CN108234520A - A kind of exception stream mode identification method based on Ben Fute laws - Google Patents

A kind of exception stream mode identification method based on Ben Fute laws Download PDF

Info

Publication number
CN108234520A
CN108234520A CN201810118785.0A CN201810118785A CN108234520A CN 108234520 A CN108234520 A CN 108234520A CN 201810118785 A CN201810118785 A CN 201810118785A CN 108234520 A CN108234520 A CN 108234520A
Authority
CN
China
Prior art keywords
data set
value
window value
abnormal
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810118785.0A
Other languages
Chinese (zh)
Inventor
肖敏
王艳
孙六英
夏喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN201810118785.0A priority Critical patent/CN108234520A/en
Publication of CN108234520A publication Critical patent/CN108234520A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of exception stream mode identification methods based on Ben Fute laws:It is chosen at feature most strong with separating capacity under abnormal network environment under proper network environment respectively as measurement by the use of Ben Fute laws, it obtains the mixed data set of different attack types respectively according to measurement, each attack type is identified using the difference between ideograph corresponding to each mixed data set.Different attack types can be identified by ideograph and infer the quantity of abnormal session.It is relative to the advantages of other abnormality detection technologies:It only needs to calculate actual distribution and degree of fitting, calculation amount is few, and calculating process is also simpler.In different attack types, standard of this feature as classification can be used, there is better applicability.

Description

A kind of exception stream mode identification method based on Ben Fute laws
Technical field
The invention belongs to network flow mode identification methods, and in particular to a kind of abnormal stream mode based on Ben Fute laws is known Other method.
Background technology
With the rapid development of Internet, the universal range of network is more and more wider.Network has become us and works life An inseparable part in work, it is closely bound up with our individual interest.In network once occur it is abnormal, caused by endanger not It can estimate.Intruding detection system is exactly one of important technology to guarantee network security, and exception stream identification refers in intrusion detection Exception stream is detected and classified in journey.And it is the key index for influencing intruding detection system performance quality.
In recent years, exception stream identification has carried out numerous studies in statistics or data digging method.However, data mining is past It is trained toward a large amount of training sample is needed, just can guarantee the accuracy of model.When training samples number deficiency, detection Effect will have a greatly reduced quality, and the recognition effect of model may be not as good as traditional statistical analysis technique based on manual features extraction. Also, since the calculating in real time of data mining is more expensive, in the environment of high power capacity, it can cause detection that there is challenge very much Property.
In order to solve these problems, it needs to study a kind of technology that can be used to detect in real time.Due to different types of exception Statistical property between stream is similar, can utilize the method identification exception stream of statistics.Statistics is divided in itself from flow Analysis, what it was disclosed is the changing rule inside flow, can more reflect changes in flow rate trend.How to be carried out using the mode of statistics The quick identification of exception stream is a subject urgently to be resolved hurrily.
Invention content:
The defects of in order to overcome above-mentioned background technology, the present invention provide a kind of abnormal stream mode based on Ben Fute laws and know Other method, suitable for various attacks type, the step of simplifying classification.
In order to solve the above-mentioned technical problem used technical solution of the invention is:
A kind of exception stream mode identification method based on Ben Fute laws:It is chosen at normal net respectively using Ben Fute laws As measurement, different attack classes are obtained according to measurement respectively for the most strong feature of separating capacity under network environment and under abnormal network environment The mixed data set of type knows each attack type using the difference between ideograph corresponding to each mixed data set Not.
Preferably, it is chosen at respectively under proper network environment using with Ben Fute laws with distinguishing energy under abnormal network environment The most strong feature of power includes as the specific method of measurement:
Normal conversation characteristic data set generation step captures proper network session, and as normal conversation set, extraction is normal The corresponding characteristic value of each feature in session aggregation, using the set of the corresponding characteristic value of each feature as the normal conversation of feature Characteristic data set;
Point set obtaining step carries out normal conversation data set impartial division using each window value in window value set; Each window value in cycling among windows value set successively, degree of fitting, acquisition are each in the block of each standalone module after dividing for acquisition The average value of degree of fitting in standalone module corresponding blocks, the Average Quasi obtained corresponding to each window value is right, according to all windows The right acquisition point set of Average Quasi corresponding to value and each window value;
Best window value obtaining step obtains corresponding fitting function according to point set, and fitting function is nearest apart from zero Point abscissa as best window value;
Generation step is measured, catch the exception network session, as abnormal session aggregation, extracts each in abnormal session aggregation The corresponding characteristic value of feature, using the set of the corresponding characteristic value of each feature as the abnormal session characteristics data set of feature;It will Abnormal session characteristics data set and normal conversation characteristic data set are split according to best window value, obtain each abnormal session Characteristic data set and normal conversation characteristic data set are right in the Average Quasi of best window value, obtain corresponding to each feature just The right difference right with the Average Quasi of the data set of abnormal session generation of the Average Quasi of data set that normal session generates, difference is most Big feature is to measure.
Preferably, feature include source address, source port, destination address, destination interface, data package size, byte-sized, Source is big to the data package size of purpose, the byte-sized in source to purpose, the data package size of purpose to source, the byte of purpose to source Small, relative start time and duration.
Preferably, according to the right method packet for obtaining point set of the Average Quasi corresponding to all window values and each window value It includes:The abscissa of point corresponding to each window value is window value, and ordinate is right for the corresponding Average Quasi of window value, and institute is fenestrate The set of the corresponding point of mouth value is point set.
Preferably, obtaining the mixed data set of different attack types respectively according to measurement, each mixed data set institute is utilized The specific method that each attack type is identified in difference between associative mode figure includes:
Mixed data set generation step, utilization measure, extraction proper network session is with abnormal network session under the measurement Characteristic value collection as normal data set and abnormal data set;Abnormal data set according to attack type is divided, is obtained The corresponding abnormal data set of each attack type;To the abnormal data set corresponding to each attack type, successively according to each different Normal ratio mixes abnormal data set with normal data set, obtains under each attack type corresponding to each unnatural proportions Mixed data set;
Mixed data set is carried out impartial division with window value each in window value set, obtained after dividing by identification step Degree of fitting in the block of each standalone module, obtains the average value of degree of fitting in each standalone module corresponding blocks, obtains each window The corresponding Average Quasi of value is right, according to the right each mixing of acquisition of the Average Quasi corresponding to all window values and each window value The corresponding point set of data set, according to the corresponding point set generation ideograph of mixed data set;By each attack type according to different different The ideograph of mixed data set corresponding to normal ratio is trained as feature, obtains the grader to each attack type.
Preferably, the quantity of mixed data set is attack type quantity and the product of unnatural proportions quantity.
Preferably, each unnatural proportions are artificial setting value.
Preferably, the maximum value of window value set is data set length, the minimum value of window value set is 1;It traverses successively Each window value in window value set.
The beneficial effects of the present invention are:Different attack types can be identified by ideograph and infer abnormal session Quantity.It is relative to the advantages of other abnormality detection technologies:It only needs to calculate actual distribution and degree of fitting, calculate Amount is few, and calculating process is also simpler.In different attack types, standard of this feature as classification can be used, is had more Good applicability.Due to the reduction in calculation amount and period so that this method cost compared with other technologies is much less.It is inventing The feature for meeting the distribution of Ben Fute laws is chosen, generating one using Ben Fute laws can be used for distinguishing different types of attack Feature, utilize the quantity of ideograph deducibility exception stream, it is proposed that a kind of for distinguishing the side of different types of network attack Method has the advantages that calculation amount is small, at low cost.It is what is extracted with manual features relative to the advantages of other abnormality detection technologies Method is compared, small to the dependence of operating personnel, more objectivity.The negligible amounts of sample compared with deep learning algorithm.It is suitable For various attacks type, the step of simplifying classification.
Description of the drawings
Fig. 1 is method flow diagram of the embodiment of the present invention based on the detection of Ben Fute laws and abnormal classification stream mode
Fig. 2 is data set piecemeal schematic diagram of the embodiment of the present invention
Fig. 3 is the ideograph of normal stream of the embodiment of the present invention
Fig. 4 is the ideograph of normal stream collection of embodiment of the present invention part2
Fig. 5 is the ideograph of normal stream collection of embodiment of the present invention part3
Fig. 6 is the ideograph of mixed data set of embodiment of the present invention part1 (exception stream quantity accounts for 0.015%)
Fig. 7 is the ideograph of mixed data set of embodiment of the present invention part19 (exception stream quantity accounts for 4.75%)
(exception stream accounts for 22.38% to the ideograph that Fig. 8 is mixed data set of embodiment of the present invention part4, in 76618- 100000)
Fig. 9 is the ideograph of mixed data set part8 (exception stream accounts for 31.4%, at the 68595-100000 articles)
(exception stream accounts for 81.42% to the ideograph that Figure 10 is mixed data set of embodiment of the present invention part6, is distributed in 1- 81417)
(exception stream accounts for 74.19% to the ideograph that Figure 11 is mixed data set of embodiment of the present invention part10, and integrated distribution exists The 1-74194 articles)
Figure 12 is the ideograph of exception flow data set of embodiment of the present invention part5
Figure 13 is the model comparision figure of part4 and part8 of the embodiment of the present invention
Figure 14 is the model comparision figure of part6 and part10 of the embodiment of the present invention
Ideograph that Figure 15 is mixed data set of embodiment of the present invention part108 (exception stream accounts for 76.85%, it is distributed in the 2316-10000 items)
(exception stream accounts for 77.72% to the ideograph that Figure 16 is mixed data set of embodiment of the present invention part128, is distributed in the 1-7772 items)
Ideograph that Figure 17 is mixed data set of embodiment of the present invention part45 (exception stream accounts for 10.55%, it is distributed in the 8946-10000 items)
Figure 18 be mixed data set of embodiment of the present invention part54 ideograph (exception stream accounts for 10.5%, be distributed in 1- 1050)
Ideograph that Figure 19 is mixed data set of embodiment of the present invention part61 (exception stream accounts for 61.15%, it is distributed in the 3741-9855 items)
Specific embodiment
The present invention is described further with reference to the accompanying drawings and examples.
A kind of exception stream mode identification method based on Ben Fute laws of the present embodiment:It is selected respectively using Ben Fute laws The most strong feature of separating capacity under proper network environment and under abnormal network environment is taken as measurement, to be obtained respectively according to measurement The mixed data set of different attack types, using the difference between ideograph corresponding to each mixed data set to each attack class Type is identified.
Specific method includes:
Step 1, normal conversation characteristic data set generation step captures proper network session, as normal conversation set, carries The corresponding characteristic value of each feature in normal conversation set is taken, using the set of the corresponding characteristic value of each feature as feature just Normal session characteristics data set;
Feature includes source address, source port, destination address, destination interface, data package size, byte-sized, source to mesh Byte-sized, phase to source of data package size, the byte-sized in source to purpose, the data package size of purpose to source, purpose To time started and duration.
Step 2, point set obtaining step carries out normal conversation data set using each window value in window value set impartial It divides;Each window value in cycling among windows value set successively obtains degree of fitting, acquisition in the block of each standalone module after dividing The average value of degree of fitting in each standalone module corresponding blocks, the Average Quasi obtained corresponding to each window value is right, according to all The right acquisition point set of Average Quasi corresponding to window value and each window value;The abscissa of point corresponding to each window value is window Mouth value, ordinate is right for the corresponding Average Quasi of window value, and all window values are point set with the set of the point of shadow.
Step 3, best window value obtaining step obtains corresponding fitting function, by fitting function distance zero according to point set The abscissa of the nearest point of point is as best window value;
Step 4, generation step is measured, catch the exception network session, as abnormal session aggregation, extracts abnormal session aggregation In the corresponding characteristic value of each feature, using the set of the corresponding characteristic value of each feature as the abnormal session characteristics data of feature Collection;Abnormal session characteristics data set and normal conversation characteristic data set according to best window value are split, obtained each different Normal session characteristics data set and normal conversation characteristic data set are right in the Average Quasi of best window value, and it is right to obtain each feature institute The right difference right with the Average Quasi of the data set of abnormal session generation of Average Quasi for the data set that the normal conversation answered generates, The feature of difference maximum is to measure.
Step 5, mixed data set generation step, utilization measure, extraction proper network session is with abnormal network session at this Characteristic value collection under measurement is as normal data set and abnormal data set;Abnormal data set is drawn according to attack type Point, obtain the corresponding abnormal data set of each attack type;To the abnormal data set corresponding to each attack type, successively according to Each unnatural proportions mix abnormal data set with normal data set, obtain each unnatural proportions institute under each attack type Corresponding mixed data set;The quantity of mixed data set is attack type quantity and the product of unnatural proportions quantity.Each exception Ratio is artificial setting value.
Step 6, mixed data set is carried out impartial division with window value each in window value set, obtained by identification step Degree of fitting in the block of each standalone module after division obtains the average value of degree of fitting in each standalone module corresponding blocks, obtains each Average Quasi corresponding to a window value is right, each according to the right acquisition of the Average Quasi corresponding to all window values and each window value The corresponding point set of a mixed data set, according to the corresponding point set generation ideograph of mixed data set;By each attack type according to The ideograph of mixed data set corresponding to different unnatural proportions is trained as feature, obtains dividing each type that amounts to Class device.
Wherein, the maximum value of window value set is data set length, and the minimum value of window value set is 1.
The present embodiment the method technical solution is illustrated below in conjunction with data set KDD99, data set KDD99 (it simulates a network environment of United States Air Force LAN, comprising 5, the training data of a network connection record more than 000,000, And the test data of 2,000,000 network connection records.One network connection record refers to a conversation procedure, each Session all contains 41 network characterizations.Also, each session is marked as normal or abnormal, and Exception Type is subdivided into 4 Major class totally 39 kinds of attack types.) on tested.In experimentation, by obtaining different attack types --- mainly for SYN Flood attack (neptune), network sweep detection attack (probe) middle port cleans attack (portsweep) and satan is swept The ideograph of scanning attack finds that different attack types have corresponding ideograph and can detect the quantity of exception stream.This patent Synthesis has used the statistical methods such as Ben Fute laws and Chi-square Test.Wherein Ben Fute laws are meters as gross data Calculate the important parameter of fitting;Chi-square Test is then used for showing the fitting effect of experimental data and gross data.The flow of this method Figure is as shown in Figure 1.It specifically includes:
Step 1:Data prediction
1) data set is divided by class:Data set KDD99 contains the session under tri- kinds of agreements of TCP, UDP, ICMP, Shiravi et al. proposes that most of IP flows on internet are all TCP flow amounts, it was reported that the ratio is about 95.11%.Due to TCP flow is the mainstream on internet, so in this patent mainly for network flow (the i.e. meeting of meaning hereinafter under Transmission Control Protocol Words) it is studied.Session under Transmission Control Protocol is extracted to the data set TCP_DATA new as one, it includes NORMAL (768668), MALICIOUS (1101926) two kinds of session, wherein MALICIOUS is mainly with SYN flood attacks (neptune, 1072017), port clean attack (portsweep, 10407) and satan scanning attacks (14147) Based on, TCP_DATA is as MIXED (1870594) data set.This six type will be as the standard for dividing categories of datasets. Since each session has 41 kinds of network characterizations, particularly directed to which Feature Selection data set (determine available measurement), Also need to further experiment.According to the definition of Ben Fute laws:The data generated naturally in life, the probability of their the first numbers Distribution meets log series model it is found that when selecting network characterization as research object, and the codomain corresponding to this feature should be connected Continuous.The network characterization for meeting this requirement has:duration、src_bytes、dst_bytes、bytes、count、srv_ count、dst_host_count、dst_host_srv_count.This six kinds of network characterizations are as another mark for dividing data set It is accurate.So in this step, experimental data set of 36 kinds of data sets as step 4 has been divided altogether.
2) window ranges are given:The NORMAL data set lengths obtained according to previous step give a window ranges and are used for The relationship between window value and degree of fitting is probed into, and best window value is chosen with this.Due to the data set marked off using window Number needs are enough, just have the meaning of partition window in this way, and the quantity in each window is also enough, otherwise counts The probability of first place number out would become hard to show certain regularity.Therefore, for the NORMAL numbers of 768668 sessions For collection, window ranges value is [500,20000], and the increasing value between adjacent window apertures takes 250.
Step 2:It is right to calculate Average Quasi
According to given window ranges, each window value is got, is divided under different characteristic with window size equalization successively NORMAL data sets (data set piecemeal is as shown in Figure 3), by the difference of this feature representated by adjacent session in each module The first numerical probability of value comes out, such as:NORMAL data sets under dst_bytes features are drawn with a certain window value Point, the probability of 1-9 in the difference of adjacent session in each window is counted, recycles Chi-square Test by these values and Ben Fute laws It is fitted, so as to obtain the value of degree of fitting in block.The value of degree of fitting is taken into mean value, corresponding to after being divided as the window size Average Quasi is right.Wherein, Ben Fute laws meet following distribution:
D represents the value of the first bit digital, and Pd represents the probability that d occurs in first place.This Ford law is disclosed in life certainly The first digital distribution of the data so formed meets this log series model, if there is artificial do during data generate In advance, then, the first digital distribution of the data of gained does not meet the distribution.It is usually used to detection exception and fraud, together Sample can be used for detecting normal stream and exception stream using its characteristic.
Note:1. when difference is 0, we select to abandon the data, and not as the parameter of digital simulation degree, this is will not shadow Result is rung, because while 0 frequency values have been abandoned, but due to 0 part as denominator, to the frequency values of other numbers Influence or existing.2. when difference is decimal, counted first nonzero value of decimal place as the first number. Data set point is as shown in Figure 2.
Through calculating the corresponding Average Quasi of each window it is right after, it can be found that fitting angle value is with the increase of window value And more tend towards stability, as shown in Figure 3.
Step 3:Obtain best window value
Such as Fig. 3 it is observed that relationship between window and degree of fitting, the value of degree of fitting subtract with the increase of window value Small, fitting effect can be seriously affected by illustrating that window value is smaller.After window increases, the value of degree of fitting tends towards stability, in order to protect Sample size after card division data set is enough, and critical point when curve tends towards stability is chosen in experiment, such as arrow institute in figure Show.
Step 4:Determine available measurement
36 data sets caused by data preprocessing phase are divided in, and distinguish according to step 2 according to best window value Calculate the right value of Average Quasi corresponding to each data set.The results are shown in Table 1:
Table 1:Fitting degree table under different characteristic
If after choosing a certain network characterization, corresponding normal data set and the plan of Ben Fute laws under the network characterization Right value is smaller, illustrates the distribution for more meeting Ben Fute laws;And corresponding abnormal data set and sheet under the network characterization The value of the degree of fitting of Ford law is bigger, illustrates not meeting Ben Fute laws more.So, this makes it possible to distinguish it is normal with it is different Often.
In features described above, the feature that numerically can be significantly distinguished for normal, mixing, abnormal data set has: src_bytes、dst_bytes、bytes.Their degree of fitting all meets normal data set<Mixed data set<Abnormal data set. Wherein, effect is distinguished the most it is apparent that bytes, subsequent experimental are carried out on the data set corresponding to this feature.
Step 5:Generate mixed data set
Above step determines that agreement is TCP and measurement is bytes, next will be extracted according to different attack types Corresponding mixed data set.Since attack type main in KDD99 is port cleaning attack, satan scanning attacks, SYN flood Water is attacked, and the label from TCP_DATA under extraction bytes features is and the session of NEPTUNE is as blended data Collect MIX_N&NEP (comprising 768668 normal conversations, 1072017 attack types are the session of SYN flood attacks), equally, Obtaining mixed data set MIX_N&PS, (comprising 768668 normal conversations, 10407 attack types clean the meeting of attack for port Words), MIX_N&ST (comprising 768668 normal conversations, 14147 attack types are the session of satan scanning attacks).Equally, In order to ensure the number of sessions in sample size and each sample, MIX_N&NEP is divided into 100000 for a unit 19 subsets, these subsets are referred to as PART_NEP.Other two data set is due to negligible amounts, then by MIX_N&NEP, MIX_ N&PS, MIX_N&ST divide data set with 10000 for a unit, these subsets are referred to as PART_MALICIOUS.In this way, The comparison between comparison and the different type between same type difference sample can be realized in the ideograph being subsequently generated.
Step 6:Window ranges
The subset divided by MIX_N&NEP, MIX_N&PS and MIX_N&ST, window ranges take 1 to arrive data set length, in this way Enough points can be obtained in Assured Mode figure.
Step 7:It is right to calculate Average Quasi
With as step 2, cycling among windows value, divides data set according to window value equalization, calculates in each window successively Degree of fitting, then will be averaged using degree of fitting in all pieces after window division.It is corresponding to can be obtained by the window value Average Quasi is right.
Step 8:Generate ideograph
The point set of window value and degree of fitting is shown in graph form.What this figure referred to is exactly ideograph.
Table 2 is the details of PART_NEP, and Fig. 4 to Figure 12 is the ideograph per part.
Table 2:MIX_N&NEP data sets part subset details table
Interpretation of result:
1st, it can be found that the ideograph between normal stream, mixed flow, exception stream has significant difference by Fig. 4 to Fig. 7. But it is difficult to differentiate between normal stream and mixed flow.This is because caused by exception stream negligible amounts.It is found by many experiments, it is different When often stream quantity accounts for more than 10%, can normal stream and mixed flow be distinguished by ideograph.
2nd, it is can be found that by Fig. 4 to Figure 12 comprising blended data attack type for the abnormal session of SYN flood attacks Collection is the characteristic for meeting self similarity.Moreover, inferring by many experiments, it is found that it is following two such data set owner will show Pattern (situation for being more than 10% just for abnormal quantity):The first:The step formula mould that image presentation is successively decreased downwards from left to right Formula.Second:Incremental step formula pattern upwards is presented from left to right in image.
The 3rd, if exception stream concentrates on the first half of data set, then can find a point, this institute on the image Corresponding window value is equal to the quantity of exception stream, the point on the image the characteristics of be that curve will encounter maximum after this point Decline, such as Figure 10, in 11 shown in arrow.When exception stream concentrates on the latter half of data set, the quantity of normal stream will be bent Line window value corresponding after rising to utmostly, such as Fig. 8, in 9 shown in arrow.
4th, by Figure 11 and Figure 12 can be found that attack type be SYN flood attacks mixing concentrate have identical size with And it is similar that abnormal session, which has the ideograph of analogous location,.
The displaying of table 3 is the partial data collection details of PART_MALICIOUS, and Figure 15 to Figure 19 respectively shows MIX_N& The ideograph of NEP, MIX_N&PS, MIX_N&ST subset.
Table 3:Fitting degree table under different characteristic
Interpretation of result:
1. find that the ideograph of different attack types is inconsistent by Figure 13 to Figure 16.
2. the attack of these three types equally meets thirdly conclusion in experiment two.If exception stream concentrates on data set First half a, then point can be found on the image, the window value corresponding to the point is equal to the quantity of exception stream, which exists The characteristics of on image is that curve will encounter maximum decline after this point, as shown by arrows in FIG..When exception stream concentrates on number According to collection latter half when, the quantity of normal stream will be window value corresponding after curve rises to utmostly, such as middle arrow Shown in head.
It should be understood that for those of ordinary skills, can be improved or converted according to the above description, And all these modifications and variations should all belong to the protection domain of appended claims of the present invention.

Claims (8)

1. a kind of exception stream mode identification method based on Ben Fute laws, it is characterised in that:It is selected respectively using Ben Fute laws The most strong feature of separating capacity under proper network environment and under abnormal network environment is taken as measurement, to distinguish according to the measurement The mixed data set of different attack types is obtained, using the difference between ideograph corresponding to each mixed data set to each A attack type is identified.
A kind of 2. exception stream mode identification method based on Ben Fute laws according to claim 1, which is characterized in that profit By the use of with Ben Fute laws be chosen at respectively feature most strong with separating capacity under abnormal network environment under proper network environment as The specific method of measurement includes:
Normal conversation characteristic data set generation step captures proper network session, as normal conversation set, extracts described normal The corresponding characteristic value of each feature in session aggregation, using the set of the corresponding characteristic value of each feature as the feature Normal conversation characteristic data set;
Point set obtaining step carries out the normal conversation data set impartial division using each window value in window value set; Each window value in cycling among windows value set successively, degree of fitting, acquisition are each in the block of each standalone module after dividing for acquisition The average value of degree of fitting in the standalone module corresponding blocks, the Average Quasi obtained corresponding to each window value is right, according to all The right acquisition point set of Average Quasi corresponding to window value and each window value;
Best window value obtaining step obtains corresponding fitting function, by the fitting function apart from zero according to the point set The abscissa of nearest point is as best window value;
Generation step is measured, catch the exception network session, as abnormal session aggregation, extracts each in the abnormal session aggregation The corresponding characteristic value of feature, using the set of the corresponding characteristic value of each feature as the abnormal session characteristics number of the feature According to collection;The abnormal session characteristics data set and the normal conversation characteristic data set are divided according to the best window value It cuts, obtains the Average Quasi of each abnormal session characteristics data set and the normal conversation characteristic data set in best window value It is right, obtain the right data generated with abnormal session of the Average Quasi of data set that the normal conversation corresponding to each feature generates The right difference of the Average Quasi of collection, the feature of difference maximum is the measurement.
3. a kind of exception stream mode identification method based on Ben Fute laws according to claim 2, it is characterised in that:Institute State feature include source address, source port, destination address, destination interface, data package size, byte-sized, source to purpose number According to packet size, the byte-sized in source to purpose, the data package size of purpose to source, the byte-sized of purpose to source, opposite beginning Time and duration.
4. a kind of exception stream mode identification method based on Ben Fute laws according to claim 2, which is characterized in that according to Include according to the right method for obtaining point set of the Average Quasi corresponding to all window values and each window value:Each window value institute The abscissa of corresponding point is the window value, and ordinate is right for the corresponding Average Quasi of the window value, all window value institutes The set of corresponding point is the point set.
A kind of 5. exception stream mode identification method based on Ben Fute laws according to claim 2, which is characterized in that institute The mixed data set for obtaining different attack types respectively according to the measurement is stated, utilizes mould corresponding to each mixed data set The specific method that each attack type is identified in difference between formula figure includes:
Mixed data set generation step using the measurement, extracts the proper network session and exists with the abnormal network session Characteristic value collection under the measurement is as normal data set and abnormal data set;By the abnormal data set according to attack type into Row divides, and obtains the corresponding abnormal data set of each attack type;To the abnormal data set corresponding to each attack type, successively Abnormal data set with normal data set is mixed according to each unnatural proportions, obtains each anomaly ratio under each attack type Mixed data set corresponding to example;
The mixed data set is carried out impartial division with window value each in window value set, traverses window successively by identification step Each window value in mouth value set, degree of fitting in the block of each standalone module, obtains each independent mould after acquisition division The average value of degree of fitting in block corresponding blocks, the Average Quasi obtained corresponding to each window value is right, according to all window values and respectively Average Quasi corresponding to a window value is right to obtain the corresponding point set of each mixed data set, according to the mixed number Ideograph is generated according to corresponding point set is collected;By each attack type according to the mould of the mixed data set corresponding to different unnatural proportions Formula figure is trained as feature, obtains the grader to each attack type.
6. a kind of exception stream mode identification method based on Ben Fute laws according to claim 5, it is characterised in that:Institute The quantity for stating mixed data set is the attack type quantity and the product of the unnatural proportions quantity.
7. a kind of exception stream mode identification method based on Ben Fute laws according to claim 5, it is characterised in that:Respectively A unnatural proportions are artificial setting value.
8. a kind of exception stream mode identification method based on Ben Fute laws according to claim 5, it is characterised in that:Institute The maximum value for stating window value set is data set length, and the minimum value of window value set is 1.
CN201810118785.0A 2018-02-06 2018-02-06 A kind of exception stream mode identification method based on Ben Fute laws Pending CN108234520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810118785.0A CN108234520A (en) 2018-02-06 2018-02-06 A kind of exception stream mode identification method based on Ben Fute laws

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810118785.0A CN108234520A (en) 2018-02-06 2018-02-06 A kind of exception stream mode identification method based on Ben Fute laws

Publications (1)

Publication Number Publication Date
CN108234520A true CN108234520A (en) 2018-06-29

Family

ID=62669755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810118785.0A Pending CN108234520A (en) 2018-02-06 2018-02-06 A kind of exception stream mode identification method based on Ben Fute laws

Country Status (1)

Country Link
CN (1) CN108234520A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321376A (en) * 2019-03-19 2019-10-11 北京信息科技大学 A kind of data fabrication investigation method based on Ben Fute law

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101557327A (en) * 2009-03-20 2009-10-14 扬州永信计算机有限公司 Intrusion detection method based on support vector machine (SVM)
US20160065613A1 (en) * 2014-09-02 2016-03-03 Sk Infosec Co., Ltd. System and method for detecting malicious code based on web
CN105429977A (en) * 2015-11-13 2016-03-23 武汉邮电科学研究院 Method for monitoring abnormal flows of deep packet detection equipment based on information entropy measurement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101557327A (en) * 2009-03-20 2009-10-14 扬州永信计算机有限公司 Intrusion detection method based on support vector machine (SVM)
US20160065613A1 (en) * 2014-09-02 2016-03-03 Sk Infosec Co., Ltd. System and method for detecting malicious code based on web
CN105429977A (en) * 2015-11-13 2016-03-23 武汉邮电科学研究院 Method for monitoring abnormal flows of deep packet detection equipment based on information entropy measurement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIUYING SUN, ET.: "Detection and Classification of Malicious Patterns In Network Traffic Using Benford’s Law", 《PROCEEDINGS OF APSIPA ANNUAL SUMMIT AND COFERENCE 2017》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321376A (en) * 2019-03-19 2019-10-11 北京信息科技大学 A kind of data fabrication investigation method based on Ben Fute law

Similar Documents

Publication Publication Date Title
CN107560851B (en) Rolling bearing Weak fault feature early stage extracting method
CN109714343B (en) Method and device for judging network traffic abnormity
US8352393B2 (en) Method and system for evaluating tests used in operating system fingerprinting
CN105847283A (en) Information entropy variance analysis-based abnormal traffic detection method
EP1907940A2 (en) Method and apparatus for whole-network anomaly diagnosis and method to detect and classify network anomalies using traffic feature distributions
CN106657160B (en) Network malicious act detection method towards big flow based on confidence level
CN111478904B (en) Method and device for detecting communication anomaly of Internet of things equipment based on concept drift
CN101848160A (en) Method for detecting and classifying all-network flow abnormity on line
CN101534305A (en) Method and system for detecting network flow exception
CN106357434A (en) Detection method, based on entropy analysis, of traffic abnormity of smart grid communication network
CN113645182B (en) Denial of service attack random forest detection method based on secondary feature screening
CN107132266A (en) A kind of Classification of water Qualities method and system based on random forest
CN107370752A (en) A kind of efficient remote control Trojan detection method
CN114374626B (en) Router performance detection method under 5G network condition
CN109450957A (en) A kind of low speed Denial of Service attack detection method based on cloud model
CN107360127A (en) A kind of Denial of Service attack detection method at a slow speed based on AEWMA algorithms
CN105871861B (en) A kind of intrusion detection method of self study protocol rule
CN117319047A (en) Network path analysis method and system based on network security anomaly detection
CN108197079A (en) A kind of improved algorithm to missing values interpolation
CN108234520A (en) A kind of exception stream mode identification method based on Ben Fute laws
CN113518073B (en) Method for rapidly identifying bit currency mining botnet flow
CN106770861B (en) The evaluation method of oil-filled transformer on-line monitoring availability of data
Hammerschmidt et al. Efficient learning of communication profiles from ip flow records
CN101594352B (en) Classifying fusion intrusion detection method based on novel discovery and window function
Chang et al. An efficient network attack visualization using security quad and cube

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180629

RJ01 Rejection of invention patent application after publication