CN101163058A - Stream aggregation arbitrary sampling based packet measuring method - Google Patents

Stream aggregation arbitrary sampling based packet measuring method Download PDF

Info

Publication number
CN101163058A
CN101163058A CNA2007101901880A CN200710190188A CN101163058A CN 101163058 A CN101163058 A CN 101163058A CN A2007101901880 A CNA2007101901880 A CN A2007101901880A CN 200710190188 A CN200710190188 A CN 200710190188A CN 101163058 A CN101163058 A CN 101163058A
Authority
CN
China
Prior art keywords
time
message
sub
interval
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2007101901880A
Other languages
Chinese (zh)
Other versions
CN100558058C (en
Inventor
程光
龚俭
强士卿
丁伟
吴桦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CNB2007101901880A priority Critical patent/CN100558058C/en
Publication of CN101163058A publication Critical patent/CN101163058A/en
Application granted granted Critical
Publication of CN100558058C publication Critical patent/CN100558058C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The prevent invention provides a message measurement method on the basis of a flow gathering random sample. The measurement time section is divided into a plurality of sub sections. Different bit serials are distributed to each sub section. A random sample website flow sample message is used in each sub section. In the sampling process, the matching bit serial distributed to the sub section is matched with the hash value of a network flow marker. All the hash valve bit serials produced by the message marker are operated with one hash function. The input of the hash function is the message flow marker; the output is the hash value bit serial, the length of which is of the same with the matching bit serial. The matching bit serial distributed to the sub section is compared with the output hash value bit serial, when the two bit serials are of the same, the message is sampled; or the message is discarded. With the method, only the message information of one network flow sub space is measured in each sub section; the message information in the whole network flow marker space can be measured when the whole measurement time section is measured.

Description

Packet measuring method based on stream aggregation arbitrary sampling
Technical field
The present invention relates to be used for the method for measurement of network flow, especially a kind of packet measuring method based on stream aggregation arbitrary sampling.
Background technology
Network traffics are to be made of sequence of message, and the message set with identical traffic identifier constitutes network flow.Traffic identifier has multiple different definition, and General Definition is that 5 tuples such as source IP address, IP address, place, source port, place port and protocol are defined as traffic identifier.Network flow is in a Measuring Time scope, has the message set of identical message identification in the sequence of message of arrival measuring appliance.As: one section interior sequence of message that arrives of Measuring Time is: { a, a, b, b, c, a, e, d, f, b} wherein one has 10 messages arrival, a wherein, b, c, d, e, f are traffic identifier, network flow in this sequence of message is that { 3}{c1}{d 1}{e 1}{f1} a.3}{b, its implication is that network flow a length is 3, has 3 messages to belong to network flow a.
The number that flows in the express network is very big, finds all that in the measurement of lot of documents a spot of stream has rules such as most network traffics.1% the stream discovered of NLANR has the flow more than 80%.Fig. 1 is 5 minutes the network flow scatter chart of measuring from router of CERNET (CERNET), and we can know from figure, and the quantity of short stream is very big, and long fluxion amount seldom, and a large amount of network messages is to belong to a spot of long stream.Network flow is used very extensive in network, the behavioural analysis of stream Network Based and abnormality detection are the hot research problems, in the IETF tissue, there are two working groups to specialize in the relevant issues of network flow, real-time RTFM of flow measurement working group and network flow information output services group IPFIX, their work is to set up the relevant universal standard that network flow is measured.Because the characteristic of " bigger, faster, the sudden change " of the Internet, making becomes the primary study problem of present network flow measurement based on the measuring technique of sampling.
The network flow sampling techniques has two kinds, the network flow sampling techniques of random sampling message and the network flow sampling techniques of random sampling network flow.The network flow sampling techniques of random sampling message is meant the message that arrives measuring appliance for each, and measuring appliance adopts a random function this message of sampling, and just each message has identical sampling probability.Network flow sampling techniques than system research random sampling message is the present network behavior data analysis CAIDA director K Claffy of research organization for the first time, system research in 1993 arrives the excitation mechanism of order for sampling based on the time with based on message, analytical system is sampled, random sampling stratified sampling technology, and reaches the performance that is distributed as example two kinds of excitation mechanisms of analysis and three kinds of sampling techniquess with message length and flow.
The network flow sampling techniques of random sampling network flow is meant the match bit string of a n bit of measuring appliance predefined, use a hash function to handle each traffic identifier that arrives message then and generate a Hash Bit String, the match bit string of the n bit that the wherein n bit in this Hash Bit String and predefined is good compares, if two Bit Strings are identical, then this message is sampled, otherwise this message will be dropped.Hash function can adopt as hash functions such as CRC32, MD5, and Jain in 1992 and Cao in 2000 have analyzed the hash algorithm of five kinds of streams Network Based.Adopt the network flow sampling techniques of random sampling network flow can be, make in the network flow all messages or by this method, or all abandoned by gross sample so that each network flow has identical sampling probability.
From Fig. 1 we as can be seen, the network flow sampling techniques of random sampling message has two shortcomings: (1) is because the size of network flow has heavy-tailed characteristic, directly adopt the method for message random sampling may cause the short probability of being sampled that flows little, and the probability that long stream is sampled is big, thereby short stream information is unable to estimate, cause the information of packet sampling to be used for and to flow network of relation and use, as scanning, DoS attack detection etc.; (2), thereby make and can't use the information of packet sampling to carry out the network application relevant with complete stream information because packet sampling causes the stream complete information of can not being sampled, as: the network end-to-end performance monitorings such as delay, shake that carry out passive measurement.The advantage of the network flow sampling techniques of random sampling message is that the long probability of being sampled that flows is big, and accurately.
Two problems that the sampling techniques of random sampling network flow can be avoided in the packet sampling being occurred, though but shortcoming is the message information in energy perfect measurement subnetwork traffic identifier space, and the message of other network flow identifier space abandons fully, cause the behavioral aspect of recognition network on the whole like this, be difficult to use in application such as network traffics charging, network management.Its advantage is comparatively accurately to measure a large amount of network flow information.
Summary of the invention
The pluses and minuses of the network flow sampling techniques of comprehensive random sampling message and these two kinds of methods of network flow sampling techniques of random sampling network flow, the present invention proposes a kind of packet measuring method based on stream aggregation arbitrary sampling, the difference of this method and conventional method is the Measuring Time interval is divided into the plurality of sub interval, in each sub-time interval, adopt the method for measurement packet sampling of random sampling network flow, in each sub-time interval, adopted different match bit strings.This method can measure the stream information in the complete traffic identifier space, make that the message data of sampling can be used in that network flow is used, the network end-to-end performance measurement and to the network global traffic behavior detects and management etc.This method will detect for the real-time traffic of high speed internet of new generation, safety management provides the important techniques support.
Technical scheme of the present invention is: a kind of packet measuring method based on stream aggregation arbitrary sampling, it is characterized in that the Measuring Time interval is divided into the plurality of sub interval, the quantity in subinterval is the inverse of sampling ratio, for distributing a different match bit string in each subinterval, in each subinterval, adopt the method for measurement packet sampling of random sampling network flow, the cryptographic Hash of the match bit string matching network traffic identifier that this subinterval of use is assigned with in the sampling process, adopt a hash function to handle all message flow signs in this process to generate the cryptographic Hash Bit String, the message flow that is input as of this hash function identifies, be output as the cryptographic Hash Bit String identical with the match bit string length, compare between the match bit string that this subinterval is assigned with and the cryptographic Hash Bit String of output, if two Bit Strings are identical, then this message is sampled, otherwise this message will be dropped.
Described Measuring Time interval is divided into the concrete grammar in plurality of sub interval can be as follows:
The length of supposing measurement match bit string is the n bit, and n is the positive integer greater than 0, and the value space size of this n Bit String is 2 nTherefore, adopting the sampling probability of method of measurement of the random sampling network flow of n bits match Bit String is 1/2 n, measuring appliance is divided into the time interval T that measures the equal portions reciprocal of sampling ratio in advance, and promptly 2 nEqual portions, each sub-time interval is numbered according to sequencing, and first sub-time interval is numbered 0, and last sub-time interval is numbered 2 n-1.
Describedly be that each subinterval distributes a different match bit string, concrete grammar can be as follows:
It is 2 that a size is set nArray t, array t writes down the match bit string of the n bit length that is assigned with in each sub-time interval, each the element t (i) among the array t is the Bit String of a n bit, i is the numbering in sub-time interval, i is more than or equal to 0 and smaller or equal to 2 n-1, with 0 to 2 nBetween-1 2 nIndividual different number is assigned randomly in this size each element for the array t of 2n, the n bits match Bit String of the number that is assigned with in each sub-time interval for being assigned with in this sub-time interval.
The described method of measurement packet sampling that in each subinterval, adopts the random sampling network flow, concrete grammar can be as follows:
In each sub-time interval, use the network flow sampling techniques packet sampling of random sampling network flow, measuring process adopts a hash function to handle all message flow signs to generate the Hash Bit String, the message flow that is input as of this hash function identifies, the cryptographic Hash of output is the Bit String of n bit, compare between the n bits match Bit String that this Hash Bit String and this sub-time space are assigned with, if two Bit Strings are identical, then this message is sampled, otherwise this message will be dropped.
Packet measuring method step based on stream aggregation arbitrary sampling is specific as follows:
The first step: initial parameter is set
It is T that sampling measurement time interval length is set, and interval T is divided into 2 with Measuring Time nEqual portions, each subinterval time span is T/2 n, the interval sequence number itime of sub-time of measurement that establishes beginning equals 0;
It is 2 that a size is set nArray t, each the element t (i) among the array t is the number of a n bit, i is more than or equal to 0 and smaller or equal to 2 n-1, with 0 to 2 nBetween-1 2 nIt is 2 that number is assigned randomly to size nEach element of array t in;
Select a hash function hash, the inlet flow sign ID of hash function, the cryptographic Hash value that hash function generates is a n bit length, its span is more than or equal to 0 and less than 2 n
Message memory headroom size S is set;
If the interval time started of current Measuring Time is current, concluding time end=current+T;
Second step: calculate the interval concluding time of sub-time of current measurement
Interval concluding time time of sub-time of current measurement adds interval big or small T/2 of sub-time of measurement for measuring the interval time started of sub-time n, time=current+itime*T/2 n+ T/2 n, wherein time is the interval concluding time of sub-time of current measurement, and current is the interval time started of current Measuring Time, and itime is an interval sequence number of current sub-time, and T is the Measuring Time siding-to-siding block length, n is the positive integer more than or equal to 0, T/2 nBe the time span in subinterval, itime*T/2 nBe the time started in current sub-time interval, entered for the 3rd step;
The 3rd step: judge the interval end of current sub-Measuring Time
If the current measuring appliance time is more than or equal to the interval concluding time time of current sub-Measuring Time, data among the outgoing message memory headroom M are stored in hard disk, empty the message accounting among the message memory headroom M, the message accounting number m that is provided with among the present message memory headroom M equals 0, enters for the 7th step; The current else if measuring appliance time entered for the 4th step less than the interval concluding time time of current sub-Measuring Time;
The 4th step: sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, if a message arrives measuring appliance, extract its traffic identifier ID, use hash function hash to calculate its cryptographic Hash value, value=hash (ID), value are the numbers of a n bit; If value equals the n bit number t (itime) of the prior setting of current interval correspondence of sub-time, (wherein itime is the current time sequence number, and t is that size is 2 nArray), then entered for the 5th step, otherwise got back to for the 3rd step;
The 5th step: handle the message of being sampled
The message information that to be sampled is recorded among the message memory headroom M, and the message accounting quantity m in the message memory headroom is increased by 1, i.e. m=m+1; If message accounting quantity m then got back to for the 3rd step less than S in the message memory headroom; Otherwise entered for the 6th step;
The 6th step: message memory headroom record output
Message accounting among the message memory headroom M is outputed in the hard disk, simultaneously the record in the message memory headroom is emptied, and message memory headroom message accounting quantity m is set equals 0, got back to for the 3rd step.
The 7th step: measure the concluding time and judge
Equal to measure sub-time sum 2 if measure interval sequence number itime of sub-time n-1, stop to measure; Otherwise interval sequence number itime=itime+1 of new current sub-time is set, entered for second step.
Compared with prior art, the present invention has following advantage and beneficial effect:
(1) only measures the message information of one of them network flow subspace in each sub-time interval, in whole Measuring Time interval, can measure the message information in the whole network flow identifier space.This method realizes the network traffics sampling techniques on the one hand, solves the measurement and the storage problem of high speed, mass network flow; Can measure the contiguous network stream information again on the other hand, make the data of measuring can be used in the network performance network application relevant with stream; Because the stream in the whole network flow space all might be sampled, can realize that behavior is monitored and managed to the network global traffic simultaneously.
(2) adopt sampling techniques,, can solve the measurement and the storage problem of high speed, mass network flow by sampling part message information;
(3) can measure Continuous Flow information, the feasible data of measuring can be used in and flow relevant network application, as: scanning and DoS attack detection etc.; And can carry out performance monitorings such as passive network end-to-end delay, shake;
Description of drawings
Fig. 1 is 5 minutes the network flow scatter chart of measuring from router of CERNET (CERNET);
Fig. 2 is the schematic diagram that the present invention is based on the stream aggregation arbitrary sampling packet measuring method;
Fig. 3 is the flow chart that the present invention is based on the stream aggregation arbitrary sampling packet measuring method.
Embodiment
Fig. 1 is a prior art, and background technology part has in front been done evaluation.
Provide embodiments of the invention in conjunction with Fig. 2,3.Stream ID length is the L bit among Fig. 2, and the space span that therefore flows ID is 0 to 2 LBetween-1; With length be the stream ID of L to adopt hash function to generate length be the cryptographic Hash of n, the span in cryptographic Hash space is for being 0 to 2 nBetween-1; Each stream ID is mapped to a node in Hash space; Measuring Time T is divided into the n equal portions, and every equal portions time granularity is T/2 n, each time granularity is mapped to a node in the Hash space at random.
If sequence of message:
A1 B1 B2 C1 B3?D1 A2 A3 C2
A1 represents a message, and wherein alphabetical A represents traffic identifier, first message of 1 expression A stream, and B1 represents first message of B stream, A3 represents the 3rd message of A stream, by that analogy.
1 (first step): initial parameter is set
It is T=4 that sampling measurement time interval length is set, and interval T is divided into 2=2 with Measuring Time 1Equal portions, each subinterval time span is T/2 n=4/2=2, the interval sequence number itime of sub-time of measurement that establishes beginning equals 0;
It is 2 that a size is set 1=2 array t, each the element t (i) among the array t is the number of a n=1 bit, i is more than or equal to 0 and smaller or equal to 2 1-1=1 is assigned randomly to 2 numbers between 0 to 1 in each element that size is 2 array t t (0)=0, t (1)=1;
Select a hash function hash, the inlet flow sign ID of hash function is (streams such as A, B, C, D), the cryptographic Hash value that the hash hash function generates is 1 bit length, its span is more than or equal to 0 and less than 2, hash (A)=0, hash (B)=1, hash (C)=1, hash (D)=0;
Message memory headroom size S=2 is set;
If the interval time started of current Measuring Time is current=0, concluding time end=current+T=0+4=4; Entered for 2 (second steps);
2 (second steps): calculate the interval concluding time of sub-time of current measurement
Interval concluding time time of sub-time of current measurement adds interval big or small 4/2=2 of sub-time of measurement, time=current+itime*T/2 for measuring interval time started=0 of sub-time n+ T/2 n=0+0*2+2=2, wherein time is the interval concluding time of sub-time of current measurement, and current is the interval time started of current Measuring Time, and itime is interval sequence number of current sub-time=0, and T is Measuring Time siding-to-siding block length=4, n=1, T/2 nBe the time span in subinterval, itime*T/2 nBe the time started in current sub-time interval, entered for 3 (the 3rd steps);
3 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 0, less than the interval concluding time time=2 of current sub-Measuring Time, enters for 4 (the 4th steps);
4 (the 4th steps): sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, the A1 message arrives measuring appliance, extracts its traffic identifier ID=A, uses hash function hash to calculate its cryptographic Hash value, then value=hash (A)=0; Value=0 equals the n bit number t (0)=0 of the prior setting of current interval correspondence of sub-time, then enters for 5 (the 5th steps);
5 (the 5th steps): handle the message of being sampled
The message A1 information that to be sampled is recorded among the message memory headroom M, and the message accounting quantity m in the message memory headroom is increased by 1, i.e. m=m+1=0+1=1; Message accounting quantity m=1 then got back to for 6 (the 3rd steps) less than S=2 in the message memory headroom;
6 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 1, less than current Measuring Time granularity concluding time time=2, enters for 7 (the 4th steps);
7 (the 4th steps): sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, the B1 message arrives measuring appliance, extracts its traffic identifier ID=B, uses hash function hash to calculate its cryptographic Hash value, then value=hash (B)=1; Value=1 is not equal to the n bit number t (0)=0 of the prior setting of current interval correspondence of sub-time, enters for 8 (the 3rd steps);
8 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 2, equal the interval concluding time time=2 of current sub-Measuring Time, data A1 among the outgoing message memory headroom M stores in hard disk, empty the message accounting among the message memory headroom M, the message accounting number m that is provided with among the present message memory headroom M equals 0, enters for 9 (the 7th steps);
9 (the 7th steps): measure the concluding time and judge
The interval sequence number of current sub-Measuring Time is 0, counts T=2 less than measurement total time granularity and subtracts 1, and interval sequence number itime=itime+1=0+1=1 of new current sub-time is set, and enters for 10 (second steps),
10 (second steps): calculate the interval concluding time of sub-time of current measurement
Interval concluding time time of sub-time of current measurement adds interval big or small 4/2=2 of sub-time of measurement, time=current+itime*T/2 for measuring interval time started=2 of sub-time n+ T/2 n=0+1*2+2=4, wherein time is the interval concluding time of sub-time of current measurement, and current is the interval time started of current Measuring Time, and itime is interval sequence number of current sub-time=0, and T is Measuring Time siding-to-siding block length=4, n=1, T/2 nBe the time span in subinterval, itime*T/2 nBe the time started in current sub-time interval, entered for 11 (the 3rd steps);
11 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 2, less than the interval concluding time time=4 of current sub-Measuring Time, enters for 12 (the 4th steps);
12 (the 4th steps): sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, the B2 message arrives measuring appliance, extracts its traffic identifier ID=B, uses hash function hash to calculate its cryptographic Hash value, then value=hash (B)=1; Value=1 equals the n bit number t (1)=1 of the prior setting of current interval correspondence of sub-time, enters for 13 (the 5th steps);
13 (the 5th steps): handle the message of being sampled
The message B2 information that to be sampled is recorded among the message memory headroom M, and the message accounting quantity m in the message memory headroom is increased by 1, i.e. m=m+1=0+1=1; Message accounting quantity m=1 then got back to for 14 (the 3rd steps) less than S=2 in the message memory headroom;
14 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 3, less than the interval concluding time time=4 of current sub-Measuring Time, enters for 15 (the 4th steps);
15 (the 4th steps): sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, the C1 message arrives measuring appliance, extracts its traffic identifier ID=C, uses hash function hash to calculate its cryptographic Hash value, then value=hash (C)=1; Value=1 equals the n bit number t (1)=1 of the prior setting of current interval correspondence of sub-time, enters for 16 (the 5th steps);
16 (the 5th steps): handle the message of being sampled
The message C1 information that to be sampled is recorded among the message memory headroom M, and the message accounting quantity m in the message memory headroom is increased by 1, i.e. m=m+1=1+1=2; Message accounting quantity m=2 equals S=2 in the message memory headroom, enters for 17 (the 6th steps);
17 (the 6th steps): message memory headroom record output
Message accounting B2, C1 among the message memory headroom M are outputed in the hard disk, simultaneously the record in the message memory headroom is emptied, and message memory headroom message accounting quantity m is set equals 0, got back to for 18 (the 3rd steps).
18 (the 3rd steps): judge the interval end of current sub-Measuring Time
The current measuring appliance time is 4, equals the interval concluding time time=4 of current sub-Measuring Time, does not have message accounting information among the message memory headroom M, and the message accounting number m that is provided with among the present message memory headroom M equals 0, enters for 19 (the 7th steps);
19 (the 7th steps): measure the concluding time and judge
The interval sequence number of current sub-Measuring Time is 1, equals sub-Measuring Time interval number T=2 and subtracts 1, stops to measure.
Therefore the message of being sampled in this example is: A1 B2 C1.

Claims (4)

1. packet measuring method based on stream aggregation arbitrary sampling, it is characterized in that the Measuring Time interval is divided into the plurality of sub interval, the quantity in subinterval is the inverse of sampling ratio, for distributing a different match bit string in each subinterval, in each subinterval, adopt the method for measurement packet sampling of random sampling network flow, the cryptographic Hash of the match bit string matching network traffic identifier that this subinterval of use is assigned with in the sampling process, adopt a hash function to handle all message flow signs in this process to generate the cryptographic Hash Bit String, the message flow that is input as of this hash function identifies, be output as the cryptographic Hash Bit String identical with the match bit string length, compare between the match bit string that this subinterval is assigned with and the cryptographic Hash Bit String of output, if two Bit Strings are identical, then this message is sampled, otherwise this message will be dropped.
2. according to the described packet measuring method of claim 1 based on stream aggregation arbitrary sampling, it is characterized in that described that the Measuring Time interval is divided into the concrete grammar in plurality of sub interval is as follows: the length of supposing to measure the match bit string is the n bit, n is the positive integer greater than 0, and the value space size of this n Bit String is 2 nTherefore, adopting the sampling probability of method of measurement of the random sampling network flow of n bits match Bit String is 1/2 n, measuring appliance is divided into the time interval T that measures the equal portions reciprocal of sampling ratio in advance, and promptly 2 nEqual portions, each sub-time interval is numbered according to sequencing, and first sub-time interval is numbered 0, and last sub-time interval is numbered 2 n-1.
3. according to claim 1 or 2 described packet measuring methods based on stream aggregation arbitrary sampling, it is characterized in that the described different match bit string of each subinterval distribution that is, concrete grammar is as follows: it is 2 that a size is set nArray t, array t writes down the match bit string of the n bit length that is assigned with in each sub-time interval, each the element t (i) among the array t is the Bit String of a n bit, i is the numbering in sub-time interval, i is more than or equal to 0 and smaller or equal to 2 n-1, with 0 to 2 nBetween-1 2 nIt is 2 that individual different number is assigned randomly to this size nEach element of array t in, the n bits match Bit String of the number that is assigned with in each sub-time interval for being assigned with in this sub-time interval.
4. according to the described packet measuring method of claim 3, it is characterized in that the method for measurement step is specific as follows based on stream aggregation arbitrary sampling:
The first step: initial parameter is set
It is T that sampling measurement time interval length is set, and interval T is divided into 2 with Measuring Time nEqual portions, each subinterval time span is T/2 n, the interval sequence number itime of sub-time of measurement that establishes beginning equals 0;
It is 2 that a size is set nArray t, each the element t (i) among the array t is the number of a n bit, i is more than or equal to 0 and smaller or equal to 2 n-1, with 0 to 2 nBetween-1 2 nIt is 2 that number is assigned randomly to size nEach element of array t in;
Select a hash function hash, the inlet flow sign ID of hash function, the cryptographic Hash value that hash function generates is a n bit length, its span is more than or equal to 0 and less than 2 n
Message memory headroom size S is set;
If the interval time started of current Measuring Time is current, concluding time end=current+T;
Second step: calculate the interval concluding time of sub-time of current measurement
Interval concluding time time of sub-time of current measurement adds interval big or small T/2 of sub-time of measurement for measuring the interval time started of sub-time n, time=current+itime*T/2 n+ T/2 n, wherein time is the interval concluding time of sub-time of current measurement, and current is the interval time started of current Measuring Time, and itime is an interval sequence number of current sub-time, and T is the Measuring Time siding-to-siding block length, n is the positive integer more than or equal to 0, T/2 nBe the time span in subinterval, itime*T/2 nBe the time started in current sub-time interval, entered for the 3rd step;
The 3rd step: judge the interval end of current sub-Measuring Time
If the current measuring appliance time is more than or equal to the interval concluding time time of current sub-Measuring Time, data among the outgoing message memory headroom M are stored in hard disk, empty the message accounting among the message memory headroom M, the message accounting number m that is provided with among the present message memory headroom M equals 0, enters for the 7th step; The current else if measuring appliance time entered for the 4th step less than the interval concluding time time of current sub-Measuring Time;
The 4th step: sampling arrives the message of measuring appliance
Wait for that message arrives measuring appliance, if a message arrives measuring appliance, extract its traffic identifier ID, use hash function hash to calculate its cryptographic Hash value, value=hash (ID), value are the numbers of a n bit; If value equals the n bit number t (itime) of the prior setting of current interval correspondence of sub-time, (wherein itime is the current time sequence number, and t is that size is 2 nArray), then entered for the 5th step, otherwise got back to for the 3rd step;
The 5th step: handle the message of being sampled
The message information that to be sampled is recorded among the message memory headroom M, and the message accounting quantity m in the message memory headroom is increased by 1, i.e. m=m+1; If message accounting quantity m then got back to for the 3rd step less than S in the message memory headroom; Otherwise entered for the 6th step;
The 6th step: message memory headroom record output
Message accounting among the message memory headroom M is outputed in the hard disk, simultaneously the record in the message memory headroom is emptied, and message memory headroom message accounting quantity m is set equals 0, got back to for the 3rd step.
The 7th step: measure the concluding time and judge
Equal to measure sub-time sum 2 if measure interval sequence number itime of sub-time n-1, stop to measure; Otherwise interval sequence number itime=itime+1 of new current sub-time is set, entered for second step.
CNB2007101901880A 2007-11-20 2007-11-20 Packet measuring method based on stream aggregation arbitrary sampling Expired - Fee Related CN100558058C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2007101901880A CN100558058C (en) 2007-11-20 2007-11-20 Packet measuring method based on stream aggregation arbitrary sampling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007101901880A CN100558058C (en) 2007-11-20 2007-11-20 Packet measuring method based on stream aggregation arbitrary sampling

Publications (2)

Publication Number Publication Date
CN101163058A true CN101163058A (en) 2008-04-16
CN100558058C CN100558058C (en) 2009-11-04

Family

ID=39297892

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007101901880A Expired - Fee Related CN100558058C (en) 2007-11-20 2007-11-20 Packet measuring method based on stream aggregation arbitrary sampling

Country Status (1)

Country Link
CN (1) CN100558058C (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8392434B1 (en) 2011-09-16 2013-03-05 International Business Machines Corporation Random sampling from distributed streams
CN104468276A (en) * 2014-12-18 2015-03-25 东南大学 Network traffic identification method based on random sampling multiple classifiers
CN108282265A (en) * 2018-01-19 2018-07-13 广东工业大学 Error correction/encoding method, device, equipment and computer readable storage medium
CN108811036A (en) * 2018-05-24 2018-11-13 上海连尚网络科技有限公司 Method and apparatus for showing wireless access point information
CN110995694A (en) * 2019-11-28 2020-04-10 新华三半导体技术有限公司 Network message detection method, device, network security equipment and storage medium
CN112532444A (en) * 2020-11-26 2021-03-19 上海阅维科技股份有限公司 Data flow sampling method, system, medium and terminal for network mirror flow

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8392434B1 (en) 2011-09-16 2013-03-05 International Business Machines Corporation Random sampling from distributed streams
CN104468276A (en) * 2014-12-18 2015-03-25 东南大学 Network traffic identification method based on random sampling multiple classifiers
CN104468276B (en) * 2014-12-18 2017-07-28 东南大学 Network flow identification method based on random sampling multi-categorizer
CN108282265A (en) * 2018-01-19 2018-07-13 广东工业大学 Error correction/encoding method, device, equipment and computer readable storage medium
CN108282265B (en) * 2018-01-19 2020-11-03 广东工业大学 Error correction encoding method, apparatus, device and computer readable storage medium
CN108811036A (en) * 2018-05-24 2018-11-13 上海连尚网络科技有限公司 Method and apparatus for showing wireless access point information
CN108811036B (en) * 2018-05-24 2020-07-31 上海连尚网络科技有限公司 Method and apparatus for displaying wireless access point information
CN110995694A (en) * 2019-11-28 2020-04-10 新华三半导体技术有限公司 Network message detection method, device, network security equipment and storage medium
CN110995694B (en) * 2019-11-28 2021-10-12 新华三半导体技术有限公司 Network message detection method, device, network security equipment and storage medium
CN112532444A (en) * 2020-11-26 2021-03-19 上海阅维科技股份有限公司 Data flow sampling method, system, medium and terminal for network mirror flow
CN112532444B (en) * 2020-11-26 2023-02-24 上海阅维科技股份有限公司 Data flow sampling method, system, medium and terminal for network mirror flow

Also Published As

Publication number Publication date
CN100558058C (en) 2009-11-04

Similar Documents

Publication Publication Date Title
CN100558058C (en) Packet measuring method based on stream aggregation arbitrary sampling
JP4480900B2 (en) System and method for measuring transfer time and loss rate in high capacity communication networks
US8391157B2 (en) Distributed flow analysis
US8923152B2 (en) Random data stream sampling
Wang et al. A data streaming method for monitoring host connection degrees of high-speed links
KR20110080465A (en) Flow data analyze method by parallel computation
Alshammari et al. Investigating two different approaches for encrypted traffic classification
MX2010006844A (en) Method of resolving network address to host names in network flows for network device.
CN111953552B (en) Data flow classification method and message forwarding equipment
CN101227318A (en) Method for overtrick real-time detection of high speed network flow quantity
CN110049061A (en) Lightweight ddos attack detection device and detection method on high speed network
CN106330611A (en) Anonymous protocol classification method based on statistical feature classification
Fusy et al. Estimating the number of active flows in a data stream over a sliding window
Canini et al. Per flow packet sampling for high-speed network monitoring
US7756128B2 (en) System and method for network analysis
CN101834763A (en) Multiple-category large-flow parallel measuring method under high speed network environment
Mori et al. Flow analysis of internet traffic: World Wide Web versus peer‐to‐peer
US7715317B2 (en) Flow generation method for internet traffic measurement
CN110932971A (en) Inter-domain path analysis method based on layer-by-layer reconstruction of request information
Dimitropoulos et al. The eternal sunshine of the sketch data structure
CN104836700B (en) NAT host number detection methods based on IPID and probability statistics model
CN112235254A (en) Rapid identification method for Tor network bridge in high-speed backbone network
Wang et al. Virtual indexing based methods for estimating node connection degrees
CN105610655A (en) Router flow monitoring and analyzing method
Kumar et al. Machine learning based traffic classification using low level features and statistical analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091104

Termination date: 20121120