WO2006130830A2 - Systeme et procede de mesure du trafic et de matrices de flux - Google Patents

Systeme et procede de mesure du trafic et de matrices de flux Download PDF

Info

Publication number
WO2006130830A2
WO2006130830A2 PCT/US2006/021447 US2006021447W WO2006130830A2 WO 2006130830 A2 WO2006130830 A2 WO 2006130830A2 US 2006021447 W US2006021447 W US 2006021447W WO 2006130830 A2 WO2006130830 A2 WO 2006130830A2
Authority
WO
WIPO (PCT)
Prior art keywords
sketch
data packet
network
node
bitmap
Prior art date
Application number
PCT/US2006/021447
Other languages
English (en)
Other versions
WO2006130830A3 (fr
Inventor
Qi Zhao
Abhishek Kumar
Jun Xu
Original Assignee
Georgia Tech Research Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Georgia Tech Research Corporation filed Critical Georgia Tech Research Corporation
Publication of WO2006130830A2 publication Critical patent/WO2006130830A2/fr
Publication of WO2006130830A3 publication Critical patent/WO2006130830A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/022Capturing of monitoring data by sampling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/022Capturing of monitoring data by sampling
    • H04L43/024Capturing of monitoring data by sampling by adaptive sampling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • H04L43/106Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Definitions

  • Figure 1 illustrates an example of a network monitoring device
  • Figure 2 illustrates another example of a network monitoring device
  • Figure 3 illustrates an example of a network that includes the network monitoring device in accordance with the invention
  • Figures 4A and 4B illustrate an example of a counter array sketch data structure used with the network monitoring device
  • Figures 5A and 5B illustrate an example of a bit map sketch data structure used with the network monitoring device
  • Figure 6 illustrates a network in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device
  • Figure 7 illustrates more details of each node of the network shown in Figure 6;
  • Figure 8 illustrates a method for monitoring a network link in accordance with the invention
  • Figure 9 illustrates a method for estimating a traffic matrix in accordance with the invention.
  • Figure 10 illustrates an example of a flow measurement interval
  • Figure 11 compares an observed average error with the predicted error for various load factors
  • Figure 12 illustrates a method for estimating a flow matrix by matching counter values in accordance with the invention
  • Figures 13A and 13B compare an estimated traffic matrix using the bit map sketch and counter array sketch, respectively, to an original traffic matrix estimates using the NLANR traces;
  • Figure 14 illustrates the impact of varying the threshold on the relative error of a traffic matrix estimation
  • Figure 15 illustrates a comparison of an average error of the bitmap scheme to the known sampling scheme
  • Figure 16 illustrates a flow matrix estimation error for various thresholds
  • Figure 17 illustrates a cumulative distribution of traffic with certain average errors.
  • TM traffic matrix
  • TM traffic matrix
  • a node could be a link, router, point of presence (PoP), or an application server (AS)
  • the traffic matrix can be measured between any two nodes of the network, such as between two routers.
  • the total traffic volume traversing the network from the ingress node i e ⁇ 1,2, • • • , r ⁇ / to the egress nodey e ⁇ 1, 2, • • • , n ⁇ is TMy.
  • estimating the TM on a high-speed network in a specified measurement interval is performed.
  • Figure 1 illustrates an example of a network monitoring device 20 that monitors a stream of data packets 22 over a link (not shown) of a communications network.
  • the device may include a collection unit 24 and an analysis unit 26.
  • the collection unit 24 collects sampled data packets or generates a sketch of the data packet stream over the link during a period of time while the analysis unit 26 analyzes the sampled data packets or the sketches.
  • the collection unit and the analysis unit may be implemented in software, but may also be implemented in hardware or in a combination of hardware and software.
  • the collection unit and analysis unit may be co- located or may be located at different physical locations and may be executed on the same piece of hardware or different pieces of hardware.
  • the sketch is a data structure that stores information about each data packet in a packet stream wherein the sketch is typically a smaller size than the actual data packets.
  • Two examples of a sketch that can be used with the network monitoring device are shown in Figures 4A-4B and 5A-5B, respectively.
  • the collection unit 24 may further comprise one or more of a selection process 28, an online streaming module 32 and a reporting process 30 which are each a piece of software code that implements the functions and methods described below.
  • the selection process performs a sampling of the packet stream and selects the sampled data packets that are communicated to the reporting process 30 that aggregates the sampled data packets and generates a report based on the sampled data packets.
  • the online streaming module monitors the packet stream and generates one or more sketches (shown in Figures 4A, 4B and/or 5 A, 5B) based on the data packets in the packet stream. As shown, the collection unit may communicate flow records/reports and the sketches to the analysis unit 26.
  • the analysis unit 26 may further comprise a collector 34, a digest collector 36 and one or more network monitoring applications 38, such as application 38 1? application 38 2 and application 38 3 .
  • the collector 34 receives the flow records/reports from the reporting process 30 while the digest collector 36 receives the sketches generated by the online streaming module 32.
  • the flow records/reports and/or the sketches may then be input into the monitoring applications 38 that perform different network monitoring functions.
  • one application can generate a data packet volume estimate over a link between a first and second node of a network
  • another application may generate a flow estimate between a flow source and a flow destination.
  • the network monitoring device 20 is a platform on which a plurality of different monitoring applications can be executed to perform various network monitoring functions and operations.
  • One or more examples of the monitoring applications that may be executed by the network monitoring device are described in more detail below.
  • FIG. 2 illustrates another example of a network monitoring device 20 that includes the collection unit 24 and the analysis unit 26.
  • the collection unit 24 includes only the online streaming module 32 that generates the sketches and periodically communicates the sketches to the analysis unit 26 that includes the offline processing module 38.
  • the network monitoring device 20 generates sketches and then analyzes those sketches to perform a network monitoring function.
  • Each data packet 22 1 , 22 2 , 22 3 and 22 4 may include a header 23 ls 23 2 , 23 3 and 23 4 that may be used by the online streaming module 32 to generate the sketches as described in more detail below.
  • a user may submit a query, such as the traffic volume of a particular link, to the monitoring application and the monitoring application returns a result to the user, such as the traffic volume of the particular link based on the sketches generated by the online streaming module.
  • FIG. 3 illustrates an example of a network 40 that includes the network monitoring device in accordance with the invention.
  • the network may include one or more nodes 42, such as routers 42j, 42 2 , 42 3 , 42 4 and 42 5 , wherein the network may include one or more ingress routers and one or more egress routers.
  • the routers form the network that permits data packets to be communicated across the network.
  • the collection unit 24 is physically located at each link interface at each node 42 and is a piece of software executed by each router.
  • the analysis unit 26 may be physically located at a central monitoring unit 44, such as a server computer, that is remote from the nodes of the network and the collection unit is a piece of software executed by the central monitoring unit 44.
  • Each collection unit 24 may generate one or more sketches for its link during a particular time period and then communicate those sketches to the central monitoring unit as shown by the dotted lines in Figure 3.
  • FIGs 4A and 4B illustrate an example of a counter array sketch data structure 50 used with the network monitoring device.
  • the sketch data structure may include one or more counters 52 (Cl to Cb for example) in an array (known as a counter array) wherein each counter has an index number (1 to b in the example shown in Figure 4A) associated with the counter.
  • Each counter can be incremented based on the scanning of the data packets in the data stream performed by the online streaming module shown above.
  • each counter is - associated with a particular set of one or more packet flow label attributes.
  • Each data packet flow label may include a field containing a source node address (an address of the source of the particular data packet, a field containing a destination node address (an address of the destination of the particular data packet, a field containing a source port (an application at the source from which the particular data packet is generated), a field containing a destination port (the application at the destination to which the particular data packet is being sent) and a field containing a protocol designation that identifies the type of protocol being used for the particular data packet, such as HTTP, UDP, SNMP, etc.
  • one counter for a particular network may be assigned to count all data packets (during a predetermined time interval) that are sent from a particular source node while another counter may be assigned to count all data packets (during a predetermined time interval) that are sent to a particular application in a particular destination node.
  • the assignment of each counter in the counter array is configurable depending on the particular network and the particular user and what the particular user needs to monitor in the network.
  • Figure 4B illustrates an example of a piece of pseudocode 54 that implements the counter array data structure shown in Figure 4A.
  • the pseudocode shows that, during an initialization process 56 (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each counter in the counter array is reset to a default value that may be zero.
  • a hash function is performed on the flow label of the data packet (illustrated as h(pkt.flow_label) in the pseudocode) which generates an index value (ind) into the counter array and the counter at that index location is incremented by one to indicate that a data packet with the particular set of one or more packet flow label attributes was monitored by the particular online streaming module.
  • h(pkt.flow_label) in the pseudocode
  • the hash function used by the counter array may be the known H 3 family of hash functions that are described in an article by J. Carter and M. Wegman entitled “Universal classes of hash functions", Journal of Computer and System Sciences, pages 143-154 (1979) which is incorporated herein by reference.
  • Q is a r x w matrix defined over GF(2) and its value is fixed for each hash function in H 3 .
  • the multiplication and addition in GF(2) is boolean AND (denoted as o) and XOR (denoted as ⁇ ), respectively.
  • Each bit of B is calculated as:
  • the bit map sketch and the counter array sketch may also use other known or unknown hash functions, such as for example, SHAl or MD5 and the invention is not limited to any particular hash function. Now, an example of another sketch data structure is described.
  • Figures 5 A and 5B illustrate an example of a bitmap sketch data structure 60 used with the network monitoring device.
  • the sketch data structure may include one or more bit positions 62 (1 to b for example) in an array (known as a bit map sketch) wherein each bit position has an index number (1 to b in the example shown in Figure 5A) associated with the bit position.
  • Each bit position can have a value of "0" or "1” and be set to "1" based on the scanning of the data packets in the data stream performed by the online streaming module shown above.
  • each bit position is associated with a particular data packet characteristic that uniquely identifies the data packet wherein that portion of the data packet is input to the hash function.
  • the invariant portion of a packet used as the input to the hash function must uniquely represent the packet and by definition should remain the same when it travels from one router to another. At the same time, it is desirable to make its size reasonably small to allow for fast hash processing. Therefore, the invariant portion of a packet consists of the packet header, where the variant fields (e.g., TTL, ToS, and checksum) are marked as O's, and the first 8 bytes of the payload if there is any. As is known, these 28 bytes are sufficient to differentiate almost all non-identical packets.
  • the variant fields e.g., TTL, ToS, and checksum
  • Figure 5B illustrates an example of a piece of pseudocode 64 that implements the bit map data structure shown in Figure 5 A.
  • the pseudocode shows that, during an initialization process 66. (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each bit in the bit map is reset to a default value that may be zero.
  • bitmap sketch may be the same hash function used for the counter array sketch.
  • the bit map once it reaches a threshold level of fullness, is stored in a memory or disk and then communicated to the analysis unit as described above. As with the counter array, the bit map has the same performance characteristics as the counter array.
  • the sketch data structure trades off complete information about each data packet for a limited amount of information about each data packet in the link.
  • the sketches store some information about each data packet due to the hash function and the bitmap or counter array.
  • one of the monitoring applications that may be resident on the network monitoring device is an application that, based on monitoring of the data packets in a packet stream to generate a sketch by the collection unit, generates a traffic matrix or a flow matrix.
  • the traffic matrix provides an estimate of the volume of data packets over a link while the flow matrix provides an estimate of the volume of the data packets for a particular flow over a link.
  • FIG. 6 illustrates a network 70 in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device.
  • the network 70 may include a source node 72 1 and a destination node 72 2 that may, for example, be a server computer in Seattle with a particular IP address and a server computer in Atlanta with a particular IP address.
  • the source node 12 ⁇ may run an email application that generates data packets over a particular port having a SMNP protocol that are destined for an email application of the destination node 72 2 .
  • a data packet from the source node to the destination node may pass through one or more intermediate nodes 74 that may be routers in the example shown in Figure 6 and the data packets can pass over various different links between the nodes during the transit of the data packets from the source node to the destination node.
  • This network 70 may include the central monitoring unit 44 that houses the analysis unit 26 (not shown) that includes the monitoring application that generates traffic matrices and flow matrices for the network.
  • the monitoring application may be a piece of software executed by the central monitoring unit that is a computer-based device.
  • each node 74 may include the collection unit 24 associated with each link interface of each node.
  • a node connected to two different communications links would have a collection unit associated with each link or may have a single collection unit with two online streaming modules.
  • the collection unit 24 as shown in Figure 7 may be integrated into the node 74 or the collection unit may be a separate piece of hardware.
  • the traffic matrix or flow matrix is measured between a first observation point and a second observation point wherein the volume of data packets between the first and second observation points is estimated (traffic matrix) or the distribution of flow sizes (data packets in each flow) within the volume of data packets between the first and second observation points is estimated (the flow matrix).
  • the first observation point may be a link or node and the second observation point may also be a node or a link.
  • the flow matrix between the node in Seattle and the node in Atlanta may be estimated using the methods described herein.
  • Figure 8 illustrates a method 80 for monitoring a network link in accordance with the invention using a novel data streaming technique that processes a long stream of data items in one pass using a small working memory in order to answer a class of queries regarding the stream.
  • the method may use the counter array sketch or the bit map sketch described above to generate a traffic matrix or a flow matrix based on the data packets in the data stream monitored at the communications link.
  • the monitoring method shown in Figure 8 may be implemented by the collection unit and in particular the online streaming module that is a plurality of lines of computer code executed by the node processor (which the node is the hardware on which the online streaming module runs) that implements the steps described below.
  • step 82 the online streaming module waits for a packet on the link and, when there is a packet, extracts the invariant portion (or the flow label information when the counter array sketch is used) from the data packet in step 84 which is described above.
  • step 86 the online streaming module performs a hash operation on the extracted data packet information and, as described above, generates an index into the sketch based on the data packet information so that, in step 88, the sketch position identified by the index is incremented (or a bit position is changed to "1" for the bit map sketch).
  • the counter array sketch or bit map sketch data structure may be stored in a smaller amount of memory than the actual data packets.
  • step 90 the online streaming module determines if the sketch period has been exceeded (either the epoch period of the counter array is exceeded or the bit map sketch has exceeded a threshold level of fullness.) If the sketch period is not exceeded, the method loops back to step 82 to check for the next data packet on the link. If the sketch period is exceeded, then in step 92, the sketch is stored on the node and the sketch data structure is reset and the new sketch data structure is filled with data when and the method loops back to step 82.
  • the sketches may be communicated to the central monitoring unit on demand when an estimate is requested by a user or periodically communicated to the central monitoring unit.
  • the monitoring application analyzes the sketches and generates an estimate of the volume of the data packets over the link based on the one or more bit map sketches (a traffic matrix element) or generates an estimate of the volume of data packets for a particular flow over the link based on the one or more counter array sketches (a flow matrix).
  • TMy To generate an estimate of a traffic matrix element, TMy, two bitmap sketches are collected from the corresponding nodes i andj, and are fed to the monitoring application that is able to estimate the traffic matrix element as described below in more detail. Since only the bitmap sketches from the two nodes, i an ⁇ j, are needed, the monitoring application can estimate a submatrix using the minimum amount of information possible, namely, only the bitmaps from the rows and columns of the submatrix. The estimation of the submatrix allows large ISP networks to focus on particular portions of their network. The estimation of the submatrix also permits the incremental deployment of the network monitoring device since the existence of non- participating nodes does not affect the estimation of the traffic submatrix between all participating ingress and egress nodes.
  • the counter array sketch is used.
  • the counter array sketch permits the volume of data packets from a plurality of different flows (based on the flow labels) to be estimated.
  • the counter array matrix may also be used to estimate the traffic matrix even though is the less cost effective than the bitmap sketch.
  • the network 40 has one or more nodes 42 that may each have the online streaming module (in the collection unit 24 shown in Figure 3) that generates the sketches (either the bitmap sketches or the counter array sketches) that are thousands of times smaller than the raw data packet traffic.
  • the sketches may be stored locally for a period of time, and will be shipped to a central monitoring unit 44 on demand.
  • the data analysis unit 26 running at the central monitoring unit 44 obtains the sketches needed for estimating the traffic matrix and flow matrix through queries.
  • the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the bitmap sketch.
  • the method for generating the bitmap sketch in the online streaming module was discussed above with respect to Figure 5B.
  • the bitmap sketch is stored in the online streaming module (at each node using the same hash function and same bitmap size b) until the sketch is filled to a threshold percentage over a time interval wherein the time interval may be known as a "bitmap epoch".
  • the bitmap sketch data structure is reset and then again filled with data.
  • FIG 9 illustrates a method 100 for estimating a traffic matrix in accordance with the invention that may be implemented by the monitoring application (that may be software executed by a processor that is part of the central monitoring unit) within the analysis unit 26 that may be within the central monitoring unit 44.
  • the monitoring application may determine if a request for a traffic matrix estimate (T My) between two nodes (i and j) during an interval, t, has been requested. If a traffic matrix estimate is requested, then in step 104, the monitoring application requests the bitmap(s) from the two nodes that are contained in or partly contained in the time interval, t.
  • T My traffic matrix estimate
  • FIG. 10 An example of the bitmaps from the two nodes over a time interval requested by the monitoring application is shown in Figure 10.
  • the analysis unit 26 receives the requested bitmaps from the nodes.
  • the monitoring application estimates the traffic matrix (T My) for the volume of data packets between the two nodes given the bitmaps delivered from the nodes. For purposes of an example, it is assumed that both node i and node/ produce exactly one bitmap during the time interval (the measurement interval) when the traffic matrix is estimated.
  • the estimator may be adapted from an article "A Linear-time Probabilistic Counting Algorithm for Database Applications" by K.Y. Whang et al, IEEE Transaction of Database Systems, pgs. 208-229 (June 1990) which is incorporated herein by reference, in which the estimator is used for databases.
  • the set of packets arriving at the ingress node i during the measurement interval is Tj, the resulting bitmap is B T ; and the number of bits (that are all "0") in Bxi is U ⁇ i while the size of the bitmap is b.
  • the estimator of T 1 which is the number of elements (packets) in Tj is:
  • T My (the quantity to be estimated) is T t nT j .
  • B TI ⁇ JT (the result of hashing the set of packets T 1 KJT 1 into a single bitmap.)
  • the bitmap B T ⁇ uT is computed as the bitwise-OR of B ⁇ i and B T1 . It can be shown that D n +D Tj -D Tl ⁇ Tj is
  • the computational complexity of estimating each element of the matrix is O(b) for the bitwise operation of the two bitmaps.
  • the overall complexity of estimating the entire m x n matrix is therefore O(mnb). Note that the bitmaps from other nodes are not needed when we are only interested in estimating TM ⁇ This poses a significant advantage in computational complexity over existing indirect measurement approaches, in which the whole traffic matrix needs to be estimated even if we are only interested in a small subset of the matrix elements due to the holistic nature of the inference method.
  • the measurement interval is exactly one bitmap epoch. Practically some network management tasks such as capacity planning and routing configuration need the traffic matrices on the long time scales such as tens of minutes or a few hours. Each epoch in our measurements is typically much smaller especially for the high speed links. Therefore we need extend our scheme to support any time scales.
  • bitmap epochs between nodes i andj are well aligned. Traffic going through different nodes can have rates orders of magnitude different from each other, resulting in some bitmaps being filled up very fast (hence short bitmap epoch) and some others filled up very slowly (hence long bitmap epoch). We refer to this phenomena as heterogeneity. Because of heterogeneity the bitmap epochs on different nodes may not well aligned.
  • the above estimation was for the ideal case.
  • the general case for the estimation is explained.
  • the measurement interval spans exactly bitmap epochs 1, 2, ..., k; at node i and bitmap epochs 1, 2, ..., k 2 at node y, respectively.
  • the traffic matrix element TMy can be estimated as
  • N q r is the estimation of the common traffic between the bitmap (also known as a page) q at node I and the page r at node j, and overlap ⁇ q, r) is 1 when the page q at node i overlaps temporally with page r at node j and is 0 otherwise.
  • the timestamps of their starting times will be stored along with the pages in a process known as "multipaging".
  • the multipaging process eliminates the assumptions set forth above. For example, the multipaging process supports the measurements over multiple epochs so that the first assumption is eliminated.
  • an exemplary measurement interval 110 corresponds to the rear part of epoch 1, epochs 2 and 3, and the front part of epoch 4 at node i.
  • the exemplary measurement interval also corresponds to the rear part of epoch 1, epoch 2, and the front part of epoch 3 at node/
  • N 11 ,N 2 1 , N 2 ⁇ 2 , N 3 2 , N 3 3 , and N 43 based on their temporal
  • the bitmaps In some situations, it is desirable to store the bitmaps for a long period of time for later troubleshooting which could result in huge storage complexity for very high speed links, but sampling can be used to reduce this requirement significantly.
  • sampling To sample the data packets, the impact on the accuracy should be minimized, but it is desirable to use DRAM to conduct online streaming for very high speed links (e.g., beyond OC- 192) and it is important to sample only a certain percentage of the packets so that the DRAM speed can keep up with the data stream speed.
  • the constraint can be one bitmap of 4 Mbits per second and suppose we have 40 million packets arriving within one second.
  • One option is that the process does no sampling, but hashes all these packets into the bitmap, referred to as "squeezing". But the resulting high load factor of approximately 10 would lead to high estimation error.
  • An alternative option is to sample only a certain percentage ⁇ of packets to be squeezed into the bitmap and many different/? values can be chosen. For example, we can sample 50% of the packets and thereby squeeze 20 million sampled packets into the bitmap, or we can sample and squeeze only 25% of them so that it is necessary to determine an optimal value of p. On the one extreme, if we sample at a very low rate, the bitmap will only be lightly loaded and the error of estimating the total sampled traffic as well as its common traffic with another node (a traffic matrix element) becomes lower.
  • the optimal value of p may be determined based on the following principle:
  • each overlapping page pair may have its own optimal t* to achieve the optimal accuracy of estimating its common traffic. Therefore it is impossible to adapt t* to satisfy every other node, as their needs (t* for optimal accuracy) conflict with each other. Therefore, a default t* for every node is identified such that the estimation accuracy for the most common cases is high.
  • the optimal ⁇ ?* and t* between pages a and ⁇ given the expected traffic demand in a bitmap epoch. In fact, only one of them needs to be determined since the other follows from Principle 1.
  • a sampling technique called consistent sampling is used to significantly reduce the estimation error. With consistent sampling, X , the estimator of X, is given by
  • the above formula consists of two terms.
  • the first term corresponds to the variance from estimating the sampled traffic (equation 2 above) scaled by 1/p 2 (to compensate for the sampling), and the second term corresponds to the variance of the sampling process. Since these two errors are orthogonal to each other, their total variance is the sum of their individual variances.
  • X is an almost unbiased estimator of X, is given by:
  • the optimal t * value is a function of T and X, setting it according to some global default value may not be optimal all the time. Fortunately we observe that t*, the optimal load factor, does not vary much for different T and X values through our extensive experiments. In addition, we can observe from Figure 11 that the curve is quite flat in a large range around the optimal load factor. For example, the average errors corresponding to any load factor between 0.09 and 1.0 only fluctuate between around 0.012 to 0.015. Combining the above two observations, we conclude that by setting a global default load factor t* according to some typical parameter settings, the average error will stay very close to optimal values. Throughout this work we set the default load factor to 0.7.
  • the consistent sampling scheme works by fixing a hash function h' (different from the aforementioned h which is used to generate the bitmap), that maps the invariant portion of a packet to an 1-bit binary number.
  • the range of the hash function h ' is ⁇ 0, 1 , • • • , 2 l - 1 ⁇ .
  • the flow matrix contains finer grained information than the traffic matrix and the counter array sketch described above may be used to estimate the flow matrix.
  • the flow matrix is the traffic matrix combined with the information on how each OD element is split into flows of different sizes.
  • a flow matrix element FMy is the set of sizes of flows that travel from node i to node/ ' during a measurement interval.
  • the counter array sketch can be used to estimate the traffic matrix as well.
  • the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the counter array sketch.
  • the method for generating the counter array sketch (using the flow label described above) in the online streaming module was discussed above with respect to Figure 4B.
  • the counter array sketch is stored on the online streaming module (at each node using the same hash function and same bitmap size b.) Since for each packet, this process requires only one hash operation, one memory read and one memory write (to the same location), this allows the online streaming module to operate at OC-768 (40 Gbps) speed with off-the-shelf 1 Oris SRAM and an efficient hardware implementation of the Hj family of hash functions.
  • the counter array scheme is holistic in the sense that all ingress and egress nodes have to participate.
  • the counter epochs in this scheme need to be aligned with each other, that is, all counter epochs in all ingress and egress nodes need to start and end at the approximately same time.
  • the practical implication is that the counter array b needs to be large enough to accommodate the highest link speed among all nodes (i.e., the worst case). Similar to the definition of "bitmap epoch”, we refer to the amount of time the highest-speed link takes to fill up the counter array to a threshold percentage as a "counter epoch", or epoch for abbreviation.
  • the memory and storage complexities of the online streaming module for the counter array scheme are explored.
  • the counter epoch ranges from 1 to a few 10's of seconds and very accurate estimates can be achieved by setting the number of counters in the array to around the same order as the number of flows during an epoch. Therefore, for an OC- 192 or an OC- 768 link, one to a few million counters need to be employed for a measurement interval of one to a few seconds. If each counter has a "safe" size of 64 bits to prevent the overflow, the memory requirement would be quite high.
  • Huffman type of compression can easily reduce the storage complexity to only a few bits per counter. Since the average flow length is about 10 packets (observed in our evaluation described below), the average storage cost per packet is amortized to less than 1 bit.
  • the counter arrays during that interval need to be shipped to the central monitoring unit for analysis. If the measurement interval spans more than one epochs, sketches in each epoch will be processed independently.
  • V"' C n [Ic] « V"_ C Ej [k] The approximation comes from the fact that the clock is not perfectly synchronized at all nodes and the packet traversal time from an ingress node to an egress node is non-zero. Both factors only have marginal impact on the accuracy of our estimation. In addition, the impact of this approximation on the accuracy of the data analysis process is further alleviated due to the "elephant matching" nature of the data analysis process described below.
  • Figure 12 is a piece of pseudocode that illustrates a method 120 for estimating the flow matrix by matching counter values at index k wherein the large and medium flows are matched. For each index k in all the counter arrays, the steps shown in lines 2-13 of Figure 12 are executed. In the matching process, the largest ingress counter value Ci max . t [k] is matched with the largest egress counter value C EMOX -J [k]- The smaller value of the two is considered a flow from max J to max J (determined in lines 6 and 10 of the pseudocode), and this value will be subtracted from both counter values (see lines 7 and 11 of the pseudocode) which reduces the smaller counter value to 0.
  • the computation complexity of the process shown in Figure 12 is O((m+ n — l)(log m + log n)) because the binary searching operation (lines 3 and 4 of the pseudocode that determine the largest ingress and egress counter values) dominates the complexity of each iteration and there are at most m + n — 1 iterations.
  • the overall complexity to estimate the flow matrix is 0(b(m + n — l)(log m + log n)).
  • an exact flow matrix can be used to indicate intrusions such as DDoS attacks.
  • the estimation process in Figure 12 provides accurate estimation on the medium and large flow matrix elements and some typical intrusions (e.g., DDoS attacks) consist of a large number of small flows.
  • the process shown in Figure 12 can still be used to provide valuable information about intrusions.
  • the flow label of the process can be selected to provide the valuable information.
  • the flow label can be the destination IP address of a packet so that the traffic of a DDoS attack becomes a large flow going through the network instead of a large number of small ones.
  • the method may include other known sampling or steaming processes as set forth in "New Directions in Traffic Measurement and a Counting", C. Estan and G. Varghese, Proceedings of ACM SIGCOMM, August 2002 which is incorporated herein by reference.
  • the traffic matrix can also be obtained by adding up the sizes of all the flows that we determine going from node i to nodey using the above process. This is in fact a fairly accurate estimation of traffic matrix since the process tracks kangaroos and elephants very accurately and thus accounts for the majority of traffic.
  • Root Mean Squared Error RMSE
  • RMSRE Root Mean Squared Relative Error
  • the RMSE provides an overall measure of the absolute errors in the estimates, while RMSRE provides a relative measure. Note that the relative errors for small matrix elements are usually not very important for network engineering so that only matrix elements greater than some threshold T are used in the computation of RMSRE (properly normalized), hi the above equation, N 7 - refers to the number of matrix elements greater than T, i.e.,
  • N r
  • x > r,i i,2,...,N ⁇
  • ⁇ LA ⁇ R trace-driven Evaluation The set of traces used consist of 16 publicly available packet header traces from ⁇ LA ⁇ R.
  • the number of flows in these traces varies from 170K to 320K and the number of packets varies from 1.8M to 3.5M.
  • a synthetic scenario is constructed that appears as if these traces were collected simultaneously at all ingress nodes of a network.
  • the challenge in constructing this scenario lies in assigning the flows in the input stream at an ingress node to 16 different egress nodes such that the generated matrix will reflect some properties of real traffic matrices.
  • bitmap and the counter array For simplicity, we configure the size of the bitmap and the counter array to fit the data set size without adopting the enhancement techniques (i.e., multipaging and sampling).
  • enhancement techniques i.e., multipaging and sampling.
  • SRAM fast memory
  • Figures 13A and 13B compare the estimated traffic matrix elements using the bitmap scheme (Figure 13A) and the counter array scheme (Figure 13B) with the original traffic matrix elements.
  • the solid diagonal line in each figure denotes a perfect estimation, while the dashed lines denote an estimation error of ⁇ 5% so that points closer to the diagonal mean a more accurate estimate.
  • both schemes are very accurate, and the bitmap scheme is more Accurate than the counter array scheme.
  • Figure 14 shows the impact of varying Ton RMSRE.
  • Ton RMSRE Ton RMSRE
  • results above reflect relative accuracy on a small time scale (one to several seconds for high speed routers), and they should not be directly compared with other reported results since those results are on much larger time scales.
  • the schemes usually can achieve much higher relative accuracy on larger time scales (e.g., tens of minutes) as shown below.
  • Figure 16 shows the RMSREs for various thresholds T. We observe a sharp downward trend in the value of RMSRE for increasing threshold values. When the threshold is equal to 10 packets, the error drops to below 15%. The accurate estimation of these flows is very important since, in this trace, flows of size 10 and above (71,345 of them) accounts for 87% of the total traffic.
  • a one-hour router-level traffic matrix from a tier-1 ISP network is obtained to analytically evaluate the accuracy of the bitmap scheme.
  • traffic volume between each pair of backbone routers is evenly distributed over the one hour time period.
  • An hour's traffic is too large (we assume a conservative average packet size of 200 bytes) to fit in a single bitmap, and therefore the aforementioned multipaging technique is used.
  • Given a traffic matrix we split the traffic on each ingress/egress node into multiple pages of 4Mbits (i.e., 512KB) with load factor 0.7 (the default load factor described above). Then, we compute the standard deviation for each pair of overlapped pages using Theorem 1.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
  • Traffic Control Systems (AREA)

Abstract

Système et un procédé de mesure de trafic et de matrices de flux capables, pour un flux de trafic à très haute vitesse, de fournir des estimations précises du volume de flux de trafic au moyen de condensés de trafic dont les d'ordres de grandeur sont inférieures à celles du flux de trafic. Ce système et ce procédé peuvent également inclure des méthodologies d'échantillonnage.
PCT/US2006/021447 2005-06-02 2006-06-02 Systeme et procede de mesure du trafic et de matrices de flux WO2006130830A2 (fr)

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
US68656005P 2005-06-02 2005-06-02
US68657005P 2005-06-02 2005-06-02
US60/686,560 2005-06-02
US60/686,570 2005-06-02
US68965105P 2005-06-10 2005-06-10
US60/689,651 2005-06-10
US70919105P 2005-08-17 2005-08-17
US70919805P 2005-08-17 2005-08-17
US60/709,198 2005-08-17
US60/709,191 2005-08-17

Publications (2)

Publication Number Publication Date
WO2006130830A2 true WO2006130830A2 (fr) 2006-12-07
WO2006130830A3 WO2006130830A3 (fr) 2007-08-30

Family

ID=37482345

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2006/021512 WO2006130840A2 (fr) 2005-06-02 2006-06-02 Systeme et procede de transmission de donnees en continu
PCT/US2006/021447 WO2006130830A2 (fr) 2005-06-02 2006-06-02 Systeme et procede de mesure du trafic et de matrices de flux

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/US2006/021512 WO2006130840A2 (fr) 2005-06-02 2006-06-02 Systeme et procede de transmission de donnees en continu

Country Status (1)

Country Link
WO (2) WO2006130840A2 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833134A (zh) * 2012-09-04 2012-12-19 中国人民解放军理工大学 负载自适应的网络数据流流量测量方法
WO2013155021A3 (fr) * 2012-04-09 2014-01-03 Cisco Technology, Inc. Calculs de matrice de demandes distribuée
US9979613B2 (en) 2014-01-30 2018-05-22 Hewlett Packard Enterprise Development Lp Analyzing network traffic in a computer network

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8799754B2 (en) 2009-12-07 2014-08-05 At&T Intellectual Property I, L.P. Verification of data stream computations using third-party-supplied annotations
JP5937990B2 (ja) * 2013-03-12 2016-06-22 日本電信電話株式会社 トラヒック分布推定装置、トラヒック分布推定システム、及びトラヒック分布推定方法
US10084752B2 (en) 2016-02-26 2018-09-25 Microsoft Technology Licensing, Llc Hybrid hardware-software distributed threat analysis
US10608992B2 (en) 2016-02-26 2020-03-31 Microsoft Technology Licensing, Llc Hybrid hardware-software distributed threat analysis
US10656960B2 (en) 2017-12-01 2020-05-19 At&T Intellectual Property I, L.P. Flow management and flow modeling in network clouds

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030105976A1 (en) * 2000-11-30 2003-06-05 Copeland John A. Flow-based detection of network intrusions
US20040218529A1 (en) * 2000-11-01 2004-11-04 Robert Rodosek Traffic flow optimisation system
US20050039086A1 (en) * 2003-08-14 2005-02-17 Balachander Krishnamurthy Method and apparatus for sketch-based detection of changes in network traffic
US6873600B1 (en) * 2000-02-04 2005-03-29 At&T Corp. Consistent sampling for network traffic measurement

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE495500T1 (de) * 1999-06-30 2011-01-15 Apptitude Inc Verfahren und vorrichtung zur überwachung des verkehrs in einem netzwerk
US6807156B1 (en) * 2000-11-07 2004-10-19 Telefonaktiebolaget Lm Ericsson (Publ) Scalable real-time quality of service monitoring and analysis of service dependent subscriber satisfaction in IP networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6873600B1 (en) * 2000-02-04 2005-03-29 At&T Corp. Consistent sampling for network traffic measurement
US20040218529A1 (en) * 2000-11-01 2004-11-04 Robert Rodosek Traffic flow optimisation system
US20030105976A1 (en) * 2000-11-30 2003-06-05 Copeland John A. Flow-based detection of network intrusions
US20050039086A1 (en) * 2003-08-14 2005-02-17 Balachander Krishnamurthy Method and apparatus for sketch-based detection of changes in network traffic

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013155021A3 (fr) * 2012-04-09 2014-01-03 Cisco Technology, Inc. Calculs de matrice de demandes distribuée
US9106510B2 (en) 2012-04-09 2015-08-11 Cisco Technology, Inc. Distributed demand matrix computations
US9237075B2 (en) 2012-04-09 2016-01-12 Cisco Technology, Inc. Route convergence monitoring and diagnostics
US9479403B2 (en) 2012-04-09 2016-10-25 Cisco Technology, Inc. Network availability analytics
CN102833134A (zh) * 2012-09-04 2012-12-19 中国人民解放军理工大学 负载自适应的网络数据流流量测量方法
US9979613B2 (en) 2014-01-30 2018-05-22 Hewlett Packard Enterprise Development Lp Analyzing network traffic in a computer network

Also Published As

Publication number Publication date
WO2006130840A2 (fr) 2006-12-07
WO2006130830A3 (fr) 2007-08-30
WO2006130840A3 (fr) 2007-07-19

Similar Documents

Publication Publication Date Title
US9781427B2 (en) Methods and systems for estimating entropy
WO2006130830A2 (fr) Systeme et procede de mesure du trafic et de matrices de flux
Katabi et al. A passive approach for detecting shared bottlenecks
US7779143B2 (en) Scalable methods for detecting significant traffic patterns in a data network
Kompella et al. Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator
Li et al. Low-complexity multi-resource packet scheduling for network function virtualization
Zhao et al. Data streaming algorithms for accurate and efficient measurement of traffic and flow matrices
Xu et al. ELDA: Towards efficient and lightweight detection of cache pollution attacks in NDN
US20090303879A1 (en) Algorithms and Estimators for Summarization of Unaggregated Data Streams
Chefrour One-way delay measurement from traditional networks to sdn: A survey
US11706114B2 (en) Network flow measurement method, network measurement device, and control plane device
Basat et al. Routing oblivious measurement analytics
Duffield et al. Trajectory sampling with unreliable reporting
Callegari et al. When randomness improves the anomaly detection performance
Zheng et al. Unbiased delay measurement in the data plane
Shahzad et al. Noise can help: Accurate and efficient per-flow latency measurement without packet probing and time stamping
Kong et al. Time-out bloom filter: A new sampling method for recording more flows
Wang et al. A new virtual indexing method for measuring host connection degrees
Singh et al. Hh-ipg: Leveraging inter-packet gap metrics in p4 hardware for heavy hitter detection
Cao et al. A quasi-likelihood approach for accurate traffic matrix estimation in a high speed network
Shahzad et al. Accurate and efficient per-flow latency measurement without probing and time stamping
AT&T
JP7174303B2 (ja) トポロジ推定システム、トラフィック生成装置、およびトラフィック生成方法
Zhang et al. Chat: Accurate network latency measurement for 5g e2e networks
Marold et al. Probabilistic parallel measurement of network traffic at multiple locations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06771942

Country of ref document: EP

Kind code of ref document: A2