CN116055362A - Two-stage Hash-Sketch network flow measurement method based on time window - Google Patents

Two-stage Hash-Sketch network flow measurement method based on time window Download PDF

Info

Publication number
CN116055362A
CN116055362A CN202310035107.9A CN202310035107A CN116055362A CN 116055362 A CN116055362 A CN 116055362A CN 202310035107 A CN202310035107 A CN 202310035107A CN 116055362 A CN116055362 A CN 116055362A
Authority
CN
China
Prior art keywords
data packet
flow
hash
measurement
node position
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310035107.9A
Other languages
Chinese (zh)
Inventor
闫林林
葛俊成
卢东辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202310035107.9A priority Critical patent/CN116055362A/en
Publication of CN116055362A publication Critical patent/CN116055362A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a two-stage hash-Sketch network flow measurement method based on a time window, which comprises the steps of firstly obtaining a data packet needing flow measurement through matching and filtering, then carrying out primary and secondary hash table flow characteristic recording and measuring flow, namely measuring separated large flows by using a two-stage hash table, and measuring flow ID data packets, namely small flows, which are expelled by the primary and secondary hash tables by adopting the existing CM-Sketch algorithm. The flow measurement mechanism can more accurately distinguish large flows from small flows, so that the large flows are measured in the first-level hash table and the second-level hash table, and the small flows are measured in the CM-Sketch, thus reducing measurement errors caused by collision of the large flows and the small flows, realizing fine-grained measurement of network flow and improving measurement accuracy.

Description

Two-stage Hash-Sketch network flow measurement method based on time window
Technical Field
The invention belongs to the technical field of network management, and particularly relates to a two-stage hash-sktch network flow measurement method based on a time window.
Background
With the rapid development of network technology, today's networks are complex systems composed of various network devices. On the basis of the complex network infrastructure, various large-scale distributed services with strict requirements on QoS are also constructed. At the same time, in order to meet the requirements of different applications, the network configuration is also very complex, the traffic generated by each application is bursty and varies greatly with time, which makes traffic difficult to predict and manage. The network measurement is taken as an important link of network management, can provide a large amount of basic information for a network manager, and is an important reference basis for decision making of the network management. Thus, it is important for network operators to achieve a fine-grained measurement of the network.
The network measurement can be divided into topology measurement, flow measurement and performance measurement according to different measurement objects, wherein the network topology measurement is to measure the topological relation between routers in the network and between routers and subnetworks; network performance measurement refers to measuring performance indicators in a network, such as time delay, packet loss rate, network bandwidth, network throughput, and the like; the network flow measurement is a network technology for carrying out statistics and analysis on the number, the size, the distribution and other characteristics of the network flow on network nodes such as a switch and the like, and is an important premise for analyzing network performance, understanding user behaviors, detecting network management behaviors such as network security events and the like. Therefore, related research on network traffic measurement (especially measurement on high-speed links) technology is widely focused at home and abroad. In high-speed networks, however, the massive network traffic that continues to arrive on the high-speed links presents great difficulties in measuring and analyzing network traffic. The numerous flows and packets present on the high-speed link pose a dual challenge to the computational and memory resources of conventional flow measurement methods. On one hand, network nodes such as a switch, a router and the like have limited high-speed memory and cannot accommodate huge network data; on the other hand, the processing speed of the packet cannot be matched to the ever-increasing port rate. This inevitably reduces the accuracy of the high speed link measurements by conventional measurement methods.
In the research of the current network flow measurement technology, three main types of network flow measurement methods are mainly included: 1) A network flow measurement method based on a switch flow table; 2) A network flow measurement method based on Counter; 3) A network traffic measurement method based on the Sketch.
The main principle of the network flow measuring method based on the switch flow table is that the counter in the switch flow table is directly utilized to count the size of each forwarding flow in the data plane, and the control plane calculates the size of each flow by periodically requesting the counter value of the data plane flow table. However, this approach mainly faces the problem that the number of flow tables of the switch (e.g., a common switch typically has only a few thousand flow tables) is very limited, and the number of flows with measurements is very large, so it is not possible to directly measure fine-grained flows using the flow tables.
The basic principle of Counter-based network traffic measurement methods is to maintain K counters within the switch, each of which is responsible for counting the size of one flow, using an efficient data structure (e.g., a heap). Since the number of streams will typically be much larger than the number of counters, K counters will be able to record information for K streams at the same time. The number of large flows in an actual network is small, but a small number of large flows generates a large portion of the network traffic. Thus, counter-based flow measurement methods typically utilize only a limited Counter to store large flows of information. A preemption mechanism may be used in a particular implementation to ensure that a large flow can preempt to a counter occupied by a small flow. The Counter-based flow measurement method can basically meet the requirements of a flow planning algorithm on flow information, but the calculation cost is high, and the Counter-based flow measurement method is difficult to apply to a high-speed data plane.
The basic principle of the web traffic measurement method based on the sktch is to store statistical information of flows using various skchs (e.g., CM skchs). The concept of the measurement method based on hash mapping is that a hash function is used to map one range of a data field to another range, for example, CM sktech can be regarded as a two-dimensional counter array, each row corresponds to a hash function, each stream can be hashed into a certain counter of each row according to an ID, and based on this, massive data to be measured can be mapped into a limited counter. However, such hash mapping necessarily has hash collisions, which can contaminate the data stored in the counter, resulting in measurement errors, which are particularly apparent in small and large flows.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a two-stage hash-sktch network flow measurement method based on a time window, and solves the problem of measurement accuracy caused by the limitation of a sktch structure while solving the limitation of network node calculation and storage resources.
In order to achieve the above object, the two-stage hash-sktch network flow measurement method based on a time window of the present invention is characterized by comprising the steps of:
(1) Matched filtering
For the arrived data packet, the network node sends the arrived data packet to a matching filter, and the matching filter filters the measurement data packet according to a matching filtering rule issued by a control plane;
(2) The first-level hash table records the current time window flow measurement characteristics and measures the flow
The filtered data packet enters a primary hash table, a stream ID is set as a key value of the hash table, and when the data packet reaches the primary hash table, the node position corresponding to the inserted hash table is calculated according to the stream ID hash;
in the first-level hash table, if the node position calculated by the data packet stream ID hash is empty, the data packet is the first data packet reaching the first-level hash table in the belonged stream, the data packet stream ID is directly inserted, and the time stamp and the measurement characteristic in the data packet are recorded;
if the node position calculated by the data packet stream ID hash is not null, matching the stream ID of the data packet with the stream ID of the node position calculated by the hash, if the two node positions are the same, indicating that no hash collision occurs, directly counting measurement characteristics, if the two node positions are different, indicating that the hash collision occurs, and considering two sub-conditions according to a time window: if the difference between the time stamp cur of the data packet and the time t recorded by the node position calculated by the hash, namely, the conflict node position, is larger than a threshold value theta, namely, cur-t is larger than or equal to theta, the next time window is entered, the data packet stream ID and the measurement characteristic recorded by the conflict node position are ejected to a secondary hash table for recording, the data packet stream ID and the time stamp cur are used as the time t and the measurement characteristic and recorded by the conflict node position, otherwise, if cur-t is smaller than theta, the data packet is ejected to the secondary hash table for recording;
obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the first-level hash table;
(3) The secondary hash table records all time window flow measurement characteristics and measures flow
For the data packets which are gradually added into the secondary hash table under the condition that cur-t is less than theta, carrying out hash calculation on the stream ID of the data packets again to obtain the corresponding node positions of the data packets in the secondary hash table; if the node position calculated by the data packet stream ID hash is empty, directly inserting the data packet stream ID, and recording the measurement characteristics in the data packet; if the node position calculated by the data packet stream ID hash is not null, matching the stream ID of the data packet with the stream ID of the node position calculated by the hash, if the node position is the same, indicating that no hash collision occurs, directly counting measurement characteristics, if the node position is not the same, indicating that the data packet belongs to a small stream, and expelling the data packet into a CM-Sketch (Count-Min Sketch) module for flow measurement to obtain the flow of the data packet corresponding to the stream ID;
for the data packet stream ID and measurement characteristics of the secondary hash table which are evicted under the condition that cur-t is more than or equal to theta, carrying out hash calculation on the data packet stream ID again to obtain the corresponding node position of the data packet stream ID in the secondary hash table; if the node position calculated by the data packet stream ID hash is empty, directly inserting the data packet stream ID and the measurement feature, if the node position calculated by the data packet stream ID hash is not empty, matching the data packet stream ID with the flow ID of the node position calculated by the hash, if the data packet stream ID and the flow ID of the node position calculated by the hash are the same, indicating that no hash collision occurs, directly adding the measurement feature which is evicted to the secondary hash table with the measurement feature of the node position calculated by Ha Xiji, updating the measurement feature of the node position calculated by the hash by the addition result, if the data packet stream ID and the measurement feature of the node position calculated by Ha Xiji are not the same, comparing the measurement feature which is evicted to the secondary hash table with the measurement feature of the node position calculated by Ha Xiji, and performing flow measurement in a CM-Sketch (Count-Min) module to obtain the flow of the corresponding flow ID data packet, and recording the flow ID with a large measurement feature value and the measurement feature thereof in the conflict node position;
and obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the secondary hash table.
The invention aims at realizing the following steps:
in order to solve the problem that network node calculation and storage resources are limited and the problem that measurement accuracy is difficult to meet the requirement due to the limitation of a Sketch structure, the invention provides a two-stage Hash-Sketch network flow measurement method based on a time window. The flow measurement mechanism can more accurately distinguish large flows from small flows, so that the large flows are measured in the first-level hash table and the second-level hash table, and the small flows are measured in the CM-Sketch, thus reducing measurement errors caused by collision of the large flows and the small flows, realizing fine-grained measurement of network flow and improving measurement accuracy.
Drawings
FIG. 1 is a schematic diagram of a two-stage hash-Sketch network flow measurement method based on a time window of the present invention;
FIG. 2 is a flow chart of one embodiment of a two-stage hash-Sketch network flow measurement method based on a time window of the present invention;
FIG. 3 is a first level hash representation intent to record current time window flow measurement characteristics;
FIG. 4 is a two-level hash representation intent to record all time window flow characteristic measurement feature accumulation values.
Detailed Description
The following description of the embodiments of the invention is presented in conjunction with the accompanying drawings to provide a better understanding of the invention to those skilled in the art. It is to be expressly noted that in the description below, detailed descriptions of known functions and designs are omitted here as perhaps obscuring the present invention.
The existing network flow measurement method has a certain limitation, and the limited network node calculation and storage resources limit the selection and improvement of the network flow measurement method, so that the network node is difficult to accurately measure the flow in the network.
Upon analysis of the traffic trace grasped in the research network, it was found that the size distribution of the flows was very close to the Zipf distribution, i.e. very few large flows contributed most of the traffic in the network. Therefore, the invention proposes to insert the stream in the same time window into the first-level hash table by adopting a time window mechanism, and after the data packet of the next time window arrives, the stream in the original first-level hash table is gradually added into the second-level hash table, and meanwhile, the stream is replaced by the packet of the new time window. Because the large stream always arrives continuously in the network and the small stream is interspersed among the large stream, the time window mechanism is adopted to divide the network data packet periodically, the probability that the large stream is evicted into the sktch by the small stream can be effectively reduced, and the measurement error caused by the hash collision of the large stream and the small stream is reduced.
In addition, in order to measure large flow more accurately, the invention designs a two-stage hash table mechanism. The data packet that collides in the primary hash table is not evicted to the Sketch by the standing horse, but is evicted to the secondary hash table. If the data packet collides in the secondary hash table and the flow measurement characteristic (for example, the number of packets) is smaller than the flow measurement characteristic of the original node in the secondary hash table, the data packet is considered as a small flow and is evicted to the sktch for measurement.
Compared with the traditional network flow measurement method, the invention has the following characteristics:
1. providing a mechanism based on a time window to periodically count network traffic;
2. two-stage hash-Sketch mechanisms are proposed to accurately measure large and small flows, respectively.
In order to achieve the above purpose, the two-stage hash-sktch network flow measurement method based on the time window mainly comprises the following steps:
(1) Primary hash table for recording current time window flow characteristics
(1.1) design of Primary Hash Table
(1.2), design of time window
(2) Secondary hash table for recording all time window flow measurement characteristic accumulated value
(2.1) design of the two-level Hash Table
(2.2), small stream eviction design
Fig. 1 is a schematic diagram of a two-stage hash-Sketch network flow measurement method based on a time window of the present invention.
As shown in fig. 1 and Song Shi, the innovative part of the present invention is mainly divided into three parts, and the functions thereof are divided as follows:
1. matched filter
The data packet first passes through a matched filter, which filters the measurement packet according to rules issued by the control plane.
2. Primary hash table
The flow characteristics of the current time window are mainly recorded, and the flow ID, the arrival time of the first packet of the flow, and the characteristics of the flow, such as the number of packets, are mainly recorded in the hash table.
3. Two-stage hash table
The cumulative value of the flow characteristics for all time windows is recorded.
The specific implementation steps are as follows:
in the network traffic measurement scheme, a secondary hash table is responsible for large flow measurements, while CM-skitch is responsible for small flow measurements. The data packets first pass through a matched filter, which filters the measurement packets according to rules issued by the control plane, for example, only measuring 10-segment flows, or 20-segment flows.
The data packet then enters a primary hash table, which records the flow characteristics of the current time window. The hash table mainly records the stream ID, the arrival time of the first packet of the stream, and the measurement characteristics of the stream, such as the number of packets. Data packets that collide in the primary hash table are evicted to the secondary hash table, which is responsible for recording the cumulative value of the flow characteristics for all time windows. The stream evicted by a two-level hash table is identified as a small stream and is measured using CM-sktch.
Fig. 2 is a flowchart of one embodiment of a two-stage hash-Sketch network traffic measurement method based on a time window of the present invention.
In this embodiment, as shown in fig. 2, the two-stage hash-Sketch network traffic measurement method based on the time window of the present invention includes the following steps:
step S1: matched filtering
For an arriving packet, the network node sends it to a match filter, which filters the measurement packet according to match filter rules issued by the control plane. In this embodiment, as shown in FIG. 1, for example, only a flow of 10 segments (Src: 10.0.0.0/8) as a source or a flow of 20 segments (Dst: 20.0.0.0/16) as a destination is measured.
Step S2: the first-level hash table records the current time window flow measurement characteristics and measures the flow
The filtered data packet enters a primary hash table, a stream ID is set as a key value of the hash table, and when the data packet reaches the primary hash table, the node position corresponding to the inserted hash table is calculated according to the stream ID hash. The stream ID may be a five-tuple of the stream, source destination IP, etc.
In the first-level hash table, if the node position calculated by the data packet stream ID hash is null, the data packet is the first data packet reaching the first-level hash table in the belonged stream, the data packet stream ID is directly inserted, and the time stamp and the measurement characteristic in the data packet are recorded. In this embodiment, the measurement feature in the recording packet is the packet number Count value set to 1, as shown in (1) in fig. 3.
If the node position calculated by the data packet stream ID hash is not null, matching the stream ID of the data packet with the stream ID of the node position calculated by the hash, if the two are the same, indicating that no hash collision occurs, and directly counting measurement characteristics. In the present embodiment, the direct statistical measurement feature is the number of packets Count value plus 1, as shown in (2) in fig. 3. If the two sub-conditions are different, the hash collision is indicated, and two sub-conditions are considered according to the time window: if the difference between the time stamp cur of the data packet and the time t recorded by the node position calculated by the hash, namely, the conflict node position, is greater than the threshold value theta, namely cur-t is greater than or equal to theta, the next time window is entered, the data packet stream ID and the measurement characteristic recorded by the conflict node position are expelled to a secondary hash table for recording, and the data packet stream ID and the time stamp cur are recorded as the time t and the measurement characteristic in the conflict node position, as shown in (3) in fig. 3; otherwise, if cur-t < θ, the packet is evicted to the secondary hash table for recording, as shown in (4) in fig. 3.
And obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the first-level hash table.
Step S3: the secondary hash table records all time window flow measurement characteristics and measures flow
According to the primary hash table working mechanism, there are two cases in total that will expel the data packet or the recorded data packet stream ID, the measurement characteristics into the secondary hash table. If the data packet collides in the first-level hash table and the time difference between the time stamp of the data packet and the time difference in the original node is smaller than a threshold value, namely cur-t is smaller than theta, the data packet is evicted to the second-level hash table for recording. And carrying out hash calculation on the stream ID of the data packet which is gradually added into the secondary hash table under the condition that cur-t is less than theta, so as to obtain the corresponding node position of the data packet in the secondary hash table. If the node position calculated by the data packet stream ID hash is null, the data packet stream ID is directly inserted, and the measurement characteristics in the data packet are recorded. If the node position calculated by the data packet stream ID hash is not null, matching the stream ID of the data packet with the stream ID of the node position calculated by the hash, if the two are the same, indicating that no hash collision occurs, and directly counting the measurement feature, wherein in the embodiment, the direct counting measurement feature is the packet Count value plus 1, and as shown in (4) in fig. 4, the packet Count value cnt_s1+1; if the data packets are different, indicating that hash collision occurs, wherein the data packets belong to a small stream, and expelling the data packets to a CM-Sketch (Count-Min Sketch) module for flow measurement to obtain the flow of the corresponding stream ID data packet.
If the difference between the time stamp cur of the data packet and the time t recorded by the node position calculated by the hash, namely, the conflict node position, in the first-level hash table is larger than a threshold value theta, namely, cur-t is larger than or equal to theta, the data packet stream ID and the measurement characteristic of the conflict node position are evicted to the second-level hash table. For the data packet stream ID and measurement characteristics of the secondary hash table which are evicted under the condition that cur-t is more than or equal to theta, carrying out hash calculation on the data packet stream ID again to obtain the corresponding node position of the data packet stream ID in the secondary hash table; if the node position calculated by the data packet stream ID hash is null, the data packet stream ID and the measurement feature are directly inserted, if the node position calculated by the data packet stream ID hash is not null, the data packet stream ID and the flow ID of the node position calculated by the hash are matched, if the data packet stream ID and the flow ID of the node position calculated by the hash are the same, the measurement feature of the data packet ejected to the secondary hash table is directly added with the measurement feature of the node position calculated by Ha Xiji (such as the number of packets Count plus 1), the measurement feature of the node position calculated by the hash is updated by the addition result, if the data packet ID and the measurement feature of the node position calculated by Ha Xiji are not the same, the flow ID with small measurement feature value and the flow of the corresponding flow ID data packet are ejected to the CM-Sketch (Count-Min) module, and the flow ID with large measurement feature value and the measurement feature of the flow ID are recorded at the node position. In the present embodiment, as shown in (5) of fig. 4, the stream ID Fk with a small measurement characteristic value and its measurement characteristic cnt_s3 are recorded in the collision node position, one by one, to the CM-Sketch module, and the stream ID F1 with a large measurement characteristic value and its measurement characteristic cnt1 are recorded.
And obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the secondary hash table.
As shown in fig. 1, the CM-Sketch module hashes the flow ID five times, and the measurement features for the five node positions are 1545, 2134, 1297, 2341, and 1441, with the smallest measurement feature 1297 being selected as the flow measurement. The flow measurement of the CM-14 module belongs to the prior art and is not described in detail here.
The flow measurement mechanism of the invention can more accurately distinguish large flow from small flow, so that the large flow is measured in the hash table, and the small flow is measured in the CM-Sketch, thereby reducing measurement errors caused by the collision of large flow and small flow and realizing a flow measurement mechanism with fine granularity.
While the foregoing describes illustrative embodiments of the present invention to facilitate an understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as protected by the accompanying claims insofar as various changes are within the spirit and scope of the present invention as defined and defined by the appended claims.

Claims (2)

1. The two-stage hash-Sketch network flow measurement method based on the time window is characterized by comprising the following steps of:
(1) Matched filtering
For the arrived data packet, the network node sends the arrived data packet to a matching filter, and the matching filter filters the measurement data packet according to a matching filtering rule issued by a control plane;
(2) The first-level hash table records the current time window flow measurement characteristics and measures the flow
The filtered data packet enters a primary hash table, a stream ID is set as a key value of the hash table, and when the data packet reaches the primary hash table, the node position corresponding to the inserted hash table is calculated according to the stream ID hash;
in the first-level hash table, if the node position calculated by the data packet stream ID hash is empty, the data packet is the first data packet reaching the first-level hash table in the belonged stream, the data packet stream ID is directly inserted, and the time stamp and the measurement characteristic in the data packet are recorded;
if the node position calculated by the data packet stream ID hash is not null, matching the stream ID of the data packet with the stream ID of the node position calculated by the hash, if the two node positions are the same, indicating that no hash collision occurs, directly counting measurement characteristics, if the two node positions are different, indicating that the hash collision occurs, and considering two sub-conditions according to a time window: if the difference between the time stamp cur of the data packet and the time t recorded by the node position calculated by the hash, namely, the conflict node position, is larger than a threshold value theta, namely, cur-t is larger than or equal to theta, the next time window is entered, the data packet stream ID and the measurement characteristic recorded by the conflict node position are ejected to a secondary hash table for recording, the data packet stream ID and the time stamp cur are used as the time t and the measurement characteristic and recorded by the conflict node position, otherwise, if cur-t is smaller than theta, the data packet is ejected to the secondary hash table for recording;
obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the first-level hash table;
(3) The secondary hash table records all time window flow measurement characteristics and measures flow
For the data packets which are gradually added into the secondary hash table under the condition that cur-t is less than theta, carrying out hash calculation on the stream ID of the data packets again to obtain the corresponding node positions of the data packets in the secondary hash table; if the node position calculated by the data packet stream ID hash is empty, directly inserting the data packet stream ID, and recording the measurement characteristics in the data packet; if the node position calculated by the data packet flow ID hash is not null, matching the flow ID of the data packet with the flow ID of the node position calculated by the hash, if the node position is the same, indicating that no hash collision occurs, directly counting measurement characteristics, if the node position is not the same, indicating that the hash collision occurs, and expelling the data packet into a CM-Sketch module for flow measurement to obtain the flow of the data packet with the corresponding flow ID;
for the data packet stream ID and measurement characteristics of the secondary hash table which are evicted under the condition that cur-t is more than or equal to theta, carrying out hash calculation on the data packet stream ID again to obtain the corresponding node position of the data packet stream ID in the secondary hash table; if the node position calculated by the data packet stream ID hash is empty, directly inserting the data packet stream ID and the measurement feature, if the node position calculated by the data packet stream ID hash is not empty, matching the data packet stream ID with the flow ID of the node position calculated by the hash, if the data packet stream ID and the flow ID of the node position calculated by the hash are the same, indicating that no hash collision occurs, directly adding the measurement feature which is evicted to the secondary hash table with the measurement feature of the node position calculated by Ha Xiji, updating the measurement feature of the node position calculated by the hash by the addition result, if the data packet stream ID and the measurement feature of the node position calculated by Ha Xiji are different, indicating that the hash collision occurs, comparing the flow ID with small measurement feature value with the measurement feature of the flow ID, and recording the flow ID with large measurement feature value and the measurement feature of the flow ID in the CM-Sketch module in the conflict node position to obtain the flow of the corresponding flow ID data packet;
and obtaining the flow of the corresponding flow ID data packet according to the measurement characteristics recorded by the secondary hash table.
2. The two-stage hash-sktch network traffic measurement method based on time window of claim 1, wherein the measurement characteristic is the number of packets of the data packet.
CN202310035107.9A 2023-01-10 2023-01-10 Two-stage Hash-Sketch network flow measurement method based on time window Pending CN116055362A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310035107.9A CN116055362A (en) 2023-01-10 2023-01-10 Two-stage Hash-Sketch network flow measurement method based on time window

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310035107.9A CN116055362A (en) 2023-01-10 2023-01-10 Two-stage Hash-Sketch network flow measurement method based on time window

Publications (1)

Publication Number Publication Date
CN116055362A true CN116055362A (en) 2023-05-02

Family

ID=86123280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310035107.9A Pending CN116055362A (en) 2023-01-10 2023-01-10 Two-stage Hash-Sketch network flow measurement method based on time window

Country Status (1)

Country Link
CN (1) CN116055362A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117811951A (en) * 2024-02-29 2024-04-02 苏州大学 Network flow size measuring method based on Sketch
CN117827851A (en) * 2024-03-06 2024-04-05 苏州元澄科技股份有限公司 Data processing structure for measuring flow base number and application thereof
CN118138496A (en) * 2024-04-30 2024-06-04 苏州元脑智能科技有限公司 Method and device for transmitting network measurement information and computer readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117811951A (en) * 2024-02-29 2024-04-02 苏州大学 Network flow size measuring method based on Sketch
CN117811951B (en) * 2024-02-29 2024-05-31 苏州大学 Network flow size measuring method based on Sketch
CN117827851A (en) * 2024-03-06 2024-04-05 苏州元澄科技股份有限公司 Data processing structure for measuring flow base number and application thereof
CN117827851B (en) * 2024-03-06 2024-05-10 苏州元澄科技股份有限公司 Data processing structure for measuring flow base number and application thereof
CN118138496A (en) * 2024-04-30 2024-06-04 苏州元脑智能科技有限公司 Method and device for transmitting network measurement information and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN116055362A (en) Two-stage Hash-Sketch network flow measurement method based on time window
CN109861881B (en) Elephant flow detection method based on three-layer Sketch framework
Zdonik et al. SpringerBriefs in Computer Science
CN113132180B (en) Cooperative type large flow detection method facing programmable network
CN110149239B (en) Network flow monitoring method based on sFlow
US9992081B2 (en) Scalable generation of inter-autonomous system traffic relations
US11050649B2 (en) Delay measurement method of network node device, apparatus, and network node device
Zeng et al. A survey on sliding window sketch for network measurement
CN112260899A (en) Network monitoring method and device based on MMU (memory management unit)
CN112688837A (en) Network measurement method and device based on time sliding window
Wang et al. A bandwidth-efficient int system for tracking the rules matched by the packets of a flow
CN110351166B (en) Network-level fine-grained flow measurement method based on flow statistical characteristics
CN109952743B (en) System and method for low memory and low flow overhead high flow object detection
CN101834763A (en) Multiple-category large-flow parallel measuring method under high speed network environment
Scherrer et al. Low-rate overuse flow tracer (loft): An efficient and scalable algorithm for detecting overuse flows
Ma et al. Noise measurement and removal for data streaming algorithms with network applications
Turkovic et al. Detecting heavy hitters in the data-plane
CN111200542B (en) Network flow management method and system based on deterministic replacement strategy
Hu et al. Entropy based adaptive flow aggregation
Hardegen Scope-based flow monitoring to improve traffic analysis in programmable networks
Qian et al. Per-Flow Size Measurement by Combining Sketch and Flow Table in Software-Defined Networks
Wen et al. Traffic identification algorithm based on improved LRU
da Cruz et al. Accurate online detection of bidimensional hierarchical heavy hitters in software-defined networks
Zhao et al. HBL-Sketch: A new three-tier sketch for accurate network measurement
Dhandapani et al. A novel eviction policy based on shortest remaining time for software defined networking flow tables

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination