CN113839835B - Top-k flow accurate monitoring system based on small flow filtration - Google Patents

Top-k flow accurate monitoring system based on small flow filtration Download PDF

Info

Publication number
CN113839835B
CN113839835B CN202111133411.4A CN202111133411A CN113839835B CN 113839835 B CN113839835 B CN 113839835B CN 202111133411 A CN202111133411 A CN 202111133411A CN 113839835 B CN113839835 B CN 113839835B
Authority
CN
China
Prior art keywords
flow
stream
small
filter
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111133411.4A
Other languages
Chinese (zh)
Other versions
CN113839835A (en
Inventor
罗可
周国徽
熊兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202111133411.4A priority Critical patent/CN113839835B/en
Publication of CN113839835A publication Critical patent/CN113839835A/en
Application granted granted Critical
Publication of CN113839835B publication Critical patent/CN113839835B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0888Throughput

Landscapes

  • Engineering & Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a Top-k flow accurate monitoring system based on small flow filtration, which comprises: the small flow filter is used for filtering small flows in network flow, reducing storage resource expenditure caused by accurate storage of small flow information, improving flow size estimation precision and solving the problem of failure of the traditional filter on small flow filtration; the large flow monitoring table is used for precisely storing large flow information to track and count the number of large flow packets, and improves the Top-k flow identification precision; the large flow monitoring table comprises a list Ha Xiduo mapping algorithm and a probability replacement strategy; the Shan Haxi multi-mapping algorithm firstly calculates the fingerprint value of the stream according to the identifier of the stream, and then repeatedly selects part of bits from the fingerprint value to rearrange and combine so as to generate a plurality of hash values, thereby reducing the expenditure of hash calculation and enabling the Top-k stream to have enough candidate positions for selection and storage. The probability replacement strategy determines whether to evict the minimum stream by searching the minimum stream in the mapping bucket and generating a replacement probability according to the packet number of the minimum stream, thereby providing a storage location for a relatively larger stream. The invention filters small flow with small storage resource cost, then accurately monitors large flow to accurately count, has high space utilization rate, and can reach high Top-k flow identification rate with small space cost.

Description

Top-k flow accurate monitoring system based on small flow filtration
Technical Field
The invention relates to the field of network measurement, in particular to a high-precision Top-k stream identification system and a technical scheme for filtering small streams in data streams.
Background
The tasks of network measurement include the identification of Top-k flows, flow size estimation, flow number statistics, etc., which provides key information for analyzing the characteristics of network flows and is the basis for network management and monitoring. Where Top-k flow identification is generally defined as finding the Top k largest flows in network traffic and flow size is defined as the number of packets of the network data flow. In general, network measurement programs assign a counter to an arriving data stream to track the size of the detected stream in order to identify the Top-k stream, but for millions of network data streams it is difficult to maintain a counter for each data stream. Meanwhile, in order to be able to correctly identify the Top-k stream, the estimation error of the measurement procedure convection size needs to be ensured to be within a very small range. Therefore, on the premise of ensuring the processing speed of the algorithm, searching a Top-k stream identification method with high precision and low cost becomes an important challenge of the current Top-k stream identification method.
Currently, the main Top-k stream identification methods are largely divided into three categories. The first category is a method based on the sketch, and is divided into two structures of sketch and small top heap, sketch and hash table. The first structured sketch method counts the sizes of all streams by a two-dimensional counter and identifies the Top-k streams therein by means of small Top heap tracking. The second structure of the method stores small flows through the sketch, the hash table monitors large flows to reduce the expenditure of storage resources, and the replacement algorithm is adopted to expel the small flows in the hash table to accurately store the large flows. The second type is a counter-based method, which estimates the flow size accurately by allocating a counter to a large flow in the buffer. The third type is a method based on filtering thought, wherein small flows in network flow are filtered by means of a filter, then large flows in the network flow are extracted, the influence of the small flows on the accurate counting of the large flows is avoided, and the estimation accuracy of the convection size is improved.
But at the same time face the following problems:
1. cannot meet the requirements of high precision and low memory overhead at the same time
With the increasing number of network devices on the internet, the number of network data flows has already reached a million level, and the size of network traffic is subject to heavy tail distribution, i.e., a small portion of large flows in the network occupy most of the data packets in the network traffic, while a large number of small flows occupy only a small amount of data packets in the network traffic. In this regard, the slot-based method must use a sufficient number of counters to reduce hash collisions and each counter must use a sufficient number of bits to avoid overflow, thereby failing to reduce the overhead of storage resources. The counter-based method also has to allocate enough counters to track a huge number of streams, and has the problem of misjudging a small stream as a large stream, which affects the Top-k stream identification accuracy.
2. Transitional filtration and filtration failure problems of conventional small flow filters
Conventional filters use a two-dimensional counter array to record the number of packets that a stream arrives at and when all the counter values to which the data stream maps reach a threshold T, it will be allowed to pass the filter. However, most of the counters in the filter will reach the threshold T after a period of time, resulting in all flows being able to pass directly through the filter, thus rendering the filter unable to filter small flows. Although the existing filter uses a fixed time as one period to reset the counter in the filter, the counter in the filter is prevented from always keeping a state of reaching a threshold value. However, after the counter of the filter is reset, the large stream needs to re-increment the value of the counter in the filter to pass through the filter, resulting in unexpected consumption of the data packets of the large stream, and thus underestimating the number of data packets of the large stream, i.e. a transitional filtering problem.
Comparison document: CN111262756a discloses a method for accurately measuring high-speed network elephant flow, which comprises the steps of filtering small flows in data flow through a filter based on a sketch, extracting large flows in network flow through an extractor based on cuckoo hash, reducing storage resource overhead for the small flows, and accurately tracking the large flows so as to improve the identification rate of the large flows. The comparison file scheme does not solve the problem of filtration failure, so that after a small flow filter works for a period of time, the small flow in the network flow cannot be filtered, and the extraction method based on cuckoo hash has the problem of low accuracy of large flow identification.
Disclosure of Invention
The invention aims to solve the technical problem of providing a small flow filter data structure based on a sketch technology by adopting a combination of double counters and a method for periodically updating the counters in the filter, designing strategies corresponding to two kinds of counter updating respectively, and accurately recording the arrival condition of each flow in each period so as to judge the large flow and the small flow in the network data flow. Meanwhile, a large flow monitoring table is designed by combining a hash algorithm mapped by a single Ha Xiduo, so that the Top-k flow is guaranteed to have enough position storage, the minimum flow is evicted by adopting a probability replacement strategy, the large flow is accurately stored, and the identification precision of the Top-k flow is improved.
In order to solve the technical problems, the invention adopts the following technical scheme:
the invention provides a Top-k flow accurate monitoring system based on small flow filtration, which comprises:
the small flow filter is used for distinguishing large flows from small flows in network flow, so that the large flows in the small flow filter can be conveniently extracted to accurately track the number of the statistics packets; the small flow filter adopts two small counters to match with different packet number information of the recorded flow in pairs, so that low memory space overhead is realized, and the counter in the small flow filter is updated according to the period; the two small counters are respectively used for recording the average number of packets arriving in each period and the number of packets arriving in the current period of the stream;
the large flow monitoring table is used for accurately monitoring the large flow in the network flow and accurately counting the packet number of the large flow; the large flow monitoring table is a hash table formed by hash buckets, a plurality of flows can be stored in each hash bucket, each flow is mapped into a plurality of candidate hash buckets by adopting a single Ha Xiduo mapping algorithm so as to ensure that Top-k flows are stored in enough positions, and a probability replacement strategy is adopted so as to accurately monitor the large flows; the Shan Haxi multi-mapping algorithm is used for generating a plurality of hash values through one hash calculation so as to map to a plurality of hash buckets; the probability replacement strategy is to replace the minimum flow in all candidate positions by a certain probability when all candidate positions have no empty position;
the small flow filter consists of d arrays, each array consists of w barrels, and each barrel is internally provided with two counters, namely a new counter and an old counter; the new counter records the number of packets reached by the stream in the current period; the old counter records the number of packets that the flow has arrived on average in the past period;
the large flow monitoring table consists of r hash buckets, each of which contains c slots, each slot storing a fingerprint value FP of one flow and a packet number counter, i.e. each slot storing one flow.
The method also provides a technical scheme based on the system, which comprises the following steps:
when the data packet arrives, the small flow filter maps to one bucket of d arrays through d two independent hash functions according to the flow identifier, acquires the smallest new counter value and the smallest old counter value in the d buckets, takes the smallest new counter value as the current packet number of the flow, and takes the smallest old counter value as the average packet number of the flow in the past period. When the minimum new counter value for a flow reaches a threshold T, then the flow is considered to be a newly arrived large flow, allowed to pass through the filter and into the large flow monitoring table. When the minimum old counter value of a flow reaches a threshold T, then the flow is considered to be a continuously arriving large flow, which is allowed to pass through the filter and into the large flow monitoring table.
The two counters are more concerned about whether the threshold value T is reached or not, and the threshold value T is usually small, so that the two counters only need to be set to be a few bits in size, and the purpose of small size and low cost is achieved.
When the data packet of the flow arrives, the large flow monitoring table firstly calculates a fingerprint value FP through a hash function according to the flow identifier, then randomly selects a fixed number of bits from the fingerprint value FP for arrangement for a plurality of times, and generates a plurality of sub hash values, so that the sub hash values are mapped into a plurality of hash buckets. Then, the large flow monitoring table checks all mapping buckets, and if the flow is stored, the counter corresponding to the flow is increased by 1; if the stream is not stored but there is a null, inserting the stream into a null; if the stream is not stored and there is no vacancy, the minimum stream in the mapping bucket is found and a decision is made as to whether to replace the minimum stream with the newly arrived stream based on generating a replacement probability based on the number of packets of the minimum stream.
Further, the Top-k stream identification system includes the following operations:
1. small flow filter insertion and reporting;
the small flow filter maps each arriving packet to a bucket on each counter array, reports whether the packet can pass the filter based on the minimum new counter value and the minimum old counter value, and decides whether to update the new counter therein.
2. Periodic updating of a counter in the small flow filter;
when the small flow filter measures a certain number of data packets, all the counters in itself will be updated. The new counter is directly reset to 0, and the old counter adopts a halved updating strategy, namely updating to be the average value of the new counter and the value of the old counter in the last period.
3. Inserting a large flow monitoring table;
when a data packet is transmitted into a large flow monitoring table, firstly counting fingerprint values of the flow according to a flow identifier of the flow to which the data packet belongs, then inquiring in the monitoring table according to the fingerprint values of the flow, and carrying out different updating steps according to the inquiring result and whether a hash bucket has a vacancy or not.
4. Replacement of a large flow monitoring table;
when a stream of packets arrives, all hash buckets mapped by it are full and the stream is not recorded in a bucket, then the packet number C of the smallest stream in the bucket is first based on min Generating a substitution probability 1/(C) min +1) and then compared with a real number randomly generated between 0 and 1. If the probability of substitution is greater than a real number, the minimum flow is substituted, otherwise, the packets of that flow will be discarded.
5. Top-k stream report of large stream monitoring table;
the large stream monitoring table firstly arranges all streams in sequence from large to small according to the packet number of the streams, extracts the first k streams, adds a threshold T of a small stream filter as the final packet number of the streams, and reports the k streams as Top-k streams to a server.
The invention has the beneficial effects that:
1. the present invention uses two small counters to construct a small flow filter and updates the counter in the small flow filter periodically. One of the counters records the number of packets that a flow arrives in the current period to identify a newly arriving large flow, and the other counter records the number of packets that the flow has arrived on average in the past period to identify a continuously arriving large flow. The two small counters are combined to identify large flows in network flow, so that storage space waste caused by small flows is reduced, the problem that a traditional filter fails in small flow filtration is solved, and the identification accuracy of Top-k flows is improved.
2. The invention combines a single Ha Xiduo mapping algorithm to design a low-cost and high-precision Top-k stream identification method. First, according to the fingerprint value calculated by the flow identifier, then, selecting bits from the fingerprint value to reconstruct a hash value to be mapped into a hash table, so as to reduce the expenditure of hash calculation. Meanwhile, each stream can have a plurality of candidate hash buckets, so that the Top-k stream can be ensured to have enough storage positions for selection, the problem that the Top-k cannot monitor the stream size because of no position storage is avoided, and the Top-k stream identification precision is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a Top-k flow accurate monitoring system based on small flow filtration in the method of the present invention.
FIG. 2 is a data structure diagram of a small flow filter in the method of the present invention.
Fig. 3 is a data structure diagram of a large flow monitoring table in the method of the present invention.
Fig. 4 is a flow chart of the insertion and release of packets by a small flow filter in the method of the present invention.
Fig. 5 is a flow chart of the periodic updating of the counter in the small flow filter in the method of the present invention.
Fig. 6 is a flow chart of a large flow monitoring table packet insertion in the method of the present invention.
Fig. 7 is a flow chart of a stream replacement in a large stream monitoring table in the method of the present invention.
Fig. 8 is a flow chart of a large flow monitor table reporting Top-k flows in the method of the present invention.
Detailed Description
In order to better illustrate the invention, the invention is further verified by the following specific examples. The examples are presented herein only to more directly describe the invention and are merely a part of the invention and should not be construed as limiting the invention in any way.
As shown in fig. 1, an embodiment of the present invention provides a Top-k flow accurate monitoring system based on small flow filtering, including:
the small flow filter is used for distinguishing large flows from small flows in network flow, so that the large flows in the small flow filter can be conveniently extracted to accurately track the number of the statistics packets; the small flow filter adopts two small counters to match with different packet number information of the recorded flow in pairs, so that low memory space overhead is realized, and the counter in the small flow filter is updated according to the period; the two small counters are respectively used for recording the average number of packets arriving in each period and the number of packets arriving in the current period of the stream;
as shown in fig. 2, the small flow filter is composed of d arrays, each of which is composed of w buckets, each containing a pair of counters, i.e., a new counter and an old counter; wherein both small counters are 4 bits in size.
The large flow monitoring table is used for accurately monitoring the large flow in the network flow and accurately counting the packet number of the large flow; the large flow monitoring table is a hash table formed by hash buckets, a plurality of flows can be stored in each hash bucket, each flow is mapped into a plurality of candidate hash buckets by adopting a single Ha Xiduo mapping algorithm so as to ensure that Top-k flows are stored in enough positions, and a probability replacement strategy is adopted so as to accurately monitor the large flows; the Shan Haxi multi-mapping algorithm is used for generating a plurality of hash values through one hash calculation so as to map to a plurality of hash buckets; the probability replacement strategy is to replace the minimum flow in all candidate positions by a certain probability when all candidate positions have no empty position;
as shown in fig. 3, the large flow monitoring table is composed of r hash buckets, each of which contains c slots, each of which stores a fingerprint value FP of one flow and a packet number counter, i.e., each of which stores one flow. When the data packet P fid Upon arrival, we calculate the streaming fingerprint FP by the hash function H (), and by the sub-hash function subH i () Mapped to i buckets. Sub-hash function subH i Hash value calculation of () is divided into two steps: (1) Selecting n bits of a fixed position from the fingerprint value FP value, and then selecting the bit of the corresponding position all the time; (2) The selected n bit values are arranged to produce a new hash value.
The embodiment also provides a technical scheme based on the system, which comprises the following steps:
when a data packet arrives, the small flow filter firstly maps to one bucket in d arrays through d two independent hash functions according to the flow identifier, acquires the smallest new counter value and the smallest old counter value in the d buckets, takes the smallest new counter value as the current arrival packet number of the flow, and takes the smallest old counter value as the average arrival packet number of the flow in the past period. When the minimum new counter value for a flow reaches a threshold T, then the flow is considered to be a newly arriving large flow, allowing it to pass through the filter. When the minimum old counter value of a flow reaches a threshold T, then the flow is considered to be a large flow that is continuously arriving, allowing it to pass through the filter.
When the data packet of the flow arrives, the large flow monitoring table calculates a fingerprint value FP through a hash function according to the flow identifier, then randomly selects a fixed number of bits from the fingerprint value FP for a plurality of times to arrange, and generates a plurality of sub hash values, so that the sub hash values are mapped into a plurality of hash buckets. Then, the large flow monitoring table checks all mapping buckets, and if the flow is stored, the counter corresponding to the flow is increased by 1; if the stream is not stored but there is a null, inserting the stream into a null; if the stream is not stored and there is no vacancy, the minimum stream in the mapping bucket is found and a decision is made as to whether to replace the minimum stream with the newly arrived stream based on generating a replacement probability based on the number of packets of the minimum stream.
1. The small flow filter inserts and releases the data packet;
as shown in fig. 4, the small flow filter maps each arriving packet to a bucket on each counter array, reports whether the packet can pass the filter based on the minimum new counter value and the minimum old counter value, and decides whether to update the new counter therein.
Firstly, analyzing header information of a data packet, and extracting a flow identifier; then, the minimum new counter value and the minimum old counter value are obtained by mapping d hash functions to a certain bucket on d arrays of the small flow filter, and then the minimum value is compared with a threshold value T. When the minimum new value is greater than the threshold, the packet is allowed to pass through the filter into the large flow monitoring table. Otherwise, updating new values in the d mapping buckets, and judging whether the minimum old value is larger than a threshold value. And if the minimum old value is greater than the threshold value, allowing the data packet to enter the large flow monitoring table through the filter.
2. Periodic updating of a counter in the small flow filter;
as shown in fig. 5, when the small flow filter measures a certain number of packets, all the counters in itself will be updated. The filter will update the counters in the buckets starting from the first bucket of each array until the last bucket has been updated.
The new counter is directly reset to 0, and the old counter adopts a halved updating strategy, namely updating to be the average value of the new counter and the value of the old counter in the last period.
3. Inserting a large flow monitoring table;
as shown in fig. 6, when a data packet is transmitted into the large flow monitoring table, the fingerprint value of the flow is first counted according to the flow identifier of the flow to which the data packet belongs, then the data packet is queried in the monitoring table according to the fingerprint value of the flow, and different updating steps are performed according to the query result and whether the hash bucket has a vacancy.
First, a fingerprint value FP is generated from the flow identifier fid, and then the positions of k hash buckets are obtained by a single Ha Xiduo mapping algorithm, and the flows in the buckets are sequentially queried. Upon encountering the first slot, the stream is inserted and ended. Or when the stream is queried, the counter of the stream is incremented by 1 and ended. Otherwise, a stream to be replaced is newly created, and the stream replacement operation is entered.
4. Replacement of a large flow monitoring table;
as shown in fig. 7, when a packet arrives for a stream, all hash buckets mapped by it are full and the stream is not recorded in a bucket, the packet number C of the smallest stream in the bucket is first based on min Generating a substitution probability 1/(C) min +1) and then compared with a real number randomly generated between 0 and 1. If the probability of substitution is greater than a real number, the minimum flow is substituted, otherwise, the packets of that flow will be discarded.
5. Top-k stream report of large stream monitoring table;
as shown in fig. 8, the large stream monitoring table firstly arranges all streams in sequence from large to small according to the packet number of the streams, extracts the first k streams, adds a threshold T of the small stream filter as the final packet number of the streams, and then reports the k streams as Top-k streams to the server.
The specific embodiments described herein are offered by way of example only to illustrate the spirit of the invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.

Claims (6)

1. The Top-k flow accurate monitoring system based on small flow filtration is characterized by comprising:
the small flow filter is used for distinguishing large flows from small flows in network flow, so that the large flows in the small flow filter can be conveniently extracted to accurately track the number of the statistics packets; the small flow filter adopts two small counters to match with different packet number information of the recorded flow in pairs, so that low memory space overhead is realized, and the counter in the small flow filter is updated according to the period; the two small counters are respectively used for recording the average number of packets arriving in each period and the number of packets arriving in the current period of the stream;
the large flow monitoring table is used for accurately monitoring the large flow in the network flow and accurately counting the packet number of the large flow; the large flow monitoring table is a hash table formed by hash buckets, a plurality of flows can be stored in each hash bucket, each flow is mapped into a plurality of candidate hash buckets by adopting a single Ha Xiduo mapping algorithm so as to ensure that Top-k flows are stored in enough positions, and a probability replacement strategy is adopted so as to accurately monitor the large flows; the Shan Haxi multi-mapping algorithm is used for generating a plurality of hash values through one hash calculation so as to map to a plurality of hash buckets; the probability replacement strategy is to replace the minimum flow in all candidate positions by a certain probability when all candidate positions have no empty position;
the small flow filter consists of d arrays, each array consists of w barrels, and each barrel comprises a pair of counters, namely a new counter and an old counter; the new counter records the number of packets reached by the stream in the current period; the old counter records the number of packets that the flow has arrived on average in the past period;
the large flow monitoring table consists of r hash buckets, each of which contains c slots, each slot storing a fingerprint value FP of one flow and a packet number counter, i.e. each slot storing one flow.
2. A method based on the system of claim 1, comprising the steps of:
when a data packet arrives, the small flow filter firstly maps to one bucket in d arrays through d two independent hash functions according to a flow identifier, acquires the smallest new counter value and the smallest old counter value in the d buckets, takes the smallest new counter value as the current packet number of the flow, and takes the smallest old counter value as the average packet number of the flow in the past period; when the minimum new counter value of a flow reaches a threshold value T, the flow is considered to be a newly arrived large flow, and the flow is allowed to pass through a filter and enter a large flow monitoring table; when the minimum old counter value of a flow reaches a threshold value T, the flow is considered to be a continuously arriving large flow, and the flow is allowed to pass through a filter and enter a large flow monitoring table;
when the data packet of the flow arrives, the large flow monitoring table firstly calculates a fingerprint value FP according to a flow identifier through a hash function, then randomly selects a fixed number of bits from the fingerprint value FP for a plurality of times to arrange the bits, and generates a plurality of sub hash values, so that the sub hash values are mapped into a plurality of hash buckets; then, the large flow monitoring table checks all mapping buckets, and if the flow is stored, the counter corresponding to the flow is increased by 1; if the stream is not stored but there is a null, inserting the stream into a null; if the stream is not stored and there is no vacancy, the minimum stream in the mapping bucket is found and a decision is made as to whether to replace the minimum stream with the newly arrived stream based on generating a replacement probability based on the number of packets of the minimum stream.
3. The method according to claim 2, wherein the Top-k flow accurate monitoring system is composed of two modules, namely a small flow filter and a large flow monitoring table, and specifically comprises the following operations: wherein a and b belong to small-flow filters, and c, d and e belong to large-flow monitoring meter modules;
a. small flow filter insertion and reporting;
the small flow filter maps each arriving data packet to a bucket on each counter array, reports whether the data packet can pass the filter according to the minimum new counter value and the minimum old counter value, and decides whether to update the new counter in the data packet;
b. periodic updating of a counter in the small flow filter;
when the small flow filter measures a certain number of data packets, all counters in the small flow filter are updated; the new counter is directly reset to 0, and the old counter adopts a halved updating strategy, namely updating to be the average value of the new counter and the value of the old counter in the last period;
c. inserting a large flow monitoring table;
when a data packet is transmitted into a large flow monitoring table, firstly counting fingerprint values of a flow according to a flow identifier of the flow to which the data packet belongs, then inquiring in the monitoring table according to the fingerprint values of the flow, and carrying out different updating steps according to an inquiring result and whether a hash bucket has a vacancy or not;
d. replacement of a large flow monitoring table;
when a stream of packets arrives, all hash buckets mapped by it are full and the stream is not recorded in a bucket, then the packet number C of the smallest stream in the bucket is first based on min Generating a substitution probability 1/(C) min +1) and then compared with a real number randomly generated between 0 and 1; if the substitution probability is larger than the real number, the minimum flow is substituted, otherwise, the data packet of the flow is discarded;
e. top-k stream report of large stream monitoring table;
the large stream monitoring table firstly arranges all streams in sequence from large to small according to the packet number of the streams, extracts the first k streams, adds a threshold T of a small stream filter as the final packet number of the streams, and reports the k streams as Top-k streams to a server.
4. A method according to claim 3, characterized in that two small counters are used to build the small flow filter and the counter in the small flow filter is updated periodically; one of the counters records the number of packets that a flow arrives in the current period to identify a newly arriving large flow, and the other counter records the number of packets that the flow has arrived on average in the past period to identify a continuously arriving large flow; the two small counters are combined to identify large flows in network flow, so that storage space waste caused by small flows is reduced, the problem that a traditional filter fails in small flow filtration is solved, and the identification accuracy of Top-k flows is improved.
5. A method according to claim 3, characterized in that in combination with Shan Haxi multi-mapping algorithm, a low-overhead and high-precision Top-k stream identification method is designed; first, according to the fingerprint value calculated by the flow identifier, then, selecting bits from the fingerprint value to reconstruct a hash value to be mapped into a hash table, so as to reduce the expenditure of hash calculation.
6. A method according to claim 3, wherein since each stream can have a plurality of candidate hash buckets, it can be ensured that the Top-k stream has enough storage locations for selection, so as to avoid the problem that the Top-k cannot monitor the stream size due to no location storage, and improve the recognition accuracy of the Top-k stream; meanwhile, when the minimum stream is replaced, since the minimum stream can be selected from a plurality of candidate positions, the minimum stream eviction can be accurately selected.
CN202111133411.4A 2021-09-27 2021-09-27 Top-k flow accurate monitoring system based on small flow filtration Active CN113839835B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111133411.4A CN113839835B (en) 2021-09-27 2021-09-27 Top-k flow accurate monitoring system based on small flow filtration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111133411.4A CN113839835B (en) 2021-09-27 2021-09-27 Top-k flow accurate monitoring system based on small flow filtration

Publications (2)

Publication Number Publication Date
CN113839835A CN113839835A (en) 2021-12-24
CN113839835B true CN113839835B (en) 2023-09-26

Family

ID=78970546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111133411.4A Active CN113839835B (en) 2021-09-27 2021-09-27 Top-k flow accurate monitoring system based on small flow filtration

Country Status (1)

Country Link
CN (1) CN113839835B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115914011B (en) * 2021-12-28 2024-08-02 长沙理工大学 Top-k flow elasticity measurement method supporting software definition
CN114785707B (en) * 2022-05-16 2023-06-20 电子科技大学 Hierarchical large-flow collaborative monitoring method
CN115102907B (en) * 2022-06-17 2024-01-26 长沙理工大学 Active large flow accurate identification method and system based on small flow filtering
CN115460111B (en) * 2022-07-26 2023-07-25 西安电子科技大学 Top-k stream statistical method and system based on HINOC protocol

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4492281A (en) * 1982-03-01 1985-01-08 Scans Associates, Inc. Weigh scale
CN102025563A (en) * 2010-11-30 2011-04-20 东南大学 Network flow identification method based on Hash collision compensation
CN104348740A (en) * 2013-07-31 2015-02-11 国际商业机器公司 Data package processing method and system
CN105427631A (en) * 2015-12-18 2016-03-23 天津通翔智能交通系统有限公司 System and method for optimizing multilevel self-adapted disturbance attenuation traffic signal
CN105745870A (en) * 2013-07-15 2016-07-06 瑞典爱立信有限公司 Removing lead filter from serial multiple-stage filter used to detect large flows in order to purge flows for prolonged operation
CN111200542A (en) * 2020-01-03 2020-05-26 国网山东省电力公司电力科学研究院 Network flow management method and system based on deterministic replacement strategy
CN111262756A (en) * 2020-01-20 2020-06-09 长沙理工大学 High-speed network elephant flow accurate measurement method and structure
CN111865635A (en) * 2019-04-29 2020-10-30 中国移动通信集团贵州有限公司 Method and device for determining out-of-limit time of ring network capacity
CN112671611A (en) * 2020-12-23 2021-04-16 清华大学 Sketch-based large stream detection method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768856B (en) * 2018-07-27 2022-01-14 华为技术有限公司 Network flow measuring method, network measuring equipment and control plane equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4492281A (en) * 1982-03-01 1985-01-08 Scans Associates, Inc. Weigh scale
CN102025563A (en) * 2010-11-30 2011-04-20 东南大学 Network flow identification method based on Hash collision compensation
CN105745870A (en) * 2013-07-15 2016-07-06 瑞典爱立信有限公司 Removing lead filter from serial multiple-stage filter used to detect large flows in order to purge flows for prolonged operation
CN104348740A (en) * 2013-07-31 2015-02-11 国际商业机器公司 Data package processing method and system
CN105427631A (en) * 2015-12-18 2016-03-23 天津通翔智能交通系统有限公司 System and method for optimizing multilevel self-adapted disturbance attenuation traffic signal
CN111865635A (en) * 2019-04-29 2020-10-30 中国移动通信集团贵州有限公司 Method and device for determining out-of-limit time of ring network capacity
CN111200542A (en) * 2020-01-03 2020-05-26 国网山东省电力公司电力科学研究院 Network flow management method and system based on deterministic replacement strategy
CN111262756A (en) * 2020-01-20 2020-06-09 长沙理工大学 High-speed network elephant flow accurate measurement method and structure
CN112671611A (en) * 2020-12-23 2021-04-16 清华大学 Sketch-based large stream detection method and device

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
"一种基于大小流区分计数的公平抽样算法";王晶等,;《电子与信息学报》;第36卷(第10期);2350-2356 *
Mengkun Wu,et al.,."ActiveKeeper: An Accurate and Efficient Algorithm for Finding Top-k Elephant Flows".《 IEEE Communications Letters 》.2021,第25卷(第8期),全文. *
基于LRU的大流检测算法;王洪波;裴育杰;林宇;程时端;金跃辉;;电子与信息学报(第10期);全文 *
基于多智能体的交通干线动态智能协调控制;孔祥杰;沈国江;孙优贤;;解放军理工大学学报(自然科学版)(第05期);全文 *
基于散列和计数方法的网络流频繁项挖掘算法;赵小欢;夏靖波;付凯;;华中科技大学学报(自然科学版)(第09期);全文 *
邓祺."软件定义网络精细化测量技术研究".《中国优秀硕士学位论文全文数据库 (信息科技辑)》.2020,全文. *

Also Published As

Publication number Publication date
CN113839835A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN113839835B (en) Top-k flow accurate monitoring system based on small flow filtration
CN107566206B (en) Flow measuring method, equipment and system
CN109861881B (en) Elephant flow detection method based on three-layer Sketch framework
CN111262756B (en) High-speed network elephant flow accurate measurement method and device
US20110167149A1 (en) Internet flow data analysis method using parallel computations
CN103714134A (en) Network flow data index method and system
CN110535825B (en) Data identification method of characteristic network flow
CN115102907B (en) Active large flow accurate identification method and system based on small flow filtering
CN112688837B (en) Network measurement method and device based on time sliding window
CN114205253A (en) Active large flow accurate detection framework and method based on small flow filtering
CN110532307A (en) A kind of date storage method and querying method flowing sliding window
CN116055362A (en) Two-stage Hash-Sketch network flow measurement method based on time window
Qi et al. Cuckoo counter: A novel framework for accurate per-flow frequency estimation in network measurement
CN101834763A (en) Multiple-category large-flow parallel measuring method under high speed network environment
CN104536700A (en) Code stream data rapid storage/reading method and system
CN111200542B (en) Network flow management method and system based on deterministic replacement strategy
CN113965492A (en) Data flow statistical method and device
CN114884834A (en) Low-overhead Top-k network flow high-precision extraction framework and method
Fan et al. Onesketch: A generic and accurate sketch for data streams
CN114710444B (en) Data center flow statistics method and system based on tower type abstract and evictable flow table
CN113872883A (en) High-precision elephant flow identification framework based on small flow filtering
CN115580543A (en) Network system activity evaluation method based on Hash counting
CN108809764B (en) GPU-based network access over-point connection number estimation method under sliding window
CN115604154A (en) Network high-flow elasticity measurement method supporting flow jitter
CN107332725B (en) Method for rapidly analyzing PCAP message

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant