CN102833134A - Workload adaptation method for measuring flow of network data stream - Google Patents

Workload adaptation method for measuring flow of network data stream Download PDF

Info

Publication number
CN102833134A
CN102833134A CN2012103236290A CN201210323629A CN102833134A CN 102833134 A CN102833134 A CN 102833134A CN 2012103236290 A CN2012103236290 A CN 2012103236290A CN 201210323629 A CN201210323629 A CN 201210323629A CN 102833134 A CN102833134 A CN 102833134A
Authority
CN
China
Prior art keywords
fingerprint
flow
bucket
data flow
stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012103236290A
Other languages
Chinese (zh)
Inventor
张进
黄清杉
赵文栋
吴泽民
彭来献
田畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PLA University of Science and Technology
Original Assignee
PLA University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PLA University of Science and Technology filed Critical PLA University of Science and Technology
Priority to CN2012103236290A priority Critical patent/CN102833134A/en
Publication of CN102833134A publication Critical patent/CN102833134A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a workload adaptation method for measuring flow of network data stream. The method comprises the steps of organizing a memory zone: dividing the entire memory zone into multiple blocks identical in length, further dividing each block into multiple barrels identical in length, setting a barrel load counter in each of the barrels, and placing stream fingerprints and flow counters of multiple data streams in the barrels; updating the flow value: at the beginning of the measuring cycle, initializing all barrel load counters, stream fingerprints and flow counters into 0, calculating the stream fingerprint and matching the fingerprint depending on a flow identifier while achieving one data packet; inquiring the flow value: depending on the flow identifier of the data stream to be inquired, calculating the stream fingerprint and matching the fingerprint to obtain a corresponding value of the flow counter; through the previous three steps, the process of measuring the flow rate of the data flow is finished; innovatively, a dynamic data stream fingerprint shrinking mechanism is adopted, to increase the length of the data stream fingerprint to the great extent when the load is comparatively light, thereby reducing the measuring errors.

Description

The network data flow flow-measuring method of loaded self-adaptive
Technical field
The present invention is a kind of adaptive data flow traffic method of measurement of well loaded that has, and belongs to technical field of the computer network.
Background technology
Network management job requirements such as accounting management, professional control, network abnormality detection and network security monitoring are carried out statistics and analysis to network traffics.Along with the lifting at full speed of network data rate,, be difficult for implementing because of its very expensive again if directly original network traffics are carried out analyzing and processing packet-by-packet; And the network traffics of data flow level have reached well balanced between the data volume of amount of information that is comprised and required processing.Data flow refers to one group and has same stream sign, and adjacent both set that is no more than one group of packet of a certain upper limit time of advent at interval.Different application can define different traffic identifier, and the general five-tuple of knowing (< source IP address, purpose IP address, protocol type, source port number, destination slogan >) that adopts is as traffic identifier.Information such as traffic identifier, the time of advent, concluding time and flow size to each bar data flow are carried out record, and this process is called flow measurement.Current, flow measurement is supported by the network device manufacturers of main flows such as Cisco, Juniper; IETF has set up IPFIX working group specially, carries out the formulation of flow measurement relevant criterion.
How accurately to measure the flow of each bar data flow, this is the difficult point of flow measurement.In the high-speed backbone link, because the message arrival interval is extremely short, and the huge amount of simultaneous streaming, the flow of each bar data flow of the measurement of entirely accurate, its very expensive.Comparatively speaking, approximate measure then has more performance cost ratio.There is certain error in the result of approximate measure.When realizing Measurement Algorithm, be Measurement Algorithm memory allocated resource according to maximum load usually.Yet, the analysis result of actual measurement network traffics is shown that the variation of flow load in different measuring periods is bigger, and in the most of the time, load is all less than maximum.Existing approximate measure algorithm; Mainly contain attribute Bloom Filter [1] (Counting Bloom Filter, CBF) with d-left attribute Bloom Filter [2] (d-left Counting Bloom Filter, dlCBF); Lack consideration for the workload-adaptability problem; When actual loading during less than maximum, all can not make full use of Resources allocation, reduce measure error as much as possible.
List of references
[1]L.Fan,P.Cao,J.Almeida,and?A.Z.Broder.Summary?Cache:a?Scalable?Wide-area?Web?Cache?Sharing?Protocol[J].IEEE/ACM?Transactions?on?Networking,2000,8(3):281-293.
[2]A.Pagh,R.Pagh,and?S.Rao.An?Optimal?Bloom?Filter?Replacement[C].In:Proc.of?the?Sixteenth?Annual?ACM-SIAM?Workshop?on?Discrete?Algorithms,Maryland,2005,823-829.
Summary of the invention
Technical problem: be directed against the problem of the approximate measure method workload-adaptability difference of existing data flow traffic, the present invention has provided a kind of network data flow flow-measuring method with the adaptive loaded self-adaptive of well loaded.When full load, the measure error of this method is consistent with existing method; Under non-full load state, this method can make full use of Resources allocation, effectively reduces measure error.
Technical scheme: the approximate measure method of existing data flow traffic mainly is based on CBF and dlCBF realizes.Available research achievements shows that the space efficiency of dlCBF is superior to CBF.In dlCBF, the size of measure error depends primarily on the length of element fingerprint.DlCBF adopts the element fingerprint of regular length, thereby the measure error of dlCBF is insensitive for network traffic load.The present invention provided the d-left attribute Bloom Filter that a kind of element fingerprint shrinks by half (Binary-Shrinking d-left Counting Bloom Filter, BSdlCBF).When full load, the measure error of BSdlCBF is consistent with dlCBF; When underload, BSdlCBF can make full use of memory space, enlarges the length of element fingerprint, thereby can more effective reduction measure error than dlCBF.
The network data flow flow-measuring method of loaded self-adaptive of the present invention comprises that memory block tissue, flow value upgrade and these three steps of flow value inquiry;
The method of memory block tissue is: whole memory block is divided into several isometric pieces, and every is divided into several isometric buckets again, and each bucket is provided with a bucket load counter, deposits the stream fingerprint and the flowmeter counter of some data flow in the bucket;
The flow value method for updating is: when begin measuring period, all bucket load counter, stream fingerprint and flowmeter counter are initialized as 0; Then, everyly reach a packet,, calculate its stream fingerprint according to its traffic identifier; Carry out fingerprint matching,, then corresponding flowmeter counter is increased by 1 if mate successfully; If coupling failure then is inserted into this data flow in the memory block, and the bucket load counter of the bucket that this data flow is inserted increases by 1;
The method of flow value inquiry is: according to the traffic identifier of data flow to be checked; Calculate its stream fingerprint, carry out fingerprint matching, obtain the value of corresponding flowmeter counter; Upgrade and these three steps of flow value inquiry through memory block tissue, flow value, accomplish the flow measurement process of data flow.
The calculated flow fingerprint realizes that through hash function the traffic identifier that is input as data flow of hash function is output as the stream fingerprint; The longest length that flows fingerprint that the length of output stream fingerprint can be deposited for institute in each barrel.
The concrete steps that data flow is inserted in the memory block are as follows: at first, according to the traffic identifier of this data flow, calculate its stream fingerprint; Then, suppose that the memory block is divided into d piece,, in each piece, select a bucket respectively, and therefrom select a lightest bucket conduct of load to insert the target bucket of this data flow then through d Hash operation; Then, according to the loading condition of target bucket, be new distribution of flows fingerprint space of inserting, if the bucket load is a, the maximum fingerprint space of bucket is the b bit, is b/ (a+1) bit for new distribution of flows fingerprint space of inserting then; At last, the flowmeter counter that this data flow is set is 1; Wherein a, b, d are positive integer.
The concrete steps of fingerprint matching are as follows: at first, according to the traffic identifier of this data flow, calculate its stream fingerprint; Then, suppose that the memory block is divided into d piece,, confirm all d the buckets that this data flow possibly inserted then through d Hash operation; Then, in each bucket, search the stream fingerprint successively, judge whether to have the stream fingerprint that flows fingerprint matching with current data,, then mate successfully if having, otherwise, then coupling failure.
The concrete steps of in bucket, searching the stream fingerprint are following: at first, according to the bucket loading condition, confirm suitable fingerprint matching length, if the bucket load is a, the maximum fingerprint space of bucket is b, and the preceding b/a bit of then getting stream fingerprint to be found is as the comparison field; Then, the comparison field and the stream fingerprint in the bucket of stream fingerprint to be found are compared one by one, if the discovery unanimity, then search successfully.
Beneficial effect: compare with current measuring methods, when full load, the measure error of method proposed by the invention is consistent with existing method; But when underload, the measure error of method proposed by the invention is starkly lower than existing method.Fig. 1 is respectively 2.5/3,2/3,1.5/3 and at 1/3 o'clock, the comparative result of the measurement error probability of dlCBF and BSdlCBF for load factor.Fig. 2 is when adopting the live network data on flows, the comparative result of the measurement error probability of dlCBF and BSdlCBF.The number of data flow is as shown in Figure 3 in each cycle of the data on flows that is adopted.Visible by Fig. 1 and Fig. 2, when full load, the measurement error probability of dlCBF and BSdlCBF is comparatively approaching; But along with the decline of load factor, the decrease speed of the measurement error probability of BSdlCBF is obviously faster than dlCBF.When underload, the measurement error probability of BSdlCBF is than low several orders of magnitude of dlCBF.It is thus clear that, to compare with existing flow-measuring method, flow-measuring method proposed by the invention has remarkable advantages aspect workload-adaptability.
Description of drawings
Under Fig. 1, the different loads rate, the measurement error probability of dlCBF and BSdlCBF compares,
The measurement error probability of dlCBF and BSdlCBF compares when Fig. 2, employing live network flow,
The number of data flow in each measuring period in Fig. 3, the live network flow,
The structural representation of Fig. 4, dlCBF,
Under Fig. 5, the load of different bucket, the bucket space utilization situation contrast of BSdlCBF and dlCBF,
The query script of Fig. 6, BSdlCBF,
The renewal process of Fig. 7, BSdlCBF.
Embodiment
Respectively from these three aspects of composition structure, flow renewal process and flow query script of BSdlCBF, specify the embodiment of flow-measuring method proposed by the invention below.
1) the composition structure of BSdlCBF
For the composition structure of BSdlCBF is described, at first the structure of dlCBF is done one and describe.DlCBF designs on the basis of d-left Hash table.The d-left Hash table is divided into d piece with the memory block, every bucket that is divided into several same capacity again.Might as well regard each piece as bucket vector (Bucket Vector), from left to right, note is made BV successively 1, BV 2..., BV dFor example, the memory block of the d-left Hash table among Fig. 4 is divided into 4 pieces, every 5 buckets, and every barrel of degree of depth is 4.When inserting element e, calculate the bucket address of element e in each piece by the individual independently hash function of d, note is made h respectively 1(e), h 2(e) ..., h d(e).Then, e is inserted into BV 1(h 1(e)), BV 2(h 2(e)) ..., BV d(h d(e)) in the lightest that of the load bucket.If there is the lightest bucket of a plurality of loads, then select Far Left that.For example, among Fig. 4, element e is inserted into a barrel BV 1[4] in.Follow above-mentioned selection strategy, can make that the load of each barrel is comparatively average, thereby each barrel increases a less extra bucket space again on the basis of average load, can guarantee that the overflow probability of bucket is extremely low, thereby obtain higher space efficiency.When adopting the d-left hash function to make up dlCBF, the fingerprint and the flowmeter counter of store data stream are as shown in Figure 4 in each barrel unit.Every when reaching a message P, at first obtain its traffic identifier f, calculate its stream fingerprint, and with its fingerprint and a bucket BV 1(h 1(f)), BV 2(h 2(f)) ..., BV d(h d(f)) existing fingerprint matees in.If on the coupling, then the flowmeter counter of bucket unit increases by 1 accordingly, if there are a plurality of barrels of unit to mate simultaneously, then selects a unit at random, and its flowmeter counter is increased by 1; If not on the coupling, then according to the insertion selection strategy of the d-left hash function of preceding text explanation, f is inserted into BV with data flow 1(h 1(f)), BV 2(h 2(f)) ..., BV d(h d(f)) in load the lightest, go in the leftmost bucket, and its flowmeter counter is made as 1.The effect length of data flow fingerprint the inquiry error probability of dlCBF, and fingerprint length is long more, and the probability that erroneous matching then occurs is low more.
BSdlCBF and dlCBF are similar, design based on the d-left Hash table equally.Different with dlCBF is that BSdlCBF adopts elongated stream fingerprint.When load was light, BSdlCBF adopted long stream fingerprint; Along with the increase of load factor, the length of element fingerprint shortens gradually.For the convenience that realizes, the element fingerprint adopts the strategy that shrinks by half among the BSdlCBF, and is as shown in Figure 5.It is pointed out that in BSdlCBF the original position that is inserted into data flow fingerprint and flowmeter counter in each barrel is different with dlCBF.Among the BSdlCBF, each bucket is provided with the load counter, to write down the current number that has been inserted into the data flow in the bucket.
2) flow query script
The handling process of BSdlCBF query manipulation is as shown in Figure 6.At first, for the packet p that is arrived, extract its traffic identifier F; Then, traffic identifier F is made the parallel Hash operation of d, obtain to deposit barrel address A of d bucket of the flow counting of F 1~A d, simultaneously, also need carry out one time Hash operation, with according to traffic identifier F calculated flow fingerprint fp; Subsequently, read a barrel unit B (A 1)~B (A d) content with and bucket load counting, and with a fp and a bucket B (A 1)~B (A d) in the stream fingerprint mate.If match hit is then returned corresponding flow counting C (F); If coupling is not hit, then return 0.
If unit fingerprint length is the l bit, bucket is dark be b, and requiring b is the inferior power of 2 positive integer just.For every data flow f, BSdlCBF generates the element fingerprint F of b l bit f[1:b].When being located at inquiry, the bucket load is i, then when carrying out the data flow fingerprint matching, only need get F fPreceding L (i) the fingerprint F of individual unit of [1:b] f[1:L (i)] is as match objects.L (i) is calculated by following formula and obtains
Figure BDA00002097042600051
At this moment, the individual unit of the every L of stream fingerprint (i) fingerprint in the bucket unit is as one group, and F f[1:L (i)] matees.
3) flow renewal process
It is as shown in Figure 7 that BSdlCBF upgrades the handling process of operating.When upgrading operation, at first need carry out the one query operation, to obtain the present flow rate count value of data flow to be updated.If Query Result C (F) is 0, show that then the flow counting of data flow F is not inserted among the BSdlCBF as yet; Need be inserted into data flow among the BSdlCBF this moment, and C (F)=1 is set, and upgrade bucket load counter.If Query Result C (F)>0, then only need upgrade C (F) and get final product.
If the flowmeter counter of data flow F possibly deposited in B (A 1)~B (A d) in.When inserting data flow F, F is positioned over B (A 1)~B (A d) in the lightest that of the load bucket, might as well be made as B (A k).If F is for being inserted into B (A k) in i (the bar data flow of 1≤i≤b), then the initial address of the stream fingerprint of data flow F and flowmeter counter is provided by following formula
Figure BDA00002097042600052
The stream fingerprint of store data stream F from the individual unit of A (i) beginning L (i).

Claims (5)

1. the network data flow flow-measuring method of a loaded self-adaptive is characterized in that this method comprises that memory block tissue, flow value upgrade and these three steps of flow value inquiry;
The method of memory block tissue is: whole memory block is divided into several isometric pieces, and every is divided into several isometric buckets again, and each bucket is provided with a bucket load counter, deposits the stream fingerprint and the flowmeter counter of some data flow in the bucket;
The flow value method for updating is: when begin measuring period, all bucket load counter, stream fingerprint and flowmeter counter are initialized as 0; Then, everyly reach a packet,, calculate its stream fingerprint according to its traffic identifier; Carry out fingerprint matching,, then corresponding flowmeter counter is increased by 1 if mate successfully; If coupling failure then is inserted into this data flow in the memory block, and the bucket load counter of the bucket that this data flow is inserted increases by 1;
The method of flow value inquiry is: according to the traffic identifier of data flow to be checked; Calculate its stream fingerprint, carry out fingerprint matching, obtain the value of corresponding flowmeter counter; Upgrade and these three steps of flow value inquiry through memory block tissue, flow value, accomplish the flow measurement process of data flow.
2. according to the network data flow flow-measuring method of the described loaded self-adaptive of claim 1, it is characterized in that the calculated flow fingerprint through the hash function realization, the traffic identifier that is input as data flow of hash function is output as the stream fingerprint; The longest length that flows fingerprint that the length of output stream fingerprint can be deposited for institute in each barrel.
3. according to the network data flow flow-measuring method of claim 1 or 2 described loaded self-adaptives, it is characterized in that the concrete steps that data flow is inserted in the memory block are as follows: at first,, calculate its stream fingerprint according to the traffic identifier of this data flow; Then, suppose that the memory block is divided into d piece,, in each piece, select a bucket respectively, and therefrom select a lightest bucket conduct of load to insert the target bucket of this data flow then through d Hash operation; Then, according to the loading condition of target bucket, be new distribution of flows fingerprint space of inserting, if the bucket load is a, the maximum fingerprint space of bucket is the b bit, is b/ (a+1) bit for new distribution of flows fingerprint space of inserting then; At last, the flowmeter counter that this data flow is set is 1; Wherein a, b, d are positive integer.
4. according to the network data flow flow-measuring method of claim 1,2 described loaded self-adaptives, the concrete steps that it is characterized in that fingerprint matching are as follows: at first, according to the traffic identifier of this data flow, calculate its stream fingerprint; Then, suppose that the memory block is divided into d piece,, confirm all d the buckets that this data flow possibly inserted then through d Hash operation; Then, in each bucket, search the stream fingerprint successively, judge whether to have the stream fingerprint that flows fingerprint matching with current data,, then mate successfully if having, otherwise, then coupling failure.
5. according to network data flow flow-measuring method according to the described loaded self-adaptive of claim 4; It is following to it is characterized in that in bucket, searching the concrete steps that flow fingerprint: at first; According to the bucket loading condition, confirm suitable fingerprint matching length, if the bucket load is a; The maximum fingerprint space of bucket is b, and the preceding b/a bit of then getting stream fingerprint to be found is as the comparison field; Then, the comparison field and the stream fingerprint in the bucket of stream fingerprint to be found are compared one by one, if the discovery unanimity, then search successfully.
CN2012103236290A 2012-09-04 2012-09-04 Workload adaptation method for measuring flow of network data stream Pending CN102833134A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012103236290A CN102833134A (en) 2012-09-04 2012-09-04 Workload adaptation method for measuring flow of network data stream

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012103236290A CN102833134A (en) 2012-09-04 2012-09-04 Workload adaptation method for measuring flow of network data stream

Publications (1)

Publication Number Publication Date
CN102833134A true CN102833134A (en) 2012-12-19

Family

ID=47336110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012103236290A Pending CN102833134A (en) 2012-09-04 2012-09-04 Workload adaptation method for measuring flow of network data stream

Country Status (1)

Country Link
CN (1) CN102833134A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111541617A (en) * 2020-04-17 2020-08-14 网络通信与安全紫金山实验室 Data flow table processing method and device for high-speed large-scale concurrent data flow
WO2023134574A1 (en) * 2022-01-12 2023-07-20 华为技术有限公司 Method for flow statistics, device, and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006130830A2 (en) * 2005-06-02 2006-12-07 Georgia Tech Research Corporation System and method for measuring traffic and flow matrices
CN101119246A (en) * 2007-09-20 2008-02-06 杭州华三通信技术有限公司 Data packet sampling statistic method and apparatus
CN102025563A (en) * 2010-11-30 2011-04-20 东南大学 Network flow identification method based on Hash collision compensation
CN102315956A (en) * 2010-07-02 2012-01-11 中兴通讯股份有限公司 Method and device for supervising flow

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006130830A2 (en) * 2005-06-02 2006-12-07 Georgia Tech Research Corporation System and method for measuring traffic and flow matrices
CN101119246A (en) * 2007-09-20 2008-02-06 杭州华三通信技术有限公司 Data packet sampling statistic method and apparatus
CN102315956A (en) * 2010-07-02 2012-01-11 中兴通讯股份有限公司 Method and device for supervising flow
CN102025563A (en) * 2010-11-30 2011-04-20 东南大学 Network flow identification method based on Hash collision compensation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张进等: "4种计数型Bloom Filter的性能分析与比较", 《软件学报》, vol. 21, no. 5, 21 June 2010 (2010-06-21) *
董永吉等: "一种基于分段模式的统计计数结构", 《计算机应用研究》, vol. 26, no. 10, 30 November 2009 (2009-11-30), pages 1 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111541617A (en) * 2020-04-17 2020-08-14 网络通信与安全紫金山实验室 Data flow table processing method and device for high-speed large-scale concurrent data flow
CN111541617B (en) * 2020-04-17 2021-11-02 网络通信与安全紫金山实验室 Data flow table processing method and device for high-speed large-scale concurrent data flow
WO2023134574A1 (en) * 2022-01-12 2023-07-20 华为技术有限公司 Method for flow statistics, device, and system

Similar Documents

Publication Publication Date Title
JP7039685B2 (en) Traffic measurement methods, devices, and systems
US20110167149A1 (en) Internet flow data analysis method using parallel computations
CN104298680B (en) Data statistical approach and data statistics device
CN103401777B (en) The parallel search method and system of Openflow
CN104102543B (en) The method and apparatus of adjustment of load in a kind of cloud computing environment
US9576073B2 (en) Distance queries on massive networks
CN106452868A (en) Network traffic statistics implement method supporting multi-dimensional aggregation classification
CN104778079B (en) Device and method and distributed system for dispatching, executing
CN105975433B (en) A kind of message processing method and device
CN103810223B (en) A kind of memory data organization querying method based on packet
CN106233256B (en) Utilize the scalable storage of the load balance of optimization module
CN106209967A (en) A kind of video monitoring cloud resource prediction method and system
CN102054000A (en) Data querying method, device and system
CN110535825A (en) A kind of data identification method of character network stream
CN110532307A (en) A kind of date storage method and querying method flowing sliding window
Hu et al. Improved heuristic job scheduling method to enhance throughput for big data analytics
CN103078754B (en) A kind of network data flow statistical method based on attribute bloom filter
CN102833134A (en) Workload adaptation method for measuring flow of network data stream
CN100493001C (en) Automatic clustering method for multi-particle size network under G bit flow rate
CN106897458A (en) A kind of storage and search method towards electromechanical equipment data
CN109976879A (en) A kind of cloud computing virtual machine placement method using curve complementation based on resource
CN104778088A (en) Method and system for optimizing parallel I/O (input/output) by reducing inter-progress communication expense
CN105120008A (en) Layering-based distributed cloud computing centre load balancing method
CN105516016A (en) Flow-based data packet filtering system and data packet filtering method by using Tilera multi-core accelerator card
CN110516119A (en) A kind of organizational scheduling method, device and the storage medium of natural resources contextual data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121219