CN104394091B - A kind of network redundancy method for recognizing flux based on uniform sampling - Google Patents

A kind of network redundancy method for recognizing flux based on uniform sampling Download PDF

Info

Publication number
CN104394091B
CN104394091B CN201410730071.7A CN201410730071A CN104394091B CN 104394091 B CN104394091 B CN 104394091B CN 201410730071 A CN201410730071 A CN 201410730071A CN 104394091 B CN104394091 B CN 104394091B
Authority
CN
China
Prior art keywords
characteristic fingerprint
sampling
data block
omega
fingerprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410730071.7A
Other languages
Chinese (zh)
Other versions
CN104394091A (en
Inventor
马强
张琦
邢玲
杨国海
何燕玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN201410730071.7A priority Critical patent/CN104394091B/en
Publication of CN104394091A publication Critical patent/CN104394091A/en
Application granted granted Critical
Publication of CN104394091B publication Critical patent/CN104394091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of network redundancy method for recognizing flux based on uniform sampling, pass through characteristic fingerprint uniform sampling:Continuously slipping by the window of fixed size, maximum characteristic fingerprint is used as sampling characteristic fingerprint deposit characteristic fingerprint storehouse in selected window;And sampling characteristic fingerprint is dynamically tracked:During searching characteristic fingerprint storehouse identification redundant data block, update the characteristic fingerprint matched in characteristic fingerprint storehouse and point to the packet load matched in (being mapped in) buffering area, to prevent that buffering area refresh process from removing the characteristic fingerprint of characteristic fingerprint storehouse medium-high frequency redundant data packets load mapping, the sustainability of redundant flow identification is kept.

Description

A kind of network redundancy method for recognizing flux based on uniform sampling
Technical field
The invention belongs to network flow management technical field, more specifically, it is related to a kind of net based on uniform sampling Network redundant flow recognition methods, for recognizing the redundancy section in network traffics.
Background technology
Driven by user interest model, the user with same interest accesses the net of similar or identical theme in edge network Network resource necessarily causes a large amount of duplicate data to transmit, and forms the related redundant flow of particular link.The presence of redundant flow is not only Link bandwidth resource utilization rate, and the experience sense of influence customer access network resource is lost, user is hit to a certain extent Enthusiasm.
Redundant flow effectively in identification network is to study the redundant flow origin cause of formation and a series of its concurrent problem brought It is crucial.Traditional WEB caching technologys are based on object layer and recognize redundant flow, but different application needs to redesign corresponding delay Details is deposited, lacks the flexibility of application.
In recent years, MODP, MAXP, SAMPBYTE and DYNABYTE method based on layer data packet are suggested successively, and Also preferable recognition efficiency is achieved.Wherein MODP is based on Rabin Polynomial Methods and calculates continuous data piecemeal fingerprint and by finger Line value modulus is 0 strategy sampling characteristic fingerprint, there is uneven and zero sampling defect of sampling.Windows of the MAXP based on fixed size The uniform piecemeal selection maximum of mouth overcomes MODP sampling problem of non-uniform as sampling characteristic fingerprint, but can not be well Track the behavioral characteristics of real traffic medium-high frequency redundant data block.SAMPBYTE and DYNABYTE lead to from angle of statistics The Typical Representative initial character of training sample selection redundant block is crossed as sampling feature.DYNABYTE's realizes details compared with SAMPBYTE Add sampling feature dynamic adjustment function, realize to a certain extent to real traffic medium-high frequency redundant block dynamic with Track ability.But, SAMPBYTE and DYNABYTE select what the scheme of characteristic fingerprint was selected by sample data based on sample training Influence larger, the limited flexibility of deployment.Above method dynamically tracks two aspects in uniform sampling and high frequency redundant block and failed to Take into account simultaneously well.
The content of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of network redundancy flow based on uniform sampling Recognition methods, while the identification of redundant flow in solving the problems, such as real network environment, takes into account the uniform sampling of characteristic fingerprint The ability dynamically tracked with high frequency redundant block, to improve the validity and discrimination of redundant flow identification.
To achieve the above object, the network redundancy method for recognizing flux of the invention based on uniform sampling, it is characterised in that bag Include following steps:
(1), characteristic fingerprint uniform sampling
1.1), to first packet load t of reception1,t2,t3...tn, by the sliding window of Ω sizes, from start bit Slip is put, a byte is stepping, and the packet load is divided, obtain+1 data of n- Ω that continuous size is Ω Block t1,t2,t3,...,tΩ、t2,t3,t4,...,tΩ+1、…、tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tn, wherein, n is negative for packet Carry byte number;
1.2), to the data block of+1 Ω size of n- Ω, by Rabin multinomials, the characteristic fingerprint of data block mapping is calculated, Data block is followed successively by with characteristic fingerprint mapping relations:
H1=RF (t1,t2,t3,...,tΩ)=(t1pΩ-1+t2pΩ-2+...+tΩ-1p1+tΩp0)modM
H2=RF (t2,t3,t4,...,tΩ+1)=((RF (t1,t2,t3...tΩ)-t1pΩ-1)*p+tΩ+1p0)modM (1)
……
Hn-Ω+1=RF (tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tn)=((RF (tn-Ω,tn-Ω+1,tn-Ω+2,...,tn-1)-tn-ΩpΩ-1)*p+tnp0)modM
Wherein, H1、H2、…、Hn-Ω+1For the corresponding characteristic fingerprint of+1 data block of n- Ω, mod is the computing that rems, and M is Constant, determines, RF represents mapping operations as the case may be;
First data block t is calculated by formula (1)1,t2,t3,...,tΩThe characteristic fingerprint H of mapping1, then according to look-up table T, with single byte tiElement value obtains t as index is searchedipΩ-1Value, i=1,2 ..., n- Ω;Finally according to formula (1), Calculate data block t2,t3,t4,...,tΩ+1、…、tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tnCharacteristic fingerprint H2,…,Hn-Ω+1, its In, the lookup that look-up table T includes 0~255 is indexed, and each corresponding output valve of index of searching is indexed and p for the lookupΩ-1Multiply Product;
1.3), step 1.2) obtained characteristic fingerprint carry out order arrangement, form characteristic fingerprint sequence H1,H2,…, Hn-Ω+1;By the sliding window of w sizes, slided from original position, a characteristic fingerprint is stepping, to characteristic fingerprint sequence H1, H2,…,Hn-Ω+1Divided, maximum in all selection sliding windows is slided every time and is stored in characteristic fingerprint as sampling characteristic fingerprint In storehouse, to last sliding window, the characteristic fingerprint sampling of input data bag is completed;
Different sliding windows because lap choose same sampling characteristic fingerprint when, only the deposit sampling chosen for the first time is special Levy fingerprint;
(2), sampling characteristic fingerprint is dynamically tracked
2.1) buffering area, is set up, first packet load of input is stored in, and by adopting in characteristic fingerprint storehouse Sample characteristic fingerprint is mapped in first packet load;
2.2), to second packet load of reception, buffering area is stored in first, then according to the method in step (1) Sampling characteristic fingerprint is extracted, and is matched one by one in characteristic fingerprint storehouse, Mobile state of going forward side by side tracking:If matching sampling Characteristic fingerprint, then be mapped in second packet load, if do not had by the sampling characteristic fingerprint matched in characteristic fingerprint storehouse Match, then the sampling characteristic fingerprint of extraction is stored in characteristic fingerprint storehouse, and be mapped in second packet load;
2.3), to the packet load then received, according to step 2.2) method handled;When packet in buffering area After load is filled with, using first in first out (First In First Out, FIFO) aging mechanism flush buffers, to store The packet load subsequently reached, during refreshing, is mapped in the sampling characteristic fingerprint for being moved out of packet load in characteristic fingerprint storehouse It is eliminated;
(3), redundant flow is recognized
For the sampling characteristic fingerprint extracted in step (2), if the match is successful in characteristic fingerprint, using most imperial palace Hold matching method, according to the corresponding data block of sampling characteristic fingerprint, packet load to reception and the number being mapped in buffer area Matched according to bag load, and output matching byte number is redundant data block size;
Each redundant data block size sum of unit interval is counted, redundant flow size is obtained and identifies redundancy stream Amount.
What the goal of the invention of the present invention was realized in:
Network redundancy method for recognizing flux of the invention based on uniform sampling, passes through characteristic fingerprint uniform sampling:By fixation The window of size is continuously slipping, and maximum characteristic fingerprint is used as sampling characteristic fingerprint deposit characteristic fingerprint storehouse in selected window;With And sampling characteristic fingerprint is dynamically tracked:During searching characteristic fingerprint storehouse identification redundant data block, update in characteristic fingerprint storehouse The characteristic fingerprint of matching points to the packet load matched in (being mapped in) buffering area, to prevent that it is special that buffering area refresh process from removing The characteristic fingerprint of fingerprint base medium-high frequency redundant data packets load mapping is levied, the sustainability of redundant flow identification is kept.
Compared with prior art, the beneficial effect in terms of the present invention has following four:
(1), the characteristic fingerprint uniform sampling of the invention based on continuously slipping window has stronger interval representative, ensures The validity that the present invention is recognized to redundant flow;
(2), dynamically tracking solves the sampling feature that buffering area aging (refreshing) brings and referred to present invention sampling characteristic fingerprint Line Problem of Failure, dynamic tracking and sustainability of the effective guarantee to high frequency redundant data block are recognized, further improve redundancy stream Measure discrimination;
(3), data-oriented covering of the present invention process object, is not limited by application layer protocol, flexible with higher application Property;
(4), the present invention is without sample training, and the characteristic fingerprint uniform sampling and Dynamic Tracking used can be with self adaptation Arbitrary network node environment, deployment is flexible.
Brief description of the drawings
Fig. 1 is network redundancy method for recognizing flux a kind of embodiment flow chart of the invention based on uniform sampling;
Fig. 2 is that packet load divides data block and characteristic fingerprint mapping schematic diagram;
Fig. 3 is characteristic fingerprint uniform sampling schematic diagram;
Fig. 4 is greatest content matching flow chart;
Fig. 5 is the record format figure of redundant flow identification output;
Fig. 6 is characteristic fingerprint dynamically tracking schematic diagram.
Embodiment
The embodiment to the present invention is described below in conjunction with the accompanying drawings, so as to those skilled in the art preferably Understand the present invention.Requiring particular attention is that, in the following description, when known function and design detailed description perhaps When can desalinate the main contents of the present invention, these descriptions will be ignored herein.
Fig. 1 is network redundancy method for recognizing flux a kind of embodiment flow chart of the invention based on uniform sampling.
In the present embodiment, as shown in figure 1, first, being drawn to the load of input data bag by fixed size Ω set in advance It is divided into consecutive data block.Packet load has n bytes, i.e. t1,t2,t3...tn, then the continuous numbers of n- Ω+1 can be divided According to block t1,t2,t3,...,tΩ、t2,t3,t4,...,tΩ+1、…、tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tn
Then, by the corresponding characteristic fingerprint Hx of each data block of Rabin polynomial computations, x ∈ [1, n- Ω+1] are H1、 H2、…、Hn-Ω+1, Fig. 2 gives the mapping relations of characteristic fingerprint and data block.
For first data block t1,t2,t3,...,tΩThe characteristic fingerprint H of mapping1, calculated by formula (1).For the 2nd Individual and later data block characteristics fingerprint is calculated, then first according to look-up table T, with single byte tiElement value is obtained as index is searched To tipΩ-1Value, i=1,2 ..., n- Ω;Finally according to formula (1), data block t is calculated2,t3,t4,...,tΩ+1、…、 tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tnCharacteristic fingerprint H2,…,Hn-Ω+1, wherein, the lookup that look-up table T includes 0~255 is indexed, Each corresponding output valve of index of searching is indexed and p for the lookupΩ-1Product, can so greatly improve computational efficiency.
In the present embodiment, for convenience of p values 2 are calculated, M values 0x100000000 is the decimal system 4294967296, to limit The characteristic fingerprint value calculated devise a stratagem in the range of 32 bits.
Fig. 3 is characteristic fingerprint uniform sampling schematic diagram.
In the present embodiment, as shown in figure 3, obtained characteristic fingerprint carry out order arrangement, forms characteristic fingerprint sequence H1, H2,…,Hn-Ω+1;By the sliding window of w sizes, slided from original position, a characteristic fingerprint is stepping, to characteristic fingerprint sequence H1,H2,…,Hn-Ω+1Divided, maximum in all selection sliding windows is slided every time and is stored in feature as sampling characteristic fingerprint In fingerprint base, to last sliding window, the characteristic fingerprint sampling of input data bag is completed.
The sampling characteristic fingerprint deposit characteristic fingerprint storehouse obtained for first packet load sampling, first packet The buffering area that load deposit is set up, and the sampling characteristic fingerprint in characteristic fingerprint storehouse is mapped in first packet load.
The packet load received for second or then is while buffering area is stored in, obtained sampling of being sampled to it Characteristic fingerprint is matched with the sampling characteristic fingerprint in characteristic fingerprint storehouse, if matched, and will be matched in characteristic fingerprint storehouse Sampling characteristic fingerprint be mapped in second packet load, if be not matched to, the sampling characteristic fingerprint of extraction is deposited Enter in characteristic fingerprint storehouse, and be mapped in second packet load.
Meanwhile, if matched, the packet load and mapping of reception are carried out with the packet load in buffer area Matching, and output matching byte number is redundant data block size, further to recognize redundant flow.Specifically, greatest content Matching method to the data block of sampling characteristic fingerprint correspondence (association) reception packet load and the number that is mapped in buffer area Matching is proceeded by according to the position of bag load, Rabin polynomial computation characteristic fingerprints so not only can be effectively solved potential Hash-collision problem, moreover it is possible to the redundant flow for identification more multibyte composition of trying one's best, improves with fixation to a certain extent Data block size mode recognizes the recognition efficiency of redundant flow.The flow of greatest content matching as shown in Figure 4 is performed, including following Step:
3.1), in alignment feature fingerprint base, the data block of sampling characteristic fingerprint correspondence (association) is mapped in a left side for buffer area Right margin position;
3.2), check buffering area alignment right boundary limit in the range of data whether and current data block contents to be matched Matching completely;
3.3), if it does, then continuing executing with subsequent match process, greatest content matching flow is otherwise terminated;
3.4) buffering area left margin, is matched after alignment successively by byte with left data and current data block left margin to be matched With left data, until it fails to match;
3.5) buffering area right margin, is matched after alignment successively by byte with right data and current data block right margin to be matched With right data, until it fails to match;
3.6), accumulating step 3.4), step 3.5) left and right extension byte number and current data block to be matched matched word Joint number, the MatchByte i.e. redundant data block size for matching of succeeding.
Redundant flow identification of the present invention carries out statistical analysis completion, output based on the record that greatest content matches link output Record format it is as shown in Figure 5.Every record represents a redundant data block of successfully identification, and len fields therein represent full The redundant data block byte number that sufficient characteristic fingerprint matching and greatest content are calculated after matching, sec fields are represented belonging to the data block Packet is accurate to the capture time of second.By the two field informations can in network flow calculation presence per second redundancy stream Measure size.The field of other in record can be applied to more complicated redundant flow attributive analysis.
Next, checking whether buffering area has enough remaining spaces to store currently pending packet load.If Remaining space not enough, then by the size of setting with the aging mechanism flush buffers of first in first out (FIFO), reserves enough delay Rush area space.Then, packet load is stored in reserved buffering area.
Finally, sampling characteristic fingerprint is performed dynamically to track:
If A, matching sampling characteristic fingerprint, the sampling characteristic fingerprint matched in characteristic fingerprint storehouse is mapped in Two or the packet load that then receives, if be not matched to, characteristic fingerprint is stored in by the sampling characteristic fingerprint of extraction In storehouse, and it is mapped in second or the packet load then received;
B, the packet load to then receiving, according to step 2.2) method handled;When packet is born in buffering area After load is filled with, using first in first out (First In First Out, FIFO) aging mechanism flush buffers, after storing The continuous packet load reached, during refreshing, is mapped in the sampling characteristic fingerprint quilt for being moved out of packet load in characteristic fingerprint storehouse Remove.
The present invention is stored using the buffering area of fixed size in the packet load of redundant flow to be identified, characteristic fingerprint storehouse Each sampling characteristic fingerprint be mapped in the particular offset position of corresponding data bag in buffering area.With in buffering area, packet is born The continuous accumulation carried, refreshes special after buffering area is full using the aging mechanism of first in first out (First In First Out, FIFO) Determine the aging buffering area of size, reserved new space is used to store the packet load subsequently reached.Refresh aging buffering area Being mapped in the packet load of the section in characteristic fingerprint storehouse simultaneously will be failed, and the number also can be synchronously removed from characteristic fingerprint storehouse According to all characteristic fingerprints of bag load association.
Characteristic fingerprint dynamic tracking schematic diagram shown in Fig. 6, which is solved in aging buffering area refresh process, may remove portion High frequency redundant data block is divided to can recognize that the defect problem of characteristic fingerprint.Actual dynamic tracking process is referred to by iteration more new feature The ability characteristic fingerprint matched in line storehouse is mapped in the packet original position of newest identification, even if aging mechanism removes fingerprint The data field that characteristic fingerprint had previously mapped in storehouse will not also have a strong impact on the identification of follow-up high frequency redundant data block, so, to spy Levy each sampling characteristic fingerprint for being used to match in fingerprint base and implement dynamic tracking, keep the present invention to high frequency redundant data block Dynamic tracking and sustainable recognition capability.
It will enter ageing link flush buffers after buffering area stores m data bag shown in Fig. 6.In characteristic fingerprint storehouse Sampling characteristic fingerprint Hs3Packet load 1 in buffering area initially is mapped in, packet m+ is finally mapped in by continuous iteration 1.Although the interval model of buffering area where FIFO aging mechanism refresh datas bag load 1, packet load 2 and packet load 3 Enclose, also do not interfere with and sampling characteristic fingerprint H is mapped in the inventive method identification data bag m+1s3Data block.
In order to illustrate beneficial effects of the present invention, one group of contrast experiment is devised below, is compared based on maximum sampling Recognition capability of the MAXP methods and the inventive method of characteristic fingerprint to redundant flow.
Table 1 is the sample data of this contrast experiment, and sample set A, B, C are derived from campus network access link in November, 2013 Two-way between 13 days to 2013 three day November 15 day, 10 points to 11 points of morning uninterruptedly enters outflow, and sample set D, E are derived from Campus network complex building link 2 pm two day November 17 day 16 days to 2013 November in 2013 between 3 points it is two-way not It is interrupted into outflow.
Sample set Source Description Date Total amount/GB
A Campus network access link Two-way uninterrupted flow On November 13rd, 2013 37.8
B Campus network access link Two-way uninterrupted flow On November 14th, 2013 27.7
C Campus network access link Two-way uninterrupted flow On November 15th, 2013 31.6
D Campus network complex building link Two-way uninterrupted flow On November 16th, 2013 19.5
E Campus network complex building link Two-way uninterrupted flow On November 17th, 2013 13.4
Table 1
By experiment test, contrast experiment's test result shown in table 2 is obtained, MAXP methods and the present invention have been counted respectively Average recognition rate of the method to 5 groups of different sample set redundant flows.
Sample set MAXP method average recognition rates The inventive method average recognition rate
A 17.5% 21.6%
B 21.3% 22.8%
C 23.6% 20.9%
D 21.2% 21.6%
E 20.4% 21.2%
Table 2
From the point of view of the statistics of table 2, MAXP methods reach to the redundant flow average recognition rate of 5 groups of different sample sets 20.8%, the inventive method is to the redundant flow average recognition rate of identical sample set up to 21.6%.From the point of view of analysis result, this hair Bright method is slightly improved on the basis of existing method is kept to redundant flow recognition capability.In addition, being sent out from the result of table 2 Existing MAXP methods are larger to the average recognition capability difference of redundant flow of different sample datas, the difference for having 6.1 percentage points.Phase Instead, the inventive method is relatively small to the average recognition capability difference of redundant flow of different sample datas, is 1.9 percentage points. Further analysis finds that the average recognition rate variance of MAXP methods is 3.86, and the average recognition rate variance of the inventive method is 0.42, it is seen that the inventive method is relatively stablized to redundant flow average recognition rate.
Analyzed by above experimental implementation, illustrate that the inventive method has good exploitativeness, while having taken into account MAXP The uniform DYNABYTE method dynamically of method Interval Sampling tracks the ability of redundant flow variation characteristic.It is compared to existing For redundant flow recognition methods, the inventive method is also improved on the basis of redundant flow recognition capability is ensured to not same The stability of this redundant flow recognition capability, can be adaptive to different real network environments, convenient deployment.The inventive method takes The redundant flow recognition capability obtained benefits from characteristic fingerprint uniform sampling and ensure that be sampled characteristic fingerprint bears to buffer data The representativeness and uniform fold rate of load, and dynamically tracking effectively compensate for buffering area aging mechanism flush buffers to characteristic fingerprint The defect that the characteristic fingerprint intermittence for causing fingerprint base medium-high frequency redundant block to map fails.The inventive method is for research Netowrk tape The wide utilization of resources and Internet resources of making rational planning for have certain reference value.
Although illustrative embodiment of the invention is described above, in order to the technology of the art Personnel understand the present invention, it should be apparent that the invention is not restricted to the scope of embodiment, to the common skill of the art For art personnel, as long as various change is in the spirit and scope of the present invention that appended claim is limited and is determined, these Change is it will be apparent that all utilize the innovation and creation of present inventive concept in the row of protection.

Claims (2)

1. a kind of network redundancy method for recognizing flux based on uniform sampling, it is characterised in that comprise the following steps:
(1), characteristic fingerprint uniform sampling
1.1), to first packet load t of reception1,t2,t3...tn, by the sliding window of Ω sizes, slided from original position Dynamic, a byte is stepping, and the packet load is divided, and obtains+1 data block of n- Ω that continuous size is Ω t1,t2,t3,...,tΩ、t2,t3,t4,...,tΩ+1、…、tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tn, wherein, n is packet load Byte number;
1.2), to the data block of+1 data block of n- Ω, by Rabin multinomials, the characteristic fingerprint of data block mapping, data are calculated Block is followed successively by with characteristic fingerprint mapping relations:
H 1 = R F ( t 1 , t 2 , t 3 , ... , t Ω ) = ( t 1 p Ω - 1 + t 2 p Ω - 1 + ... + t Ω - 1 p 1 + t Ω p 0 ) mod M H 2 = R F ( t 2 , t 3 , t 4 , ... , t Ω + 1 ) = ( ( R F ( t 1 , t 2 , t 3 ... t Ω ) - t 1 p Ω - 1 ) * p + t Ω + 1 p 0 ) mod M ... H n - Ω + 1 = R F ( t n - Ω + 1 , t n - Ω + 2 , t n - Ω + 3 , ... , t n ) = ( ( R F ( t n - Ω , t n - Ω + 1 , t n - Ω + 2 , ... , t n - 1 ) - t n - Ω p Ω - 1 ) * p + t n p 0 ) mod M - - - ( 1 )
Wherein, H1、H2、…、Hn-Ω+1For the corresponding characteristic fingerprint of+1 data block of n- Ω, mod is the computing that rems, and M is constant, Determine as the case may be, RF represents mapping operations;
First data block t is calculated by formula (1)1,t2,t3,...,tΩThe characteristic fingerprint H of mapping1, then according to look-up table T, with list Byte tiElement value obtains t as index is searchedipΩ-1Value, i=1,2 ..., n- Ω;Finally according to formula (1), calculate Data block t2,t3,t4,...,tΩ+1、…、tn-Ω+1,tn-Ω+2,tn-Ω+3,...,tnCharacteristic fingerprint H2,…,Hn-Ω+1, wherein, look into The lookup that looking for table T includes 0~255 is indexed, and each corresponding output valve of index of searching is indexed and p for the lookupΩ-1Product;
1.3), step 1.2) obtained characteristic fingerprint carry out order arrangement, form characteristic fingerprint sequence H1,H2,…,Hn-Ω+1;By w The sliding window of size, is slided from original position, and a characteristic fingerprint is stepping, to characteristic fingerprint sequence H1,H2,…,Hn-Ω+1 Divided, maximum in all selection sliding windows is slided every time and is stored in as sampling characteristic fingerprint in characteristic fingerprint storehouse, to most Latter sliding window, completes the characteristic fingerprint sampling of input data bag;
Different sliding windows because lap choose same sampling characteristic fingerprint when, only the sampling feature chosen refers to for the first time for deposit Line;
(2), sampling characteristic fingerprint is dynamically tracked
2.1) buffering area, is set up, first packet load of input is stored in, and the sampling in characteristic fingerprint storehouse is special Levy fingerprint and be mapped in first packet load;
2.2), to second packet load of reception, buffering area is stored in first, is then extracted according to the method in step (1) To sampling characteristic fingerprint, and matched one by one in characteristic fingerprint storehouse, Mobile state of going forward side by side tracking:If matching sampling feature Fingerprint, then be mapped in second packet load, if do not matched by the sampling characteristic fingerprint matched in characteristic fingerprint storehouse Arrive, then the sampling characteristic fingerprint of extraction is stored in characteristic fingerprint storehouse, and be mapped in second packet load;
2.3), to the packet load then received, according to step 2.2) method handled;When packet load in buffering area It is follow-up to store using first in first out (First In First Out, FIFO) aging mechanism flush buffers after being filled with The packet load reached, during refreshing, be mapped in characteristic fingerprint storehouse be moved out of packet load sampling characteristic fingerprint it is clear Remove;
(3), redundant flow is recognized
For the sampling characteristic fingerprint extracted in step (2), if the match is successful in characteristic fingerprint, using greatest content With method, according to the corresponding data block of sampling characteristic fingerprint, packet load to reception and the packet being mapped in buffer area Load is matched, and output matching byte number is redundant data block size;
Described greatest content matching method is:
3.1), in alignment feature fingerprint base, the corresponding data block of sampling characteristic fingerprint is mapped in the right boundary position of buffer area;
3.2), check buffering area alignment right boundary limit in the range of data whether and current data block contents to be matched it is complete Matching;
3.3), if it does, then continuing executing with subsequent match process, greatest content matching flow is otherwise terminated;
3.4) buffering area left margin, is matched after alignment successively by byte with left data and current data block left margin to be matched with a left side Data, until it fails to match;
3.5) buffering area right margin, is matched after alignment successively by byte with right data and current data block right margin to be matched with the right side Data, until it fails to match;
3.6), accumulating step 3.4), step 3.5) left and right extension byte number and current data block to be matched MatchByte, Succeed the MatchByte i.e. redundant data block size of matching;
Each redundant data block size sum of unit interval is counted, redundant flow size is obtained and identifies redundant flow.
2. network redundancy method for recognizing flux according to claim 1, it is characterised in that step 1.2) described in p take It is the decimal system 4294967296 to be worth for 2, M values for 0x100000000, to limit the characteristic fingerprint value of calculating in 32 bits In the range of.
CN201410730071.7A 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling Active CN104394091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410730071.7A CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410730071.7A CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Publications (2)

Publication Number Publication Date
CN104394091A CN104394091A (en) 2015-03-04
CN104394091B true CN104394091B (en) 2017-07-18

Family

ID=52611927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410730071.7A Active CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Country Status (1)

Country Link
CN (1) CN104394091B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357071B (en) * 2015-11-12 2018-08-31 成都科来软件有限公司 A kind of network complexity method for recognizing flux and identifying system
CN110083743B (en) * 2019-03-28 2021-11-16 哈尔滨工业大学(深圳) Rapid similar data detection method based on unified sampling
CN110031701B (en) * 2019-04-15 2021-05-25 杭州拓深科技有限公司 Electrical appliance characteristic detection method based on current fingerprint technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN103888317A (en) * 2014-03-31 2014-06-25 西南科技大学 Protocol-independent network redundant flow eliminating method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8315984B2 (en) * 2007-05-22 2012-11-20 Netapp, Inc. System and method for on-the-fly elimination of redundant data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN103888317A (en) * 2014-03-31 2014-06-25 西南科技大学 Protocol-independent network redundant flow eliminating method

Also Published As

Publication number Publication date
CN104394091A (en) 2015-03-04

Similar Documents

Publication Publication Date Title
CN104394091B (en) A kind of network redundancy method for recognizing flux based on uniform sampling
US8391164B2 (en) Computing time-decayed aggregates in data streams
CN103078709B (en) Data redundancy recognition methods
CN104484673B (en) The Supplementing Data method of real-time stream application of pattern recognition
CN101447928B (en) Method and device for processing fragment information
CN103888317B (en) A kind of unrelated network redundancy flow removing method of agreement
CN104426770A (en) Routing lookup method, routing lookup device and method for constructing B-Tree tree structure
CN108243241A (en) A kind of storage mode of block chain transaction and queueing form
CN102932479B (en) Virtual network mapping method for realizing topology awareness based on historical data
CN101188477A (en) A data packet sequence receiving method and device
CN110110936A (en) Estimation method, estimation device, storage medium and the electronic equipment of order duration
CN108734361A (en) Share-car order processing method and apparatus
CN110288044A (en) A kind of trajectory simplification method divided based on track with Priority Queues
CN106528844B (en) A kind of data request method and device and data-storage system
CN104778193B (en) Data duplicate removal method and device
CN113064730A (en) Block chain transaction execution method, block chain node and control device
CN104461774B (en) Asynchronous replication method, apparatus and system
CN108712337B (en) Multipath bandwidth scheduling method in high-performance network
CN109858559B (en) Self-adaptive traffic analysis road network simplification method based on traffic flow macroscopic basic graph
CN109359735B (en) Data input device and method for accelerating deep neural network hardware
CN112883067A (en) Block chain transaction execution method, block chain node and control device
CN105812204B (en) A kind of recurrence name server online recognition method based on Connected degree estimation
WO2015078007A1 (en) Quick human face alignment method
CN106100921A (en) The dynamic streaming figure parallel samples method synchronized based on dot information
CN112988818B (en) Block chain transaction execution method, block chain node and control device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Ma Qiang

Inventor after: Zhang Qi

Inventor after: Xing Ling

Inventor after: Yang Guohai

Inventor after: He Yanling

Inventor before: Xing Ling

Inventor before: He Yanling

Inventor before: Ma Qiang

Inventor before: Yang Guohai

GR01 Patent grant
GR01 Patent grant