CN104394091A - Uniform sampling based network redundancy traffic identification method - Google Patents

Uniform sampling based network redundancy traffic identification method Download PDF

Info

Publication number
CN104394091A
CN104394091A CN201410730071.7A CN201410730071A CN104394091A CN 104394091 A CN104394091 A CN 104394091A CN 201410730071 A CN201410730071 A CN 201410730071A CN 104394091 A CN104394091 A CN 104394091A
Authority
CN
China
Prior art keywords
characteristic fingerprint
sampling
characteristic
data block
fingerprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410730071.7A
Other languages
Chinese (zh)
Other versions
CN104394091B (en
Inventor
邢玲
何燕玲
马强
杨国海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN201410730071.7A priority Critical patent/CN104394091B/en
Publication of CN104394091A publication Critical patent/CN104394091A/en
Application granted granted Critical
Publication of CN104394091B publication Critical patent/CN104394091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a uniform sampling based network redundancy traffic identification method. According to the method, uniform sampling is achieved through characteristic fingerprints. The method comprises sliding continuously according to windows of the fixed size, and selecting largest characteristic fingerprints in the windows to serve as sampling characteristic fingerprints to be stored in a characteristic fingerprint database; performing dynamic tracking on the sampling characteristic fingerprints: updating characteristic fingerprints in the characteristic fingerprint database to point to (be mapped to) matched data package loads in a buffer region to prevent clearing of high-frequency redundancy data package load mapped characteristic fingerprints in the characteristic fingerprint database during the refreshing of the buffer region and maintain the redundancy flow identification continuity.

Description

A kind of network redundancy method for recognizing flux based on uniform sampling
Technical field
The invention belongs to network flow management technical field, more specifically say, relate to a kind of network redundancy method for recognizing flux based on uniform sampling, for the redundancy section in recognition network flow.
Background technology
Drive by user interest model, the Internet resources that the user in edge network with same interest accesses similar or identical theme must cause a large amount of repeating data to transmit, and form the redundant flow that particular link is relevant.The existence not only loss link bandwidth resource utilance of redundant flow, and affect the experience sense of customer access network resource, hit user's enthusiasm to a certain extent.
Redundant flow in effective recognition network is the key of the research redundant flow origin cause of formation and its a series of concurrent problem brought.Traditional WEB caching technology is based on object layer identification redundant flow, but different application needs to redesign corresponding buffer memory details, lacks the flexibility of application.
In recent years, MODP, MAXP, SAMPBYTE and DYNABYTE method based on layer data packet is suggested successively, and achieves good recognition efficiency.Wherein MODP is the strategy sampling characteristic fingerprint of 0 based on Rabin Polynomial Method calculating continuous data piecemeal fingerprint and by fingerprint value delivery, there is uneven and zero sampling defect of sampling.MAXP selects maximum as sampling characteristic fingerprint based on the even piecemeal of window of fixed size, overcomes MODP sampling problem of non-uniform, but can not follow the tracks of the behavioral characteristics of real traffic medium-high frequency redundant data block well.SAMPBYTE and DYNABYTE from angle of statistics, by the Typical Representative initial character of training sample selection redundant block as sampling feature.DYNABYTE realize details compared with SAMPBYTE add sampling feature dynamic adjustment function, achieve the dynamic tracking capabilities to real traffic medium-high frequency redundant block to a certain extent.But the impact that SAMPBYTE and DYNABYTE selects the scheme of characteristic fingerprint to select by sample data based on sample training is comparatively large, the limited flexibility of deployment.Above method all could not be taken into account well in uniform sampling and high frequency redundant block dynamically follow the tracks of two simultaneously.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art, a kind of network redundancy method for recognizing flux based on uniform sampling is provided, while the identification problem solving redundant flow in real network environment, take into account the ability that the uniform sampling of characteristic fingerprint and high frequency redundant block are dynamically followed the tracks of, to improve validity and the discrimination of redundant flow identification.
For achieving the above object, the present invention is based on the network redundancy method for recognizing flux of uniform sampling, it is characterized in that, comprise the following steps:
(1), characteristic fingerprint uniform sampling
1.1), to first the packet load t received 1, t 2, t 3... t n, by the sliding window of Ω size, slide from original position, a byte is stepping, divides this packet load, obtains n-Ω+1 data block t that continuous print size is Ω 1, t 2, t 3..., t Ω, t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t n, wherein, n is packet load byte number;
1.2) data block, to n-Ω+1 Ω size, by Rabin multinomial, the characteristic fingerprint that calculated data block maps, data block and characteristic fingerprint mapping relations are followed successively by:
H 1=RF(t 1,t 2,t 3,...,t Ω)=(t 1p Ω-1+t 2p Ω-2+...+t Ω-1p 1+t Ωp 0)modM
H 2=RF(t 2,t 3,t 4,...,t Ω+1)=((RF(t 1,t 2,t 3...t Ω)-t 1p Ω-1)*p+t Ω+1p 0)modM (1)
……
H n-Ω+1=RF(t n-Ω+1,t n-Ω+2,t n-Ω+3,...,t n)=((RF(t n-Ω,t n-Ω+1,t n-Ω+2,...,t n-1)-t n-Ωp Ω-1)*p+t np 0)modM
Wherein, H 1, H 2..., H n-Ω+1for n-Ω+1 data block characteristic of correspondence fingerprint, mod is the computing that rems, and M is constant, determines as the case may be, and RF represents mapping operations;
First formula (1) calculated data block t is pressed 1, t 2, t 3..., t Ωthe characteristic fingerprint H mapped 1, then according to look-up table T, with byte t ielement value, as searching index, obtains t ip Ω-1value, i=1,2 ..., n-Ω; Last according to formula (1), calculate data block t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t ncharacteristic fingerprint H 2..., H n-Ω+1, wherein, look-up table T comprise 0 ~ 255 search index, each output valve corresponding to index of searching searches index and p for this Ω-1product;
1.3), step 1.2) characteristic fingerprint that obtains carries out order arrangement, morphogenesis characters fingerprint sequence H 1, H 2..., H n-Ω+1; By the sliding window of w size, slide from original position, a characteristic fingerprint is stepping, to characteristic fingerprint sequence H 1, H 2..., H n-Ω+1divide, each slip all to choose in sliding window maximum as sampling characteristic fingerprint stored in characteristic fingerprint storehouse, to last sliding window, completes the characteristic fingerprint sampling of input packet;
Different sliding window because of lap choose same sampling characteristic fingerprint time, only stored in the sampling characteristic fingerprint that first time is chosen;
(2) characteristic fingerprint of, sampling dynamically is followed the tracks of
2.1), set up a buffering area, by first packet load of input stored in, and the sampling characteristic fingerprint in characteristic fingerprint storehouse is mapped in first packet load;
2.2), to second packet load received, first stored in buffering area, then sampling characteristic fingerprint is extracted according to the method in step (1), and mate in characteristic fingerprint storehouse one by one, Mobile state of going forward side by side is followed the tracks of: if match sampling characteristic fingerprint, then the sampling characteristic fingerprint matched in characteristic fingerprint storehouse is mapped in second packet load, if do not matched, then by the sampling characteristic fingerprint that extracts stored in characteristic fingerprint storehouse, and be mapped in second packet load;
2.3), to the packet load received subsequently, according to step 2.2) method processes; After in buffering area, packet load is filled with, adopt first in first out (First In First Out, FIFO) aging mechanism flush buffers, to store the follow-up packet load reached, during refreshing, be mapped in the sampling characteristic fingerprint being moved out of packet load in characteristic fingerprint storehouse and be eliminated;
(3), redundant flow identification
For the sampling characteristic fingerprint extracted in step (2), if the match is successful in characteristic fingerprint, then adopt greatest content matching method, according to the data block that sampling characteristic fingerprint is corresponding, the packet load received is mated with the packet load be mapped in buffer area, and output matching byte number and redundant data block size;
Count each redundant data block size sum of unit interval, obtain redundant flow size and namely identify redundant flow.
Goal of the invention of the present invention is achieved in that
The present invention is based on the network redundancy method for recognizing flux of uniform sampling, by characteristic fingerprint uniform sampling: slide continuously by the window of fixed size, characteristic fingerprint maximum in selected window as sampling characteristic fingerprint stored in characteristic fingerprint storehouse; And sampling characteristic fingerprint is dynamically followed the tracks of: search characteristic fingerprint storehouse and identify in the process of redundant data block, the characteristic fingerprint mated in regeneration characteristics fingerprint base points to the packet load mated in (being mapped in) buffering area, with the characteristic fingerprint preventing buffering area refresh process from removing the medium-high frequency redundant data packets load mapping of characteristic fingerprint storehouse, keep the sustainability of redundant flow identification.
Compared with prior art, the present invention has the beneficial effect of following four aspects:
(1), that the characteristic fingerprint uniform sampling that the present invention is based on continuous sliding window has stronger interval is representative, ensures that the present invention is to the validity of redundant flow identification;
(2), the present invention's characteristic fingerprint of sampling dynamically follows the tracks of the sampling characteristic fingerprint Problem of Failure solving buffering area aging (refreshing) and bring, effective guarantee, to the dynamic tracking of high frequency redundant data block and sustainability identification, improves redundant flow discrimination further;
(3), data-oriented covering handling object of the present invention, do not limit by application layer protocol, there is higher application flexibility;
(4), the present invention without the need to sample training, the characteristic fingerprint uniform sampling of employing and Dynamic Tracking can self adaptation arbitrary network node environment, dispose flexibly.
Accompanying drawing explanation
Fig. 1 is a kind of embodiment flow chart of network redundancy method for recognizing flux that the present invention is based on uniform sampling;
Fig. 2 is that packet load dividing data block and characteristic fingerprint map schematic diagram;
Fig. 3 is characteristic fingerprint uniform sampling schematic diagram;
Fig. 4 is greatest content coupling flow chart;
Fig. 5 is the record format figure that redundant flow identification exports;
Fig. 6 is that characteristic fingerprint dynamically follows the tracks of schematic diagram;
The redundant flow ratio schematic diagram of Fig. 7 campus network speed under load and identification
Embodiment
Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described, so that those skilled in the art understands the present invention better.Requiring particular attention is that, in the following description, when perhaps the detailed description of known function and design can desalinate main contents of the present invention, these are described in and will be left in the basket here.
Fig. 1 is a kind of embodiment flow chart of network redundancy method for recognizing flux that the present invention is based on uniform sampling.
In the present embodiment, as shown in Figure 1, first, by the fixed size Ω preset, consecutive data block is divided into input packet load.Packet load has n byte, i.e. t 1, t 2, t 3... t n, so can divide n-Ω+1 continuous print data block t 1, t 2, t 3..., t Ω, t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t n.
Then, by Rabin polynomial computation each data block characteristic of correspondence fingerprint Hx, x ∈ [1, n-Ω+1] i.e. H 1, H 2..., H n-Ω+1, Fig. 2 gives the mapping relations of characteristic fingerprint and data block.
For first data block t 1, t 2, t 3..., t Ωthe characteristic fingerprint H mapped 1, calculate by formula (1).2nd and later data block characteristics fingerprint are calculated, then first according to look-up table T, with byte t ielement value, as searching index, obtains t ip Ω-1value, i=1,2 ..., n-Ω; Last according to formula (1), calculate data block t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t ncharacteristic fingerprint H 2..., H n-Ω+1, wherein, look-up table T comprise 0 ~ 255 search index, each output valve corresponding to index of searching searches index and p for this Ω-1product, greatly can improve computational efficiency like this.
In the present embodiment, for convenience of calculating p value 2, M value 0x100000000 and the decimal system 4294967296, to limit the characteristic fingerprint value of calculating within the scope of 32 bits.
Fig. 3 is characteristic fingerprint uniform sampling schematic diagram.
In the present embodiment, as shown in Figure 3, the characteristic fingerprint obtained carries out order arrangement, morphogenesis characters fingerprint sequence H 1, H 2..., H n-Ω+1; By the sliding window of w size, slide from original position, a characteristic fingerprint is stepping, to characteristic fingerprint sequence H 1, H 2..., H n-Ω+1divide, each slip all to choose in sliding window maximum as sampling characteristic fingerprint stored in characteristic fingerprint storehouse, to last sliding window, completes the characteristic fingerprint sampling of input packet.
Sample the sampling characteristic fingerprint that obtains stored in characteristic fingerprint storehouse for first packet load, and the sampling characteristic fingerprint in characteristic fingerprint storehouse stored in the buffering area set up, and is mapped in first packet load by first packet load.
For second or the packet load that receives subsequently while stored in buffering area, its sampling characteristic fingerprint obtained of sampling is mated with the sampling characteristic fingerprint in characteristic fingerprint storehouse, if matched, then the sampling characteristic fingerprint matched in characteristic fingerprint storehouse is mapped in second packet load, if do not matched, then by the sampling characteristic fingerprint that extracts stored in characteristic fingerprint storehouse, and be mapped in second packet load.
Meanwhile, if matched, then the packet load received is mated with the packet load in buffer area with mapping, and output matching byte number and redundant data block size, to identify redundant flow further.Specifically, greatest content matching method starts to mate with the position of the packet load be mapped in buffer area to the packet load that the data block of sampling characteristic fingerprint correspondence (association) is receiving, so not only effectively can solve the potential hash-collision problem of Rabin polynomial computation characteristic fingerprint, can also try one's best and identify the redundant flow of more multibyte composition, improve to a certain extent with the recognition efficiency of fixed data block size mode identification redundant flow.Greatest content coupling presses the flow performing shown in Fig. 4, comprises the following steps:
3.1), in alignment feature fingerprint base, the data block of sampling characteristic fingerprint correspondence (association) is mapped in the right boundary position of buffer area;
3.2), check whether the data in buffering area alignment right boundary limited range mate completely with current data block contents to be matched;
3.3) if coupling, then continue to perform subsequent match process, otherwise terminate greatest content coupling flow process;
3.4) buffering area left margin, by byte is mated after alignment successively with left data and current data block left margin to be matched with left data, until it fails to match;
3.5) buffering area right margin, by byte is mated after alignment successively with right data and current data block right margin to be matched with right data, until it fails to match;
3.6), accumulating step 3.4), step 3.5) left and right expansion byte number and current data block to be matched MatchByte, succeed coupling MatchByte and redundant data block size.
Redundant flow identification of the present invention is carried out statistical analysis based on the record that greatest content coupling link exports and is completed, and the record format of output as shown in Figure 5.Every bar record represents the redundant data block successfully identified, len field wherein represents the redundant data block byte number meeting characteristic fingerprint coupling and the rear calculating of greatest content coupling, and sec field represents that belonging to this data block, packet is accurate to the capture time of second.The redundant flow size of existence per second in network flow calculation is got final product by these two field informations.In record, other fields can be applicable to more complicated redundant flow attributive analysis.
Next, check whether buffering area has enough remaining spaces to store current pending packet load.If remaining space is inadequate, then press the size of setting with the aging mechanism flush buffers of first in first out (FIFO), between reserved enough buffer empties.Subsequently, by packet load stored in reserved buffering area.
Finally, perform sampling characteristic fingerprint dynamically to follow the tracks of:
If A matches sampling characteristic fingerprint, then by packet load that the sampling characteristic fingerprint matched in characteristic fingerprint storehouse is mapped in second or receives subsequently, if do not matched, then by the sampling characteristic fingerprint that extracts stored in characteristic fingerprint storehouse, and the packet load being mapped in second or receiving subsequently;
B, to the packet load received subsequently, according to step 2.2) method processes; After in buffering area, packet load is filled with, adopt first in first out (First In First Out, FIFO) aging mechanism flush buffers, to store the follow-up packet load reached, during refreshing, be mapped in the sampling characteristic fingerprint being moved out of packet load in characteristic fingerprint storehouse and be eliminated.
The present invention adopts the buffering area of fixed size to store the packet load of redundant flow to be identified, and each sampling characteristic fingerprint in characteristic fingerprint storehouse is mapped in the particular offset position of corresponding data bag in buffering area.Along with the continuous accumulation of packet load in buffering area, buffering area adopts first in first out (First In First Out completely afterwards, FIFO) aging mechanism refreshes the aging buffering area of specific size, and reserved new space is for storing the follow-up packet load reached.The packet load being mapped in this section while refreshing aging buffering area in characteristic fingerprint storehouse will lose efficacy, and also synchronously can remove all characteristic fingerprints of this packet load association from characteristic fingerprint storehouse.
Characteristic fingerprint shown in Fig. 6 is dynamically followed the tracks of schematic diagram and is solved in the refresh process of aging buffering area the defect problem may removing part high frequency redundant data block identifiable design characteristic fingerprint.Actual dynamic tracing process is mapped in the packet original position of up-to-date identification by the ability characteristic fingerprint mated in iteration regeneration characteristics fingerprint base, even if aging mechanism removes the identification that data field that characteristic fingerprint in fingerprint base previously mapped also can not have a strong impact on follow-up high frequency redundant data block, like this, implementing dynamically to follow the tracks of to sampling characteristic fingerprint for mating each in characteristic fingerprint storehouse, keeping the present invention to the dynamic tracking of high frequency redundant data block and sustainable recognition capability.
Be about to enter ageing link flush buffers shown in Fig. 6 after buffering area stores m data bag.Sampling characteristic fingerprint H in characteristic fingerprint storehouse s3be mapped in packet load 1 in buffering area at first, be finally mapped in packet m+1 by continuous iteration.Although the buffering area interval range at FIFO aging mechanism refresh data bag load 1, packet load 2 and packet load 3 place, also can not affect in the inventive method identification data bag m+1 and be mapped in sampling characteristic fingerprint H s3data block.
In order to beneficial effect of the present invention is described, devise one group of contrast experiment below, compare based on the maximum sampling MAXP method of characteristic fingerprint and the inventive method the recognition capability of redundant flow.
Table 1 is the sample data of this contrast experiment, sample set A, B, C take from the two-way uninterrupted turnover flow between campus network access link on November 15,10 o'clock to 11 o'clock three day morning 13 days to 2013 November in 2013, and sample set D, E take from the two-way uninterrupted turnover flow between campus network complex building link two afternoon on November 17,16 days to 2013 November in 2013 2 o'clock to 3 o'clock.
Sample set Source Describe Date Total amount/GB
A Campus network access link Two-way uninterrupted flow On November 13rd, 2013 37.8
B Campus network access link Two-way uninterrupted flow On November 14th, 2013 27.7
C Campus network access link Two-way uninterrupted flow On November 15th, 2013 31.6
D Campus network complex building link Two-way uninterrupted flow On November 16th, 2013 19.5
E Campus network complex building link Two-way uninterrupted flow On November 17th, 2013 13.4
Table 1
Through experiment test, obtain the contrast experiment's test result shown in table 2, added up MAXP method and the average recognition rate of the inventive method to 5 groups of different sample set redundant flow respectively.
Sample set MAXP method average recognition rate The inventive method average recognition rate
A 17.5% 21.6%
B 21.3% 22.8%
C 23.6% 20.9%
D 21.2% 21.6%
E 20.4% 21.2%
Table 2
From the statistics of table 2, the redundant flow average recognition rate of MAXP method to 5 groups of different sample sets reaches 20.8%, and the redundant flow average recognition rate of the inventive method to same sample collection reaches 21.6%.From analysis result, the inventive method slightly improves on the basis of redundant flow recognition capability in the existing method of maintenance.In addition, observe from the result of table 2 and find that the redundant flow of MAXP method to different sample data average recognition capability difference is comparatively large, have the difference of 6.1 percentage points.On the contrary, the redundant flow of the inventive method to different sample data average recognition capability difference is relatively little, is 1.9 percentage points.Further analysis finds, the average recognition rate variance of MAXP method is 3.86, and the average recognition rate variance of the inventive method is 0.42, and visible the inventive method is relatively stable to redundant flow average recognition rate.
Fig. 7 is the test result of the inventive method to sample B, wherein blue curve represents the speed under load of the following header information of Campus Network Traffic removing application layer, green curve represents the redundant flow ratio of the inventive method identification, abscissa is time shaft in seconds, and ordinate is the statistic flow axle in units of byte.Other sample sets also record similar results, and diagram is not listed one by one.
Through above experimental implementation and interpretation of result, illustrate that the inventive method has good exploitativeness, taken into account the ability that all even DYNABYTE method dynamically of MAXP method Interval Sampling follows the tracks of redundant flow variation characteristic simultaneously.Be compared to existing redundant flow recognition methods, the inventive method also improves the stability to different sample redundant flow recognition capability on the basis of guaranteeing redundant flow recognition capability, can be adaptive to different real network environment, convenient deployment.The redundant flow recognition capability that the inventive method obtains is benefited from characteristic fingerprint uniform sampling and be ensure that and be sampled characteristic fingerprint to the representativeness of buffer data load and uniform fold rate, and characteristic fingerprint dynamically follows the tracks of the defect that effectively compensate for the characteristic fingerprint intermittence that causes fingerprint base medium-high frequency redundant block to map in aging mechanism flush buffers, buffering area and lost efficacy.The inventive method utilizes for research network bandwidth resources and Internet resources of making rational planning for have certain reference value.
Although be described the illustrative embodiment of the present invention above; so that those skilled in the art understand the present invention; but should be clear; the invention is not restricted to the scope of embodiment; to those skilled in the art; as long as various change to limit and in the spirit and scope of the present invention determined, these changes are apparent, and all innovation and creation utilizing the present invention to conceive are all at the row of protection in appended claim.

Claims (2)

1., based on a network redundancy method for recognizing flux for uniform sampling, it is characterized in that, comprise the following steps:
(1), characteristic fingerprint uniform sampling
1.1), to first the packet load t received 1, t 2, t 3... t n, by the sliding window of Ω size, slide from original position, a byte is stepping, divides this packet load, obtains n-Ω+1 data block t that continuous print size is Ω 1, t 2, t 3..., t Ω, t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t n, wherein, n is packet load byte number;
1.2) data block, to n-Ω+1 data block, by Rabin multinomial, the characteristic fingerprint that calculated data block maps, data block and characteristic fingerprint mapping relations are followed successively by:
H 1=RF(t 1,t 2,t 3,...,t Ω)=(t 1p Ω-1+t 2p Ω-2+...+t Ω-1p 1+t Ωp 0)mod M
H 2=RF(t 2,t 3,t 4,...,t Ω+1)=((RF(t 1,t 2,t 3...t Ω)-t 1p Ω-1)*p+t Ω+1p 0)mod M (1)
……
H n-Ω+1=RF(t n-Ω+1,t n-Ω+2,t n-Ω+3,...,t n)=((RF(t n-Ω,t n-Ω+1,t n-Ω+2,...,t n-1)-t n-Ωp Ω-1)*p+t np 0)mod M
Wherein, H 1, H 2..., H n-Ω+1for n-Ω+1 data block characteristic of correspondence fingerprint;
First formula (1) calculated data block t is pressed 1, t 2, t 3..., t Ωthe characteristic fingerprint H mapped 1, then according to look-up table T, with byte t ielement value, as searching index, obtains t ip Ω-1value, i=1,2 ..., n-Ω; Last according to formula (1), calculate data block t 2, t 3, t 4..., t Ω+1..., t n-Ω+1, t n-Ω+2, t n-Ω+3..., t ncharacteristic fingerprint H 2..., H n-Ω+1, wherein, look-up table T comprise 0 ~ 255 search index, each output valve corresponding to index of searching searches index and p for this Ω-1product;
1.3), step 1.2) characteristic fingerprint that obtains carries out order arrangement, morphogenesis characters fingerprint sequence H 1, H 2..., H n-Ω+1; By the sliding window of w size, slide from original position, a characteristic fingerprint is stepping, to characteristic fingerprint sequence H 1, H 2..., H n-Ω+1divide, each slip all to choose in sliding window maximum as sampling characteristic fingerprint stored in characteristic fingerprint storehouse, to last sliding window, completes the characteristic fingerprint sampling of input packet;
Different sliding window because of lap choose same sampling characteristic fingerprint time, only stored in the sampling characteristic fingerprint that first time is chosen;
(2) characteristic fingerprint of, sampling dynamically is followed the tracks of
2.1), set up a buffering area, by first packet load of input stored in, and the sampling characteristic fingerprint in characteristic fingerprint storehouse is mapped in first packet load;
2.2), to second packet load received, first stored in buffering area, then sampling characteristic fingerprint is extracted according to the method in step (1), and mate in characteristic fingerprint storehouse one by one, Mobile state of going forward side by side is followed the tracks of: if match sampling characteristic fingerprint, then the sampling characteristic fingerprint matched in characteristic fingerprint storehouse is mapped in second packet load, if do not matched, then by the sampling characteristic fingerprint that extracts stored in characteristic fingerprint storehouse, and be mapped in second packet load;
2.3), to the packet load received subsequently, according to step 2.2) method processes; After in buffering area, packet load is filled with, adopt first in first out (First In First Out, FIFO) aging mechanism flush buffers, to store the follow-up packet load reached, during refreshing, be mapped in the sampling characteristic fingerprint being moved out of packet load in characteristic fingerprint storehouse and be eliminated;
(3), redundant flow identification
For the sampling characteristic fingerprint extracted in step (2), if the match is successful in characteristic fingerprint, then sampling greatest content matching method, according to the data block that sampling characteristic fingerprint is corresponding, the packet load received is mated with the packet load be mapped in buffer area, and output matching byte number and redundant data block size;
Count each redundant data block size sum of unit interval, obtain redundant flow size and namely identify redundant flow.
2. network redundancy method for recognizing flux according to claim 1, it is characterized in that, step 1.2) described in p value be 2, M value be 0x100000000 and the decimal system 4294967296, to limit the characteristic fingerprint value of calculating within the scope of 32 bits.
CN201410730071.7A 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling Active CN104394091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410730071.7A CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410730071.7A CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Publications (2)

Publication Number Publication Date
CN104394091A true CN104394091A (en) 2015-03-04
CN104394091B CN104394091B (en) 2017-07-18

Family

ID=52611927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410730071.7A Active CN104394091B (en) 2014-12-04 2014-12-04 A kind of network redundancy method for recognizing flux based on uniform sampling

Country Status (1)

Country Link
CN (1) CN104394091B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357071A (en) * 2015-11-12 2016-02-24 成都科来软件有限公司 Identification method and identification system for network complex traffic
CN110031701A (en) * 2019-04-15 2019-07-19 杭州拓深科技有限公司 A kind of electric appliance characteristic detection method based on electric current fingerprint technique
CN110083743A (en) * 2019-03-28 2019-08-02 哈尔滨工业大学(深圳) A kind of quick set of metadata of similar data detection method based on uniform sampling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
US20080294696A1 (en) * 2007-05-22 2008-11-27 Yuval Frandzel System and method for on-the-fly elimination of redundant data
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN103888317A (en) * 2014-03-31 2014-06-25 西南科技大学 Protocol-independent network redundant flow eliminating method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080294696A1 (en) * 2007-05-22 2008-11-27 Yuval Frandzel System and method for on-the-fly elimination of redundant data
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN103888317A (en) * 2014-03-31 2014-06-25 西南科技大学 Protocol-independent network redundant flow eliminating method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357071A (en) * 2015-11-12 2016-02-24 成都科来软件有限公司 Identification method and identification system for network complex traffic
CN105357071B (en) * 2015-11-12 2018-08-31 成都科来软件有限公司 A kind of network complexity method for recognizing flux and identifying system
CN110083743A (en) * 2019-03-28 2019-08-02 哈尔滨工业大学(深圳) A kind of quick set of metadata of similar data detection method based on uniform sampling
CN110031701A (en) * 2019-04-15 2019-07-19 杭州拓深科技有限公司 A kind of electric appliance characteristic detection method based on electric current fingerprint technique

Also Published As

Publication number Publication date
CN104394091B (en) 2017-07-18

Similar Documents

Publication Publication Date Title
CN104199832B (en) Banking network based on comentropy transaction community discovery method extremely
CN103078709B (en) Data redundancy recognition methods
CN103888317B (en) A kind of unrelated network redundancy flow removing method of agreement
CN104598621B (en) A kind of trace compression method based on sliding window
CN104462141B (en) Method, system and the storage engines device of a kind of data storage and inquiry
CN105989129A (en) Real-time data statistic method and device
CN104394091A (en) Uniform sampling based network redundancy traffic identification method
CN105515997B (en) The higher efficiency range matching process of zero scope expansion is realized based on BF_TCAM
CN106850750A (en) A kind of method and apparatus of real time propelling movement information
CN109033141B (en) Space-time trajectory compression method based on trajectory dictionary
CN102110171A (en) Method for inquiring and updating Bloom filter based on tree structure
EP4345634A1 (en) Message matching method and apparatus, storage medium and electronic apparatus
CN107862074A (en) Big data quantity parameter rapid read-write method
CN104951403A (en) Low-overhead and error-free cold and hot data recognition method
CN110532307A (en) A kind of date storage method and querying method flowing sliding window
Minsky et al. Practical set reconciliation
CN104536700B (en) Quick storage/the read method and system of a kind of bit stream data
CN104391910B (en) A kind of taxation statistics form based on HBase stores and the method calculated
CN102722557B (en) Self-adaption identification method for identical data blocks
CN106844541A (en) A kind of on-line analytical processing method and device
CN106998472A (en) The compression method and system of a kind of holding target information
CN102685008A (en) Pipeline-based rapid stream identification method and equipment
CN102184214B (en) Data grouping quick search positioning mode
CN103020182B (en) A kind of data search method based on HASH algorithm
CN103458032B (en) The method and system of a kind of spatial data accessing rule dynamic statistics and Information Compression

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Ma Qiang

Inventor after: Zhang Qi

Inventor after: Xing Ling

Inventor after: Yang Guohai

Inventor after: He Yanling

Inventor before: Xing Ling

Inventor before: He Yanling

Inventor before: Ma Qiang

Inventor before: Yang Guohai

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant