CN109815234A - A kind of multiple cuckoo filter under streaming computing model - Google Patents
A kind of multiple cuckoo filter under streaming computing model Download PDFInfo
- Publication number
- CN109815234A CN109815234A CN201811635873.4A CN201811635873A CN109815234A CN 109815234 A CN109815234 A CN 109815234A CN 201811635873 A CN201811635873 A CN 201811635873A CN 109815234 A CN109815234 A CN 109815234A
- Authority
- CN
- China
- Prior art keywords
- data
- cuckoo
- cuckoo filter
- sliding window
- filter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses the multiple cuckoo filters under a kind of streaming computing model.Multiple cuckoo filter is mainly made of multiple standard cuckoo filters identical with data flow sum, for each data flow, one standard cuckoo filter is set, the expression of the multiple data collection of data flow and query decomposition are the expression and inquiry of multiple forms data collection by each respective data flow of standard cuckoo filter process;Sliding window is respectively established for each cuckoo filter, each standard cuckoo filter is filtered inquiry simultaneously, and sliding window inquires between different data streams whether exist simultaneously identical specified object along time boundary splitting traffic.The present invention can inherit the advantages of cuckoo filter well, and the processing for high amount of traffic can largely simplify operand and space occupancy rate, reduce false positive rate, facilitate accurate data to inquire, obvious technical effects are prominent.
Description
Technical field
The present invention relates to a kind of cuckoo filters of computer big data field, more particularly, to a kind of stream
Multiple cuckoo filter towards magnanimity multidimensional data index under formula computation model.
Background technique
With the fast development of the related industries such as mobile Internet, Web2.0, smart machine, data volume caused by the mankind
With exponential rapid growth.Mass data gradually shows the number greatly such as hugeization, type diversification, flow high speed
According to feature.Data multidimensional feature becomes clear day by day, and the storage of magnanimity multidimensional data calculates analysis, large-scale data rope in real time
Draw and search for etc. is that information system brings stern challenge.
Unlike low-dimensional data, multidimensional data enables the system to record a large amount of and comprehensive information, and by answering
With providing richer service for user.But the distributed towards multidimensional data, in performances sides such as indexes
Face dramatic decrease, the memory headroom especially occupied also with dimension increase and rapid growth.
Summary of the invention
It is a primary object of the present invention to propose under a kind of streaming computing model towards the more of magnanimity multidimensional data index
Weight cuckoo filter, to establish operation basis to establish multiplex data stream incidence relation.
The technical solution adopted by the present invention is that:
The present invention designs multiple cuckoo filter data structure, multiple cuckoo filter master according to cuckoo filter
It to be made of multiple standard cuckoo filters identical with data flow sum, for each data flow setting pair of required processing
Answer a standard cuckoo filter, each respective data flow of standard cuckoo filter process, by the multiplicity of data flow
According to the expression of collection and query decomposition it is the expression and inquiry of multiple forms data collection, and data element is increased by control and is deposited in index
The value control of the fingerprint size f of storage reduces the positive rate of vacation of multiple cuckoo filter.
When inquiring under any time, the standard cuckoo filter of each data flow is filtered inquiry simultaneously, and inquiry is not
With identical specified object whether is existed simultaneously between data flow, exists, return to True, otherwise return to False.
One sliding window respectively established for each cuckoo filter, sliding window is from corresponding data flow head
Start to obtain quantity and the corresponding segmentation source of each data flow in the period a one by one in office along time boundary splitting traffic
Fingerprint in each sliding window is compared data, checks between multiple data flows whether exist simultaneously certain specified element.
The index of all entries of cuckoo filter is stored in the sliding window, sliding window uses queuing data knot
Structure;When whether containing element x simultaneously in more multiple sliding windows, the Hash mapping result of element x is first obtained in all standards
Final storage location in cuckoo filter retrieves Hash in sliding window changes over time and moves and compare without offset
Whether mapping result is in the location index of corresponding sliding window storage.
The standard cuckoo filter of each data flow is since the head of data flow.
The specified object is the character numerical value of data slot or data slot after processing.
The data of the standard cuckoo filter are by data fluxion dynamic generation, or are set in advance.
Main thought of the invention is to design multiple cuckoo filtering based on cuckoo filter data structure and algorithm
The expression of multiple data collection and query decomposition are the expression and inquiry of multiple forms data collection by device.
In the present invention, the inquiry of multiple cuckoo filter is realized based on cuckoo filter data structure source C++ code
Algorithm.Compare in any time, object is specified whether to compare inquiry in multiple standard cuckoo mistakes by the result of Hash mapping
In the location index of the corresponding multiple sliding window storages of filter.
The present invention implements and analyzes false positive rate, and the positive rate of the vacation of multiple cuckoo filter and the size of bucket, cuckoo are filtered
The number of device, set element sum, sliding window size, window move size every time and fingerprint size is related.
Fingerprint in the present invention refers to digital finger-print, as unique character value of data slot, such as MD5 value.
The present invention proposes multiple cuckoo filter, is for multiple odd numbers by the expression of multiplex data stream and query decomposition
According to the expression and inquiry of stream.The data flow of generation how many, the cuckoo filter of how many standard is respectively indicated and is looked into
Ask the object in each data flow.
The data flow that the present invention inputs is not limited only to the data acquisition system of large capacity, such as file stream.
The program code that the present invention constructs multiple cuckoo filter data structure is not limited only to C Plus Plus;Call journey
Sequence perform script is not limited only to Linux Shell language, such as Python script.Hash function used in programming procedure
It is not limited only to MurmurHash, such as BobHash, SuperFastHash, MD5Hash, SHA1Hash.
In present invention specific implementation, the number of standard cuckoo filter is not limited only to raw dynamically with data fluxion
At, k cuckoo filter can be given in advance, when insertion failure occurs for some cuckoo filter, other in the set
Element continues into next cuckoo filter, after the completion of all elements insertion, then the cuckoo filter of releasing idling
Occupied space.
In the present invention, the index of all entries of cuckoo filter is stored in sliding window, that is, by cuckoo mistake
The bucket and entry of filter are numbered since 0, largely simplify operand and space occupancy rate.
The beneficial effects of the present invention are:
Multiple cuckoo filter is designed the present invention is based on cuckoo filter, can also inherit cuckoo filtering well
The advantages of device --- support element dynamic increase and reliable delete operation, better query performance, storage location relevance,
Smaller space utilization rate under certain condition.
The present invention can not only largely simplify operand and space occupancy rate, and can substantially reduce false positive rate, convenient
Accurate data inquiry.
Multiple cuckoo filter of the invention is compared with previous cuckoo filter, can be supported in multiple data flows
Lookup exists simultaneously the object for meeting specified relationship, has more broad application prospect than existing cuckoo filter.
Detailed description of the invention
Fig. 1 is the multiple cuckoo filter data structure query logic schematic diagram of the present invention;
Fig. 2 is the relational graph of Checkup query time and sliding window size;
Fig. 3 is the relational graph of Checkup query time and set element sum.
Specific embodiment
The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
It is proposed by the present invention be under a kind of streaming computing model with the variation of data fluxion the adaptive cuckoo of dynamic change
Bird filter referring to the drawings and gives an actual example pair to make the purpose of the present invention, technical solution and effect clearer, clear and definite
The present invention is further described.
1, data structure
As shown in Figure 1, multiple cuckoo filter is by multiple standard cuckoo filter groups identical with data flow sum
At each data flow respectively corresponds to the cuckoo filter of a standard, i.e., by the expression and query decomposition of multiple data collection
For the expression and inquiry of multiple forms data collection.How many is a for the data flow of generation, with regard to the cuckoo filter of how many standard
Respectively indicate the object inquired in each data flow.
Table 1: the symbol description of multiple cuckoo filter
High amount of traffic environment is simulated, designs multiple cuckoo filter data structure such as Fig. 1 in the present invention.
Assuming that having n data flow as n data acquisition system, data pair at least in the millions are included in each data acquisition system
As element, that is, designing the identical set of n number of elements at least in the millions, corresponding n standard cuckoo of dynamic generation
Filter.
As shown in Figure 1, Windowing (Windowing) processing is carried out in the following ways for each cuckoo filter,
Be able to solve the technical issues of infinite data source never terminates: each cuckoo filter establishes sliding window, sliding window
Since respective data flow head, sliding window is ok within each period mouth along time boundary splitting traffic
N segmentation source data (the inquiry data i.e. in sliding window) is obtained, and then the fingerprint in each sliding window is compared,
Check between multiple data flows whether exist simultaneously certain specified element.Fingerprint is identical to be considered identical specified element.
2, search algorithm
The element querying flow of multiple cuckoo filter algorithm specifically:
Input data: num and x respectively indicates number of data streams and element to be checked
Query process is as follows:
Step 1. adds data into i-th of cuckoo filter.Element x to be checked is recorded in the filter simultaneously
Location information, be stored in entry item_index [i].A sliding window is generated in the filter, window size is random
It generates, the corresponding data in position in cuckoo filter is inserted into sliding window.Next filter is carried out after the completion
Same operation, until each filter completes above-mentioned steps.
Step 2. inquiry starts.
Situation 1: for single filter, if in current sliding window mouth including element x, then it represents that in the filter
Containing element x, true is returned;
Situation 2: for single filter, if in current sliding window mouth not yet including element x and the sliding window
The last item data in the filter are not slided into also, then sliding window moves down, and continues to inquire;
Situation 3: for single filter, if in current sliding window mouth not yet including element x and the sliding window
The last item data in the filter are had arrived at, then it represents that element x is not present in the filter, returns to false;
The index of all entries of cuckoo filter is stored in sliding window, the fingerprint that index data is generated by Hash is sliding
Dynamic window uses queue data structure.When whether containing element x simultaneously in more multiple sliding windows, the Kazakhstan of element x is first obtained
Uncommon final storage location of the mapping result in all standard cuckoo filters, changes over time in sliding window without offset
In moving and comparing, Hash mapping result is retrieved whether in the location index of corresponding sliding window storage, and in successful inquiring
In the case of carry out timing can assess performance.
Hash mapping result is retrieved not in the location index of corresponding sliding window storage, i.e. inquiry failure then returns
False, this is unsuccessfully not representing element x and is not successfully plugged into cuckoo filter, but indicates in dynamic at any time
In mobile sliding window, it can not find in the sliding window of any time multiple cuckoo filters while comprising the element
Situation.
Hash mapping result is retrieved in the location index of corresponding sliding window storage, i.e. successful inquiring, then returns to true,
Then think that there are identical element x.Also think there is false positive rate simultaneously, it may occur however that the Hash fingerprint of other elements and the Kazakhstan of x
Uncommon fingerprint Hash collision, fairly falls in sliding window.It is following to carry out false positive rate analysis.
3, false positive rate analysis:
For a standard cuckoo filter, the worst request for information is considered --- inquiry one is not belonging in set
Element, then the inquiry must retrieve all 2b entries in two buckets.
In each entry, the probability for being matched to stored fingerprint and returning to erroneous judgement inquiry is at most 1/2f, carries out 2b
After secondary fingerprint comparison, the fingerprint False Rate upper limit are as follows:
∈CF=1- (1-1/2f)2b≈2b/2f
Multiple cuckoo filter is looked into the case where not considering dynamic window according to the set of multiple cuckoo filter
Operation is ask, the inquiry of each element requires to retrieve whole cuckoo filters.The positive rate of the vacation of multiple cuckoo filter, refers to
The probability that at least one cuckoo filter judges x by accident in all cuckoo filters.
The positive rate of the vacation of each cuckoo filter is ∈CF, in s all cuckoo filters, that does not judge by accident is general
Rate is (1- ∈CF)s.The united false positive rate upper limit of s cuckoo filter are as follows:
1-(1-∈CF)s=1- (1-1/2f)2bs≈2bs/2f
If the dynamic window in view of cuckoo filter changes, it is assumed that total m element (the i.e. data pair of a data acquisition system
As), sliding window size is w, every time mobile k element, then symbiosis atA sliding window.
In the dynamic window variation of s cuckoo filter, altogether relativelyIt is secondary, multiple cuckoo
The positive rate of the vacation of bird filter calculates are as follows:
According to above formula relationship as it can be seen that of the positive rate of the vacation of multiple cuckoo filter and the size of bucket, cuckoo filter
Number, data acquisition system element sum, sliding window size, window move size every time and fingerprint size is related.Here set is
Refer to all data in dynamic window.
Specifically, the value for increasing fingerprint size f can significantly reduce false positive rate ∈MCF, under bigger data acquisition system,
The value for constructing bigger fingerprint size f enables to the positive rate of the vacation of multiple cuckoo filter to lower.
As a result, in multiple cuckoo filter of the invention, it is independent from each other between multiple standard cuckoo filters,
False sun rate can be smaller than standard cuckoo filter.
The present invention has used cuckoo filter data structure source C++ code to realize multiple cuckoo mistake in specific implementation
Filter.
Experiment is divided into three groups, analyze respectively Checkup query time and cuckoo filter sum, sliding window size,
The relationship of set element sum.As shown in Fig. 2, total element number is initially set to 1000000 in experiment, sliding window is initially set
Value control is set between 50000-100000, the size that sliding window moves every time is appointed as 2000.
Multiple cuckoo filters move down sliding window simultaneously, if in any time in corresponding sliding window
The presence for all inquiring specified element then returns to True, otherwise returns to False.When wherein some sliding window is to assigning
When to maximum value, needs to fix its window and other windows is waited successively to be moved to filter end.
Specific time-consuming data such as following table 2- table 6.It was found that being difficult to find 4 dynamics when data flow number is more than 4 or more
Simultaneously containing the fingerprint of certain element in window, so cuckoo filter number is increased to 5 by 1, specified element is retrieved in test
Exist simultaneously time and inquiry times consumed by the sliding window at multiple cuckoo filter moment.
As can be seen that when cuckoo filter number increases to 4 by 1, the number of successful query largely subtracts table 2- table 5
It is few, and inquire time-consuming linearly increasing.Such as table 6, when the number of cuckoo filter at 5 and its it is above when, be difficult again at
Function inquires sliding window in any time while all there is the fingerprint for specifying element.
2:1 cuckoo filter of table, every group of 50 inquiry
3:2 cuckoo filter of table, every group of 100 inquiry
4:3 cuckoo filter of table, every group of 100 inquiry
5:4 cuckoo filter of table, every group of 100 inquiry
6:5 cuckoo filter of table, every group of 100 inquiry
And implemented further directed to Checkup query time and the relationship of sliding window size, sliding window it is big
It is small not generate at random.When specifying 2 cuckoo filters, the present invention sets maximum for the sliding window of one of them, separately
One window size value between the 20000~160000 of each increase by 20000.It, will when specifying 3 cuckoo filters
First sliding window is set as maximum, other two window is incremented by successively with every time 10000 speed.It as shown in Figure 3 can be with
Find out, increasing for cuckoo filter number will lead to the more query times of cost, in general, the cuckoo of same number
Between filter, sliding window is bigger, and inquiry is time-consuming gradually on a declining curve.
Also, the sum for gradually increasing data acquisition system element respectively, tests the influence to Checkup query time.Such as Fig. 3
It is shown as can be seen that inquiry is time-consuming also bigger when data acquisition system sum is bigger.
Thus above-mentioned implementation is as it can be seen that the advantages of present invention can inherit cuckoo filter well, for high amount of traffic
Processing can largely simplify operand and space occupancy rate, reduce false positive rate, facilitate accurate data to inquire, technical effect is aobvious
It writes and protrudes.
Claims (7)
1. the multiple cuckoo filter under a kind of streaming computing model, it is characterised in that: multiple cuckoo filter mainly by
Identical multiple standard cuckoo filter compositions, are arranged a standard cuckoo mistake for each data flow with data flow sum
Filter, each respective data flow of standard cuckoo filter process, by the expression of the multiple data collection of data flow and inquiry point
Solution is the expression and inquiry of multiple forms data collection, and increases data element in the value of the fingerprint size f of index storage by control
Control reduces the positive rate of vacation of multiple cuckoo filter.
2. the multiple cuckoo filter under a kind of streaming computing model according to claim 1, it is characterised in that: any
When inscribing inquiry, the standard cuckoo filter of each data flow is filtered inquiry simultaneously, between inquiry different data streams
Identical specified object whether is existed simultaneously, is existed, True is returned, otherwise returns to False.
3. the multiple cuckoo filter under a kind of streaming computing model according to claim 2, it is characterised in that: for
Each cuckoo filter respectively establishes a sliding window, and sliding window is since corresponding data flow head along the time
Boundary segmentation data flow obtains quantity and the corresponding segmentation source data of each data flow in the period a one by one in office, will be each
Fingerprint in sliding window is compared, and checks between multiple data flows whether exist simultaneously certain specified element.
4. the multiple cuckoo filter under a kind of streaming computing model according to claim 2, it is characterised in that: described
Sliding window in store all entries of cuckoo filter index, sliding window use queue data structure;It is more multiple
When whether containing element x simultaneously in sliding window, the Hash mapping result of element x is first obtained in all standard cuckoo filters
In final storage location, sliding window change over time it is dynamic without offset and relatively in, whether retrieval Hash mapping result
In the location index of corresponding sliding window storage.
5. the multiple cuckoo filter under a kind of streaming computing model according to claim 1, it is characterised in that: each
The standard cuckoo filter of data flow is since the head of data flow.
6. the multiple cuckoo filter under a kind of streaming computing model according to claim 1, it is characterised in that: described
Specified object be the character numerical value of data slot or data slot after processing.
7. the multiple cuckoo filter under a kind of streaming computing model according to claim 1, it is characterised in that: described
Standard cuckoo filter data by data fluxion dynamic generation, or be set in advance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811635873.4A CN109815234B (en) | 2018-12-29 | 2018-12-29 | Multiple cuckoo filter under STREAMING computational model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811635873.4A CN109815234B (en) | 2018-12-29 | 2018-12-29 | Multiple cuckoo filter under STREAMING computational model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109815234A true CN109815234A (en) | 2019-05-28 |
CN109815234B CN109815234B (en) | 2021-01-08 |
Family
ID=66602770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811635873.4A Active CN109815234B (en) | 2018-12-29 | 2018-12-29 | Multiple cuckoo filter under STREAMING computational model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109815234B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339058A (en) * | 2020-03-24 | 2020-06-26 | 中国人民解放军国防科技大学 | Set synchronization method and device |
CN111478769A (en) * | 2020-03-18 | 2020-07-31 | 西安电子科技大学 | Distributed credible identity authentication method, system, storage medium and terminal |
CN111552692A (en) * | 2020-04-30 | 2020-08-18 | 南方科技大学 | Plus-minus cuckoo filter |
CN111552693A (en) * | 2020-04-30 | 2020-08-18 | 南方科技大学 | Tag cuckoo filter |
CN111858651A (en) * | 2020-09-22 | 2020-10-30 | 中国人民解放军国防科技大学 | Data processing method and data processing device |
CN112149416A (en) * | 2020-09-09 | 2020-12-29 | 南京大学 | Method for detecting hot spot academic research topic in distributed academic data warehouse |
CN112507689A (en) * | 2021-01-20 | 2021-03-16 | 中国地质大学(武汉) | Spatial range-keyword query method under distributed subscription and release mode |
CN112597345A (en) * | 2020-10-30 | 2021-04-02 | 深圳市检验检疫科学研究院 | Laboratory data automatic acquisition and matching method |
CN113535706A (en) * | 2021-08-03 | 2021-10-22 | 重庆赛渝深科技有限公司 | Two-stage cuckoo filter and repeated data deleting method based on two-stage cuckoo filter |
CN114844638A (en) * | 2022-07-03 | 2022-08-02 | 浙江九州量子信息技术股份有限公司 | Big data volume secret key duplication removing method and system based on cuckoo filter |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103116599A (en) * | 2012-11-30 | 2013-05-22 | 浙江工商大学 | Urban mass data flow fast redundancy elimination method based on improved Bloom filter structure |
US20160134503A1 (en) * | 2014-11-07 | 2016-05-12 | Arbor Networks, Inc. | Performance enhancements for finding top traffic patterns |
CN105989061A (en) * | 2015-02-09 | 2016-10-05 | 中国科学院信息工程研究所 | Rapid indexing method for repeated detection of multi-dimensional data under sliding window |
-
2018
- 2018-12-29 CN CN201811635873.4A patent/CN109815234B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103116599A (en) * | 2012-11-30 | 2013-05-22 | 浙江工商大学 | Urban mass data flow fast redundancy elimination method based on improved Bloom filter structure |
US20160134503A1 (en) * | 2014-11-07 | 2016-05-12 | Arbor Networks, Inc. | Performance enhancements for finding top traffic patterns |
CN105989061A (en) * | 2015-02-09 | 2016-10-05 | 中国科学院信息工程研究所 | Rapid indexing method for repeated detection of multi-dimensional data under sliding window |
Non-Patent Citations (1)
Title |
---|
BIN FAN ET AL.: "Cuckoo Filter: Practically Better Than Bloom", 《PROCEEDINGS OF THE 10TH ACM INTERNATIONAL ON CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111478769A (en) * | 2020-03-18 | 2020-07-31 | 西安电子科技大学 | Distributed credible identity authentication method, system, storage medium and terminal |
CN111339058B (en) * | 2020-03-24 | 2023-05-16 | 中国人民解放军国防科技大学 | Aggregation synchronization method and device |
CN111339058A (en) * | 2020-03-24 | 2020-06-26 | 中国人民解放军国防科技大学 | Set synchronization method and device |
CN111552693B (en) * | 2020-04-30 | 2023-04-07 | 南方科技大学 | Tag cuckoo filter |
CN111552692A (en) * | 2020-04-30 | 2020-08-18 | 南方科技大学 | Plus-minus cuckoo filter |
CN111552693A (en) * | 2020-04-30 | 2020-08-18 | 南方科技大学 | Tag cuckoo filter |
CN111552692B (en) * | 2020-04-30 | 2023-04-07 | 南方科技大学 | Plus-minus cuckoo filter |
CN112149416A (en) * | 2020-09-09 | 2020-12-29 | 南京大学 | Method for detecting hot spot academic research topic in distributed academic data warehouse |
CN112149416B (en) * | 2020-09-09 | 2023-08-22 | 南京大学 | Method for detecting hot academic research topics in distributed academic data warehouse |
CN111858651A (en) * | 2020-09-22 | 2020-10-30 | 中国人民解放军国防科技大学 | Data processing method and data processing device |
CN112597345B (en) * | 2020-10-30 | 2023-05-12 | 深圳市检验检疫科学研究院 | Automatic acquisition and matching method for laboratory data |
CN112597345A (en) * | 2020-10-30 | 2021-04-02 | 深圳市检验检疫科学研究院 | Laboratory data automatic acquisition and matching method |
CN112507689B (en) * | 2021-01-20 | 2023-08-01 | 中国地质大学(武汉) | Space range-keyword query method under distributed subscription and release mode |
CN112507689A (en) * | 2021-01-20 | 2021-03-16 | 中国地质大学(武汉) | Spatial range-keyword query method under distributed subscription and release mode |
CN113535706A (en) * | 2021-08-03 | 2021-10-22 | 重庆赛渝深科技有限公司 | Two-stage cuckoo filter and repeated data deleting method based on two-stage cuckoo filter |
CN113535706B (en) * | 2021-08-03 | 2023-05-23 | 佛山赛思禅科技有限公司 | Two-stage cuckoo filter and repeated data deleting method based on two-stage cuckoo filter |
CN114844638A (en) * | 2022-07-03 | 2022-08-02 | 浙江九州量子信息技术股份有限公司 | Big data volume secret key duplication removing method and system based on cuckoo filter |
Also Published As
Publication number | Publication date |
---|---|
CN109815234B (en) | 2021-01-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109815234A (en) | A kind of multiple cuckoo filter under streaming computing model | |
CN105320775B (en) | The access method and device of data | |
CN104536959B (en) | A kind of optimization method of Hadoop accessing small high-volume files | |
CN103593436B (en) | file merging method and device | |
CN101866358B (en) | Multidimensional interval querying method and system thereof | |
CN111913955A (en) | Data sorting processing device, method and storage medium | |
US8364751B2 (en) | Automated client/server operation partitioning | |
EP2199935A2 (en) | Method and system for dynamically partitioning very large database indices on write-once tables | |
CN107436813A (en) | A kind of method and system of meta data server dynamic load leveling | |
CN107329987A (en) | A kind of search system based on mongo databases | |
CN110688382B (en) | Data storage query method and device, computer equipment and storage medium | |
WO2021047373A1 (en) | Big data-based column data processing method, apparatus, and medium | |
CN110515920A (en) | A kind of mass small documents access method and system based on Hadoop | |
CN102214236A (en) | Method and system for processing mass data | |
JP2022547673A (en) | DATA PROCESSING METHOD AND RELATED DEVICE, AND COMPUTER PROGRAM | |
CN109766318A (en) | File reading and device | |
CN104462349B (en) | A kind of document handling method and device | |
CN109117426A (en) | Distributed networks database query method, apparatus, equipment and storage medium | |
CN110019017B (en) | High-energy physical file storage method based on access characteristics | |
CN116089364B (en) | Storage file management method and device, AI platform and storage medium | |
CN112540954B (en) | Multi-level storage construction and online migration method in directory unit | |
CN109828984B (en) | Analysis processing method and device, computer storage medium and terminal | |
Zhao et al. | LS-AMS: An adaptive indexing structure for realtime search on microblogs | |
CN108614879A (en) | Small documents processing method and device | |
CN112540843B (en) | Resource allocation method and device, storage equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 310000 Room 501, building 9, No. 20, kekeyuan Road, Baiyang street, Hangzhou Economic and Technological Development Zone, Zhejiang Province Patentee after: Hangzhou Zhongke advanced technology development Co.,Ltd. Address before: 310026 Room 501, building 9, 20 kejiyuan Road, Baiyang street, Hangzhou Economic and Technological Development Zone, Zhejiang Province Patentee before: HANGZHOU ZHONGKE ADVANCED TECHNOLOGY RESEARCH INSTITUTE Co.,Ltd. |