CN113886391B - Data processing method of double-fingerprint storage cuckoo filter based on discrete type - Google Patents

Data processing method of double-fingerprint storage cuckoo filter based on discrete type Download PDF

Info

Publication number
CN113886391B
CN113886391B CN202111181649.4A CN202111181649A CN113886391B CN 113886391 B CN113886391 B CN 113886391B CN 202111181649 A CN202111181649 A CN 202111181649A CN 113886391 B CN113886391 B CN 113886391B
Authority
CN
China
Prior art keywords
index table
storage unit
fingerprint
data
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111181649.4A
Other languages
Chinese (zh)
Other versions
CN113886391A (en
Inventor
邓显辉
李斌勇
赵兰
蒋娜
张小辉
宋学江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Tianma Technology Co ltd
Chengdu University of Information Technology
Original Assignee
Chengdu Tianma Technology Co ltd
Chengdu University of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Tianma Technology Co ltd, Chengdu University of Information Technology filed Critical Chengdu Tianma Technology Co ltd
Priority to CN202111181649.4A priority Critical patent/CN113886391B/en
Publication of CN113886391A publication Critical patent/CN113886391A/en
Application granted granted Critical
Publication of CN113886391B publication Critical patent/CN113886391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a data processing method of a dual-fingerprint storage cuckoo filter based on a discrete type, which comprises the steps of establishing a main index table of the dual-fingerprint storage cuckoo filter based on the discrete type and initializing the main index table; correspondingly inserting data, inquiring data and/or deleting data according to the current instruction type; and judging whether to continuously acquire the current instruction type, if so, continuously processing the data, and otherwise, finishing the data processing of the discrete double-fingerprint storage cuckoo filter. The method realizes the dynamic expansion of the storage space of the cuckoo filter by combining the dynamic change of the storage space and the dynamic increase and decrease of the storage data, and improves the construction speed of a data structure; the storage space is saved, the member query accuracy is improved, and the probability of mistaken deletion of the member is reduced; the problem of cyclic loading of data when the relocation operation occurs is effectively avoided, and the use efficiency of the cuckoo filter is improved.

Description

Data processing method of double-fingerprint storage cuckoo filter based on discrete type
Technical Field
The invention relates to the field of computer information representation and information retrieval, in particular to a data processing method of a dual-fingerprint storage cuckoo filter based on a discrete type.
Background
The three problems are three common data processing problems, especially the three problems are processed under the conditions of meeting several key requirements of low storage space overhead, quick query and the like, and the realization of the problems becomes a huge challenge. Currently, researchers commonly use Bloom filters (Bloom filters), bloom filters and variants thereof, cuckoo filters (Cuckoo filters), variants of the Bloom filters, and the like to solve the above problems, but the Bloom filters (Bloom filters) including variants thereof cannot adapt to and solve the above problems. For example, a standard bloom filter does not support delete operations for set members, and a count bloom filter, while supporting member deletes, results in a dramatic increase in space overhead. For a Cuckoo Filter (Cuckoo Filter), it is a Filter implemented based on Cuckoo hash algorithm, and is essentially a Cuckoo hash table storing hash values of storage items. The method overcomes the defect that the bloom filter does not support member deletion, and simultaneously remarkably reduces the problem of high storage overhead of bloom filter varieties. However, the standard cuckoo filter cannot adapt to high-speed dynamic data transformation, and is easy to have the problems of filling circulation, mistaken deletion caused by Hash collision when deleting data and the like. In the prior art, cuckoo filtering varieties have the problems that self-adaptive dynamic high-speed data transformation cannot be realized, and the storage space cannot be changed after the generation. Meanwhile, when the member deletion is faced, the error deletion rate is continuously increased along with the increase of the number of the members. In the face of the problem of filling cycles that are likely to occur, the existing solutions also rarely have good solutions.
Disclosure of Invention
Aiming at the defects in the prior art, the data processing method based on the discrete double-fingerprint storage cuckoo filter solves the problems that the prior art cannot adapt to dynamic high-speed data transformation, reliable data insertion and high data cyclic filling probability at the same time.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
the data processing method based on the discrete double-fingerprint storage cuckoo filter is provided, and comprises the following steps of:
s1, establishing a main index table for storing a cuckoo filter based on discrete double fingerprints and initializing the table;
s2, acquiring a current instruction type, and if the instruction type is data insertion, entering a step S3; if the instruction type is data query, performing data query; if the instruction type is data deletion, data deletion is carried out;
s3, calculating fingerprint information of a data member to be inserted, selecting the front k position of the fingerprint information as a front fingerprint, and selecting the rear k-1 position of the fingerprint information as a rear fingerprint;
s4, judging whether the fingerprint information of the member to be inserted is in the double-fingerprint storage cuckoo filter, if so, ending the data insertion and entering the step S13; otherwise, entering step S5;
s5, calculating the positions of two candidate buckets of a data member to be inserted, and selecting one of the candidate buckets as a candidate bucket to be inserted;
s6, judging whether an address stored in a main index pointed by a main index pointer of the candidate bucket to be inserted corresponding to the data member to be inserted is empty or not, and if yes, entering the step S7; otherwise, sending to step S8;
s7, generating a main index storage unit, setting the relocation identification value to be 0, filling the preposed fingerprints and the postposition fingerprints selected in the step S3 into the main index storage unit, pointing the main index pointer of the corresponding candidate bucket to the main index storage unit, finishing the data insertion and entering the step S13;
s8, judging whether the address stored in the main index of the candidate bucket to be inserted corresponding to the data member to be inserted is a main index storage unit, if so, entering a step S9, otherwise, entering a step S10;
s9, reserving the main index storage unit, generating a primary index table, setting the position address of No. 0 as NULL, and pointing the main index pointer corresponding to the candidate bucket to the primary index table; generating two secondary index storage units of a primary index table, respectively pointing address storage positions of the two secondary index storage units of the data to be inserted in the primary index table to the generated two secondary index storage units, simultaneously setting relocation identification values of the two secondary index storage units to be 0, sequentially inserting a front fingerprint of a data member to be inserted and a front fingerprint in a reserved main index storage unit into a front fingerprint position of the primary index table, sequentially inserting a rear fingerprint of the data member to be inserted and a rear fingerprint in the reserved main index storage unit into the two generated secondary index storage units of the primary index table, canceling the reserved main index storage unit after the insertion is finished, and simultaneously counting the number of the remaining addresses of the primary index table to be-2; ending the data insertion and entering the step S13;
s10, judging whether the number of the remaining addresses of the current index table is 0, and if so, entering a step S12; otherwise, entering step S11;
s11, generating a secondary index storage unit of the current index table, setting a relocation identification value of the secondary index storage unit to be 0, pointing an address storage position of the secondary index storage unit to be inserted with data in the current index table to the generated secondary index storage unit, sequentially inserting the front fingerprints of the data members to be inserted into the front fingerprint positions of the current index table, sequentially inserting the rear fingerprints of the data members to be inserted into the secondary index storage unit generated by the current index table, counting the residual addresses of the index table to be-1, ending the data insertion and entering the step S13;
s12, judging whether a next-level index table exists or not, if so, entering a step S10, and otherwise, performing relocation operation and entering a step S13;
and S13, judging whether the current instruction type is continuously acquired or not, if so, returning to the step S2, and otherwise, finishing the fingerprint filtering of the cuckoo filter.
Further: when the step S1 is initialized, the whole cuckoo filter system only has an index table with empty address bits, namely a main index table; the minimum unit of storage pointed by the address of the main index table is a main index storage unit which consists of a front fingerprint position, a rear fingerprint storage position and a repositioning identification value; the occupation of the two storage bits is the same, the relocation identification value occupies 1 bit and has three states of 0, 1 and n; where 0 represents a newly inserted fingerprint, 1 represents that 1 relocation operation has occurred for this memory location, and n represents that n relocation operations have occurred for the memory location.
Further: the first-level index table consists of a residual address number storage bit, an address bit No. 0 and a secondary index table storage bit; the storage bit of the residual address number stores the residual number of the storage bits of the secondary index table of the index table, the storage bit is initialized to n and does not include the address bit No. 0; the storage positions of the secondary index table comprise a front fingerprint position of the secondary index table and address storage positions of a secondary index storage unit, the address storage positions of the secondary index storage unit point to the secondary index storage unit correspondingly generated by the secondary index table, the secondary index storage unit sequentially stores the secondary index storage unit from n bits according to a reverse principle, the remaining address number of the index table is 0 until 1 bit is used, namely the index table is fully loaded, the next-level index table is inserted when the index table is fully loaded, and the structure of the next-level index table is the same as that of the first-level index table; address bit number 0 is initialized to NULL and when there is a next level index table, this bit stores the address of the next level index table.
Further: step S3, calculating the k less than or equal to hash function to be inserted into data member xi X 1/2 of the total number of bits of the fingerprint information.
Further, the method comprises the following steps: data member xi to be inserted in step S4 X Is x = h for the two candidate bucket positions x and γ, respectively 1X ) And
Figure GDA0004103761050000041
wherein it is present>
Figure GDA0004103761050000042
Is an exclusive or operation; f. of X Fingerprint information of a data member to be inserted; h is 1 () is a hash function; a data member has two candidate buckets, and data information of the two candidate buckets is stored in only one of the candidate buckets.
Further, the specific method of the relocation operation in step S12 includes the following sub-steps:
s12-1, generating an intermediate variable temp in a format of a main index storage unit, randomly filling all fingerprint information of an existing data member in an original primary index table into the intermediate variable temp, emptying storage position information of a secondary index table of the data member and canceling the original secondary index storage unit after filling is finished;
s12-2, generating a new secondary index storage unit in the primary index table, setting the relocation identification value of the new secondary index storage unit to be 0, and pointing the address storage position of the empty secondary index storage unit in the primary index table to the new secondary index storage unit in the primary index table; sequentially inserting data members xi X The pre-fingerprints are inserted into empty pre-fingerprint positions in a primary index table, and data members xi to be inserted are sequentially inserted X The post fingerprint is inserted into a new secondary index storage unit of the primary index table;
s12-3, judging whether the repositioning identification value of the intermediate variable temp is 0, if so, entering a step S12-4; otherwise, entering step S12-11;
s12-4, calculating the position of another candidate bucket of the intermediate variable temp, judging whether the address stored in the main index pointed by the main index pointer of the candidate bucket is empty, and if so, entering the step S12-5; otherwise, entering step S12-6;
s12-5, setting the repositioning identification value of the intermediate variable temp to 1, simultaneously pointing the main index pointer of the candidate bucket to the intermediate variable temp, ending the repositioning operation and entering the step S13;
s12-6, judging whether the main index pointer of the other candidate bucket of the intermediate variable temp points to a main index storage unit, if so, entering the step S12-7; otherwise, entering step S12-9;
s12-7, reserving the main index storage unit, generating a primary index table, setting the position address of No. 0 of the primary index table to be NULL, and pointing a main index pointer of a corresponding candidate bucket to the primary index table;
s12-8, generating two secondary index storage units of a primary index table, wherein one secondary index storage unit sets the relocation identification value to 0, and the other secondary index storage unit sets the relocation identification value to 1; inserting the front fingerprints of the intermediate variable temp and the front fingerprints of the reserved main index storage unit into front fingerprint positions in the secondary index table storage positions of the primary index table to-be-inserted data in sequence, pointing the address storage positions of the secondary index storage units in the secondary index table storage positions of the front fingerprints of the intermediate variable temp to the secondary index storage units with the relocation identifier value of 1, and pointing the address storage positions of the secondary index storage units in the secondary index table storage positions of the front fingerprints of the reserved main index storage units to the secondary index storage units with the relocation identifier value of 0; filling the post-fingerprint of the intermediate variable temp into a secondary index storage unit of a primary index table with a relocation identification value of 1, and inserting the post-fingerprint of a reserved main index storage unit into a secondary index storage unit of the primary index table with a relocation identification value of 0; after the insertion is finished, the intermediate variable temp and the reserved main index storage unit are cancelled, the residual address number of the primary index table is-2, the relocation operation is finished, and the step S13 is entered;
s12-9, judging whether the residual address number of the index table currently pointed by the main index pointer of another candidate bucket of the intermediate variable temp is 0, if so, entering a step S12-1, otherwise, entering a step S12-10;
s12-10, generating a secondary index storage unit of the current index table and setting a relocation identification value of the secondary index storage unit to be 1; inserting the leading fingerprint of the intermediate variable temp into the leading fingerprint position of the secondary index table storage position of the current index table, simultaneously pointing the secondary index storage unit address storage position in the secondary index table storage position of the current index table to the generated secondary index storage unit, inserting the trailing fingerprint of the intermediate variable temp into the secondary index storage unit generated by the current index table, logging out the intermediate variable temp by counting the residual addresses of the current index table to 1, ending the repositioning operation and entering the step S13;
s12-11, searching a current last-stage index table of a current candidate bucket of the intermediate variable temp, judging whether the number of the remaining addresses of the index table is 0, if so, entering the step S12-12, otherwise, entering the step S12-14;
s12-12, generating a next-level index table, setting the address position 0 as NULL, and pointing the address position 0 of the current last-level index table to the next-level index table;
s12-13, generating a secondary index storage unit of a next-level index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the pre-fingerprint position in the storage position of the secondary index table, and simultaneously pointing the address storage position of the secondary index storage unit in the storage position of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the residual address number-1 of the next-level index table is counted, the relocation operation is finished, and the step S13 is entered;
s12-14, generating a secondary index storage unit of the current last-stage index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the storage positions of the secondary index table, and simultaneously pointing the address storage positions of the secondary index storage unit of the storage positions of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the number of the remaining addresses of the current last-stage index table is-1, the relocation operation is finished, and the process goes to step S13.
Further, the specific method for querying data in step S2 includes the following steps:
s2-1-1, calculating to obtain member xi of data to be inquired Y Fingerprint information f Y And obtaining the pre-fingerprint of the member
Figure GDA0004103761050000071
And a postpositional fingerprint->
Figure GDA0004103761050000072
S2-1-2, calculating member xi of data to be inquired Y Two candidate bucket positions χ 'and γ';
s2-1-3, judging the pre-fingerprint
Figure GDA0004103761050000073
If the candidate buckets chi 'and gamma' exist, the step S2-1-4 is carried out if the candidate buckets chi 'and gamma' exist; otherwise, ending the data query and entering the step S13;
s2-1-4, judging the pre-fingerprint
Figure GDA0004103761050000074
Whether the post fingerprint of the candidate bucket is based on the post fingerprint->
Figure GDA0004103761050000075
If the data is matched with the data, judging that the query is successful, finishing the data query and entering a step S13; otherwise, judging that the query fails, ending the data query and entering the step S13.
Further: in step S2-1-1, member xi of data to be inquired Y The two candidate bucket positions χ ' and γ ' are χ ' = h, respectively 1Y ) And
Figure GDA0004103761050000076
wherein it is present>
Figure GDA0004103761050000077
Is an exclusive or operation; h is 1 (. Cndot.) is a hash function.
Further, the specific method for deleting data in step S2 includes the following steps:
s2-2-1, calculating the xi of the member to be deleted del And obtaining a pre-fingerprint of the member
Figure GDA0004103761050000078
And post fingerprint
Figure GDA0004103761050000079
S2-2-2, calculating the xi of the member to be deleted del Two candidate bucket positions χ del And gamma del
S2-2-3, judging the xi of the member to be deleted del Whether or not there is a candidate bucket χ del Or gamma del If yes, entering step S2-2-4; otherwise, ending the data deletion and entering the step S13;
s2-2-4, judging the xi of the member to be deleted del If the storage unit is a main index storage unit, the step S2-2-5 is carried out; otherwise, entering a step S2-2-6;
s2-2-5, logging out the main index storage unit, and enabling the address position of a candidate bucket of the main index table to be NULL; finishing the data deletion and entering the step S13;
s2-2-6, canceling the secondary index storage unit, and setting the values of the storage positions of the secondary index table in the current index table to be NULL;
s2-2-7, judging whether the number of the remaining addresses of the current index table is n-1 or not, namely no load, and if yes, entering the step S2-2-8; otherwise, entering step S2-2-9;
s2-2-8, performing NULL on the address position No. 0 of the upper-level index table of the table, canceling the current index table, finishing the data deletion and entering the step S13;
s2-2-9, searching a last-stage index table, migrating the fingerprint data in the last-stage index table to a current index table, and counting the number of the remaining addresses of the last-stage index table to be +1;
s2-2-10, taking the last-stage index table as a current index table, judging whether the number of the remaining addresses of the last-stage index table is n or not, namely no load, and if yes, entering the step S2-2-8; otherwise, the data deletion is finished and the process goes to step S13.
Further: in step S2-2-1, member to be deleted xi del Two candidate bucket positions χ del And gamma del Are respectively x del =h 1del ) And
Figure GDA0004103761050000081
wherein it is present>
Figure GDA0004103761050000082
Is an exclusive or operation; h is a total of 1 (. Cndot.) is a hash function.
The invention has the beneficial effects that: by combining the two characteristics of dynamic storage space conversion and dynamic increase and decrease of stored data, the special characteristics of an index table and a pointer can be utilized to realize dynamic expansion and contraction of the storage space of the cuckoo filter, various dynamically converted data streams are self-adapted, and the construction speed of a data structure is further improved; by executing the member inserting, member inquiring and member deleting methods in the scheme, the problem that a large amount of storage space is wasted in the existing scheme is effectively solved, the accuracy of member inquiring is improved through double-fingerprint storage, and the probability of member mistaken deleting is effectively reduced; the high-precision member query and reliable member deletion are provided, meanwhile, the high-efficiency utilization rate of the storage space is realized by using a data compact technology, and the waste of the space is avoided; by adding the bit of the relocation identification value into the storage unit, the problem of cyclic filling of data during relocation operation is effectively avoided, and the use efficiency of the cuckoo filter is improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is an index representation of the present invention;
FIG. 3 is a diagram of minimum storage units of the main index table according to the present invention;
FIG. 4 is a diagram of minimum storage units of the secondary index table according to the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 1, the data processing method based on the discrete double-fingerprint storage cuckoo filter includes the following steps:
s1, establishing a main index table for storing a cuckoo filter based on discrete double fingerprints and initializing the table;
s2, acquiring a current instruction type, and if the instruction type is data insertion, entering a step S3; if the instruction type is data query, performing data query; if the instruction type is data deletion, deleting the data;
s3, calculating fingerprint information of a data member to be inserted, selecting the front k position of the fingerprint information as a front fingerprint, and selecting the rear k-1 position of the fingerprint information as a rear fingerprint;
s4, judging whether the fingerprint information of the member to be inserted into the data is in the double-fingerprint storage cuckoo filter or not, if so, ending the data insertion and entering the step S13; otherwise, entering step S5;
s5, calculating the positions of two candidate buckets of a data member to be inserted, and selecting one of the candidate buckets as a candidate bucket to be inserted;
s6, judging whether an address stored in a main index pointed by a main index pointer of the candidate bucket to be inserted corresponding to the data member to be inserted is empty or not, and if yes, entering the step S7; otherwise, sending to step S8;
s7, generating a main index storage unit, setting the relocation identification value to be 0, filling the preposed fingerprints and the postposition fingerprints selected in the step S3 into the main index storage unit, pointing the main index pointer of the corresponding candidate bucket to the main index storage unit, finishing the data insertion and entering the step S13;
s8, judging whether the address stored in the main index of the candidate bucket to be inserted corresponding to the data member to be inserted is a main index storage unit, if so, entering the step S9, otherwise, entering the step S10;
s9, reserving the main index storage unit, generating a primary index table, setting the position address of No. 0 as NULL, and pointing the main index pointer corresponding to the candidate bucket to the primary index table; generating two secondary index storage units of a primary index table, respectively pointing address storage positions of the two secondary index storage units of the data to be inserted in the primary index table to the generated two secondary index storage units, simultaneously setting relocation identification values of the two secondary index storage units to be 0, sequentially inserting a front fingerprint of a data member to be inserted and a front fingerprint in a reserved main index storage unit into a front fingerprint position of the primary index table, sequentially inserting a rear fingerprint of the data member to be inserted and a rear fingerprint in the reserved main index storage unit into the two generated secondary index storage units of the primary index table, canceling the reserved main index storage unit after the insertion is finished, and simultaneously counting the number of the remaining addresses of the primary index table to be-2; ending the data insertion and entering the step S13;
s10, judging whether the number of the remaining addresses of the current index table is 0 or not, and if yes, entering a step S12; otherwise, entering step S11;
s11, generating a secondary index storage unit of the current index table, setting a relocation identification value of the secondary index storage unit to be 0, pointing an address storage position of the secondary index storage unit to be inserted with data in the current index table to the generated secondary index storage unit, sequentially inserting the front fingerprints of the data members to be inserted into the front fingerprint positions of the current index table, sequentially inserting the rear fingerprints of the data members to be inserted into the secondary index storage unit generated by the current index table, counting the residual addresses of the index table to be-1, ending the data insertion and entering the step S13;
s12, judging whether a next-level index table exists or not, if so, entering a step S10, and otherwise, performing relocation operation and entering a step S13;
and S13, judging whether the current instruction type is continuously acquired or not, if so, returning to the step S2, and otherwise, finishing the fingerprint filtering of the cuckoo filter.
When the step S1 is initialized, the whole cuckoo filter system only has one index table with empty address bits, namely a main index table; the minimum unit of storage pointed by the address of the main index table is a main index storage unit which consists of a front fingerprint position, a rear fingerprint storage position and a repositioning identification value; the occupation of the two storage bits is the same, the relocation identification value occupies 1 bit and has three states of 0, 1 and n; where 0 represents a newly inserted fingerprint, 1 represents that 1 relocation operation has occurred for this memory location, and n represents that n relocation operations have occurred for the memory location.
The first-level index table consists of a residual address number storage bit, a No. 0 address bit and a secondary index table storage bit; the storage bit of the residual address number stores the residual number of the storage bits of the secondary index table of the index table, the storage bit is initialized to n and does not include the address bit No. 0; the storage positions of the secondary index table comprise a front fingerprint position of the secondary index table and address storage positions of a secondary index storage unit, the address storage positions of the secondary index storage unit point to the secondary index storage unit correspondingly generated by the secondary index table, the secondary index storage unit sequentially stores the secondary index storage unit from n bits according to a reverse principle, the remaining address number of the index table is 0 until 1 bit is used, namely the index table is fully loaded, the next-level index table is inserted when the index table is fully loaded, and the structure of the next-level index table is the same as that of the first-level index table; address bit number 0 is initialized to NULL and when there is a next level index table, this bit stores the address of the next level index table.
Step S3, calculating the k less than or equal to hash function to be inserted into data member xi X 1/2 of the total number of bits of the fingerprint information.
Data member xi to be inserted in step S4 X Is x = h for the two candidate bucket positions x and γ, respectively 1X ) And
Figure GDA0004103761050000111
wherein +>
Figure GDA0004103761050000112
Is an exclusive or operation; f. of X Fingerprint information of a data member to be inserted; h is 1 (. H) is a hash function; a data member has two candidate buckets, and data information of the two candidate buckets is stored in only one of the candidate buckets.
The specific method of the relocation operation in step S12 includes the following sub-steps:
s12-1, generating an intermediate variable temp in a format of a main index storage unit, randomly filling all fingerprint information of an existing data member in an original primary index table into the intermediate variable temp, emptying storage position information of a secondary index table of the data member and canceling the original secondary index storage unit after filling is finished;
s12-2, generating a new secondary index storage unit in the primary index table, setting the relocation identification value to be 0, and pointing the address storage position of the secondary index storage unit with the empty position in the primary index table to the new secondary index storage unit in the primary index table; sequentially inserting data members xi X Inserting the pre-fingerprint into the empty pre-fingerprint position in the primary index table, and sequentially inserting the data members xi to be inserted X The post fingerprint is inserted into a new secondary index storage unit of the primary index table;
s12-3, judging whether the repositioning identification value of the intermediate variable temp is 0, if so, entering a step S12-4; otherwise, entering step S12-11;
s12-4, calculating the position of another candidate bucket of the intermediate variable temp, judging whether the address stored by the main index pointed by the main index pointer of the candidate bucket is empty or not, and if so, entering the step S12-5; otherwise, entering step S12-6;
s12-5, setting the repositioning identification value of the intermediate variable temp to 1, simultaneously pointing the main index pointer of the candidate bucket to the intermediate variable temp, ending the repositioning operation and entering the step S13;
s12-6, judging whether the main index pointer of the other candidate bucket of the intermediate variable temp points to a main index storage unit, if so, entering the step S12-7; otherwise, entering step S12-9;
s12-7, reserving the main index storage unit, generating a primary index table, setting the position address of 0 number of the primary index table to be NULL, and pointing a main index pointer corresponding to the candidate bucket to the primary index table;
s12-8, generating two secondary index storage units of a primary index table, wherein one secondary index storage unit sets the relocation identification value to 0, and the other secondary index storage unit sets the relocation identification value to 1; inserting the prepositive fingerprints of the intermediate variable temp and the prepositive fingerprints of the reserved main index storage unit into prepositive fingerprint bits in secondary index table storage positions of the primary index table to-be-inserted data in sequence, simultaneously pointing the address storage positions of the secondary index storage units in the secondary index table storage positions of the prepositive fingerprints of the intermediate variable temp to the secondary index storage units with the relocation identification values of 1, and pointing the address storage positions of the secondary index storage units in the secondary index tables of the prepositive fingerprints of the reserved main index storage units to the secondary index storage units with the relocation identification values of 0; filling the post-fingerprint of the intermediate variable temp into a secondary index storage unit of a primary index table with a relocation identification value of 1, and inserting the post-fingerprint of a reserved main index storage unit into a secondary index storage unit of the primary index table with a relocation identification value of 0; after the insertion is finished, the intermediate variable temp and the reserved main index storage unit are cancelled, the residual address number of the primary index table is-2, the relocation operation is finished, and the step S13 is entered;
s12-9, judging whether the residual address number of the index table currently pointed by the main index pointer of another candidate bucket of the intermediate variable temp is 0, if so, entering a step S12-1, otherwise, entering a step S12-10;
s12-10, generating a secondary index storage unit of the current index table and setting a relocation identification value of the secondary index storage unit to be 1; inserting the leading fingerprint of the intermediate variable temp into the leading fingerprint position of the secondary index table storage position of the current index table, simultaneously pointing the secondary index storage unit address storage position in the secondary index table storage position of the current index table to the generated secondary index storage unit, inserting the trailing fingerprint of the intermediate variable temp into the secondary index storage unit generated by the current index table, logging out the intermediate variable temp by counting the residual addresses of the current index table to 1, ending the repositioning operation and entering the step S13;
s12-11, searching a current last-stage index table of a current candidate bucket of the intermediate variable temp, judging whether the number of the remaining addresses of the index table is 0, if so, entering the step S12-12, otherwise, entering the step S12-14;
s12-12, generating a next-level index table, setting the address position 0 as NULL, and pointing the address position 0 of the current last-level index table to the next-level index table;
s12-13, generating a secondary index storage unit of a next-level index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the pre-fingerprint position in the storage position of the secondary index table, and simultaneously pointing the address storage position of the secondary index storage unit in the storage position of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the residual address number-1 of the next-level index table is counted, the relocation operation is finished, and the step S13 is entered;
s12-14, generating a secondary index storage unit of the current last-stage index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the storage positions of the secondary index table, and simultaneously pointing the address storage positions of the secondary index storage unit of the storage positions of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the number of the remaining addresses of the current last-stage index table is-1, the relocation operation is finished, and the process goes to step S13.
The specific method for querying data in step S2 includes the following steps:
s2-1-1, calculating to obtain member xi of data to be inquired Y Fingerprint information f Y And obtaining the pre-fingerprints of the members
Figure GDA0004103761050000141
And a postpositional fingerprint->
Figure GDA0004103761050000142
S2-1-2, calculating member xi of data to be inquired Y Two candidate bucket positions χ 'and γ';
s2-1-3, judging the pre-fingerprint
Figure GDA0004103761050000143
If the candidate buckets chi 'and gamma' exist, the step S2-1-4 is carried out if the candidate buckets chi 'and gamma' exist; otherwise, ending the data query and entering the step S13;
s2-1-4, judging the pre-fingerprint
Figure GDA0004103761050000144
Whether the post fingerprint of the candidate bucket is based on the post fingerprint->
Figure GDA0004103761050000145
If so, judging that the query is successful, finishing the data query and entering the step S13; otherwise, judging that the query fails, ending the data query and entering the step S13.
In step S2-1-1, member xi of data to be inquired Y The two candidate bucket positions χ ' and γ ' are χ ' = h, respectively 1Y ) And
Figure GDA0004103761050000146
wherein it is present>
Figure GDA0004103761050000147
Is an exclusive or operation; h is 1 (. Cndot.) is a hash function.
The specific method for deleting data in the step S2 comprises the following steps:
s2-2-1, calculating the xi of the member to be deleted del And obtaining a pre-fingerprint of the member
Figure GDA0004103761050000151
And post fingerprint
Figure GDA0004103761050000152
S2-2-2, calculating the xi of the member to be deleted del Two candidate bucket positions χ del And gamma del
S2-2-3, judging the xi of the member to be deleted del Whether or not there is a candidate bucket χ del Or gamma del If yes, entering step S2-2-4; otherwise, ending the data deletion and entering the step S13;
s2-2-4, judging the xi of the member to be deleted del If the storage unit is a main index storage unit, the step S2-2-5 is carried out; otherwise, entering a step S2-2-6;
s2-2-5, logging out the main index storage unit, and enabling the address position of a candidate bucket of the main index table to be NULL; finishing the data deletion and entering the step S13;
s2-2-6, canceling the secondary index storage unit, and setting the values of the storage positions of the secondary index table in the current index table to be NULL;
s2-2-7, judging whether the number of the remaining addresses of the current index table is n-1, namely no load, if so, entering the step S2-2-8; otherwise, entering step S2-2-9;
s2-2-8, enabling the address position of No. 0 of the upper-level index table of the table to be NULL, canceling the current index table, finishing the data deletion and entering the step S13;
s2-2-9, searching a last-stage index table, migrating the fingerprint data in the last-stage index table to a current index table, and counting the number of the remaining addresses of the last-stage index table to be +1;
s2-2-10, taking the last-stage index table as a current index table, judging whether the number of the remaining addresses of the last-stage index table is n, namely no load, and if yes, entering the step S2-2-8; otherwise, the data deletion is finished and the process goes to step S13.
In step S2-2-1, to-be-deleted member xi del Two candidate bucket positions χ del And gamma del Are respectively x del =h 1del ) And
Figure GDA0004103761050000153
wherein it is present>
Figure GDA0004103761050000154
Is an exclusive or operation; h is 1 (. Cndot.) is a hash function.
As shown in fig. 2, the discrete dual-fingerprint storage cuckoo filter of the present invention includes a primary index table and S secondary index tables, wherein the primary index table points to the primary index table or the primary index storage unit, the secondary index table points to the secondary index storage unit, and when the secondary index table is full, the secondary index table points to the next primary index table; in which ξ X Is a data member, h 1 (x) And h 2 (x) For two different candidate bucket computations that are to be performed,
Figure GDA0004103761050000161
is a data member xi X Is pre-fingerprint information, and/or is pre-fingerprint information>
Figure GDA0004103761050000162
Is a data member xi X The subscript X corresponds to overwriting with a different lower case letter to indicate a different member.
As shown in fig. 3, the smallest storage unit of the main index table of the discrete dual-fingerprint storage cuckoo filter, i.e., the main index storage unit, of the present invention includes a pre-fingerprint bit, a post-fingerprint bit, and a relocation identification value.
As shown in fig. 4, the smallest storage unit of the secondary index table of the discrete dual-fingerprint storage cuckoo filter of the present invention, i.e., the secondary index storage unit, includes a post-fingerprint bit and a relocation identification value.
The method combines the two characteristics of dynamic storage space conversion and dynamic increase and decrease of the stored data, can utilize the special characteristics of the index table and the pointer to realize the dynamic expansion of the storage space of the cuckoo filter, is self-adaptive to various dynamically converted data streams, and further improves the construction speed of a data structure; by executing the member inserting, member inquiring and member deleting methods in the scheme, the problem that a large amount of storage space is wasted in the existing scheme is effectively solved, the accuracy of member inquiring is improved through double-fingerprint storage, and the probability of member mistaken deleting is effectively reduced; the high-precision member query and reliable member deletion are provided, meanwhile, the high-efficiency utilization rate of the storage space is realized by using a data compact technology, and the waste of the space is avoided; by adding the bit of the 'relocation identification value' into the storage unit, the problem of cyclic loading of data when relocation operation occurs is effectively avoided, and the use efficiency of the cuckoo filter is improved.

Claims (9)

1. A data processing method based on a discrete type double-fingerprint storage cuckoo filter is characterized by comprising the following steps:
s1, establishing a main index table for storing a cuckoo filter based on discrete double fingerprints and initializing the table;
s2, acquiring a current instruction type, and if the instruction type is data insertion, entering a step S3; if the instruction type is data query, performing data query; if the instruction type is data deletion, data deletion is carried out;
s3, calculating fingerprint information of a data member to be inserted, selecting the front k position of the fingerprint information as a front fingerprint, and selecting the rear k-1 position of the fingerprint information as a rear fingerprint;
s4, judging whether the fingerprint information of the member to be inserted into the data is in the double-fingerprint storage cuckoo filter or not, if so, ending the data insertion and entering the step S13; otherwise, entering step S5;
s5, calculating the positions of two candidate buckets of the data members to be inserted, and selecting one of the candidate buckets as a candidate bucket to be inserted;
s6, judging whether an address stored in a main index pointed by a main index pointer of the candidate bucket to be inserted corresponding to the data member to be inserted is empty or not, and if so, entering a step S7; otherwise, sending to step S8;
s7, generating a main index storage unit, setting the relocation identification value to be 0, filling the preposed fingerprints and the postposition fingerprints selected in the step S3 into the main index storage unit, pointing the main index pointer of the corresponding candidate bucket to the main index storage unit, finishing the data insertion and entering the step S13;
s8, judging whether the address stored in the main index of the candidate bucket to be inserted corresponding to the data member to be inserted is a main index storage unit, if so, entering the step S9, otherwise, entering the step S10;
s9, reserving the main index storage unit, generating a primary index table, setting the position address of No. 0 as NULL, and pointing the main index pointer corresponding to the candidate bucket to the primary index table; generating two secondary index storage units of a primary index table, respectively pointing address storage positions of the two secondary index storage units of the data to be inserted in the primary index table to the generated two secondary index storage units, simultaneously setting relocation identification values of the two secondary index storage units to be 0, sequentially inserting a front fingerprint of a data member to be inserted and a front fingerprint in a reserved main index storage unit into a front fingerprint position of the primary index table, sequentially inserting a rear fingerprint of the data member to be inserted and a rear fingerprint in the reserved main index storage unit into the two generated secondary index storage units of the primary index table, canceling the reserved main index storage unit after the insertion is finished, and simultaneously counting the number of the remaining addresses of the primary index table to be-2; ending the data insertion and entering the step S13;
s10, judging whether the number of the remaining addresses of the current index table is 0 or not, and if yes, entering a step S12; otherwise, entering step S11;
s11, generating a secondary index storage unit of the current index table, setting a relocation identification value of the secondary index storage unit to be 0, pointing an address storage position of the secondary index storage unit to be inserted with data in the current index table to the generated secondary index storage unit, sequentially inserting the front fingerprints of the data members to be inserted into the front fingerprint positions of the current index table, sequentially inserting the rear fingerprints of the data members to be inserted into the secondary index storage unit generated by the current index table, counting the residual addresses of the index table to be-1, ending the data insertion and entering the step S13;
s12, judging whether a next-level index table exists or not, if so, entering a step S10, and otherwise, performing relocation operation and entering a step S13; the specific method of the relocation operation comprises the following substeps:
s12-1, generating an intermediate variable temp in a format of a main index storage unit, randomly filling all fingerprint information of an existing data member in an original primary index table into the intermediate variable temp, emptying storage position information of a secondary index table of the data member and canceling the original secondary index storage unit after filling is finished;
s12-2, generating a new secondary index storage unit in the primary index table, setting the relocation identification value of the new secondary index storage unit to be 0, and pointing the address storage position of the empty secondary index storage unit in the primary index table to the new secondary index storage unit in the primary index table; sequentially inserting data members xi X The pre-fingerprints are inserted into empty pre-fingerprint positions in a primary index table, and data members xi to be inserted are sequentially inserted X The post fingerprint is inserted into a new secondary index storage unit of the primary index table;
s12-3, judging whether the repositioning identification value of the intermediate variable temp is 0, if so, entering the step S12-4; otherwise, entering step S12-11;
s12-4, calculating the position of another candidate bucket of the intermediate variable temp, judging whether the address stored in the main index pointed by the main index pointer of the candidate bucket is empty, and if so, entering the step S12-5; otherwise, entering step S12-6;
s12-5, setting the repositioning identification value of the intermediate variable temp to 1, simultaneously pointing the main index pointer of the candidate bucket to the intermediate variable temp, ending the repositioning operation and entering the step S13;
s12-6, judging whether the main index pointer of the other candidate bucket of the intermediate variable temp points to a main index storage unit, if so, entering the step S12-7; otherwise, entering step S12-9;
s12-7, reserving the main index storage unit, generating a primary index table, setting the position address of No. 0 of the primary index table to be NULL, and pointing a main index pointer of a corresponding candidate bucket to the primary index table;
s12-8, generating two secondary index storage units of a primary index table, wherein one secondary index storage unit sets the relocation identification value to 0, and the other secondary index storage unit sets the relocation identification value to 1; inserting the prepositive fingerprints of the intermediate variable temp and the prepositive fingerprints of the reserved main index storage unit into prepositive fingerprint bits in secondary index table storage positions of the primary index table to-be-inserted data in sequence, simultaneously pointing the address storage positions of the secondary index storage units in the secondary index table storage positions of the prepositive fingerprints of the intermediate variable temp to the secondary index storage units with the relocation identification values of 1, and pointing the address storage positions of the secondary index storage units in the secondary index tables of the prepositive fingerprints of the reserved main index storage units to the secondary index storage units with the relocation identification values of 0; filling the post-fingerprint of the intermediate variable temp into a secondary index storage unit of a primary index table with a relocation identification value of 1, and inserting the post-fingerprint of a reserved main index storage unit into a secondary index storage unit of the primary index table with a relocation identification value of 0; after the insertion is finished, the intermediate variable temp and the reserved main index storage unit are cancelled, the residual address number of the primary index table is-2, the relocation operation is finished, and the step S13 is entered;
s12-9, judging whether the residual address number of the index table currently pointed by the main index pointer of another candidate bucket of the intermediate variable temp is 0, if so, entering a step S12-1, otherwise, entering a step S12-10;
s12-10, generating a secondary index storage unit of the current index table and setting a relocation identification value of the secondary index storage unit to be 1; inserting the leading fingerprint of the intermediate variable temp into the leading fingerprint position of the secondary index table storage position of the current index table, simultaneously pointing the secondary index storage unit address storage position in the secondary index table storage position of the current index table to the generated secondary index storage unit, inserting the trailing fingerprint of the intermediate variable temp into the secondary index storage unit generated by the current index table, logging out the intermediate variable temp by counting the residual addresses of the current index table to 1, ending the repositioning operation and entering the step S13;
s12-11, searching a current last-stage index table of a current candidate bucket of the intermediate variable temp, judging whether the number of the remaining addresses of the index table is 0, if so, entering a step S12-12, and otherwise, entering a step S12-14;
s12-12, generating a next-level index table, setting the address position 0 as NULL, and pointing the address position 0 of the current last-level index table to the next-level index table;
s12-13, generating a secondary index storage unit of a next-level index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the pre-fingerprint position in the storage position of the secondary index table, and simultaneously pointing the address storage position of the secondary index storage unit in the storage position of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the residual address number-1 of the next-level index table is counted, the relocation operation is finished, and the step S13 is entered;
s12-14, generating a secondary index storage unit of the current last-stage index table, filling the post-fingerprint information of the intermediate variable temp and the value after the repositioning identification value +1 into the secondary index storage unit, sequentially inserting the pre-fingerprint information of the intermediate variable temp into the storage positions of the secondary index table, and simultaneously pointing the address storage positions of the secondary index storage unit of the storage positions of the secondary index table to the generated secondary index storage unit; after the insertion is finished, the intermediate variable temp is cancelled, the residual address number-1 of the current last-stage index table is counted, the relocation operation is finished, and the step S13 is entered;
and S13, judging whether the current instruction type is continuously acquired or not, if so, returning to the step S2, and otherwise, finishing the fingerprint filtering of the cuckoo filter.
2. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 1, wherein: when the step S1 is initialized, the whole cuckoo filter system only has an index table with empty address bits, namely a main index table; the minimum unit of storage pointed by the address of the main index table is a main index storage unit which consists of a front fingerprint position, a rear fingerprint storage position and a repositioning identification value; the occupation of the two storage bits is the same, the relocation identification value occupies 1 bit and has three states of 0, 1 and n; where 0 represents a newly inserted fingerprint, 1 represents that 1 relocation operation has occurred for this memory location, and n represents that n relocation operations have occurred for the memory location.
3. The data processing method based on the discrete type double-fingerprint storage cuckoo filter as claimed in claim 1, wherein: the first-level index table consists of a residual address number storage bit, a No. 0 address bit and a secondary index table storage bit; the storage bit of the residual address number stores the residual number of the storage bits of the secondary index table of the index table, the storage bit is initialized to n and does not include the address bit No. 0; the storage positions of the secondary index table comprise a front fingerprint position of the secondary index table and address storage positions of a secondary index storage unit, the address storage positions of the secondary index storage unit point to the secondary index storage unit correspondingly generated by the secondary index table, the secondary index storage unit sequentially stores the secondary index storage unit from n bits according to a reverse principle, the remaining address number of the index table is 0 until 1 bit is used, namely the index table is fully loaded, the next-level index table is inserted when the index table is fully loaded, and the structure of the next-level index table is the same as that of the first-level index table; address bit number 0 is initialized to NULL and when there is a next level index table, this bit stores the address of the next level index table.
4. The data processing method based on the discrete type double-fingerprint storage cuckoo filter as claimed in claim 1, wherein: step S3, calculating the k less than or equal to hash function to be inserted into data member xi X 1/2 of the total number of bits of the fingerprint information.
5. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 1, wherein: data member xi to be inserted in step S4 X Is x = h for the two candidate bucket positions x and γ, respectively 1X ) And
Figure FDA0004072098430000061
wherein it is present>
Figure FDA0004072098430000062
Is an exclusive or operation; f. of X Fingerprint information of a data member to be inserted; h is 1 (. H) is a hash function; a data member has two candidate buckets, and data information of the two candidate buckets is stored in only one of the candidate buckets.
6. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 1, wherein the specific method for performing data query in step S2 comprises the following steps:
s2-1-1, calculating to obtain member xi of data to be inquired Y Fingerprint information f Y And obtaining the pre-fingerprint of the member
Figure FDA0004072098430000063
And a postpositional fingerprint->
Figure FDA0004072098430000064
S2-1-2, calculating members xi of data to be inquired Y Two candidate bucket positions χ 'and γ';
s2-1-3, judging the pre-fingerprint
Figure FDA0004072098430000065
If the candidate buckets chi 'and gamma' exist, the step S2-1-4 is carried out if the candidate buckets chi 'and gamma' exist; otherwise, ending the data query and entering the step S13;
s2-1-4, judging the pre-fingerprint
Figure FDA0004072098430000066
Whether the post-fingerprint of the candidate bucket is the same as the post-fingerprint f Y B If so, judging that the query is successful, finishing the data query and entering the step S13; otherwise, judging that the query fails, ending the data query and entering the step S13./>
7. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 6, wherein: in step S2-1-1, member xi of data to be inquired Y The two candidate bucket positions χ ' and γ ' are χ ' = h, respectively 1Y ) And
Figure FDA0004072098430000067
wherein it is present>
Figure FDA0004072098430000068
Is an exclusive or operation; h is 1 (. Cndot.) is a hash function.
8. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 1, wherein the specific method for deleting data in step S2 comprises the following steps:
s2-2-1, calculating the xi of the member to be deleted del And obtaining a pre-fingerprint of the member
Figure FDA0004072098430000069
And a postpositional fingerprint->
Figure FDA00040720984300000610
S2-2-2, calculating the xi of the member to be deleted del Two candidate bucket positions χ del And gamma del
S2-2-3, judging the xi of the member to be deleted del Whether or not there is a candidate bucket χ del Or gamma del If yes, entering step S2-2-4; otherwise, ending the data deletion and entering the step S13;
s2-2-4, judging the xi of the member to be deleted del If the storage unit is a main index storage unit, the step S2-2-5 is carried out; otherwise, entering a step S2-2-6;
s2-2-5, logging out the main index storage unit, and enabling the address position of a candidate bucket of the main index table to be NULL; finishing the data deletion and entering the step S13;
s2-2-6, canceling the secondary index storage unit, and setting the values of the storage positions of the secondary index table in the current index table to be NULL;
s2-2-7, judging whether the number of the remaining addresses of the current index table is n-1, namely no load, if so, entering the step S2-2-8; otherwise, entering step S2-2-9;
s2-2-8, performing NULL on the address position No. 0 of the upper-level index table of the table, canceling the current index table, finishing the data deletion and entering the step S13;
s2-2-9, searching a last-stage index table, migrating the fingerprint data in the last-stage index table to a current index table, and counting the number of the remaining addresses of the last-stage index table to be +1;
s2-2-10, taking the last-stage index table as a current index table, judging whether the number of the remaining addresses of the last-stage index table is n, namely no load, and if yes, entering the step S2-2-8; otherwise, the data deletion is finished and the process goes to step S13.
9. The data processing method based on the discrete double-fingerprint storage cuckoo filter as claimed in claim 8, wherein: in step S2-2-1, to-be-deleted member xi del Two candidate bucket positions χ del And gamma del Are respectively x del =h 1del ) And
Figure FDA0004072098430000071
wherein it is present>
Figure FDA0004072098430000072
Is an exclusive or operation; h is 1 (. Cndot.) is a hash function. />
CN202111181649.4A 2021-10-11 2021-10-11 Data processing method of double-fingerprint storage cuckoo filter based on discrete type Active CN113886391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111181649.4A CN113886391B (en) 2021-10-11 2021-10-11 Data processing method of double-fingerprint storage cuckoo filter based on discrete type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111181649.4A CN113886391B (en) 2021-10-11 2021-10-11 Data processing method of double-fingerprint storage cuckoo filter based on discrete type

Publications (2)

Publication Number Publication Date
CN113886391A CN113886391A (en) 2022-01-04
CN113886391B true CN113886391B (en) 2023-03-28

Family

ID=79006040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111181649.4A Active CN113886391B (en) 2021-10-11 2021-10-11 Data processing method of double-fingerprint storage cuckoo filter based on discrete type

Country Status (1)

Country Link
CN (1) CN113886391B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630955B (en) * 2015-12-24 2019-01-29 华中科技大学 A kind of data acquisition system member management method of high-efficiency dynamic
US10515064B2 (en) * 2016-07-11 2019-12-24 Microsoft Technology Licensing, Llc Key-value storage system including a resource-efficient index
CN113360516B (en) * 2021-08-11 2021-11-26 成都信息工程大学 Collection member management method

Also Published As

Publication number Publication date
CN113886391A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
US11204905B2 (en) Trie-based indices for databases
US5497485A (en) Method and apparatus for implementing Q-trees
US7831626B1 (en) Integrated search engine devices having a plurality of multi-way trees of search keys therein that share a common root node
US7603346B1 (en) Integrated search engine devices having pipelined search and b-tree maintenance sub-engines therein
US8086641B1 (en) Integrated search engine devices that utilize SPM-linked bit maps to reduce handle memory duplication and methods of operating same
US8190591B2 (en) Bit string searching apparatus, searching method, and program
US6654868B2 (en) Information storage and retrieval system
CN110147204B (en) Metadata disk-dropping method, device and system and computer-readable storage medium
US7653619B1 (en) Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that support variable tree height
US20040019737A1 (en) Multiple-RAM CAM device and method therefor
US7987205B1 (en) Integrated search engine devices having pipelined node maintenance sub-engines therein that support database flush operations
Otoo et al. A mapping function for the directory of a multidimensional extendible hashing
US7725450B1 (en) Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that maintain search coherence during multi-cycle update operations
US7953721B1 (en) Integrated search engine devices that support database key dumping and methods of operating same
CN113886391B (en) Data processing method of double-fingerprint storage cuckoo filter based on discrete type
CN112434085B (en) Roaring Bitmap-based user data statistical method
US20120239664A1 (en) Bit string search apparatus, search method, and program
CN116450656A (en) Data processing method, device, equipment and storage medium
US8166043B2 (en) Bit strings search apparatus, search method, and program
CN103714121A (en) Index record management method and device
CN110825747B (en) Information access method, device and medium
CN111581440B (en) Hardware acceleration B + tree operation device and method thereof
CN112269784A (en) Hash table structure based on hardware realization and inserting, inquiring and deleting method
CN110941730A (en) Retrieval method and device based on human face feature data migration
US5479657A (en) System and method for sorting count information by summing frequencies of usage and using the sums to determine write addresses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant