CN103092885A - Method and device for creating sparse indexes, sparse index and query method and device - Google Patents

Method and device for creating sparse indexes, sparse index and query method and device Download PDF

Info

Publication number
CN103092885A
CN103092885A CN2011103476374A CN201110347637A CN103092885A CN 103092885 A CN103092885 A CN 103092885A CN 2011103476374 A CN2011103476374 A CN 2011103476374A CN 201110347637 A CN201110347637 A CN 201110347637A CN 103092885 A CN103092885 A CN 103092885A
Authority
CN
China
Prior art keywords
blocks
value
files
data recording
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103476374A
Other languages
Chinese (zh)
Inventor
周大
钱岭
郭磊涛
齐骥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN2011103476374A priority Critical patent/CN103092885A/en
Publication of CN103092885A publication Critical patent/CN103092885A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for creating sparse indexes. For data records to be processed, a same hash function is used for calculating hashed values of key values, the data records are saved into corresponding subareas according to the calculated hashed values, and the data records saved into the same subarea have same hashed values; for any subarea, in the initial phase, the content in the subarea is empty, the saved data records are used for forming a file block when the saved data records reach preset requirements, and the data records not forming the file block are used for forming another file block when the saved data blocks not forming the file block reach preset requirements again, and so on; and an index entry is created for every formed file block. According to the method and the device, the creating speed of sparse indexes can be fastened. The invention discloses a sparse index and a query method and device based on the sparse index simultaneously.

Description

The method for building up of sparse index and device, sparse index and querying method and device
Technical field
The present invention relates to data processing technique, particularly a kind of method for building up of sparse index and device, a kind of sparse index, and a kind of querying method and device based on this sparse index.
Background technology
When carrying out the data loading, for ease of subsequent query, usually can set up index for data recording, described index can be dense index or sparse index etc.
Wherein, dense index need to be set up respectively an index entry for each data recording, and sparse index only need to be set up respectively an index entry for each grouping, comprises respectively several data recording in each grouping.
In prior art, usually set up in such a way sparse index: according to certain rule, to each pending data recording, namely each data recording to be loaded sorts such as key assignments order from small to large; Data recording after sequence is carried out cutting, obtain several groupings; For each grouping, set up respectively an index entry, include a key assignments and a pointer in each index entry, key assignments typically refers to the key assignments of first data recording in grouping, pointed be the reference position of first data recording in grouping.
Fig. 1 is the schematic diagram of the sparse index set up according to existing mode.As shown in Figure 1,010101,020101 etc. is key assignments, and the delegation of thick arrow indication is a data record, and front 3 data record is as a grouping, and rear 4 data record is as a grouping.
But, can there be certain problem in aforesaid way in actual applications, because needs first sort to each data recording, then just can carry out subsequent treatment that is:, therefore and that the process implementation of sequence gets up is very complicated, can cause the speed of setting up of sparse index very slow.
Summary of the invention
In view of this, the invention provides a kind of method for building up and device of sparse index, can accelerate the speed of setting up of sparse index.
The present invention provides a kind of sparse index and simultaneously based on querying method and the device of this sparse index.
For achieving the above object, technical scheme of the present invention is achieved in that
A kind of method for building up of sparse index comprises:
For each pending data recording, to utilize respectively same hash function to calculate the hashed value of its key assignments, and according to the hashed value that calculates, this data recording is saved in corresponding subregion, the data recording that is saved in same subregion has identical hashed value;
For arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
A kind of apparatus for establishing of sparse index comprises:
Computing module is used for for each pending data recording, utilizes respectively same hash function to calculate the hashed value of its key assignments, and this data recording and the hashed value that calculates are sent to sets up module;
The described module of setting up is used for according to the hashed value that receives, the data recording that receives being saved in corresponding subregion, and the data recording that is saved in same subregion has identical hashed value; For arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
A kind of sparse index comprises:
The respectively corresponding index entry of each blocks of files in each subregion; Each subregion has respectively a numbering that is different from other subregion, and each blocks of files has respectively a numbering that is different from other blocks of files in same subregion;
Comprise respectively in each index entry: largest key value, minimum key value, partition number, blocks of files number and hash function name; Wherein,
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding.
A kind of querying method based on above-mentioned sparse index comprises:
Receive key assignments to be checked, and find out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item;
For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of;
Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
A kind of inquiry unit based on above-mentioned sparse index comprises:
Receiver module is used for receiving key assignments to be checked, and sends to processing module;
Described processing module is used for finding out minimum key value from each index entry and is less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked, with the index entry that finds out as the candidate index item; For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of; Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
As seen, adopt technical scheme of the present invention, need not each pending data recording is sorted, can complete the foundation of sparse index, thereby accelerated the speed of setting up of sparse index, and can complete data query based on this sparse index.
Description of drawings
Fig. 1 is the schematic diagram of the sparse index set up according to existing mode.
Fig. 2 is the process flow diagram of the method for building up embodiment of sparse index of the present invention.
Fig. 3 is the process of the setting up schematic diagram of sparse index of the present invention.
Fig. 4 is the schematic diagram of the sparse index set up according to mode of the present invention.
Fig. 5 is the composition structural representation of the apparatus for establishing embodiment of sparse index of the present invention.
Fig. 6 is the composition structural representation of the inquiry unit embodiment of sparse index of the present invention.
Embodiment
For problems of the prior art, the scheme of setting up of the sparse index in the present invention after a kind of improvement of proposition need not each pending data recording is sorted, thereby has accelerated the speed of setting up of sparse index.
For make technical scheme of the present invention clearer, understand, referring to the accompanying drawing embodiment that develops simultaneously, scheme of the present invention is described in further detail.
Fig. 2 is the process flow diagram of the method for building up embodiment of sparse index of the present invention.As shown in Figure 2, comprise the following steps:
Step 21: for each pending data recording, utilize respectively same hash function to calculate the hashed value of its key assignments, and according to the hashed value that calculates, this data recording is saved in corresponding subregion, the data recording that is saved in same subregion has identical hashed value.
Usually, carrying out need to setting up index when data load, therefore, above-mentioned pending data recording typically refers to data recording to be loaded.
In this step, obtain respectively each pending data recording, and for each data recording that gets, process in such a way respectively:
1) utilize hash function to calculate the hashed value of the key assignments of this data recording;
2) according to the hashed value that calculates, this data recording is saved in corresponding subregion, the data recording that is saved in same subregion has identical hashed value.
Concrete which kind of hash function of employing calculates hashed value and can be decided according to the actual requirements.Can calculate the different hashed value of how many kinds of according to hash function, namely have what different subregions.
Step 22: for arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
For arbitrary subregion, starting stage, content wherein is empty, along with being on the increase of the data recording of preserving, progressively produces different blocks of files, that is: when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, afterwards, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Distinguishingly, need to preserve when no longer including new data recording, namely all pending data recording all are disposed, but the data recording of the not composing document piece of preserving is not when reaching pre-provisioning request, and utilizing not, the data recording of composing document piece forms a blocks of files.
illustrate: suppose above-mentionedly to reach pre-provisioning request and refer to reach predetermined number, and the hypothesis predetermined number refers to 100, so, for arbitrary subregion, when the data recording of preserving reaches 100, utilize these 100 data records to form a blocks of files, after this, if newly preserved again 100 data records, utilize these 100 new blocks of files of data recording recomposition of preserving, the like, need to preserve when no longer including new data recording, but when also having 50 not belong to arbitrary blocks of files in the data recording of having preserved, these 50 data records are formed a blocks of files.
Need to prove, reach pre-provisioning request and be not limited to refer to reach predetermined number, also can refer to reach other requirement, reach predetermined threshold etc. as total amount of data.
In actual applications, can be each subregion a numbering that is different from other subregion is set respectively, if any N subregion, can be numbered respectively subregion 1~subregion N, and, for each blocks of files arranges respectively a numbering that is different from other blocks of files in same subregion, as M blocks of files arranged in a subregion, can be numbered respectively blocks of files 1~blocks of files M, M and N are the positive integer greater than 1, usually, the numbering of the blocks of files that more first forms is less, be blocks of files 1 as the blocks of files that forms at first, be blocks of files 2 afterwards, then be blocks of files 3 afterwards.
After blocks of files of every composition, be it and set up an index entry, comprising 5 property values: largest key value, minimum key value, partition number, blocks of files number and hash function name;
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding;
The title of the hash function that uses when the hash function name refers to calculate hashed value.
Based on above-mentioned introduction, can obtain sparse index shown in Figure 3 and set up the process schematic diagram.For each pending data recording, process according to mode shown in Figure 3 respectively, finally obtain several blocks of files, the corresponding index entry of each blocks of files.
Fig. 4 is the schematic diagram of the sparse index set up according to mode of the present invention.As shown in Figure 4, the delegation of arrow indication is an index entry.
The present invention provides a kind of querying method based on above-mentioned sparse index simultaneously, comprising:
1) receive key assignments to be checked, and find out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item;
Namely search all index entries, will meet the index entry of " minimum key value≤key assignments≤largest key value to be checked " this condition as the candidate index item.
2) for each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of.
3) travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered;
By step 1)~2) processing, can only obtain index entry as a result, if obtain concrete data recording, also need to find each blocks of files corresponding to index entry as a result according to blocks of files number and partition number information, and travel through these blocks of files.
Above-mentioned process and the corresponding process of inquiring about of setting up sparse index can integral body be exemplified below:
At field of telecommunications, the rise time information of normally this data recording of carrying in the key assignments of each data recording can comprise year, month, day, hour, min etc.
so, can utilizing by the hour, the hash function of subregion is saved in each data recording in different subregions, as for data recording X, if determine that according to its key assignments its rise time is (not consider year at 0 o'clock, month, day, minute), it is saved in subregion 1, for data recording Y, if determine that according to its key assignments its rise time is at 1 o'clock, it is saved in subregion 2, for data recording Z, if determine that according to its key assignments its rise time is at 2 o'clock, it is saved in subregion 3, the like, like this, 24 subregions have been met together, can be numbered respectively subregion 1~subregion 24.
Have respectively several blocks of files in each subregion, suppose that the blocks of files number average in each subregion is 5 (the blocks of files number in practical application in each subregion is usually different), so, these 5 blocks of files can be numbered respectively blocks of files 1~blocks of files 5; And, for each blocks of files is set up respectively an index entry, comprise: largest key value, minimum key value, partition number, blocks of files number and hash function name, wherein, largest key value and minimum key value are all for complete temporal information, namely comprise year, month, day, hour, min etc., for any two key assignments A and B, if the temporal information in key assignments A is nearer apart from the current time than the temporal information in key assignments B, can think that so key assignments A is greater than key assignments B.
Follow-up, when receiving key assignments to be checked, complete temporal information in key assignments to be checked according to this and the largest key value in each index entry and minimum key value information are determined the candidate index item, the blocks of files of these candidate index item correspondences may come from different subregions, screen undesirable index entry by calculating hashed value, the remaining index entry as a result that is, at last, travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered, namely key assignments equals the data recording of key assignments to be checked.
Based on above-mentioned introduction, Fig. 5 is the composition structural representation of the apparatus for establishing embodiment of sparse index of the present invention.As shown in Figure 5, comprising:
Computing module is used for for each pending data recording, utilizes respectively same hash function to calculate the hashed value of its key assignments, and this data recording and the hashed value that calculates are sent to sets up module;
Set up module, be used for according to the hashed value that receives, the data recording that receives being saved in corresponding subregion, the data recording that is saved in same subregion has identical hashed value; For arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
The above-mentioned module of setting up can be further used for, and for arbitrary subregion, need to preserve when no longer including new data recording, but the data recording of the not composing document piece of preserving is not when reaching pre-provisioning request, and utilizing not, the data recording of composing document piece forms a blocks of files.
Above-mentionedly reach pre-provisioning request and typically refer to and reach predetermined number.
Each subregion has respectively a numbering that is different from other subregion, and each blocks of files has respectively a numbering that is different from other blocks of files in same subregion;
Comprise respectively in each index entry: largest key value, minimum key value, partition number, blocks of files number and hash function name; Wherein,
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding;
The title of the hash function that uses when the hash function name refers to calculate hashed value.
Also can further comprise in device shown in Figure 5:
Enquiry module is used for receiving key assignments to be checked, and finds out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item; For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of; Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
Fig. 6 is the composition structural representation of the inquiry unit embodiment of sparse index of the present invention.As shown in Figure 6, comprising:
Receiver module is used for receiving key assignments to be checked, and sends to processing module;
Processing module is used for finding out minimum key value from each index entry and is less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked, with the index entry that finds out as the candidate index item; For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of; Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered;
The structure of described index entry can with reference to above stated specification, repeat no more herein.
The above is only preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, is equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (13)

1. the method for building up of a sparse index, is characterized in that, comprising:
For each pending data recording, to utilize respectively same hash function to calculate the hashed value of its key assignments, and according to the hashed value that calculates, this data recording is saved in corresponding subregion, the data recording that is saved in same subregion has identical hashed value;
For arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
2. method according to claim 1, it is characterized in that, the method further comprises: for arbitrary subregion, need to preserve when no longer including new data recording, but when the data recording of the not composing document piece of preserving did not reach pre-provisioning request, utilizing not, the data recording of composing document piece formed a blocks of files.
3. method according to claim 1 and 2, is characterized in that, describedly reaches pre-provisioning request and comprise: reach predetermined number.
4. method according to claim 1, is characterized in that,
Each subregion has respectively a numbering that is different from other subregion, and each blocks of files has respectively a numbering that is different from other blocks of files in same subregion;
Comprise respectively in each index entry: largest key value, minimum key value, partition number, blocks of files number and hash function name; Wherein,
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding;
The title of the hash function that uses when the hash function name refers to calculate hashed value.
5. method according to claim 4, is characterized in that, when described sparse index set up complete after, further comprise:
Receive key assignments to be checked, and find out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item;
For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of;
Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
6. the apparatus for establishing of a sparse index, is characterized in that, comprising:
Computing module is used for for each pending data recording, utilizes respectively same hash function to calculate the hashed value of its key assignments, and this data recording and the hashed value that calculates are sent to sets up module;
The described module of setting up is used for according to the hashed value that receives, the data recording that receives being saved in corresponding subregion, and the data recording that is saved in same subregion has identical hashed value; For arbitrary subregion, starting stage, content wherein is empty, when the data recording of preserving reaches pre-provisioning request, utilize the data recording of preserving to form a blocks of files, when the data recording of the not composing document piece of preserving reaches pre-provisioning request again, utilize not that the data recording of composing document piece forms another blocks of files, the like; Blocks of files of every composition is set up an index entry for this document piece.
7. device according to claim 6, it is characterized in that, the described module of setting up is further used for, for arbitrary subregion, need to preserve when no longer including new data recording, but when the data recording of the not composing document piece of preserving did not reach pre-provisioning request, utilizing not, the data recording of composing document piece formed a blocks of files.
8. according to claim 6 or 7 described devices, is characterized in that, describedly reaches pre-provisioning request and comprise: reach predetermined number.
9. device according to claim 6, is characterized in that,
Each subregion has respectively a numbering that is different from other subregion, and each blocks of files has respectively a numbering that is different from other blocks of files in same subregion;
Comprise respectively in each index entry: largest key value, minimum key value, partition number, blocks of files number and hash function name; Wherein,
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding;
The title of the hash function that uses when the hash function name refers to calculate hashed value.
10. device according to claim 9, is characterized in that, described device further comprises:
Enquiry module is used for receiving key assignments to be checked, and finds out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item; For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of; Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
11. a sparse index is characterized in that, comprising:
The respectively corresponding index entry of each blocks of files in each subregion; Each subregion has respectively a numbering that is different from other subregion, and each blocks of files has respectively a numbering that is different from other blocks of files in same subregion;
Comprise respectively in each index entry: largest key value, minimum key value, partition number, blocks of files number and hash function name; Wherein,
Largest key value refers to the maximal value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Minimum key value refers to the minimum value in the key assignments of each data recording in blocks of files corresponding to this index entry;
Partition number refers to the numbering of the subregion under blocks of files corresponding to this index entry;
Blocks of files number refers to the numbering of the blocks of files that this index entry is corresponding.
12. the querying method based on the described sparse index of claim 11 is characterized in that, comprising:
Receive key assignments to be checked, and find out minimum key value be less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked from each index entry, with the index entry that finds out as the candidate index item;
For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of;
Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
13. the inquiry unit based on the described sparse index of claim 11 is characterized in that, comprising:
Receiver module is used for receiving key assignments to be checked, and sends to processing module;
Described processing module is used for finding out minimum key value from each index entry and is less than or equal to key assignments to be checked and largest key value more than or equal to the index entry of key assignments to be checked, with the index entry that finds out as the candidate index item; For each candidate index item, utilize respectively hashed value and the minimum key value in this candidate index item or the hashed value of largest key value of the hash function calculating key assignments to be checked of hash function name correspondence wherein, if the hashed value of key assignments to be checked equals minimum key value in this candidate index item or the hashed value of largest key value, with this candidate index item index entry as a result of; Travel through each data recording in each blocks of files that index entry is corresponding as a result, obtain the data recording that key-value pair to be checked is answered.
CN2011103476374A 2011-11-07 2011-11-07 Method and device for creating sparse indexes, sparse index and query method and device Pending CN103092885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103476374A CN103092885A (en) 2011-11-07 2011-11-07 Method and device for creating sparse indexes, sparse index and query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103476374A CN103092885A (en) 2011-11-07 2011-11-07 Method and device for creating sparse indexes, sparse index and query method and device

Publications (1)

Publication Number Publication Date
CN103092885A true CN103092885A (en) 2013-05-08

Family

ID=48205462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103476374A Pending CN103092885A (en) 2011-11-07 2011-11-07 Method and device for creating sparse indexes, sparse index and query method and device

Country Status (1)

Country Link
CN (1) CN103092885A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408151A (en) * 2014-12-03 2015-03-11 天津南大通用数据技术股份有限公司 User-defined column database function index building method and device
CN106164898A (en) * 2014-10-11 2016-11-23 华为技术有限公司 Data processing method and device
CN108052649A (en) * 2017-12-26 2018-05-18 广州泼墨神网络科技有限公司 The data managing method and its system of a kind of distributed file system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1464451A (en) * 2002-06-26 2003-12-31 联想(北京)有限公司 A sorting method of data record
US7113957B1 (en) * 2001-12-20 2006-09-26 Ncr Corporation Row hash match scan join using summary contexts for a partitioned database system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7113957B1 (en) * 2001-12-20 2006-09-26 Ncr Corporation Row hash match scan join using summary contexts for a partitioned database system
CN1464451A (en) * 2002-06-26 2003-12-31 联想(北京)有限公司 A sorting method of data record

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙少陵等: "云数据仓库高性能查询技术研究", 《邮电设计技术》 *
张新建等: "Oracle数据库分区优化技术研究与应用", 《指挥信息系统与技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106164898A (en) * 2014-10-11 2016-11-23 华为技术有限公司 Data processing method and device
CN106164898B (en) * 2014-10-11 2018-06-26 华为技术有限公司 Data processing method and device
US11003719B2 (en) 2014-10-11 2021-05-11 Huawei Technologies Co., Ltd. Method and apparatus for accessing a storage disk
CN104408151A (en) * 2014-12-03 2015-03-11 天津南大通用数据技术股份有限公司 User-defined column database function index building method and device
CN108052649A (en) * 2017-12-26 2018-05-18 广州泼墨神网络科技有限公司 The data managing method and its system of a kind of distributed file system

Similar Documents

Publication Publication Date Title
JP6716727B2 (en) Streaming data distributed processing method and apparatus
CN104572727A (en) Data querying method and device
CN102129425B (en) The access method of big object set table and device in data warehouse
CN109086456B (en) Data indexing method and device
CN103699585A (en) Methods, devices and systems for file metadata storage and file recovery
CN105653537A (en) Paging query method and device for database application system
CN105589894B (en) Document index establishing method and device and document retrieval method and device
CN110727702B (en) Data query method, device, terminal and computer readable storage medium
CN105099729A (en) User ID (Identification) recognition method and device
CN109388636A (en) Business datum is inserted into database method, apparatus, computer equipment and storage medium
CN105005567B (en) Interest point query method and system
CN103092885A (en) Method and device for creating sparse indexes, sparse index and query method and device
CN104346347A (en) Data storage method, device, server and system
CN107562762B (en) Data index construction method and device
CN106649385B (en) Data reordering method and device based on HBase database
CN110109867B (en) Method, apparatus and computer program product for improving online mode detection
CN109388644B (en) Data updating method and device
CN110908995A (en) Data processing method, device and equipment
CN102831181B (en) Directory refreshing method for cache files
CN112765155A (en) Block chain-based key value storage method and device, terminal equipment and medium
CN104239538A (en) Method, system and device for compressing snapshot log
CN108763381B (en) Table dividing method and device based on consistent Hash algorithm
CN112231398A (en) Data storage method, device, equipment and storage medium
CN103995831A (en) Object processing method, system and device based on similarity among objects
CN103714121A (en) Index record management method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130508