CN105426408A

CN105426408A - Multi-index data processing method and apparatus

Info

Publication number: CN105426408A
Application number: CN201510731581.0A
Authority: CN
Inventors: 肖冰
Original assignee: Beijing Ruian Technology Co Ltd
Current assignee: Beijing Ruian Technology Co Ltd
Priority date: 2015-11-02
Filing date: 2015-11-02
Publication date: 2016-03-23
Anticipated expiration: 2035-11-02
Also published as: CN105426408B

Abstract

The present invention discloses a multi-index data processing method and apparatus. The method comprises: extracting each index value of to-be-stored data; according to each association position of each index value in a pointer array, establishing each associated index chained list node; determining an associated storage bucket area of the to-be-stored data, and creating a mapping relationship between the to-be-stored data and each associated storage block contained by the associated storage bucket area in a header of the associated storage bucket area; and according to the created mapping relationship between the to-be-stored data and each associated storage block, storing the to-be-stored data, and allocating a unique identifier of the to-be-stored data to each index chained list node. By adopting the scheme, multiple indexes can be rapidly established for the stored data, rapid management of the multiple indexes and the stored data can be implemented, and by means of controllable space loss, the speed of space allocation, recycle and collation is improved, so that the effect of replacing space with time is achieved.

Description

A kind of data processing method of many indexes and device

Technical field

The present invention relates to caching technology field, particularly relate to a kind of data processing method and device of many indexes.

Background technology

Memory caching technology be usually used in promoting client query data performance, reduce query responding time, in the Database Systems that the larger web station system of visit capacity, data volume are larger, often have application.

Data systems etc. are often with the caching mechanism of oneself inside, the memory cache software simultaneously also having some special, their each own application scenarioss be applicable to, but these softwares itself are towards general memory cache demand, internal memory, solid state hard disc etc. generally cannot be supported to manage as unified memory address space.Simultaneously because existing caching mechanism meets general requirment, system is relatively complicated, under self-defining many indexes random access scene, be not easy simply to realize the space application of buffer memory, delete the demand of unified management, when repeatedly applying for and a lot of memory fragmentation can being produced after deleting space, defragmentation is general consuming time more, and normal with the time-out action to system access, and this performance for high performance real-time on-line system has significant impact.

Summary of the invention

In view of this, the embodiment of the present invention provides a kind of data processing method and device of many indexes, to solve the data management technical matters improperly of many indexes of the prior art.

First aspect, embodiments provides a kind of data processing method of many indexes, comprising:

Extract each index value of data to be stored;

According to each relative position of each index value in array of pointers, each index chained list node be associated;

Determine the association store bucket region of data to be stored, and in the gauge outfit in association store bucket region, create the mapping relations of each association store interblock that described data to be stored and described association store bucket region comprise;

According to the data to be stored of establishment and the mapping relations of each association store interblock, store described data to be stored, and the unique identification of described data to be stored is distributed to each index chained list node.

Second aspect, the embodiment of the present invention additionally provides a kind of data processing equipment of many indexes, comprising:

Index value extraction module, for extracting each index value of data to be stored;

Node sets up module, for according to each relative position of each index value in array of pointers, and each index chained list node be associated;

Bucket area determination module, for determining the association store bucket region of data to be stored, and creates the mapping relations of each association store interblock that described data to be stored and described association store bucket region comprise in the gauge outfit in association store bucket region;

Data memory module, for the mapping relations according to the data to be stored created and each association store interblock, stores described data to be stored, and the unique identification of described data to be stored is distributed to each index chained list node.

The data processing method of a kind of many indexes that the embodiment of the present invention provides and device.Each index value is calculated by storing data, the relative position of each index value is found and each index chained list node be associated in array of pointers, determine the association store bucket region of data to be stored and in the gauge outfit in described bucket region, create the mapping relations of data to be stored and storage block, store data to be stored according to mapping relations, and the unique identification of data to be stored is distributed to each index chained list node.Adopt such scheme, index can be set up fast, and by controllable space loss, room for promotion distribution, the speed reclaiming and arrange, reach the effect of trading space for time.According to the amount of capacity reasonable distribution storage area of data to be stored, and calculate the best piecemeal quantity of data, ensure that the controlled of the loss of storage area.

Accompanying drawing explanation

By reading the detailed description done non-limiting example done with reference to the following drawings, other features, objects and advantages of the present invention will become more obvious:

The process flow diagram of the data processing method of a kind of many indexes that Fig. 1 provides for the embodiment of the present invention one;

Areal map set up in many indexes that Fig. 2 position embodiment of the present invention one provides;

The process flow diagram of the data processing method of a kind of many indexes that Fig. 3 provides for the embodiment of the present invention two;

The process flow diagram of the data processing method of a kind of many indexes that Fig. 4 provides for the embodiment of the present invention three;

The process flow diagram of the data processing method of a kind of many indexes that Fig. 5 provides for the embodiment of the present invention four;

The storage block that Fig. 6 provides for the embodiment of the present invention four moves schematic diagram;

The process flow diagram of the data processing method of a kind of many indexes that Fig. 7 provides for the embodiment of the present invention five;

The data to be stored that Fig. 8 provides for the embodiment of the present invention five store schematic diagram;

The schematic diagram of the data processing equipment of a kind of many indexes that Fig. 9 provides for the embodiment of the present invention six.

Embodiment

Below in conjunction with drawings and Examples, the present invention is described in further detail.Be understandable that, specific embodiment described herein is only for explaining the present invention, but not limitation of the invention.It also should be noted that, for convenience of description, illustrate only part related to the present invention in accompanying drawing but not full content.

Embodiment one

The process flow diagram of the data processing method of a kind of many indexes that Fig. 1 provides for the embodiment of the present invention one.The method of the present embodiment sets up the situation of many indexes of data to be stored fast specifically for terminal.The method of the present embodiment can be performed by the data processing equipment of many indexes, and this device can be realized by software and/or hardware, is integrated in the terminal can setting up many indexes.As shown in Figure 1, the method comprises:

S110, extract each index value of data to be stored.

Exemplary, according to pre-configured index construct rule, extract each index value of data construct to be stored.Wherein index construct rule can set according to actual conditions, does not do concrete restriction here.

S120, according to each relative position of each index value in array of pointers, each index chained list node be associated.

Exemplary, carry out Hash calculation to each index value, can be adopt high performance hash function to carry out Hash calculation to each index value, high-performance hash function can be BKDRHash function.The length of the integer obtained after calculating and array of pointers is carried out delivery calculating, and wherein array of pointers is for storing single linked list meter pointer, and the length of array of pointers is more than or equal to the number of the storable index value of terminal.Find corresponding array of pointers according to the value that delivery calculates, judge whether the correspondence position that this array of pointers indicates is empty.If it is empty, then in array of pointers, new single linked list meter pointer is set up.After creating single linked list meter pointer according to index value, set up new index chained list node, each index chained list node that wherein in array of pointers, each single linked list meter pointer points to forms single linked list, and array of pointers and single linked list form index area.Index chained list node is divided into three parts, and Part I is for storing index value, and namely Part I is the region that array of pointers is pointed to.Part II is temporarily empty, for behind the bucket region determining storage data, and the information of gauge outfit in store storage bucket region.Part III is set to sky, if after there is hash-collision, and process hash-collision.If the correspondence position of array of pointers instruction is not empty, illustrate that other index value takies this array of pointers, namely there occurs hash-collision, now, employing is turned up the soil location method process conflict, from occupied array of pointers, namely traverse the end of single linked list, creates a new index chained list node after last index chained list node, the Part III of last index chained list node is pointed to the Part I of new index chained list node, namely process conflict.

S130, determine the association store bucket region of data to be stored, and in the gauge outfit in association store bucket region, create the mapping relations of each association store interblock that described data to be stored and described association store bucket region comprise.

Exemplary, the data field base unit of actual storage data is storage blocks, and the storage block of same memory capacity defines bucket region in logic, and the such as storage block of 4K capacity constitutes 4K bucket region, and the storage block of 8K capacity constitutes 8K bucket region.Comprise at least one bucket in each bucket region, bucket is actual bucket.The storage size of each bucket is the integral multiple of storage block size, and the bucket in same bucket region is continuous in logic, also can be physically continuous.A unique gauge outfit is comprised in each bucket region, described gauge outfit is divided into two parts, a part is used for the mapping relations between bucket region and bucket, mapping relations between bucket and storage block, and the address of bucket, the address of storage block and storage block status indicator, another part is used for storing the mapping relations between data to be stored and storage block, and mapping relations preferably include the unique identification of data to be stored and the MBA memory block address of correspondence.When there being storage data stored in storage block, stored in the unique identification of data to be stored and the MBA memory block address of correspondence in the gauge outfit of correspondence, and the status indicator of corresponding memory block is become storaging mark.The uniquely identified computing method of data to be stored can set according to actual conditions, do not limit here.

Further, the association store bucket region of described data to be stored is determined.

Before determining association store bucket region, preferably data to be stored are calculated.First use the condensing encoder of setting to compress to data to be stored, compress mode can adopt LZ4 compress mode.After compression, data to be stored are calculated, account form can be the optimical block number of size divided by setting of data to be stored, after obtaining result of calculation, round in result of calculation, obtain immediate storage block, using the association store bucket region of bucket region corresponding for storage block as data to be stored.Wherein, optimical block is several sets according to actual conditions.Preferably, first judge whether described bucket region exists, if exist, then direct as association store bucket region, if bucket region does not exist, then in storage space, set up corresponding bucket region, described bucket region comprises a gauge outfit.Such as, the optimical block number of setting is 10 pieces, and size of data to be stored is 38K, then obtain 4K to data to be stored divided by rounding on carrying out after optimical block number, namely 4K storage block is as immediate storage block, then can determine that 4K bucket region corresponding to 4K storage block is as association store bucket region.Judge whether 4K bucket region exists, if exist, then using 4K bucket region as association store bucket region, if do not exist, then in storage space, set up a 4K bucket region as association store bucket region.

Further, described data to be stored are led to the size of storage block in region according to association store and split into each son data to be stored.

After determining association store bucket region, each son data to be stored are split into data to be stored.Concrete fractionation mode is that size of data to be stored is divided by rounding the block number obtaining actual fractionation closest to storage block size.After determining the block number of actual fractionation, data to be stored are split, if after splitting, the size of end data to be stored is less than the size of described storage block, then in described end data to be stored, fill at least one polishing code, make described end sub-pending data size and described storage block equal and opposite in direction.Wherein the size of polishing code is 1 byte, can be preferably 0 yard.The size of the data to be stored of the list after fractionation with the described equal and opposite in direction associating in bucket region the single storage block comprised.Such as, size of data to be stored is 38K, determines that its association store bucket region is 4K bucket region.Then 38 obtain 9.5 divided by 4, carry out rounding obtaining 10.Namely data to be stored are divided into 10 pieces of son data to be stored, and the size of every block data to be stored is 4K.After splitting, the actual size of last block data to be stored is 2K, then use polishing code 0 by the size polishing of data to be stored for last block to 4K.

Further, Jiang Gezi data sequence to be stored is placed into the empty storage block comprised in association store bucket region.

After Data Division to be stored, in the empty storage block being placed in association store bucket region of Jiang Gezi data sequence to be stored.Preferably, first determine before storage whether the quantity of the empty storage block comprised in described association store bucket region is greater than the quantity of described son data to be stored.If so, data placement to be stored for each son enters to associate in the empty storage block in bucket region by order; If not, then when the empty storage space in described association store bucket region is greater than storage space needed for data to be stored, in described association store bucket region, create new bucket, and data placement to be stored for each son is entered to associate in the empty storage block in bucket region by order.Wherein, if when the storage space in bucket region is less than the space required for described data to be stored, then show program exception, and this operation of rollback.Such as, needing the storage data of 10 pieces of 4K storage blocks, when determining that storing bucket space corresponding to data is 4K bucket space, judging whether described bucket space has the empty storage block being more than or equal to 10 pieces, if having, then son is stored data sequence and store soon stored in sky; If be less than 10 pieces, then judge whether have enough remaining spaces to store data to be stored in 4K bucket space, if have, then set up new empty bucket, have 10 pieces of empty storage blocks at least in bucket, Jiang Gezi data sequence to be stored writes in empty storage block, if do not have enough remaining spaces to store data to be stored in 4K bucket space, then show program exception, this operation of rollback.

S140, the data to be stored of foundation establishment and the mapping relations of each association store interblock, store described data to be stored, and the unique identification of described data to be stored distributed to each index chained list node.

Exemplary, determine each son data correlation bucket to be stored region, Jiang Gezi data sequence to be stored is stored in the empty storage block in association store bucket region.The mapping relations of data to be stored and each association store interblock are stored in the gauge outfit in association store bucket region, mapping relations preferably include the described unique identification of data to be stored and the MBA memory block address of correspondence, and change the status indicator of each association store block into storaging mark.The mapping relations stored in gauge outfit can be fed back to corresponding index chained list node afterwards, also the unique identification of the data to be stored stored in chained list can be distributed to each index chained list node, preferably the unique identification of data to be stored be distributed to each index chained list node.The unique identification of the gauge outfit feedback in association bucket region is stored in the Part II of index chained list node.The unique identification of each index chained list node acquisition that data to be stored are corresponding is identical, and the corresponding corresponding memory location of namely different index values, can save storage space like this.Such as, in the storage block of data to be stored stored in 5 pieces of 4K sizes, in gauge outfit, the status indicator of storage block is " 0 " and " 1 ", represents non-storaging mark and storaging mark respectively.The address of 5 pieces of storage blocks is respectively 0010,0011,0100,0101 and 0110, the unique identification calculating storage block is 01000100010001000, then in gauge outfit, store 0,100,010,001,000,100,000,100,011,010,001,010,110 1 sections of sequences, and the status indicator of 5 pieces of storage blocks in gauge outfit is become " 1 " from " 0 ".The unique identification 01000100010001000 of data to be stored is distributed to the Part II of each index chained list node.When being inputted after a certain index value finds corresponding index chained list node by user, corresponding gauge outfit is found according to 01000100010001000, and obtain the address 0010,0011,0100,0101 and 0110 of corresponding memory block, find corresponding bucket by the mapping relations of searching between storage block with bucket and then find described storage block.

Areal map set up in many indexes that Fig. 2 provides for the embodiment of the present invention one, as shown in Figure 2, logically index region is divided into index area 10 and data field 20.The hash table that index area 10 is made up of array of pointers 101 and the corresponding single linked list 102 of array of pointers 101 are formed, and each index chained list node 103 that single linked list 102 same array of pointers 101 is pointed to is formed.The length of pointer data 101 is greater than the index value number estimating to store.Index chained list node 103 comprises Part I 1031, for storing index value; Part II 1032, for storing the unique identification of index data; Part III 1033, for the treatment of hash-collision.Wherein array of pointers 101 points to the Part I 1031 of first index chained list node 103 of single linked list 102, the Part II 1032 of index chained list node 103 points to corresponding gauge outfit 202, the Part III 1033 of index chained list node 103, when there is hash-collision, points to the Part I 1031 newly indexing chained list node 103 in same single linked list 102.Hash-collision is that array of pointers 101 has pointed to index chained list node 103, then travel through this single linked list 102, and sets up new index chained list node 103 at end.The base unit of data field 20 is storage block 204, varies in size according to storage block 204, and data field 20 is divided into each bucket region 201 in logic, and storage block 204 size in each bucket region 201 is identical.Bucket region 201 is divided into the bucket 203 of several reality.The amount of capacity of each bucket 203 is the integral multiple of storage block 204 amount of capacity.Data to be stored are exactly in the storage block 204 of order stored in respective stored bucket region 201.Each bucket region comprises a unique gauge outfit 202, gauge outfit 202 is divided into two parts, and Part I is used for the mapping relations of bucket region 201, bucket 203 and storage block 204, the address of bucket 203, the address of storage block 204, and the status indicator of storage block 204; Part II is used for storing the address of the unique identification of data to be stored and the storage block 204 of corresponding stored.After data to be stored are stored in the empty storage block 204 of correspondence, in the address of gauge outfit 202 Part II stored in the unique identification of new data to be stored and the storage block 204 of corresponding stored, and unique identification is fed back to index chained list node 103 and stored in the Part II 1032 of index chained list node 103, now completes the foundation of index.

The data processing method of a kind of many indexes that the embodiment of the present invention one provides, by obtaining each index value of data to be stored, create the index chained list node of association, data to be stored are calculated to the bucket region determining to associate, to store in the empty storage block in data sequence write association bucket region, and the unique identification of data to be stored is distributed to each index chained list node.Adopt said method, index can be created fast, and due to the reasonable fractionation to data to be stored, effectively can avoid the waste of storage space, improve the utilization factor of storage space.

Embodiment two

The process flow diagram of the data processing method of a kind of many indexes that Fig. 3 provides for the embodiment of the present invention two.The present embodiment is on the basis of embodiment one, and with the addition of the step that index that utilization created carries out inquiring about, as shown in Figure 2, the method comprises:

S210, when monitoring data query event, obtain client send index value to be checked.

Exemplary, when client needs to inquire about storage data according to certain index value, input corresponding index value, obtained the index value to be checked of client's input by client.

S220, determine index chained list node relative position pointed by of described index value to be checked in array of pointers.

Exemplary, Hash calculation is carried out and delivery to the index value to be checked of client's input, corresponding array of pointers is found after obtaining result, the entrance of corresponding single linked list node is mapped to by array of pointers, namely determine the index chained list node pointed by the relative position of index value to be checked in array of pointers, the time complexity of this part is O (1).If the index value that the index chained list node that array of pointers correspondence is pointed to is deposited does not conform to index value to be checked namely clash, then travel through all index chained list nodes in single linked list, inquire about the index chained list node conformed to index value to be checked.The probability that single linked list clashes is very little, and the time managed conflict is a very little constant.

S230, determine the gauge outfit in bucket region corresponding to index chained list node pointed to.

Exemplary, according to the mark deposited inside the Part II in index chained list node, determine the gauge outfit in the bucket region needing the storage data of inquiry corresponding.

The gauge outfit in the bucket region that S240, foundation are corresponding, inquires about the storage block that index value to be checked is corresponding, merges, the data in the storage block of correspondence as the visit data that index value to be checked is corresponding.

Exemplary, according to the address of the unique identification determination storage block of data to be stored in the gauge outfit in corresponding bucket region, according to the mapping relations of storage block and bucket, find the whole storage blocks storing data, the data in storage block are taken out the storage data being merged into original compression and returns to client as the visit data that index value to be checked is corresponding.

The data processing method of a kind of many indexes that the embodiment of the present invention two provides, by obtaining the index value determination index chained list node of client's input, the unique identification of the data to be stored stored by index chained list node, find corresponding bucket region gauge outfit, determine corresponding storage block according to the gauge outfit in bucket region, the storage data in each storage block are merged and obtains corresponding visit data to be checked.Adopt above-mentioned querying method, the storage data that index value is corresponding can be obtained fast and accurately, promote customer experience.

Embodiment three

The process flow diagram of the data processing method of a kind of many indexes that Fig. 4 provides for the embodiment of the present invention three.The present embodiment is on the basis of embodiment one, with the addition of the step of the index that deletion has established, and as shown in Figure 4, the method comprises:

S310, when monitoring data deletion event, obtain client send data to be deleted.

Exemplary, obtaining the data to be deleted that client sends, also can be the index value to be deleted obtaining client transmission; Determine the index chained list node that the relative position of index value to be deleted in array of pointers points to; According to the gauge outfit in bucket region corresponding to index chained list node pointed to, the data obtaining corresponding storage block merge, as data to be deleted.

S320, according to preset index rule, extract the index value each to be deleted of data to be deleted.

Exemplary, according to pre-configured index construct rule, extract each index value of data construct to be deleted.Wherein index construct rule can set according to actual conditions, does not do concrete restriction here.

S330, each index chained list node determining pointed by the relative position of described each index value to be deleted in array of pointers.

Exemplary, after each index value to be deleted is calculated, determine each index chained list node that the relative position of each index value to be deleted in array of pointers points to.The length of the integer obtained after calculating and array of pointers, for carry out Hash calculation to each index value to be deleted, is carried out delivery calculating, is found corresponding array of pointers, determine the index chained list node that array of pointers is pointed to according to result of calculation by concrete computation rule.If the index value of the index chained list node that array of pointers is pointed to does not conform to index value to be deleted, then travel through the single linked list node that this array of pointers is corresponding, find corresponding index chained list node.

S340, determine the gauge outfit in bucket region corresponding to each index chained list node pointed to.

Exemplary, because each index chained list node points to the gauge outfit in same bucket region, the gauge outfit in the corresponding stored bucket region pointed to can be determined according to the Part II of arbitrary index chained list node of data to be deleted.

The mapping relations of data to be stored and association store interblock in S350, deletion gauge outfit, and delete described each index chained list node.

Exemplary, after the gauge outfit finding corresponding bucket region according to index chained list node, the MBA memory block address of the unique identification of data to be deleted in the gauge outfit of bucket region and storage is deleted, and the storaging mark of the storage block of correspondence is deleted, also can be change the storaging mark of storage block into non-storaging mark, and delete the whole index chained list node of data to be deleted.Data in storage block can be not deleted, if when next time stores other data to be stored in storage block, other data to be stored replaced legacy data.Also storage data corresponding in storage block can be deleted.

Further, obtain the son storage data last access time that the storage block of bucket region gauge outfit acceptance of the bid note is corresponding, if the interval of last access time distance current time is greater than predetermined interval, then the son in corresponding stored block is stored data and be merged into storage data, and calculate whole index values corresponding to storage data, and then determine each index chained list node that each index value is corresponding, delete whole index chained list node, and delete in gauge outfit the unique identification and corresponding MBA memory block address that store data, and delete the storaging mark of corresponding storage block, also can be change the storaging mark of storage block into non-storaging mark.Prefixed time interval can set according to actual conditions.

The data processing method of a kind of many indexes that the embodiment of the present invention three provides, by obtaining data to be deleted, extract corresponding index value each to be deleted, find corresponding each index chained list node, determine the gauge outfit in the bucket region that each index chained list node of sensing is corresponding, delete the mapping relations storing data in gauge outfit, and delete described each index chained list node, the status indicator simultaneously revising storage block is non-storaging mark.Adopt said method, index chained list node can be deleted fast, improve index deletion efficiency.

Embodiment four

The process flow diagram of the data processing method of a kind of many indexes that Fig. 5 provides for the embodiment of the present invention four.The present embodiment is on the basis of embodiment one, and with the addition of the step of utilization rate lower than the bucket locking of the utilization rate threshold value preset, as shown in Figure 5, the method comprises:

S410, detect the utilization rate of arbitrary bucket lower than preset utilization rate threshold values time, this bucket is labeled as lock-out state.

Exemplary, the utilization rate of the bucket in storage area is detected, when the utilization rate of arbitrary bucket being detected lower than the utilization rate threshold value preset, this bucket is labeled as lock-out state, the bucket after locking is not useable for storing, inquiring about or delete data.Detecting the utilization rate of bucket can be automatically detect in time interval according to setting, also can be manually detect.The storage block that Fig. 6 provides for the embodiment of the present invention four moves schematic diagram, and as shown in Figure 6, the bucket 501 lower than default utilization rate threshold value is labeled as lock-out state.The utilization rate threshold value preset can set according to actual conditions.

S420, transfer to the storage block comprised in this bucket except this bucket other buckets.

Exemplary, determine other buckets with this bucket corresponding stored bucket region, the storage block comprised is transferred in other buckets, preferably move in bucket forward in logic in bucket region in this bucket.As shown in Figure 6, the storage block 5011 in bucket 501, storage block 5012 and storage block 5013 are moved in bucket 502 successively.If the quantity of the storage block in bucket 502 is less than the quantity of the storage block needing movement in bucket 501, then can after moving to bucket 502, remaining storage block is moved in other bucket, and change storage data unique identification MBA memory block address below corresponding in the gauge outfit of bucket region, and change the status indicator of storage block, then unlock state, discharges the storage space of this bucket.

The data processing method of a kind of many indexes that the embodiment of the present invention four provides, if when the utilization rate of bucket being detected lower than utilization rate predetermined threshold value, is locking by the status indication of bucket, and the storage block in described bucket is moved in other bucket.Adopt the method can the arrangement storage fragmentation of timing, reasonably utilize storage space, improve the utilization factor of storage space.

Embodiment five

The process flow diagram of the data processing method of a kind of many indexes that Fig. 7 provides for the embodiment of the present invention five.The present embodiment is the preferred exemplary setting up index, and as shown in Figure 7, the method specifically comprises:

S610, beginning.

As, when obtaining data to be stored (value), start.

S620, according to index rule, from each index value of extracting data to be stored (key).

S630, Hash is carried out to each index value, according to array of pointers (P [n]) length delivery, be mapped to correspondence position.

Exemplary, hash function is BKDRhash.The length of P [n] is greater than the number estimating to store key.

S640, judge correspondence position whether existing value.

Exemplary, judge whether the position that array of pointers is pointed to has index value, if having, performs S660, if not, performs S650.

S650, set up new single linked list meter pointer, and redirect performs S670.

S660, according to turn up the soil location send out process conflict, continue perform S670.

Exemplary, travel through whole index chained list nodes of corresponding single linked list, and newly index chained list node at end place, and store key.

S670, set up manipulative indexing chained list node, redirect performs S6180.

Exemplary, after establishing single linked list meter pointer, set up corresponding index chained list node, index chained list node comprises three parts.Part I stores key, and namely single linked list meter pointer points to the Part I of index chained list node.Part II is for storing the data unique identification to be stored that in S6160, gauge outfit stores.Part III is temporarily empty, during for clashing, indicates next index chained list node.

S680, to data compression to be stored, continue perform S690.

Exemplary, the compress mode of acquiescence is LZ4.

S690, to split according to the best of setting number calculate Data Division to be stored after the suitable size of sub data to be stored and number, continue to perform S6100.

S6100, judge whether the bucket region of corresponding sub data to be stored exists.

Exemplary, in described bucket region, the size of storage block is identical with the size of son data to be stored.If exist, then perform S6110, if do not exist, then perform S6120.

S6110, judge this barrel of region each bucket in whether have enough vacant storage block positions.

Exemplary, if there are enough block positions, then perform S6150, if there is no enough block positions, then perform S6130.

The corresponding bucket region of S6120, the corresponding sub size of data to be stored of foundation, continues to perform S6130.

Exemplary, the size of the storage block in the bucket region of foundation is identical with the size of described son data to be stored.

S6130, judge whether bucket region has enough storage spaces.

Exemplary, if there are enough spaces, perform S6140, if there is no sufficient space, perform S6170.

S6140, in bucket region, set up empty bucket, continue to perform S6150.

Exemplary, the number of the storage block that described bucket comprises is greater than the number of sub data to be stored.

S6150, from first empty storage block order insert data to be stored, continue perform S6160.

Exemplary, the data to be stored that Fig. 8 provides for the embodiment of the present invention five store schematic diagram, as shown in Figure 8, data 701 to be stored are divided into sub data to be stored 7011, sub data 7012 to be stored, sub data 7013 to be stored, sub data 7014 to be stored and son data 7015 to be stored.If wherein data to be stored cannot fill up 7015, then filled up by 7015 remainders with 0, namely filling part is 7016.Data sequence to be stored for each son is assigned in the blank storage block of bucket 702, obtains after distribution writing the bucket 703 storing data.

S6160, in gauge outfit, store the unique identification of data to be stored and the MBA memory block address of correspondence, and change the status indicator of storage block into storaging mark, redirect performs S6180.

S6170, abnormal ending rollback.

S6180, the unique identification storing data data to be stored in gauge outfit is assigned to index chained list node, and continues to perform S6190.

Exemplary, S610-S670 and S680-S6170 carries out simultaneously, after two-wire completes, the unique identification of data to be stored in gauge outfit is assigned to the Part II of index chained list node.

S6190, to be assigned.

The data processing method of a kind of many indexes that the embodiment of the present invention five provides, processes data to be stored while setting up index chained list node and writes in bucket region, can the speed of Speed-up Establishment index, improves data storage efficiency.

Embodiment six

The schematic diagram of the data processing equipment of a kind of many indexes that Fig. 9 provides for the embodiment of the present invention six.As shown in Figure 9, this device comprises: index value extraction module 801, node set up module 802, bucket area determination module 803 and data memory module 804.

Wherein, index value extraction module 801, for extracting each index value of data to be stored; Node sets up module 802, for according to each relative position of each index value in array of pointers, and each index chained list node be associated; Bucket area determination module 803, for determining the association store bucket region of data to be stored, and creates the mapping relations of each association store interblock that described data to be stored and described association store bucket region comprise in the gauge outfit in association store bucket region; Data memory module 804, for the mapping relations according to the data to be stored created and each association store interblock, stores described data to be stored, and the unique identification of described data to be stored is distributed to each index chained list node.

Further, bucket area determination module 803 also comprises: bucket determining unit, split cells and stored in unit.

Wherein, bucket determining unit, for determining the association store bucket region of described data to be stored; Split cells, for splitting into each son data to be stored by described data to be stored according to the size of storage block in association store bucket region; Stored in unit, for data sequence to be stored for each son being placed into the empty storage block comprised in association store bucket region.

Preferably, describedly also to comprise stored in unit: empty block judgment sub-unit and distribute subelement.

Wherein, empty block judgment sub-unit, for determining whether the quantity of the empty storage block comprised in described association store bucket region is greater than the quantity of described son data to be stored; Distribute subelement, for being if so, sequentially assigned in the empty storage block in association store bucket region by data to be stored for each son; If not, then when the empty storage space in described association store bucket region is greater than storage space needed for data to be stored, in described association store bucket region, create new bucket, and data to be stored for each son are assigned in the empty storage block in association store bucket region by order.

On above-described embodiment basis, described device also comprises: search index value acquisition module, query node determination module, question blank head module and visit data module.

Wherein, search index value acquisition module, for when monitoring data query event, obtains the index value to be checked that client sends; Query node determination module, for determining the index chained list node pointed by the relative position of described index value to be checked in array of pointers; Question blank head module, for determining the gauge outfit in the bucket region that the index chained list node of sensing is corresponding; Visit data module, for the gauge outfit according to corresponding bucket region, inquires about the storage block that index value to be checked is corresponding, merges, the data in the storage block of correspondence as the visit data that index value to be checked is corresponding.

Further, described device also comprises: data acquisition module to be deleted, index value acquisition module to be deleted, node determination module to be deleted, gauge outfit determination module and removing module.

Wherein, data acquisition module to be deleted, for when monitoring data deletion event, obtains the data to be deleted that client sends; Index value acquisition module to be deleted, for according to the index rule preset, extracts the index value each to be deleted of data to be deleted; Node determination module to be deleted, for determining each index chained list node pointed by the relative position of described each index value to be deleted in array of pointers; Gauge outfit determination module, determines the gauge outfit in the bucket region that each index chained list node of sensing is corresponding; Removing module, for deleting the mapping relations of data to be deleted and association store interblock in gauge outfit, and deletes described each index chained list node.

Preferably, described data acquisition module to be deleted also comprises: index value acquiring unit, node determining unit and data capture unit.

Wherein, index value acquiring unit, for when monitoring data deletion event, obtains the index value to be deleted that client sends; Node determining unit, for determining the index chained list node that the relative position of index value to be deleted in array of pointers points to; Data capture unit, the gauge outfit in the bucket region that the index chained list node for foundation sensing is corresponding, the data obtaining corresponding storage block merge, as data to be deleted.

Further, described device also comprises: locking module and mobile module.

Wherein, locking module, for when the utilization rate of arbitrary bucket being detected lower than the utilization rate threshold values preset, is labeled as lock-out state by this bucket; Mobile module, for transferring to the storage block comprised in this bucket in other buckets except this bucket.

The data processing equipment of a kind of many indexes that the embodiment of the present invention six provides, by obtaining the index value of data to be stored, create the index chained list node of association, data to be stored are calculated to the bucket region determining to associate, to store in the empty storage block in data sequence write association bucket region, and the unique identification of data to be stored is distributed to each index chained list node.Adopt said apparatus, index can be created fast, and due to the reasonable fractionation to data to be stored, effectively can avoid the waste of storage space, improve the utilization factor of storage space.

The data processing equipment of many indexes that the embodiment of the present invention provides, for the data processing method of many indexes performing the embodiment of the present invention and provide, possesses corresponding function and beneficial effect.

Note, above are only preferred embodiment of the present invention and institute's application technology principle.Skilled person in the art will appreciate that and the invention is not restricted to specific embodiment described here, various obvious change can be carried out for a person skilled in the art, readjust and substitute and can not protection scope of the present invention be departed from.Therefore, although be described in further detail invention has been by above embodiment, the present invention is not limited only to above embodiment, when not departing from the present invention's design, can also comprise other Equivalent embodiments more, and scope of the present invention is determined by appended right.

Claims

1. a data processing method for index more than, is characterized in that, comprising:

Extract each index value of data to be stored;

2. method according to claim 1, it is characterized in that, determine the association store bucket region of data to be stored, and in the gauge outfit in association store bucket region, create the mapping relations of each association store interblock that described data to be stored and described association store bucket region comprise, comprising:

Determine the association store bucket region of described data to be stored;

Described data to be stored are split into each son Data Data to be stored according to the size of storage block in association store bucket region;

Data sequence to be stored for each son is placed into the empty storage block comprised in association store bucket region.

3. method according to claim 2, is characterized in that, Jiang Gezi data sequence to be stored is placed into the empty storage block comprised in association store bucket region, comprising:

Determine whether the quantity of the empty storage block comprised in described association store bucket region is greater than the quantity of described son data to be stored;

If so, data placement to be stored for each son enters to associate in the empty storage block in bucket region by order; If not, then when the empty storage space in described association store bucket region is greater than storage space needed for data to be stored, in described association store bucket region, create new bucket, and data placement to be stored for each son is entered to associate in the empty storage block in bucket region by order.

4. method according to claim 1, is characterized in that, also comprises:

When monitoring data query event, obtain the index value to be checked that client sends;

Determine index chained list node relative position pointed by of described index value to be checked in array of pointers;

Determine the gauge outfit in the bucket region that the index chained list node of sensing is corresponding;

According to the gauge outfit in corresponding bucket region, inquire about the storage block that index value to be checked is corresponding, the data in the storage block of correspondence are merged, as the visit data that index value to be checked is corresponding.

5. method according to claim 1, is characterized in that, also comprises:

When monitoring data deletion event, obtain the data to be deleted that client sends;

According to the index rule preset, extract the index value each to be deleted of data to be deleted;

Determine each index chained list node pointed by the relative position of described each index value to be deleted in array of pointers;

Determine the gauge outfit in the bucket region that each index chained list node of sensing is corresponding;

Delete the mapping relations of data to be deleted and association store interblock in gauge outfit, and delete described each index chained list node.

6. method according to claim 5, is characterized in that, when monitoring data deletion event, obtaining the data to be deleted that client sends, comprising:

When monitoring data deletion event, obtain the index value to be deleted that client sends;

Determine the index chained list node that the relative position of index value to be deleted in array of pointers points to;

According to the gauge outfit in bucket region corresponding to index chained list node pointed to, the data obtaining corresponding storage block merge, as data to be deleted.

7. method according to claim 1, is characterized in that, also comprises:

When the utilization rate of arbitrary bucket being detected lower than the utilization rate threshold values preset, this bucket is labeled as lock-out state;

The storage block comprised in this bucket is transferred in other buckets except this bucket.

8. the treating apparatus of index data more than, is characterized in that, comprising:

9. device according to claim 8, is characterized in that, also comprises:

Search index value acquisition module, for when monitoring data query event, obtains the index value to be checked that client sends;

Query node determination module, for determining the index chained list node pointed by the relative position of described index value to be checked in array of pointers;

Question blank head module, for determining the gauge outfit in the bucket region that the index chained list node of sensing is corresponding;

Visit data module, for the gauge outfit according to corresponding bucket region, inquires about the storage block that index value to be checked is corresponding, merges, the data in the storage block of correspondence as the visit data that index value to be checked is corresponding.

10. device according to claim 8, is characterized in that, also comprises:

Data acquisition module to be deleted, for when monitoring data deletion event, obtains the data to be deleted that client sends;

Index value acquisition module to be deleted, for according to the index rule preset, extracts the index value each to be deleted of data to be deleted;

Node determination module to be deleted, for determining each index chained list node pointed by the relative position of described each index value to be deleted in array of pointers;

Gauge outfit determination module, determines the gauge outfit in the bucket region that each index chained list node of sensing is corresponding;

Removing module, for deleting the mapping relations of data to be deleted and association store interblock in gauge outfit, and deletes described each index chained list node.