CN105426408B

CN105426408B - A kind of data processing method and device of more indexes

Info

Publication number: CN105426408B
Application number: CN201510731581.0A
Authority: CN
Inventors: 肖冰
Original assignee: Beijing Ruian Technology Co Ltd
Current assignee: Beijing Ruian Technology Co Ltd
Priority date: 2015-11-02
Filing date: 2015-11-02
Publication date: 2019-03-08
Anticipated expiration: 2035-11-02
Also published as: CN105426408A

Abstract

The data processing method and device more indexed the invention discloses one kind, this method comprises: extracting each index value of data to be stored；According to each relative position of each index value in array of pointers, associated each index chained list node is established；Determine the associated storage bucket region of data to be stored, and the mapping relations between creating the data to be stored in the gauge outfit in associated storage bucket region and each associated storage block that associated storage bucket region includes；According to the mapping relations between the data to be stored and each associated storage block of creation, the data to be stored is stored, and the unique identification of the data to be stored is distributed into each index chained list node.Using the above scheme, more indexes can be quickly established to storing data, are able to achieve the quick management to more indexes and storing data, and by controllable space loss, room for promotion distribution, recycling and the speed arranged achieve the effect that trade space for time.

Description

A kind of data processing method and device of more indexes

Technical field

The present invention relates to caching technology field, more particularly to a kind of data processing method indexed and device more.

Background technique

Memory caching technology is usually used in promoting the performance of client query data, reduces query responding time, in amount of access Often there is application in the biggish Database Systems of biggish web station system, data volume.

Data system etc. is often with oneself internal caching mechanism, while there are also special memory cache software, it It is each have suitable application scenarios by oneself, but these softwares itself can not be generally supported towards general memory cache demand Memory, solid state hard disk etc. are managed as unified memory address space.Simultaneously because existing caching mechanism meet it is general Demand, system is relatively complicated, under customized more index random access scenes, it is not easy to the simple sky for realizing caching Between application, delete unified management demand, when repeatedly apply and delete space after can generate many memory fragmentations, defragmentation one As it is time-consuming more, and be often accompanied by the pause movement to system access, this has weight for the performance of high performance real-time on-line system It is big to influence.

Summary of the invention

In view of this, the embodiment of the present invention provides the data processing method and device that one kind indexes more, to solve existing skill The improper technical problem of the data management of more indexes in art.

In a first aspect, the embodiment of the invention provides the data processing methods that one kind indexes more, comprising:

Extract each index value of data to be stored；

According to each relative position of each index value in array of pointers, associated each index chained list node is established；

Determine the associated storage bucket region of data to be stored, and creation is described wait deposit in the gauge outfit in associated storage bucket region Mapping relations between each associated storage block that storage data and associated storage bucket region include；

According to the mapping relations between the data to be stored and each associated storage block of creation, the data to be stored is stored, and The unique identification of the data to be stored is distributed into each index chained list node.

Second aspect, the embodiment of the invention also provides the data processing equipments that one kind indexes more, comprising:

Index value extraction module, for extracting each index value of data to be stored；

Node establishes module, for each relative position according to each index value in array of pointers, establishes associated each rope Draw chained list node；

Bucket area determination module, for determining the associated storage bucket region of data to be stored, and in associated storage bucket region Gauge outfit in create the data to be stored and each associated storage block that associated storage bucket region includes between mapping relations；

Data memory module, for the mapping relations between the data to be stored and each associated storage block according to creation, storage The data to be stored, and the unique identification of the data to be stored is distributed into each index chained list node.

A kind of data processing methods indexed provided in an embodiment of the present invention and device more.Each rope is calculated by storing data Draw value, the relative position of each index value is found in array of pointers and establish associated each index chained list node, determines wait store The associated storage bucket region of data and the mapping relations that data to be stored and memory block are created in the gauge outfit in the bucket region, root Data to be stored is stored according to mapping relations, and the unique identification of data to be stored is distributed into each index chained list node.Using upper Scheme is stated, index, and the speed distributed, recycle and arranged by controllable space loss, room for promotion can be quickly established, Achieve the effect that trade space for time.According to the amount of capacity reasonable distribution storage region of data to be stored, and calculate data Best piecemeal quantity, ensure that the controllable of the loss of storage region.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:

Fig. 1 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention one provides more；

More indexes that the position Fig. 2 embodiment of the present invention one provides establish administrative division map；

Fig. 3 is a kind of flow chart of data processing methods indexed provided by Embodiment 2 of the present invention more；

Fig. 4 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention three provides more；

Fig. 5 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention four provides more；

Fig. 6 is the mobile schematic diagram of memory block that the embodiment of the present invention four provides；

Fig. 7 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention five provides more；

Fig. 8 is that the data to be stored that the embodiment of the present invention five provides stores schematic diagram；

Fig. 9 is the schematic diagram for one kind data processing equipments indexed that the embodiment of the present invention six provides more.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just In description, only some but not all contents related to the present invention are shown in the drawings.

Embodiment one

Fig. 1 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention one provides more.The present embodiment Method is specifically used for the case where terminal quickly establishes more indexes of data to be stored.The method of the present embodiment can be by the number that more indexes It is executed according to processing unit, which can be realized that being integrated in can establish in the terminal more indexed by software and/or hardware.Such as Fig. 1 It is shown, this method comprises:

S110, each index value for extracting data to be stored.

Illustratively, it according to preconfigured index construct rule, extracts data to be stored and constructs each index value.Wherein Index construct rule may be set according to actual conditions, and be not especially limited here.

S120, each relative position according to each index value in array of pointers establish associated each index chained list node.

Illustratively, Hash calculation is carried out to each index value, can be using high performance hash function to each index value Hash calculation is carried out, high-performance hash function can be BKDRHash function.By the integer obtained after calculating and array of pointers Length carries out modulus calculating, and wherein array of pointers is more than or equal to eventually for storing single linked list meter pointer, the length of array of pointers Hold the number of storable index value.Corresponding array of pointers is found according to the value that modulus is calculated, judges the array of pointers Whether the corresponding position of instruction is empty.If it is empty, then new single linked list meter pointer is established in array of pointers.According to index value After creating single linked list meter pointer, new index chained list node is established, wherein each single linked list meter pointer refers in array of pointers To each index chained list node constitute single linked list, array of pointers and single linked list constitute index area.Index chained list node is divided into three Point, for storing index value, i.e. first part is the region that array of pointers is directed toward for first part.Second part is temporarily empty, use In after determining the bucket region of data to be stored, the information of gauge outfit in bucket region is stored.Part III is set as empty, If after hash-collision occurs, handling hash-collision.If the corresponding position of array of pointers instruction is not sky, illustrate other index values This array of pointers is occupied, that is, hash-collision has occurred, at this point, using the processing conflict of address method is opened, i.e., from occupied pointer number Group begins stepping through the end of single linked list, creates a new index chained list node after end indexes chained list node, end is indexed The Part III of chained list node is directed toward the first part of new index chained list node, i.e. processing conflict.

S130, the associated storage bucket region for determining data to be stored, and institute is created in the gauge outfit in associated storage bucket region Mapping relations between stating data to be stored and each associated storage block that associated storage bucket region includes.

Illustratively, the data field basic unit of actual storage data is memory block, and the memory block of same memory capacity is patrolled Bucket region is formd on volume, such as the memory block of 4K capacity constitutes 4K barrels of regions, the memory block of 8K capacity constitutes 8K Bucket region.It include at least one bucket in each bucket region, bucket is actual bucket.The storage of each bucket is big Small is the integral multiple of storage block size, and the bucket in same bucket region is continuous in logic, can also be physically continuous.Often It include only one gauge outfit in a bucket region, the gauge outfit is divided into two parts, and a part is used to bucket region and storage Mapping relations between bucket, the address of mapping relations and bucket, the address of memory block between bucket and memory block and The status indicator of memory block, another part are used to store the mapping relations between data to be stored and memory block, and mapping relations are preferred Unique identification and corresponding MBA memory block address including data to be stored.When there is storing data to be stored in memory block, corresponding It is stored in the unique identification and corresponding MBA memory block address of data to be stored in gauge outfit, and the status indicator of corresponding memory block is become Storage mark.The calculation method of the unique identification of data to be stored can be set according to the actual situation, here without limitation.

Further, the associated storage bucket region of the data to be stored is determined.

Before determining associated storage bucket region, preferably data to be stored is calculated.Data to be stored is made first It is compressed with the condensing encoder of setting, compress mode can use LZ4 compress mode.Data to be stored is carried out after compression It calculates, calculation can be the size of data to be stored divided by the best block number of setting, after obtaining calculated result, tie to calculating It is rounded on fruit, obtains immediate memory block, using the corresponding bucket region of memory block as the associated storage of data to be stored Bucket region.Wherein, optimical block number is set according to the actual situation.Preferably, first judge that the bucket region whether there is, If it exists, then directly as associated storage bucket region, if bucket region is not present, corresponding bucket is established in memory space Region, the bucket region include a gauge outfit.For example, the best block number set is 10 piece, data to be stored size is 38K then obtains 4K divided by upper rounding is carried out after best block number to data to be stored, i.e., 4K memory block is as immediate storage Block can then determine 4K memory block corresponding 4K bucket region as associated storage bucket region.Judge whether 4K barrels of regions deposit , and if it exists, then using 4K barrels of regions as associated storage bucket region, if it does not exist, then one 4K barrels are established in memory space Region is as associated storage bucket region.

Further, by the data to be stored according to the size that associated storage leads to memory block in region split into it is each son to Storing data.

After determining associated storage bucket region, data to be stored is carried out to split into each sub- data to be stored.It is specific to split Mode is rounded the block number actually split divided by closest to storage block size for data to be stored size.What determination was actually split After block number, data to be stored is split, if after splitting, the size of the sub- data to be stored in end is less than the big of the memory block It is small, then at least one polishing code is filled in the sub- data to be stored in the end, make the sub- pending data size in the end with it is described Memory block is equal in magnitude.Wherein the size of polishing code is 1 byte, can be preferably 0 yard.List data to be stored after fractionation Size with it is described association bucket region in include single memory block it is equal in magnitude.For example, data to be stored size is 38K, really Its fixed associated storage bucket region is 4K barrels of regions.Then 38 9.5 are obtained divided by 4, is rounded to obtain 10.I.e. data to be stored is divided For 10 pieces of sub- data to be stored, the size of every piece of sub- data to be stored is 4K.The reality of last block data to be stored after fractionation Border size is 2K, then uses polishing code 0 by the size polishing of last block data to be stored to 4K.

Further, each sub- data to be stored sequence is placed into the empty memory block in associated storage bucket region included.

After being split to data to be stored, the sky of each sub- data to be stored sequence being placed into associated storage bucket region is deposited It stores up in block.Preferably, first determine whether the quantity for the empty memory block for including in associated storage bucket region is greater than before storage The quantity of the sub- data to be stored.If so, sequence deposits the sky that each sub- data to be stored is placed into associated storage bucket region It stores up in block；If it is not, when memory space needed for being then greater than data to be stored in the empty memory space in associated storage bucket region, New bucket is created in associated storage bucket region, and each sub- data to be stored is placed into associated storage bucket region by sequence In empty memory block in.Wherein, if the memory space in bucket region is less than space required for the data to be stored, Then show program exception, and rollback this operation.For example, it is desired to which the storing data of 10 pieces of 4K memory blocks, is determining storing data When corresponding bucket space is 4K barrels of spaces, judge whether the bucket space has the empty memory block more than or equal to 10 pieces, if so, then Sub- storing data sequence is stored in empty storage fastly；If judging whether to have in 4K barrels of spaces enough remaining empty less than 10 pieces Between store data to be stored, if so, then establish new empty barrel, at least 10 pieces empty memory blocks in bucket, by each sub- data to be stored It is sequentially written in sky memory block, if there is no enough remaining space storage data to be stored in 4K barrels of spaces, shows that program is different Often, this operation of rollback.

Mapping relations between S140, the foundation data to be stored created and each associated storage block, store the number to be stored According to, and the unique identification of the data to be stored is distributed into each index chained list node.

Illustratively, each sub- data to be stored associated storage bucket region is determined, extremely by each sub- data to be stored sequential storage In empty memory block in associated storage bucket region.In the gauge outfit storage data to be stored in associated storage bucket region and each associated storage Mapping relations between block, mapping relations preferably include the unique identification and corresponding MBA memory block address of the data to be stored, and The status indicator of each associated storage block is changed to storage mark.The mapping relations stored in gauge outfit can be fed back to accordingly later Index chained list node, the unique identification of the data to be stored stored in chained list can also be distributed to each index chained list node, The unique identification of data to be stored is preferably distributed into each index chained list node.It is associated with unique mark of the gauge outfit feedback in bucket region Know the second part for being stored in index chained list node.The unique identification phase that the corresponding each index chained list node of data to be stored obtains Together, i.e., different index values corresponds to corresponding storage location, can save memory space in this way.For example, data to be stored deposit 5 In the memory block of block 4K size, the status indicator of memory block is " 0 " and " 1 " in gauge outfit, respectively indicates not stored mark and storage Mark.The address of 5 pieces of memory blocks is respectively 0010,0011,0100,0101 and 0110, and the unique identification of memory block is calculated It is 01000100010001000, then stores 0,100,010,001,000,100,000,100,011,010,001,010,110 1 Duan Xu in gauge outfit Column, and the status indicator of 5 pieces of memory blocks in gauge outfit is become into " 1 " from " 0 ".By the unique identification of data to be stored 01000100010001000 distributes to the second part of each index chained list node.It is found when inputting a certain index value by user After corresponding index chained list node, corresponding gauge outfit is found according to 01000100010001000, and obtain the ground of corresponding memory block Corresponding deposit is found by searching for the mapping relations between memory block and bucket in location 0010,0011,0100,0101 and 0110 Storage tank finds the memory block in turn.

Fig. 2 is that more indexes provided by the embodiment of the present invention one establish administrative division map, as shown in Fig. 2, logically will index Region is divided into index area 10 and data field 20.The hash table and array of pointers 101 that index area 10 is made of array of pointers 101 are corresponding Single linked list 102 is constituted, and each index chained list node 103 that the same array of pointers 101 of single linked list 102 is directed toward is constituted.Pointer data 101 Length be greater than the index value number of estimated storage.Indexing chained list node 103 includes first part 1031, for storing index Value；Second part 1032, for storing the unique identification of index data；Part III 1033, for handling hash-collision.Wherein Array of pointers 101 is directed toward the first part 1031 of first index chained list node 103 of single linked list 102, indexes chained list node 103 Second part 1032 be directed toward corresponding gauge outfit 202, hash-collision is occurring for the Part III 1033 for indexing chained list node 103 When, it is directed toward the first part 1031 that chained list node 103 is newly indexed in same single linked list 102.Hash-collision is array of pointers 101 Index chained list node 103 is had pointed to, then traverses the single linked list 102, and establish new index chained list node 103 at end.Data field 20 basic unit is memory block 204, of different sizes according to memory block 204, and data field 20 is logically divided into each bucket area Domain 201,204 size of memory block in each bucket region 201 are identical.Bucket region 201 is divided actual deposits for several Storage tank 203.The amount of capacity of each bucket 203 is the integral multiple of 204 amount of capacity of memory block.Data to be stored is exactly sequence It is stored in the memory block 204 in respective stored bucket region 201.Each bucket region includes only one gauge outfit 202, gauge outfit 202 It is divided into two parts, first part is used to the mapping relations in bucket region 201, bucket 203 and memory block 204, bucket 203 address, the address of memory block 204 and the status indicator of memory block 204；Second part is used to store data to be stored The address of unique identification and the memory block 204 of corresponding storage.Behind data to be stored deposit corresponding empty memory block 204, in table First 202 second part is stored in the address of the unique identification of new data to be stored and the memory block 204 of corresponding storage, and will be unique Identification feedback is to index chained list node 103 and is stored in the second part 1032 of index chained list node 103, completes index at this time It establishes.

One kind that the embodiment of the present invention one provides data processing methods indexed, by each rope for obtaining data to be stored more Draw value, create associated index chained list node, data to be stored is carried out to calculate determining associated bucket region, number will be stored According to being sequentially written in the empty memory block in association bucket region, and the unique identification of data to be stored is distributed into each index chained list section Point.Using the above method, index can be quickly created, and due to the reasonable fractionation to data to be stored, it can be effective The waste for avoiding memory space improves the utilization rate of memory space.

Embodiment two

Fig. 3 is a kind of flow chart of data processing methods indexed provided by Embodiment 2 of the present invention more.The present embodiment is On the basis of example 1, it is added to the step of being inquired using the index created, as shown in Figure 2, this method comprises:

S210, when monitoring data query event, obtain client send index value to be checked.

Illustratively, when client needs according to some index value to inquire storing data, corresponding index value is inputted, by Client obtains the index value to be checked of client's input.

S220, index chained list node pointed by relative position of the index value to be checked in array of pointers is determined.

Illustratively, Hash calculation and modulus are carried out to the index value to be checked of client's input, finds phase after obtaining result The array of pointers answered is mapped to the entrance of corresponding single linked list node by array of pointers, that is, determines that index value to be checked is referring to Index chained list node pointed by relative position in needle array, the time complexity of this part are O (1).If array of pointers pair The index value for the index chained list node storage that should be directed toward is not consistent with index value to be checked to be clashed, then is traversed in single linked list All index chained list nodes inquire the index chained list node being consistent with index value to be checked.The probability that single linked list clashes is non- It is often small, and the time for solving conflict is the constant of a very little.

S230, the gauge outfit for determining the corresponding bucket region of index chained list node being directed toward.

Illustratively, it according to the mark stored inside the second part in index chained list node, determines and needs that inquires to deposit Store up the gauge outfit in the corresponding bucket region of data.

S240, according to the gauge outfit in corresponding bucket region, inquire the corresponding memory block of index value to be checked, will be corresponding Data in memory block merge, as the corresponding access data of index value to be checked.

Illustratively, the unique identification according to data to be stored in the gauge outfit in corresponding bucket region determines memory block Whole memory blocks of storing data are found, the data in memory block are taken according to the mapping relations of memory block and bucket in address The storing data for being merged into original compression out returns to client as the corresponding access data of index value to be checked.

A kind of data processing methods indexed provided by Embodiment 2 of the present invention, by the index value for obtaining client's input more It determines index chained list node, by the unique identification of the data to be stored of index chained list node storage, finds corresponding bucket region Gauge outfit determines corresponding memory block according to the gauge outfit in bucket region, and the storing data in each memory block is merged acquisition pair The access data to be checked answered.Using above-mentioned querying method, the corresponding storing data of index value can be fast and accurately obtained, Promote customer experience.

Embodiment three

Fig. 4 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention three provides more.The present embodiment is On the basis of example 1, it is added to the step of deleting well-established index, as shown in figure 4, this method comprises:

S310, when monitoring data deletion event, obtain client send data to be deleted.

Illustratively, the data to be deleted that client is sent are obtained, are also possible to obtain the rope to be deleted that client is sent Draw value；Determine the index chained list node that relative position of the index value to be deleted in array of pointers is directed toward；According to the index being directed toward The gauge outfit in the corresponding bucket region of chained list node, the data for obtaining corresponding memory block merge, as data to be deleted.

S320, according to preset index rule, extract each index value to be deleted of data to be deleted.

Illustratively, according to preconfigured index construct rule, data to be deleted is extracted and construct each index value.Wherein Index construct rule may be set according to actual conditions, and be not especially limited here.

S330, each index chained list section pointed by relative position of each index value to be deleted in array of pointers is determined Point.

Illustratively, after calculating each index value to be deleted, determine each index value to be deleted in array of pointers Each index chained list node that relative position is directed toward.Specific computation rule is to carry out Hash calculation to each index value to be deleted, will be counted The length of the integer obtained after calculation and array of pointers carries out modulus calculating, finds corresponding array of pointers according to calculated result, really Determine the index chained list node of array of pointers direction.If the index value and index to be deleted of the index chained list node that array of pointers is directed toward Value is not consistent, then traverses the corresponding single linked list node of the array of pointers, find corresponding index chained list node.

S340, the gauge outfit for determining the corresponding bucket region of each index chained list node being directed toward.

It illustratively, can be according to number to be deleted since each index chained list node is directed toward the gauge outfit in same bucket region According to any index chained list node second part determine be directed toward correspondence bucket region gauge outfit.

Mapping relations in S350, deletion gauge outfit between data to be stored and associated storage block, and delete each index chain Table node.

Illustratively, after finding the gauge outfit in corresponding bucket region according to index chained list node, by bucket region The unique identification of data to be deleted and the MBA memory block address of storage are deleted in gauge outfit, and the storage of corresponding memory block mark is deleted It removes, is also possible to the storage mark of memory block being changed to not stored mark, and delete the index chained list section of data whole to be deleted Point.Data in memory block can be deleted, if next time stores other data to be stored in memory block, other be waited depositing It stores up data and replaces legacy data.Corresponding storing data in memory block can also be deleted.

Further, the memory block corresponding sub- storing data last access time marked in the gauge outfit of bucket region is obtained, if Last interval of the access time apart from current time is greater than preset interval, then is merged into the sub- storing data in corresponding memory block Storing data, and the corresponding whole index values of storing data are calculated, and then determine the corresponding each index chained list node of each index value, All index chained list nodes are deleted, and delete the unique identification of storing data and corresponding MBA memory block address in gauge outfit, and delete The storage of corresponding memory block identifies, and is also possible to the storage mark of memory block being changed to not stored mark.Prefixed time interval It can be set according to the actual situation.

One kind that the embodiment of the present invention three provides data processing methods indexed are extracted more by obtaining data to be deleted Corresponding each index value to be deleted finds corresponding each index chained list node, determines that each index chained list node being directed toward is corresponding The gauge outfit in bucket region, deletes the mapping relations of storing data in gauge outfit, and deletes each index chained list node, repairs simultaneously The status indicator for changing memory block is not stored mark.Using the above method, index chained list node can be quickly deleted, rope is improved Draw deletion efficiency.

Example IV

Fig. 5 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention four provides more.The present embodiment is On the basis of example 1, the step of locking to utilization rate lower than the bucket of preset utilization rate threshold value, such as Fig. 5 are added to It is shown, this method comprises:

S410, the utilization rate for detecting any bucket be lower than preset utilization rate threshold value when, which is marked For lock state.

Illustratively, the utilization rate of the bucket in storage region is detected, when detecting making for any bucket When being lower than preset utilization rate threshold value with rate, which is labeled as lock state, the bucket after locking is not useable for depositing Data are deleted in storage, inquiry.The utilization rate detection of bucket can be and detect automatically at set time intervals, It can be and manually detected.Fig. 6 is the mobile schematic diagram of memory block that the embodiment of the present invention four provides, as shown in fig. 6, lower than pre- If the bucket 501 of utilization rate threshold value is labeled as lock state.Preset utilization rate threshold value can be set according to the actual situation It is fixed.

S420, the memory block for including in the bucket is transferred in other buckets in addition to the bucket.

Illustratively, determine bucket corresponding with bucket region other buckets, will include in the bucket Memory block is transferred in other buckets, is preferably moved in bucket forward in logic in bucket region.Such as Fig. 6 institute Show, memory block 5011, memory block 5012 and the memory block 5013 in bucket 501 are successively moved in bucket 502.If depositing The quantity of memory block in storage tank 502 is less than the quantity that mobile memory block is needed in bucket 501, then can deposit being moved to After storage tank 502, remaining memory block is moved in other buckets, and changes corresponding storing data in barrel region gauge outfit The subsequent MBA memory block address of unique identification, and change the status indicator of memory block, then unlock state, discharges this bucket Memory space.

One kind that the embodiment of the present invention four provides data processing methods indexed, if detecting, the utilization rate of bucket is low more It is locking by the status indication of bucket, and the memory block in the bucket is moved to it when utilization rate preset threshold In his bucket.Using this method can timing arrangement fragmentation, reasonable utilization memory space, improve memory space Utilization rate.

Embodiment five

Fig. 7 is the flow chart for one kind data processing methods indexed that the embodiment of the present invention five provides more.The present embodiment is The preferable example for establishing index, as shown in fig. 7, this method specifically includes:

S610, beginning.

Such as, when obtaining data to be stored (value), start.

S620, according to index rule, each index value (key) is extracted from data to be stored.

S630, corresponding position is mapped to according to array of pointers (P [n]) length modulus to each index value progress Hash.

Illustratively, hash function BKDRhash.The length of P [n] is greater than the number of estimated storage key.

S640, judge whether corresponding position has value.

Illustratively, whether the position for judging that array of pointers is directed toward has had index value, S660 is executed if having, if not having Then execute S650.

S650, new single linked list meter pointer is established, and jumps and executes S670.

S660, basis open address hair processing conflict, continue to execute S670.

Illustratively, whole index chained list nodes of corresponding single linked list are traversed, and newly index chained list node at end, And store key.

S670, manipulative indexing chained list node is established, jumps and executes S6180.

Illustratively, after establishing single linked list meter pointer, corresponding index chained list node is established, indexes chained list node packet Containing three parts.First part stores key, i.e. single linked list meter pointer is directed toward the first part for indexing chained list node.Second part For storing the data to be stored unique identification that gauge outfit stores in S6160.Part III is temporarily sky, when for clashing, Indicate next index chained list node.

S680, data to be stored is compressed, continues to execute S690.

Illustratively, the compress mode of default is LZ4.

S690, according to the best fractionation number of setting calculate data to be stored split after sub- data to be stored suitable size and Number continues to execute S6100.

The bucket region that S6100, judgement correspond to sub- data to be stored whether there is.

Illustratively, the size of memory block is identical as the size of sub- data to be stored in the bucket region.If it exists, then it holds Row S6110 then executes S6120 if it does not exist.

S6110, judge whether there are enough vacant memory block positions in each bucket in this barrel of region.

Illustratively, if there are enough block positions, S6150 is executed, if executing S6130 without enough block positions.

S6120, the correspondence bucket region for establishing corresponding sub- data to be stored size, continue to execute S6130.

Illustratively, the size of the memory block in the bucket region of foundation is identical as the size of the sub- data to be stored.

S6130, judge whether bucket region there are enough memory spaces.

Illustratively, S6140 is executed if there are enough spaces, if executing S6170 without sufficient space.

S6140, empty bucket is established in bucket region, continue to execute S6150.

Illustratively, the number for the memory block that the bucket includes is greater than the number of sub- data to be stored.

S6150, data to be stored is inserted from first empty storage BOB(beginning of block) sequence, continues to execute S6160.

Illustratively, Fig. 8 is that the data to be stored that the embodiment of the present invention five provides stores schematic diagram, as shown in figure 8, wait deposit Storage data 701 are divided for sub- data to be stored 7011, sub- data to be stored 7012, sub- data to be stored 7013, sub- data to be stored 7014 and sub- data to be stored 7015.Wherein if data to be stored can not fill up 7015,7015 remainders are filled out with 0 Full, i.e., fill part is 7016.By each sub- data to be stored order-assigned into the blank memory block of bucket 702, after distribution It is written the bucket 703 of storing data.

S6160, the unique identification and corresponding MBA memory block address that data to be stored is stored in gauge outfit, and by memory block Status indicator is changed to storage mark, jumps and executes S6180.

S6170, abnormal ending and rollback.

S6180, by storing data, the unique identification of data to be stored is assigned to index chained list node in gauge outfit, and continues to hold Row S6190.

Illustratively, S610-S670 and S680-S6170 is carried out simultaneously, after the completion of two-wire, by data to be stored in gauge outfit Unique identification be assigned to index chained list node second part.

S6190, it is assigned.

One kind that the embodiment of the present invention five provides data processing methods indexed, while establishing index chained list node pair more Data to be stored is handled and is written in barrel region, and the speed that can be indexed with Speed-up Establishment improves data storage efficiency.

Embodiment six

Fig. 9 is the schematic diagram for one kind data processing equipments indexed that the embodiment of the present invention six provides more.As shown in figure 9, The device includes: that index value extraction module 801, node establish module 802, bucket area determination module 803 and data memory module 804。

Wherein, index value extraction module 801, for extracting each index value of data to be stored；Node establishes module 802, For each relative position according to each index value in array of pointers, associated each index chained list node is established；Bucket region determines Module 803, for determining the associated storage bucket region of data to be stored, and in the gauge outfit in associated storage bucket region described in creation Mapping relations between each associated storage block that data to be stored and associated storage bucket region include；Data memory module 804, For the mapping relations between the data to be stored and each associated storage block according to creation, the data to be stored is stored, and by institute The unique identification for stating data to be stored distributes to each index chained list node.

Further, bucket area determination module 803 further include: bucket determination unit, split cells and deposit unit.

Wherein, bucket determination unit, for determining the associated storage bucket region of the data to be stored；Split cells, For the data to be stored to be split into each sub- data to be stored according to the size of memory block in associated storage bucket region；Deposit Unit, for each sub- data to be stored sequence to be placed into the empty memory block in associated storage bucket region included.

Preferably, the deposit unit further include: empty block judgment sub-unit and distribution subelement.

Wherein, empty block judgment sub-unit, for determining the quantity for the empty memory block for including in associated storage bucket region Whether the quantity of the sub- data to be stored is greater than；Subelement is distributed, for if so, each sub- data to be stored is assigned to by sequence In empty memory block in associated storage bucket region；If it is not, then the empty memory space in associated storage bucket region is greater than wait deposit When memory space needed for storing up data, new bucket is created in associated storage bucket region, and sequence will be each sub wait store Data are assigned in the empty memory block in associated storage bucket region.

On that basi of the above embodiments, described device further include: search index value obtains module, query node determines mould Block, inquiry table head module and access data module.

Wherein, search index value obtains module, for when monitoring data query event, obtain that client sends to Search index value；Query node determining module, for determining relative position institute of the index value to be checked in array of pointers The index chained list node of direction；Inquiry table head module, for determining the corresponding bucket region of index chained list node being directed toward Gauge outfit；It accesses data module and inquires the corresponding storage of index value to be checked for the gauge outfit according to corresponding bucket region Block merges the data in corresponding memory block, as the corresponding access data of index value to be checked.

Further, described device further include: data acquisition module to be deleted, index value to be deleted obtain module, wait delete Except node determining module, gauge outfit determining module and removing module.

Wherein, data acquisition module to be deleted, for when monitoring data deletion event, obtain that client sends to Delete data；Index value to be deleted obtains module, for extracting each to be deleted of data to be deleted according to preset index rule Index value；Node determining module to be deleted, for determining relative position institute of each index value to be deleted in array of pointers Each index chained list node being directed toward；Gauge outfit determining module determines the corresponding bucket region of each index chained list node being directed toward Gauge outfit；Removing module for deleting the mapping relations in gauge outfit between data and associated storage block to be deleted, and deletes each rope Draw chained list node.

Preferably, the data acquisition module to be deleted further include: index value acquiring unit, node determination unit and data Acquiring unit.

Wherein, index value acquiring unit, for when monitoring data deletion event, obtaining the to be deleted of client transmission Index value；Node determination unit, the index chained list being directed toward for determining relative position of the index value to be deleted in array of pointers Node；Data capture unit obtains corresponding for the gauge outfit according to the corresponding bucket region of index chained list node being directed toward The data of memory block merge, as data to be deleted.

Further, described device further include: locking module and mobile module.

Wherein, locking module, for inciting somebody to action when the utilization rate for detecting any bucket is lower than preset utilization rate threshold value The bucket is labeled as lock state；Mobile module, for being transferred to the memory block for including in the bucket except the bucket In other outer buckets.

One kind that the embodiment of the present invention six provides data processing equipments indexed, by the index for obtaining data to be stored more Value, creates associated index chained list node, carries out calculating determining associated bucket region to data to be stored, by storing data It is sequentially written in the empty memory block in association bucket region, and the unique identification of data to be stored is distributed into each index chained list node. Using above-mentioned apparatus, index can be quickly created, and due to the reasonable fractionation to data to be stored, can effectively be avoided The waste of memory space improves the utilization rate of memory space.

The data processing equipments indexed provided by the embodiment of the present invention provided in an embodiment of the present invention more for executing more The data processing method of index has corresponding function and beneficial effect.

Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims

1. a kind of data processing method indexed more characterized by comprising

Extract each index value of data to be stored；

It determines the associated storage bucket region of data to be stored, and creates the number to be stored in the gauge outfit in associated storage bucket region Mapping relations between each associated storage block for including according to associated storage bucket region；

According to the mapping relations between the data to be stored and each associated storage block of creation, the data to be stored is stored, and by institute The unique identification for stating data to be stored distributes to each index chained list node；

According to each relative position of each index value in array of pointers, associated each index chained list node is established, comprising:

New index chained list node is established, the first part of the index chained list node is for storing index value, the index chain The second part of table node is temporarily sky, for storing in bucket region after determining the bucket region of data to be stored The information of gauge outfit；

It determines the associated storage bucket region of data to be stored, and creates the number to be stored in the gauge outfit in associated storage bucket region Mapping relations between each associated storage block for including according to associated storage bucket region, comprising:

Determine the associated storage bucket region of the data to be stored；

The data to be stored is split into each sub- data to be stored according to the size of memory block in associated storage bucket region；

Each sub- data to be stored sequence is placed into the empty memory block in associated storage bucket region included.

2. the method according to claim 1, wherein each sub- data to be stored sequence is placed into associated storage bucket The empty memory block for including in region, comprising:

Determine whether the quantity for the empty memory block for including in associated storage bucket region is greater than the number of the sub- data to be stored Amount；

If so, each sub- data to be stored is placed into the empty memory block in associated storage bucket region by sequence；If it is not, then described When memory space needed for the empty memory space in associated storage bucket region is greater than data to be stored, in associated storage bucket region New bucket is created, and each sub- data to be stored is placed into the empty memory block in associated storage bucket region by sequence.

3. the method according to claim 1, wherein further include:

When monitoring data query event, the index value to be checked that client is sent is obtained；

Determine index chained list node pointed by relative position of the index value to be checked in array of pointers；

Determine the gauge outfit in the corresponding bucket region of index chained list node being directed toward；

According to the gauge outfit in corresponding bucket region, the corresponding memory block of index value to be checked is inquired, it will be in corresponding memory block Data merge, as the corresponding access data of index value to be checked.

4. the method according to claim 1, wherein further include:

When monitoring data deletion event, the data to be deleted that client is sent are obtained；

According to preset index rule, each index value to be deleted of data to be deleted is extracted；

Determine each index chained list node pointed by relative position of each index value to be deleted in array of pointers；

Determine the gauge outfit in the corresponding bucket region of each index chained list node being directed toward；

The mapping relations in gauge outfit between data and associated storage block to be deleted are deleted, and delete each index chained list node.

5. according to the method described in claim 4, it is characterized in that, obtaining client hair when monitoring data deletion event The data to be deleted sent, comprising:

When monitoring data deletion event, the index value to be deleted that client is sent is obtained；

Determine the index chained list node that relative position of the index value to be deleted in array of pointers is directed toward；

According to the gauge outfit in the corresponding bucket region of index chained list node being directed toward, the data for obtaining corresponding memory block are closed And as data to be deleted.

6. the method according to claim 1, wherein further include:

When the utilization rate for detecting any bucket is lower than preset utilization rate threshold value, by the bucket labeled as locking shape State；

The memory block for including in the bucket is transferred in other buckets in addition to the bucket.

7. a kind of more index data processing units characterized by comprising

Node establishes module, for each relative position according to each index value in array of pointers, establishes associated each index chain Table node；

Bucket area determination module, for determining the associated storage bucket region of data to be stored, and the table in associated storage bucket region Mapping relations between creating the data to be stored and each associated storage block that associated storage bucket region includes in head；

Data memory module, for the mapping relations between the data to be stored and each associated storage block according to creation, described in storage Data to be stored, and the unique identification of the data to be stored is distributed into each index chained list node；

The node establishes module, specifically for establishing new index chained list node, the first part of the index chained list node For storing index value, the second part of the index chained list node is temporarily sky, in the storage for determining data to be stored Behind bucket region, the information of gauge outfit in bucket region is stored；

The bucket area determination module further include:

Bucket determination unit, for determining the associated storage bucket region of the data to be stored；

Split cells, for by the data to be stored according to the size of memory block in associated storage bucket region split into it is each son to Storing data；

It is stored in unit, for each sub- data to be stored sequence to be placed into the empty memory block in associated storage bucket region included.

8. device according to claim 7, which is characterized in that further include:

Search index value obtains module, for when monitoring data query event, obtaining the index to be checked that client is sent Value；

Query node determining module, for determining rope pointed by relative position of the index value to be checked in array of pointers Draw chained list node；

Inquiry table head module, for determining the gauge outfit in the corresponding bucket region of index chained list node being directed toward；

It accesses data module and inquires the corresponding memory block of index value to be checked for the gauge outfit according to corresponding bucket region, Data in corresponding memory block are merged, as the corresponding access data of index value to be checked.

9. device according to claim 7, which is characterized in that further include:

Data acquisition module to be deleted, for when monitoring data deletion event, obtaining the data to be deleted that client is sent；

Index value to be deleted obtains module, for extracting each index to be deleted of data to be deleted according to preset index rule Value；

Node determining module to be deleted, for determining pointed by relative position of each index value to be deleted in array of pointers Each index chained list node；

Gauge outfit determining module determines the gauge outfit in the corresponding bucket region of each index chained list node being directed toward；

Removing module for deleting the mapping relations in gauge outfit between data and associated storage block to be deleted, and deletes each rope Draw chained list node.