CN107679212A

CN107679212A - A kind of data query optimization method for being applied to jump list data structure

Info

Publication number: CN107679212A
Application number: CN201710968153.9A
Authority: CN
Inventors: 汪俊锋; 刘罡; 张巧云; 戴平
Original assignee: Anhui Wisdom Gold Tong Technology Co Ltd
Current assignee: Anhui Wisdom Gold Tong Technology Co Ltd
Priority date: 2017-10-17
Filing date: 2017-10-17
Publication date: 2018-02-09

Abstract

The present invention discloses a kind of data query optimization method for being applied to jump list data structure, belong to data directory technical field, including according to index data sum be N, index level be n create skip list, using each layer index data of array T [N] sequential storage skip list, array Id [N] stores address of the corresponding index data in next layer；The inquiry initial address index of first layer index data in skip list is initialized as 0；Index data is read from array T [N] since index addresses, index data is compared with searching data M one by one, first is found and is more than or equal to M index data, and obtain the address index of the index data；According to obtained index addresses, renewal index in address is obtained from array Id [N], the index addresses according to renewal continue to search in array T [N] or data list；In data list, from index addresses, traveled through backward along list, inquire return address behind M positions.Cache hit probability during data query can be improved.

Description

A kind of data query optimization method for being applied to jump list data structure

Technical field

The present invention relates to data directory technical field, more particularly to a kind of data query for being applied to jump list data structure Optimization method.

Background technology

Memory database refers to all for data to be placed on the database operated in internal memory.With disk as main storage The database of medium is compared, and memory database is mainly characterized by that speed is fast, handling capacity is high.At present in big data processing system, The buffer memory of data is carried out usually using memory database, to lift the process performance of frequently-used data.

Skip list refers on the basis of ordered list, increases a kind of data knot of index data in a manner of randomization Structure.By these additional index datas when so searching in lists, it rapidly can skip partial list and lift lookup Speed.Skip list has the characteristics that simple in construction, efficiency high.It is as index data structure, in main flows such as Redis, Memsql It is used widely in memory database.

The primary operational of skip list lookup algorithm is that index data is begun look for from the superiors, and then basis is found Index data address is searched in the index data of next level, is to the last searched in a layer index data after terminating, according to most The index address obtained afterwards is searched in ordered list.But because index datastore is discontinuous, cache missings often be present The features such as rate is high, memory bandwidth utilization rate is low and influence the performance of existing skip list lookup algorithm.

The content of the invention

It is an object of the invention to provide a kind of data query optimization method for being applied to jump list data structure, to improve Cache hit probability during data query.

To realize object above, the technical solution adopted by the present invention is：The present invention provides one kind and is applied to skip list data The data query optimization method of structure, including：

S1, according to index data sum be N, level be that n creates skip list, use array T [N] sequential storage skip list Each layer index data, array Id [N] store address of the corresponding index data in next layer；

S2, the inquiry initial address index of first layer index data in skip list is initialized as 0；

S3, index data is read from the array T [N] since index addresses, by index data one by one with searching data M It is compared, finds first and be more than or equal to M index data, and obtain the address index of the index data；

S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to institute The index addresses for stating renewal continue to search in array T [N] or data list；

S5, in data list, from index addresses, travel through along list, returned after inquiring M positions backward Go back to address.

Wherein, step S1, specifically include：

Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N]；

Index address array is stored using array Id [N], corresponding index data exists in array Id [N] storage array T [N] Address in next layer.

Wherein, step S3, specifically include：

S31, mod=index%K is calculated, K is the data width that SIMD instruction is read；

S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33；

S33, the data T [index] for indexing index is read from array T [N], be designated as variable H；

S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35；

S35, more new variables mod and index value, and performed after performing operation mod=mod-1, index=index+1 Step S32；

S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K]；

S37, initializing variable k values are 0；

S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39；

S39, more new variables k value, perform operation k=k+1；

S310, k >=K is judged, if then performing step S311, if otherwise performing step S38；

S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K；

S312, the value for updating index, step S313 is performed after performing operation index=index+k；

S313, flow terminate.

Wherein, step S4, specifically include：

According to the obtained index addresses, address is obtained from array Id [N] and updates index；

Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if not Then perform step S5.

Compared with prior art, there is following technique effect in the present invention：By index datastore continuous empty in the present invention Between among, and inquired about for each level index data of skip list, this can improve cache hits during data query Rate.SIMD instruction can read multiple continuous data in an instruction, the memory bandwidth for making full use of processor to provide.This Sample, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data, improve processor memory bandwidth Utilization rate.

Brief description of the drawings

Below in conjunction with the accompanying drawings, the embodiment of the present invention is described in detail：

Fig. 1 is a kind of schematic flow sheet for the data query optimization method for being applied to jump list data structure in the present invention；

Fig. 2 is skip list index datastore structural representation in the present invention；

Fig. 3 is to search index data schematic flow sheet in the present invention in skip list.

Embodiment

In order to illustrate further the feature of the present invention, please refer to the following detailed descriptions related to the present invention and accompanying drawing.Institute Accompanying drawing is only for reference and purposes of discussion, is not used for being any limitation as protection scope of the present invention.

As shown in figure 1, present embodiments provide a kind of data query optimization method for being applied to jump list data structure, bag Include following steps：

It should be noted that create the index data that skip list is used for data storage list.Setting skip list has n layer Secondary, index data quantity summation at all levels is N.Index data and index ground are stored respectively using array T [N] and Id [N] Location.The index data of each layer of sequential storage skip list in array T [N].Array Id [N] stores its corresponding index data next Address in layer.For the index data of last layer, its address in data list is stored in array Id [N].

Fig. 2 describes the index datastore structure of some 2 layers of skip list.Sequential storage 2 layer indexs in T [N] array Data.Id [N] stores index address of the corresponding data at next layer in T [N] array.When inquiring about data, successively from first layer Concordance list Leve1 finds second layer concordance list Level2, then again the Query Result according to second layer concordance list Leve2 in number According to searching the data particular location in list.

It should be noted that carrying out real-time recording indexes using variable i ndex searches address.The data query of skip list is from One layer index data start.Variable i ndex initialization value should be starting of the skip list first layer index data in array T [N] Address.It is 0 to initialize index search address index.

It should be noted that the present invention inquires about index data among being stored in continuous space, for each layer of skip list The inquiry of secondary index data, by cache hit probability during raising data query；The reading of continuous data is carried out using SIMD instruction Take, the utilization rate of processor memory bandwidth can be improved.

Further, step S1, specifically include：

It should be noted that all index datas are stored sequentially in a continuous space when being advantageous to improve data query Cache hit probability.

Further, as shown in figure 3, step S3 specifically comprises the following steps：

S37, initializing variable k values are 0；

S39, more new variables k value, perform operation k=k+1；

S313, flow terminate.

Further, step S4, specifically include：

According to the obtained index addresses, renewal index in address is obtained from array Id [N]；

Judge the renewal index whether be index data address, in this way then perform step S3, if otherwise performing step Rapid S5.

Wherein, idiographic flow is as follows：

(1) judge whether index is the initial address of this layer index data, continue to hold step by step (4) if jumping to OK, continue executing with step by step (2) if not jumping to；

(2) judge whether T [index] is equal to M, continued executing with step by step (4) if jumping to, if not jumping to (3) continue executing with step by step；

(3) index value is updated, performs operation index=index -1；

(4) determine whether last layer index data, continued executing with step by step (5) if jumping to, if not Jump to and continue executing with step by step (6)；

(5) index value is updated, performs operation index=Id [index], obtained index is the ground in data list Location.Jump to step S5 and data are inquired about in data list；

(6) index value is updated, performs operation index=Id [index], obtained index is next level index number According to starting search address.Step S3 is jumped to continue to search in next layer index data.

Further, the step S5, refer specifically in skip list last layer index data, the rope that searching data M is obtained Draw address index, be the index address in data list.Based on this index address index, then along data list backward Traversal, inquires return address behind M positions.If M is not in data list, return address -1.Idiographic flow is as follows：

A, judge whether the data pointed by index are M, continued executing with if jumping to d step by step；If not then B step by step is jumped to continue executing with；

B, judge whether the data pointed by index are more than M, if renewal index value is -1 and jumps to substep Rapid d is continued executing with；Continued executing with if not c step by step is then jumped to；

C, index value is updated, is the address of next data in data list.A step by step is then branched to continue to hold OK；

D, index addresses are returned to, are addresses of the data M in data list, flow terminates.

A kind of it should be noted that data query optimization side for being applied to jump list data structure disclosed in the present embodiment Method, have the advantages that：

(1) processor cache hit probability is high：

The present invention is by the index datastore of skip list among the array of a continuous space.Carried out successively in skip list During inquiry, the index data of each level is all Coutinuous store.In the cache of processor, the data of caching are frequent recently The consecutive data block used.This reading to continuous data, by cache hit probability during significant increase data query.

(2) memory bandwidth utilization rate is high：

The present invention by index data inquiry be stored in continuous space among, skip list data query be for skip list it is each The inquiry of level index data.SIMD instruction can read multiple continuous data in an instruction, make full use of processor The memory bandwidth of offer.So, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data, Improve the utilization rate of processor memory bandwidth.

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims

A kind of 1. data query optimization method for being applied to jump list data structure, it is characterised in that including：

S1, according to index data sum be N, level be that n creates skip list, use each layer of array T [N] sequential storage skip list Index data, array Id [N] store address of the corresponding index data in next layer；

S2, the inquiry initial address index of first layer index data in skip list is initialized as 0；

S3, index data is read from array T [N] since index addresses, index data is carried out with searching data M one by one Compare, find first and be more than or equal to M index data, and obtain the address index of the index data；

S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to described in more New index addresses continue to search in array T [N] or data list；

S5, in data list, from index addresses, traveled through backward along list, inquire and return to ground behind M positions Location.
2. the method as described in claim 1, it is characterised in that described step S1, specifically include：

Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N]；

Index address array is stored using array Id [N], corresponding index data is next in array Id [N] storage array T [N] Address in layer.
3. the method as described in claim 1, it is characterised in that described step S3, specifically include：

S31, mod=index%K is calculated, K is the data width that SIMD instruction is read；

S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33；

S33, the data T [index] for indexing index is read from array T [N], be designated as variable H；

S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35；

S35, more new variables mod and index value, and perform step after performing operation mod=mod-1, index=index+1 S32；

S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K]；

S37, initializing variable k values are 0；

S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39；

S39, more new variables k value, perform operation k=k+1；

S310, k >=K is judged, if then performing step S311, if otherwise performing step S38；

S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K；

S312, the value for updating index, step S313 is performed after performing operation index=index+k；

S313, flow terminate.
4. the method as described in claim 1, it is characterised in that described step S4, specifically include：

According to the obtained index addresses, renewal index in address is obtained from array Id [N]；

Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if otherwise holding Row step S5.