CN107679212A - A kind of data query optimization method for being applied to jump list data structure - Google Patents
A kind of data query optimization method for being applied to jump list data structure Download PDFInfo
- Publication number
- CN107679212A CN107679212A CN201710968153.9A CN201710968153A CN107679212A CN 107679212 A CN107679212 A CN 107679212A CN 201710968153 A CN201710968153 A CN 201710968153A CN 107679212 A CN107679212 A CN 107679212A
- Authority
- CN
- China
- Prior art keywords
- index
- data
- array
- list
- address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
Abstract
The present invention discloses a kind of data query optimization method for being applied to jump list data structure, belong to data directory technical field, including according to index data sum be N, index level be n create skip list, using each layer index data of array T [N] sequential storage skip list, array Id [N] stores address of the corresponding index data in next layer;The inquiry initial address index of first layer index data in skip list is initialized as 0;Index data is read from array T [N] since index addresses, index data is compared with searching data M one by one, first is found and is more than or equal to M index data, and obtain the address index of the index data;According to obtained index addresses, renewal index in address is obtained from array Id [N], the index addresses according to renewal continue to search in array T [N] or data list;In data list, from index addresses, traveled through backward along list, inquire return address behind M positions.Cache hit probability during data query can be improved.
Description
Technical field
The present invention relates to data directory technical field, more particularly to a kind of data query for being applied to jump list data structure
Optimization method.
Background technology
Memory database refers to all for data to be placed on the database operated in internal memory.With disk as main storage
The database of medium is compared, and memory database is mainly characterized by that speed is fast, handling capacity is high.At present in big data processing system,
The buffer memory of data is carried out usually using memory database, to lift the process performance of frequently-used data.
Skip list refers on the basis of ordered list, increases a kind of data knot of index data in a manner of randomization
Structure.By these additional index datas when so searching in lists, it rapidly can skip partial list and lift lookup
Speed.Skip list has the characteristics that simple in construction, efficiency high.It is as index data structure, in main flows such as Redis, Memsql
It is used widely in memory database.
The primary operational of skip list lookup algorithm is that index data is begun look for from the superiors, and then basis is found
Index data address is searched in the index data of next level, is to the last searched in a layer index data after terminating, according to most
The index address obtained afterwards is searched in ordered list.But because index datastore is discontinuous, cache missings often be present
The features such as rate is high, memory bandwidth utilization rate is low and influence the performance of existing skip list lookup algorithm.
The content of the invention
It is an object of the invention to provide a kind of data query optimization method for being applied to jump list data structure, to improve
Cache hit probability during data query.
To realize object above, the technical solution adopted by the present invention is:The present invention provides one kind and is applied to skip list data
The data query optimization method of structure, including:
S1, according to index data sum be N, level be that n creates skip list, use array T [N] sequential storage skip list
Each layer index data, array Id [N] store address of the corresponding index data in next layer;
S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;
S3, index data is read from the array T [N] since index addresses, by index data one by one with searching data M
It is compared, finds first and be more than or equal to M index data, and obtain the address index of the index data;
S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to institute
The index addresses for stating renewal continue to search in array T [N] or data list;
S5, in data list, from index addresses, travel through along list, returned after inquiring M positions backward
Go back to address.
Wherein, step S1, specifically include:
Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];
Index address array is stored using array Id [N], corresponding index data exists in array Id [N] storage array T [N]
Address in next layer.
Wherein, step S3, specifically include:
S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;
S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;
S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;
S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;
S35, more new variables mod and index value, and performed after performing operation mod=mod-1, index=index+1
Step S32;
S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H
[K];
S37, initializing variable k values are 0;
S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;
S39, more new variables k value, perform operation k=k+1;
S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;
S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;
S312, the value for updating index, step S313 is performed after performing operation index=index+k;
S313, flow terminate.
Wherein, step S4, specifically include:
According to the obtained index addresses, address is obtained from array Id [N] and updates index;
Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if not
Then perform step S5.
Compared with prior art, there is following technique effect in the present invention:By index datastore continuous empty in the present invention
Between among, and inquired about for each level index data of skip list, this can improve cache hits during data query
Rate.SIMD instruction can read multiple continuous data in an instruction, the memory bandwidth for making full use of processor to provide.This
Sample, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data, improve processor memory bandwidth
Utilization rate.
Brief description of the drawings
Below in conjunction with the accompanying drawings, the embodiment of the present invention is described in detail:
Fig. 1 is a kind of schematic flow sheet for the data query optimization method for being applied to jump list data structure in the present invention;
Fig. 2 is skip list index datastore structural representation in the present invention;
Fig. 3 is to search index data schematic flow sheet in the present invention in skip list.
Embodiment
In order to illustrate further the feature of the present invention, please refer to the following detailed descriptions related to the present invention and accompanying drawing.Institute
Accompanying drawing is only for reference and purposes of discussion, is not used for being any limitation as protection scope of the present invention.
As shown in figure 1, present embodiments provide a kind of data query optimization method for being applied to jump list data structure, bag
Include following steps:
S1, according to index data sum be N, level be that n creates skip list, use array T [N] sequential storage skip list
Each layer index data, array Id [N] store address of the corresponding index data in next layer;
It should be noted that create the index data that skip list is used for data storage list.Setting skip list has n layer
Secondary, index data quantity summation at all levels is N.Index data and index ground are stored respectively using array T [N] and Id [N]
Location.The index data of each layer of sequential storage skip list in array T [N].Array Id [N] stores its corresponding index data next
Address in layer.For the index data of last layer, its address in data list is stored in array Id [N].
Fig. 2 describes the index datastore structure of some 2 layers of skip list.Sequential storage 2 layer indexs in T [N] array
Data.Id [N] stores index address of the corresponding data at next layer in T [N] array.When inquiring about data, successively from first layer
Concordance list Leve1 finds second layer concordance list Level2, then again the Query Result according to second layer concordance list Leve2 in number
According to searching the data particular location in list.
S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;
It should be noted that carrying out real-time recording indexes using variable i ndex searches address.The data query of skip list is from
One layer index data start.Variable i ndex initialization value should be starting of the skip list first layer index data in array T [N]
Address.It is 0 to initialize index search address index.
S3, index data is read from the array T [N] since index addresses, by index data one by one with searching data M
It is compared, finds first and be more than or equal to M index data, and obtain the address index of the index data;
S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to institute
The index addresses for stating renewal continue to search in array T [N] or data list;
S5, in data list, from index addresses, travel through along list, returned after inquiring M positions backward
Go back to address.
It should be noted that the present invention inquires about index data among being stored in continuous space, for each layer of skip list
The inquiry of secondary index data, by cache hit probability during raising data query;The reading of continuous data is carried out using SIMD instruction
Take, the utilization rate of processor memory bandwidth can be improved.
Further, step S1, specifically include:
Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];
It should be noted that all index datas are stored sequentially in a continuous space when being advantageous to improve data query
Cache hit probability.
Index address array is stored using array Id [N], corresponding index data exists in array Id [N] storage array T [N]
Address in next layer.
Further, as shown in figure 3, step S3 specifically comprises the following steps:
S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;
S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;
S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;
S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;
S35, more new variables mod and index value, and performed after performing operation mod=mod-1, index=index+1
Step S32;
S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H
[K];
S37, initializing variable k values are 0;
S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;
S39, more new variables k value, perform operation k=k+1;
S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;
S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;
S312, the value for updating index, step S313 is performed after performing operation index=index+k;
S313, flow terminate.
Further, step S4, specifically include:
According to the obtained index addresses, renewal index in address is obtained from array Id [N];
Judge the renewal index whether be index data address, in this way then perform step S3, if otherwise performing step
Rapid S5.
Wherein, idiographic flow is as follows:
(1) judge whether index is the initial address of this layer index data, continue to hold step by step (4) if jumping to
OK, continue executing with step by step (2) if not jumping to;
(2) judge whether T [index] is equal to M, continued executing with step by step (4) if jumping to, if not jumping to
(3) continue executing with step by step;
(3) index value is updated, performs operation index=index -1;
(4) determine whether last layer index data, continued executing with step by step (5) if jumping to, if not
Jump to and continue executing with step by step (6);
(5) index value is updated, performs operation index=Id [index], obtained index is the ground in data list
Location.Jump to step S5 and data are inquired about in data list;
(6) index value is updated, performs operation index=Id [index], obtained index is next level index number
According to starting search address.Step S3 is jumped to continue to search in next layer index data.
Further, the step S5, refer specifically in skip list last layer index data, the rope that searching data M is obtained
Draw address index, be the index address in data list.Based on this index address index, then along data list backward
Traversal, inquires return address behind M positions.If M is not in data list, return address -1.Idiographic flow is as follows:
A, judge whether the data pointed by index are M, continued executing with if jumping to d step by step;If not then
B step by step is jumped to continue executing with;
B, judge whether the data pointed by index are more than M, if renewal index value is -1 and jumps to substep
Rapid d is continued executing with;Continued executing with if not c step by step is then jumped to;
C, index value is updated, is the address of next data in data list.A step by step is then branched to continue to hold
OK;
D, index addresses are returned to, are addresses of the data M in data list, flow terminates.
A kind of it should be noted that data query optimization side for being applied to jump list data structure disclosed in the present embodiment
Method, have the advantages that:
(1) processor cache hit probability is high:
The present invention is by the index datastore of skip list among the array of a continuous space.Carried out successively in skip list
During inquiry, the index data of each level is all Coutinuous store.In the cache of processor, the data of caching are frequent recently
The consecutive data block used.This reading to continuous data, by cache hit probability during significant increase data query.
(2) memory bandwidth utilization rate is high:
The present invention by index data inquiry be stored in continuous space among, skip list data query be for skip list it is each
The inquiry of level index data.SIMD instruction can read multiple continuous data in an instruction, make full use of processor
The memory bandwidth of offer.So, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data,
Improve the utilization rate of processor memory bandwidth.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.
Claims (4)
- A kind of 1. data query optimization method for being applied to jump list data structure, it is characterised in that including:S1, according to index data sum be N, level be that n creates skip list, use each layer of array T [N] sequential storage skip list Index data, array Id [N] store address of the corresponding index data in next layer;S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;S3, index data is read from array T [N] since index addresses, index data is carried out with searching data M one by one Compare, find first and be more than or equal to M index data, and obtain the address index of the index data;S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to described in more New index addresses continue to search in array T [N] or data list;S5, in data list, from index addresses, traveled through backward along list, inquire and return to ground behind M positions Location.
- 2. the method as described in claim 1, it is characterised in that described step S1, specifically include:Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];Index address array is stored using array Id [N], corresponding index data is next in array Id [N] storage array T [N] Address in layer.
- 3. the method as described in claim 1, it is characterised in that described step S3, specifically include:S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;S35, more new variables mod and index value, and perform step after performing operation mod=mod-1, index=index+1 S32;S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K];S37, initializing variable k values are 0;S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;S39, more new variables k value, perform operation k=k+1;S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;S312, the value for updating index, step S313 is performed after performing operation index=index+k;S313, flow terminate.
- 4. the method as described in claim 1, it is characterised in that described step S4, specifically include:According to the obtained index addresses, renewal index in address is obtained from array Id [N];Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if otherwise holding Row step S5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710968153.9A CN107679212A (en) | 2017-10-17 | 2017-10-17 | A kind of data query optimization method for being applied to jump list data structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710968153.9A CN107679212A (en) | 2017-10-17 | 2017-10-17 | A kind of data query optimization method for being applied to jump list data structure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107679212A true CN107679212A (en) | 2018-02-09 |
Family
ID=61139699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710968153.9A Pending CN107679212A (en) | 2017-10-17 | 2017-10-17 | A kind of data query optimization method for being applied to jump list data structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107679212A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763413A (en) * | 2018-05-23 | 2018-11-06 | 唐山高新技术产业园区兴荣科技有限公司 | Data memory format and its data search localization method |
CN109344303A (en) * | 2018-11-30 | 2019-02-15 | 广州虎牙信息科技有限公司 | A kind of data structure switching method, device, equipment and storage medium |
CN111046034A (en) * | 2018-10-12 | 2020-04-21 | 第四范式(北京)技术有限公司 | Method and system for managing memory data and maintaining data in memory |
CN112214503A (en) * | 2020-10-10 | 2021-01-12 | 深圳壹账通智能科技有限公司 | Data processing method and device, electronic equipment and storage medium |
CN112597152A (en) * | 2020-12-04 | 2021-04-02 | 国创新能源汽车智慧能源装备创新中心(江苏)有限公司 | Indexing method and indexing device for characteristic time sequence data based on skip list |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1152365A (en) * | 1994-06-06 | 1997-06-18 | 诺基亚电信公司 | Method for storing and retrieving data and memory arrangement |
CN1613066A (en) * | 2001-11-09 | 2005-05-04 | 瑞迪西斯迈克维尔通讯软件分公司 | Routing and forwarding table management for network processor architectures |
US20060047719A1 (en) * | 2004-08-30 | 2006-03-02 | Hywire Ltd. | Database storage and maintenance using row index ordering |
CN102682116A (en) * | 2012-05-14 | 2012-09-19 | 中兴通讯股份有限公司 | Method and device for processing table items based on Hash table |
CN103544300A (en) * | 2013-10-31 | 2014-01-29 | 云南大学 | Method for realizing extensible storage index structure in cloud environment |
CN104431497A (en) * | 2014-12-01 | 2015-03-25 | 许冠安 | Young parrot powder feed and preparation method thereof |
CN106209645A (en) * | 2016-07-29 | 2016-12-07 | 北京邮电大学 | The initial lookup node of a kind of packet determines method and device |
US20170091244A1 (en) * | 2015-09-24 | 2017-03-30 | Microsoft Technology Licensing, Llc | Searching a Data Structure |
-
2017
- 2017-10-17 CN CN201710968153.9A patent/CN107679212A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1152365A (en) * | 1994-06-06 | 1997-06-18 | 诺基亚电信公司 | Method for storing and retrieving data and memory arrangement |
CN1613066A (en) * | 2001-11-09 | 2005-05-04 | 瑞迪西斯迈克维尔通讯软件分公司 | Routing and forwarding table management for network processor architectures |
US20060047719A1 (en) * | 2004-08-30 | 2006-03-02 | Hywire Ltd. | Database storage and maintenance using row index ordering |
CN102682116A (en) * | 2012-05-14 | 2012-09-19 | 中兴通讯股份有限公司 | Method and device for processing table items based on Hash table |
CN103544300A (en) * | 2013-10-31 | 2014-01-29 | 云南大学 | Method for realizing extensible storage index structure in cloud environment |
CN104431497A (en) * | 2014-12-01 | 2015-03-25 | 许冠安 | Young parrot powder feed and preparation method thereof |
US20170091244A1 (en) * | 2015-09-24 | 2017-03-30 | Microsoft Technology Licensing, Llc | Searching a Data Structure |
CN106209645A (en) * | 2016-07-29 | 2016-12-07 | 北京邮电大学 | The initial lookup node of a kind of packet determines method and device |
Non-Patent Citations (1)
Title |
---|
MYSHELL: ""跳表skiplist"", 《HTTPS://SEGMENTFAULT.COM/A/1190000006024984》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763413A (en) * | 2018-05-23 | 2018-11-06 | 唐山高新技术产业园区兴荣科技有限公司 | Data memory format and its data search localization method |
CN108763413B (en) * | 2018-05-23 | 2021-07-23 | 唐山高新技术产业园区兴荣科技有限公司 | Data searching and positioning method based on data storage format |
CN111046034A (en) * | 2018-10-12 | 2020-04-21 | 第四范式(北京)技术有限公司 | Method and system for managing memory data and maintaining data in memory |
CN111046034B (en) * | 2018-10-12 | 2024-02-13 | 第四范式(北京)技术有限公司 | Method and system for managing memory data and maintaining data in memory |
CN109344303A (en) * | 2018-11-30 | 2019-02-15 | 广州虎牙信息科技有限公司 | A kind of data structure switching method, device, equipment and storage medium |
CN109344303B (en) * | 2018-11-30 | 2020-12-29 | 广州虎牙信息科技有限公司 | Data structure switching method, device, equipment and storage medium |
CN112214503A (en) * | 2020-10-10 | 2021-01-12 | 深圳壹账通智能科技有限公司 | Data processing method and device, electronic equipment and storage medium |
CN112597152A (en) * | 2020-12-04 | 2021-04-02 | 国创新能源汽车智慧能源装备创新中心(江苏)有限公司 | Indexing method and indexing device for characteristic time sequence data based on skip list |
CN112597152B (en) * | 2020-12-04 | 2022-08-23 | 国创移动能源创新中心(江苏)有限公司 | Indexing method and indexing device for characteristic time sequence data based on skip list |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107679212A (en) | A kind of data query optimization method for being applied to jump list data structure | |
CN109376156B (en) | Method for reading hybrid index with storage awareness | |
US20190179752A1 (en) | Multi-level caching method and multi-level caching system for enhancing graph processing performance | |
US7558802B2 (en) | Information retrieving system | |
KR101467589B1 (en) | Dynamic fragment mapping | |
US8325721B2 (en) | Method for selecting hash function, method for storing and searching routing table and devices thereof | |
CN104809182B (en) | Based on the web crawlers URL De-weight method that dynamically can divide Bloom Filter | |
CN105975587B (en) | A kind of high performance memory database index organization and access method | |
CN104063330B (en) | Data prefetching method and device | |
CN104035925B (en) | Date storage method, device and storage system | |
EP2750053A1 (en) | Data storage program, data retrieval program, data retrieval apparatus, data storage method and data retrieval method | |
CN105468298B (en) | A kind of key assignments storage method based on log-structured merging tree | |
US20060271540A1 (en) | Method and apparatus for indexing in a reduced-redundancy storage system | |
CN102484610A (en) | Routing table construction method and device and routing table lookup method and device | |
CN106980656B (en) | A kind of searching method based on two-value code dictionary tree | |
CN101655861A (en) | Hashing method based on double-counting bloom filter and hashing device | |
CN109542339B (en) | Data layered access method and device, multilayer storage equipment and storage medium | |
CN106777003A (en) | A kind of search index method and system towards Key Value storage systems | |
CN107330094A (en) | The Bloom Filter tree construction and key-value pair storage method of dynamic memory key-value pair | |
JP2013502020A (en) | Method and device for improving scalability of longest prefix match | |
CN109245879A (en) | A kind of double hash algorithms of storage and lookup IP address mapping relations | |
CN106569963A (en) | Buffering method and buffering device | |
CN113468080B (en) | Caching method, system and related device for full-flash metadata | |
CN105359142A (en) | Hash join method, device and database management system | |
CN109325022A (en) | A kind of data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 230000 Yafu Park, Juchao Economic Development Zone, Chaohu City, Hefei City, Anhui Province Applicant after: ANHUI HUISHI JINTONG TECHNOLOGY Co.,Ltd. Address before: 102, room 602, C District, Hefei National University, Mount Huangshan Road, 230000 Hefei Road, Anhui, China Applicant before: ANHUI HUISHI JINTONG TECHNOLOGY Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180209 |