CN107679212A - A kind of data query optimization method for being applied to jump list data structure - Google Patents

A kind of data query optimization method for being applied to jump list data structure Download PDF

Info

Publication number
CN107679212A
CN107679212A CN201710968153.9A CN201710968153A CN107679212A CN 107679212 A CN107679212 A CN 107679212A CN 201710968153 A CN201710968153 A CN 201710968153A CN 107679212 A CN107679212 A CN 107679212A
Authority
CN
China
Prior art keywords
index
data
array
list
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710968153.9A
Other languages
Chinese (zh)
Inventor
汪俊锋
刘罡
张巧云
戴平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Wisdom Gold Tong Technology Co Ltd
Original Assignee
Anhui Wisdom Gold Tong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Wisdom Gold Tong Technology Co Ltd filed Critical Anhui Wisdom Gold Tong Technology Co Ltd
Priority to CN201710968153.9A priority Critical patent/CN107679212A/en
Publication of CN107679212A publication Critical patent/CN107679212A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Abstract

The present invention discloses a kind of data query optimization method for being applied to jump list data structure, belong to data directory technical field, including according to index data sum be N, index level be n create skip list, using each layer index data of array T [N] sequential storage skip list, array Id [N] stores address of the corresponding index data in next layer;The inquiry initial address index of first layer index data in skip list is initialized as 0;Index data is read from array T [N] since index addresses, index data is compared with searching data M one by one, first is found and is more than or equal to M index data, and obtain the address index of the index data;According to obtained index addresses, renewal index in address is obtained from array Id [N], the index addresses according to renewal continue to search in array T [N] or data list;In data list, from index addresses, traveled through backward along list, inquire return address behind M positions.Cache hit probability during data query can be improved.

Description

A kind of data query optimization method for being applied to jump list data structure
Technical field
The present invention relates to data directory technical field, more particularly to a kind of data query for being applied to jump list data structure Optimization method.
Background technology
Memory database refers to all for data to be placed on the database operated in internal memory.With disk as main storage The database of medium is compared, and memory database is mainly characterized by that speed is fast, handling capacity is high.At present in big data processing system, The buffer memory of data is carried out usually using memory database, to lift the process performance of frequently-used data.
Skip list refers on the basis of ordered list, increases a kind of data knot of index data in a manner of randomization Structure.By these additional index datas when so searching in lists, it rapidly can skip partial list and lift lookup Speed.Skip list has the characteristics that simple in construction, efficiency high.It is as index data structure, in main flows such as Redis, Memsql It is used widely in memory database.
The primary operational of skip list lookup algorithm is that index data is begun look for from the superiors, and then basis is found Index data address is searched in the index data of next level, is to the last searched in a layer index data after terminating, according to most The index address obtained afterwards is searched in ordered list.But because index datastore is discontinuous, cache missings often be present The features such as rate is high, memory bandwidth utilization rate is low and influence the performance of existing skip list lookup algorithm.
The content of the invention
It is an object of the invention to provide a kind of data query optimization method for being applied to jump list data structure, to improve Cache hit probability during data query.
To realize object above, the technical solution adopted by the present invention is:The present invention provides one kind and is applied to skip list data The data query optimization method of structure, including:
S1, according to index data sum be N, level be that n creates skip list, use array T [N] sequential storage skip list Each layer index data, array Id [N] store address of the corresponding index data in next layer;
S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;
S3, index data is read from the array T [N] since index addresses, by index data one by one with searching data M It is compared, finds first and be more than or equal to M index data, and obtain the address index of the index data;
S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to institute The index addresses for stating renewal continue to search in array T [N] or data list;
S5, in data list, from index addresses, travel through along list, returned after inquiring M positions backward Go back to address.
Wherein, step S1, specifically include:
Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];
Index address array is stored using array Id [N], corresponding index data exists in array Id [N] storage array T [N] Address in next layer.
Wherein, step S3, specifically include:
S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;
S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;
S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;
S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;
S35, more new variables mod and index value, and performed after performing operation mod=mod-1, index=index+1 Step S32;
S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K];
S37, initializing variable k values are 0;
S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;
S39, more new variables k value, perform operation k=k+1;
S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;
S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;
S312, the value for updating index, step S313 is performed after performing operation index=index+k;
S313, flow terminate.
Wherein, step S4, specifically include:
According to the obtained index addresses, address is obtained from array Id [N] and updates index;
Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if not Then perform step S5.
Compared with prior art, there is following technique effect in the present invention:By index datastore continuous empty in the present invention Between among, and inquired about for each level index data of skip list, this can improve cache hits during data query Rate.SIMD instruction can read multiple continuous data in an instruction, the memory bandwidth for making full use of processor to provide.This Sample, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data, improve processor memory bandwidth Utilization rate.
Brief description of the drawings
Below in conjunction with the accompanying drawings, the embodiment of the present invention is described in detail:
Fig. 1 is a kind of schematic flow sheet for the data query optimization method for being applied to jump list data structure in the present invention;
Fig. 2 is skip list index datastore structural representation in the present invention;
Fig. 3 is to search index data schematic flow sheet in the present invention in skip list.
Embodiment
In order to illustrate further the feature of the present invention, please refer to the following detailed descriptions related to the present invention and accompanying drawing.Institute Accompanying drawing is only for reference and purposes of discussion, is not used for being any limitation as protection scope of the present invention.
As shown in figure 1, present embodiments provide a kind of data query optimization method for being applied to jump list data structure, bag Include following steps:
S1, according to index data sum be N, level be that n creates skip list, use array T [N] sequential storage skip list Each layer index data, array Id [N] store address of the corresponding index data in next layer;
It should be noted that create the index data that skip list is used for data storage list.Setting skip list has n layer Secondary, index data quantity summation at all levels is N.Index data and index ground are stored respectively using array T [N] and Id [N] Location.The index data of each layer of sequential storage skip list in array T [N].Array Id [N] stores its corresponding index data next Address in layer.For the index data of last layer, its address in data list is stored in array Id [N].
Fig. 2 describes the index datastore structure of some 2 layers of skip list.Sequential storage 2 layer indexs in T [N] array Data.Id [N] stores index address of the corresponding data at next layer in T [N] array.When inquiring about data, successively from first layer Concordance list Leve1 finds second layer concordance list Level2, then again the Query Result according to second layer concordance list Leve2 in number According to searching the data particular location in list.
S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;
It should be noted that carrying out real-time recording indexes using variable i ndex searches address.The data query of skip list is from One layer index data start.Variable i ndex initialization value should be starting of the skip list first layer index data in array T [N] Address.It is 0 to initialize index search address index.
S3, index data is read from the array T [N] since index addresses, by index data one by one with searching data M It is compared, finds first and be more than or equal to M index data, and obtain the address index of the index data;
S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to institute The index addresses for stating renewal continue to search in array T [N] or data list;
S5, in data list, from index addresses, travel through along list, returned after inquiring M positions backward Go back to address.
It should be noted that the present invention inquires about index data among being stored in continuous space, for each layer of skip list The inquiry of secondary index data, by cache hit probability during raising data query;The reading of continuous data is carried out using SIMD instruction Take, the utilization rate of processor memory bandwidth can be improved.
Further, step S1, specifically include:
Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];
It should be noted that all index datas are stored sequentially in a continuous space when being advantageous to improve data query Cache hit probability.
Index address array is stored using array Id [N], corresponding index data exists in array Id [N] storage array T [N] Address in next layer.
Further, as shown in figure 3, step S3 specifically comprises the following steps:
S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;
S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;
S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;
S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;
S35, more new variables mod and index value, and performed after performing operation mod=mod-1, index=index+1 Step S32;
S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K];
S37, initializing variable k values are 0;
S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;
S39, more new variables k value, perform operation k=k+1;
S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;
S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;
S312, the value for updating index, step S313 is performed after performing operation index=index+k;
S313, flow terminate.
Further, step S4, specifically include:
According to the obtained index addresses, renewal index in address is obtained from array Id [N];
Judge the renewal index whether be index data address, in this way then perform step S3, if otherwise performing step Rapid S5.
Wherein, idiographic flow is as follows:
(1) judge whether index is the initial address of this layer index data, continue to hold step by step (4) if jumping to OK, continue executing with step by step (2) if not jumping to;
(2) judge whether T [index] is equal to M, continued executing with step by step (4) if jumping to, if not jumping to (3) continue executing with step by step;
(3) index value is updated, performs operation index=index -1;
(4) determine whether last layer index data, continued executing with step by step (5) if jumping to, if not Jump to and continue executing with step by step (6);
(5) index value is updated, performs operation index=Id [index], obtained index is the ground in data list Location.Jump to step S5 and data are inquired about in data list;
(6) index value is updated, performs operation index=Id [index], obtained index is next level index number According to starting search address.Step S3 is jumped to continue to search in next layer index data.
Further, the step S5, refer specifically in skip list last layer index data, the rope that searching data M is obtained Draw address index, be the index address in data list.Based on this index address index, then along data list backward Traversal, inquires return address behind M positions.If M is not in data list, return address -1.Idiographic flow is as follows:
A, judge whether the data pointed by index are M, continued executing with if jumping to d step by step;If not then B step by step is jumped to continue executing with;
B, judge whether the data pointed by index are more than M, if renewal index value is -1 and jumps to substep Rapid d is continued executing with;Continued executing with if not c step by step is then jumped to;
C, index value is updated, is the address of next data in data list.A step by step is then branched to continue to hold OK;
D, index addresses are returned to, are addresses of the data M in data list, flow terminates.
A kind of it should be noted that data query optimization side for being applied to jump list data structure disclosed in the present embodiment Method, have the advantages that:
(1) processor cache hit probability is high:
The present invention is by the index datastore of skip list among the array of a continuous space.Carried out successively in skip list During inquiry, the index data of each level is all Coutinuous store.In the cache of processor, the data of caching are frequent recently The consecutive data block used.This reading to continuous data, by cache hit probability during significant increase data query.
(2) memory bandwidth utilization rate is high:
The present invention by index data inquiry be stored in continuous space among, skip list data query be for skip list it is each The inquiry of level index data.SIMD instruction can read multiple continuous data in an instruction, make full use of processor The memory bandwidth of offer.So, in every layer index data query, SIMD instruction can be used to carry out the reading of continuous data, Improve the utilization rate of processor memory bandwidth.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims (4)

  1. A kind of 1. data query optimization method for being applied to jump list data structure, it is characterised in that including:
    S1, according to index data sum be N, level be that n creates skip list, use each layer of array T [N] sequential storage skip list Index data, array Id [N] store address of the corresponding index data in next layer;
    S2, the inquiry initial address index of first layer index data in skip list is initialized as 0;
    S3, index data is read from array T [N] since index addresses, index data is carried out with searching data M one by one Compare, find first and be more than or equal to M index data, and obtain the address index of the index data;
    S4, the index addresses obtained according to, renewal index in address is obtained from array Id [N], then according to described in more New index addresses continue to search in array T [N] or data list;
    S5, in data list, from index addresses, traveled through backward along list, inquire and return to ground behind M positions Location.
  2. 2. the method as described in claim 1, it is characterised in that described step S1, specifically include:
    Index data is stored using array T [N], the index data of each layer of sequential storage skip list in array T [N];
    Index address array is stored using array Id [N], corresponding index data is next in array Id [N] storage array T [N] Address in layer.
  3. 3. the method as described in claim 1, it is characterised in that described step S3, specifically include:
    S31, mod=index%K is calculated, K is the data width that SIMD instruction is read;
    S32, judge whether mod is 0, if then performing step S36, otherwise perform step S33;
    S33, the data T [index] for indexing index is read from array T [N], be designated as variable H;
    S34, judge whether to meet H >=M, if performing step S313, otherwise perform step S35;
    S35, more new variables mod and index value, and perform step after performing operation mod=mod-1, index=index+1 S32;
    S36, using SIMD instruction, the K data since being indexed index are read in array T [N], are designated as array H [K];
    S37, initializing variable k values are 0;
    S38, judge whether to meet H [k] >=M, if then performing step S312, if otherwise performing step S39;
    S39, more new variables k value, perform operation k=k+1;
    S310, k >=K is judged, if then performing step S311, if otherwise performing step S38;
    S311, renewal variable i ndex value, step S36 is performed after performing operation index=index+K;
    S312, the value for updating index, step S313 is performed after performing operation index=index+k;
    S313, flow terminate.
  4. 4. the method as described in claim 1, it is characterised in that described step S4, specifically include:
    According to the obtained index addresses, renewal index in address is obtained from array Id [N];
    Judge the renewal index whether be index data in skip list address, in this way then perform step S3, if otherwise holding Row step S5.
CN201710968153.9A 2017-10-17 2017-10-17 A kind of data query optimization method for being applied to jump list data structure Pending CN107679212A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710968153.9A CN107679212A (en) 2017-10-17 2017-10-17 A kind of data query optimization method for being applied to jump list data structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710968153.9A CN107679212A (en) 2017-10-17 2017-10-17 A kind of data query optimization method for being applied to jump list data structure

Publications (1)

Publication Number Publication Date
CN107679212A true CN107679212A (en) 2018-02-09

Family

ID=61139699

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710968153.9A Pending CN107679212A (en) 2017-10-17 2017-10-17 A kind of data query optimization method for being applied to jump list data structure

Country Status (1)

Country Link
CN (1) CN107679212A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763413A (en) * 2018-05-23 2018-11-06 唐山高新技术产业园区兴荣科技有限公司 Data memory format and its data search localization method
CN109344303A (en) * 2018-11-30 2019-02-15 广州虎牙信息科技有限公司 A kind of data structure switching method, device, equipment and storage medium
CN111046034A (en) * 2018-10-12 2020-04-21 第四范式(北京)技术有限公司 Method and system for managing memory data and maintaining data in memory
CN112214503A (en) * 2020-10-10 2021-01-12 深圳壹账通智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112597152A (en) * 2020-12-04 2021-04-02 国创新能源汽车智慧能源装备创新中心(江苏)有限公司 Indexing method and indexing device for characteristic time sequence data based on skip list

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1152365A (en) * 1994-06-06 1997-06-18 诺基亚电信公司 Method for storing and retrieving data and memory arrangement
CN1613066A (en) * 2001-11-09 2005-05-04 瑞迪西斯迈克维尔通讯软件分公司 Routing and forwarding table management for network processor architectures
US20060047719A1 (en) * 2004-08-30 2006-03-02 Hywire Ltd. Database storage and maintenance using row index ordering
CN102682116A (en) * 2012-05-14 2012-09-19 中兴通讯股份有限公司 Method and device for processing table items based on Hash table
CN103544300A (en) * 2013-10-31 2014-01-29 云南大学 Method for realizing extensible storage index structure in cloud environment
CN104431497A (en) * 2014-12-01 2015-03-25 许冠安 Young parrot powder feed and preparation method thereof
CN106209645A (en) * 2016-07-29 2016-12-07 北京邮电大学 The initial lookup node of a kind of packet determines method and device
US20170091244A1 (en) * 2015-09-24 2017-03-30 Microsoft Technology Licensing, Llc Searching a Data Structure

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1152365A (en) * 1994-06-06 1997-06-18 诺基亚电信公司 Method for storing and retrieving data and memory arrangement
CN1613066A (en) * 2001-11-09 2005-05-04 瑞迪西斯迈克维尔通讯软件分公司 Routing and forwarding table management for network processor architectures
US20060047719A1 (en) * 2004-08-30 2006-03-02 Hywire Ltd. Database storage and maintenance using row index ordering
CN102682116A (en) * 2012-05-14 2012-09-19 中兴通讯股份有限公司 Method and device for processing table items based on Hash table
CN103544300A (en) * 2013-10-31 2014-01-29 云南大学 Method for realizing extensible storage index structure in cloud environment
CN104431497A (en) * 2014-12-01 2015-03-25 许冠安 Young parrot powder feed and preparation method thereof
US20170091244A1 (en) * 2015-09-24 2017-03-30 Microsoft Technology Licensing, Llc Searching a Data Structure
CN106209645A (en) * 2016-07-29 2016-12-07 北京邮电大学 The initial lookup node of a kind of packet determines method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MYSHELL: ""跳表skiplist"", 《HTTPS://SEGMENTFAULT.COM/A/1190000006024984》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763413A (en) * 2018-05-23 2018-11-06 唐山高新技术产业园区兴荣科技有限公司 Data memory format and its data search localization method
CN108763413B (en) * 2018-05-23 2021-07-23 唐山高新技术产业园区兴荣科技有限公司 Data searching and positioning method based on data storage format
CN111046034A (en) * 2018-10-12 2020-04-21 第四范式(北京)技术有限公司 Method and system for managing memory data and maintaining data in memory
CN111046034B (en) * 2018-10-12 2024-02-13 第四范式(北京)技术有限公司 Method and system for managing memory data and maintaining data in memory
CN109344303A (en) * 2018-11-30 2019-02-15 广州虎牙信息科技有限公司 A kind of data structure switching method, device, equipment and storage medium
CN109344303B (en) * 2018-11-30 2020-12-29 广州虎牙信息科技有限公司 Data structure switching method, device, equipment and storage medium
CN112214503A (en) * 2020-10-10 2021-01-12 深圳壹账通智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112597152A (en) * 2020-12-04 2021-04-02 国创新能源汽车智慧能源装备创新中心(江苏)有限公司 Indexing method and indexing device for characteristic time sequence data based on skip list
CN112597152B (en) * 2020-12-04 2022-08-23 国创移动能源创新中心(江苏)有限公司 Indexing method and indexing device for characteristic time sequence data based on skip list

Similar Documents

Publication Publication Date Title
CN107679212A (en) A kind of data query optimization method for being applied to jump list data structure
CN109376156B (en) Method for reading hybrid index with storage awareness
US20190179752A1 (en) Multi-level caching method and multi-level caching system for enhancing graph processing performance
US7558802B2 (en) Information retrieving system
KR101467589B1 (en) Dynamic fragment mapping
US8325721B2 (en) Method for selecting hash function, method for storing and searching routing table and devices thereof
CN104809182B (en) Based on the web crawlers URL De-weight method that dynamically can divide Bloom Filter
CN105975587B (en) A kind of high performance memory database index organization and access method
CN104063330B (en) Data prefetching method and device
CN104035925B (en) Date storage method, device and storage system
EP2750053A1 (en) Data storage program, data retrieval program, data retrieval apparatus, data storage method and data retrieval method
CN105468298B (en) A kind of key assignments storage method based on log-structured merging tree
US20060271540A1 (en) Method and apparatus for indexing in a reduced-redundancy storage system
CN102484610A (en) Routing table construction method and device and routing table lookup method and device
CN106980656B (en) A kind of searching method based on two-value code dictionary tree
CN101655861A (en) Hashing method based on double-counting bloom filter and hashing device
CN109542339B (en) Data layered access method and device, multilayer storage equipment and storage medium
CN106777003A (en) A kind of search index method and system towards Key Value storage systems
CN107330094A (en) The Bloom Filter tree construction and key-value pair storage method of dynamic memory key-value pair
JP2013502020A (en) Method and device for improving scalability of longest prefix match
CN109245879A (en) A kind of double hash algorithms of storage and lookup IP address mapping relations
CN106569963A (en) Buffering method and buffering device
CN113468080B (en) Caching method, system and related device for full-flash metadata
CN105359142A (en) Hash join method, device and database management system
CN109325022A (en) A kind of data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 230000 Yafu Park, Juchao Economic Development Zone, Chaohu City, Hefei City, Anhui Province

Applicant after: ANHUI HUISHI JINTONG TECHNOLOGY Co.,Ltd.

Address before: 102, room 602, C District, Hefei National University, Mount Huangshan Road, 230000 Hefei Road, Anhui, China

Applicant before: ANHUI HUISHI JINTONG TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180209