CN106777343A - increment distributed index system and method - Google Patents

increment distributed index system and method Download PDF

Info

Publication number
CN106777343A
CN106777343A CN201710028299.5A CN201710028299A CN106777343A CN 106777343 A CN106777343 A CN 106777343A CN 201710028299 A CN201710028299 A CN 201710028299A CN 106777343 A CN106777343 A CN 106777343A
Authority
CN
China
Prior art keywords
data
querying condition
index
secondary index
subregion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710028299.5A
Other languages
Chinese (zh)
Inventor
张韶峰
陈浪仙
陈贺巍
邹迎春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bai Rong (beijing) Financial Information Service Ltd By Share Ltd
Original Assignee
Bai Rong (beijing) Financial Information Service Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bai Rong (beijing) Financial Information Service Ltd By Share Ltd filed Critical Bai Rong (beijing) Financial Information Service Ltd By Share Ltd
Priority to CN201710028299.5A priority Critical patent/CN106777343A/en
Publication of CN106777343A publication Critical patent/CN106777343A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The embodiment of the invention provides a kind of increment distributed index system and method;Methods described includes:Obtain HBase databases in store data, and each data one-level index value and property value;It is each data, generation secondary index Key;Secondary index Key according to each data, generates secondary index table, and subregional retrieval is carried out to secondary index table with inquiry;Wherein described subregion is divided according to secondary index Key values.

Description

Increment distributed index system and method
Technical field
The invention belongs to computer software technical field, more particularly to a kind of increment distributed index system and side Method.
Background technology
With the development of society, data staged numerical expression increases, and passes through to analyze mass data and can therefrom obtain much to have Information, therefore data mining technology for mass data becomes recent hot spot technology.In order in big data condition Lower faster to inquire about the data for meeting condition, way the more commonly used at present is data to be pre-processed and to data creation Index, inquires about during inquiry according to index.
Index is by the data of these values of physical label in the set of a row or some train values in table and corresponding Compass The logic inventory of page.When certain value is indexed, then can be found comprising the value along pointer by searching for index to find particular value Row.Assuming that wanting to look up Adradvark characters in data as shown in Figure 1, need to go scanning when index is not set up All data in DataPage;And only needing to scanning first, second if after the foundation index shown in Fig. 1 IndexPage can just find the row data comprising Adradvark characters, can thus accelerate inquiry velocity.It is possible thereby to see Go out index can rapid access evidence, being ranked up, grouped data can effectively improve search efficiency when inquiring about.It is existing to be Storage, create index it is the more commonly used have Lucene, HBase algorithm:
Lucene is an open source projects under Apache foundations, there is provided can realize the Java of full-text index and retrieval API;Existing Lucene includes index engine and search engine two parts.For the document comprising multiple fields (Field) (Document), word segmentation processing can be carried out to the content of text in document field by the index engine of Lucene, builds and close Key word indexing.After the completion of index construct, specific fields can be carried out based on keyword by the search engine of Lucene Inquiry.Lucene supports various inquiry modes, including fuzzy search, Querying by group etc..For Query Result, Lucene uses base The ranking of Query Result is calculated in the rank algorithm of vector space model.The advantage of Lucene is can to support various looking into Inquiry, inquiry data are fast.Have the disadvantage only to support that the whole of single document update, do not support that documentation section updates, and establishment index, Merge index than relatively time-consuming.
HBase is that, based on row storage, the distributed memory system built on HDFS, HBase can be cheap by increasing PC improve system operation and storage ability.A table can be supported in HBase can billions of rows, row up to a million, Can seating surface nematic (race) storage and control of authority, arrange (race) independent retrieval, and data in each unit can have many Individual version, version number distributes automatically under default situations, is timestamp when cell is inserted.Compare and be adapted to storage storage organization Or the key-value inquiries of semi-structured data data and high concurrent.The shortcoming of Hbase is that a support is good for by row, row is good for Range query, it is impossible to according to Column Properties and inquiry.
The content of the invention
All there is a problem of obvious defect for Lucene, HBase scheme in the prior art, the embodiment of the present invention Purpose is to provide a kind of effective and efficient increment distributed index system and method.
In order to solve the above problems, the embodiment of the present invention proposes a kind of increment distributed index method, including:
Step 1, obtain HBase databases in store data, and each data one-level index value and property value;For Each data, generate secondary index Key by the following method;
{ initial Key values } _ { original property value } _ { original Key values };
Wherein starting Key values are the initial values of all data Ll index values;Original property value is the property value of data;It is former Beginning Key value is all data Ll index values;
Step 2, the secondary index Key according to each data, generate secondary index table, with inquiry when to secondary index table Carry out subregional retrieval;Wherein described subregion is divided according to secondary index Key values.
Wherein, the secondary index Key and the storage of secondary index table are in HBase databases.
Wherein, methods described kind also includes:
The inquiry request of step 3, reception based on sql language, and the inquiry request generation based on sql language is directed to The querying condition of the secondary index table of HBase databases;The secondary index table of HBase databases is inquired about according to querying condition, is returned The data for meeting querying condition are returned, so that client arranges the returned data in each regions of HBase merges output.
Wherein, the step 3 is specifically included:
Step 31, when inquiry is received, the type of the inquiry is judged, if based on collecting the derivation of (Count) Inquiry, then jump to step 2;If for the paging query of single subregion (Region), then jumping to step N;
Step 32, the querying condition according to the derived query for being based on collecting, generation are directed to each subregional querying condition, So that each subregion is inquired about according to corresponding querying condition;Then after the data that each subregion returns are merged Return, step terminates;
Step 33, basis are directed to single subregional paging query, determine the subregion of this request;Inquired about for meeting The subregion of condition is inquired about, and Query Result is returned, and step terminates.
Wherein, the method inquired about subregion is specially:
Step a, according to querying condition and last registration id, regenerate querying condition, and build querying condition grammer Number;
Whether the result that step b, judgement have been obtained is if it is crucial by all rows in Query Result less than requirement The corresponding initial data of word is returned, and step is received;Step c is jumped to if not;
Step c, according to querying condition syntax tree obtain next line keyword, judge the next line keyword whether be , if it is be added to the next line keyword in retrieval result, and jump to step b by sky.
Meanwhile, the embodiment of the present invention also proposed a kind of increment distributed index system, including:
The one of secondary index Key generation modules, the data for obtaining storage in HBase databases, and each data Level index value and property value;It is each data, secondary index Key is generated by the following method;
{ initial Key values } _ { original property value } _ { original Key values };
Wherein starting Key values are the initial values of all data Ll index values;Original property value is the property value of data;It is former Beginning Key value is all data Ll index values;
Secondary index table generation module, for the secondary index Key according to each data, generates secondary index table, with Subregional retrieval is carried out to secondary index table during inquiry;Wherein described subregion is divided according to secondary index Key values 's.
Wherein, the secondary index Key and the storage of secondary index table are in HBase databases.
Wherein, the system also includes:
Enquiry module, for receiving the inquiry request based on sql language, and to the inquiry request based on sql language Querying condition of the generation for the secondary index table of HBase databases;Two grades of ropes of HBase databases are inquired about according to querying condition Draw table, return meets the data of querying condition, so that client arranges the returned data in each regions of HBase merges output.
Wherein, the enquiry module is used to perform following operation:
Step 31, when inquiry is received, the type of the inquiry is judged, if based on collecting the derivation of (Count) Inquiry, then jump to step 2;If for the paging query of single subregion (Region), then jumping to step N;
Step 32, the querying condition according to the derived query for being based on collecting, generation are directed to each subregional querying condition, So that each subregion is inquired about according to corresponding querying condition;Then after the data that each subregion returns are merged Return, step terminates;
Step 33, basis are directed to single subregional paging query, determine the subregion of this request;Inquired about for meeting The subregion of condition is inquired about, and Query Result is returned, and step terminates.
Wherein, the method inquired about subregion is specially:
Step a, according to querying condition and last registration id, regenerate querying condition, and build querying condition grammer Number;
Whether the result that step b, judgement have been obtained is if it is crucial by all rows in Query Result less than requirement The corresponding initial data of word is returned, and step is received;Step c is jumped to if not;
Step c, according to querying condition syntax tree obtain next line keyword, judge the next line keyword whether be , if it is be added to the next line keyword in retrieval result, and jump to step b by sky.
Above-mentioned technical proposal of the invention has the beneficial effect that:It is distributed that above-mentioned technical scheme proposes a kind of increment Directory system and method, can set up secondary index, so as to provide according to pass on the basis of the one-level of Hbase databases index The inquiry of keyword Key and Key value scope, with the further effect for improving retrieval.
Brief description of the drawings
Fig. 1 is the example for indexing in the prior art;
Fig. 2 a and Fig. 2 b are the examples of a kind of typical one-level index and secondary index;
Fig. 3 is the flow chart of the overall querying flow of the embodiment of the present invention;
Fig. 4 is the flow chart that server end is inquired about.
Specific embodiment
To make the technical problem to be solved in the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and tool Body embodiment is described in detail.
For data volume it is big, querying condition is complicated, data can incremental update database, existing Lucene, HBase side Case all cannot very well meet the situation of demand, and the embodiment of the present invention is realized using the scheme of HBase+ secondary indexs+sql.
The embodiment of the present invention proposes a kind of increment distributed index method, including:
Step 1, obtain HBase databases in store data, and each data one-level index value and property value;For Each data, generate secondary index Key by the following method;
{ initial Key values } _ { original property value } _ { original Key values };
Wherein starting Key values are the initial values of all data Ll index values;Original property value is the property value of data;It is former Beginning Key value is all data Ll index values;
Step 2, the secondary index Key according to each data, generate secondary index table, with inquiry when to secondary index table Carry out subregional retrieval;Wherein described subregion is divided according to secondary index Key values.
Wherein, the secondary index Key and the storage of secondary index table are in HBase databases.
Wherein, methods described kind also includes:
The inquiry request of step 3, reception based on sql language, and the inquiry request generation based on sql language is directed to The querying condition of the secondary index table of HBase databases;The secondary index table of HBase databases is inquired about according to querying condition, is returned The data for meeting querying condition are returned, so that client arranges the returned data in each regions of HBase merges output.
Wherein, the step 3 is specifically included:
Step 31, when inquiry is received, the type of the inquiry is judged, if based on collecting the derivation of (Count) Inquiry, then jump to step 2;If for the paging query of single subregion (Region), then jumping to step N;
Step 32, the querying condition according to the derived query for being based on collecting, generation are directed to each subregional querying condition, So that each subregion is inquired about according to corresponding querying condition;Then after the data that each subregion returns are merged Return, step terminates;
Step 33, basis are directed to single subregional paging query, determine the subregion of this request;Inquired about for meeting The subregion of condition is inquired about, and Query Result is returned, and step terminates.
Wherein, the method inquired about subregion is specially:
Step a, according to querying condition and last registration id, regenerate querying condition, and build querying condition grammer Number;
Whether the result that step b, judgement have been obtained is if it is crucial by all rows in Query Result less than requirement The corresponding initial data of word is returned, and step is received;Step c is jumped to if not;
Step c, according to querying condition syntax tree obtain next line keyword, judge the next line keyword whether be , if it is be added to the next line keyword in retrieval result, and jump to step b by sky.
It is further described below by a specific example of Fig. 2 a and Fig. 2 b.As shown in Figure 2 a and 2 b be exactly The example of one typical one-level index and secondary index, wherein Fig. 2 a are a typical one-level concordance list (Primary User Table), it includes row keyword row (rowkey) and an attribute column (cf1:col);Can from Fig. 2 a The primitive attribute of trip keyword 001 is A, the primitive attribute of row keyword 002 is B, the primitive attribute of row keyword 003 is C, The primitive attribute of row keyword 004 be A, the primitive attribute of row keyword 005 be A, the primitive attribute of row keyword 006 be B, OK The primitive attribute of keyword 007 is B.If Fig. 2 b are secondary index table (Secondary User Table), it is directed to Fig. 2 a category Property row (cf1:Col) the secondary index of generation.Just foregoing, the rule for generating secondary index is:
{ initial key values } _ { original property value } _ { original key values }
The initial key values that each data are can be seen that by Fig. 2 a are all 001, and original property value is the data in attribute Row (cf1:Col value), original key values be the data in fig. 2 a row keyword row (rowkey) value.Therefore pin Secondary index to the database table generation of Fig. 2 a is Fig. 2 b.
It can be seen that the bottom of the embodiment of the present invention is the database based on HBase, by the self-defined plug-in units of HBase Mode to realize secondary index+sql is inquired about.Because bottom is based on HBase, so system can very easily carry out level Extension, and can be easy to be updated data by the PUT of HBase, DELETE operation.By the dependent of dead military hero to HBase Property create secondary index, ensure that response speed is very fast when being inquired about by secondary index.And sql sentences are using fairly simple And it is relatively more flexible, can very easily support complex conditions inquiry, data aggregate statistical query.
The principle of secondary index is:Be in HBase database tables each data attribute set up index, and will index as One key is also saved in HBase databases, and the create-rule for indexing key is { initial key values } _ { original property value } _ { original Beginning key value }.Because HBase databases are supported to do range query by key and key, line range is entered by index key Inquiry, it is possible to inquire the row of corresponding initial data quickly, improves inquiry velocity.As shown in Figure 1, wherein Primary User Table are raw data tables, have cf1 in former table:The attribute of col1.If being now to inquire about cf1:The number of col1=A According to all data in needing to scan original table before not creating secondary index.After secondary index table is created, it is only necessary to Scanning index table key scopes 001_A to 001_A~between all data, it is possible to know and all meet condition rowkey.Parsed by sql, group of subscribers can be made to draw a portrait and support more complicated condition query, rather than only supporting basis Key, key prefix, key range queries.And the grammer of sql is more common, thus use also it is fairly simple, easily on Hand.In order to secondary index is constructed above in HBase systems, and data are inquired about, it is necessary to realize customized HBase according to sql Coprocessor, observer interface.When client initiates inquiry request to HBase, HBase can load customized Plug-in unit realizes, sql is parsed that generate sql syntax trees, inquiry secondary index is then back to meet the data of condition, finally Client needs to arrange the returned data of each nodes of hbase and merges output.Overall flow is as shown in Figure 2.When HBase plug-in units connect Receive after sql inquiries, the querying condition of HBase secondary indexs can be generated according to the querying condition of sql sentences, by two grades Index can inquire all rowkey for meeting condition, then just can easily inquire initial data by rowkey Value.Overall flow is as shown in Figure 3.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, on the premise of principle of the present invention is not departed from, some improvements and modifications can also be made, these improvements and modifications Should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of increment distributed index method, it is characterised in that including:
Step 1, obtain HBase databases in store data, and each data one-level index value and property value;For each Data, generate secondary index Key by the following method;
{ initial Key values } _ { original property value } _ { original Key values };
Wherein starting Key values are the initial values of all data Ll index values;Original property value is the property value of data;It is original Key values are all data Ll index values;
Step 2, the secondary index Key according to each data, generate secondary index table, and secondary index table is carried out with inquiry Subregional retrieval;Wherein described subregion is divided according to secondary index Key values.
2. increment distributed index method according to claim 1, it is characterised in that the secondary index Key and two Level concordance list storage is in HBase databases.
3. increment distributed index method according to claim 1, it is characterised in that methods described kind also includes:
The inquiry request of step 3, reception based on sql language, and the inquiry request generation based on sql language is directed to The querying condition of the secondary index table of HBase databases;The secondary index table of HBase databases is inquired about according to querying condition, is returned The data for meeting querying condition are returned, so that client arranges the returned data in each regions of HBase merges output.
4. increment distributed index method according to claim 3, it is characterised in that the step 3 is specifically included:
Step 31, when inquiry is received, judge the type of the inquiry, if based on collecting the derived query of (Count), Then jump to step 2;If for the paging query of single subregion (Region), then jumping to step N;
Step 32, the querying condition according to the derived query for being based on collecting, generation are directed to each subregional querying condition, so that Each subregion is inquired about according to corresponding querying condition;Then returned after the data that each subregion returns are merged Return, step terminates;
Step 33, basis are directed to single subregional paging query, determine the subregion of this request;For meeting querying condition Subregion inquired about, and Query Result is returned, step terminates.
5. increment distributed index method according to claim 4, it is characterised in that the method inquired about subregion Specially:
Step a, according to querying condition and last registration id, regenerate querying condition, and build querying condition grammer number;
Whether the result that step b, judgement have been obtained is less than requirement, if it is by all row keywords pair in Query Result The initial data answered is returned, and step is received;Step c is jumped to if not;
Step c, according to querying condition syntax tree obtain next line keyword, judge the next line keyword whether be sky, such as Fruit is the next line keyword to be added in retrieval result, and jump to step b.
6. a kind of increment distributed index system, it is characterised in that including:
The one-level rope of secondary index Key generation modules, the data for obtaining storage in HBase databases, and each data Draw value and property value;It is each data, secondary index Key is generated by the following method;
{ initial Key values } _ { original property value } _ { original Key values };
Wherein starting Key values are the initial values of all data Ll index values;Original property value is the property value of data;It is original Key values are all data Ll index values;
Secondary index table generation module, for the secondary index Key according to each data, generates secondary index table, with inquiry When subregional retrieval is carried out to secondary index table;Wherein described subregion is divided according to secondary index Key values.
7. increment distributed index system according to claim 6, it is characterised in that the secondary index Key and two Level concordance list storage is in HBase databases.
8. increment distributed index system according to claim 6, it is characterised in that the system also includes:
Enquiry module, for receiving the inquiry request based on sql language, and to the inquiry request generation based on sql language For the querying condition of the secondary index table of HBase databases;The secondary index of HBase databases is inquired about according to querying condition Table, return meets the data of querying condition, so that client arranges the returned data in each regions of HBase merges output.
9. increment distributed index system according to claim 8, it is characterised in that the enquiry module be used to performing with Lower operation:
Step 31, when inquiry is received, judge the type of the inquiry, if based on collecting the derived query of (Count), Then jump to step 2;If for the paging query of single subregion (Region), then jumping to step N;
Step 32, the querying condition according to the derived query for being based on collecting, generation are directed to each subregional querying condition, so that Each subregion is inquired about according to corresponding querying condition;Then returned after the data that each subregion returns are merged Return, step terminates;
Step 33, basis are directed to single subregional paging query, determine the subregion of this request;For meeting querying condition Subregion inquired about, and Query Result is returned, step terminates.
10. increment distributed index system according to claim 9, it is characterised in that the side inquired about subregion Method is specially:
Step a, according to querying condition and last registration id, regenerate querying condition, and build querying condition grammer number;
Whether the result that step b, judgement have been obtained is less than requirement, if it is by all row keywords pair in Query Result The initial data answered is returned, and step is received;Step c is jumped to if not;
Step c, according to querying condition syntax tree obtain next line keyword, judge the next line keyword whether be sky, such as Fruit is the next line keyword to be added in retrieval result, and jump to step b.
CN201710028299.5A 2017-01-16 2017-01-16 increment distributed index system and method Pending CN106777343A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710028299.5A CN106777343A (en) 2017-01-16 2017-01-16 increment distributed index system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710028299.5A CN106777343A (en) 2017-01-16 2017-01-16 increment distributed index system and method

Publications (1)

Publication Number Publication Date
CN106777343A true CN106777343A (en) 2017-05-31

Family

ID=58945647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710028299.5A Pending CN106777343A (en) 2017-01-16 2017-01-16 increment distributed index system and method

Country Status (1)

Country Link
CN (1) CN106777343A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273556A (en) * 2017-08-23 2017-10-20 上海点融信息科技有限责任公司 Block chain data index method and equipment
CN108334613A (en) * 2018-02-07 2018-07-27 掌阅科技股份有限公司 Real-time arrangement method, computing device and storage medium based on mass users data
CN108829649A (en) * 2018-05-31 2018-11-16 西安交通大学 The implementation method of complicated type coded sequence algorithm based on HBASE key assignments index
WO2019174558A1 (en) * 2018-03-13 2019-09-19 华为技术有限公司 Data indexing method and device
CN110888870A (en) * 2018-09-11 2020-03-17 北京奇虎科技有限公司 Data storage table query method, partition server and electronic equipment
CN112148731A (en) * 2020-08-13 2020-12-29 新华三大数据技术有限公司 Data paging query method, device and storage medium
CN112307753A (en) * 2020-12-29 2021-02-02 启业云大数据(南京)有限公司 Word segmentation method supporting large word stock, computer readable storage medium and system
CN113806376A (en) * 2021-11-09 2021-12-17 阿里云计算有限公司 Index construction method and device
CN114625798A (en) * 2020-12-14 2022-06-14 金篆信科有限责任公司 Data retrieval method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN105205162A (en) * 2015-09-29 2015-12-30 烽火通信科技股份有限公司 HBase secondary-index storage and query system and query method thereof
CN105302869A (en) * 2015-09-29 2016-02-03 烽火通信科技股份有限公司 HBase secondary index query and storage system and query method
CN105404676A (en) * 2015-11-20 2016-03-16 中国科学院计算技术研究所 HBase secondary index updating method and system based on HFile
CN105740410A (en) * 2016-01-29 2016-07-06 浪潮电子信息产业股份有限公司 Data statistics method based on Hbase secondary index
CN105787118A (en) * 2016-03-25 2016-07-20 武汉工程大学 Design method and query method for HBase secondary index
CN106294814A (en) * 2016-08-16 2017-01-04 上海欣方软件有限公司 HBase secondary index based on memory database builds and the device and method of inquiry

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN105205162A (en) * 2015-09-29 2015-12-30 烽火通信科技股份有限公司 HBase secondary-index storage and query system and query method thereof
CN105302869A (en) * 2015-09-29 2016-02-03 烽火通信科技股份有限公司 HBase secondary index query and storage system and query method
CN105404676A (en) * 2015-11-20 2016-03-16 中国科学院计算技术研究所 HBase secondary index updating method and system based on HFile
CN105740410A (en) * 2016-01-29 2016-07-06 浪潮电子信息产业股份有限公司 Data statistics method based on Hbase secondary index
CN105787118A (en) * 2016-03-25 2016-07-20 武汉工程大学 Design method and query method for HBase secondary index
CN106294814A (en) * 2016-08-16 2017-01-04 上海欣方软件有限公司 HBase secondary index based on memory database builds and the device and method of inquiry

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
夏超俊: ""基于协处理器机制的HBase检索速度改进研究"", 《万方数据知识服务平台》 *
崔丹等: ""基于Redis实现HBase二级索引的方法"", 《软件》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273556A (en) * 2017-08-23 2017-10-20 上海点融信息科技有限责任公司 Block chain data index method and equipment
CN108334613A (en) * 2018-02-07 2018-07-27 掌阅科技股份有限公司 Real-time arrangement method, computing device and storage medium based on mass users data
WO2019174558A1 (en) * 2018-03-13 2019-09-19 华为技术有限公司 Data indexing method and device
CN108829649A (en) * 2018-05-31 2018-11-16 西安交通大学 The implementation method of complicated type coded sequence algorithm based on HBASE key assignments index
CN108829649B (en) * 2018-05-31 2020-04-10 西安交通大学 Method for realizing complex type coding sequence algorithm based on HBASE key value index
CN110888870A (en) * 2018-09-11 2020-03-17 北京奇虎科技有限公司 Data storage table query method, partition server and electronic equipment
CN112148731A (en) * 2020-08-13 2020-12-29 新华三大数据技术有限公司 Data paging query method, device and storage medium
CN112148731B (en) * 2020-08-13 2022-05-27 新华三大数据技术有限公司 Data paging query method, device and storage medium
CN114625798A (en) * 2020-12-14 2022-06-14 金篆信科有限责任公司 Data retrieval method and device, electronic equipment and storage medium
CN112307753A (en) * 2020-12-29 2021-02-02 启业云大数据(南京)有限公司 Word segmentation method supporting large word stock, computer readable storage medium and system
CN113806376A (en) * 2021-11-09 2021-12-17 阿里云计算有限公司 Index construction method and device
CN113806376B (en) * 2021-11-09 2022-05-13 阿里云计算有限公司 Index construction method and device

Similar Documents

Publication Publication Date Title
CN106777343A (en) increment distributed index system and method
CN107463632B (en) Distributed NewSQL database system and data query method
Ma et al. Big graph search: challenges and techniques
US20100299367A1 (en) Keyword Searching On Database Views
CN106294695A (en) A kind of implementation method towards the biggest data search engine
CN111506621A (en) Data statistical method and device
CN104391908B (en) Multiple key indexing means based on local sensitivity Hash on a kind of figure
US10372736B2 (en) Generating and implementing local search engines over large databases
CN102314464B (en) Lyrics searching method and lyrics searching engine
Lian et al. Keyword search over probabilistic RDF graphs
CN106484694B (en) Full-text search method and system based on distributed data base
CN106484815B (en) A kind of automatic identification optimization method based on mass data class SQL retrieval scene
Braga et al. Joining the results of heterogeneous search engines
CN109446293A (en) A kind of parallel higher-dimension nearest Neighbor
CN101719162A (en) Multi-version open geographic information service access method and system based on fragment pattern matching
CN108804580B (en) Method for querying keywords in federal RDF database
De Virgilio et al. Cluster-based exploration for effective keyword search over semantic datasets
WO2021248319A1 (en) Database management system and method for graph view selection for relational-graph database
CN105868406A (en) Multi-database based patent retrieval system
KR100434718B1 (en) Method and system for indexing document
Deng et al. LAF: a new XML encoding and indexing strategy for keyword‐based XML search
Song et al. Discussions on subgraph ranking for keyworded search
CN110019993B (en) Method for realizing sequencing optimization algorithm technology based on massive standard literature data
Ma et al. Matching query processing in high-dimensional space
KR20100067764A (en) Ontology based products information service system and method in e-commerce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170531

RJ01 Rejection of invention patent application after publication