CN107506464A - A kind of method that HBase secondary indexs are realized based on ES - Google Patents
A kind of method that HBase secondary indexs are realized based on ES Download PDFInfo
- Publication number
- CN107506464A CN107506464A CN201710763058.5A CN201710763058A CN107506464A CN 107506464 A CN107506464 A CN 107506464A CN 201710763058 A CN201710763058 A CN 201710763058A CN 107506464 A CN107506464 A CN 107506464A
- Authority
- CN
- China
- Prior art keywords
- data
- secondary index
- index table
- line unit
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method that HBase secondary indexs are realized based on ES, it is related to big data technical field.This method is:1. being listed according to inquiry business to related data in ES and establishing secondary index table, the corresponding secondary index table of a basic query business, a complex query business corresponds to multiple secondary index tables;2. inquire about, the line unit for obtaining corresponding data is inquired about according to concordance list first, data are obtained further according to line unit inquiry tables of data;3., it is necessary to secondary index table update simultaneously corresponding to when updating the data table related column.The introducing that the present invention passes through ES distributed search engines, each data manipulation, only very few several Region, fundamentally reduce the pressure of cluster, the burden of network service is alleviated, makes the dependence reduction to high-performance server, enhances the efficiency and stability of work, and possess preferable scalability, have good value for applications.
Description
Technical field
The present invention relates to big data technical field, more particularly to a kind of method that HBase secondary indexs are realized based on ES.
Background technology
With the arrival in big data epoch, geometric growth is presented in public security system data volume, and mass data is to traditional database
Technology proposes storage and the challenge of retrieval performance, and the data statistics difficulty of each dimension also becomes big therewith.It is traditional at present to be
By writing MapReduce or the method using instruments such as Hive, Pig, conventional method is that full table is scanned, to cluster
Performance consumption and the occupancy of network bandwidth are larger, are not applied under the scene of ultra-large data volume.It is only hard by upgrading physics
Part or Optimized code, do not adapt to the growth rate of information and the demand of information processing efficiency, and researcher starts to explore
New data statistical approach.How to solve this problem turns into difficult point.
The HBase databases run in Hadoop platform be a high reliability, high-performance, towards row and it is expansible
Distributed memory system.HBase is that one kind is increased income NoSQL databases, is suitable for various unstructured and semi-structured loose
The storage and management of data, large-scale storage cluster can be erected on low-cost server cluster using HBase database technologys,
It disclosure satisfy that the storage demand of public security big data.But the big data storage scheme based on HBase is not fully solved data
Efficient retrieval problem.In actual applications, it is often necessary to retrieval is combined according to specific field, or several fields, especially
It is in face of public security big data is complicated, flexible inquiry business demand, and single line unit can not necessarily meet service inquiry needs, because
A kind of this urgently big data search method that disclosure satisfy that needs.
ES full name ElasticSearch, it can establish and index convenient for data, an index can be divided into multiple ropes
Drawing burst, (index burst number can be specified by user, be defaulted as 5), multiple bursts are balancedly then distributed in into all of cluster can
With on node, distributed frame is formed, alleviates the burden of individual node.Can also be every in ElasticSearch clusters
It is individual index burst set copy (number of copies still can voluntarily be specified by user, be defaulted as 1), when certain index burst failure when,
Copy can be timely used and recover data.ElasticSearch also possesses automatic discovery Node Mechanism and fast data recovery machine
System, when there is new node to add cluster, ElasticSearch can in time have found and re-start load balancing automatically, for new section
Point distribution data;When certain node failure, it equally can distribute data for enabled node again automatically.
The content of the invention
The purpose of the present invention is that the above mentioned problem solved existing for prior art, there is provided one kind is realized based on ES
The method of HBase secondary indexs.
The object of the present invention is achieved like this:
Specifically, this method comprises the following steps:
1. related data is listed in ES according to inquiry business and establishes secondary index table, a basic query business corresponding one
Secondary index table is opened, a complex query business corresponds to multiple secondary index tables;
A, according to action type, secondary index table is created in ES
For selecting inquiry operation, the M data row for being related to selection inquiry are respectively stored into M secondary index table,
Wherein, M is more than or equal to 1, and the line unit R of each secondary index table is formed by three parts, is successively:QUALIFIER、VALUE
And ROEKEY;Wherein QUALIFIER is the identifier that data arrange in tables of data, and VALUE is the value that data arrange in tables of data,
ROWKEY is the line unit of tables of data;
B, according to data column-generation secondary index entry and secondary index table is inserted
Operated for connection Query, the N number of data row for being related to connection Query are stored into a secondary index table, its
In, N is more than or equal to 2, and the line unit R of secondary index table is made up of three parts, is successively:PREFIX、VALUE、QUALIFIER;Its
Middle PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is the value that data arrange in tables of data,
QUALIFIER is the identifier that data arrange in tables of data;
The value that data arrange in the secondary index table is the ROWKEY of corresponding data table;Data arrange in the secondary index table
Value and the line unit R of secondary index table collectively form an entry of secondary index table;Secondary index table is created in ES, and will
The incidence relation that data arrange corresponding secondary index table is stored into metadata table, and the line unit of metadata table, which is formed, to be followed successively by:
Table name, row Praenomen, row name, the action type of secondary index table, the timestamp of tables of data, value corresponding to the line unit of metadata table
For:The action type and secondary index table name of secondary index table;
The action type of secondary index table includes:Select inquiry operation and connection Query operation;
2. inquire about, the line unit for obtaining corresponding data is inquired about according to concordance list first, data are inquired about further according to line unit
Table obtains data;
A, the line unit that secondary index table obtains data to be checked is scanned;
Each data in the M data row being related to for selection inquiry business are arranged, and first number is inquired about according to action type
According to table, the title of secondary index table corresponding to acquisition;The secondary index table is looked into, specific query process is:Inquired about according to selection
In condition value directly position to first qualified data, continue to scan on, until find an ineligible number
According to;Scanned qualified data composition meets the ROWKEY of the querying condition of current data row set;If M etc.
In 1, then ROWKEY set is the ROWKEY of data to be checked set;If M is more than 1, according to M data in inquiry business
Corresponding set operation is done in logical relation in row, the ROWKEY set to different lines:Logical AND corresponds to the operation of intersection of sets collection,
Logic or corresponding union operation, the result of computing is the ROWKEY of data to be checked set;
B, using data to be checked ROWKEY collection query tables of data
Arranged for N number of data that connection Query business is related to, two according to corresponding to obtaining action type query metadata table
The title of level concordance list, N number of corresponding same secondary index table of row;The secondary index table is inquired about, specific query process is:Root
Understood according to secondary index table row key form, it is continuous that N number of data with identical value are listed in corresponding entry in secondary index table
Arrangement;If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY of N number of entry is formed
One N tuple for meeting querying condition<R1, R2 ..., RN>;Scan whole secondary index table, then obtain all conditions that meet
N tuples set<R1, R2 ..., RN>, then gather<R1,R2,...,RN>Be exactly data to be checked ROWKEY collection
Close;The ROWKEY of the data to be checked obtained set is obtained corresponding by the HBase Get interface methods provided in tables of data
Data value;
3., it is necessary to secondary index table update simultaneously corresponding to when updating the data table related column
Judge whether tables of data has renewal, if so, just renewal secondary index table, if not having, does not update secondary index table;
The method of renewal secondary index table comprises the following steps:
I, update the data table:The Put method interfaces provided by the HBase in Hadoop platform, the value of submission data row,
The identifier of line unit, row race and row, the renewal of complete paired data table;
II, generation secondary index entry:For the row of the data currently updated, query metadata table, acquisition needs to update
Secondary index table and secondary index table corresponding to action type, the lattice of corresponding secondary index table are selected according to action type
Formula, meet the tabular entry of corresponding secondary index using the data message generation updated in tables of data;
III, renewal secondary index table:The interface method provided by Coprocessor in the HBase in Hadoop platform,
The value of the form submission secondary index table of the secondary index entry generated according to step II, line unit, the identifier for arranging race and row, it is complete
The renewal of paired secondary index table.
This method can realize basic renewal operation in the case where not causing larger pressure to Hadoop clusters, and
The connection Query and selection inquiry operation between tables of data can be relatively efficiently realized for each specific business, so as to real
Now to complexity business demand support and to it is daily increase newly data counted with total amount.
This method has following features:
1) secondary index table creates simple;
2) index file writes simultaneously with data file, ensures uniformity;
3) the data statistics time greatly reduces.
The present invention has following advantages and good effect:
By the introducing of ES distributed search engines, each data manipulation, only very few several Region, from basic
On reduce the pressure of cluster, alleviate the burden of network service, make the dependence reduction to high-performance server, enhance work
The efficiency and stability of work, and possess preferable scalability, have good value for applications.
Brief description of the drawings
Fig. 1 is the overview flow chart of this method;
Fig. 2 is the selection querying flow figure of this method step 2.;
Fig. 3 is the connection Query flow chart of this method step 2..
English to Chinese:
1、ES:Full name ElasticSearch is increasing income based on Lucene structures, distributed, and RESTful search is drawn
Hold up.It is stable designed for real-time search in cloud computing, can be reached, it is reliably, quickly, easy to install.Support passes through HTTP
Data directory is carried out using JSON.
We establish a website or application program, and to add function of search, make that we are stricken to be:Search work
It is difficult.It is desirable that our search solution is fast, it is intended that have a zero configuration and one it is completely free
Search pattern, it is therefore desirable to be able to the index data for simply passing through HTTP using JSON, it is intended that our search service
Device can use all the time, it is therefore desirable to be able to which one starts and expands to hundreds of, and we will search in real time, and we simply will rent more
Family, it is intended that establish the solution of a cloud.Elasticsearch aims to solve the problem that all these problems and more.
2、HBase:It is the non-relational an increased income distributed data base (NoSQL), it with reference to Google
BigTable is modeled, and the programming language of realization is Java.It is a part for Apache Software Foundation Hadoop projects, operation
On HDFS file system, the service similar to BigTable scales is provided for Hadoop.HBase is realized on row
Compression algorithm, internal memory operation and the Bloom filter that BigTable papers are mentioned.HBase table can appoint as MapReduce
The input and output of business, data can be accessed by Java API, REST, Avro or Thrift API can also be passed through
To access.Although HBase performances are obviously improved, it can't directly substitute SQL database.It has been applied to more now
Individual data driven type website.
Embodiment
With reference to the accompanying drawings and examples to the detailed description of the invention:
1st, method (totality)
Such as Fig. 1, overall procedure is:
Secondary index table is established according to the row of index first, then first judges to update the data or look into when calling
Ask data;
If updating the data, then secondary index table is updated while table is updated the data;
If operation is inquires about, the data of secondary index are inquired about first, and the key assignments being retrieved according to secondary index obtains
The related data row of tables of data.
2nd, step is 2.
1) selection inquiry
Such as Fig. 2, selecting the workflow of inquiry is:
For a compound selection inquiry business, the compound selection querying condition of business is split as single query bar first
Part, the entry set for meeting single condition is then obtained by the line unit of concordance list, will finally meet the entry of each single condition
Set carries out set operation, you can obtains all secondary index entries for meeting compound query condition, then is carried from these entries
Take all qualified tables of data line units;Wherein, obtain meet the secondary index bar destination aggregation (mda) of single condition when, can be according to
Directly position to first qualified data according to the line unit of concordance list, down scan, until discovery one is ineligible
Data, then scanned entry is merged into the secondary index bar destination aggregation (mda) for meeting single condition.
2) connection Query
Such as Fig. 3, the workflow of connection Query is:
For compound connection Query business, inquiry can be divided into two connection Query groups, the number of same connection Query group
When being inserted into according to row in concordance list, identical PREFIX values are produced by hash function;Value corresponding to line unit R is then that this is listed in data
Line unit in table;Whole scan is carried out to secondary index table during inquiry, records qualified multi-component system set, then these are more
Tuple-set carries out set operation, obtains the line unit value of eligible data;Wherein recording qualified multi-component system set
During, when the multi-component system of only continuous entry composition can meet the condition of connection Query group, just this multi-component system is added
Add in multi-component system set.
Claims (3)
- A kind of 1. method that HBase secondary indexs are realized based on ES, it is characterised in that:1. being listed according to inquiry business to related data in ES and establishing secondary index table, a basic query business is corresponding one two Level concordance list, a complex query business correspond to multiple secondary index tables;A, according to action type, secondary index table is created in ESFor selecting inquiry operation, the M data row for being related to selection inquiry are respectively stored into M secondary index table, wherein, M is more than or equal to 1, and the line unit R of each secondary index table is formed by three parts, is successively:QUALIFIER, VALUE and ROEKEY;Wherein QUALIFIER be in tables of data data arrange identifier, VALUE be in tables of data data arrange value, ROWKEY It is the line unit of tables of data;B, according to data column-generation secondary index entry and secondary index table is insertedOperated for connection Query, the N number of data row for being related to connection Query are stored into a secondary index table, wherein, N is big In equal to 2, the line unit R of secondary index table is made up of three parts, is successively:PREFIX、VALUE、QUALIFIER;Wherein PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is the value that data arrange in tables of data, QUALIFIER It is the identifier that data arrange in tables of data;The value that data arrange in the secondary index table is the ROWKEY of corresponding data table;The value that data arrange in the secondary index table An entry of secondary index table is collectively formed with the line unit R of secondary index table;Secondary index table is created in ES, and by data The incidence relation for arranging corresponding secondary index table is stored into metadata table, and the line unit of metadata table, which is formed, to be followed successively by:Data Table name, row Praenomen, row name, the action type of secondary index table, the timestamp of table, value corresponding to the line unit of metadata table are:Two The action type and secondary index table name of level concordance list;The action type of secondary index table includes:Select inquiry operation and connection Query operation;2. inquire about, the line unit for obtaining corresponding data is inquired about according to concordance list first, is obtained further according to line unit inquiry tables of data Obtain data;A, the line unit that secondary index table obtains data to be checked is scanned;Each data row in the M data row being related to for selection inquiry business, according to action type query metadata table, The title of secondary index table corresponding to acquisition;The secondary index table is looked into, specific query process is:Bar in being inquired about according to selection Part value is directly positioned to first qualified data, is continued to scan on, until finding an ineligible data;Scanning The qualified data composition crossed meets the ROWKEY of the querying condition of current data row set;If M is equal to 1, ROWKEY set is the ROWKEY of data to be checked set;If M is more than 1, according in M data row in inquiry business Corresponding set operation is done in logical relation, the ROWKEY set to different lines:Logical AND correspond to intersection of sets collection operation, logic or Corresponding union operation, the result of computing is the ROWKEY of data to be checked set;B, using data to be checked ROWKEY collection query tables of dataArranged for N number of data that connection Query business is related to, the two level rope according to corresponding to obtaining action type query metadata table Draw the title of table, N number of corresponding same secondary index table of row;The secondary index table is inquired about, specific query process is:According to two Level concordance list line unit form understands that N number of data with identical value are listed in corresponding entry continuous arrangement in secondary index table; If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY of N number of entry forms one completely The N tuples of sufficient querying condition<R1, R2 ..., RN>;Whole secondary index table is scanned, then obtains all N tuples for meeting condition Set<R1, R2 ..., RN>, then gather<R1,R2,...,RN>Be exactly data to be checked ROWKEY set; Data to be checked ROWKEY set by HBase provide Get interface methods obtained in tables of data corresponding to number According to value;3., it is necessary to secondary index table update simultaneously corresponding to when updating the data table related columnJudge whether tables of data has renewal, if so, just renewal secondary index table, if not having, does not update secondary index table;The method of renewal secondary index table comprises the following steps:I, update the data table:The Put method interfaces provided by HBase in Hadoop platform, submit the value, OK of data row The identifier of key, row race and row, the renewal of complete paired data table;II, generation secondary index entry:For the row of the data currently updated, query metadata table, need to update two are obtained Action type corresponding to level concordance list and secondary index table, corresponding secondary index tableau format is selected according to action type, Meet the tabular entry of corresponding secondary index using the data message generation updated in tables of data;III, renewal secondary index table:The interface method provided by Coprocessor in the HBase in Hadoop platform, according to The value of the form submission secondary index table of the secondary index entry of step b generations, line unit, the identifier for arranging race and row, completion pair The renewal of secondary index table.
- 2. a kind of method that HBase secondary indexs are realized based on ES as described in claim 1, it is characterised in that the step is 2. Its select inquiry workflow be:For a compound selection inquiry business, the compound selection querying condition of business is split as single query condition first, Then the entry set for meeting single condition is obtained by the line unit of concordance list, will finally meet the entry set of each single condition Carry out set operation, you can obtain all secondary index entries for meeting compound query condition, then institute is extracted from these entries There is qualified tables of data line unit;Wherein, can be according to rope when acquisition meets the secondary index bar destination aggregation (mda) of single condition Draw the line unit directly positioning of table to first qualified data, down scan, until finding an ineligible number According to then scanned entry to be merged into the secondary index bar destination aggregation (mda) for meeting single condition.
- 3. a kind of method that HBase secondary indexs are realized based on ES as described in claim 1, it is characterised in that the step is 2. The workflow of its connection Query is:For compound connection Query business, inquiry can be divided into two connection Query groups, the data row of same connection Query group When being inserted into concordance list, identical PREFIX values are produced by hash function;Value corresponding to line unit R is then that this is listed in tables of data Line unit;Whole scan is carried out to secondary index table during inquiry, records qualified multi-component system set, then by these multi-component systems Set carries out set operation, obtains the line unit value of eligible data;Wherein recording qualified multi-component system aggregation process In, when the multi-component system of only continuous entry composition can meet the condition of connection Query group, just this multi-component system is added to In multi-component system set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710763058.5A CN107506464A (en) | 2017-08-30 | 2017-08-30 | A kind of method that HBase secondary indexs are realized based on ES |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710763058.5A CN107506464A (en) | 2017-08-30 | 2017-08-30 | A kind of method that HBase secondary indexs are realized based on ES |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107506464A true CN107506464A (en) | 2017-12-22 |
Family
ID=60694149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710763058.5A Withdrawn CN107506464A (en) | 2017-08-30 | 2017-08-30 | A kind of method that HBase secondary indexs are realized based on ES |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107506464A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109271097A (en) * | 2017-12-28 | 2019-01-25 | 新华三大数据技术有限公司 | Data processing method, data processing equipment and server |
CN109299102A (en) * | 2018-10-23 | 2019-02-01 | 中国电子科技集团公司第二十八研究所 | A kind of HBase secondary index system and method based on Elastcisearch |
CN109299110A (en) * | 2018-11-09 | 2019-02-01 | 东软集团股份有限公司 | Data query method, apparatus, storage medium and electronic equipment |
CN109800222A (en) * | 2018-12-11 | 2019-05-24 | 中国科学院信息工程研究所 | A kind of HBase secondary index adaptive optimization method and system |
CN110502524A (en) * | 2019-08-15 | 2019-11-26 | 济南浪潮数据技术有限公司 | Phoenix index data asynchronous updating method and device |
CN110737692A (en) * | 2018-07-19 | 2020-01-31 | 杭州海康威视数字技术股份有限公司 | data retrieval method, index database establishment method and device |
CN111159185A (en) * | 2019-12-27 | 2020-05-15 | 紫光云(南京)数字技术有限公司 | Hive index method based on conditional push-down elastic search |
CN111753045A (en) * | 2020-07-01 | 2020-10-09 | 浪潮云信息技术股份公司 | Hive secondary full-text index technical method and system based on elastic search |
CN112597191A (en) * | 2020-12-29 | 2021-04-02 | 拉卡拉支付股份有限公司 | Data processing method, data processing apparatus, electronic device, storage medium, and program product |
CN112805695A (en) * | 2019-03-20 | 2021-05-14 | 谷歌有限责任公司 | Co-sharding and randomized co-sharding |
CN114372064A (en) * | 2022-03-22 | 2022-04-19 | 飞狐信息技术(天津)有限公司 | Data processing apparatus, method, computer readable medium and processor |
WO2024022180A1 (en) * | 2022-07-28 | 2024-02-01 | 天津联想协同科技有限公司 | Network disk document indexing method and apparatus, and network disk and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834688A (en) * | 2015-04-20 | 2015-08-12 | 北京奇艺世纪科技有限公司 | Secondary index establishment method and device |
CN106503243A (en) * | 2016-11-08 | 2017-03-15 | 国网山东省电力公司电力科学研究院 | Electric power big data querying method and system based on HBase secondary indexs |
CN106682073A (en) * | 2016-11-14 | 2017-05-17 | 上海轻维软件有限公司 | HBase fuzzy retrieval system based on Elastic Search |
-
2017
- 2017-08-30 CN CN201710763058.5A patent/CN107506464A/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834688A (en) * | 2015-04-20 | 2015-08-12 | 北京奇艺世纪科技有限公司 | Secondary index establishment method and device |
CN106503243A (en) * | 2016-11-08 | 2017-03-15 | 国网山东省电力公司电力科学研究院 | Electric power big data querying method and system based on HBase secondary indexs |
CN106682073A (en) * | 2016-11-14 | 2017-05-17 | 上海轻维软件有限公司 | HBase fuzzy retrieval system based on Elastic Search |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109271097A (en) * | 2017-12-28 | 2019-01-25 | 新华三大数据技术有限公司 | Data processing method, data processing equipment and server |
CN110737692A (en) * | 2018-07-19 | 2020-01-31 | 杭州海康威视数字技术股份有限公司 | data retrieval method, index database establishment method and device |
CN109299102B (en) * | 2018-10-23 | 2020-11-13 | 中国电子科技集团公司第二十八研究所 | HBase secondary index system and method based on Elastcissearch |
CN109299102A (en) * | 2018-10-23 | 2019-02-01 | 中国电子科技集团公司第二十八研究所 | A kind of HBase secondary index system and method based on Elastcisearch |
CN109299110A (en) * | 2018-11-09 | 2019-02-01 | 东软集团股份有限公司 | Data query method, apparatus, storage medium and electronic equipment |
CN109800222A (en) * | 2018-12-11 | 2019-05-24 | 中国科学院信息工程研究所 | A kind of HBase secondary index adaptive optimization method and system |
CN109800222B (en) * | 2018-12-11 | 2021-06-01 | 中国科学院信息工程研究所 | HBase secondary index self-adaptive optimization method and system |
CN112805695A (en) * | 2019-03-20 | 2021-05-14 | 谷歌有限责任公司 | Co-sharding and randomized co-sharding |
CN110502524A (en) * | 2019-08-15 | 2019-11-26 | 济南浪潮数据技术有限公司 | Phoenix index data asynchronous updating method and device |
CN111159185A (en) * | 2019-12-27 | 2020-05-15 | 紫光云(南京)数字技术有限公司 | Hive index method based on conditional push-down elastic search |
CN111753045A (en) * | 2020-07-01 | 2020-10-09 | 浪潮云信息技术股份公司 | Hive secondary full-text index technical method and system based on elastic search |
CN111753045B (en) * | 2020-07-01 | 2024-09-10 | 浪潮云信息技术股份公司 | Hive two-level full-text index technical method and system based on elastic search |
CN112597191A (en) * | 2020-12-29 | 2021-04-02 | 拉卡拉支付股份有限公司 | Data processing method, data processing apparatus, electronic device, storage medium, and program product |
CN112597191B (en) * | 2020-12-29 | 2024-06-11 | 拉卡拉支付股份有限公司 | Data processing method, device, electronic equipment, storage medium and program product |
CN114372064A (en) * | 2022-03-22 | 2022-04-19 | 飞狐信息技术(天津)有限公司 | Data processing apparatus, method, computer readable medium and processor |
CN114372064B (en) * | 2022-03-22 | 2022-07-12 | 飞狐信息技术(天津)有限公司 | Data processing apparatus, method, computer readable medium and processor |
WO2024022180A1 (en) * | 2022-07-28 | 2024-02-01 | 天津联想协同科技有限公司 | Network disk document indexing method and apparatus, and network disk and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107506464A (en) | A kind of method that HBase secondary indexs are realized based on ES | |
CN109299102B (en) | HBase secondary index system and method based on Elastcissearch | |
US11816126B2 (en) | Large scale unstructured database systems | |
US9396018B2 (en) | Low latency architecture with directory service for integration of transactional data system with analytical data structures | |
US10783124B2 (en) | Data migration in a networked computer environment | |
Zhao et al. | Modeling MongoDB with relational model | |
US7577637B2 (en) | Communication optimization for parallel execution of user-defined table functions | |
US10565199B2 (en) | Massively parallel processing database middleware connector | |
US9923901B2 (en) | Integration user for analytical access to read only data stores generated from transactional systems | |
JP6964384B2 (en) | Methods, programs, and systems for the automatic discovery of relationships between fields in a mixed heterogeneous data source environment. | |
US20160103914A1 (en) | Offloading search processing against analytic data stores | |
US20140236889A1 (en) | Site-based search affinity | |
CN110032604A (en) | Data storage device, transfer device and data bank access method | |
CN106030573A (en) | Implementation of semi-structured data as a first-class database element | |
CN106294695A (en) | A kind of implementation method towards the biggest data search engine | |
EP2680151A1 (en) | Distributed data base system and data structure for distributed data base | |
WO2020077027A1 (en) | Method and system for executing queries on indexed views | |
US20190057133A1 (en) | Systems and methods of bounded scans on multi-column keys of a database | |
CN111221791A (en) | Method for importing multi-source heterogeneous data into data lake | |
CN106503243A (en) | Electric power big data querying method and system based on HBase secondary indexs | |
Borkar et al. | Have your data and query it too: From key-value caching to big data management | |
CN103646051A (en) | Big-data parallel processing system and method based on column storage | |
CN105069151A (en) | HBase secondary index construction apparatus and method | |
WO2024001493A1 (en) | Visual data analysis method and device | |
Mehmood et al. | Distributed real-time ETL architecture for unstructured big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20171222 |