WO2017092583A1 - 一种索引建立方法及设备 - Google Patents

一种索引建立方法及设备 Download PDF

Info

Publication number
WO2017092583A1
WO2017092583A1 PCT/CN2016/106581 CN2016106581W WO2017092583A1 WO 2017092583 A1 WO2017092583 A1 WO 2017092583A1 CN 2016106581 W CN2016106581 W CN 2016106581W WO 2017092583 A1 WO2017092583 A1 WO 2017092583A1
Authority
WO
WIPO (PCT)
Prior art keywords
index
column
time threshold
determining
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/106581
Other languages
English (en)
French (fr)
Chinese (zh)
Inventor
郑博文
潘岳
魏闯先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to JP2018524442A priority Critical patent/JP6898320B2/ja
Priority to EP16869893.4A priority patent/EP3385864B1/en
Publication of WO2017092583A1 publication Critical patent/WO2017092583A1/zh
Priority to US15/996,237 priority patent/US11003649B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to an index establishing method, and the present application also relates to an index establishing device.
  • An index is a structure that sorts the values of one or more columns in a database table. You can use indexes to quickly access specific information in a database table. The index provides pointers to the data values stored in the specified columns of the table, and then sorts the pointers according to the sort order specified by the user. When you need to use an index in your database, first search the index to find a specific value, then follow the pointer to find the row that contains the value.
  • the index type of traditional data is relatively single, mainly based on B-Tree index. Not all data features are suitable for B-Tree index. For example, Key-Value (key-value) mode is very slow.
  • SQL Structured Query Language
  • a SQL will perform multiple searches. If you use B-Tree lookup, it will seriously affect the performance of join SQL.
  • the invention provides an index establishing method. It is used to optimize the index establishment process to improve the efficiency of data retrieval while reducing manpower consumption.
  • the method includes the following steps:
  • the index type is determined according to the data information of the column, and the column is indexed according to the index type.
  • the method further comprises:
  • determining whether the index needs to be indexed according to the index state information of the column in the database within a preset time threshold is specifically:
  • the number of times the index is used within the time threshold is not less than the number of times threshold, it is determined that an index needs to be established for the column.
  • the index type includes at least a B-Tree index, a Hash index, and a Bitmap index, and the index type is determined according to the data information of the column, specifically:
  • the column is a continuous numeric type, determining that the index type is the B-Tree index
  • the index type is the bitmap index.
  • the method further comprises:
  • the retrieval expression sent by the user When the retrieval expression sent by the user is received, the retrieval expression is split into a plurality of sub-expressions;
  • a search response for returning to the user is generated according to the search result and the search result of the other sub-expressions;
  • the sub-expression is retrieved by using an index of the column, and is generated for using the search result according to the search result and other sub-expressions After the retrieval response returned by the user, the retrieval result is stored with the cache.
  • the method before determining whether to need to index the column according to the index state information in the database in the preset time threshold, the method further includes:
  • an index is built for each column in the database according to a default index type, and an index is newly built for each column when a preset time is reached.
  • an index establishing device including:
  • Determining a module determining, according to index state information of a column in the database that is within a preset time threshold, whether the column needs to be indexed;
  • Establishing a module determining an index type according to the data information of the column when the determining module determines that the column needs to be indexed, and establishing an index for the column according to the index type.
  • the establishing module further determines, according to the index usage status information listed in the time threshold, after the time threshold, that the index is not required to be indexed for the column, List the indexing.
  • the determining module is specifically configured to:
  • the number of times the index is used within the time threshold is not less than the number of times threshold, it is determined that an index needs to be established for the column.
  • the index type includes at least a B-Tree index, a Hash index, and a Bitmap index, and the establishing The module determines the index type according to the data information of the column, specifically:
  • the column is a continuous numeric type, determining that the index type is the B-Tree index
  • the index type is the bitmap index.
  • the method further comprises:
  • a splitting module when receiving a retrieval expression sent by a user, splitting the retrieval expression into a plurality of sub-expressions
  • the query module queries, in the cache, whether there is a search result corresponding to each of the sub-expressions;
  • a processing module when a retrieval result corresponding to the sub-expression exists, generating a retrieval response for returning to the user according to the retrieval result and a retrieval result of the other sub-expressions, and in the absence of the sub-expression Retrieving the sub-expression using the index of the column when the search result corresponding to the formula, and searching after generating the retrieval response for returning to the user according to the retrieval result and the retrieval result of the other sub-expressions
  • the result is stored with the cache.
  • the method further comprises:
  • the initialization module after the initialization of the database is completed, builds an index for each column in the database according to a default index type, and re-indexes each column when a preset time is reached.
  • the technical solution of the present application firstly, according to the index state information in the database that is listed in the preset time threshold, it is determined whether the column needs to be indexed, and according to the column when determining that the index needs to be indexed.
  • the data information determines the type of index and indexes the column by index type.
  • FIG. 1 is a schematic flowchart diagram of an index establishing method according to the present application
  • FIG. 2 is a schematic diagram of a data structure in a specific embodiment of the present application.
  • FIG. 3 is a schematic diagram of a flow result merging process according to a specific embodiment of the present application.
  • FIG. 4 is a schematic diagram of an index structure according to a specific embodiment of the present application.
  • FIG. 5 is a schematic diagram of an index establishment process according to a specific embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an index establishing device according to the present application.
  • the present application proposes an index establishing method.
  • the index usage of each column in the database is used to determine whether it needs to be indexed, and the index type is combined with the data information of the column when the index is built. Therefore, the retrieval efficiency is improved under the premise of saving hardware resources and manpower input.
  • a schematic flowchart of an index establishment method proposed by the present application includes the following steps:
  • S101 Determine, according to index state information of a column in the database that is within a preset time threshold, whether an index needs to be established for the column.
  • step S102 is performed;
  • step S101 determines whether the column needs to be indexed within the time interval corresponding to the next preset time threshold.
  • the present application introduces index state information for each column in the database, and determines whether it is necessary to index the column based on the index state information for a period of time.
  • the index status information of each column is updated and judged after each time threshold, for example, whether the column is used within the time threshold. Over the index, and whether the index is used a certain number of times.
  • the time threshold is specifically a preset time interval value, and the statistical time range for characterizing the index usage of the corresponding column may be adjusted according to the capacity of the database and the number of users used. It can be set to one day, on the basis of which the skilled person can extend or shorten it, which is within the scope of protection of the present application.
  • Step a) acquiring the index state information listed in the time threshold.
  • Step b) determining, according to the index status information, whether the index is used in the time threshold, and determining whether the number of times the index is used in the time threshold is not less than a preset when the determination result is yes.
  • the number of thresholds The number of thresholds.
  • the index does not use the index within the time threshold or the number of times the column uses the index within the time threshold is less than the number of times threshold, it is confirmed that there is no need to index the column; Listed in The number of times the index is used within the time threshold is not less than the number of times threshold, and it is determined that an index needs to be established for the column.
  • the application will build an index for each column in the database according to the default index type, and re-establish the preset time.
  • An index is built for each of the columns, and the index uniformly established in the process can be preset by a technician.
  • S102 Determine an index type according to the data information of the column, and establish an index for the column according to the index type.
  • the index type can be determined according to the data information of the column, so that the user can be provided with a personalized retrieval service based on different situations, thereby greatly reducing all users in the process of querying the current database.
  • step S101 in the case that it is determined that there is no need to index the column, the technical solution of the present application will also be based on the column in the next time after the end of the time interval corresponding to the current time threshold.
  • the index usage status information in the time interval corresponding to the time threshold determines whether it is necessary to index the column, that is, step S101 is re-executed, so that the adjustment can be flexibly performed according to actual conditions.
  • the index type is specifically determined by the following conditions:
  • the index type is the B-Tree index.
  • Join is one of the important operations of the relational database system.
  • the commonly used joins included in SQL Server include: inner join, outer join, and cross join. If you need to get data from a table in one or more tables that matches a row in another table, you need to consider using the Join operation, because Join has the property of querying a table or function.
  • the number of words in the column is specifically the number of variable values included in the column.
  • a column of data is of a numeric type and is of a continuous type (eg, an amount)
  • the field of this numeric type generally only needs to perform range query, and the data structure diagram shown in Figure 2, the intermediate node of the data structure such as B-Tree is inherently equipped with the range attribute, so when facing the column of continuous numeric type
  • the range finding efficiency of the B-Tree index is much higher than the inverted index and the bitmap index.
  • the word number threshold can be specifically set to 32 in the above preferred embodiment.
  • the word threshold can be modified later based on other algorithms, which does not affect the scope of protection of the present application.
  • a preferred embodiment of the present application proposes a solution for splitting the search expression into a plurality of sub-expressions when receiving a search expression sent by the user, and querying whether the existence and the place are in the cache.
  • the search result corresponding to the subexpression is processed based on the following conditions:
  • the sub-expression is retrieved by using the index of the column, and is generated for the search result according to the search result and other sub-expressions After the retrieval response returned to the user, the retrieval result is stored in the cache.
  • bitmap bitmap
  • QueryBuilder (Queue Creation Component) generates an EngineQuery object tree based on the where expression.
  • the RowidSet (column identification set) is hung on the leaf node, and the intermediate node is a logical operation combiner.
  • This particular embodiment is based on the fact that different SQL statements may have the same where subexpression, so the query performance of the index is accelerated by caching the atomic expression.
  • the column-level independent index structure automatically selects the most suitable index type according to cost and cost, and automatically adjusts the index type by using historical statistics (HBO), and simultaneously caches the process atomic expression, thereby saving storage.
  • HBO historical statistics
  • the cost also speeds up the query performance of the index.
  • the specific embodiment implements the column-level indexing architecture shown in FIG. 4 based on the principle and characteristics of the indexes such as the inverted index, the bitmap index, the Hash index, and the B-Tree index, and can simultaneously support the index type of 4.
  • the index type is transparent to the user and does not require external designation by the user. Instead, the index type is automatically selected based on the data characteristics.
  • multiple index types use the same set of data structures for Boolean operations. For the engine layer, it is not necessary to perceive the results of the where subexpressions from which index.
  • the index can be automatically optimized based on historical statistics without manual intervention.
  • the index structure diagram mainly includes the following three parts:
  • Streamed Merger responsible for unifying the interaction between different index query results and computing layers.
  • the results of different index queries are stored in a bitmap, and then a stream merged tree is generated according to the Boolean operation of the where expression, and then the line number satisfying the where condition is outputted one by one by the merger.
  • Index Manager responsible for managing indexes, selecting types and automatically optimizing the indexing process.
  • Sub-expression cache responsible for caching the where sub-expression, so that different SQL has the same sub-expression no longer need to query the index, you can directly query in the cache (cache).
  • an index is first established for each column in the database, and an index is re-established every day and historical SQL statistics are acquired, and based on the historical data, Use the index and the number of times the index is used to exceed the threshold.
  • select a different index type based on the type of data for that column. Therefore, it can not only support different column types with different index types (supporting inverted index, B-Tree index, Bitmap index and Hash index).
  • the selection of the index type can achieve no perception to the user, greatly saving storage costs and increasing query speed.
  • the present application also provides an index establishing device, as shown in FIG. 6, including:
  • the determining module 610 determines, according to the index state information of the column in the database that is within a preset time threshold, whether an index needs to be established for the column;
  • the establishing module 620 determines an index type according to the data information of the column when the determining module determines that the column needs to be indexed, and indexes the column according to the index type.
  • the establishing module further determines, after the determining module does not need to index the column, after the time threshold, according to the index usage status information listed in the time threshold. Whether you need to index the column.
  • the determining module is specifically configured to:
  • the number of times the index is used within the time threshold is not less than the number of times threshold, it is determined that an index needs to be established for the column.
  • the index type includes at least a B-Tree index, a Hash index, and a Bitmap index
  • the establishing module determines an index type according to the data information of the column, specifically:
  • the column is a continuous numeric type, determining that the index type is the B-Tree index
  • the index type is the bitmap index.
  • a splitting module when receiving a retrieval expression sent by a user, splitting the retrieval expression into a plurality of sub-expressions
  • the query module queries, in the cache, whether there is a search result corresponding to each of the sub-expressions;
  • the retrieval result of the expression generates a retrieval response for returning to the user, and retrieves the sub-expression using the index of the column when there is no retrieval result corresponding to the sub-expression, and is based on
  • the retrieval result and the retrieval result of the other sub-expressions generate a retrieval response for returning to the user, and then store the retrieval result with the cache.
  • the initialization module after the initialization of the database is completed, builds an index for each column in the database according to a default index type, and re-indexes each column when a preset time is reached.
  • the present invention can be implemented by hardware or by means of software plus a necessary general hardware platform.
  • the technical solution of the present invention may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.), including several The instructions are for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various implementation scenarios of the present invention.
  • modules in the apparatus in the implementation scenario may be distributed in the apparatus for implementing the scenario according to the implementation scenario description, or may be correspondingly changed in one or more devices different from the implementation scenario.
  • the modules of the above implementation scenarios may be combined into one module, or may be further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/CN2016/106581 2015-12-01 2016-11-21 一种索引建立方法及设备 Ceased WO2017092583A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2018524442A JP6898320B2 (ja) 2015-12-01 2016-11-21 インデックス確立の方法およびデバイス
EP16869893.4A EP3385864B1 (en) 2015-12-01 2016-11-21 Method and device for establishing index
US15/996,237 US11003649B2 (en) 2015-12-01 2018-06-01 Index establishment method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510868254.X 2015-12-01
CN201510868254.XA CN106815260B (zh) 2015-12-01 2015-12-01 一种索引建立方法及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/996,237 Continuation US11003649B2 (en) 2015-12-01 2018-06-01 Index establishment method and device

Publications (1)

Publication Number Publication Date
WO2017092583A1 true WO2017092583A1 (zh) 2017-06-08

Family

ID=58796259

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/106581 Ceased WO2017092583A1 (zh) 2015-12-01 2016-11-21 一种索引建立方法及设备

Country Status (5)

Country Link
US (1) US11003649B2 (enExample)
EP (1) EP3385864B1 (enExample)
JP (1) JP6898320B2 (enExample)
CN (1) CN106815260B (enExample)
WO (1) WO2017092583A1 (enExample)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046130A (zh) * 2019-11-08 2020-04-21 杭州安恒信息技术股份有限公司 结合ElasticSearch和FSM的关联检索方法
US11003649B2 (en) 2015-12-01 2021-05-11 Alibaba Group Holding Limited Index establishment method and device
CN116383144A (zh) * 2023-03-23 2023-07-04 中科星图股份有限公司 一种多源异构遥感数据存储方法和装置

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11023439B2 (en) * 2016-09-01 2021-06-01 Morphick, Inc. Variable cardinality index and data retrieval
CN110851438B (zh) * 2018-08-20 2025-03-18 北京京东尚科信息技术有限公司 一种数据库索引优化建议与验证的方法和装置
CN110874358B (zh) * 2018-08-30 2023-05-05 阿里巴巴集团控股有限公司 多属性列的存储、检索方法和装置以及电子设备
US10545960B1 (en) * 2019-03-12 2020-01-28 The Governing Council Of The University Of Toronto System and method for set overlap searching of data lakes
CN113297454B (zh) * 2020-04-14 2025-01-03 阿里巴巴集团控股有限公司 检索方法、查询方法、装置、系统、电子设备和计算机存储介质
CN113535733B (zh) * 2021-07-26 2024-08-06 北京锐安科技有限公司 数据存储、查询方法、装置、计算机设备及存储介质
CN114168800B (zh) * 2021-11-26 2024-09-13 哈尔滨工程大学 一种基于b+树和位图索引融合树的冲突检测方法
US12182093B2 (en) * 2022-09-27 2024-12-31 Ocient Holdings LLC Applying range-based filtering during query execution based on utilizing an inverted index structure
US12321387B2 (en) * 2023-03-10 2025-06-03 Equifax Inc. Automatically generating search indexes for expediting searching of a computerized database
CN116719843B (zh) * 2023-05-31 2025-11-07 中电科金仓(北京)科技股份有限公司 数据库系统的查询方法、存储介质及设备
CN117573680B (zh) * 2024-01-17 2024-04-12 深圳市进择科技有限公司 一种基于大数据的定位数据传输管理系统及方法
CN118467669B (zh) * 2024-05-09 2025-02-25 深圳计算科学研究院 索引构建方法、字段搜索方法、装置、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7640244B1 (en) * 2004-06-07 2009-12-29 Teredata Us, Inc. Dynamic partition enhanced joining using a value-count index
CN103390066A (zh) * 2013-08-08 2013-11-13 上海新炬网络技术有限公司 一种数据库全局性自动化优化预警装置及其处理方法
CN103810212A (zh) * 2012-11-14 2014-05-21 阿里巴巴集团控股有限公司 一种数据库索引的自动创建方法及系统
CN105045851A (zh) * 2015-07-07 2015-11-11 福建天晴数码有限公司 根据日志分析自动创建数据库索引的方法及系统

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63201716A (ja) * 1987-02-17 1988-08-19 Nec Corp インデツクス保守方式
US5404510A (en) 1992-05-21 1995-04-04 Oracle Corporation Database index design based upon request importance and the reuse and modification of similar existing indexes
JPH0785093A (ja) * 1993-09-16 1995-03-31 Nissan Motor Co Ltd インデックス自動設定方法
US5907837A (en) * 1995-07-17 1999-05-25 Microsoft Corporation Information retrieval system in an on-line network including separate content and layout of published titles
US7392266B2 (en) * 2005-03-17 2008-06-24 International Business Machines Corporation Apparatus and method for monitoring usage of components in a database index
JP2007122405A (ja) * 2005-10-28 2007-05-17 Hitachi Ltd データベース管理システムの性能チューニングシステム
JP5162215B2 (ja) * 2007-11-22 2013-03-13 株式会社エヌ・ティ・ティ・データ データ処理装置、データ処理方法、および、プログラム
JP4237813B2 (ja) * 2008-05-26 2009-03-11 株式会社東芝 構造化文書管理システム
US8489565B2 (en) * 2009-03-24 2013-07-16 Microsoft Corporation Dynamic integrated database index management
CN101609460B (zh) * 2009-07-22 2011-12-14 中国科学院地理科学与资源研究所 一种支持异构地学数据资源的检索方法及检索系统
US8655867B2 (en) * 2010-05-13 2014-02-18 Salesforce.Com, Inc. Method and system for optimizing queries in a multi-tenant database environment
US8412701B2 (en) * 2010-09-27 2013-04-02 Computer Associates Think, Inc. Multi-dataset global index
CN102467572B (zh) * 2010-11-17 2013-10-02 英业达股份有限公司 支持重复数据删除程序的数据区块查询方法
US8396858B2 (en) * 2011-08-11 2013-03-12 International Business Machines Corporation Adding entries to an index based on use of the index
CN102779180B (zh) * 2012-06-29 2015-09-09 华为技术有限公司 数据存储系统的操作处理方法,数据存储系统
US8825664B2 (en) * 2012-08-17 2014-09-02 Splunk Inc. Indexing preview
US20140317093A1 (en) * 2013-04-22 2014-10-23 Salesforce.Com, Inc. Facilitating dynamic creation of multi-column index tables and management of customer queries in an on-demand services environment
US20150032720A1 (en) * 2013-07-23 2015-01-29 Yahoo! Inc. Optimizing database queries
CN104714984A (zh) 2013-12-17 2015-06-17 中国移动通信集团湖南有限公司 一种数据库优化的方法和装置
CN104112011B (zh) * 2014-07-16 2017-09-15 深圳国泰安教育技术股份有限公司 一种海量数据提取的方法及装置
CN104182460B (zh) * 2014-07-18 2017-06-13 浙江大学 基于倒排索引的时间序列相似性查询方法
US9846746B2 (en) * 2014-11-20 2017-12-19 Facebook, Inc. Querying groups of users based on user attributes for social analytics
CN104834736A (zh) * 2015-05-19 2015-08-12 深圳证券信息有限公司 构建索引库的方法、装置及检索的方法、装置和系统
CN106815260B (zh) 2015-12-01 2021-05-04 阿里巴巴集团控股有限公司 一种索引建立方法及设备
US10601593B2 (en) * 2016-09-23 2020-03-24 Microsoft Technology Licensing, Llc Type-based database confidentiality using trusted computing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7640244B1 (en) * 2004-06-07 2009-12-29 Teredata Us, Inc. Dynamic partition enhanced joining using a value-count index
CN103810212A (zh) * 2012-11-14 2014-05-21 阿里巴巴集团控股有限公司 一种数据库索引的自动创建方法及系统
CN103390066A (zh) * 2013-08-08 2013-11-13 上海新炬网络技术有限公司 一种数据库全局性自动化优化预警装置及其处理方法
CN105045851A (zh) * 2015-07-07 2015-11-11 福建天晴数码有限公司 根据日志分析自动创建数据库索引的方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3385864A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11003649B2 (en) 2015-12-01 2021-05-11 Alibaba Group Holding Limited Index establishment method and device
CN111046130A (zh) * 2019-11-08 2020-04-21 杭州安恒信息技术股份有限公司 结合ElasticSearch和FSM的关联检索方法
CN111046130B (zh) * 2019-11-08 2023-05-23 杭州安恒信息技术股份有限公司 结合ElasticSearch和FSM的关联检索方法
CN116383144A (zh) * 2023-03-23 2023-07-04 中科星图股份有限公司 一种多源异构遥感数据存储方法和装置

Also Published As

Publication number Publication date
JP2019502980A (ja) 2019-01-31
US20180276264A1 (en) 2018-09-27
US11003649B2 (en) 2021-05-11
EP3385864B1 (en) 2024-01-03
EP3385864A1 (en) 2018-10-10
CN106815260A (zh) 2017-06-09
CN106815260B (zh) 2021-05-04
EP3385864A4 (en) 2018-10-10
JP6898320B2 (ja) 2021-07-07

Similar Documents

Publication Publication Date Title
WO2017092583A1 (zh) 一种索引建立方法及设备
US11157473B2 (en) Multisource semantic partitioning
US11593379B2 (en) Join query processing using pruning index
EP3285178B1 (en) Data query method in crossing-partition database, and crossing-partition query device
JP6964384B2 (ja) 異種データソース混在環境におけるフィールド間の関係性の自動的発見のための方法、プログラム、および、システム
US11275738B2 (en) Prefix N-gram indexing
US8965918B2 (en) Decomposed query conditions
EP3014488B1 (en) Incremental maintenance of range-partitioned statistics for query optimization
US11907220B2 (en) Optimizing query processing and routing in a hybrid workload optimized database system
US11188552B2 (en) Executing conditions with negation operators in analytical databases
CN105512129B (zh) 一种海量数据检索方法及装置、海量数据存储方法及系统
CN107291770B (zh) 一种分布式系统中海量数据的查询方法及装置
US8583655B2 (en) Using an inverted index to produce an answer to a query
CN113568931A (zh) 一种数据访问请求的路由解析系统及方法
US11816107B2 (en) Index generation using lazy reassembling of semi-structured data
US9471634B2 (en) Execution of negated conditions using a bitmap
CN118708608A (zh) 处理引擎的选择方法、装置、计算机设备、存储介质
US10762139B1 (en) Method and system for managing a document search index
Wang et al. Research on the Optimization of Spark Big Table Equal Join
CN121166758A (en) Equivalent connection query method, equipment, medium and product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16869893

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2018524442

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016869893

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016869893

Country of ref document: EP

Effective date: 20180702