CN107943922B - Method and device for retrieving information based on solr - Google Patents

Method and device for retrieving information based on solr Download PDF

Info

Publication number
CN107943922B
CN107943922B CN201711164079.1A CN201711164079A CN107943922B CN 107943922 B CN107943922 B CN 107943922B CN 201711164079 A CN201711164079 A CN 201711164079A CN 107943922 B CN107943922 B CN 107943922B
Authority
CN
China
Prior art keywords
data
query
solr
loading
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711164079.1A
Other languages
Chinese (zh)
Other versions
CN107943922A (en
Inventor
谢永恒
孟宪奎
火一莽
万月亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN201711164079.1A priority Critical patent/CN107943922B/en
Publication of CN107943922A publication Critical patent/CN107943922A/en
Application granted granted Critical
Publication of CN107943922B publication Critical patent/CN107943922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method and a device for retrieving information based on solr, which are used for receiving a request for retrieving information, acquiring parameters in the request, analyzing and identifying the parameters; starting distributed query control, and starting line interruption control or query overtime control according to a trigger condition; and loading data by adopting a segment file reverse loading and inverted list reverse loading mode, executing the user-defined standard scoring, and responding to the request. The method and the device have the key points of avoiding excessive index file loading, applying a spatial locality principle, ensuring effective utilization of system resources and improving information retrieval performance.

Description

Method and device for retrieving information based on solr
Technical Field
The embodiment of the invention relates to the technical field of information retrieval, in particular to a method and a device for retrieving information based on solr.
Background
As the big data industry is gradually applied to various industries, the query of massive data meets unprecedented challenges. In the field of big data, only the nosql databases like hbase and the like ensure the requirements of high concurrency, high performance, high storage and the like. However, the hbase database can only be queried according to rowkey, and cannot meet the variability of service requirements on the premise of guaranteeing the factors such as performance, concurrency and storage. The design and specific business of Rowkey have strong dependencies. In the hierarchical design of the secondary index, an architectural design mode that a solr retrieval engine is used as a query inlet and a hbase is used as a storage exists. The problem that the rowkey constraint of the hbase is too strong is solved.
In the traditional method for using the solr, relevance default sorting is adopted, and on the premise of sorting by using the relevance principle, a solr search engine is inevitably required to load all index files and score the index files. In a mass data mode, index files are frequently read, system memories are frequently recycled in a transition mode, the CPU utilization rate is too high, and system loads are on-line in a warning mode for a long time, so that the overall query performance and the concurrency capability cannot be effectively improved. In particular, when the solr data is divided into tables, the number of concurrent threads is directly determined by the number of tables. Furthermore, the solr cluster nodes frequently drop points, which results in the system function being unable to be used normally.
As can be seen from the use of the conventional solr. Too many index file loads are the root cause that the performance and concurrency cannot be effectively improved.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for retrieving information based on solr, aiming at improving the information retrieval efficiency.
To achieve the purpose, the embodiment of the invention adopts the following technical scheme:
in a first aspect, a method for solr-based information retrieval, the method comprising:
receiving a request for information retrieval, acquiring parameters in the request, analyzing and identifying the parameters;
starting distributed query control, and starting line interruption control or query overtime control according to a trigger condition;
and loading data by adopting a segment file reverse loading and inverted list reverse loading mode, executing the user-defined standard scoring, and responding to the request.
Optionally, the distributed query control includes:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
the total amount of written data per slice is approximately equal in the time dimension.
Optionally, the query timeout control includes:
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
Optionally, the segment file is reversely loaded, including:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large is carried out, and the segment files are loaded in a sequence from large to small by expanding the solr default realization interface.
Optionally, the data read interruption comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if so, interrupt control is executed, the request is directly responded, and the next segment file is prevented from being continuously scanned.
Optionally, the inverted table is reversely loaded, including:
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of recorded pieces is not satisfied, the scanning of the next segment is continued until the set desired number is satisfied.
Optionally, the performing the custom criteria scoring comprises:
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
In a second aspect, an apparatus for solr-based information retrieval, the apparatus comprising:
the analysis module is used for receiving a request of information retrieval, acquiring parameters in the request, analyzing and identifying the parameters;
the starting module is used for starting the distributed query control and starting line interruption control or query overtime control according to the triggering condition;
and the loading module is used for loading data in a way of reversely loading the segment file and reversely loading the inverted list, executing the user-defined standard scoring and responding to the request.
Optionally, the starting module is specifically configured to:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
in the time dimension, the total amount of written data of each slice is approximately equal;
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
Optionally, the loading module is specifically configured to:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large, and realizing an interface by expanding the solr default to finish loading the segment files in a sequence from large to small;
the data read interruption comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if yes, executing interrupt control, directly responding to the request, and avoiding continuously scanning the next segment file;
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of the recording pieces is not met, continuing to scan the next segment until the set expected number is met;
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
The embodiment of the invention has the beneficial effects that: the method and the device have the key points of avoiding excessive index file loading, applying a spatial locality principle, ensuring effective utilization of system resources and improving information retrieval performance.
Drawings
Fig. 1 is a schematic flowchart of a method for retrieving information based on solr according to an embodiment of the present invention;
fig. 2 is a schematic functional module diagram of an apparatus for retrieving information based on solr according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Referring to fig. 1, fig. 1 is a schematic flowchart of a method for retrieving information based on solr according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step 110, receiving a request for information retrieval, acquiring parameters in the request, analyzing and identifying the parameters;
step 120, starting distributed query control, and starting line interruption control or query timeout control according to a trigger condition;
wherein the distributed query control comprises:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
the total amount of written data per slice is approximately equal in the time dimension.
Illustratively, during the index writing process, data is uniformly distributed to each fragment of the table in a hash manner. The total amount of written data per slice is approximately equal in the time dimension. So far, the reverse process is adopted for the reading process, the number of rows of each query is evenly distributed to each slice, and each slice only reads 1/rows of record number (rounding up is needed). By the means, relatively less records of each fragment query are ensured, and the data volume transmitted in the network is smaller. Greatly improving query performance and reducing resource consumption.
The query timeout control includes:
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
In the invention, under the consideration of a plurality of factors such as line interruption, inverted reading and the like, the functions of the system are expanded, the combination of overtime and interruption is realized, and the two-way control over overtime and interruption is realized. Under the dual control of line interruption and overtime interruption, effective query data is guaranteed to exist.
And step 130, loading data by adopting a segment file reverse loading and inverted list reverse loading mode, executing user-defined standard grading, and responding to the request.
Wherein, the segment file is reversely loaded, including:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large is carried out, and the segment files are loaded in a sequence from large to small by expanding the solr default realization interface.
Illustratively, the purpose of the horizontal sub-table is to perform horizontal physical isolation of data, so that the overlarge data space of the table is avoided, and the horizontal expansion is facilitated. Meanwhile, data query is carried out by loading the data of the latest table and according to a data reading interruption technology by depending on the service characteristics (reading data according to data writing). In the internal default processing control logic of the solr, segment files are loaded in a sequence from small to large, the current processing mode and the service requirement are in conflict, and up to this point, segment files are loaded in a sequence from large to small by expanding a solr default realization interface. The detailed process is shown in fig. 2. Through the two strategies, the latest data is guaranteed to be read.
Wherein the data read interruption comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if so, interrupt control is executed, the request is directly responded, and the next segment file is prevented from being continuously scanned.
Illustratively, data reading interruption is realized by expanding on the basis that solr provides collectors, the main design principle is to define the number of hits of the collectors, and in the process of circularly collecting documents, interception judges whether the defined expected number is met. If so, interrupt control is executed, the request is directly responded, and the next segment file is prevented from being scanned continuously. In the document collection process, relevant expansion is also carried out on the reading query control of the inverted list, and the main reason is determined by the structure of the inverted list, in the inverted list, the last written data is placed at the end of the inverted list, the reading is started from the starting point (in the latest version of solr, the inverted list cannot read data from the end), if the latest data is required to be obtained, the whole inverted list must be read, the number of the last documents meeting the condition is intercepted, in the implementation process, the priority minimum heap queue technology is adopted and the queue size is defined, namely, each satisfied record is placed in a queue, and the data is input and output through a priority algorithm (the minimum heap queue provided by the solr is adopted). After scanning one segment, if the number of records is satisfied, go back directly, if the number of records is not satisfied, continue scanning the next segment until the set desired number is satisfied.
Wherein, the reverse table reverse loading includes:
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of recorded pieces is not satisfied, the scanning of the next segment is continued until the set desired number is satisfied.
Wherein the executing the custom criteria score comprises:
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
The embodiment of the invention has the beneficial effects that: the method and the device have the key points of avoiding excessive index file loading, applying a spatial locality principle, ensuring effective utilization of system resources and improving information retrieval performance.
Referring to fig. 2, fig. 2 is a functional module schematic diagram of an apparatus for retrieving information based on solr according to an embodiment of the present invention. As shown in fig. 2, the apparatus includes:
the analysis module 210 is configured to receive a request for information retrieval, obtain a parameter in the request, analyze and identify the parameter;
the starting module 220 is used for starting distributed query control and starting line interruption control or query timeout control according to a triggering condition;
and the loading module 230 is configured to load data in a manner of reverse loading of the segment file and reverse loading of the inverted list, execute a custom standard score, and respond to the request.
Optionally, the starting module 220 is specifically configured to:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
in the time dimension, the total amount of written data of each slice is approximately equal;
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
Optionally, the loading module 230 is specifically configured to:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large, and realizing an interface by expanding the solr default to finish loading the segment files in a sequence from large to small;
the data read interruption comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if yes, executing interrupt control, directly responding to the request, and avoiding continuously scanning the next segment file;
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of the recording pieces is not met, continuing to scan the next segment until the set expected number is met;
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
The embodiment of the invention has the beneficial effects that: the method and the device have the key points of avoiding excessive index file loading, applying a spatial locality principle, ensuring effective utilization of system resources and improving information retrieval performance.
The technical principle of the embodiment of the present invention is described above in conjunction with the specific embodiments. The description is only intended to explain the principles of embodiments of the invention and should not be taken in any way as limiting the scope of the embodiments of the invention. Based on the explanations herein, those skilled in the art will be able to conceive of other embodiments of the present invention without inventive step, and these embodiments will fall within the scope of the present invention.

Claims (9)

1. A method for solr-based information retrieval, the method comprising:
receiving a request for information retrieval, acquiring parameters in the request, analyzing and identifying the parameters;
starting distributed query control, and starting line interruption control or query overtime control according to a trigger condition;
loading data in a segment file reverse loading and inverted list reverse loading mode, executing user-defined standard scoring, and responding to the request;
the reverse loading of the segment file comprises the following steps:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large is carried out, and the segment files are loaded in a sequence from large to small by expanding the solr default realization interface.
2. The method of claim 1, wherein the distributed query control comprises:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
the total amount of written data per slice is approximately equal in the time dimension.
3. The method of claim 1, wherein querying the timeout control comprises:
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
4. The method of claim 1, wherein the data read interruption comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if so, interrupt control is executed, the request is directly responded, and the next segment file is prevented from being continuously scanned.
5. The method of claim 1, wherein the inverted table is loaded in reverse, comprising:
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of recorded pieces is not satisfied, the scanning of the next segment is continued until the set desired number is satisfied.
6. The method of claim 1, wherein said performing a custom criteria score comprises:
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
7. An apparatus for solr-based information retrieval, the apparatus comprising:
the analysis module is used for receiving a request of information retrieval, acquiring parameters in the request, analyzing and identifying the parameters;
the starting module is used for starting the distributed query control and starting line interruption control or query overtime control according to the triggering condition;
the loading module is used for loading data in a way of reversely loading segment files and reversely loading inverted lists, executing user-defined standard scoring and responding to the request;
the loading module is specifically configured to:
performing physical isolation of data in a transverse direction by adopting a transverse sub-table, writing data and reading data according to the data, and performing data query by adopting the data loaded with a latest table and interrupting according to data reading;
in the internal default processing control logic of the solr, loading the segment files in a sequence from small to large is carried out, and the segment files are loaded in a sequence from large to small by expanding the solr default realization interface.
8. The apparatus according to claim 7, wherein the starting module is specifically configured to:
in the index writing, data are uniformly distributed to each sub-slice of the table in a Hash mode;
in the time dimension, the total amount of written data of each slice is approximately equal;
if the Solr is in default, setting overtime time;
starting a timer, timing and inquiring;
judging the execution time;
if the query is overtime, interrupting the query;
if the query is not timed out, the complete query is executed.
9. The apparatus of claim 7, wherein the data read interrupt comprises:
defining the number of hits of a collector, and intercepting and judging whether the number meets the defined expected number in the process of circularly collecting the documents;
if yes, executing interrupt control, directly responding to the request, and avoiding continuously scanning the next segment file;
in the document collection process, a priority minimum heap queue technology is adopted and the size of a queue is defined, each satisfied record is put into the queue, and the data input and output are realized through a priority algorithm;
after scanning a segment, if the number of records is satisfied, directly returning;
if the number of the recording pieces is not met, continuing to scan the next segment until the set expected number is met;
on the premise that scoring is not applied, the scoring of solr is expanded through self-defining similarity, weight or scoring, and management is carried out through a singleton mode.
CN201711164079.1A 2017-11-21 2017-11-21 Method and device for retrieving information based on solr Active CN107943922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711164079.1A CN107943922B (en) 2017-11-21 2017-11-21 Method and device for retrieving information based on solr

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711164079.1A CN107943922B (en) 2017-11-21 2017-11-21 Method and device for retrieving information based on solr

Publications (2)

Publication Number Publication Date
CN107943922A CN107943922A (en) 2018-04-20
CN107943922B true CN107943922B (en) 2020-08-25

Family

ID=61929501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711164079.1A Active CN107943922B (en) 2017-11-21 2017-11-21 Method and device for retrieving information based on solr

Country Status (1)

Country Link
CN (1) CN107943922B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455619A (en) * 2013-09-12 2013-12-18 焦点科技股份有限公司 Grading treatment method and system based on Lucene fragmentation structure
CN104408065A (en) * 2014-10-29 2015-03-11 中国建设银行股份有限公司 Trade information on-line inquiry method and device
CN106776929A (en) * 2016-11-30 2017-05-31 北京锐安科技有限公司 A kind of method for information retrieval and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100217665A1 (en) * 2009-02-25 2010-08-26 Vishal Naresh Sharma Method and system for launching an advertising campaign
US20130151534A1 (en) * 2011-12-08 2013-06-13 Digitalsmiths, Inc. Multimedia metadata analysis using inverted index with temporal and segment identifying payloads

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455619A (en) * 2013-09-12 2013-12-18 焦点科技股份有限公司 Grading treatment method and system based on Lucene fragmentation structure
CN104408065A (en) * 2014-10-29 2015-03-11 中国建设银行股份有限公司 Trade information on-line inquiry method and device
CN106776929A (en) * 2016-11-30 2017-05-31 北京锐安科技有限公司 A kind of method for information retrieval and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
建立基于Solr平台的质量信息检索系统;牛涛;《电子科学技术》;20160910(第05期);590-593 *

Also Published As

Publication number Publication date
CN107943922A (en) 2018-04-20

Similar Documents

Publication Publication Date Title
US9298761B2 (en) Adaptive merging in database indexes
US9372867B2 (en) Similarity analysis method, apparatus, and system
RU2663358C2 (en) Clustering storage method and device
US7921085B2 (en) Method and system for quantifying a data page repetition pattern for a database index in a database management system
WO2019184618A1 (en) Method and device for storing data, server, and storage medium
CN106919675B (en) Data storage method and device
CN113836084A (en) Data storage method, device and system
CN105378716B (en) A kind of conversion method and device of data memory format
US20190236201A1 (en) Techniques for processing database tables using indexes
US20200349113A1 (en) File storage method, deletion method, server and storage medium
CN105786918B (en) Data query method and device based on data loading storage space
CN104239377A (en) Platform-crossing data retrieval method and device
CN107515931B (en) Repeated data detection method based on clustering
CN109299101B (en) Data retrieval method, device, server and storage medium
CN103377292B (en) Database result set caching method and device
CN109033295B (en) Method and device for merging super-large data sets
Chai et al. Adaptive lower-level driven compaction to optimize LSM-tree key-value stores
Wang et al. PLSM: a highly efficient LSM-tree index supporting real-time big data analysis
KR101666440B1 (en) Data processing method in In-memory Database System based on Circle-Queue
Shi et al. ByteSeries: an in-memory time series database for large-scale monitoring systems
CN114281819A (en) Data query method, device, equipment and storage medium
CN107943922B (en) Method and device for retrieving information based on solr
CN113760190A (en) Small file merging system and method based on Ceph storage
CN113282618A (en) Optimization scheme and system for retrieval of active clusters of Elasticissearch
WO2008085358A1 (en) Accelerating queries using temporary enumeration representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant