CN111597212A - Data retrieval method and device - Google Patents

Data retrieval method and device Download PDF

Info

Publication number
CN111597212A
CN111597212A CN202010445888.5A CN202010445888A CN111597212A CN 111597212 A CN111597212 A CN 111597212A CN 202010445888 A CN202010445888 A CN 202010445888A CN 111597212 A CN111597212 A CN 111597212A
Authority
CN
China
Prior art keywords
retrieval
request
data
data retrieval
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010445888.5A
Other languages
Chinese (zh)
Other versions
CN111597212B (en
Inventor
崔大鹏
白宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Supertool Internet Technology Ltd
Original Assignee
Beijing Supertool Internet Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Supertool Internet Technology Ltd filed Critical Beijing Supertool Internet Technology Ltd
Priority to CN202010445888.5A priority Critical patent/CN111597212B/en
Publication of CN111597212A publication Critical patent/CN111597212A/en
Application granted granted Critical
Publication of CN111597212B publication Critical patent/CN111597212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data retrieval method and a data retrieval device, which relate to the field of data processing and comprise the following steps: receiving a data retrieval request; splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests; respectively carrying out data retrieval on the plurality of split requests to obtain a plurality of retrieval results which are in one-to-one correspondence with the plurality of split requests; and summarizing a plurality of search results to obtain a data search result corresponding to the data search request. Therefore, by the implementation of the implementation mode, the retrieval speed and the retrieval efficiency can be improved while the occupation of a large amount of memory of the retrieval server is avoided, and the retrieval performance is further improved.

Description

Data retrieval method and device
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data retrieval method and apparatus.
Background
With the development of modern network technology and communication technology, people are expected to extract required data from a large amount of information in a retrieval manner in the daily work in the face of a rapid increase in the amount of information resources, and therefore, full-text retrieval technology is also urgently needed to be rapidly developed. The existing data retrieval method generally carries out comprehensive retrieval in a full database comprising all original data according to a retrieval request input by a user every time, obtains a retrieval result and finally returns the retrieval result to the user. However, in practice, the full database includes full raw data, and performing full search in each search consumes a large amount of memory of the search server, resulting in slow and inefficient search.
Disclosure of Invention
An object of the embodiments of the present application is to provide a data retrieval method and apparatus, which can improve retrieval speed and retrieval efficiency while avoiding occupying a large amount of memory of a retrieval server, thereby improving retrieval performance.
A first aspect of an embodiment of the present application provides a data retrieval method, where the data retrieval method includes:
receiving a data retrieval request;
splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests;
respectively carrying out data retrieval on the plurality of split requests to obtain a plurality of retrieval results corresponding to the plurality of split requests one by one;
and summarizing the plurality of retrieval results to obtain a data retrieval result corresponding to the data retrieval request.
In the implementation process, the method can preferentially acquire the data retrieval request, and then split the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests; after acquiring a plurality of splitting requests, performing data retrieval according to each splitting request to obtain a plurality of retrieval results, wherein the plurality of retrieval results are in one-to-one correspondence with the plurality of splitting requests; and finally, summarizing a plurality of retrieval results to obtain a final data retrieval result. Therefore, by implementing the implementation mode, a large amount of retrieval of the data retrieval request can be split into multiple sub-retrieval through the splitting process, so that a large amount of memory of the retrieval server is avoided being occupied, meanwhile, the implementation mode can also greatly improve the retrieval efficiency through a multi-thread processing mode, in addition, the implementation mode can also be simplified, and corresponding retrieval is triggered from each detail, so that the effect of improving the retrieval performance is realized.
Further, the step of performing data retrieval on the multiple split requests respectively to obtain multiple retrieval results corresponding to the multiple split requests one to one includes:
classifying the plurality of split requests according to a preset request classification rule to obtain a full retrieval request set and a surface retrieval request set;
performing full-scale retrieval processing in a preset full-scale database according to the full-scale retrieval request set to obtain a full-scale retrieval result;
performing surface retrieval processing in a preset surface database according to the surface retrieval request set to obtain a surface retrieval result;
and determining a plurality of retrieval results which correspond to the plurality of split requests one by one according to the full retrieval result and the surface retrieval result.
In the implementation process, in the process of obtaining a plurality of retrieval results, the method can preferentially classify a plurality of split requests according to a preset request classification rule to obtain two types of retrieval requests, wherein the two types of retrieval requests comprise a retrieval request set formed by a full-scale retrieval request and a retrieval request set formed by a surface layer retrieval request; after the full-scale retrieval request set and the surface layer retrieval condition set, performing full-scale retrieval processing in a preset full-scale database according to the full-scale retrieval request set to obtain a full-scale retrieval result, and performing surface layer retrieval processing in a preset surface layer database according to the surface layer retrieval request set to obtain a surface layer retrieval result; and after the full-scale search result and the surface-layer search result are both obtained, determining a plurality of search results according to the full-scale search result and the surface-layer search result. Therefore, by implementing the implementation mode, the splitting request can be classified, so that the retrieval can be divided into two different kinds of retrieval, the retrieval pertinence is improved, and the retrieval precision and the retrieval effect of the retrieval are improved.
Further, the step of performing a surface layer search process in a preset surface layer database according to the surface layer search request set to obtain a surface layer search result includes:
determining a mapping retrieval request set corresponding to the surface layer retrieval request set in a preset request mapping library;
determining a last update time set formed by the last update time corresponding to each split request in the surface layer retrieval request set according to the mapping retrieval request set;
and performing surface retrieval processing in a preset surface database according to the last updating time set and the surface retrieval request set to obtain a surface retrieval result.
In the implementation process, in the process of performing the surface layer retrieval processing to obtain the surface layer retrieval result, the method may preferentially determine a mapping retrieval request set corresponding to the surface layer retrieval request set in a preset request mapping library, so that the mapping retrieval request set may be used to determine a last update time set formed by last update times corresponding to each split request in the surface layer retrieval request set; and then, performing surface retrieval processing in a preset surface database according to the last update time set and the surface retrieval request set to obtain a surface retrieval result. Therefore, by implementing the embodiment, the full-text retrieval problem can be solved by mapping the retrieval request set, the effect of reducing the retrieval pressure is realized, and meanwhile, the error problem caused by data change can be effectively solved by introducing the concept of data updating time, so that the retrieval accuracy is improved.
Further, the step of performing a surface retrieval process in a preset surface database according to the last update time set and the surface retrieval request set to obtain a surface retrieval result includes:
acquiring the last database updating time of a preset surface database;
splitting the surface layer retrieval request set according to the last updating time set to obtain a first request subset and a second request subset; the last update time corresponding to each split request in the first request subset is less than the database update time, and the last update time corresponding to each split request in the second request subset is greater than or equal to the database update time;
determining a mapping retrieval request subset corresponding to the second request subset in the mapping retrieval request set;
performing surface retrieval processing in the surface database according to the mapping retrieval request subset to obtain a second retrieval result;
and aggregating the second retrieval result to obtain a surface retrieval result.
In the implementation process, the method can obtain the update time of the database, and then determine stable data, that is, data included in the second request subset, according to the update time of the database and the update time of the search request, so that the part of data can be subjected to surface layer search processing to obtain an accurate second search result. Therefore, by implementing the implementation mode, the method can judge the stability of the data through the database updating time and the updating time of the retrieval request, then determine the retrieval type according to the stability, and further perform corresponding efficient and accurate retrieval after the retrieval type is determined to obtain an accurate surface retrieval result.
Further, after the step of splitting the surface layer retrieval request set according to the last update time set to obtain a first request subset and a second request subset, the method further includes:
according to the first request subset, carrying out full-scale retrieval processing in the full-scale database to obtain a first retrieval result;
the step of aggregating the second search results to obtain a surface layer search result comprises:
and aggregating the first retrieval result and the second retrieval result to obtain a surface retrieval result.
In the implementation process, the method can also obtain a first request subset with a more active data state, then perform full-scale retrieval processing according to the first request subset to obtain a first retrieval result, and then aggregate the first retrieval result and the second retrieval result to obtain a complete surface layer retrieval result. Therefore, by implementing the embodiment, the secondary search can be completed in the process of surface layer search, and the search precision is improved.
Further, after the receiving a data retrieval request, the method further comprises:
storing the data retrieval request;
acquiring storage time for storing the data retrieval request;
performing hotspot query analysis processing on the data retrieval request to obtain an analysis result;
and updating a preset request mapping library according to the storage time and the analysis result.
In the implementation process, the method can store the data retrieval request after the data retrieval request, acquire the storage time of the data retrieval request, and further perform hotspot query analysis on the data retrieval request to obtain an analysis result, so that the request mapping library can be updated according to the storage time of the data retrieval request and the hotspot analysis result. Therefore, by implementing the implementation mode, the request mapping library can be prompted to be updated through the input of the data retrieval request, so that the real-time performance of the request mapping library is realized, and the overall data retrieval effect is improved.
Further, the updating the preset request mapping library according to the storage time and the analysis result includes:
judging whether a preset request mapping library stores a target mapping request matched with the data retrieval request or not according to the analysis result;
if so, acquiring the target mapping request and the target last updating time corresponding to the target mapping request;
acquiring a target identifier matched with the data retrieval request according to the storage time and the target last updating time;
and updating the request mapping library according to the target identification, the data retrieval request and the target last updating time.
In the implementation process, the method can detect the target mapping request in advance, acquire the last updating time of the target mapping request when monitoring, acquire the target identifier matched with the data retrieval request according to the last updating time and the storage time of the data retrieval request, and update the request mapping library according to the target identifier, the data retrieval request and the last updating time. Therefore, the implementation method can accurately complete data updating in real time, thereby ensuring the accuracy of data retrieval.
A second aspect of the embodiments of the present application provides a data retrieval apparatus, including:
a receiving unit configured to receive a data retrieval request;
the splitting unit is used for splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests;
the retrieval unit is used for respectively carrying out data retrieval on the plurality of split requests to obtain a plurality of retrieval results which are in one-to-one correspondence with the plurality of split requests;
and the summarizing unit is used for summarizing the plurality of retrieval results to obtain the data retrieval result corresponding to the data retrieval request.
In the implementation process, the data retrieval device completes the operations of acquiring the data retrieval request, splitting the data retrieval request, retrieving the split requests one by one, unifying and summarizing the retrieval result and the like through a plurality of units, and realizes the function of the acquisition and summarization integrated device. Therefore, by implementing the embodiment, the data retrieval device can realize the function of automatic retrieval, thereby improving the efficiency of data retrieval and ensuring the precision of data retrieval.
A third aspect of embodiments of the present application provides an electronic device, including a memory and a processor, where the memory is used to store a computer program, and the processor runs the computer program to enable the electronic device to execute the data retrieval method according to any one of the first aspect of embodiments of the present application.
A fourth aspect of the embodiments of the present application provides a computer-readable storage medium, which stores computer program instructions, and when the computer program instructions are read and executed by a processor, the computer program instructions perform the data retrieval method according to any one of the first aspect of the embodiments of the present application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of a data retrieval method according to an embodiment of the present application;
fig. 2 is a schematic flow chart of another data retrieval method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a data retrieval device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of another data retrieval device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, fig. 1 is a schematic flow chart of a data retrieval method according to an embodiment of the present application. The method can be applied to various retrieval application scenes, and particularly can be applied to retrieval scenes using an elastic search engine. The data retrieval method comprises the following steps:
s101, receiving a data retrieval request.
In this embodiment, the data retrieval request may be a query aggregation request of a word segmentation search class.
In this embodiment, the data retrieval request may be DSL (Domain Specific Language); the DSL is a JSON-formatted query language supported by the Elasticsearch, and generally includes a query filtering part, an aggregation statistics part, and other auxiliary parts such as highlights. The basic process of inquiry is that the inquiry and filtering part inquires hit data, and the aggregation statistical part calculates results according to the hit data.
S102, splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests.
In this embodiment, the preset request splitting rule may be a monthly splitting rule, and the data retrieval request may be split into a plurality of splitting requests by the request splitting rule.
S103, data retrieval is carried out on the split requests respectively, and a plurality of retrieval results corresponding to the split requests one by one are obtained.
In this embodiment, the search results are in one-to-one correspondence with the split requests, that is, one split request corresponds to one search result.
And S104, summarizing a plurality of search results to obtain a data search result corresponding to the data search request.
In this embodiment, a plurality of search results are collected together to obtain a comprehensive data search result.
In the embodiment, a plurality of retrieval results are processed in parallel, so that the data retrieval efficiency is higher.
Although basic hardware overhead cannot be avoided in the embodiment, the key difference between the method and the traditional scheme is that only non-Text datatype data are stored, and therefore, compared with other schemes, the method can greatly reduce hardware cost, and the whole hardware consumption only accounts for 5% of the original ESC-M.
In this embodiment, an ESC (elastic search cluster) refers to a cluster service composed of different elastic search nodes, and supports full-Text search and large data warehousing, where the full database stores ESC of full raw data, and the surface database does not store Filed of Text datatype, compared with the full database. For example, if the data retrieval method described in this embodiment is applied to paper retrieval, the full database stores data of all papers, including paper surface data (including paper title, paper author identifier, paper classification identifier, paper comment data, etc.) and paper text data of a paper; compared with a full database, the surface database only stores the surface data of the thesis.
In this embodiment, the data retrieval method described in this embodiment is based on an Elasticsearch (full text search engine). The ElasticSearch is a distributed, high-expansion and high-real-time search and data analysis engine. The method can conveniently enable a large amount of data to have the capability of searching, analyzing and exploring.
In this embodiment, the data retrieval request may be DSL (domain Specific language), which is a JSON-formatted query language supported by an elastic search, and generally includes a query filtering portion, an aggregation statistics portion, and other auxiliary portions such as highlighting. The basic process of inquiry is that the inquiry and filtering part inquires hit data, and the aggregation statistical part calculates results according to the hit data.
In this embodiment, the preset request splitting rule includes splitting the data retrieval request according to a preset time threshold.
In this embodiment, the basic unit of the Elasticsearch storage data is Document, each Document is composed of different fields, and one Document must correspond to a unique Field and an Index (identifier), that is, two documents in the same Index do not have the same identifier. And obtaining the target identification matched with the data retrieval request, and determining the unique Document, Index and Field according to the target identification.
In this embodiment, the execution subject may be a computer device with a computing and searching capability, such as a computer and a server, and is not limited in this embodiment.
In this embodiment, the execution subject may also be an intelligent electronic device such as a mobile phone and a tablet, which is not limited in this embodiment.
It can be seen that, by implementing the data retrieval method described in fig. 1, the data retrieval request can be preferentially obtained, and then the data retrieval request is split according to the preset request splitting rule to obtain a plurality of splitting requests; after acquiring a plurality of splitting requests, performing data retrieval according to each splitting request to obtain a plurality of retrieval results, wherein the plurality of retrieval results are in one-to-one correspondence with the plurality of splitting requests; and finally, summarizing a plurality of retrieval results to obtain a final data retrieval result. Therefore, by implementing the implementation mode, a large amount of retrieval of the data retrieval request can be split into multiple sub-retrieval through the splitting process, so that a large amount of memory of the retrieval server is avoided being occupied, meanwhile, the implementation mode can also greatly improve the retrieval efficiency through a multi-thread processing mode, in addition, the implementation mode can also be simplified, and corresponding retrieval is triggered from each detail, so that the effect of improving the retrieval performance is realized.
Example 2
Referring to fig. 2, fig. 2 is a schematic flow chart of another data retrieval method according to an embodiment of the present application. The flow diagram of the data retrieval method depicted in fig. 2 is improved from the flow diagram of the data retrieval method depicted in fig. 1. The data retrieval method comprises the following steps:
s201, receiving a data retrieval request.
In this embodiment, the data retrieval request may be a query aggregation request of a word segmentation search class.
In this embodiment, the data retrieval request may be DSL (Domain Specific Language); the DSL is a JSON-formatted query language supported by the Elasticsearch, and generally includes a query filtering part, an aggregation statistics part, and other auxiliary parts such as highlights. The basic process of inquiry is that the inquiry and filtering part inquires hit data, and the aggregation statistical part calculates results according to the hit data. The ElasticSearch is a Lucene-based search server that provides a distributed multi-user capable full-text search engine.
As an optional implementation, after receiving the data retrieval request, the method may further include:
storing the data retrieval request;
acquiring the storage time of a storage data retrieval request;
performing hotspot query analysis processing on the data retrieval request to obtain an analysis result;
and updating the preset request mapping library according to the storage time and the analysis result.
By implementing the embodiment, the request mapping library can be prompted to be updated through the input of the data retrieval request, so that the real-time performance of the request mapping library is realized, and the overall data retrieval effect is improved.
In this embodiment, the data retrieval request may be a search query fusion request.
In this embodiment, the storage time of the data retrieval request is the storage time calculated when the data retrieval request is stored.
In this embodiment, the step of performing hotspot query analysis processing on the data retrieval request to obtain an analysis result may be understood as analyzing a new hotspot query/updating an existing hotspot query mapping regularly, and splitting the request according to a preset rule.
As an optional implementation manner, the hotspot query analysis processing is performed on the data retrieval request, when an analysis result is obtained, asynchronous execution is adopted, the hotspot query statement is analyzed on the data retrieval request, the peak time of the full-scale retrieval is avoided, mapping from the surface layer retrieval to the full-scale retrieval is established, the same hotspot data retrieval request is received again, the mapping request corresponding to the data retrieval request can be directly obtained, the surface layer retrieval is performed according to the mapping request, the cluster performance pressure caused by the keyword retrieval during the full-scale retrieval is greatly reduced, meanwhile, the surface layer retrieval is not dependent on word segmentation, and the response can be more timely and faster compared with the full-scale retrieval.
As a further optional implementation, the step of updating the preset request mapping library according to the storage time and the analysis result may include:
judging whether a preset request mapping library stores a target mapping request matched with the data retrieval request or not according to the analysis result;
if so, acquiring a target mapping request and the target last updating time corresponding to the target mapping request;
acquiring a target identifier matched with the data retrieval request according to the storage time and the target last updating time;
and updating the request mapping library according to the target identification, the data retrieval request and the target last updating time.
By implementing the implementation mode, the data updating can be accurately finished in real time, so that the accuracy of data retrieval is ensured.
In this embodiment, the step of determining whether the preset request mapping library stores the target mapping request matched with the data retrieval request according to the analysis result may be understood as whether the conversion mapping relationship established earlier exists in the current month.
In this embodiment, if there is a conversion mapping relationship established early, it means that there is data in the database, and it is necessary to update, delete, and update the data; otherwise, the updating is directly carried out without deleting.
S202, splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests.
In this embodiment, the preset request splitting rule may be a monthly splitting rule, and the data retrieval request may be split into a plurality of splitting requests by the request splitting rule.
S203, classifying the plurality of split requests according to a preset request classification rule to obtain a full-scale retrieval request set and a surface layer retrieval request set.
In this embodiment, the full-volume search request set is a set of a plurality of full-volume search requests, where the full-volume search requests are used for full-text search (i.e., full-data search).
In this embodiment, the surface layer search request set is a set of a plurality of surface layer search requests, where the surface layer search requests are used for performing non-text search.
And S204, carrying out full-quantity retrieval processing in a preset full-quantity database according to the full-quantity retrieval request set to obtain a full-quantity retrieval result.
In this embodiment, the full database refers to a database having all data.
S205, determining a mapping retrieval request set corresponding to the surface layer retrieval request set in a preset request mapping library.
In this embodiment, the request mapping library refers to a search request library independent of word segmentation, and the search request library corresponds to a large number of surface layer search requests.
In the embodiment, the mapping request in the request mapping library can directly call the historical query result corresponding to the surface layer retrieval request without retrieving again, so that the effects of saving time and improving efficiency are achieved.
S206, determining a last update time set formed by the last update time corresponding to each split request in the surface layer retrieval request set according to the mapping retrieval request set.
In this embodiment, the last update time in the last update time set corresponds to each split request one to one.
And S207, performing surface retrieval processing in a preset surface database according to the last updating time set and the surface retrieval request set to obtain a surface retrieval result.
In this embodiment, the last update time set is used to obtain the latest retrieval data, thereby improving the accuracy of retrieval.
As an optional implementation manner, the step of performing the surface layer search processing in the preset surface layer database according to the last update time set and the surface layer search request set to obtain the surface layer search result may include:
acquiring the last database updating time of a preset surface database;
splitting the table retrieval request set according to the last update time set to obtain a first request subset and a second request subset; the last update time corresponding to each split request in the first request subset is less than the database update time, and the last update time corresponding to each split request in the second request subset is greater than or equal to the database update time;
determining a mapping retrieval request subset corresponding to the second request subset in the mapping retrieval request set;
performing surface retrieval processing in a surface database according to the mapping retrieval request subset to obtain a second retrieval result;
and aggregating the second search results to obtain a surface layer search result.
By implementing the implementation mode, the stability of the data can be judged through the updating time of the database and the updating time of the retrieval request, then the retrieval type is determined according to the stability, and then the corresponding efficient and accurate retrieval is carried out after the retrieval type is determined, so that an accurate surface layer retrieval result is obtained.
As an optional implementation manner, after the step of splitting the table-level search request set according to the last update time set to obtain the first request subset and the second request subset, the method further includes:
according to the first request subset, carrying out full-scale retrieval processing in a full-scale database to obtain a first retrieval result;
the step of aggregating the second search results to obtain the surface layer search results comprises:
and aggregating the first search result and the second search result to obtain a surface layer search result.
By implementing the embodiment, the secondary search can be completed in the process of surface layer search, and the precision of the search is improved
Steps S205 to S207 may be configured to perform a surface layer search process in a preset surface layer database according to the surface layer search request set, and obtain a surface layer search result. Therefore, by implementing the embodiment, the full-text retrieval problem can be solved by mapping the retrieval request set, the effect of reducing the retrieval pressure is realized, and meanwhile, the error problem caused by data change can be effectively solved by introducing the concept of data updating time, so that the retrieval accuracy is improved.
And S208, determining a plurality of retrieval results corresponding to the plurality of split requests one by one according to the full-scale retrieval result and the surface-layer retrieval result.
In this embodiment, the plurality of search results are included in either the full search result or the surface search result, which is not described in detail in this embodiment.
And S209, summarizing a plurality of search results to obtain a data search result corresponding to the data search request.
As an optional implementation manner, the updating process is performed on the request mapping library according to the target identifier, the data retrieval request, and the target last update time, and the method may further include the following steps:
deleting mapping data corresponding to the target identifier in the request mapping library;
constructing a mapping relation between a target identification and the data retrieval request;
and adding the mapping relation into the request mapping library to update the request mapping library.
It can be seen that, by implementing the data retrieval method described in fig. 2, the data retrieval request can be preferentially obtained, and then the data retrieval request is split according to the preset request splitting rule to obtain a plurality of splitting requests; after acquiring a plurality of splitting requests, performing data retrieval according to each splitting request to obtain a plurality of retrieval results, wherein the plurality of retrieval results are in one-to-one correspondence with the plurality of splitting requests; and finally, summarizing a plurality of retrieval results to obtain a final data retrieval result. Therefore, by implementing the implementation mode, a large amount of retrieval of the data retrieval request can be split into multiple sub-retrieval through the splitting process, so that a large amount of memory of the retrieval server is avoided being occupied, meanwhile, the implementation mode can also greatly improve the retrieval efficiency through a multi-thread processing mode, in addition, the implementation mode can also be simplified, and corresponding retrieval is triggered from each detail, so that the effect of improving the retrieval performance is realized.
Example 3
Please refer to fig. 3, fig. 3 is a schematic structural diagram of a data retrieval device according to an embodiment of the present application. Wherein the data retrieval apparatus comprises:
the receiving unit 310 is configured to receive a data retrieval request.
The splitting unit 320 is configured to split the data retrieval request according to a preset request splitting rule to obtain multiple splitting requests.
The retrieving unit 330 is configured to perform data retrieval on the multiple split requests respectively to obtain multiple retrieval results corresponding to the multiple split requests one to one.
The summarizing unit 340 is configured to summarize a plurality of search results to obtain a data search result corresponding to the data search request.
In this embodiment, for the explanation of the data retrieval device, reference may be made to the description in embodiment 1 or embodiment 2, and details are not repeated in this embodiment.
It can be seen that, with the data retrieval device described in fig. 3, operations such as data retrieval request acquisition, data retrieval request splitting, split request one-by-one retrieval, and unified summarization of retrieval results can be completed by a plurality of units, so that the function of the device integrating acquisition and summarization is realized. Therefore, by implementing the embodiment, the data retrieval device can realize the function of automatic retrieval, thereby improving the efficiency of data retrieval and ensuring the precision of data retrieval.
Example 4
Referring to fig. 4, fig. 4 is a schematic structural diagram of another data retrieval device according to an embodiment of the present application. The structure diagram of the data retrieval device depicted in fig. 4 is improved according to the structure diagram of the data retrieval device depicted in fig. 3. Wherein, the search unit 330 includes:
the classifying subunit 331 is configured to classify the multiple split requests according to a preset request classifying rule, so as to obtain a full-scale retrieval request set and a surface-level retrieval request set.
And a retrieval subunit 332, configured to perform a full-scale retrieval process in a preset full-scale database according to the full-scale retrieval request set, so as to obtain a full-scale retrieval result.
The retrieving subunit 332 is further configured to perform a surface layer retrieval process in a preset surface layer database according to the surface layer retrieval request set, so as to obtain a surface layer retrieval result.
The determining subunit 333 is configured to determine, according to the full-size search result and the surface layer search result, a plurality of search results that correspond to the plurality of split requests one to one.
As an optional implementation manner, when performing the surface layer search processing in the preset surface layer database according to the surface layer search request set to obtain the surface layer search result, the search subunit 332 specifically executes determining, in the preset request mapping library, a mapping search request set corresponding to the surface layer search request set;
determining a last update time set formed by the last update time corresponding to each split request in the surface layer retrieval request set according to the mapping retrieval request set;
and performing surface retrieval processing in a preset surface database according to the last update time set and the surface retrieval request set to obtain a surface retrieval result.
As an optional implementation manner, when performing an operation of performing a surface layer search process in a preset surface layer database according to the last update time set and the surface layer search request set to obtain a surface layer search result, the search subunit 332 specifically performs the operation of obtaining the database update time of the last time of the preset surface layer database;
splitting the table retrieval request set according to the last update time set to obtain a first request subset and a second request subset; the last update time corresponding to each split request in the first request subset is less than the database update time, and the last update time corresponding to each split request in the second request subset is greater than or equal to the database update time;
according to the first request subset, carrying out full-scale retrieval processing in a full-scale database to obtain a first retrieval result;
determining a mapping retrieval request subset corresponding to the second request subset in the mapping retrieval request set;
performing surface retrieval processing in a surface database according to the mapping retrieval request subset to obtain a second retrieval result;
and aggregating the first search result and the second search result to obtain a surface layer search result.
As an optional implementation, the data retrieval apparatus may further include:
the storage unit 350 is used for storing the data retrieval request.
An obtaining unit 360, configured to obtain a storage time of the storage data retrieval request.
The analysis unit 370 is configured to perform hotspot query analysis processing on the data retrieval request to obtain an analysis result.
And an updating unit 380, configured to update the preset request mapping library according to the storage time and the analysis result.
As an optional implementation, the updating unit 380 includes:
and the determining subunit 381 is configured to determine, according to the analysis result, whether a target mapping request matched with the data retrieval request is stored in the preset request mapping library.
The obtaining subunit 382 is configured to, if the determination result of the determining subunit 381 is yes, obtain the target mapping request and the target last update time corresponding to the target mapping request.
The obtaining subunit 382 is further configured to obtain the target identifier matching the data retrieval request according to the storage time and the target last update time.
And an updating subunit 383, configured to update the request mapping library according to the target identifier, the data retrieval request, and the target last update time.
In this embodiment, for the explanation of the data retrieval device, reference may be made to the description in embodiment 1 or embodiment 2, and details are not repeated in this embodiment.
It can be seen that, with the data retrieval device described in fig. 4, operations such as data retrieval request acquisition, data retrieval request splitting, split request one-by-one retrieval, and unified summarization of retrieval results can be completed by multiple units, so that the function of the device integrating acquisition and summarization is realized. Therefore, by implementing the embodiment, the data retrieval device can realize the function of automatic retrieval, thereby improving the efficiency of data retrieval and ensuring the precision of data retrieval.
An embodiment of the present application provides an electronic device, which includes a memory and a processor, where the memory is used to store a computer program, and the processor runs the computer program to enable the electronic device to execute the data retrieval method in embodiment 1 or embodiment 2 of the present application.
An embodiment of the present application provides a computer-readable storage medium, which stores computer program instructions, and when the computer program instructions are read and executed by a processor, the computer program instructions execute the data retrieval method according to any one of embodiment 1 or embodiment 2 of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A data retrieval method, characterized in that the data retrieval method comprises:
receiving a data retrieval request;
splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests;
respectively carrying out data retrieval on the plurality of split requests to obtain a plurality of retrieval results corresponding to the plurality of split requests one by one;
and summarizing the plurality of retrieval results to obtain a data retrieval result corresponding to the data retrieval request.
2. The data retrieval method of claim 1, wherein the step of performing data retrieval on the plurality of split requests respectively to obtain a plurality of retrieval results corresponding to the plurality of split requests one to one comprises:
classifying the plurality of split requests according to a preset request classification rule to obtain a full retrieval request set and a surface retrieval request set;
performing full-scale retrieval processing in a preset full-scale database according to the full-scale retrieval request set to obtain a full-scale retrieval result;
performing surface retrieval processing in a preset surface database according to the surface retrieval request set to obtain a surface retrieval result;
and determining a plurality of retrieval results which correspond to the plurality of split requests one by one according to the full retrieval result and the surface retrieval result.
3. The data retrieval method of claim 2, wherein the step of performing a surface retrieval process in a preset surface database according to the surface retrieval request set to obtain a surface retrieval result comprises:
determining a mapping retrieval request set corresponding to the surface layer retrieval request set in a preset request mapping library;
determining a last update time set formed by the last update time corresponding to each split request in the surface layer retrieval request set according to the mapping retrieval request set;
and performing surface retrieval processing in a preset surface database according to the last updating time set and the surface retrieval request set to obtain a surface retrieval result.
4. The data retrieval method of claim 3, wherein the step of performing a surface retrieval process in a preset surface database according to the last update time set and the surface retrieval request set to obtain a surface retrieval result comprises:
acquiring the last database updating time of a preset surface database;
splitting the surface layer retrieval request set according to the last updating time set to obtain a first request subset and a second request subset; the last update time corresponding to each split request in the first request subset is less than the database update time, and the last update time corresponding to each split request in the second request subset is greater than or equal to the database update time;
determining a mapping retrieval request subset corresponding to the second request subset in the mapping retrieval request set;
performing surface retrieval processing in the surface database according to the mapping retrieval request subset to obtain a second retrieval result;
and aggregating the second retrieval result to obtain a surface retrieval result.
5. The data retrieval method of claim 4, wherein after the step of splitting the surface layer retrieval request set according to the last update time set to obtain a first request subset and a second request subset, the method further comprises:
according to the first request subset, carrying out full-scale retrieval processing in the full-scale database to obtain a first retrieval result;
the step of aggregating the second search results to obtain a surface layer search result comprises:
and aggregating the first retrieval result and the second retrieval result to obtain a surface retrieval result.
6. The data retrieval method of claim 1, wherein after the receiving a data retrieval request, the method further comprises:
storing the data retrieval request;
acquiring storage time for storing the data retrieval request;
performing hotspot query analysis processing on the data retrieval request to obtain an analysis result;
and updating a preset request mapping library according to the storage time and the analysis result.
7. The data retrieval method of claim 6, wherein the updating the preset request mapping library according to the storage time and the analysis result includes:
judging whether a preset request mapping library stores a target mapping request matched with the data retrieval request or not according to the analysis result;
if so, acquiring the target mapping request and the target last updating time corresponding to the target mapping request;
acquiring a target identifier matched with the data retrieval request according to the storage time and the target last updating time;
and updating the request mapping library according to the target identification, the data retrieval request and the target last updating time.
8. A data retrieval device, characterized in that the data retrieval device comprises:
a receiving unit configured to receive a data retrieval request;
the splitting unit is used for splitting the data retrieval request according to a preset request splitting rule to obtain a plurality of splitting requests;
the retrieval unit is used for respectively carrying out data retrieval on the plurality of split requests to obtain a plurality of retrieval results which are in one-to-one correspondence with the plurality of split requests;
and the summarizing unit is used for summarizing the plurality of retrieval results to obtain the data retrieval result corresponding to the data retrieval request.
9. An electronic device, comprising a memory for storing a computer program and a processor for executing the computer program to cause the electronic device to perform the data retrieval method of any one of claims 1 to 7.
10. A readable storage medium having stored thereon computer program instructions which, when read and executed by a processor, perform the data retrieval method of any one of claims 1 to 7.
CN202010445888.5A 2020-05-22 2020-05-22 Data retrieval method and device Active CN111597212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010445888.5A CN111597212B (en) 2020-05-22 2020-05-22 Data retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010445888.5A CN111597212B (en) 2020-05-22 2020-05-22 Data retrieval method and device

Publications (2)

Publication Number Publication Date
CN111597212A true CN111597212A (en) 2020-08-28
CN111597212B CN111597212B (en) 2024-03-08

Family

ID=72186400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010445888.5A Active CN111597212B (en) 2020-05-22 2020-05-22 Data retrieval method and device

Country Status (1)

Country Link
CN (1) CN111597212B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6678674B1 (en) * 1998-07-09 2004-01-13 Informex, Inc. Data retrieving method and apparatus data retrieving system and storage medium
CN1987853A (en) * 2005-12-23 2007-06-27 北大方正集团有限公司 Searching method for relational data base and full text searching combination
CN107870985A (en) * 2017-10-12 2018-04-03 深圳市金立通信设备有限公司 A kind of method, server and computer-readable recording medium for retrieving information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6678674B1 (en) * 1998-07-09 2004-01-13 Informex, Inc. Data retrieving method and apparatus data retrieving system and storage medium
CN1987853A (en) * 2005-12-23 2007-06-27 北大方正集团有限公司 Searching method for relational data base and full text searching combination
CN107870985A (en) * 2017-10-12 2018-04-03 深圳市金立通信设备有限公司 A kind of method, server and computer-readable recording medium for retrieving information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆群;: "中国网络之门争夺战悄然进行" *

Also Published As

Publication number Publication date
CN111597212B (en) 2024-03-08

Similar Documents

Publication Publication Date Title
US9418144B2 (en) Similar document detection and electronic discovery
CN106033416B (en) Character string processing method and device
KR100544514B1 (en) Method and system for determining relation between search terms in the internet search system
CN106611053B (en) Data cleaning and indexing method
CN104573130A (en) Entity resolution method based on group calculation and entity resolution device based on group calculation
CN111125116B (en) Method and system for positioning code field in service table and corresponding code table
CN104239377A (en) Platform-crossing data retrieval method and device
CN112269816B (en) Government affair appointment correlation retrieval method
CN111752955A (en) Data processing method, device, equipment and computer readable storage medium
CN115145871A (en) File query method and device and electronic equipment
CN111897867A (en) Database log statistical method, system and related device
CN107291951B (en) Data processing method, device, storage medium and processor
CN113722600A (en) Data query method, device, equipment and product applied to big data
CN110874366A (en) Data processing and query method and device
CN112052259A (en) Data processing method, device, equipment and computer storage medium
CN111597212B (en) Data retrieval method and device
CN107730021B (en) Service index optimization method and device
CN112214494B (en) Retrieval method and device
US11645283B2 (en) Predictive query processing
CN114461762A (en) Archive change identification method, device, equipment and storage medium
CN114416848A (en) Data blood relationship processing method and device based on data warehouse
CN110781309A (en) Entity parallel relation similarity calculation method based on pattern matching
Zhang et al. An approximate approach to frequent itemset mining
JP2017010376A (en) Mart-less verification support system and mart-less verification support method
CN114579573B (en) Information retrieval method, information retrieval device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 15-5, 1st Floor, Building 4, No. 15 Haidian Middle Street, Haidian District, Beijing, 100082

Applicant after: Beijing minglue Zhaohui Technology Co.,Ltd.

Address before: Room 2020, 2nd floor, building 27, 25 North Third Ring Road West, Haidian District, Beijing

Applicant before: BEIJING SUPERTOOL INTERNET TECHNOLOGY LTD.

Country or region before: China

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant