CN111767309A - Method for optimizing retrieval based on switch design mode - Google Patents

Method for optimizing retrieval based on switch design mode Download PDF

Info

Publication number
CN111767309A
CN111767309A CN202010644255.7A CN202010644255A CN111767309A CN 111767309 A CN111767309 A CN 111767309A CN 202010644255 A CN202010644255 A CN 202010644255A CN 111767309 A CN111767309 A CN 111767309A
Authority
CN
China
Prior art keywords
solr
data
interface
search
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010644255.7A
Other languages
Chinese (zh)
Other versions
CN111767309B (en
Inventor
王旭锋
刘涛
刘磊
魏帮财
蒋永录
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Wanwei Information Technology Co Ltd
Original Assignee
China Telecom Wanwei Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Wanwei Information Technology Co Ltd filed Critical China Telecom Wanwei Information Technology Co Ltd
Priority to CN202010644255.7A priority Critical patent/CN111767309B/en
Publication of CN111767309A publication Critical patent/CN111767309A/en
Application granted granted Critical
Publication of CN111767309B publication Critical patent/CN111767309B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The method relates to the technical field of big data search, in particular to a method for optimizing retrieval based on a switch design mode, which is suitable for automatically selecting a big data search entry to optimize retrieval based on the switch design mode. When a user searches, the user can intelligently select a searched entry, and the searching advantages of the Elasticissearch and the Solr in different aspects are combined, so that the searching efficiency is improved. In practical application, the method supports cluster expansion, and with the increase of cluster nodes of search engines, the method can also configure a strategy for selecting search entries, so that the autonomy is strong.

Description

Method for optimizing retrieval based on switch design mode
Technical Field
The method relates to the technical field of big data search, in particular to a method for optimizing retrieval based on a switch design mode, which is suitable for automatically selecting a big data search entry to optimize retrieval based on the switch design mode.
Technical Field
The big data search is a search mode that a computer index program searches in a search engine database by means of a big data search engine through input query conditions and feeds back results meeting the search conditions to a user. Both the Elasticissearch and Solr search engines are used herein.
The ElasticSearch is a distributed, high-expansion and high-real-time based search and data analysis engine. It can conveniently make a large amount of data have the capability of searching, analyzing and exploring. The horizontal flexibility of the elastic search is fully utilized, so that the data can become more valuable in a production environment. The implementation principle of the ElasticSearch is mainly divided into the following steps that firstly, a user submits data to an ElasticSearch database, then a word controller divides words of corresponding sentences, the weights and word division results are stored into the data, when the user searches data, the results are ranked and scored according to the weights, and then returned results are presented to the user.
Solr is an open-source enterprise-level search engine, and the main functions of the Solr comprise full-text retrieval, hit marking, facet search, dynamic clustering and database integration. Solr is highly scalable and provides distributed search and index replication.
When the server configuration, the number of cluster nodes, the data content (data volume), and the query conditions are all the same, the performance comparison table of elastic search and Solr is as follows:
Figure 40654DEST_PATH_IMAGE001
as seen from the test results, in the case that the data volume is less than 2000 ten thousand and no data is inserted into the search engine database, the search speed of solr is obviously higher than that of Elasticissearch; and when the data volume is higher than 2000 ten thousand or data is inserted into the database of the search engine, the search efficiency of the Elasticissearch is obviously higher than that of Solr. Therefore, the advantages of the two methods can be combined to make an intelligent method for selecting the entry of the search engine so as to improve the search performance.
Disclosure of Invention
In order to realize the functions, the invention provides a method for automatically selecting a big data search entry to optimize retrieval based on a switch design mode.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for optimizing retrieval based on a switch design mode comprises the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step 3, and obtaining the searched data to return to the user;
in the step C, the real-time acquired data volume is judged,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
The invention has the beneficial effects that:
by adopting the method, the user can intelligently select the searched entry when searching, and the searching advantages of the elastic search and the Solr in different aspects are combined, so that the searching efficiency is improved. In practical application, the method supports cluster expansion, and with the increase of cluster nodes of search engines, the method can also configure a strategy for selecting search entries, so that the autonomy is strong.
Drawings
FIG. 1 is a logic diagram for implementing the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
A method for optimizing retrieval based on a switch design mode comprises the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step 3, and obtaining the searched data to return to the user;
in the step C, the real-time acquired data volume is judged,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
The present invention is specifically described as follows:
A. the Solr data insertion interface is managed and monitored by adopting a data bus architecture, if data flow into the Solr can be instantly acquired by a monitoring program, and meanwhile, the monitoring program can acquire real-time data volume in the Solr.
B. Two interfaces are created to respectively perform data query on the elastic search and the solr, and the formats of the input and participation return values of the two interfaces are required to be kept consistent.
C. Using a switch design mode, creating a switch interface, which supervises both interfaces created in step B and can determine whether to use the elasticsearch interface or the solr query interface according to the incoming conditions, such as incoming solr insert = true or num > n (solr insert = true represents that there is data inserted into the solr, num represents the amount of data in the current solr, and n represents the threshold value of the solr performance reduction), through which the query interface can be cut into the search entry of the elasticsearch instantly; otherwise, the search entry of solr is switched to. The interface will use the data queried by the elastic search or solr interface as the return value.
D. And (3) creating a user search interface which is used as an entrance for user query, receiving query conditions of a user to the outside, calling a monitoring interface of the solr to the inside to obtain the solr data volume and the data insertion condition (namely whether the solrInsert is true), transmitting the obtained result to the switch interface in the step C, and obtaining the searched data to return to the user.
Through the steps, the solr is monitored by using a data bus architecture, the two search entries are packaged in a switch mode, the big data search entries can be automatically switched, and the search efficiency is improved by combining the advantages of the elastic search and the solr in respective search fields.
Application case
Case one: the government website document library function has a large amount of policy documents and related office documents, and in order to realize quick query of the documents, a search engine can be adopted as a document retrieval tool. Because the files in the document library are updated regularly and are not in real time, the search entry provided by the method can be quickly switched to a Solr search engine for retrieval under the conditions that no data is inserted and the data volume is lower than a set threshold value; when data is inserted or the data volume is higher than a set threshold value, the method can be switched to the elastic search for searching, and therefore the searching efficiency is improved.
Case two: the system file management function is used for storing a large amount of user file information, frequent data query operation needs to be provided, but the writing operation of data is relatively less, in order to improve the query performance, the method provided by the text can be adopted, the search engine inlets are automatically switched, the search strategy is optimized, the advantages of the Solr and Elasticesearch search engines are fully exerted, and the data query efficiency is improved.

Claims (2)

1. A method for optimizing retrieval based on a switch design mode is characterized by comprising the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. and C, creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step C, and obtaining the searched data to return to the user.
2. The method of claim 1, wherein the step C comprises determining the real-time data amount,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
CN202010644255.7A 2020-07-07 2020-07-07 Method for optimizing retrieval based on switch design mode Active CN111767309B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010644255.7A CN111767309B (en) 2020-07-07 2020-07-07 Method for optimizing retrieval based on switch design mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010644255.7A CN111767309B (en) 2020-07-07 2020-07-07 Method for optimizing retrieval based on switch design mode

Publications (2)

Publication Number Publication Date
CN111767309A true CN111767309A (en) 2020-10-13
CN111767309B CN111767309B (en) 2022-06-24

Family

ID=72723927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010644255.7A Active CN111767309B (en) 2020-07-07 2020-07-07 Method for optimizing retrieval based on switch design mode

Country Status (1)

Country Link
CN (1) CN111767309B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018036235A1 (en) * 2016-08-22 2018-03-01 中兴通讯股份有限公司 Solr data migration method and apparatus
US20180089316A1 (en) * 2016-09-26 2018-03-29 Twiggle Ltd. Seamless integration of modules for search enhancement
US20180150487A1 (en) * 2016-11-28 2018-05-31 Atlassian Pty Ltd Systems and methods for indexing source code in a search engine
CN108121709A (en) * 2016-11-28 2018-06-05 中兴通讯股份有限公司 A kind of search processing method and device
CN109189800A (en) * 2018-08-16 2019-01-11 北京中科梧桐网络科技有限公司 A kind of data Layer paging query model and querying method
CN109241080A (en) * 2018-09-29 2019-01-18 焦点科技股份有限公司 A kind of the building application method and its system of FQL query language
CN109299102A (en) * 2018-10-23 2019-02-01 中国电子科技集团公司第二十八研究所 A kind of HBase secondary index system and method based on Elastcisearch

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018036235A1 (en) * 2016-08-22 2018-03-01 中兴通讯股份有限公司 Solr data migration method and apparatus
US20180089316A1 (en) * 2016-09-26 2018-03-29 Twiggle Ltd. Seamless integration of modules for search enhancement
US20180150487A1 (en) * 2016-11-28 2018-05-31 Atlassian Pty Ltd Systems and methods for indexing source code in a search engine
CN108121709A (en) * 2016-11-28 2018-06-05 中兴通讯股份有限公司 A kind of search processing method and device
CN109189800A (en) * 2018-08-16 2019-01-11 北京中科梧桐网络科技有限公司 A kind of data Layer paging query model and querying method
CN109241080A (en) * 2018-09-29 2019-01-18 焦点科技股份有限公司 A kind of the building application method and its system of FQL query language
CN109299102A (en) * 2018-10-23 2019-02-01 中国电子科技集团公司第二十八研究所 A kind of HBase secondary index system and method based on Elastcisearch

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANTON FIRSOV: "Traditional IR Meets Ontology Engineering in Search for Data", 《SIGIR"17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL》 *
吴雨晨等: "改进的大数据检索自适应性切换搜索算法", 《西安工业大学学报》 *
李传根: "Elasticsearch数据存储策略研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN111767309B (en) 2022-06-24

Similar Documents

Publication Publication Date Title
CN108304444B (en) Information query method and device
CN107729336B (en) Data processing method, device and system
US8775410B2 (en) Method for using dual indices to support query expansion, relevance/non-relevance models, blind/relevance feedback and an intelligent search interface
US7254580B1 (en) System and method for selectively searching partitions of a database
CN108897761B (en) Cluster storage method and device
JP2021531591A (en) Association recommendation method, equipment, computer equipment and storage media
Reinanda et al. Mining, ranking and recommending entity aspects
WO2009003050A2 (en) System and method for measuring the quality of document sets
JP2012003740A (en) Retrieval result generation method, retrieval result generation program and retrieval system
CN104166651A (en) Data searching method and device based on integration of data objects in same classes
CN112307366B (en) Information display method and device and computer storage medium
CN102915381B (en) Visual network retrieval based on multi-dimensional semantic presents system and presents control method
WO2013056192A1 (en) Presenting search results based upon subject-versions
KR20160053933A (en) Smart search refinement
US20070168346A1 (en) Method and system for implementing two-phased searching
Bouramoul et al. PRESY: A Context based query reformulation tool for information retrieval on the Web
WO2020248378A1 (en) Service query method and apparatus, and storage medium and computer device
US20120130999A1 (en) Method and Apparatus for Searching Electronic Documents
EP2073131A1 (en) Method and apparatus for processing a search query for text content items
CN114218211A (en) Data processing system, method, computer device and readable storage medium
CN111767309B (en) Method for optimizing retrieval based on switch design mode
Tsukuda et al. Estimating intent types for search result diversification
CN103034709A (en) System and method for resequencing search results
US8875007B2 (en) Creating and modifying an image wiki page
KR20150096848A (en) Apparatus for searching data using index and method for using the apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant