CN111767309A - Method for optimizing retrieval based on switch design mode - Google Patents
Method for optimizing retrieval based on switch design mode Download PDFInfo
- Publication number
- CN111767309A CN111767309A CN202010644255.7A CN202010644255A CN111767309A CN 111767309 A CN111767309 A CN 111767309A CN 202010644255 A CN202010644255 A CN 202010644255A CN 111767309 A CN111767309 A CN 111767309A
- Authority
- CN
- China
- Prior art keywords
- solr
- data
- interface
- search
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The method relates to the technical field of big data search, in particular to a method for optimizing retrieval based on a switch design mode, which is suitable for automatically selecting a big data search entry to optimize retrieval based on the switch design mode. When a user searches, the user can intelligently select a searched entry, and the searching advantages of the Elasticissearch and the Solr in different aspects are combined, so that the searching efficiency is improved. In practical application, the method supports cluster expansion, and with the increase of cluster nodes of search engines, the method can also configure a strategy for selecting search entries, so that the autonomy is strong.
Description
Technical Field
The method relates to the technical field of big data search, in particular to a method for optimizing retrieval based on a switch design mode, which is suitable for automatically selecting a big data search entry to optimize retrieval based on the switch design mode.
Technical Field
The big data search is a search mode that a computer index program searches in a search engine database by means of a big data search engine through input query conditions and feeds back results meeting the search conditions to a user. Both the Elasticissearch and Solr search engines are used herein.
The ElasticSearch is a distributed, high-expansion and high-real-time based search and data analysis engine. It can conveniently make a large amount of data have the capability of searching, analyzing and exploring. The horizontal flexibility of the elastic search is fully utilized, so that the data can become more valuable in a production environment. The implementation principle of the ElasticSearch is mainly divided into the following steps that firstly, a user submits data to an ElasticSearch database, then a word controller divides words of corresponding sentences, the weights and word division results are stored into the data, when the user searches data, the results are ranked and scored according to the weights, and then returned results are presented to the user.
Solr is an open-source enterprise-level search engine, and the main functions of the Solr comprise full-text retrieval, hit marking, facet search, dynamic clustering and database integration. Solr is highly scalable and provides distributed search and index replication.
When the server configuration, the number of cluster nodes, the data content (data volume), and the query conditions are all the same, the performance comparison table of elastic search and Solr is as follows:
as seen from the test results, in the case that the data volume is less than 2000 ten thousand and no data is inserted into the search engine database, the search speed of solr is obviously higher than that of Elasticissearch; and when the data volume is higher than 2000 ten thousand or data is inserted into the database of the search engine, the search efficiency of the Elasticissearch is obviously higher than that of Solr. Therefore, the advantages of the two methods can be combined to make an intelligent method for selecting the entry of the search engine so as to improve the search performance.
Disclosure of Invention
In order to realize the functions, the invention provides a method for automatically selecting a big data search entry to optimize retrieval based on a switch design mode.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for optimizing retrieval based on a switch design mode comprises the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step 3, and obtaining the searched data to return to the user;
in the step C, the real-time acquired data volume is judged,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
The invention has the beneficial effects that:
by adopting the method, the user can intelligently select the searched entry when searching, and the searching advantages of the elastic search and the Solr in different aspects are combined, so that the searching efficiency is improved. In practical application, the method supports cluster expansion, and with the increase of cluster nodes of search engines, the method can also configure a strategy for selecting search entries, so that the autonomy is strong.
Drawings
FIG. 1 is a logic diagram for implementing the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
A method for optimizing retrieval based on a switch design mode comprises the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step 3, and obtaining the searched data to return to the user;
in the step C, the real-time acquired data volume is judged,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
The present invention is specifically described as follows:
A. the Solr data insertion interface is managed and monitored by adopting a data bus architecture, if data flow into the Solr can be instantly acquired by a monitoring program, and meanwhile, the monitoring program can acquire real-time data volume in the Solr.
B. Two interfaces are created to respectively perform data query on the elastic search and the solr, and the formats of the input and participation return values of the two interfaces are required to be kept consistent.
C. Using a switch design mode, creating a switch interface, which supervises both interfaces created in step B and can determine whether to use the elasticsearch interface or the solr query interface according to the incoming conditions, such as incoming solr insert = true or num > n (solr insert = true represents that there is data inserted into the solr, num represents the amount of data in the current solr, and n represents the threshold value of the solr performance reduction), through which the query interface can be cut into the search entry of the elasticsearch instantly; otherwise, the search entry of solr is switched to. The interface will use the data queried by the elastic search or solr interface as the return value.
D. And (3) creating a user search interface which is used as an entrance for user query, receiving query conditions of a user to the outside, calling a monitoring interface of the solr to the inside to obtain the solr data volume and the data insertion condition (namely whether the solrInsert is true), transmitting the obtained result to the switch interface in the step C, and obtaining the searched data to return to the user.
Through the steps, the solr is monitored by using a data bus architecture, the two search entries are packaged in a switch mode, the big data search entries can be automatically switched, and the search efficiency is improved by combining the advantages of the elastic search and the solr in respective search fields.
Application case
Case one: the government website document library function has a large amount of policy documents and related office documents, and in order to realize quick query of the documents, a search engine can be adopted as a document retrieval tool. Because the files in the document library are updated regularly and are not in real time, the search entry provided by the method can be quickly switched to a Solr search engine for retrieval under the conditions that no data is inserted and the data volume is lower than a set threshold value; when data is inserted or the data volume is higher than a set threshold value, the method can be switched to the elastic search for searching, and therefore the searching efficiency is improved.
Case two: the system file management function is used for storing a large amount of user file information, frequent data query operation needs to be provided, but the writing operation of data is relatively less, in order to improve the query performance, the method provided by the text can be adopted, the search engine inlets are automatically switched, the search strategy is optimized, the advantages of the Solr and Elasticesearch search engines are fully exerted, and the data query efficiency is improved.
Claims (2)
1. A method for optimizing retrieval based on a switch design mode is characterized by comprising the following steps:
A. monitoring an Solr data insertion interface by adopting a data bus architecture, and acquiring real-time data volume in the Solr at the moment when data flowing into the Solr are detected;
B. creating two interfaces to respectively perform data query on the elastic search and the solr, and keeping the formats of the input participation return values of the two interfaces consistent;
C. using a switch design mode, creating a switch interface which simultaneously supervises the two interfaces created in the step B and switches to use an elastic search query interface or a solr query interface according to the incoming conditions;
D. and C, creating a user search interface which is used as an entrance for user query, receiving query conditions of a user externally, calling the real-time data volume in the step A, transmitting the obtained data result into the switch interface in the step C, and obtaining the searched data to return to the user.
2. The method of claim 1, wherein the step C comprises determining the real-time data amount,
the expression of incoming real-time data is solr insert = true or num > n;
solrInsert = true represents that there is data inserted into solr;
num represents the data volume in the current solr, n represents the critical value of the solr performance reduction, and the query interface is instantly cut into the search entry of the elastic search through the switch interface; and otherwise, switching to the search entry of the solr, and using the data queried by the elastic search or the solr interface as a return value by the interface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010644255.7A CN111767309B (en) | 2020-07-07 | 2020-07-07 | Method for optimizing retrieval based on switch design mode |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010644255.7A CN111767309B (en) | 2020-07-07 | 2020-07-07 | Method for optimizing retrieval based on switch design mode |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111767309A true CN111767309A (en) | 2020-10-13 |
CN111767309B CN111767309B (en) | 2022-06-24 |
Family
ID=72723927
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010644255.7A Active CN111767309B (en) | 2020-07-07 | 2020-07-07 | Method for optimizing retrieval based on switch design mode |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111767309B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018036235A1 (en) * | 2016-08-22 | 2018-03-01 | 中兴通讯股份有限公司 | Solr data migration method and apparatus |
US20180089316A1 (en) * | 2016-09-26 | 2018-03-29 | Twiggle Ltd. | Seamless integration of modules for search enhancement |
US20180150487A1 (en) * | 2016-11-28 | 2018-05-31 | Atlassian Pty Ltd | Systems and methods for indexing source code in a search engine |
CN108121709A (en) * | 2016-11-28 | 2018-06-05 | 中兴通讯股份有限公司 | A kind of search processing method and device |
CN109189800A (en) * | 2018-08-16 | 2019-01-11 | 北京中科梧桐网络科技有限公司 | A kind of data Layer paging query model and querying method |
CN109241080A (en) * | 2018-09-29 | 2019-01-18 | 焦点科技股份有限公司 | A kind of the building application method and its system of FQL query language |
CN109299102A (en) * | 2018-10-23 | 2019-02-01 | 中国电子科技集团公司第二十八研究所 | A kind of HBase secondary index system and method based on Elastcisearch |
-
2020
- 2020-07-07 CN CN202010644255.7A patent/CN111767309B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018036235A1 (en) * | 2016-08-22 | 2018-03-01 | 中兴通讯股份有限公司 | Solr data migration method and apparatus |
US20180089316A1 (en) * | 2016-09-26 | 2018-03-29 | Twiggle Ltd. | Seamless integration of modules for search enhancement |
US20180150487A1 (en) * | 2016-11-28 | 2018-05-31 | Atlassian Pty Ltd | Systems and methods for indexing source code in a search engine |
CN108121709A (en) * | 2016-11-28 | 2018-06-05 | 中兴通讯股份有限公司 | A kind of search processing method and device |
CN109189800A (en) * | 2018-08-16 | 2019-01-11 | 北京中科梧桐网络科技有限公司 | A kind of data Layer paging query model and querying method |
CN109241080A (en) * | 2018-09-29 | 2019-01-18 | 焦点科技股份有限公司 | A kind of the building application method and its system of FQL query language |
CN109299102A (en) * | 2018-10-23 | 2019-02-01 | 中国电子科技集团公司第二十八研究所 | A kind of HBase secondary index system and method based on Elastcisearch |
Non-Patent Citations (3)
Title |
---|
ANTON FIRSOV: "Traditional IR Meets Ontology Engineering in Search for Data", 《SIGIR"17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL》 * |
吴雨晨等: "改进的大数据检索自适应性切换搜索算法", 《西安工业大学学报》 * |
李传根: "Elasticsearch数据存储策略研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN111767309B (en) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108304444B (en) | Information query method and device | |
CN107729336B (en) | Data processing method, device and system | |
US8775410B2 (en) | Method for using dual indices to support query expansion, relevance/non-relevance models, blind/relevance feedback and an intelligent search interface | |
US7254580B1 (en) | System and method for selectively searching partitions of a database | |
CN108897761B (en) | Cluster storage method and device | |
JP2021531591A (en) | Association recommendation method, equipment, computer equipment and storage media | |
Reinanda et al. | Mining, ranking and recommending entity aspects | |
WO2009003050A2 (en) | System and method for measuring the quality of document sets | |
JP2012003740A (en) | Retrieval result generation method, retrieval result generation program and retrieval system | |
CN104166651A (en) | Data searching method and device based on integration of data objects in same classes | |
CN112307366B (en) | Information display method and device and computer storage medium | |
CN102915381B (en) | Visual network retrieval based on multi-dimensional semantic presents system and presents control method | |
WO2013056192A1 (en) | Presenting search results based upon subject-versions | |
KR20160053933A (en) | Smart search refinement | |
US20070168346A1 (en) | Method and system for implementing two-phased searching | |
Bouramoul et al. | PRESY: A Context based query reformulation tool for information retrieval on the Web | |
WO2020248378A1 (en) | Service query method and apparatus, and storage medium and computer device | |
US20120130999A1 (en) | Method and Apparatus for Searching Electronic Documents | |
EP2073131A1 (en) | Method and apparatus for processing a search query for text content items | |
CN114218211A (en) | Data processing system, method, computer device and readable storage medium | |
CN111767309B (en) | Method for optimizing retrieval based on switch design mode | |
Tsukuda et al. | Estimating intent types for search result diversification | |
CN103034709A (en) | System and method for resequencing search results | |
US8875007B2 (en) | Creating and modifying an image wiki page | |
KR20150096848A (en) | Apparatus for searching data using index and method for using the apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |