CN111125244A - Search engine inspection method and equipment - Google Patents

Search engine inspection method and equipment Download PDF

Info

Publication number
CN111125244A
CN111125244A CN201911146662.9A CN201911146662A CN111125244A CN 111125244 A CN111125244 A CN 111125244A CN 201911146662 A CN201911146662 A CN 201911146662A CN 111125244 A CN111125244 A CN 111125244A
Authority
CN
China
Prior art keywords
data
data set
search engine
index
storing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911146662.9A
Other languages
Chinese (zh)
Inventor
徐培培
王兆贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN201911146662.9A priority Critical patent/CN111125244A/en
Publication of CN111125244A publication Critical patent/CN111125244A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for checking a search engine, wherein the method comprises the following steps: importing quantitative data into a database of a search engine; acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set; acquiring index data in the quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set; comparing each piece of data in the first data set with the data in the second data set respectively; and judging that the searched engine is in normal function in response to the first data set and the second data set being identical. By using the method of the invention, the visualization effect and display can be enhanced, the product quality is improved, the data rigidness and the accuracy under large data volume are ensured, the data inspection efficiency is improved, the human resource is saved, and the time cost of comparison is reduced.

Description

Search engine inspection method and equipment
Technical Field
The field relates to the field of computers, and more particularly to a search engine checking method and apparatus.
Background
The Elasticissearch is a distributed full-text search engine, and Kibana is an open source analysis and visualization platform designed for the Elasticissearch, can monitor, search, view and interact data stored in an Elasticissearch index in real time, and can effectively display high-level data analysis and visualization by using various charts, tables, maps and the like. Based on the characteristics of big data of 4V +1O, when the elastic search data is searched by Kibana, the consistency of the data plays a key role in the process of final search results and visual display effect and accuracy.
Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a method and a device for checking a search engine, which can enhance visualization effect and display, improve product quality, ensure data rigidness and accuracy under a large data volume, improve data checking efficiency, save human resources, and reduce comparison time cost.
In view of the above object, an aspect of embodiments of the present invention provides a method for verifying a search engine, including the steps of:
importing quantitative data into a database of a search engine;
acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set;
acquiring index data in the quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set;
comparing each piece of data in the first data set with the data in the second data set respectively;
and judging that the searched engine is in normal function in response to the first data set and the second data set being identical.
According to an embodiment of the present invention, further comprising: in response to the first data set not being identical to the second data set, storing different data in a results file and issuing a warning on the visualization platform.
According to an embodiment of the invention, comparing each piece of data in the first data set with data in the second data set comprises: and respectively searching the attribute keywords of each piece of data in the first data set in the second data set.
According to one embodiment of the invention, the search engine comprises an elastic search and the visualization platform comprises Kibana.
According to one embodiment of the invention, the fixed format is a json format.
In another aspect of an embodiment of the present invention, there is also provided an inspection apparatus for a search engine, including:
at least one processor; and
a memory storing program code executable by the processor, the program code when executed by the processor performing the steps of:
importing quantitative data into a database of a search engine;
acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set;
acquiring index data in the quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set;
comparing each piece of data in the first data set with the data in the second data set respectively;
and judging that the searched engine is in normal function in response to the first data set and the second data set being identical.
According to an embodiment of the invention, the program code further performs the following steps when executed by the processor: in response to the first data set not being identical to the second data set, storing different data in a results file and issuing a warning on the visualization platform.
According to an embodiment of the invention, comparing each piece of data in the first data set with data in the second data set comprises: and respectively searching the attribute keywords of each piece of data in the first data set in the second data set.
According to one embodiment of the invention, the search engine comprises an elastic search and the visualization platform comprises Kibana.
According to one embodiment of the invention, the fixed format is a json format.
The invention has the following beneficial technical effects: the method for checking the search engine provided by the embodiment of the invention leads quantitative data into a database of the search engine; acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set; acquiring index data in the quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set; comparing each piece of data in the first data set with the data in the second data set respectively; the technical scheme that the searched engine is judged to be normal in function in response to the fact that the first data set is completely the same as the second data set can enhance visualization effect and display, improve product quality, guarantee data rigidness and accuracy under large data volume, improve data inspection efficiency, save human resources and reduce comparison time cost.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of verification of a search engine according to one embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
In view of the above objects, a first aspect of embodiments of the present invention proposes an embodiment of a method for verifying a search engine. Fig. 1 shows a schematic flow diagram of the method.
As shown in fig. 1, the method may include the steps of:
s1, quantitative data is imported into a database of a search engine, wherein the data can be thousands of data, but more than 1000 data are ensured for testing accuracy;
s2, acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set, wherein the first data set is directly acquired from the database;
s3, acquiring index data in quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set, wherein the visualization platform can display contents searched in a database, and the second data set is acquired in the database through the visualization platform;
s4, comparing each piece of data in the first data set with the data in the second data set respectively, and checking whether the data are changed through a visualization platform or not or checking whether the search function of a search engine is normal or not;
s5, in response to the first data set being identical to the second data set, the search engine is judged to be functional normally, and the two data sets are identical to each other, so that the search engine can be judged to be functional normally.
Through the technical scheme, the visualization effect and display can be enhanced, the product quality is improved, the data rigidness and the accuracy under large data volume are ensured, the data inspection efficiency is improved, the human resources are saved, and the time cost of comparison is reduced.
In a preferred embodiment of the present invention, the method further comprises: in response to the first data set not being identical to the second data set, storing different data in a results file and issuing a warning on the visualization platform. In the comparison process, if more than one piece of data is inconsistent, the function of the search engine is abnormal, at this time, inconsistent information needs to be stored for analysis, and a warning is sent on a visual platform to inform a user that the current search engine is not credible.
In a preferred embodiment of the present invention, comparing each piece of data in the first data set with data in the second data set comprises: and respectively searching the attribute keywords of each piece of data in the first data set in the second data set. Each complete piece of data in the first data set may be searched in the second data set. The attribute keyword search may be faster than the full data search.
In a preferred embodiment of the present invention, the search engine comprises an Elasticsearch and the visualization platform comprises Kibana.
In a preferred embodiment of the invention, the fixed format is a json format.
Examples
The search engine is an elastic search (ES for short), and the visualization platform is Kibana.
ES Bulk import data
Using the self-contained _ bulk of the ES to import large data volume into the ES in batches;
curl–XPOST“http://100.7.45.80:9200/bank/account/_bulk?pretty”–H‘Content-Type:application/x-ndjson’--data-binary@/usr/hdp/3.0.1.0-187/elasticsearch/data/accounts.json
json is the position of a large number of data files, and takes a classical data bank as an example;
ES obtains index data and stores the index data in json format
Switching an Elasticissearch user in a background, using a curl command mode, checking that a bank index contains all data, wherein the bank index contains 1000 pieces of data of fields such as account _ number, balance, firstname, lastname, age, generator, employee and the like, and finally writing a data result into a file es _ search _ result in a json format after query;
3, automatically acquiring bank index data by the Kibana interface and storing the bank index data in a json format
Firstly, introducing a selenium related packet (python + selenium + xpath); then, modifying a max _ result _ window parameter, namely, the query depth is 10000, and querying 1000 complete data information of the bank index; and finally, storing the final response result into a file Kibana _ search _ result in a json format.
Consistency check of ES acquisition index data and Kibana interface automation acquisition index data
Firstly, circularly obtaining attributes such as account _ number, balance, firstname, lastname, age, gender, employee and the like of each _ id information in es _ search _ result and Kibana _ search _ result; then judging whether the inquiry of each account _ number, balance, firstname, lastname, age, builder and employee is consistent with the response obtained by the Kibana interface search; and finally, screening field data information with inconsistent consistency check.
5. Screening out inconsistent data information
The method focuses on the problem of ES and Kibana data consistency check, and finally, index field data information which is inconsistent after consistency check is screened out and comprises data information of _ index, _ type, _ id, _ sound, key field account _ number, balance, first name, last name, age, gene, employee and the like.
Expanding around consistency check of visual data of an ES background and a Kibana front-end page, firstly introducing a large amount of data in batches through ESBulk, then obtaining ES index (bank) data in a curl mode, storing the obtained index data in a json format, then automatically obtaining index data information in the ES through Kibana by using Python + Selenium + XPath, storing a response result in the json format, and then carrying out the final step, namely automatically obtaining the index data by the ES and the Kibana interface to carry out consistency check, finally screening out inconsistent information, effectively reducing time and effort of manual comparison in the aspects of large data size, diversity, real-time updating and value degree, improving the working efficiency, solving the problem of data consistency check, and ensuring the rigidness of the data and the accuracy of the large data size, the consistency of the search results and the effectiveness of visual display are enhanced.
It should be noted that, as will be understood by those skilled in the art, all or part of the processes in the methods of the above embodiments may be implemented by instructing relevant hardware through a computer program, and the above programs may be stored in a computer-readable storage medium, and when executed, the programs may include the processes of the embodiments of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
Furthermore, the method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU, and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention.
In view of the above object, a second aspect of an embodiment of the present invention provides an inspection apparatus for a search engine, characterized by comprising:
at least one processor; and
a memory storing program code executable by the processor, the program code when executed by the processor performing the steps of:
importing quantitative data into a database of a search engine;
acquiring index data in quantitative data from a database, and storing the index data in a fixed format to form a first data set;
acquiring index data in the quantitative data through a visualization platform of a search engine, and storing the index data in a fixed format to form a second data set;
comparing each piece of data in the first data set with the data in the second data set respectively;
and judging that the searched engine is in normal function in response to the first data set and the second data set being identical.
In a preferred embodiment of the invention, the program code further performs the following steps when executed by the processor: in response to the first data set not being identical to the second data set, storing different data in a results file and issuing a warning on the visualization platform.
In a preferred embodiment of the present invention, comparing each piece of data in the first data set with data in the second data set comprises: and respectively searching the attribute keywords of each piece of data in the first data set in the second data set.
In a preferred embodiment of the present invention, the search engine comprises an Elasticsearch and the visualization platform comprises Kibana.
In a preferred embodiment of the invention, the fixed format is a json format.
It should be particularly noted that the embodiment of the system described above employs the embodiment of the method described above to specifically describe the working process of each module, and those skilled in the art can easily think that the modules are applied to other embodiments of the method described above.
Further, the above-described method steps and system elements or modules may also be implemented using a controller and a computer-readable storage medium for storing a computer program for causing the controller to implement the functions of the above-described steps or elements or modules.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The embodiments described above, particularly any "preferred" embodiments, are possible examples of implementations and are presented merely to clearly understand the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing from the spirit and principles of the technology described herein. All such modifications are intended to be included within the scope of this disclosure and protected by the following claims.

Claims (10)

1. A method for checking a search engine, comprising the steps of:
importing quantitative data into a database of the search engine;
acquiring index data in the quantitative data from the database, and storing the index data in a fixed format to form a first data set;
obtaining the index data in the quantitative data via a visualization platform of the search engine and storing the index data in the fixed format to form a second data set;
comparing each piece of data in the first data set with data in the second data set respectively;
and responding to the first data set and the second data set being identical, and judging that the searched engine is normal in function.
2. The method of claim 1, further comprising:
in response to the first data set not being identical to the second data set, storing different data in a results file and issuing an alert on the visualization platform.
3. The method of claim 1, wherein comparing each piece of data in the first data set with data in the second data set comprises:
and searching the attribute keywords of each piece of data in the first data set in the second data set respectively.
4. The method of claim 1, wherein the search engine comprises an Elasticsearch and the visualization platform comprises Kibana.
5. The method of claim 1, wherein the fixed format is a json format.
6. An inspection apparatus of a search engine, the apparatus comprising:
at least one processor; and
a memory storing program code executable by the processor, the program code, when executed by the processor, performing the steps of:
importing quantitative data into a database of the search engine;
acquiring index data in the quantitative data from the database, and storing the index data in a fixed format to form a first data set;
obtaining the index data in the quantitative data via a visualization platform of the search engine and storing the index data in the fixed format to form a second data set;
comparing each piece of data in the first data set with data in the second data set respectively;
and responding to the first data set and the second data set being identical, and judging that the searched engine is normal in function.
7. The apparatus of claim 6, wherein the program code, when executed by the processor, further performs the steps of:
in response to the first data set not being identical to the second data set, storing different data in a results file and issuing an alert on the visualization platform.
8. The apparatus of claim 6, wherein comparing each piece of data in the first data set with data in the second data set comprises:
and searching the attribute keywords of each piece of data in the first data set in the second data set respectively.
9. The device of claim 6, wherein the search engine comprises an Elasticsearch and the visualization platform comprises Kibana.
10. The apparatus of claim 6, wherein the fixed format is a json format.
CN201911146662.9A 2019-11-21 2019-11-21 Search engine inspection method and equipment Withdrawn CN111125244A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911146662.9A CN111125244A (en) 2019-11-21 2019-11-21 Search engine inspection method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911146662.9A CN111125244A (en) 2019-11-21 2019-11-21 Search engine inspection method and equipment

Publications (1)

Publication Number Publication Date
CN111125244A true CN111125244A (en) 2020-05-08

Family

ID=70495893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911146662.9A Withdrawn CN111125244A (en) 2019-11-21 2019-11-21 Search engine inspection method and equipment

Country Status (1)

Country Link
CN (1) CN111125244A (en)

Similar Documents

Publication Publication Date Title
US11847574B2 (en) Systems and methods for enriching modeling tools and infrastructure with semantics
US20200118311A1 (en) Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items
US10216846B2 (en) Combinatorial business intelligence
US20210042866A1 (en) Method and apparatus for the semi-autonomous management, analysis and distribution of intellectual property assets between various entities
US8880440B2 (en) Automatic combination and mapping of text-mining services
Ali et al. Trust-based requirements traceability
US9779135B2 (en) Semantic related objects
EP2889788A1 (en) Accessing information content in a database platform using metadata
US20220107980A1 (en) Providing an object-based response to a natural language query
CN111782824A (en) Information query method, device, system and medium
CN112882702A (en) Information processing method and device for report configuration
CN111210321B (en) Risk early warning method and system based on contract management
CN109460363B (en) Automatic testing method and device, electronic equipment and computer readable medium
US20190114639A1 (en) Anomaly detection in data transactions
US11328005B2 (en) Machine learning (ML) based expansion of a data set
CN111125244A (en) Search engine inspection method and equipment
CN109344300A (en) The data query of natural language is intended to determine method, apparatus and computer equipment
US20210357401A1 (en) Automatic frequency recommendation for time series data
CN113961811A (en) Conversational recommendation method, device, equipment and medium based on event map
US20140236940A1 (en) System and method for organizing search results
US11868341B2 (en) Identification of content gaps based on relative user-selection rates between multiple discrete content sources
CN110737851B (en) Hyper-link semantization method, device, equipment and computer readable storage medium
US20230141506A1 (en) Pre-constructed query recommendations for data analytics
CN117421415A (en) Data processing method, device, electronic equipment and storage medium
JP6999400B2 (en) Text analyzer, text analysis method, and text analysis program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200508