CN113849499A - Data query method and device, storage medium and electronic device - Google Patents

Data query method and device, storage medium and electronic device Download PDF

Info

Publication number
CN113849499A
CN113849499A CN202110984634.5A CN202110984634A CN113849499A CN 113849499 A CN113849499 A CN 113849499A CN 202110984634 A CN202110984634 A CN 202110984634A CN 113849499 A CN113849499 A CN 113849499A
Authority
CN
China
Prior art keywords
data
index
field
original data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110984634.5A
Other languages
Chinese (zh)
Inventor
曹可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Original Assignee
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Technology Co Ltd, Haier Smart Home Co Ltd filed Critical Qingdao Haier Technology Co Ltd
Priority to CN202110984634.5A priority Critical patent/CN113849499A/en
Publication of CN113849499A publication Critical patent/CN113849499A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24557Efficient disk access during query execution

Abstract

The invention discloses a data query method, a data query device, a storage medium and an electronic device, wherein the method comprises the following steps: searching target index data matched with a field to be retrieved in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; determining a target index field in the target index data under the condition that the target index data is found; and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved. The method solves the problems that the database search engine cannot perform data query under the condition of limited corresponding storage space and the like, improves the use efficiency of the search engine, avoids the problem that a large amount of storage space is needed during operation, realizes the quick search of the stored mass data through the linkage of the search engine and the database, and saves the cost of a data search framework.

Description

Data query method and device, storage medium and electronic device
Technical Field
The present invention relates to the field of communications, and in particular, to a method and an apparatus for querying data, a storage medium, and an electronic apparatus.
Background
In the prior art, under the condition that mass data are limited in storage resources, H base is generally used for storing the mass data, but the query performance is not high, and real-time quick query cannot be met; the Elastic Search is used as a Search engine, the query performance is extremely high, but an SSD (solid State disk) is required for storage, and the storage cost is higher; taking the internet of things as an example, data, logs, pictures and the like are collected, and mass data are stored in an Elastic Search, so that a large amount of storage space is needed; the long text field and the picture field of the log can not be used as a retrieval condition as a query result generally, and are stored in an Elastic Search, which causes waste. For mass data under the condition of limited storage resources, Elastic Search as a Search engine needs a large amount of Storage Space (SSD), and the storage cost is high.
Aiming at the problems that a database search engine cannot perform data query under the condition of limited corresponding storage space and the like in the related technology, an effective solution is not provided.
Disclosure of Invention
The embodiment of the invention provides a data query method, a data query device, a storage medium and an electronic device, which at least solve the problems that in the related technology, a database search engine cannot perform data query under the condition of limited corresponding storage space and the like.
According to an aspect of the embodiments of the present invention, there is provided a method for querying data, including: searching target index data matched with a field to be retrieved in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; under the condition that the target index data is found, determining a target index field in the target index data; and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved.
In an exemplary embodiment, before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises: acquiring query content input by a target object in the database search engine, wherein the query content is used for indicating text content required to be contained in original data to be acquired by the target object; and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
In an exemplary embodiment, finding target index data matched with a field to be retrieved in index data of a database search engine comprises: determining an index type corresponding to each field to be retrieved; and determining a corresponding index field in the index data according to the index type, and determining the data containing the index field as target index data matched with the field to be retrieved.
In an exemplary embodiment, before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises: setting the index data according to a preset rule, wherein the preset rule comprises at least one of the following: determining a partition date corresponding to the original data, and writing the partition date into a corresponding index field to obtain index data of a date index corresponding to the original data; determining a data type corresponding to the original data, and writing the data type into a corresponding index field to obtain index data of a name index corresponding to the original data; and determining a fragment identifier corresponding to the original data, and writing the fragment identifier into a corresponding index field to obtain index data of a fragment index corresponding to the original data.
In an exemplary embodiment, obtaining complete raw data corresponding to the target index field in the database, and using the complete raw data as a query result of the field to be retrieved includes: acquiring text content in the database according to the target index field, wherein the text content is original data of complete original data after replacement processing, and the replacement processing is used for indicating the complete original data to replace words with identification through a preset reverse index; determining words corresponding to the identifiers in the text content according to a preset inverted index so as to obtain complete original data corresponding to the target index field; and under the condition that the semantics of the complete original data are determined to be correct, taking the complete original data as the query result of the field to be retrieved.
In an exemplary embodiment, obtaining text content in the database according to the target index field includes: determining the index attribute of the target index field and the row key corresponding to the target index field; inquiring in a database according to the index attribute and the row key; and under the condition that the query result corresponds to the service field in the target index field, taking the data corresponding to the query result as the text content corresponding to the target index field.
According to another aspect of the embodiments of the present invention, there is also provided a data query apparatus, including: the search module is used for searching target index data matched with a field to be searched in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; the determining module is used for determining a target index field in the target index data under the condition that the target index data is found; and the acquisition module is used for acquiring complete original data corresponding to the target index field in the database and taking the complete original data as a query result of the field to be retrieved.
In an exemplary embodiment, the apparatus further includes: the extraction module is used for acquiring query contents input by a target object in the database search engine, wherein the query contents are used for indicating text contents required to be contained in original data to be acquired by the target object; and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
In an exemplary embodiment, the search module is further configured to determine an index type corresponding to each field to be retrieved; and determining a corresponding index field in the index data according to the index type, and determining the data containing the index field as target index data matched with the field to be retrieved.
In an exemplary embodiment, the apparatus further includes: a setting module, configured to set the index data according to a preset rule, where the preset rule includes at least one of: determining a partition date corresponding to the original data, and writing the partition date into a corresponding index field to obtain index data of a date index corresponding to the original data; determining a data type corresponding to the original data, and writing the data type into a corresponding index field to obtain index data of a name index corresponding to the original data; and determining a fragment identifier corresponding to the original data, and writing the fragment identifier into a corresponding index field to obtain index data of a fragment index corresponding to the original data.
In an exemplary embodiment, the obtaining module is further configured to obtain text content in the database according to the target index field, where the text content is original data of the complete original data after replacement processing, and the replacement processing is used to instruct the complete original data to replace words with identifiers through a preset inverted index; determining words corresponding to the identifiers in the text content according to a preset inverted index so as to obtain complete original data corresponding to the target index field; and under the condition that the semantics of the complete original data are determined to be correct, taking the complete original data as the query result of the field to be retrieved.
In an exemplary embodiment, the obtaining module is further configured to determine an index attribute of the target index field and a row key corresponding to the target index field; inquiring in a database according to the index attribute and the row key; and under the condition that the query result corresponds to the service field in the target index field, taking the data corresponding to the query result as the text content corresponding to the target index field.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute the above data query method when running.
According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the method for querying data through the computer program.
In the embodiment of the invention, target index data matched with a field to be retrieved is searched in index data of a database search engine, wherein the index data comprises the following components: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; determining a target index field in the target index data under the condition that the target index data is found; and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved. The method and the device have the advantages that index data with small data capacity are stored in the search engine, target index data matched with a field to be retrieved are searched in the index data, and then corresponding original data are found in the database according to corresponding content information in the target index data, so that the problems that in the related technology, the database search engine cannot perform data query under the condition that corresponding storage space is limited and the like are solved, the use efficiency of the search engine is improved, the problem that a large amount of storage space is needed in operation is solved, the stored mass data can be quickly searched through linkage of the search engine and the database, and the cost of a data search framework is saved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a block diagram of a hardware configuration of a computer terminal of a data query method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of a method of querying data according to an embodiment of the invention;
FIG. 3 is a schematic flow diagram of an instant query store in accordance with an alternative embodiment of the present invention;
FIG. 4 is an illustration of a lookup of raw data according to corresponding search fields in accordance with an alternative embodiment of the present invention;
fig. 5 is a block diagram of a data query apparatus according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method provided by the embodiment of the application can be executed in a computer terminal, a device terminal or a similar operation device. Taking the example of being operated on a computer terminal, fig. 1 is a hardware structure block diagram of a computer terminal of a data query method according to an embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more (only one shown in fig. 1) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and in an exemplary embodiment, may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration with equivalent functionality to that shown in FIG. 1 or with more functionality than that shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the data query method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In this embodiment, a data query method is provided, and fig. 2 is a flowchart of a data query method according to an embodiment of the present invention, where the flowchart includes the following steps:
step S202, searching target index data matched with a field to be retrieved in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
step S204, determining a target index field in the target index data under the condition that the target index data is found;
step S206, acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved.
Through the above steps S202 to S206, target index data matched with the field to be retrieved is searched in the index data of the database search engine, where the index data includes: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; determining a target index field in the target index data under the condition that the target index data is found; and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved. The method and the device have the advantages that index data with small data capacity are stored in the search engine, target index data matched with a field to be retrieved are searched in the index data, and then corresponding original data are found in the database according to corresponding content information in the target index data, so that the problems that in the related technology, the database search engine cannot perform data query under the condition that corresponding storage space is limited and the like are solved, the use efficiency of the search engine is improved, the problem that a large amount of storage space is needed in operation is solved, the stored mass data can be quickly searched through linkage of the search engine and the database, and the cost of a data search framework is saved.
In an exemplary embodiment, before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises: acquiring query content input by a target object in the database search engine, wherein the query content is used for indicating text content required to be contained in original data to be acquired by the target object; and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
In short, before searching corresponding original data for a target object according to query content input by the target object, in order to ensure the accuracy of the data, field extraction is performed on the query content through a preset field type, a field to be retrieved is determined, and the searching efficiency in index data is accelerated.
In an exemplary embodiment, finding target index data matched with a field to be retrieved in index data of a database search engine comprises: determining an index type corresponding to each field to be retrieved; and determining a corresponding index field in the index data according to the index type, and determining the data containing the index field as target index data matched with the field to be retrieved.
It should be noted that the index type is used to indicate a type of a field to be retrieved when the field is searched in index data, a search direction may be determined in a classified data type in the index data according to the index type, and then data including the index field is determined as target index data matched with the field to be retrieved.
In an exemplary embodiment, before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises: setting the index data according to a preset rule, wherein the preset rule comprises at least one of the following: determining a partition date corresponding to the original data, and writing the partition date into a corresponding index field to obtain index data of a date index corresponding to the original data; determining a data type corresponding to the original data, and writing the data type into a corresponding index field to obtain index data of a name index corresponding to the original data; and determining a fragment identifier corresponding to the original data, and writing the fragment identifier into a corresponding index field to obtain index data of a fragment index corresponding to the original data.
For example, when the data is massive data, the index is generally divided into a plurality by month: table _ name-yyyMM-dd to dd, where yyyMM is the year and month, and dd to dd are the start and stop dates. The data is written into the index field of the corresponding start-stop date according to the partition date. The same type of data can combine all the data by indexing the alias table _ name _ rd to provide query service for the outside. An index instance (currently divided by time scale) number of segments (multiple shards would be stored on different servers) ES recommends shards 5-15, so if the calculated value is <5, please use a shard of 5, but if the amount of data is particularly small, the calculated value may also be used. The maximum JVM heap space recommended by the ES is 30-32G, so that the maximum capacity of the fragments is limited to 30 GB.
In an exemplary embodiment, obtaining complete raw data corresponding to the target index field in the database, and using the complete raw data as a query result of the field to be retrieved includes: acquiring text content in the database according to the target index field, wherein the text content is original data of complete original data after replacement processing, and the replacement processing is used for indicating the complete original data to replace words with identification through a preset reverse index; determining words corresponding to the identifiers in the text content according to a preset inverted index so as to obtain complete original data corresponding to the target index field; and under the condition that the semantics of the complete original data are determined to be correct, taking the complete original data as the query result of the field to be retrieved.
In an exemplary embodiment, obtaining text content in the database according to the target index field includes: determining the index attribute of the target index field and the row key corresponding to the target index field; inquiring in a database according to the index attribute and the row key; and under the condition that the query result corresponds to the service field in the target index field, taking the data corresponding to the query result as the text content corresponding to the target index field.
For example, when corresponding original data is found in a database according to a target index field, in order to improve storage efficiency of the original data, a word corresponding to the original data is converted into an identifier through a preset inverted index, and therefore, the identifier needs to be re-detected according to the preset inverted index and converted into a corresponding word, and the finished original data which can be used for query display of a search engine is obtained, that is, the original data is different from the complete original data in that part or all of the data is converted through the preset inverted index.
In order to better understand the process of the data query method, the following describes a flow of the data query method with reference to several alternative embodiments.
As an optional implementation mode, the implementation principle of the Elastic search is mainly divided into the following steps that firstly, a user submits data to an Elastic search database, then a word controller divides words of a corresponding sentence, the weight and word division results are stored in the data, when the user searches data, the results are ranked and scored according to the weight, and then the returned results are presented to the user.
In an optional embodiment of the invention, a half-index-based instant query storage design is mainly provided, and by combining a distributed full-text retrieval Search server ElasticSearch with a database HBase and adopting a half-index (HBase + ElasticSearch) storage design, the storage requirement space of the ElasticSearch is reduced, the query performance is improved, and the optimization of the storage space required by the ElasticSearch is realized.
Optionally, fig. 3 is a schematic flowchart of an instant query storage according to an alternative embodiment of the present invention, as shown in fig. 3, including:
step S302, determining query conditions;
step S304, searching a field meeting the query condition in index data stored on the Elastic Search of the Search server, wherein the index data refers to the field needing to be retrieved and is stored in an Elastic Search cluster;
optionally, as shown in fig. 4, a schematic diagram of searching for original data according to a corresponding search field is shown; before searching, the corresponding index data is established by processing the original data according to the preset inverted index, and further obtaining the index data corresponding to the original data.
Step S306, after the target index data is determined, analyzing the target index data to determine a corresponding row key (rowkey), wherein the row key is a unique identifier of a data row in a table and is used as a main key of a retrieval record;
optionally, the rowkey design sets the row key as yyyMM + segment + other service fields, where the yyyMM + segment corresponds to the split key (segment index) of the region, and determines which region the data enters;
step S308, finding out corresponding data partitions in the database HBase according to the row keys to obtain corresponding original data, wherein the original data are full fields and comprise certain overlong text data, such as photos, and are stored in an HBase cluster;
step S310, the original data is returned as the query result.
It should be noted that, in order to facilitate the implementation of the above query scheme, a storage design needs to be performed on the database Hbase, and a Region division design is performed according to the type of data that needs to be stored, where Region is a basic unit for storing and managing Hbase data.
Optionally, the size range of each region is as follows: 10G-30G, and the region is normally 10G as the default value.
It should be noted that the smaller the region is, the more favorable the cluster load distribution balance of the HBase, the faster the corresponding compression speed and the good stability are, but the too many regions cause the longer the pulling time after the downtime; a large number of small regions may generate more frequent refresh flush, generate many small files, cause unnecessary compression, and in a special scene, once the number of regions exceeds a threshold, the refresh flush of the whole region server level is caused, and the read-write of a user is seriously blocked, so that the region management overhead is large. The larger the region is, the quick restart and downtime restart of the HBase cluster are facilitated, the number of RPCs can be reduced, and fewer and larger refresh Flush can be generated; however, the compression effect of the compact is poor, which causes large data writing jitter, and the stability is poor, which is not favorable for load balance among clusters.
Optionally, when the design is performed, the number of regions per month is single record size (KB) and total number of records per day is 30/1024/1024/20GB (region size);
as an alternative embodiment, for the storage design of the ElasticSearch;
1. indexing and slicing, wherein the mass data indexing divides the indexes into a plurality of indexes according to the month: table _ name-yyyMM-dd to dd, where yyyMM is the year and month, and dd to dd are the start and stop dates. The data is written into the index of the corresponding start-stop date according to the partition date. The same type of data can combine all the data by indexing the alias table _ name _ rd to provide query service for the outside. An index instance (currently divided by time scale) number of segments (multiple shards would be stored on different servers) ES recommends shards 5-15, so if the calculated value is <5, please use a shard of 5, but if the amount of data is particularly small, the calculated value may also be used. The maximum JVM heap space recommended by the ES is 30-32G, so that the maximum capacity of the fragments is limited to 30 GB.
Alternatively, the monthly index number 30GB > a single record size (KB) total number of records per day 30/1024/1024.
2. The default of the source field is that the field storage in the original document can be reduced by using rowkey and setting a small number of fields as inclusions and setting a large number of fields as exclusions, so that the storage size of the ElasticSearch is reduced, and the query requirement is met.
Through the optional embodiment, after the large field in the Elastic Search is removed and the attributes of the inclusions are changed into the inclusions, the size of a single file of the Elastic Search is changed, the storage is reduced, more data can be stored in the storage space corresponding to the Elastic Search, the storage of the Elastic Search is reduced, the query performance is improved, and further the storage and the query of mass data are realized through the semi-index data storage combining the Elastic Search and the Hbase.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a data query apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, and the description of the apparatus is omitted for brevity. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram of a data query apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus including:
the searching module 52 is configured to search, in the index data of the database search engine, target index data matched with the field to be retrieved, where the index data includes: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
a determining module 54, configured to determine a target index field in the target index data when the target index data is found;
an obtaining module 56, configured to obtain complete original data corresponding to the target index field in the database, and use the complete original data as a query result of the field to be retrieved.
Through the device, target index data matched with the field to be retrieved is searched in the index data of the database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database; determining a target index field in the target index data under the condition that the target index data is found; and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved. The method and the device have the advantages that index data with small data capacity are stored in the search engine, target index data matched with a field to be retrieved are searched in the index data, and then corresponding original data are found in the database according to corresponding content information in the target index data, so that the problems that in the related technology, the database search engine cannot perform data query under the condition that corresponding storage space is limited and the like are solved, the use efficiency of the search engine is improved, the problem that a large amount of storage space is needed in operation is solved, the stored mass data can be quickly searched through linkage of the search engine and the database, and the cost of a data search framework is saved.
In an exemplary embodiment, the apparatus further includes: the extraction module is used for acquiring query contents input by a target object in the database search engine, wherein the query contents are used for indicating text contents required to be contained in original data to be acquired by the target object; and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
In short, before searching corresponding original data for a target object according to query content input by the target object, in order to ensure the accuracy of the data, field extraction is performed on the query content through a preset field type, a field to be retrieved is determined, and the searching efficiency in index data is accelerated.
In an exemplary embodiment, the search module is further configured to determine an index type corresponding to each field to be retrieved; and determining a corresponding index field in the index data according to the index type, and determining the data containing the index field as target index data matched with the field to be retrieved.
It should be noted that the index type is used to indicate a type of a field to be retrieved when the field is searched in index data, a search direction may be determined in a classified data type in the index data according to the index type, and then data including the index field is determined as target index data matched with the field to be retrieved.
In an exemplary embodiment, the apparatus further includes: a setting module, configured to set the index data according to a preset rule, where the preset rule includes at least one of: determining a partition date corresponding to the original data, and writing the partition date into a corresponding index field to obtain index data of a date index corresponding to the original data; determining a data type corresponding to the original data, and writing the data type into a corresponding index field to obtain index data of a name index corresponding to the original data; and determining a fragment identifier corresponding to the original data, and writing the fragment identifier into a corresponding index field to obtain index data of a fragment index corresponding to the original data.
For example, when the data is massive data, the index is generally divided into a plurality by month: table _ name-yyyMM-dd to dd, where yyyMM is the year and month, and dd to dd are the start and stop dates. The data is written into the index field of the corresponding start-stop date according to the partition date. The same type of data can combine all the data by indexing the alias table _ name _ rd to provide query service for the outside. An index instance (currently divided by time scale) number of segments (multiple shards would be stored on different servers) ES recommends shards 5-15, so if the calculated value is <5, please use a shard of 5, but if the amount of data is particularly small, the calculated value may also be used. The maximum JVM heap space recommended by the ES is 30-32G, so that the maximum capacity of the fragments is limited to 30 GB.
In an exemplary embodiment, the obtaining module is further configured to obtain text content in the database according to the target index field, where the text content is original data of the complete original data after replacement processing, and the replacement processing is used to instruct the complete original data to replace words with identifiers through a preset inverted index; determining words corresponding to the identifiers in the text content according to a preset inverted index so as to obtain complete original data corresponding to the target index field; and under the condition that the semantics of the complete original data are determined to be correct, taking the complete original data as the query result of the field to be retrieved.
In an exemplary embodiment, the obtaining module is further configured to determine an index attribute of the target index field and a row key corresponding to the target index field; inquiring in a database according to the index attribute and the row key; and under the condition that the query result corresponds to the service field in the target index field, taking the data corresponding to the query result as the text content corresponding to the target index field.
For example, when corresponding original data is found in a database according to a target index field, in order to improve storage efficiency of the original data, a word corresponding to the original data is converted into an identifier through a preset inverted index, and therefore, the identifier needs to be re-detected according to the preset inverted index and converted into a corresponding word, and the finished original data which can be used for query display of a search engine is obtained, that is, the original data is different from the complete original data in that part or all of the data is converted through the preset inverted index.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, searching target index data matched with the field to be retrieved in the index data of the database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
s2, determining a target index field in the target index data under the condition that the target index data is found;
s3, acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as the query result of the field to be retrieved.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
In an exemplary embodiment, the processor may be configured to execute the following steps by a computer program:
s1, searching target index data matched with the field to be retrieved in the index data of the database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
s2, determining a target index field in the target index data under the condition that the target index data is found;
s3, acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as the query result of the field to be retrieved.
It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for querying data, comprising:
searching target index data matched with a field to be retrieved in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
under the condition that the target index data is found, determining a target index field in the target index data;
and acquiring complete original data corresponding to the target index field in the database, and taking the complete original data as a query result of the field to be retrieved.
2. The method for querying data according to claim 1, wherein before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises:
acquiring query content input by a target object in the database search engine, wherein the query content is used for indicating text content required to be contained in original data to be acquired by the target object;
and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
3. The method for querying data according to claim 1, wherein searching the index data of the database search engine for the target index data matching the field to be retrieved comprises:
determining an index type corresponding to each field to be retrieved;
and determining a corresponding index field in the index data according to the index type, and determining the data containing the index field as target index data matched with the field to be retrieved.
4. The method for querying data according to claim 1, wherein before searching the index data of the database search engine for the target index data matching the field to be retrieved, the method further comprises:
setting the index data according to a preset rule, wherein the preset rule comprises at least one of the following:
determining a partition date corresponding to the original data, and writing the partition date into a corresponding index field to obtain index data of a date index corresponding to the original data;
determining a data type corresponding to the original data, and writing the data type into a corresponding index field to obtain index data of a name index corresponding to the original data;
and determining a fragment identifier corresponding to the original data, and writing the fragment identifier into a corresponding index field to obtain index data of a fragment index corresponding to the original data.
5. The method according to claim 1, wherein obtaining complete original data corresponding to the target index field in the database, and using the complete original data as a query result of the field to be retrieved, comprises:
acquiring text content in the database according to the target index field, wherein the text content is original data of complete original data after replacement processing, and the replacement processing is used for indicating the complete original data to replace words with identification through a preset reverse index;
determining words corresponding to the identifiers in the text content according to a preset inverted index so as to obtain complete original data corresponding to the target index field;
and under the condition that the semantics of the complete original data are determined to be correct, taking the complete original data as the query result of the field to be retrieved.
6. The method for querying data according to claim 5, wherein obtaining text content in the database according to the target index field comprises:
determining the index attribute of the target index field and the row key corresponding to the target index field;
inquiring in a database according to the index attribute and the row key;
and under the condition that the query result corresponds to the service field in the target index field, taking the data corresponding to the query result as the text content corresponding to the target index field.
7. An apparatus for querying data, comprising:
the search module is used for searching target index data matched with a field to be searched in index data of a database search engine, wherein the index data comprises: index field and row key corresponding to the index field; the index field corresponds to original data stored in a database;
the determining module is used for determining a target index field in the target index data under the condition that the target index data is found;
and the acquisition module is used for acquiring complete original data corresponding to the target index field in the database and taking the complete original data as a query result of the field to be retrieved.
8. The apparatus for querying data according to claim 7, wherein the apparatus further comprises: the extraction module is used for acquiring query contents input by a target object in the database search engine, wherein the query contents are used for indicating text contents required to be contained in original data to be acquired by the target object; and performing field extraction on the query content according to a preset field type to obtain a corresponding field to be retrieved, wherein the preset field type is used for indicating the part of speech of the original data query of the database search engine.
9. A computer-readable storage medium, comprising a stored program, wherein the program is operable to perform the method of any one of claims 1 to 6.
10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 6 by means of the computer program.
CN202110984634.5A 2021-08-25 2021-08-25 Data query method and device, storage medium and electronic device Pending CN113849499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110984634.5A CN113849499A (en) 2021-08-25 2021-08-25 Data query method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110984634.5A CN113849499A (en) 2021-08-25 2021-08-25 Data query method and device, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN113849499A true CN113849499A (en) 2021-12-28

Family

ID=78976352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110984634.5A Pending CN113849499A (en) 2021-08-25 2021-08-25 Data query method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN113849499A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115168409A (en) * 2022-09-05 2022-10-11 金蝶软件(中国)有限公司 Data query method and device for database sub-tables and computer equipment
CN116561434A (en) * 2023-06-28 2023-08-08 平安银行股份有限公司 Data retrieval recommendation method, device, storage medium and equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115168409A (en) * 2022-09-05 2022-10-11 金蝶软件(中国)有限公司 Data query method and device for database sub-tables and computer equipment
CN115168409B (en) * 2022-09-05 2023-02-28 金蝶软件(中国)有限公司 Data query method and device for database sub-tables and computer equipment
CN116561434A (en) * 2023-06-28 2023-08-08 平安银行股份有限公司 Data retrieval recommendation method, device, storage medium and equipment

Similar Documents

Publication Publication Date Title
CN106657213B (en) File transmission method and device
CN108197296B (en) Data storage method based on Elasticissearch index
US9195745B2 (en) Dynamic query master agent for query execution
CN111506569B (en) Data storage method and device and electronic device
CN113849499A (en) Data query method and device, storage medium and electronic device
CN109299101B (en) Data retrieval method, device, server and storage medium
WO2023143095A1 (en) Method and system for data query
CN114116762A (en) Offline data fuzzy search method, device, equipment and medium
CN114398520A (en) Data retrieval method, system, device, electronic equipment and storage medium
CN111625600B (en) Data storage processing method, system, computer equipment and storage medium
CN113051460A (en) Elasticissearch-based data retrieval method and system, electronic device and storage medium
CN110515979B (en) Data query method, device, equipment and storage medium
CN111125158B (en) Data table processing method, device, medium and electronic equipment
CN116450607A (en) Data processing method, device and storage medium
CN114880329A (en) Data query method and device, storage medium and computer equipment
CN111061719B (en) Data collection method, device, equipment and storage medium
CN114064729A (en) Data retrieval method, device, equipment and storage medium
CN114201496A (en) Data updating method and device, electronic equipment, system and storage medium
CN112527824B (en) Paging query method, paging query device, electronic equipment and computer-readable storage medium
CN115794876A (en) Fragment processing method, device, equipment and storage medium for service data packet
CN112527900A (en) Method, device, equipment and medium for database multi-copy reading consistency
CN112199463A (en) Data query method, device and equipment
CN111427910A (en) Data processing method and device
CN117725074A (en) Database updating method and device for storage file, storage medium and electronic equipment
CN117251521A (en) Content searching method, content searching device, computer equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination