CN113886426A - Data query method and device, computer equipment and storage medium - Google Patents
Data query method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN113886426A CN113886426A CN202111227991.3A CN202111227991A CN113886426A CN 113886426 A CN113886426 A CN 113886426A CN 202111227991 A CN202111227991 A CN 202111227991A CN 113886426 A CN113886426 A CN 113886426A
- Authority
- CN
- China
- Prior art keywords
- query
- candidate
- data
- statement
- engine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000012545 processing Methods 0.000 claims description 24
- 238000010586 diagram Methods 0.000 description 23
- 230000007246 mechanism Effects 0.000 description 13
- 238000004590 computer program Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 230000001960 triggered effect Effects 0.000 description 7
- 238000007405 data analysis Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000036316 preload Effects 0.000 description 3
- 240000002989 Euphorbia neriifolia Species 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2477—Temporal data queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure provides a data query method, apparatus, computer device and storage medium, the method comprising: receiving a data query request, the data query request comprising: and inquiring the mark of the statement, judging whether preloading is finished aiming at the inquiry statement or not according to the mark, if the preloading is finished, acquiring related result data corresponding to the inquiry statement from the cache database, and if the preloading is not finished, inquiring from a target inquiry engine according to the inquiry statement to obtain the related result data. When the preloading is finished, the related result data is preloaded and cached in the cache database, so that the data query is assisted based on the cache database, the consumption of repeated data query requests on the query performance can be effectively reduced, and when the preloading is not finished, the related result data is obtained by supporting the query in the combined target query engine, so that the success rate of the data query can be improved, and the data query performance and the flexibility of the data query can be effectively improved.
Description
Technical Field
The present disclosure relates to the field of big data technologies, and in particular, to a data query method and apparatus, a computer device, and a storage medium.
Background
In the field of big data technology, data analysis is mainly data query analysis on a database in a big data platform through a data query language.
In the related art, a data processing node of an open-source data query engine is generally used to parse a data query language, generate a data query execution plan, and distribute an execution task to a task sub-node, where the task sub-node takes charge of an actual task query task.
In this way, the response time of the data query is long, and the concurrent processing capability of the data query is poor, so that the query performance cannot meet the service requirement of the actual data query.
Disclosure of Invention
The present disclosure is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, the present disclosure aims to provide a data query method, apparatus, computer device, and storage medium, where when the preloading is completed, the relevant result data is preloaded and cached in the cache database, so as to assist in performing data query based on the cache database, and can effectively reduce the consumption of repeated data query requests on query performance, and when the preloading is not completed, the query in the joint target query engine is supported to obtain the relevant result data, so that the success rate of data query can be increased, and the data query performance and the flexibility of data query can be effectively improved.
The data query method provided by the embodiment of the first aspect of the disclosure includes: receiving a data query request, the data query request comprising: an identification of the query statement; judging whether preloading is finished aiming at the query statement or not according to the identification; if the preloading is finished, acquiring relevant result data corresponding to the query statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread before receiving a data query request, and the query statement set comprises: a plurality of candidate query sentences, and the query task is executed for each candidate query sentence; and if the preloading is not finished, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data.
In the data query method provided in the embodiment of the first aspect of the present disclosure, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement or not according to the identifier, and if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded inquiry statement set before receiving a data inquiry request, and the inquiry statement set comprises the following steps: the method comprises the steps that a plurality of candidate query sentences are obtained, corresponding query tasks are executed for each candidate query sentence, if preloading is not completed, relevant result data are obtained through querying from a target query engine according to the query sentences, and when the preloading is completed, the relevant result data are preloaded and cached in a cache database, so that data query is assisted based on the cache database, the consumption of repeated data query requests on query performance can be effectively reduced, and when the preloading is not completed, the relevant result data are obtained through query in a combined target query engine, so that the success rate of data query can be improved, and the data query performance and the flexibility of data query can be effectively improved.
The data query device provided by the embodiment of the second aspect of the present disclosure includes: a receiving module, configured to receive a data query request, where the data query request includes: an identification of the query statement; the judging module is used for judging whether the preloading is finished aiming at the query statement or not according to the identification; a first obtaining module, configured to obtain, if the preloading is completed, relevant result data corresponding to the query statement from a cache database, where the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded query statement set before receiving a data query request, where the query statement set includes: a plurality of candidate query sentences, and the query task is executed for each candidate query sentence; and the query module is used for querying from a target query engine according to the query statement to obtain the relevant result data if the preloading is not completed.
The data query apparatus provided in an embodiment of the second aspect of the present disclosure receives a data query request, where the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement or not according to the identifier, and if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded inquiry statement set before receiving a data inquiry request, and the inquiry statement set comprises the following steps: the method comprises the steps that a plurality of candidate query sentences are obtained, corresponding query tasks are executed for each candidate query sentence, if preloading is not completed, relevant result data are obtained through querying from a target query engine according to the query sentences, and when the preloading is completed, the relevant result data are preloaded and cached in a cache database, so that data query is assisted based on the cache database, the consumption of repeated data query requests on query performance can be effectively reduced, and when the preloading is not completed, the relevant result data are obtained through query in a combined target query engine, so that the success rate of data query can be improved, and the data query performance and the flexibility of data query can be effectively improved.
An embodiment of a third aspect of the present disclosure provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the data query method as set forth in the embodiment of the first aspect of the present disclosure.
A fourth aspect of the present disclosure provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data query method as set forth in the first aspect of the present disclosure.
An embodiment of a fifth aspect of the present disclosure provides a computer program product, which when executed by an instruction processor in the computer program product performs the data query method as set forth in the embodiment of the first aspect of the present disclosure.
Drawings
The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a data query method according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a Presto-based real-time data query engine architecture in an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a link trace implementation in an embodiment of the disclosure;
FIG. 4 is a diagram of a database access record display interface in the related art;
FIG. 5 is a database access record display interface diagram in an embodiment of the present disclosure;
FIG. 6 is a flowchart illustrating a data query method according to another embodiment of the disclosure;
FIG. 7 is a schematic diagram of a daily concurrent pressure distribution by a target query engine in an embodiment of the disclosure;
FIG. 8 is a diagram illustrating a structure of a lookup record table in an embodiment of the present disclosure;
FIG. 9 is a schematic diagram of performance metrics for a query engine in an embodiment of the disclosure;
FIG. 10 is a flowchart illustrating a data query method according to another embodiment of the disclosure;
FIG. 11 is a flowchart illustrating a data query method according to another embodiment of the disclosure;
FIG. 12 is an architectural schematic of a data analysis system in an embodiment of the disclosure;
FIG. 13 is a schematic diagram of a preload handling mechanism in an embodiment of the disclosure;
FIG. 14 is a schematic diagram of a cluster structure of Presto-based real-time data query engine in an embodiment of the present disclosure;
fig. 15 is a schematic structural diagram of a data query apparatus according to an embodiment of the disclosure;
fig. 16 is a schematic structural diagram of a data query device according to another embodiment of the present disclosure;
FIG. 17 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of illustrating the present disclosure and should not be construed as limiting the same. On the contrary, the embodiments of the disclosure include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.
Fig. 1 is a schematic flow chart of a data query method according to an embodiment of the present disclosure.
It should be noted that an execution main body of the data query method of this embodiment is a data query apparatus, the apparatus may be implemented by software and/or hardware, the apparatus may be configured in a computer device, and the computer device may include, but is not limited to, a terminal, a server, and the like.
As shown in fig. 1, the data query method includes:
s101: receiving a data query request, the data query request comprising: identification of the query statement.
The Query statement may be used to execute a Query task in the database and return a data Query result, for example, the data Query may be performed using a Structured Query Language (SQL) Query statement.
The Identifier of the query statement may be used to uniquely identify the query statement, and the Identifier of the query statement may be, for example, a universal Unique Identifier (UUId) of SQL, or an Identifier of another data query statement, which is not limited to this.
In the embodiment of the present disclosure, a data query interface may be configured on a data query apparatus, a data query request (the data query request may be, for example, a HyperText Transfer Protocol (HTTP) request, or may also be a request supporting any other possible data Transfer Protocol, which is not limited to this), is received through the data query interface, an identifier of a query statement is obtained by parsing the data query request, and a subsequent data query processing logic is triggered, which may specifically refer to subsequent embodiments.
S102: and judging whether the preloading is finished aiming at the query statement or not according to the identification.
In the embodiment of the present disclosure, a preloading mechanism may be supported, for example, before starting a data query task, a part of query statements with a high frequency of use and related result data thereof may be cached in a database in advance, in an application of actual data query, whether the query statements exist in the database may be determined according to an identifier of the query statements, if the query statements exist, it is determined that the query statements have completed preloading, the preloaded related result data is directly called, and data query is performed based on assistance of the cached database, so that consumption of repeated data query requests on query performance can be effectively reduced.
In the embodiment of the present disclosure, the preloading may be supported by triggering in any time period, or the preloading may also be supported by triggering in a adaptively configured time period, where the adaptively configured time period may be, for example, an idle time period of the data query apparatus, which is not limited to this.
When the query statement is preloaded, the asynchronous thread can be started to obtain the preloaded query statement set (the query statement set can include a plurality of query statements to be preloaded, and then corresponding query tasks are respectively executed for each query statement to obtain corresponding relevant result data).
S103: if the preloading is finished, acquiring relevant result data corresponding to the query statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread to acquire a preloaded query statement set before receiving a data query request, and the query statement set comprises: and executing corresponding query task query aiming at each candidate query statement to obtain the query result.
The cache database may be used to store the pre-loaded partial query statement and the corresponding related result data.
The method includes starting an asynchronous thread to obtain a preloaded query statement set, where the asynchronous thread may be a thread started outside a data query main thread of a system to obtain the preloaded query statement set.
In the embodiment of the disclosure, before receiving a data query request, an asynchronous thread may be started to obtain a preloaded query statement set, where the preloaded query statement set includes a plurality of candidate query statements, and corresponding query tasks are respectively executed for each candidate query statement to obtain relevant result data of the part of candidate query statements and cache the relevant result data in a database, and then, when the data query statements hit the cache, data query may be performed based on the cache database.
When data is queried according to a data query request, the identifier of a cached data query statement can be obtained from a cache database, the identifier of the query statement carried in the data query request is queried in the identifier of the cached query statement, if the identifier of the query statement exists in the cache database, preloading is finished aiming at the query statement carried in the data query request, and relevant result data corresponding to the query statement carried in the data query request can be obtained from the cache database.
In the embodiment of the present disclosure, when obtaining the relevant result data corresponding to the query statement from the cache database, a hash function may be used to perform data query in the cache database by using the identifier of the query statement as a key value, and read the relevant result data corresponding to the query statement, or a tree table query may be used to perform data query, or any other possible data search algorithm may be used to read the relevant result data corresponding to the query statement from the cache database, which is not limited to this.
S104: and if the preloading is not finished, inquiring from the target inquiry engine according to the inquiry statement to obtain relevant result data.
The target query engine is used for assisting in real-time data query, analyzing query statements and returning relevant result data corresponding to the query statements.
For example, a Presto-based real-time data query engine may be used as the target query engine, as shown in fig. 2, fig. 2 is a schematic diagram of a Presto-based real-time data query engine architecture in the embodiment of the present disclosure, and a Presto-based real-time data query engine implements data query by using an open source Presto, where Presto is a data query engine based on memory computation, and when preloading is not completed, it is characterized that preloading is not completed yet for a query statement carried in the data query request, that is, cache is not hit, a central query node of the open-source Presto data query engine may be used to parse the data query statement and generate a query execution plan, and the central query node schedules an execution node to execute a data query task according to the execution plan, so as to obtain relevant result data corresponding to the query statement.
When the preloading is not completed, the query statement is analyzed by using the target query engine, a query execution plan is generated, a data query task is executed, and the relevant result data is obtained by querying from the target query engine, so that the relevant result data obtained by querying in the target query engine is supported, the success rate of data query can be improved, and the data query performance and the flexibility of data query are effectively improved.
Optionally, in some embodiments, the target query engine corresponds to database connection information, the database connection information has a corresponding plurality of access user identifiers, and the data query request includes: the query user identifier may further determine a target access user identifier corresponding to the query user identifier from the multiple access user identifiers after the relevant result data is obtained according to the identifier query of the query statement carried by the data query request, perform association processing on the query user identifier and the target access user identifier, and display the query user identifier and the target access user identifier after the association processing, so as to associate the data query record with a specific access user.
In the embodiment of the present disclosure, querying and querying a user identifier on a target query engine side may be implemented by using implicit-reference code-intrusion-free link tracking, for example, as shown in fig. 3, fig. 3 is a schematic diagram of implementing link tracking in the embodiment of the present disclosure, and after each server completes configuration of its respective information table, the acquisition of a code-intrusion-free query user identifier may be implemented by using a link tracking component (jar) of each server.
When the target query engine obtains the relevant result data according to the query of the data query statement, the target query engine may establish a connection with the relevant database, and query the data in the corresponding database, for example, a Hadoop Distributed File System (HDFS) may be used, as shown in fig. 2, the open-source Presto data query engine is used as the target query engine, the HDFS is used as the database connected with the target query engine, the target query engine analyzes the query statement, and queries the data in the database according to the analyzed result, so as to obtain the relevant result data corresponding to the query statement.
The access user identifier is used to uniquely identify the user accessing the database, and may be a user Identity identifier (Identity, ID) of the query this time, or any other form of identifier, which is not limited to this.
In the embodiment of the disclosure, when the target data engine is connected to the database for data query, the data query request may further include a query user identifier, where the query user identifier is used to uniquely identify a user of the current query data, and the query user identifier and the identifier of the query statement together form the data query request.
In the embodiment of the disclosure, when a target data engine is connected with a database to perform data query, corresponding access records are reserved for query access to the database every time, database connection information can be obtained through analysis from the access records of the database, a plurality of corresponding access user identifications are obtained through analysis from the database connection information, when a target access user identification corresponding to the query user identification is determined from the plurality of access user identifications, an identification corresponding to a query statement can be input into a matching algorithm, after the target access user identification corresponding to the query user identification is obtained, correlation processing can be performed on the query user identification and the target access user identification, the query user identification and the target access user identification after the correlation processing are displayed, and a completed database access record is formed.
For example, as shown in fig. 4, fig. 4 is a database access record display interface diagram in the related art, as can be seen from fig. 4, when a data query record is not associated to a specific access user in the related art, the database access record only displays a fixed user identifier for database login, and fig. 5 is a database access record display interface diagram in the embodiment of the present disclosure, and after the query user identifier and the target access user identifier are associated, the database access record interface may display the query user identifier and the target access user identifier after the association processing.
In this embodiment, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement or not according to the identifier, and if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded inquiry statement set before receiving a data inquiry request, and the inquiry statement set comprises the following steps: the method comprises the steps that a plurality of candidate query sentences are obtained, corresponding query tasks are executed for each candidate query sentence, if preloading is not completed, relevant result data are obtained through querying from a target query engine according to the query sentences, and when the preloading is completed, the relevant result data are preloaded and cached in a cache database, so that data query is assisted based on the cache database, the consumption of repeated data query requests on query performance can be effectively reduced, and when the preloading is not completed, the relevant result data are obtained through query in a combined target query engine, so that the success rate of data query can be improved, and the data query performance and the flexibility of data query can be effectively improved.
Fig. 6 is a flowchart illustrating a data query method according to another embodiment of the disclosure.
As shown in fig. 6, the data query method includes:
s601: and acquiring candidate query sentences which have corresponding candidate marks.
The candidate query statement may be used to preload a processed query statement, the candidate identifier is used to uniquely identify the candidate query statement, and the candidate identifier may be, for example, a universal unique identifier of an SQL data query statement, or any other identifier, which is not limited to this.
The candidate query statement may be a query statement with a high query frequency obtained by sampling the historical massive query statements, or may be a candidate query statement determined from the historical massive query statements based on any other possible rules, which is not limited to this.
In the embodiment of the disclosure, before the data query device receives the data query request, part of query statements may be acquired as candidate query statements for pre-loading processing, and then corresponding query tasks are respectively executed for each candidate query statement to acquire relevant result data of the part of data query statements and cache the relevant result data in the database.
Optionally, in some embodiments, when the candidate query statement is obtained, the current time may be detected, and if the current time is within a set time range, the candidate query statement is obtained, so that the resource idle time of the data query device may be used to support the pre-loading processing mechanism, and the resource utilization rate of the target query engine is greatly improved.
The set time range may be a time period for assisting in determining the most suitable time for acquiring the candidate query statement, the set time range may be preset, and the set time range may also support adaptive adjustment, for example, the set time range may be set as a time period with a smaller resource usage pressure of the target query engine, which is not limited in this regard.
For example, as shown in fig. 7, fig. 7 is a schematic diagram illustrating a distribution situation of a daily concurrency pressure of a target query engine in the embodiment of the present disclosure, where the concurrency pressure of the target query engine is small in a morning period of each day, a set time range may be set to be 0 to 5 points of each day, or a period of time may be arbitrarily taken as the set time range in the morning period, which is not limited to this.
In the embodiment of the present disclosure, when the candidate query statements are obtained, the current time may be detected, if the current time is within a set time range, the candidate query statements are obtained, and then, corresponding query tasks are respectively executed for each candidate query statement to obtain relevant result data of the part of the data query statements, which are cached in the database, and the result is stored in the cache database, which particularly shows the following embodiments.
S602: candidate related result data corresponding to the candidate query statement is generated.
When the candidate related result data corresponding to the candidate query statement is generated, the data source can be configured in advance, and the target query engine is used for obtaining the candidate related result data of the candidate query data from the pre-configured data source according to the historical query record, wherein the query statement and the corresponding related result data are recorded in the historical query record.
Optionally, in some embodiments, when candidate related result data corresponding to the candidate query statement is generated, part of the candidate query statement may be selected from the multiple candidate query statements according to the priority information, and candidate related result data corresponding to the part of the candidate query statement is generated, so that a query statement with a higher priority may be selected from the query statement set as a candidate query statement for preloading, and the priority information may be adjusted in real time according to the query performance of the target data engine, so that real-time update of the query statement and related result data in the cache database may be ensured, the cache hit rate is improved, and the data query efficiency is improved.
The priority information represents the priority order of the query statements selected as candidate query statements, the priority information may be configured by combining information of a query record table in a database with a priority policy, after the priority information of each candidate query statement is configured based on the priority policy, the priority information may be used to represent the sampled priority of each candidate query statement in the priority policy dimension, and the priority policy may be configured adaptively according to the requirement of actual data query, for example, a priority policy representing the query frequency, a priority policy representing the importance of the queried data, and the like, without limitation.
For example, as shown in fig. 8, fig. 8 is a schematic structural diagram of a query record table in the embodiment of the present disclosure, where the query record table includes information about whether a query is successful, a query timestamp, and whether the query is successful, where the query timestamp may represent a query frequency of a data query statement for the last n days, and a resource consumption block may represent how many resources are consumed by the data query statement, and whether the query is successful represents whether the data query in the data query record is successful, a priority policy may be used according to information about the query frequency of the query statement for the last n days, the resource consumption block, and whether the query is successful, and the priority policy may be adjusted in real time according to a data query performance effect, for example, when the data query statement for the last n days is more frequent, the priority of the data query statement for the less resource consumption blocks is higher, and the priority of the data query statement for the current query is higher, alternatively, the priority policy may be set in combination with other information in the query record, for example, the priority policy may be configured in combination with the result data amount information and the remaining amount of cache space, which is not limited in this regard.
In the embodiment of the disclosure, when the candidate query statement is obtained to generate the candidate relevant result data corresponding to the candidate query statement, a corresponding priority policy may be set, a part of query statements are selected from the query statement set as the candidate query statements according to the priority information, and then a corresponding query task is executed for each candidate query statement respectively to obtain the candidate relevant result data corresponding to the candidate query statement.
S603: and correspondingly storing the candidate identification and the candidate related result data into a cache database.
Selecting a part of query sentences as candidate query sentences according to the priority information, using a target query engine to obtain candidate related result data according to a historical query record, storing candidate identification and the candidate related result data into a cache database correspondingly, judging whether the query sentences exist in the database according to the identification of the query sentences after receiving a data query request, finishing preloading aiming at the query sentences if the query sentences exist, matching the candidate identification in the cache database according to the identification of the query sentences, and determining the candidate related result data corresponding to the candidate identification as the related result data of the query sentences when the matching is successful, wherein subsequent embodiments are particularly visible.
S604: receiving a data query request, the data query request comprising: identification of the query statement.
S605: and judging whether the preloading is finished aiming at the query statement or not according to the identification.
For the description of S604-S605, reference may be made to the above embodiments, which are not described herein again.
S606: if the preloading is completed, candidate identifiers matching the identifiers are determined.
In the embodiment of the present disclosure, after it is confirmed according to the identifier that the preloading is completed for the query statement, a candidate identifier matching the identifier of the query statement may be determined from the cache database, and when a candidate identifier corresponding to the identifier of the query statement is determined from the cache database, the identifier of the query statement may be input to a data matching algorithm by using a data matching algorithm, and the candidate identifier matching the identifier is determined according to an output result of the algorithm.
S607: and taking the candidate relevant result data corresponding to the matched candidate identification in the cache database as relevant result data corresponding to the query statement.
After the candidate identifier matching the identifier is determined, the candidate related result data corresponding to the matching candidate identifier in the cache database may be read as the related result data corresponding to the query statement.
In the embodiment of the disclosure, a candidate query statement is obtained, candidate related result data corresponding to the candidate query statement is generated, the candidate identifier and the candidate related result data are correspondingly stored in a cache database, when a data query request is received, the query request is analyzed to obtain the query identifier of the query statement, the query identifier of the query statement is queried in the cache database, and the candidate related result data corresponding to the candidate identifier matched with the query identifier of the query statement is used as the related result data corresponding to the query statement, so that data query can be performed by using a cache mechanism, and the data query efficiency is improved.
S608: and if the preloading is not finished, obtaining the engine performance information of the target query engine.
The engine performance information is used for representing the engine performance condition when the target query engine executes the query task currently.
In the embodiment of the present disclosure, the performance obtaining tool may be used to obtain the engine performance information of the target query engine, for example, a network service tool named Thor may be used as the performance obtaining tool to obtain the engine performance information of the target query engine in real time and analyze the engine performance information of the target query engine, where the network service tool named Thor is also a performance analysis tool in the engine acceleration web service in fig. 2, or any other performance obtaining tool may also be used to obtain and analyze the engine performance information, which is not limited herein.
Optionally, in some embodiments, the engine performance information is any one or a combination of: the total number of tasks, the usage amount of engine memory resources, the task execution time length and the amount of memory resources consumed by the tasks during execution can be analyzed according to the engine performance information, so that the resource usage condition and the data query performance of the target query engine can be analyzed, the data query task in the target query engine can be adjusted in real time according to the engine performance information, the phenomenon that the data query is terminated by a system due to the fact that resources are insufficient is avoided, and the occurrence of data query failure events is effectively avoided.
And the total task quantity in execution is used for representing the quantity of the data query tasks which are currently and concurrently executed in the target query engine.
The usage amount of the engine memory resources is used for representing the condition that each data query task currently executed in the target query engine occupies the memory resources.
The task execution time length is used for representing the time length of each data query task currently executed in the target query engine from the task establishment.
The task consumes the memory resource amount, and is used for representing the system memory resource amount of each data query task minor letter executed in the target query engine.
In the embodiment of the present disclosure, when engine performance information such as a total number of tasks, an amount of used engine memory resources, a task execution time, and a task consumed memory resource amount is obtained, engine performance information such as a total number of tasks, an amount of used engine memory resources, a task execution time, and a task consumed memory resource amount may be monitored in real time by reconstructing a querytjddbcttem plate function of a DataBase application program interface (JDBC) and a self-determined engine state monitoring function presturementmonitor (progressmonitor), as shown in fig. 9, fig. 9 is a performance index schematic diagram of an inquiry engine in the embodiment of the present disclosure.
S609: and if the engine performance information meets the set conditions, inquiring from the target inquiry engine according to the inquiry statement to obtain relevant result data.
The set condition may be a check condition composed of any one or more of a total number of tasks in execution, an engine memory resource usage amount, a task execution time length, and a task consumed memory resource amount, and the check condition may be set to be any one or a combination of any one or more of a total number of tasks less than a target query engine task number threshold, an engine memory resource usage amount less than a target query engine total resource amount, a task execution time length less than a data query task time consumption amount threshold, and a task consumed memory resource amount less than a system total resource amount.
After the engine performance information of the target query engine is obtained, the engine performance information can be triggered to be checked, if the engine performance information meets the set conditions, relevant result data are obtained by querying from the target query engine according to the query statement, and due to the fact that the engine performance information meets the set conditions, resources of the target query engine can be fully utilized, and data query efficiency is improved.
S610: and if the engine performance information does not meet the set condition, stopping the query aiming at the target query engine.
In the embodiment of the present disclosure, after obtaining the engine performance information of the target query engine, the engine performance information may be triggered to be checked, if the engine performance information does not satisfy the setting condition, it represents that the target query engine currently has a situation where the total number of tasks exceeds a number threshold, and the usage amount of the engine memory resource is excessive, and at this time, if the data query task continues, the query may fail due to poor performance of the target query engine, and the query to the target query engine may be triggered to stop, for example, a user-defined cancellation callback function regression monitor.
In this embodiment, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement according to the identifier, if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, if the preloading is not finished, obtaining the relevant result data from a target inquiry engine according to the inquiry statement, preloading the relevant result data into the cache database when the preloading is finished, and accordingly carrying out data inquiry based on the assistance of the cache database, effectively reducing the consumption of repeated data inquiry requests on inquiry performance, supporting the inquiry in a joint target inquiry engine to obtain the relevant result data when the preloading is not finished, improving the success rate of data inquiry, effectively improving the flexibility of data inquiry performance and data inquiry, obtaining candidate inquiry statements within a set time range, and supporting a preloading processing mechanism by utilizing the resource idle time of a data inquiry device, the resource utilization rate of the target query engine is improved to a great extent, the candidate query sentences are selected according to the priority information, so that the query sentences with higher priorities can be selected from the query sentence set to be used as the candidate query sentences for preloading processing, the priority information can be adjusted in real time according to the query performance of the target data engine, the query sentences and related result data in the cache database can be updated in real time, the cache hit rate is improved, the data query efficiency is improved, the resource use condition and the data query performance of the target query engine are analyzed according to the engine performance information, the data query tasks in the target query engine are adjusted in real time according to the engine performance information, the phenomenon that the system is stopped due to overtime caused by insufficient resources is avoided, and data query failure events are effectively avoided.
Fig. 10 is a flowchart illustrating a data query method according to another embodiment of the disclosure.
As shown in fig. 10, the data query method includes:
s1001: receiving a data query request, the data query request comprising: identification of the query statement.
S1002: and judging whether the preloading is finished aiming at the query statement or not according to the identification.
S1003: and if the preloading is finished, acquiring relevant result data corresponding to the query statement from the cache database.
For description of S1001 to S1003, reference may be made to the above embodiments, and details are not repeated here.
S1004: and if the preloading is not finished, timing the query duration to obtain the query time.
In the embodiment of the present disclosure, when the preloading is not completed, the system timer may be utilized to time the continuous execution time of the data query request to obtain the query time, for example, the time may be triggered to be timed when the data query request is received to obtain the corresponding query time.
In the embodiment of the present disclosure, after the query duration is timed to obtain the query time, the failure probability value corresponding to the query statement may be determined according to the query time, and whether the data query request needs to be terminated is determined, which may specifically refer to the following embodiments.
S1005: a failure probability value corresponding to the query statement is determined.
The failure probability value is used for representing the probability that the query task corresponding to the query statement is terminated.
In the embodiment of the present disclosure, when determining the failure probability value corresponding to the query statement, the failure probability value may be calculated by using an overtime mechanism of the target engine according to the engine performance information of the target engine.
For example, the failure probability value corresponding to the predicted query statement may be analyzed by a calculation rule based on a timeout mechanism of the open-source Presto data query engine.
S1006: and when the query time does not reach the first time threshold value, querying from the target query engine to obtain relevant result data.
The first time threshold is a threshold of the query time when the probability value of the failure of the query statement reaches 100%.
In the embodiment of the present disclosure, after the query duration is timed to obtain the query time, the query time may be triggered to be checked, and when the query time does not reach the first time threshold, the target query engine is used to analyze the query statement to generate a query execution plan, execute the data query task, and query the target query engine to obtain the relevant result data.
S1007: and when the failure probability value is smaller than the probability threshold value, inquiring from the target inquiry engine to obtain relevant result data.
The probability threshold may be a probability threshold for determining that a query failure event may occur when the query is performed on the query statement, where the probability threshold is, for example, 50%, that is, when the failure probability value is less than 50%, the representation is performed based on the query statement, and the probability of the query failure event is small, and conversely, the representation is performed based on the query statement, and the probability of the query failure event is large.
In the embodiment of the present disclosure, after the failure probability value corresponding to the query statement is determined, the test on the failure probability value may be triggered, and when the failure probability value is smaller than the probability threshold, the query statement is analyzed by using the target query engine to generate a query execution plan, execute a data query task, and obtain related result data by querying in the target query engine.
S1008: and when the query time reaches a first time threshold, the query time does not reach a second time threshold, and the failure probability value is smaller than a probability threshold, querying from the target query engine to obtain related result data, wherein the first time threshold is smaller than the second time threshold.
The second time threshold is a timeout time for executing the query task set by the target query engine, and the second time threshold may be configured according to the performance information of the target query engine, for example, when the number of the target query engine tasks is large, the second time threshold may be set to 5 minutes, or the second time threshold may be set by using other comprehensive mechanisms, which is not limited thereto.
In the embodiment of the disclosure, when the query time reaches the first time threshold, the query time does not reach the second time threshold, and the failure probability value is smaller than the probability threshold, the target query engine is used to analyze the query statement, generate a query execution plan, execute a data query task, and query from the target query engine to obtain related result data.
Optionally, in some embodiments, when the query time reaches the first time threshold and the failure probability value is greater than the probability threshold, the query to the target query engine is stopped, or when the query time reaches the second time threshold, the query to the target query engine is stopped, so that a phenomenon that a query statement occupies resources of the target query engine for a long time to cause a congestion of a query task of the target query engine can be avoided, the query task is effectively prevented from being terminated due to insufficient resources, robustness and robustness of data query are effectively improved, and utilization rate of resources of the target query engine is effectively improved.
In the embodiment of the present disclosure, when the query time reaches the first time threshold and the failure probability value is greater than the probability threshold, or when the query time reaches the second time threshold, the query statement represents that the target query engine occupies more resources and the resource consumption pressure of the target query engine is higher, the operation of terminating the query on the target query engine may be executed, and the resources of the target query engine may be released in time.
In this embodiment, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement according to the identifier, if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, if the preloading is not finished, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data, and preloading and caching the relevant result data into the cache database when the preloading is finished, so that data inquiry is assisted based on the cache database, the consumption of repeated data inquiry requests on inquiry performance can be effectively reduced, and when the preloading is not finished, the inquiry in a combined target inquiry engine is supported to obtain the relevant result data, so that the success rate of data inquiry can be improved, the data inquiry performance and the flexibility of data inquiry can be effectively improved, and a first time threshold, a second time threshold and failure probability are utilized to carry out real-time adjustment on inquiry tasks in the target inquiry engine, therefore, the phenomenon that query tasks of the target query engine are congested due to the fact that the query sentences occupy resources of the target query engine for a long time can be avoided, the query tasks are effectively prevented from being terminated due to insufficient resources, robustness and robustness of data query are effectively improved, and utilization rate of the resources of the target query engine is effectively improved.
Fig. 11 is a flowchart illustrating a data query method according to another embodiment of the disclosure.
As shown in fig. 11, the data query method includes:
s1101: and acquiring a configuration information table.
In this embodiment of the present disclosure, before the configuration information table is obtained, a data analysis system may be used to configure a data source, a configuration information table, and the like, as shown in fig. 12, fig. 12 is a schematic diagram of an architecture of the data analysis system in this embodiment of the present disclosure, an application area of the data analysis system may configure the information table, and a ghost.
The configuration information table can be used for characterizing the storage distribution of the data set in the data analysis system.
S1102: and acquiring query configuration corresponding to the configuration information table.
The configuration information table may correspond to one or more query configurations, where the query configuration may be a data source for configuring data query, or a specific field for data query, and the like, which is not limited herein.
In the embodiment of the present disclosure, after the configuration information table is obtained, the link trace may be used to obtain the query configuration corresponding to the configuration information table, and the query configuration may be converted into a query statement.
S1103: candidate query statements are generated according to the query configuration.
In the embodiment of the present disclosure, after the query configuration corresponding to the configuration information table is obtained, the candidate query statement may be generated according to the query configuration, and the candidate query statement may be generated according to the query data source, the query request, and other information in the query configuration.
Optionally, in some embodiments, the configuration information table includes any one or a combination of more than one of the following: a data set configuration information table, a graph configuration information table, and a portal configuration information table, the query configuration includes any one or a combination of: a data query configuration corresponding to the data set configuration information table, a graph query configuration corresponding to the graph configuration information table, and a portal query configuration corresponding to the portal configuration information table, since the generation of candidate query statements is aided according to any one or a combination of data query configurations corresponding to the data set configuration information table and graph query configurations corresponding to the graph configuration information table, thereby enabling the generated candidate query statement to cover the distribution characteristics of the query statement as diversified as possible, effectively improving the quality of the candidate query statement, avoiding the feature of the candidate query statement from being too single, when candidate query statements are generated from a query configuration, employed to assist in a preload mechanism, the auxiliary effect of the candidate query statement on the aspect of the data query effect can be effectively assisted and improved.
The data set configuration information table is used for storing the storage structure information of the data.
The chart configuration information table is used for storing configuration information of the database data table.
The portal configuration information table is used for storing information of a data query user.
In the embodiment of the present disclosure, a configuration information table may be composed of one or more of a data set configuration information table, a graph configuration information table, and a portal configuration information table, and accordingly, the query configuration may include: the data query configuration corresponding to the data set configuration information table, the chart query configuration corresponding to the chart configuration information table, and the portal query configuration corresponding to the portal configuration information table.
In this disclosure, after the configuration information table is obtained, candidate query statements may be generated according to the query configuration, and the candidate query statements may be used for a preloading mechanism, for example, as shown in fig. 13, fig. 13 is a schematic view of a preloading processing mechanism in this disclosure, and after the query configuration corresponding to the configuration information table is obtained, query requests corresponding to the candidate query statements may be generated, and the query requests sequentially enter a request queue to be executed, so as to generate candidate related result data corresponding to the candidate query statements and cache the candidate related result data in a database.
Optionally, in some embodiments, after generating the candidate query statement according to the query configuration, a query information table may be further obtained, where the query information table includes: and generating priority information corresponding to the candidate identification according to the historical query information, and describing corresponding candidate query statements according to the priority information, so that the query task executed in the target query engine can be adjusted in real time by using the priority information, and the data query efficiency is improved.
The query information table is used for storing historical query information corresponding to the candidate identification, and the historical query information records multiple times of query information corresponding to query sentences corresponding to the candidate identification.
In the embodiment of the present disclosure, when the query information table is obtained, the query information table may be obtained according to the universal unique identifier of the query information table.
Optionally, in some embodiments, the historical query information includes any one or a combination of: and inquiring frequency, resource consumption blocks, result data quantity, residual cache space and success or failure information.
The query frequency is used for representing the query times of the query statement in the target query engine, the resource consumption block is used for representing the number of resource blocks consumed by the query statement when executed in the target query engine, the result data volume is used for representing the result data volume queried after the query statement is executed in the target query engine, the residual amount of the cache space is used for representing the occupation condition of the cache space when the query statement is executed in the target query engine, and the query success or failure information is used for representing the success or failure condition distribution of the query statement when executed in the target query engine for several times.
In the embodiment of the disclosure, one or more of the query frequency, the resource consumption block, the result data amount, the remaining amount of cache space and the query success or failure information may be obtained according to the performance condition of the query engine and combined, the historical query information may be used to generate the priority information of the query statement, and the priority information may be adjusted in real time according to the performance condition of the query engine.
In the embodiment of the present disclosure, when generating the priority information corresponding to the candidate identifier according to the historical query information, the priority information may be generated according to a certain policy, for example, if the query frequency is higher, the priority corresponding to the candidate identifier is higher, if the remaining amount of the cache space corresponding to the candidate identifier is more, the priority corresponding to the candidate identifier is higher, and the like, or the priority information corresponding to the candidate identifier may be determined and generated by using the sum of the query frequency and the remaining amount of the cache space.
In the embodiment of the disclosure, after the priority information corresponding to the candidate identifier is generated according to the historical query information, the candidate query statements corresponding to the candidate identifier may be described according to the priority information, and the described candidate query statements may enter the request queue according to the priority information and then enter the target query engine to be executed.
S1104: candidate related result data corresponding to the candidate query statement is generated.
S1105: and correspondingly storing the candidate identification and the candidate related result data into a cache database.
S1106: receiving a data query request, the data query request comprising: identification of the query statement.
S1107: and judging whether the preloading is finished aiming at the query statement or not according to the identification.
S1108: and if the preloading is finished, acquiring relevant result data corresponding to the query statement from the cache database.
For the description of S1104 to S1108, reference may be made to the above embodiments, which are not described herein again.
S1109: and if the preloading is not finished, inquiring from the target inquiry engine according to the inquiry statement to obtain relevant result data.
In this disclosure, a query engine cluster may be used to implement real-time data query, for example, as shown in fig. 14, fig. 14 is a schematic structural diagram of a Presto-based real-time data query engine cluster in this disclosure, and link tracking, data preloading, intelligent fusing, performance index analysis, and a caching mechanism in the Presto-based real-time data query engine cluster may be used to implement real-time data query.
In this embodiment, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement according to the identifier, if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, if the preloading is not finished, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data, and preloading and caching the relevant result data into the cache database when the preloading is finished, so that data inquiry is assisted based on the cache database, the consumption of repeated data inquiry requests on inquiry performance can be effectively reduced, and when the preloading is not finished, the inquiry in a combined target inquiry engine is supported to obtain the relevant result data, so that the success rate of data inquiry can be improved, the data inquiry performance and the flexibility of data inquiry can be effectively improved Any one or a plurality of combinations of chart query configurations corresponding to the chart configuration information table are used for assisting in generating candidate query sentences, so that the generated candidate query sentences can cover distribution characteristics of query sentences as diverse as possible, the quality of the candidate query sentences can be effectively improved, the condition that the characteristics of the candidate query sentences are too single is avoided, when the candidate query sentences are generated according to the query configurations and are used for a preloading mechanism by adopting the query configuration assistance, the assistance effect of the candidate query sentences on the aspect of data query effect can be effectively assisted to be improved, query tasks executed in a target query engine can be adjusted in real time by utilizing priority information, and the data query efficiency is improved.
Fig. 15 is a schematic structural diagram of a data query apparatus according to an embodiment of the present disclosure.
As shown in fig. 15, the data query device 140 includes:
a receiving module 1401, configured to receive a data query request, where the data query request includes: an identification of the query statement;
a judging module 1402, configured to judge whether preloading is completed for the query statement according to the identifier;
a first obtaining module 1403, configured to obtain, when the preloading is completed, relevant result data corresponding to the query statement from the cache database, where the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded query statement set before receiving a data query request, and the query statement set includes: a plurality of candidate query sentences, and the query task is executed aiming at each candidate query sentence;
and the query module 1404 is configured to query the target query engine according to the query statement to obtain relevant result data when the preloading is not completed.
In some embodiments of the present disclosure, the query module 1404 is specifically configured to:
timing the query duration to obtain query time;
determining a failure probability value corresponding to the query statement;
when the query time does not reach a first time threshold value, querying from a target query engine to obtain related result data;
or when the failure probability value is smaller than the probability threshold value, related result data are obtained by querying in the target query engine;
or when the query time reaches a first time threshold, the query time does not reach a second time threshold, and the failure probability value is smaller than the probability threshold, querying from the target query engine to obtain related result data, wherein the first time threshold is smaller than the second time threshold.
In some embodiments of the present disclosure, the query module 1404 is further configured to:
stopping the query aiming at the target query engine when the query time reaches a first time threshold and the failure probability value is greater than a probability threshold;
or stopping the query to the target query engine when the query time reaches a second time threshold.
In some embodiments of the present disclosure, the query module 1404 is further configured to:
obtaining engine performance information of a target query engine;
if the engine performance information meets the set conditions, inquiring from the target inquiry engine according to the inquiry statement to obtain related result data;
and if the engine performance information does not meet the set condition, stopping the query aiming at the target query engine.
In some embodiments of the disclosure, the engine performance information is any one or a combination of:
the total number of tasks in execution, the usage amount of engine memory resources, the execution time length of the tasks, and the amount of memory resources consumed by the tasks.
In some embodiments of the present disclosure, as shown in fig. 16, further comprising:
a second obtaining module 1405, configured to, before the receiving of the data query request, obtain a candidate query statement, where the candidate query statement has a corresponding candidate identifier;
a generating module 1406 for generating candidate related result data corresponding to the candidate query statement;
a storing module 1407, configured to correspondingly store the candidate identifier and the candidate related result data into the cache database;
wherein the first obtaining module 1403 is further configured to:
determining candidate identifications matched with the identifications;
and taking the candidate relevant result data corresponding to the matched candidate identification in the cache database as relevant result data corresponding to the query statement.
In some embodiments of the present disclosure, the second obtaining module 1405 is specifically configured to:
acquiring a configuration information table;
acquiring query configuration corresponding to the configuration information table;
candidate query statements are generated according to the query configuration.
In some embodiments of the present disclosure, the configuration information table comprises any one or a combination of:
a data set configuration information table, a chart configuration information table, and a portal configuration information table;
the query configuration includes any one or a combination of:
the data query configuration corresponding to the data set configuration information table, the graph query configuration corresponding to the graph configuration information table, and the portal query configuration corresponding to the portal configuration information table.
In some embodiments of the present disclosure, the second obtaining module 1405 is further configured to:
after generating the candidate query statement according to the query configuration, obtaining a query information table, the query information table comprising: historical query information corresponding to the candidate identifiers;
generating priority information corresponding to the candidate identification according to the historical query information;
and describing the corresponding candidate query statement according to the priority information.
In some embodiments of the present disclosure, the generating module 1406 is specifically configured to:
selecting a part of candidate query sentences from the plurality of candidate query sentences according to the priority information;
candidate related result data corresponding to a portion of the candidate query statements is generated.
In some embodiments of the present disclosure, the historical query information includes any one or a combination of:
and inquiring frequency, resource consumption blocks, result data quantity, residual cache space and success or failure information.
In some embodiments of the present disclosure, the second obtaining module 1405 is further configured to:
detecting the current time;
and if the current time is within the set time range, acquiring the candidate query statement.
In some embodiments of the present disclosure, the target query engine corresponds to database connection information, the database connection information has a corresponding plurality of access user identifiers, and the data query request includes: querying the user identification, further comprising:
a determining module 1408, configured to determine a target access user identifier corresponding to the query user identifier from the multiple access user identifiers;
an association module 1409, configured to perform association processing on the query user identifier and the target access user identifier;
and a display module 1410, configured to display the query user identifier and the target access user identifier after the association processing.
Corresponding to the data query method provided in the embodiments of fig. 1 to 14, the present disclosure also provides a data query device, and since the data query device provided in the embodiments of the present disclosure corresponds to the data query method provided in the embodiments of fig. 1 to 14, the implementation manner of the data query method is also applicable to the data query device provided in the embodiments of the present disclosure, and will not be described in detail in the embodiments of the present disclosure.
In this embodiment, by receiving a data query request, the data query request includes: the method comprises the steps of inquiring an identifier of a statement, judging whether preloading is finished aiming at the inquiry statement or not according to the identifier, and if the preloading is finished, obtaining relevant result data corresponding to the inquiry statement from a cache database, wherein the relevant result data in the cache database is obtained by starting an asynchronous thread to obtain a preloaded inquiry statement set before receiving a data inquiry request, and the inquiry statement set comprises the following steps: the method comprises the steps that a plurality of candidate query sentences are obtained, corresponding query tasks are executed for each candidate query sentence, if preloading is not completed, relevant result data are obtained through querying from a target query engine according to the query sentences, and when the preloading is completed, the relevant result data are preloaded and cached in a cache database, so that data query is assisted based on the cache database, the consumption of repeated data query requests on query performance can be effectively reduced, and when the preloading is not completed, the relevant result data are obtained through query in a combined target query engine, so that the success rate of data query can be improved, and the data query performance and the flexibility of data query can be effectively improved.
In order to implement the foregoing embodiments, the present disclosure also provides a computer device, including: the data query method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the data query method is realized according to the embodiment of the disclosure.
In order to implement the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the data query method as proposed by the foregoing embodiments of the present disclosure.
In order to implement the foregoing embodiments, the present disclosure further provides a computer program product, which when executed by an instruction processor in the computer program product, performs the data query method as set forth in the foregoing embodiments of the present disclosure.
FIG. 17 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present disclosure. The computer device 12 shown in fig. 17 is only an example and should not bring any limitations to the functionality or scope of use of the embodiments of the present disclosure.
As shown in FIG. 17, computer device 12 is embodied in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Although not shown in FIG. 17, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the disclosure.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described in this disclosure.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, implementing the data query method mentioned in the foregoing embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
It should be noted that, in the description of the present disclosure, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present disclosure, "a plurality" means two or more unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present disclosure includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present disclosure.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present disclosure have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present disclosure, and that changes, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present disclosure.
Claims (28)
1. A method for data query, the method comprising:
receiving a data query request, the data query request comprising: an identification of the query statement;
judging whether preloading is finished aiming at the query statement or not according to the identification;
if the preloading is finished, acquiring relevant result data corresponding to the query statement from a cache database;
if the preloading is not finished, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data;
wherein, the caching of relevant result data in the database is to start an asynchronous thread to obtain a preloaded query statement set before receiving a data query request, and the query statement set includes: and the query task is obtained by executing corresponding query task query aiming at each candidate query statement.
2. The method of claim 1, wherein said querying from a target query engine according to the query statement results in the relevant result data, comprising:
timing the query duration to obtain query time;
determining a failure probability value corresponding to the query statement;
when the query time does not reach a first time threshold value, the relevant result data are queried from the target query engine; or
When the failure probability value is smaller than a probability threshold value, the relevant result data are obtained by inquiring from the target inquiry engine; or
And when the query time reaches the first time threshold, the query time does not reach a second time threshold, and the failure probability value is smaller than a probability threshold, querying the target query engine to obtain the relevant result data, wherein the first time threshold is smaller than the second time threshold.
3. The method of claim 2, wherein the method further comprises:
stopping querying for the target query engine when the query time reaches the first time threshold and the failure probability value is greater than the probability threshold; or
Stopping querying to the target query engine when the query time reaches the second time threshold.
4. The method of claim 1, wherein said querying from a target query engine according to the query statement results in the relevant result data, comprising:
obtaining engine performance information of the target query engine;
if the engine performance information meets set conditions, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data;
stopping the query to the target query engine if the engine performance information does not satisfy the set condition.
5. The method of claim 4, wherein the engine performance information is any one or a combination of:
the total number of tasks in execution, the usage amount of engine memory resources, the execution time length of the tasks, and the amount of memory resources consumed by the tasks.
6. The method of claim 1, prior to said receiving a data query request, further comprising:
obtaining candidate query sentences which have corresponding candidate marks;
generating candidate related result data corresponding to the candidate query statement;
correspondingly storing the candidate identification and the candidate related result data into the cache database;
wherein the obtaining of the relevant result data corresponding to the query statement from the cache database includes:
determining candidate identifications matched with the identifications;
and taking the candidate relevant result data corresponding to the matched candidate identification in the cache database as the relevant result data corresponding to the query statement.
7. The method of claim 6, wherein the obtaining candidate query statements comprises:
acquiring a configuration information table;
acquiring query configuration corresponding to the configuration information table;
and generating the candidate query statement according to the query configuration.
8. The method of claim 7, wherein the configuration information table comprises any one or a combination of:
a data set configuration information table, a chart configuration information table, and a portal configuration information table;
the query configuration includes any one or a combination of:
the data query configuration corresponding to the data set configuration information table, the graph query configuration corresponding to the graph configuration information table, and the portal query configuration corresponding to the portal configuration information table.
9. The method of claim 7, after the generating the candidate query statement according to the query configuration, further comprising:
obtaining a query information table, wherein the query information table comprises: historical query information corresponding to the candidate identifiers;
generating priority information corresponding to the candidate identification according to the historical query information;
and describing the corresponding candidate query statement according to the priority information.
10. The method of claim 9, wherein the generating candidate relevant result data corresponding to the candidate query statement comprises:
selecting a part of candidate query sentences from the candidate query sentences according to the priority information;
candidate relevant result data corresponding to the partial candidate query statement is generated.
11. The method of claim 9, wherein the historical query information comprises any one or a combination of:
and inquiring frequency, resource consumption blocks, result data quantity, residual cache space and success or failure information.
12. The method of claim 6, wherein the obtaining candidate query statements comprises:
detecting the current time;
and if the current time is within a set time range, acquiring a candidate query statement.
13. The method of any of claims 1-12, wherein the target query engine corresponds to database connectivity information having a corresponding plurality of access user identifications, the data query request comprising: querying the user identification, the method further comprising:
determining a target access user identifier corresponding to the query user identifier from the plurality of access user identifiers;
performing association processing on the query user identifier and the target access user identifier;
and displaying the query user identification and the target access user identification after the association processing.
14. A data query apparatus, characterized in that the apparatus comprises:
a receiving module, configured to receive a data query request, where the data query request includes: an identification of the query statement;
the judging module is used for judging whether the preloading is finished aiming at the query statement or not according to the identification;
a first obtaining module, configured to obtain, when the preloading is completed, relevant result data corresponding to the query statement from a cache database;
the query module is used for querying from a target query engine according to the query statement to obtain the relevant result data when the preloading is not completed;
wherein, the caching of relevant result data in the database is to start an asynchronous thread to obtain a preloaded query statement set before receiving a data query request, and the query statement set includes: and the query task is obtained by executing corresponding query task query aiming at each candidate query statement.
15. The apparatus of claim 14, wherein the query module is specifically configured to:
timing the query duration to obtain query time;
determining a failure probability value corresponding to the query statement;
when the query time does not reach a first time threshold value, the relevant result data are queried from the target query engine; or
When the failure probability value is smaller than a probability threshold value, the relevant result data are obtained by inquiring from the target inquiry engine; or
And when the query time reaches the first time threshold, the query time does not reach a second time threshold, and the failure probability value is smaller than a probability threshold, querying the target query engine to obtain the relevant result data, wherein the first time threshold is smaller than the second time threshold.
16. The apparatus of claim 15, wherein the query module is further configured to:
stopping querying for the target query engine when the query time reaches the first time threshold and the failure probability value is greater than the probability threshold; or
Stopping querying to the target query engine when the query time reaches the second time threshold.
17. The apparatus of claim 14, wherein the query module is further configured to:
obtaining engine performance information of the target query engine;
if the engine performance information meets set conditions, inquiring from a target inquiry engine according to the inquiry statement to obtain the relevant result data;
stopping the query to the target query engine if the engine performance information does not satisfy the set condition.
18. The apparatus of claim 17, wherein the engine performance information is any one or combination of:
the total number of tasks in execution, the usage amount of engine memory resources, the execution time length of the tasks, and the amount of memory resources consumed by the tasks.
19. The apparatus of claim 14, further comprising:
a second obtaining module, configured to obtain a candidate query statement before the data query request is received, where the candidate query statement has a corresponding candidate identifier;
a generating module, configured to generate candidate related result data corresponding to the candidate query statement;
the storage module is used for correspondingly storing the candidate identification and the candidate related result data into the cache database;
wherein the first obtaining module is further configured to:
determining candidate identifications matched with the identifications;
and taking the candidate relevant result data corresponding to the matched candidate identification in the cache database as the relevant result data corresponding to the query statement.
20. The apparatus of claim 19, wherein the second obtaining module is specifically configured to:
acquiring a configuration information table;
acquiring query configuration corresponding to the configuration information table;
and generating the candidate query statement according to the query configuration.
21. The apparatus of claim 20, wherein the configuration information table comprises any one or a combination of:
a data set configuration information table, a chart configuration information table, and a portal configuration information table;
the query configuration includes any one or a combination of:
the data query configuration corresponding to the data set configuration information table, the graph query configuration corresponding to the graph configuration information table, and the portal query configuration corresponding to the portal configuration information table.
22. The apparatus of claim 20, wherein the second obtaining module is further configured to:
after the candidate query statement is generated according to the query configuration, obtaining a query information table, the query information table including: historical query information corresponding to the candidate identifiers;
generating priority information corresponding to the candidate identification according to the historical query information;
and describing the corresponding candidate query statement according to the priority information.
23. The apparatus of claim 22, wherein the generation module is specifically configured to:
selecting a part of candidate query sentences from the candidate query sentences according to the priority information;
candidate relevant result data corresponding to the partial candidate query statement is generated.
24. The apparatus of claim 22, wherein the historical query information comprises any one or a combination of:
and inquiring frequency, resource consumption blocks, result data quantity, residual cache space and success or failure information.
25. The apparatus of claim 19, wherein the second obtaining module is further configured to:
detecting the current time;
and if the current time is within a set time range, acquiring a candidate query statement.
26. The apparatus of claims 14-25, wherein the target query engine corresponds to database connectivity information having a corresponding plurality of access user identifications, the data query request comprising: querying the user identification, further comprising:
a determining module, configured to determine, from the multiple access user identifiers, a target access user identifier corresponding to the query user identifier;
the association module is used for associating the query user identifier with the target access user identifier;
and the display module is used for displaying the query user identifier and the target access user identifier after the association processing.
27. A computer device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-13.
28. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-13.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111227991.3A CN113886426A (en) | 2021-10-21 | 2021-10-21 | Data query method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111227991.3A CN113886426A (en) | 2021-10-21 | 2021-10-21 | Data query method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113886426A true CN113886426A (en) | 2022-01-04 |
Family
ID=79004152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111227991.3A Pending CN113886426A (en) | 2021-10-21 | 2021-10-21 | Data query method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113886426A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114491178A (en) * | 2022-02-16 | 2022-05-13 | 中银金融科技有限公司 | Data preloading method, device, electronic equipment, medium and product |
CN114519582A (en) * | 2022-02-21 | 2022-05-20 | 中国邮政储蓄银行股份有限公司 | Service preheating method, preheating device and service system |
CN114547095A (en) * | 2022-02-10 | 2022-05-27 | 深圳市翼海云峰科技有限公司 | Data rapid query method and device, electronic equipment and storage medium |
CN115334135A (en) * | 2022-08-01 | 2022-11-11 | 北京神州云合数据科技发展有限公司 | Multi-cloud api asynchronous processing method, device and equipment based on event bus |
CN117591480A (en) * | 2023-09-27 | 2024-02-23 | 行吟信息科技(上海)有限公司 | Data query method, device, electronic equipment and storage medium |
CN118445317A (en) * | 2023-09-28 | 2024-08-06 | 荣耀终端有限公司 | Application data acquisition method and application server |
-
2021
- 2021-10-21 CN CN202111227991.3A patent/CN113886426A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114547095A (en) * | 2022-02-10 | 2022-05-27 | 深圳市翼海云峰科技有限公司 | Data rapid query method and device, electronic equipment and storage medium |
CN114491178A (en) * | 2022-02-16 | 2022-05-13 | 中银金融科技有限公司 | Data preloading method, device, electronic equipment, medium and product |
CN114519582A (en) * | 2022-02-21 | 2022-05-20 | 中国邮政储蓄银行股份有限公司 | Service preheating method, preheating device and service system |
CN115334135A (en) * | 2022-08-01 | 2022-11-11 | 北京神州云合数据科技发展有限公司 | Multi-cloud api asynchronous processing method, device and equipment based on event bus |
CN115334135B (en) * | 2022-08-01 | 2023-06-02 | 北京神州云合数据科技发展有限公司 | Multi-cloud api asynchronous processing method, device and equipment based on event bus |
CN117591480A (en) * | 2023-09-27 | 2024-02-23 | 行吟信息科技(上海)有限公司 | Data query method, device, electronic equipment and storage medium |
CN118445317A (en) * | 2023-09-28 | 2024-08-06 | 荣耀终端有限公司 | Application data acquisition method and application server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113886426A (en) | Data query method and device, computer equipment and storage medium | |
US7292961B2 (en) | Capturing session activity as in-memory snapshots using a time-based sampling technique within a database for performance tuning and problem diagnosis | |
CN107077477B (en) | Structured data stream for enhanced event processing | |
EP2563062A1 (en) | Long connection management apparatus and link resource management method for long connection communication | |
CN111177161B (en) | Data processing method, device, computing equipment and storage medium | |
CN107918562A (en) | A kind of unified interface management method and system | |
CN101196912A (en) | Method and apparatus for application state synchronization | |
CN113407623B (en) | Data processing method, device and server | |
EP3384391B1 (en) | Real-time change data from disparate sources | |
CN111314158B (en) | Big data platform monitoring method, device, equipment and medium | |
CN110147470B (en) | Cross-machine-room data comparison system and method | |
CN113238815B (en) | Interface access control method, device, equipment and storage medium | |
AU2021244852A1 (en) | Offloading statistics collection | |
CN113760677A (en) | Abnormal link analysis method, device, equipment and storage medium | |
US8732323B2 (en) | Recording medium storing transaction model generation support program, transaction model generation support computer, and transaction model generation support method | |
CN112783711A (en) | Method and storage medium for analyzing program memory on NodeJS | |
CN111522870A (en) | Database access method, middleware and readable storage medium | |
CN114416849A (en) | Data processing method and device, electronic equipment and storage medium | |
CN113946491A (en) | Microservice data processing method, microservice data processing device, computer equipment and storage medium | |
CN112711606A (en) | Database access method and device, computer equipment and storage medium | |
CN115657625B (en) | Monitoring method, program product, system, device and readable storage medium | |
US11507485B2 (en) | Universal profiling device and method for simulating performance monitoring unit | |
CN115525392A (en) | Container monitoring method and device, electronic equipment and storage medium | |
US11475017B2 (en) | Asynchronous data enrichment for an append-only data store | |
CN114036179A (en) | Processing method and device for slow query operation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |