CN117370381A - Data query method, device, electronic equipment and storage medium - Google Patents

Data query method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117370381A
CN117370381A CN202311205473.0A CN202311205473A CN117370381A CN 117370381 A CN117370381 A CN 117370381A CN 202311205473 A CN202311205473 A CN 202311205473A CN 117370381 A CN117370381 A CN 117370381A
Authority
CN
China
Prior art keywords
data
query statement
query
data table
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311205473.0A
Other languages
Chinese (zh)
Inventor
胡知强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinzhuan Xinke Co Ltd
Original Assignee
Jinzhuan Xinke Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinzhuan Xinke Co Ltd filed Critical Jinzhuan Xinke Co Ltd
Priority to CN202311205473.0A priority Critical patent/CN117370381A/en
Publication of CN117370381A publication Critical patent/CN117370381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data query method, a data query device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a target query statement, wherein the target query statement is associated with a plurality of data tables in a distributed database; determining whether the target query statement meets preset execution conditions according to the type of the target query statement; and under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is the data table with the minimum data quantity in the plurality of data tables, and the residual data table is other data tables except the first data table in the plurality of data tables. Therefore, the first data table with the smallest data volume can be queried preferentially, and the remaining data table with larger data volume can be queried again, so that the communication time and the storage space in the query process are reduced, the query efficiency is improved, and the resource consumption is reduced.

Description

Data query method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of distributed database technologies, and in particular, to a data query method, a data query device, an electronic device, and a storage medium.
Background
With the development of economy and technology, it is increasingly difficult for conventional centralized databases to meet the increasing demand for data volumes. To accommodate the development of digital economies, distributed databases with high concurrency, high performance, and high availability characteristics have evolved. One of the most frequently used operations in distributed databases is data querying. Because the distributed database is required to realize data association among data nodes, extra communication time and storage space are increased, and therefore, optimizing the data query efficiency is particularly important in a distributed environment.
However, in the prior art, when data query is performed in the distributed database, the data table is generally queried according to the original execution logic of the query statement, and when the data table queried earlier is large, the whole query statement needs to occupy a large communication time and a large storage space when being executed, so that the query efficiency is low and the resource consumption is large.
Disclosure of Invention
The application provides a data query method, a data query device, electronic equipment and a storage medium, which are used for solving the problems of low query efficiency and high resource consumption when data query is performed in a distributed database in the prior art.
In a first aspect, the present application provides a data query method, the method including:
acquiring a target query statement, wherein the target query statement is associated with a plurality of data tables in a distributed database;
determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different;
and under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is a data table with the smallest data quantity in the data tables, and the residual data table is other data tables which are remained in the data tables except the first data table.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
the determining whether the target query statement meets the preset execution condition according to the type of the target query statement comprises:
Determining whether the first data table is associated with the main query statement if the type of the target query statement is a first query type;
and under the condition that the first data table is associated with the main query statement, determining that the target query statement meets the preset execution condition.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
the determining whether the target query statement meets the preset execution condition according to the type of the target query statement comprises:
determining whether the first data table is associated with the main query statement or whether the first data table is associated with the sub-query statement in the case that the type of the target query statement is the second query type or the third query type;
and under the condition that the first data table is associated with the main query statement or the first data table is associated with the sub query statement, determining that the target query statement meets the preset execution condition.
Further, the method further comprises:
determining an associated connection type of the target query statement under the condition that the type of the target query statement is the third query type, wherein the associated connection type comprises an inner connection type, a left connection type, a right connection type and a full connection type;
In the case that the association connection type is the connection type, the first data table is any data table associated with the main query statement or any data table associated with the sub query statement;
in the case that the association connection type is the left connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement;
in the case that the association connection type is the left connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement or any data table associated with the sub-query statement;
in the case that the association connection type is the right connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement;
in the case that the association connection type is the right connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement or any data table associated with the sub-query statement;
And in the case that the association connection type is the full connection type, the first data table is any data table associated with the main query statement.
Further, the sequentially querying the data in the first data table and the remaining data tables, and determining the target query result of the target query statement according to the queried data, includes:
sending a first data query request to a data node storing the first data table to obtain a first intermediate result, wherein the first data query request is used for requesting to query the first data table;
based on the first intermediate result, sending a second data query request to a data node storing the residual data table to obtain a second intermediate result, wherein the second data query request is used for requesting to query the residual data table;
and determining a target query result of the target query statement according to the first intermediate result and the second intermediate result.
Further, the sending a second data query request to the data node storing the remaining data table based on the first intermediate result, to obtain a second intermediate result, includes:
If the target query statement comprises a plurality of sub-query statements and the first data table is associated with any sub-query statement, a third data query request is sent to a data node storing a second data table based on the first intermediate result to obtain a third intermediate result, wherein the second data table is a data table in the same sub-query statement as the first data table in the remaining data tables, and the third data query request is used for requesting to query the second data table;
and based on the third intermediate result, sending a fourth data query request to a data node storing a third data table to obtain the second intermediate result, wherein the third data table is a data table in different sub-query sentences with the first data table in the rest data tables, and the fourth data query request is used for requesting to query the third data table.
Further, after sequentially querying the data in the first data table and the remaining data tables and determining the target query result of the target query statement according to the queried data, the method further includes:
and outputting and displaying the target query result of the target query statement.
In a second aspect, the present application provides a data query apparatus, the apparatus comprising:
the acquisition module is used for acquiring target query sentences, wherein the target query sentences are associated with a plurality of data tables in the distributed database;
the first determining module is used for determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different;
and the second determining module is used for sequentially querying the data in the first data table and the residual data table under the condition that the target query statement meets the preset execution condition, and determining the target query result of the target query statement according to the queried data, wherein the first data table is the data table with the smallest data volume in the plurality of data tables, and the residual data table is other data tables which are remained in the plurality of data tables except the first data table.
In a third aspect, the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
A memory for storing a computer program;
and the processor is used for realizing the steps of the data query method when executing the program stored in the memory.
In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data query method of any of the above.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: according to the method provided by the embodiment of the application, the target query statement is obtained, wherein the target query statement is associated with a plurality of data tables in the distributed database; determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different; and under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is a data table with the smallest data quantity in the data tables, and the residual data table is other data tables which are remained in the data tables except the first data table. By the method, when the data query is needed to be carried out on a plurality of data tables in the distributed database and the target query statement meets the preset execution condition, the first data table with the smallest data volume can be queried preferentially, and then the remaining data table with larger data volume is queried, so that the communication time and the storage space of the whole query process can be reduced, the query efficiency is improved, and the resource consumption is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.
Fig. 1 is a schematic flow chart of a data query method provided in an embodiment of the present application;
fig. 2 is a schematic structural diagram of a data query device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Referring to fig. 1, fig. 1 is a flow chart of a data query method according to an embodiment of the present application. As shown in fig. 1, the data query method includes the following steps:
Step 101, acquiring a target query statement, wherein the target query statement is associated with a plurality of data tables in a distributed database.
It should be noted that, the data query method provided by the embodiment of the application can be used in the fields of databases, distributed databases, middleware, distributed storage and the like. For ease of understanding, the following embodiments will be described with reference to a distributed database.
Specifically, the target query statement refers to a statement that queries data in the distributed database, where the target query statement may be associated with a plurality of data tables in the distributed database, and a final query result of the target query statement is obtained by analyzing a data query result of the plurality of data tables.
The target query statement may be obtained by receiving an input operation of a user, or may be obtained by receiving a data query request and analyzing the data query request.
Step 102, determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different.
Specifically, the types of the target query statement may include, but are not limited to, a first query type (e.g., a query type including a SELECT LIST sub-query statement), a second query type (e.g., a query type including a WHERE sub-query statement), a third query type (e.g., a query type including an ON sub-query statement), and so ON. Because the preset execution conditions corresponding to different types of target query sentences are different, the preset execution conditions corresponding to the target query sentences need to be determined according to the types of the target query sentences, and whether the target query sentences meet the preset execution conditions is determined. If the target query statement meets the preset execution condition, executing step 103; if the target query statement does not meet the preset execution condition, the data table is queried according to the original execution logic, and a query result is obtained.
Step 103, under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is a data table with the minimum data quantity in a plurality of data tables, and the residual data table is other data tables except the first data table in the plurality of data tables.
In this step, when the target query statement meets the preset execution condition, the first data table with the smallest data amount in the multiple data tables associated with the target query statement may be queried preferentially to obtain the data in the first data table, and then other remaining data tables with larger data amounts in the multiple data tables are queried, so as to finally obtain the target query result of the target query statement.
In this embodiment, when data query is required for a plurality of data tables in the distributed database and the target query statement satisfies a preset execution condition, the first data table with the smallest data amount may be queried preferentially, and then the remaining data tables with larger data amount may be queried, so that the communication time and the storage space of the entire query process may be reduced, thereby improving the query efficiency and reducing the resource consumption.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
step 102, determining whether the target query statement meets a preset execution condition according to the type of the target query statement, which specifically includes:
determining whether the first data table is associated with the main query statement if the type of the target query statement is the first query type;
And under the condition that the first data table is associated with the main query statement, determining that the target query statement meets the preset execution condition.
Specifically, the first query type may be understood as a query type in which the sub-query statement nested in the main query statement is a SELECT LIST sub-query statement.
In an embodiment, when the type of the target query statement is the first query type, it may be determined whether the first data table is associated with the main query statement, and if the first data table is associated with the main query statement, it may be determined that the target query statement satisfies a preset execution condition; if the first data table is not associated with the main query statement, it may be determined that the target query statement does not satisfy the preset execution condition.
For example, let the target query statement be select t1, (select b from t2 where t2. A=t1. A) from t1, and t1 table be the small table (i.e., the first data table), and t2 table be the large table. It can be seen that, in the target query term, the first data table t1 is associated with the main query term, and thus the target query term satisfies a preset execution condition. At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t1 table, and issuing a data node inquiry statement: select from t1;
Step 2, inquiring the needed t2 table data by using the association condition and the t1 table data searched in the step 1, and sending a data node inquiry statement: select from t2 sphere t2.A in (t 1 table field a data);
and step 3, calculating sub-queries and outputting results.
In this example, t1 and t2 are equivalent correlations, although other non-equivalent correlations may also use similar methods, and the core may also reduce the data queried by the t2 table, which is not described herein. Assuming that the t1 table is a large table and the t2 table is a small table (i.e., the first data table), the first data table t2 is not associated with the main query statement, so that the target query statement does not satisfy the preset execution condition at this time.
In this embodiment, for the target query statement of the first query type, since the number of result set rows of the target query statement is determined by the main query, only the first data table can be associated with the main query statement, and the sub-query is driven by the main query result, i.e. the small table drives the large table, so as to reduce the communication time and the storage space of the whole query process.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
step 102, determining whether the target query statement meets a preset execution condition according to the type of the target query statement, which specifically includes:
Determining whether the first data table is associated with the main query statement or whether the first data table is associated with the sub-query statement in the case that the type of the target query statement is the second query type or the third query type;
and under the condition that the first data table is associated with the main query statement or the first data table is associated with the sub query statement, determining that the target query statement meets the preset execution condition.
Specifically, the second query type may be understood as a query type in which the sub-query statement nested in the main query statement is a WHERE sub-query statement. The third query type may be understood as a query type in which sub-query statements nested in the main query statement are ON sub-query statements.
In an embodiment, when the type of the target query statement is the second query type or the third query type, it may be determined whether the first data table is associated with the main query statement or whether the first data table is associated with the sub-query statement, and if it is determined that the first data table is associated with the main query statement or the first data table is associated with the sub-query statement, it may be determined that the target query statement satisfies a preset execution condition; if the first data table is associated with neither the main query statement nor the sub-query statement, it may be determined that the target query statement does not satisfy the preset execution condition.
For example, assume that the target query statement of the second query type is select from t1 where exists (select b from t2 where t1. A=t2. A), and that the t1 table is a small table (i.e., the first data table) and the t2 table is a large table. It can be seen that, in the target query term, the first data table t1 is associated with the main query term, and thus the target query term satisfies a preset execution condition. At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t1 table, and issuing a data node inquiry statement: select from t1;
step 2, inquiring the needed t2 table data by using the association condition and the t1 table data searched in the step 1, and sending a data node inquiry statement: select from t2 sphere t2.A in (t 1 table field a data);
and step 3, calculating exists sub-queries and outputting results.
Continuing with the illustration, assuming that the t1 table is a large table and the t2 table is a small table (i.e., the first data table), in the target query statement, the first data table t2 is associated with the sub-query statement, so that the target query statement satisfies the preset execution condition. At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t2 table, and issuing a data node inquiry statement: select a from t2;
Step 2, inquiring the needed t1 table data by using the association condition and the t2 table data searched in the step 1, and sending a data node inquiry statement: select from t1 sphere t1.A in (t 2 table field a data);
and 3, directly outputting the data queried in the step 2 as a result.
For another example, assume that the target query statement of the third query type is select from t1join t2 on t1. A=t2. A and exists (select c from t3 where t3. B=t1. B). The on condition may be put into the where condition because it is an inner join (i.e., an interconnect type). Therefore, the execution logic of the target query term is the same as that of the target query term of the second query type in the above example, and will not be described herein.
In this embodiment, for the target query statement of the second query type or the third query type, internal and external bidirectional driving is supported, that is, if the first data table is associated with the sub-query statement, the sub-query is first performed, then the main query is filtered, and the main query is filtered by using the sub-query or filtered in batches. If the first data table is associated with the main query statement, the main query is first followed by the sub-query, and the sub-query is filtered by the main query or the sub-queries are filtered in batches. Therefore, the small table can be driven to large table, and the whole communication time and storage space in the query process are effectively reduced.
Further, the method further comprises:
determining an associated connection type of the target query statement under the condition that the type of the target query statement is a third query type, wherein the associated connection type comprises an inner connection type, a left connection type, a right connection type and a full connection type;
in the case that the associated connection type is an internal connection type, the first data table is any data table associated with a main query statement or any data table associated with a sub-query statement;
under the condition that the associated connection type is a left connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement;
under the condition that the associated connection type is a left connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement or any data table associated with the sub-query statement;
in the case that the associated connection type is a right connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement;
under the condition that the associated connection type is a right connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement or any data table associated with the sub-query statement;
In the case where the association connection type is a full connection type, the first data table is any data table associated with the main query statement.
In an embodiment, the associated connection types of the target query statement of the third query type may include an inner connection type, a left connection type, a right connection type, and a full connection type. Wherein, the inner join refers to inner join, and returns only rows with equal join fields in the two tables; left join refers to left join, returning records that include all records in the left table and equal join fields in the right table; right join refers to right join, returning records that include all records in the right table and equal join fields in the left table; full join refers to full join, returning all records in the left and right tables. There is a difference in the first data table for different associated connection types.
Specifically, when the associated connection type is the connection type, the first data table may be any data table associated with the main query statement or any data table associated with the sub query statement, and the target query statement may be referred to as an example in the second query type, which is not described herein.
When the associated connection type is a left connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement.
For example, assume that the target query statement is select from t1 left join t2 on t1. A=t2. A and exists (select c from t3 where t3. B=t1. B), and t1 is a small table (i.e., the first data table), and t2 and t3 are large tables. At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t1 table, and issuing a data node inquiry statement: select from t1;
step 2, inquiring the needed t2 table data by using the association condition and the t1 table data searched in the step 1, and sending a data node inquiry statement: select from t2 where t1.A in (t 1 table field a data);
step 3, inquiring the needed t3 table data by utilizing the inquired t1 table data and the association condition, and sending a data node inquiry statement: select from t3 sphere t3.B in (t 1 table field b data);
and 4, calculating left join and sub-query, and outputting a result.
For another example, assume that the target query statement is select from t1 left join t2 on t1. A=t2. A and exists (select c from t3 where t2. B=t3. B), and t1 is a small table (i.e., the first data table), and t2 and t3 are large tables. At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t1 table, and issuing a data node inquiry statement: select from t1;
Step 2, inquiring the needed t2 table data by using the association condition and the t1 table data searched in the step 1, and sending a data node inquiry statement: select from t2 where t1.A in (t 1 table field a data);
step 3, inquiring the needed t3 table data by utilizing the inquired t2 table data and the association condition, and sending a data node inquiry statement: select b from t3 where t3.B in (t 2 table field b data);
and 4, calculating left join and sub-query, and outputting a result.
Note that left join indicates that the association connection type of the target query statement is a left connection type, where t1 is a left table and t2 is a right table. Because the sub-query is located in the on condition of left join and the sub-query statement is associated with the left table in the main query statement, the sub-query can only be driven by the main query; otherwise, the query result is wrong. Therefore, the first data table in this example is the left table in the main query statement, i.e., the t1 table must be the small table. If the condition is not satisfied, the data query method provided by the application cannot be used for optimization.
When the associated connection type is a left connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement or any data table associated with the sub-query statement.
For example, let t1 be a large table, t2 be a large table, and t3 be a small table (i.e., first data table), and at this time, the execution logic of the target query term is as follows:
step 1, inquiring a t1 table, and issuing a data node inquiry statement: select from t1;
step 2, inquiring a t3 table, and issuing a data node inquiring statement: select from t3;
step 3, inquiring the needed t2 table data by utilizing the inquired t3 table data and the association condition, and sending a data node inquiry statement: select from t2 sphere t2.B in (t 3 table field b data);
and 4, calculating left join and sub-query, and outputting a result.
Note that left join indicates that the association connection type of the target query statement is a left connection type, where t1 is a left table and t2 is a right table. Because the sub-query is located in the on-condition of left join and the sub-query is associated with the left join right table, the right table may be filtered by either the right table or the sub-query. Therefore, in this example, the first data table is the t2 table in the main query statement, and of course, the data query may also be performed by using the t3 table in the sub query statement as the first data table.
Similarly, when a sub-query is located in the on-condition of right join and the sub-query is associated with the right join right table, the sub-query can only be driven with the main query. Otherwise, the query result is wrong. Therefore, when the association connection type is a right connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement. If the condition is not satisfied, the method in this patent cannot be used for optimization.
Similarly, the sub-query is located in the on condition of right join, and the sub-query is associated with the right join left table, where the sub-query may be filtered with the left table or the left table may be filtered with the sub-query. Therefore, when the association connection type is the right connection type and the sub-query statement is associated with the left table in the main query statement, the first data table is the left table in the main query statement or any data table associated with the sub-query statement.
When the associated connection type is a full connection type, the first data table is any data table associated with the main query statement since the sub-query can only be driven with the main query.
In this embodiment, whether the first data table is the main query statement or the corresponding data table in the sub query statement may be further determined according to the association connection type of the target query statement of the third query type, so as to further determine whether the target query statement of the third query type meets the preset execution condition, thereby ensuring accuracy of the query result.
Further, the step 103 of sequentially querying the data in the first data table and the remaining data tables, and determining a target query result of the target query statement according to the queried data specifically includes:
sending a first data query request to a data node storing a first data table to obtain a first intermediate result, wherein the first data query request is used for requesting to query the first data table;
based on the first intermediate result, sending a second data query request to a data node storing the residual data table to obtain a second intermediate result, wherein the second data query request is used for requesting to query the residual data table;
and determining a target query result of the target query statement according to the first intermediate result and the second intermediate result.
In an embodiment, when the target query statement meets a preset execution condition, a first data query request may be sent to a data node storing the first data table to obtain a first intermediate result, then a second data query request is sent to a data node storing the remaining data table based on the first intermediate result to obtain a second intermediate result, and further, the target query result of the target query statement is determined according to the first intermediate result and the second intermediate result. Therefore, the first data table with the smallest data volume is queried preferentially, and the remaining data table with larger data volume is queried again, so that the communication time and the storage space of the whole query process can be reduced, the query efficiency is improved, and the resource consumption is reduced.
Further, the step of sending a second data query request to the data node storing the remaining data table based on the first intermediate result to obtain a second intermediate result specifically includes:
when the target query statement comprises a plurality of sub-query statements and the first data table is associated with any sub-query statement, a third data query request is sent to a data node storing a second data table based on the first intermediate result to obtain a third intermediate result, wherein the second data table is a data table in the same sub-query statement as the first data table in the remaining data tables, and the third data query request is used for requesting to query the second data table;
and based on the third intermediate result, sending a fourth data query request to a data node storing a third data table to obtain a second intermediate result, wherein the third data table is a data table in different sub-query sentences with the first data table in the rest data tables, and the fourth data query request is used for requesting to query the third data table.
In an embodiment, when the target query statement includes a plurality of sub-query statements and the first data table is associated with any sub-query statement, after a first data query request is sent to a data node storing the first data table to obtain a first intermediate result, a third data query request may be sent to a data node storing a second data table in the same sub-query statement as the first data table based on the first intermediate result to obtain a third intermediate result, and then a fourth data query request may be sent to a data node storing a third data table in a different sub-query statement from the first data table based on the third intermediate result to obtain a second intermediate result. The second data table and the third data table herein may represent one data table or may represent a plurality of data tables, which is not limited herein.
For example, let t1 be the large table, t2 be the small table (i.e., the first data table), and t3 be the large table, assuming that the target query statement is select from t1 where exists (select from t2 where t1. A=t2. A) and exists (select from t3where t3. B=t1. B). It can be seen that, in the target query statement, there are 2 sub-query statements, and the first data table t2 is associated with the t1 table in the first sub-query statement, so that after the t2 table is queried, the t1 table is queried first, and then the t3 table is queried.
At this time, the execution logic of the target query statement is as follows:
step 1, inquiring a t2 table, and issuing a data node inquiry statement: select from t2;
step 2, inquiring the needed t1 table data by using the association condition and the t2 table data searched in the step 1, and sending a data node inquiry statement: select from t1 sphere t1.A in (t 2 table field a data);
step 3, inquiring the needed t3 table data by utilizing the inquired t1 table data and the association condition, and sending a data node inquiry statement: select from t3 sphere t3.B in (t 1 table field b data);
and 4, calculating sub-queries and outputting results.
Thus, the query efficiency can be effectively improved, and the accuracy of the query result can be effectively ensured.
Further, after sequentially querying the data in the first data table and the remaining data tables and determining the target query result of the target query statement according to the queried data in step 103, the method further includes:
and outputting and displaying the target query result of the target query statement.
In an embodiment, after sequentially querying the data in the first data table and the remaining data tables and determining the target query result of the target query statement according to the data obtained by the query, the target query result of the target query statement may be further output and displayed, so that the user can query conveniently and follow-up calculation is performed by using the target query result.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a data query device according to an embodiment of the present application. As shown in fig. 2, the data query apparatus 200 may include:
an obtaining module 201, configured to obtain a target query statement, where the target query statement is associated with a plurality of data tables in a distributed database;
a first determining module 202, configured to determine, according to a type of a target query statement, whether the target query statement meets a preset execution condition, where preset execution conditions corresponding to different types of target query statements are different;
The second determining module 203 is configured to query, in order, data in a first data table and a remaining data table when the target query statement meets a preset execution condition, and determine a target query result of the target query statement according to the queried data, where the first data table is a data table with a minimum data amount in the multiple data tables, and the remaining data table is other data tables of the multiple data tables except the first data table.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement; the first determination module 202 includes:
a first determining sub-module, configured to determine, if the type of the target query statement is a first query type, whether the first data table is associated with the main query statement;
and the second determining submodule is used for determining that the target query statement meets the preset execution condition under the condition that the first data table is associated with the main query statement.
Further, the target query statement comprises a main query statement and sub-query statements nested in the main query statement; the first determination module 202 further includes:
a third determining sub-module, configured to determine whether the first data table is associated with the main query statement or whether the first data table is associated with the sub-query statement if the type of the target query statement is the second query type or the third query type;
And the fourth determining sub-module is used for determining that the target query statement meets the preset execution condition under the condition that the first data table is associated with the main query statement or the first data table is associated with the sub-query statement.
Further, under the condition that the type of the target query statement is a third query type, determining an associated connection type of the target query statement, wherein the associated connection type comprises an inner connection type, a left connection type, a right connection type and a full connection type;
in the case that the associated connection type is an internal connection type, the first data table is any data table associated with a main query statement or any data table associated with a sub-query statement;
under the condition that the associated connection type is a left connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement;
under the condition that the associated connection type is a left connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement or any data table associated with the sub-query statement;
in the case that the associated connection type is a right connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement;
Under the condition that the associated connection type is a right connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement or any data table associated with the sub-query statement;
in the case where the association connection type is a full connection type, the first data table is any data table associated with the main query statement.
Further, the second determining module 203 includes:
the first sending submodule is used for sending a first data query request to a data node storing the first data table to obtain a first intermediate result, wherein the first data query request is used for requesting to query the first data table;
the second sending sub-module is used for sending a second data query request to the data nodes storing the residual data tables based on the first intermediate result to obtain a second intermediate result, wherein the second data query request is used for requesting to query the residual data tables;
and the fifth determining submodule is used for determining a target query result of the target query statement according to the first intermediate result and the second intermediate result.
Further, the second transmitting submodule includes:
the first sending unit is used for sending a third data query request to a data node storing a second data table based on the first intermediate result to obtain a third intermediate result when the target query statement comprises a plurality of sub-query statements and the first data table is associated with any sub-query statement, wherein the second data table is a data table in the same sub-query statement as the first data table in the remaining data tables, and the third data query request is used for requesting to query the second data table;
The second sending unit is configured to send a fourth data query request to a data node storing a third data table based on the third intermediate result, to obtain a second intermediate result, where the third data table is a data table in different sub-query statements with the first data table in the remaining data tables, and the fourth data query request is used for requesting to query the third data table.
Further, the apparatus 200 further comprises:
and the output and display module is used for outputting and displaying the target query result of the target query statement.
It should be noted that, the apparatus 200 may implement the steps of the data query method of any one of the above embodiments, and achieve the same technical effects, which is not described herein again.
As shown in fig. 3, the embodiment of the present application further provides an electronic device, which includes a processor 311, a communication interface 312, a memory 313, and a communication bus 314, where the processor 311, the communication interface 312, and the memory 313 complete communication with each other through the communication bus 314;
a memory 313 for storing a computer program;
in one embodiment of the present application, the processor 311 is configured to implement the data query method provided in any one of the foregoing method embodiments when executing the program stored in the memory 313, and the method includes:
Acquiring a target query statement, wherein the target query statement is associated with a plurality of data tables in a distributed database;
determining whether the target query statement meets preset execution conditions according to the types of the target query statements, wherein the preset execution conditions corresponding to the target query statements of different types are different;
and under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is the data table with the minimum data quantity in the plurality of data tables, and the residual data table is other data tables except the first data table in the plurality of data tables.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data query method provided by any of the method embodiments described above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of querying data, the method comprising:
acquiring a target query statement, wherein the target query statement is associated with a plurality of data tables in a distributed database;
determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different;
and under the condition that the target query statement meets the preset execution condition, sequentially querying data in a first data table and a residual data table, and determining a target query result of the target query statement according to the queried data, wherein the first data table is a data table with the smallest data quantity in the data tables, and the residual data table is other data tables which are remained in the data tables except the first data table.
2. The method of claim 1, wherein the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
the determining whether the target query statement meets the preset execution condition according to the type of the target query statement comprises:
determining whether the first data table is associated with the main query statement if the type of the target query statement is a first query type;
and under the condition that the first data table is associated with the main query statement, determining that the target query statement meets the preset execution condition.
3. The method of claim 1, wherein the target query statement comprises a main query statement and sub-query statements nested in the main query statement;
the determining whether the target query statement meets the preset execution condition according to the type of the target query statement comprises:
determining whether the first data table is associated with the main query statement or whether the first data table is associated with the sub-query statement in the case that the type of the target query statement is the second query type or the third query type;
And under the condition that the first data table is associated with the main query statement or the first data table is associated with the sub query statement, determining that the target query statement meets the preset execution condition.
4. A method according to claim 3, characterized in that the method further comprises:
determining an associated connection type of the target query statement under the condition that the type of the target query statement is the third query type, wherein the associated connection type comprises an inner connection type, a left connection type, a right connection type and a full connection type;
in the case that the association connection type is the connection type, the first data table is any data table associated with the main query statement or any data table associated with the sub query statement;
in the case that the association connection type is the left connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement;
in the case that the association connection type is the left connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement or any data table associated with the sub-query statement;
In the case that the association connection type is the right connection type and the sub-query statement is associated with a right table in the main query statement, the first data table is the right table in the main query statement;
in the case that the association connection type is the right connection type and the sub-query statement is associated with a left table in the main query statement, the first data table is the left table in the main query statement or any data table associated with the sub-query statement;
and in the case that the association connection type is the full connection type, the first data table is any data table associated with the main query statement.
5. The method according to claim 1, wherein sequentially querying the data in the first data table and the remaining data tables and determining the target query result of the target query statement according to the queried data comprises:
sending a first data query request to a data node storing the first data table to obtain a first intermediate result, wherein the first data query request is used for requesting to query the first data table;
based on the first intermediate result, sending a second data query request to a data node storing the residual data table to obtain a second intermediate result, wherein the second data query request is used for requesting to query the residual data table;
And determining a target query result of the target query statement according to the first intermediate result and the second intermediate result.
6. The method of claim 5, wherein sending a second data query request to the data node storing the remaining data table based on the first intermediate result, to obtain a second intermediate result, comprises:
if the target query statement comprises a plurality of sub-query statements and the first data table is associated with any sub-query statement, a third data query request is sent to a data node storing a second data table based on the first intermediate result to obtain a third intermediate result, wherein the second data table is a data table in the same sub-query statement as the first data table in the remaining data tables, and the third data query request is used for requesting to query the second data table;
and based on the third intermediate result, sending a fourth data query request to a data node storing a third data table to obtain the second intermediate result, wherein the third data table is a data table in different sub-query sentences with the first data table in the rest data tables, and the fourth data query request is used for requesting to query the third data table.
7. The method of claim 1, wherein after sequentially querying the data in the first data table and the remaining data tables and determining the target query result of the target query statement based on the queried data, the method further comprises:
and outputting and displaying the target query result of the target query statement.
8. A data querying device, the device comprising:
the acquisition module is used for acquiring target query sentences, wherein the target query sentences are associated with a plurality of data tables in the distributed database;
the first determining module is used for determining whether the target query statement meets preset execution conditions according to the type of the target query statement, wherein the preset execution conditions corresponding to the target query statement of different types are different;
and the second determining module is used for sequentially querying the data in the first data table and the residual data table under the condition that the target query statement meets the preset execution condition, and determining the target query result of the target query statement according to the queried data, wherein the first data table is the data table with the smallest data volume in the plurality of data tables, and the residual data table is other data tables which are remained in the plurality of data tables except the first data table.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of the data query method of any one of claims 1 to 7 when executing a program stored on a memory.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the data query method of any of claims 1-7.
CN202311205473.0A 2023-09-18 2023-09-18 Data query method, device, electronic equipment and storage medium Pending CN117370381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311205473.0A CN117370381A (en) 2023-09-18 2023-09-18 Data query method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311205473.0A CN117370381A (en) 2023-09-18 2023-09-18 Data query method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117370381A true CN117370381A (en) 2024-01-09

Family

ID=89401230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311205473.0A Pending CN117370381A (en) 2023-09-18 2023-09-18 Data query method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117370381A (en)

Similar Documents

Publication Publication Date Title
US10558659B2 (en) Techniques for dictionary based join and aggregation
CN112559554B (en) Query statement optimization method and device
US11599535B2 (en) Query translation for searching complex structures of objects
US7451136B2 (en) System and method for searching multiple disparate search engines
US10853361B2 (en) Scenario based insights into structure data
US7730055B2 (en) Efficient hash based full-outer join
US7233944B2 (en) Determining query cost based on subquery filtering factor
US7953727B2 (en) Handling requests for data stored in database tables
US7606827B2 (en) Query optimization using materialized views in database management systems
CN104123374A (en) Method and device for aggregate query in distributed databases
CN109791543B (en) Control method for executing multi-table connection operation and corresponding device
CN107251013A (en) Method, device and the Database Systems of data query
CN107102995B (en) Method and device for determining SQL execution plan
US10726006B2 (en) Query optimization using propagated data distinctness
CN111897824B (en) Data operation method, device, equipment and storage medium
WO2016122891A1 (en) Workload aware data placement for join-based query processing in a cluster
CN113918605A (en) Data query method, device, equipment and computer storage medium
CN112732752A (en) Query statement optimization method, device, equipment and storage medium
CN108140022A (en) Data query method and Database Systems
CN107735781A (en) Store method and apparatus, the computing device of Query Result
US7774353B2 (en) Search templates
CN117708169A (en) Database query optimization method and device, electronic equipment and storage medium
US7680787B2 (en) Database query generation method and system
Wu et al. Answering XML queries using materialized views revisited
US7155446B2 (en) Performing recursive database operations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination