CN112199390B - Data query method, device, equipment and storage medium in database - Google Patents

Data query method, device, equipment and storage medium in database Download PDF

Info

Publication number
CN112199390B
CN112199390B CN202011065950.4A CN202011065950A CN112199390B CN 112199390 B CN112199390 B CN 112199390B CN 202011065950 A CN202011065950 A CN 202011065950A CN 112199390 B CN112199390 B CN 112199390B
Authority
CN
China
Prior art keywords
query
statement
data table
target data
query statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011065950.4A
Other languages
Chinese (zh)
Other versions
CN112199390A (en
Inventor
韩朱忠
朱仲颖
孟正凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN202011065950.4A priority Critical patent/CN112199390B/en
Publication of CN112199390A publication Critical patent/CN112199390A/en
Application granted granted Critical
Publication of CN112199390B publication Critical patent/CN112199390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data query method, a device, equipment and a storage medium in a database, wherein the method comprises the following steps: parsing the received query statement; when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement; when the query number is within the preset number range, aiming at a target data table selected from a set of to-be-queried tables, optimizing the statement associated with the target data table in the query statement according to preset statement conversion conditions and statement conversion rules to obtain an optimized query statement, so that data query according to the optimized query statement is realized, the problem that when a plurality of data tables are queried, the number of the data tables exceeds the upper limit of the number of the tables of the set, the plurality of sets are required to be divided for query is solved, the data can be quickly queried without dividing the data tables into the plurality of sets, multiple cost estimation is not required, the processing complexity is reduced, the resources are saved, and the query speed is improved.

Description

Data query method, device, equipment and storage medium in database
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a method, a device, equipment and a storage medium for data query in a database.
Background
When two or more tables are involved in a structured query language (Structured Query Language, SQL) query statement, a join query is required for those tables when the data query operation is performed in the database. The connection operator connects two tables at a time, and when N tables participate in the connection, the connection operation of the two tables is performed N-1 times. The order in which the tables participate in the connection is determined by the optimizer based on the cost estimate. As the number of tables involved in connections increases, the number of connection patterns that may be generated increases in a permutation and combination, and when a huge number of tables are involved in connections, the cost of wanting to analyze these connection patterns one by one necessarily consumes a great deal of resources.
At present, when a large number of data tables are processed to participate in the connection query, the tables are divided into a plurality of sets, the optimal connection mode of the sets is found in the analysis cost inside the sets, and then the sets are connected. The number of TABLES per set is controlled by the parameter max_opt_n_tabs. For example, when max_opt_n_tabs is set to 6, if a total of 15 TABLES participate in the connection, the 15 TABLES are divided into 6+6+3 sets, and the optimal connection mode is selected according to the cost estimation in each set, and then the three sets are connected.
However, this approach may create the following problems: when max_opt_n_tabs is set to 6, when the number of TABLES involved in connection is 7 and 8, the TABLES are divided into two sets of 6+1 and 6+2, an optimal connection mode is selected by cost estimation in each set, and then the sets are connected, so that the complexity of system processing is increased and system resources are wasted.
Disclosure of Invention
The invention provides a data query method, a device, equipment and a storage medium in a database, so as to realize quick data query, reduce data processing complexity and save resources.
In a first aspect, an embodiment of the present invention provides a method for querying data in a database, where the method for querying data in the database includes:
parsing the received query statement;
when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement;
when the query quantity is within a preset quantity range, optimizing the sentences associated with the target data table in the query sentences according to preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the to-be-queried table set to obtain optimized query sentences, so as to realize data query according to the optimized query sentences.
In a second aspect, an embodiment of the present invention further provides a data query device in a database, where the data query device in the database includes:
the analysis module is used for analyzing the received query statement;
the determining module is used for determining the query quantity of the data tables in the to-be-queried table set of the query statement when the window filtering condition exists in the query statement;
and the optimizing module is used for optimizing the sentences associated with the target data table in the query sentences according to the preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the to-be-queried table set when the query number is within the preset number range so as to obtain optimized query sentences, so that data query according to the optimized query sentences is realized.
In a third aspect, an embodiment of the present invention further provides an apparatus, including:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method for querying data in a database according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for querying data in a database according to any of the embodiments of the present invention.
The embodiment of the invention provides a data query method, a device, equipment and a storage medium in a database, which are characterized in that received query sentences are analyzed; when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement; when the query quantity is within the preset quantity range, aiming at the target data table selected from the query statement set, the statement related to the target data table in the query statement is optimized according to the preset statement conversion condition and statement conversion rule to obtain the optimized query statement, so that the data query is carried out according to the optimized query statement, the problem that the number of the data tables must be divided into a plurality of sets to carry out the query when the number of the data tables exceeds the upper limit of the number of the tables of the sets is solved, the statement related to the target data table in the query statement is optimized according to the preset statement conversion condition and statement conversion rule, the data query is carried out according to the optimized query statement, the quantity of the sets when the data table is divided is reduced, the quick query of the data is realized, the cost estimation times is reduced, the complexity of system processing is reduced, the system resource is saved, and the data query speed is improved.
Drawings
FIG. 1 is a flow chart of a method for data polling in a database according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a method for data polling in a database according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a data query device in a database according to a third embodiment of the present invention;
fig. 4 is a schematic structural view of an apparatus according to a fourth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
Fig. 1 is a flowchart of a data query method in a database according to an embodiment of the present invention, where the method may be applied to a case of data query in a database, and the method may be performed by a data query device in the database, and specifically includes the following steps:
step S110, analyzing the received query statement.
In this embodiment, a query statement may be understood as a statement generated from a database query and programming language, for example, a select count (x) from t1, used to perform a data query.
When a user has a query requirement, namely, when the user wants to acquire information from a database, a query sentence is input into an application program interface (database software), and a database management system carries out grammar and semantic analysis on the query sentence.
Step S120, judging whether a window filtering condition exists in the query statement, if so, executing step S130; otherwise, step S160 is performed.
When a window filtering condition exists in the query statement, the query statement is possibly a target which needs to be optimized, and the subsequent steps are executed to further judge; if the window filtering condition does not exist in the query statement, the condition that the query statement does not have the query condition required by optimization is indicated, so that the query statement is not required to be optimized and is directly connected.
Illustratively, the query statement is "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8; "when there is no where filtering condition, no optimization is required. The query statement is "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8where t1. C1=t2.c1 and t 3.c1=8 and t4.c3=t5.c3 and t5.c2=t6.c3 and t7.c4=t8.c4 group by t4.c4; when' is, a where filtering condition exists, and a subsequent step needs to be executed, so that the optimization of the query statement is realized.
Step S130, determining the query quantity of the data tables in the set of the tables to be queried of the query statement.
In this embodiment, the set of tables to be queried may be understood as a set formed by each data table to be queried for data determined from a query statement; the number of queries may be understood as the number of data tables contained in the set of tables to be queried.
Illustratively, the query statement is "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8where t1. C1=t2.c1 and t 3.c1=8 and t 4.c3=t5.c3 and t 5.c2=t6.c3 and t 7.c4=t8.c4 group by t4.c4; the set of the to-be-queried tables is a set formed by t1, t2, t3, t4, t5, t6, t7 and t8 in the from item; the number of queries was 8.
Step S140, judging whether the query quantity is within a preset quantity range, if so, executing step S150; otherwise, step S160 is performed.
In this embodiment, the number range may be understood as a preset number range, and may be determined according to an upper limit of a data table in a set when a plurality of data tables are queried to form the set, for example, 7-8, 11-12, more than 6, etc.
The prior art, when querying a large number of data tables, divides them into multiple sets, with an upper limit on the number of data tables in each set. That is, if there are 7 data tables in each set, the data query is performed in the form of dividing the set, and in order to perform the data query without dividing the set, an optimization target may be set first, where the optimization target is generally an integer multiple of the number of data tables in each set, for example, when the upper limit of the number of data tables in the set is 6, the optimization target may be set to 6,12 or 18. When the optimization target is 6, the number range to be optimized can be set to be larger than 6, at the moment, the optimization can be performed when the number of the data tables exceeds 6, and the attempt is made to reduce the number of the data tables until the number of the data tables is smaller than or equal to 6, so that the optimization of query sentences is directly performed to perform data query, and the data query is prevented from being performed in a set division mode. If the optimization target cannot be achieved, the query is performed according to the original query statement. When the number of tables is far greater than the number of optimization targets, the possibility of reaching the expected targets by optimization is low, if the target is not reached at last, but the cost is increased due to the fact that the redundant optimization process is performed, the number range can be set to be [7,8], when the number of data tables exceeds 8, the optimization of query sentences is not attempted, and only when the number of data tables is 7 or 8, the optimization is performed, and the number of data tables is attempted to be reduced until the target number of 6 is reached, so that data query is performed. The advantage of this upper limit is that this unnecessary optimization can be effectively reduced when the number of data tables exceeds 8. Or, setting the optimization target as a set {6,12,18}, that is, the number of queries should reach a certain value in the set of the optimization target during optimization, and setting the corresponding number range as [7,11] [13,17] [19,23] according to the optimization target, that is, the number of data tables in the query statement should be 6,12 or 18 after optimization; alternatively, the number range is set to be [7,8] U.S. 13,14] U.S. 19,20], and the number of optimized data tables should be 6,12 or 18, and in this case, there may be a case that no optimization is performed, for example, the number of queries is 17.
Step S150, optimizing the sentences associated with the target data table in the query sentences according to preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the set of the to-be-queried tables to obtain optimized query sentences, so as to realize data query according to the optimized query sentences.
In this embodiment, the target data table may be understood as a data table that needs to be optimized when the query number of the data table is within the preset range of the number to be optimized, and the target data table is not included in the optimized data table to be queried, where the data table to be queried may be any data table in the set of the data tables to be queried. Statement conversion conditions can be understood as conditions under which the target data table can be converted and optimized, namely, when the target data table meets the statement conversion conditions, the statement associated with the target data table in the query statement can be optimized; the term conversion rule may be understood as a rule that converts a term associated with a target data table to obtain a corresponding rewritten term. For example, the sentence conversion rule may be to rewrite tm.cp=tn.cq in the where Tm represents another table different from Tn to tm.cp in (select Cq from Tn), cp represents any one column of Tm, and Cq represents any one column of Tn.
And selecting any one data table from the to-be-queried set as a target data table, judging whether the target data table meets statement conversion conditions, and if so, converting the statement associated with the target data table in the query statement according to the statement conversion rules to optimize the query statement, thereby obtaining an optimized query statement, and conveniently performing data query according to the optimized query statement during subsequent query. The target data table is reduced in the set of the to-be-queried tables of the optimized query statement, so that the query number of the data tables in the set of the to-be-queried tables of the optimized query statement is judged during re-query, and if the query number is still within the range of the number, one target data table is selected again for optimizing the query statement. Or after the optimization, determining the query quantity of the data tables in the set of the to-be-queried tables of the optimized query statement, and judging whether the optimization needs to be continued or not, wherein the optimization mode is the same as that described above.
Step S160, the operation is ended.
The embodiment of the invention provides a data query method in a database, which is implemented by analyzing received query sentences; when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement; when the query quantity is within a preset quantity range, aiming at a target data table selected from the query statement set, optimizing the statement related to the target data table in the query statement according to preset statement conversion conditions and statement conversion rules to obtain an optimized query statement, so that data query is performed according to the optimized query statement, the problem that the number of the data tables must be divided into a plurality of sets to perform query when the number of the data tables exceeds the upper limit of the number of the tables of the sets is solved, the statement related to the target data table in the query statement is optimized according to the preset statement conversion conditions and statement conversion rules, the number of the data tables involved in connection is reduced according to the optimized query statement, the number of the sets when the data tables are divided is reduced, the quick query of the data is realized, the cost estimation times are reduced, the complexity of system processing is reduced, the system resources are saved, and the data query speed is improved.
Example two
Fig. 2 is a flowchart of a data query method in a database according to a second embodiment of the present invention. The technical scheme of the embodiment is further refined on the basis of the technical scheme, and specifically mainly comprises the following steps:
step S210, analyzing the received query statement.
Step S220, judging whether a window filtering condition exists in the query statement, if so, executing step S230; otherwise, step S280 is performed.
Step S230, determining the query quantity of the data tables in the set of the tables to be queried of the query statement.
Step S240, judging whether the query quantity is within a preset quantity range, if so, executing step S250; otherwise, step S280 is performed.
Step S250, judging whether the target data table meets statement conversion conditions or not according to the target data table selected from the to-be-queried table set, if yes, executing step S260; otherwise, step S280 is performed.
In the present embodiment, the sentence conversion conditions include: the number of the window filtering condition expressions associated with the target data table is one, the target data table only exists in the set of tables to be queried and the window filtering condition, the window filtering condition expressions associated with the target data table are equivalent condition expressions of columns between two data tables, at least one column of the window filtering condition expressions associated with the target data table has unique constraint, and the window filtering condition expressions associated with the target data table and the rest of the window filtering condition expressions in the window filtering condition are connected in series through logic and connectors.
Illustratively, the query statement takes "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8where t1. C1=t2.c1 and t 3.c1=8 and t 4.c3=t5.c3 and t 5.c2=t6.c3 and t7.c4=t8.c4 group by t4.c4" as an example, the number range is greater than 6, the target data table takes t1 as an example, and the columns of the data table involved all have unique constraints to illustrate how to determine whether the target data table satisfies the statement conversion condition: t1 occurs only in the where filtering condition, except from term (set of look-up tables); the sphere filtering conditional expression associated with t1 is only t1.c1=t2.c1, and the condition that the number is one is satisfied; the where filter conditional expression of t1 is t1.c1=t2.c1, i.e. the equivalent conditional expression of the column between two data tables (in the form of tm.cp=tn.cq); the column c1 of the data table t1 and the column c1 of the data table t2 in the where filtering conditional expression of t1 have unique constraints; the other of the other where filter condition expressions (t3.c1=8, t4.c3=t5.c3, t5.c2=t6.c3, t7.c4=t8.c4) of the where filter condition expression t1 t1.c1=t2.c1 and where the remainder of the where filter condition expressions are all logically concatenated with the connector (and), i.e. all where the where filter condition expressions in the where filter condition are all concatenated by and; t1 satisfies all the conditions, namely, satisfies the sentence conversion condition, and sentence conversion can be performed.
And step S260, determining an optimized sub-query statement according to the statement conversion rule and the sphere filtering conditional expression associated with the target data table.
In this embodiment, the optimizing sub-query statement may be understood as a query statement obtained by optimizing a sphere filtering conditional expression of the target data table according to a statement conversion rule.
Illustratively, the statement transformation rules are: converting tm.cp=tn.cq to tm.cp in (select Cq from Tn), the sphere filter condition expression of the target data table is: c1=t2.c1, and the optimized sub-query statement obtained by optimizing (converting) is: t1.c1 in (select c1 from t 2).
Further, determining an optimized sub-query statement according to the statement transformation rule and the sphere filtering conditional expression associated with the target data table may be implemented by:
when only one column of the data table exists in the sphere filtering conditional expression associated with the target data table and has the unique constraint, converting the column of the data table with the unique constraint into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
When forming the optimized sub-query statement, the sphere filtering conditional expression associated with the target data table is an equivalent conditional expression of a column between two data tables, the columns of one data table are arranged around the equation of the equivalent conditional expression, when tm.cp=tn.cq is converted into tm.cp in (select Cq from Tn) by optimizing the query statement, the unique constraint is required on Cq, and since the two sides of the equation of the equivalent conditional expression can be interchanged, the columns of the data table (i.e. tn.cq) converted into the in sub-query statement are ensured to have the unique constraint.
Illustratively, the query statement takes "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8where t1. C1=t2.c1 and t 3.c1=8 and t 4.c3=t5.c3 and t 5.c2=t6.c3 and t 7.c4=t8.c4 group by t4.c4" as an example, the target data table takes t1 as an example, t1.c1 has an unique constraint, and t2.c1 does not have an unique constraint. The in sub-query statement is: select c1 from t1; the optimizing sub-query statement is: c1 in (select c1 from t 1).
Further, determining an optimized sub-query statement according to the statement transformation rule and the sphere filtering conditional expression associated with the target data table may be implemented by:
when columns of two data tables in the sphere filtering conditional expression associated with the target data table have unique constraint, converting the column of any one data table into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
Illustratively, the query statement takes "select count (x) from t1, t2, t3, t4, t5, t6, t7, t8where t1. C1=t2.c1 and t 3.c1=8 and t 4.c3=t5.c3 and t 5.c2=t6.c3 and t7.c4=t8.c4 group by t4.c4" as an example, and the target data table takes t1 as an example, with t1.c1, t2.c1 each having an unique constraint. The in sub-query statement is: select c1 from t1, optimize sub-query statement as: c1 in (select c1 from t 1); alternatively, the in sub-query statement is: select c1 from t2; the optimizing sub-query statement is: t1.c1 in (select c1 from t 2).
Step S270, removing a target data table in a to-be-queried table set of the query statement, and replacing a sphere filtering condition expression associated with the target data table in the query statement with the optimized sub-query statement to form an optimized query statement.
The optimized query statement is: select count from t2, t3, t4, t5, t6, t7, t8where t2.C1 in (select c1 from t 1) and t3.c1=8 and t4.c3=t5.c3 and t5.c2=t6.c3 and t7.c4=t8.c4 group by t4.c4.
Step S280, the operation is ended.
After the optimized query statement is formed, the optimization is completed, the operation is finished, the number of queries of the data tables in the set of to-be-queried tables (from items) after the optimization is 7, the number of queries is still greater than 6, and other data tables are optimized by adopting the same method continuously. Exemplary description of optimization of other data tables: the t 2-related where filter conditional expression has been rewritten into the form of an in sub-query and cannot be optimized any more. t3 does not have equivalent conditions like tm.cp=tn.cq and cannot be optimized. t4 appears in the group by in addition to the where filtering condition, and cannot be optimized. There are two equivalent expressions in the where filtering conditions that involve t5, nor is t5 optimized. t6 meets the statement conversion condition, and can be optimized, and the query statement is continuously optimized as follows: select count from t2, t3, t4, t5, t7, t8where t2.C1 in (select c1 from t 1) and t3.c1=8 and t4.c3=t5.c3 and t5.c2 in (select c3 from t 6) and t7.c4=t8.c4 group by t4.c4. At this time, only 6 data tables are searched in the to-be-searched table set (from item), and the optimization is completed without considering t7 and t8. Generally, if the original data table cannot be optimized to the target number finally, the original query statement is adopted for query.
For example, the upper limit of the number of data tables in each set is 6, the optimization target can be set to be the set {6,12,18}, the number of data tables to be optimized can be set to be in the range of [7,8 ]. U.S. 13,14 ]. U.S. 19,20], the number of data tables in the set to be queried is 13, and 13 tables in the prior art must be divided into three sets of 6+6+1 according to the threshold 6. The 12 data tables can be divided into 6+6 sets, so that the purpose of reducing the sets is achieved.
Exemplary, the embodiment of the invention provides a verification method, which verifies that the optimization does not change the query result set in theory: taking the general form of a query statement as "select sel_item from Tn, tm, t_other where tn.cp=tm.cq and other_condition" as an example, where sel_item represents a list of query terms and all query terms are independent of Tn, t_other is the product of all Other tables except Tn, tm, and other_condition is Other filtering condition independent of Tn. For example, tn is T1, tm is T2, T_other is T3, T4, T5, T6, T7, T8.
The query statement is filtered for tn.cp=tm.cq and Other condition conditions on tn×tm×t_other, and finally projection of sel_item is performed. Since the Other condition is independent of Tn, the filtering of the Other condition can be performed on tm×t_other first, and the intermediate result set obtained here is denoted as t_temp, so that the original sentence becomes the filtering of tn.cp=tm.cq on tn×t_temp, that is, tn and t_temp are equivalently connected on the condition of tn.cp=tm.cq. Also, since the selection list (sel_item) is independent of Tn, the final projection will discard all other columns on the Tn table, leaving only the Tn.Cp column, which is equivalent to (select Cp from Tn). This connection then corresponds to the retention of only the items in T_temp for which all Tm.Cq and any Tn.Cp have the same value, and thus corresponds to the selection sel_item from T_temp window Tm.Cq in (select Cp from Tn).
It should be noted that if Tn.Cp has a repetition value, the result of the concatenation will also be repeated, and this is not completely equivalent before and after overwriting. Therefore, when performing statement conversion (converting into in sub-query statement), at least one column of the data table in the where the statement conversion condition includes the where filtering condition expression of the target data table has a unique constraint, i.e. a unique constraint on tn.cp.
The embodiment of the invention provides a data query method in a database, which is implemented by analyzing received query sentences; when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement; when the query quantity is within the preset quantity range, aiming at the target data table selected from the query statement set, the statement related to the target data table in the query statement is optimized according to the preset statement conversion condition and statement conversion rule to obtain the optimized query statement, so that the data query is carried out according to the optimized query statement, the problem that the number of the data tables is required to be divided into a plurality of sets to carry out the query when the number of the data tables exceeds the upper limit of the number of the tables of the sets is solved, the statement related to the target data table in the query statement is optimized according to the preset statement conversion condition and statement conversion rule, the data query is carried out according to the optimized query statement, the number of the data tables involved in connection is reduced, the number of the sets when the data tables are divided is reduced, the quick query of the data is realized, the cost estimation times are reduced, the complexity of system processing is reduced, the system resource is saved, the number of the connected data tables in the statement is reduced on the premise that the query result is not changed, and the data query speed is improved.
Example III
Fig. 3 is a schematic structural diagram of a data query device in a database according to a third embodiment of the present invention, where the device includes: a parsing module 31, a determining module 32 and an optimizing module 33.
Wherein, the parsing module 31 is configured to parse the received query sentence; a determining module 32, configured to determine, when a where filtering condition exists in the query statement, a query number of data tables in a set of tables to be queried of the query statement; and the optimizing module 33 is configured to optimize, when the number of queries is within a preset number range, a statement associated with a target data table in the query statement according to a preset statement conversion condition and a statement conversion rule, for a target data table selected from the set of tables to be queried, so as to obtain an optimized query statement, so as to implement data query according to the optimized query statement.
The embodiment of the invention provides a data query device in a database, which solves the problem that when a plurality of data tables are queried, the number of the data tables exceeds the upper limit of the number of the tables of a set and must be divided into a plurality of sets for query, optimizes the statement related to the target data table in the query statement according to the preset statement conversion condition and statement conversion rule, queries the data according to the optimized query statement, reduces the number of the data tables participating in connection, reduces the number of the sets when the data tables are divided into sets, realizes quick query of data, reduces cost estimation times, reduces the complexity of system processing, saves system resources and improves the data query speed.
Further, the sentence conversion condition includes: the number of the window filtering condition expressions associated with the target data table is one, the target data table only exists in a set of tables to be queried and the window filtering condition, the window filtering condition expressions associated with the target data table are equivalent condition expressions of columns between two data tables, at least one column of the window filtering condition expressions associated with the target data table has unique constraint, and the window filtering condition expressions associated with the target data table and the rest of the window filtering condition expressions in the window filtering condition are connected in series through logic and connectors.
Further, the optimizing module 33 includes:
the statement determining unit is used for determining an optimized sub-query statement according to the statement conversion rule and a sphere filtering condition expression associated with the target data table when the target data table meets the statement conversion condition;
and the replacing unit is used for removing the target data table in the to-be-queried table set of the query statement, and replacing the sphere filtering conditional expression associated with the target data table in the query statement with the optimized sub-query statement to form an optimized query statement.
Further, the sentence determining unit is specifically configured to: when only one column of the data table exists in the sphere filtering conditional expression associated with the target data table and has the unique constraint, converting the column of the data table with the unique constraint into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
Further, the sentence determining unit is specifically configured to: when columns of two data tables in the sphere filtering conditional expression associated with the target data table have unique constraint, converting the column of any one data table into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
The data query device in the database provided by the embodiment of the invention can execute the data query method in the database provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 4 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention, and as shown in fig. 4, the apparatus includes a processor 40, a memory 41, an input device 42 and an output device 43; the number of processors 40 in the device may be one or more, one processor 40 being taken as an example in fig. 4; the processor 40, the memory 41, the input means 42 and the output means 43 in the device may be connected by a bus or other means, in fig. 4 by way of example.
The memory 41 is used as a computer readable storage medium for storing a software program, a computer executable program, and modules, such as program instructions/modules corresponding to the data query method in the database (e.g., the parsing module 31, the determining module 32, and the optimizing module 33 in the data query device in the database) in the embodiment of the present invention. The processor 40 executes various functional applications of the device and data processing, i.e., implements the data query method in the database described above, by running software programs, instructions and modules stored in the memory 41.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 41 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 42 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output means 43 may comprise a display device such as a display screen.
Example five
A fifth embodiment of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a method of data querying in a database, the method comprising:
parsing the received query statement;
when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement;
when the query quantity is within a preset quantity range, optimizing the sentences associated with the target data table in the query sentences according to preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the to-be-queried table set to obtain optimized query sentences, so as to realize data query according to the optimized query sentences.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the method operations described above, and may also perform the related operations in the data query method in the database provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data query device in the database, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (8)

1. A method for querying data in a database, comprising:
parsing the received query statement;
when a window filtering condition exists in the query statement, determining the query quantity of data tables in a to-be-queried table set of the query statement;
when the query quantity is within a preset quantity range, optimizing sentences associated with the target data table in the query sentences according to preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the to-be-queried table set to obtain optimized query sentences so as to realize data query according to the optimized query sentences;
the optimizing the statement associated with the target data table in the query statement according to the preset statement conversion condition and the statement conversion rule to obtain the optimized query statement comprises the following steps:
when the target data table meets the statement conversion conditions, determining an optimized sub-query statement according to the statement conversion rules and a sphere filtering condition expression associated with the target data table;
and removing a target data table in the to-be-queried table set of the query statement, and replacing a sphere filtering condition expression associated with the target data table in the query statement with the optimized sub-query statement to form an optimized query statement.
2. The method of claim 1, wherein the statement transformation condition comprises: the number of the window filtering condition expressions associated with the target data table is one, the target data table only exists in a set of tables to be queried and the window filtering condition, the window filtering condition expressions associated with the target data table are equivalent condition expressions of columns between two data tables, at least one column of the window filtering condition expressions associated with the target data table has unique constraint, and the window filtering condition expressions associated with the target data table and the rest of the window filtering condition expressions in the window filtering condition are connected in series through logic and connectors.
3. The method of claim 1, wherein the determining an optimized sub-query statement based on the statement transformation rule and a sphere filtering conditional expression associated with the target data table comprises:
when only one column of the data table exists in the sphere filtering conditional expression associated with the target data table and has the unique constraint, converting the column of the data table with the unique constraint into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
4. The method of claim 1, wherein the determining an optimized sub-query statement based on the statement transformation rule and a sphere filtering conditional expression associated with the target data table comprises:
when columns of two data tables in the sphere filtering conditional expression associated with the target data table have unique constraint, converting the column of any one data table into an in sub-query statement, and forming an optimized sub-query statement according to the in sub-query statement and the columns of the rest data tables.
5. A data querying device in a database, comprising:
the analysis module is used for analyzing the received query statement;
the determining module is used for determining the query quantity of the data tables in the to-be-queried table set of the query statement when the window filtering condition exists in the query statement;
the optimizing module is used for optimizing the sentences associated with the target data table in the query sentences according to preset sentence conversion conditions and sentence conversion rules aiming at the target data table selected from the to-be-queried table set when the query number is within a preset number range so as to obtain optimized query sentences, so that data query according to the optimized query sentences is realized;
the optimization module comprises:
the statement determining unit is used for determining an optimized sub-query statement according to the statement conversion rule and a sphere filtering condition expression associated with the target data table when the target data table meets the statement conversion condition;
and the replacing unit is used for removing the target data table in the to-be-queried table set of the query statement, and replacing the sphere filtering conditional expression associated with the target data table in the query statement with the optimized sub-query statement to form an optimized query statement.
6. The apparatus of claim 5, wherein the sentence conversion condition comprises: the number of the window filtering condition expressions associated with the target data table is one, the target data table only exists in a set of tables to be queried and the window filtering condition, the window filtering condition expressions associated with the target data table are equivalent condition expressions of columns between two data tables, at least one column of the window filtering condition expressions associated with the target data table has unique constraint, and the window filtering condition expressions associated with the target data table and the rest of the window filtering condition expressions in the window filtering condition are connected in series through logic and connectors.
7. An apparatus, the apparatus comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement a method of data querying in a database as claimed in any of claims 1 to 4.
8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a method of data querying in a database according to any of claims 1-4.
CN202011065950.4A 2020-09-30 2020-09-30 Data query method, device, equipment and storage medium in database Active CN112199390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011065950.4A CN112199390B (en) 2020-09-30 2020-09-30 Data query method, device, equipment and storage medium in database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011065950.4A CN112199390B (en) 2020-09-30 2020-09-30 Data query method, device, equipment and storage medium in database

Publications (2)

Publication Number Publication Date
CN112199390A CN112199390A (en) 2021-01-08
CN112199390B true CN112199390B (en) 2023-05-30

Family

ID=74013673

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011065950.4A Active CN112199390B (en) 2020-09-30 2020-09-30 Data query method, device, equipment and storage medium in database

Country Status (1)

Country Link
CN (1) CN112199390B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010596B (en) * 2021-03-19 2024-02-23 上海达梦数据库有限公司 Method, device, equipment and storage medium for constructing dynamic performance view
CN114780554B (en) * 2022-06-22 2023-04-18 北京奥星贝斯科技有限公司 Method and device for processing database query statement

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599052A (en) * 2016-11-15 2017-04-26 上海跬智信息技术有限公司 Data query system based on ApacheKylin, and method thereof
CN106933893A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The querying method and device of multi-dimensional data
CN108241692A (en) * 2016-12-26 2018-07-03 北京国双科技有限公司 The querying method and device of data
CN109791543A (en) * 2016-09-30 2019-05-21 华为技术有限公司 Execute the control method and corresponding intrument of multi-table join operation
CN109947804A (en) * 2019-03-20 2019-06-28 上海达梦数据库有限公司 Optimization method, device, server and the storage medium of data acquisition system inquiry
CN110399377A (en) * 2019-08-30 2019-11-01 北京东软望海科技有限公司 Optimization method, device, electronic equipment and the computer readable storage medium of SQL
CN110515973A (en) * 2019-08-30 2019-11-29 上海达梦数据库有限公司 A kind of optimization method of data query, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275103B2 (en) * 2013-02-28 2016-03-01 International Business Machines Corporation Optimization of JOIN queries for related data
US10838981B2 (en) * 2017-08-30 2020-11-17 Sap Se Database optimization using special partition specifications for replicas

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933893A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The querying method and device of multi-dimensional data
CN109791543A (en) * 2016-09-30 2019-05-21 华为技术有限公司 Execute the control method and corresponding intrument of multi-table join operation
CN106599052A (en) * 2016-11-15 2017-04-26 上海跬智信息技术有限公司 Data query system based on ApacheKylin, and method thereof
CN108241692A (en) * 2016-12-26 2018-07-03 北京国双科技有限公司 The querying method and device of data
CN109947804A (en) * 2019-03-20 2019-06-28 上海达梦数据库有限公司 Optimization method, device, server and the storage medium of data acquisition system inquiry
CN110399377A (en) * 2019-08-30 2019-11-01 北京东软望海科技有限公司 Optimization method, device, electronic equipment and the computer readable storage medium of SQL
CN110515973A (en) * 2019-08-30 2019-11-29 上海达梦数据库有限公司 A kind of optimization method of data query, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
数据仓库中的一种提高多表连接效率的有效方法;文娟等;《计算机研究与发展》;20051128(第11期);第182-189页 *
汇总型多表连接查询的一种优化方法;杨武等;《计算机系统应用》;20000105(第01期);第 60-61页 *

Also Published As

Publication number Publication date
CN112199390A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
CN110515973B (en) Data query optimization method, device, equipment and storage medium
US8396852B2 (en) Evaluating execution plan changes after a wakeup threshold time
US8965918B2 (en) Decomposed query conditions
CN112199390B (en) Data query method, device, equipment and storage medium in database
CN109508339B (en) Data query method and device, terminal equipment and storage medium
CN110968593B (en) Database SQL statement optimization method, device, equipment and storage medium
EP3385864B1 (en) Method and device for establishing index
CN110502532B (en) Method, device, equipment and storage medium for optimizing remote database object
CN110362593B (en) Data query method, device, equipment and storage medium
CN109299101B (en) Data retrieval method, device, server and storage medium
WO2021259217A1 (en) Data association query method and apparatus, and device and storage medium
CN111309753B (en) Optimization method, device, equipment and storage medium of structured query statement
CN112732752A (en) Query statement optimization method, device, equipment and storage medium
CN110704472A (en) Data query statistical method and device
CN112100198B (en) Database SQL statement optimization method, device, equipment and storage medium
CN108959571B (en) SQL statement operation method and device, terminal equipment and storage medium
CN107679107B (en) Graph database-based power grid equipment reachability query method and system
WO2022253131A1 (en) Data parsing method and apparatus, computer device, and storage medium
CN110162574B (en) Method and device for determining data redistribution mode, server and storage medium
CN115586953A (en) Hive-based task concurrent execution method and related device
CN110895529B (en) Processing method of structured query language and related device
CN113535756B (en) Data query method, device, equipment and storage medium
CN111159228A (en) Data query method and device
CN113076330A (en) Query processing method and device, database system, electronic equipment and storage medium
CN110263055B (en) Parameter prompting method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant