CN115878654A - Data query method, device, equipment and storage medium - Google Patents

Data query method, device, equipment and storage medium Download PDF

Info

Publication number
CN115878654A
CN115878654A CN202211327887.6A CN202211327887A CN115878654A CN 115878654 A CN115878654 A CN 115878654A CN 202211327887 A CN202211327887 A CN 202211327887A CN 115878654 A CN115878654 A CN 115878654A
Authority
CN
China
Prior art keywords
query
branch
branches
abnormal
set operation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211327887.6A
Other languages
Chinese (zh)
Inventor
刘琦帆
王桃
王国平
朱涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oceanbase Technology Co Ltd
Original Assignee
Beijing Oceanbase Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oceanbase Technology Co Ltd filed Critical Beijing Oceanbase Technology Co Ltd
Priority to CN202211327887.6A priority Critical patent/CN115878654A/en
Publication of CN115878654A publication Critical patent/CN115878654A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present specification discloses a method, an apparatus, a device, and a storage medium for data query, which can screen out and remove query branches that do not have practical significance in a data query statement according to an operation type of each set operation included in the data query statement input by a user and a filtering condition included in the data query statement input by the user, so that the data query statement can be optimized, and further, the efficiency of a database executing data query statement query data can be improved.

Description

Data query method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for querying data.
Background
With the development of database technology, the storage and reading modes of data in a database gradually develop towards diversification, and data set operation is widely used as a common data operation, where the set operation refers to performing intersection, union, difference and other operations on query results of different data query statements.
However, there are query branches with empty query result sets in the aggregation operation, and when the aggregation operation is performed on the database, these query branches with empty result sets still need to be performed on the database, thereby reducing the execution efficiency of the database, for example: in the data query statement "select from t1 where 1=2union all select from t2; "in, and" select from t1 where 1=2 "and" select from t2 "are two query branches, the union of the result sets corresponding to the two query branches can be taken through the aggregate operation, when the database executes the first query branch, the data in the t1 table needs to be traversed, and the data satisfying the query condition" where 1=2 "is queried from the union, but it is obvious that the condition that the query condition is 1=2 is not true, so the result set of the first query branch is an empty set, but the database still executes unnecessary query operations (i.e., traverses the t1 table), thereby reducing the execution efficiency of the database.
Disclosure of Invention
The present specification provides a method, an apparatus, a device and a storage medium for data query, so as to partially solve the problems in the prior art.
The technical scheme adopted by the specification is as follows:
the present specification provides a method of data query, comprising:
acquiring a data query statement;
analyzing the data query statement to determine a set operation involved in the data query statement and each query branch involved in the set operation, wherein different query branches are used for obtaining different query result sets;
determining abnormal query branches from the query branches according to filtering conditions contained in each query branch, wherein the filtering conditions are used for filtering the query results of the query branches;
determining each target branch from each query branch according to the operation type corresponding to the set operation and the abnormal query branch;
and generating an execution plan aiming at the data query statement according to other query branches except the target branch in each query branch, and performing data query according to the execution plan.
Optionally, determining a set operation involved in the data query statement and query branches involved in the set operation specifically includes:
determining each sentence keyword contained in the data query sentence;
determining the set operation related in the data query statement according to the statement keywords;
and determining each query branch involved in the set operation according to the position of the statement keyword corresponding to the set operation in the data query statement.
Optionally, determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch, specifically including:
aiming at each query branch, analyzing the filtering condition contained in the query branch, and judging whether the filtering condition contained in the query branch is an abnormal filtering condition or not;
if yes, determining the query branch as an abnormal query branch.
Optionally, determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch, specifically including:
for each collective operation, if the collective operation belongs to a first type collective operation, when the number of abnormal query branches involved in the collective operation is equal to the number of query branches involved in the collective operation, arbitrarily selecting one abnormal query branch from the abnormal query branches involved in the collective operation for reservation, and using the rest abnormal query branches involved in the collective operation as target branches, wherein the first type collective operation is used for taking a union between query result sets respectively obtained through the query branches.
Optionally, determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch, specifically including:
for each collective operation, if the collective operation belongs to a collective operation of a first type, when the number of the abnormal query branches involved in the collective operation is less than the number of the query branches involved in the collective operation, taking the abnormal query branches involved in the collective operation as target branches, wherein the collective operation of the first type is used for taking a union of query result sets respectively obtained by the query branches.
Optionally, determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch, specifically including:
for each set operation, if the set operation belongs to a second type of set operation, determining a main query branch and a secondary query branch from the query branches related to the set operation, wherein the second type of set operation is used for taking a difference set between query result sets respectively obtained through the query branches;
and if each query branch related to the set operation contains an abnormal query branch, taking the secondary query branch as the target branch.
Optionally, determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch, specifically including:
for each set operation, if the set operation belongs to a set operation of a third type, one abnormal query branch is selected from the abnormal query branches related to the set operation arbitrarily and reserved, and the rest abnormal query branches related to the set operation are used as target branches, wherein the set operation of the third type is used for taking intersections among query result sets respectively obtained through the query branches.
The present specification provides an apparatus for data query, comprising:
the acquisition module is used for acquiring a data query statement;
the analysis module is used for analyzing the data query statement to determine the set operation involved in the data query statement and all query branches involved in the set operation, wherein different query branches are used for obtaining different query result sets;
the first determining module is used for determining abnormal query branches from the query branches according to filtering conditions contained in each query branch, and the filtering conditions are used for filtering query results of the query branches;
a second determining module, configured to determine, according to the operation type corresponding to the set operation and the abnormal query branch, each target branch from the query branches;
and the execution module is used for generating an execution plan aiming at the data query statement according to other query branches except the target branch in each query branch, and carrying out data query according to the execution plan.
Optionally, the parsing module is specifically configured to determine each statement keyword included in the data query statement; determining set operation related to the data query statement according to the statement keywords; and determining each query branch involved in the set operation according to the position of the statement keyword corresponding to the set operation in the data query statement.
Optionally, the second determining module is specifically configured to, for each query branch, analyze the filtering condition included in the query branch, and determine whether the filtering condition included in the query branch is an abnormal filtering condition; if yes, determining the query branch as an abnormal query branch.
Optionally, the second determining module is specifically configured to, for each set operation, if the set operation belongs to a set operation of a first type, arbitrarily select one abnormal query branch reservation from the abnormal query branches involved in the set operation when the number of the abnormal query branches involved in the set operation is equal to the number of the query branches involved in the set operation, and use the remaining abnormal query branches involved in the set operation as target branches, where the set operation of the first type is used to take a union between query result sets obtained through the query branches, respectively.
Optionally, the second determining module is specifically configured to, for each set operation, if the set operation belongs to a set operation of a first type, take, as a target branch, each abnormal query branch involved in the set operation when the number of each abnormal query branch involved in the set operation is smaller than the number of each query branch involved in the set operation, where the set operation of the first type is used to take a union between query result sets obtained by each query branch respectively.
Optionally, the second determining module is specifically configured to, for each set operation, determine, if the set operation belongs to a set operation of a second type, a main query branch and a secondary query branch from query branches involved in the set operation, where the set operation of the second type is used to obtain a difference set between query result sets obtained through the query branches, respectively; and if each query branch related to the set operation contains an abnormal query branch, taking the secondary query branch as the target branch.
Optionally, the second determining module is specifically configured to, for each set operation, if the set operation belongs to a set operation of a third type, arbitrarily select one abnormal query branch from the abnormal query branches related to the set operation to be reserved, and use the remaining abnormal query branches related to the set operation as target branches, where the set operation of the third type is used to take intersections between query result sets obtained through the query branches, respectively.
The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described method of data query.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the above data query method when executing the program.
The technical scheme adopted by the specification can achieve the following beneficial effects:
the data query method provided by the present specification includes first obtaining a data query statement, parsing the data query statement to determine a set operation related in the data query statement and query branches related in the set operation, where different query branches are used to obtain different query result sets, determining an abnormal query branch from the query branches according to a filtering condition included in each query branch, where the filtering condition is used to filter query results of the query branches, determining a target branch from the query branches according to an operation type corresponding to the set operation and the abnormal query branch, generating an execution plan for the data query statement according to other query branches except the target branch in the query branches, and performing data query according to the execution plan.
It can be seen from the above method that query branches that are not of practical significance in the data query statement can be screened and removed according to the operation type of each set operation included in the data query statement input by the user and the filtering condition included in the data query statement input by the user, so that the data query statement can be optimized, and the efficiency of the database for executing the data query statement to query data can be improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. On the attachment
In the figure:
FIG. 1 is a flow chart illustrating a method for querying data provided herein;
FIG. 2 is a schematic diagram of a collective operation provided herein;
FIG. 3 is a schematic diagram of an apparatus for querying data provided herein;
fig. 4 is a schematic diagram of an electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without making any creative effort belong to the protection scope of the present specification.
The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of a data query method provided in this specification, including the following steps:
s100: and acquiring a data query statement.
Aggregation operation is a common data reading method, and is widely used in the process of reading data of a database, but some data query statements including aggregation operation are automatically generated in the running process of an application program or a website program.
For example: suppose that after a user inputs a login account and a login password on a login interface of a website program, the login interface of the website program generates a corresponding data query statement, for example: the method includes the steps that select from user where user name = 'mr. Zhang' and password = '1234567' is used for reading account information matched with an account and a password input by a user from a database, and therefore the login function of the user is achieved, a condition statement of a generated data query statement is that the user name is the user name 'mr. Zhang' input by the user, and the password is the password '1234567' input by the user.
In the above, the reason why the application program or the website program can automatically generate the data query statement according to the input of the user is that a developer writes a template of the data query statement in advance, a condition statement of the data query statement in the template is temporarily left in a variable form, and then when the template needs to be used, the data input by the user can be directly filled in the reserved variable, so that the complete data query statement is generated.
For example: the template for the data query statement select from user where name = "mr. Zhang" and password = "1234567" may be select from user where name =? and password =? It can be seen that when the query statement is generated according to the template, the data can be directly filled in.
However, because parameters generated during the running of the application program or the website program are in error, a query branch with an empty query result set exists in the automatically generated data query statement, and the execution and the demand of the query branch still need to consume resources of the database.
For example: the templates of the data query statement may be: select from t1 where a = b unit select from t2, where select from t1 where a = b and select from t2 are the two query branches of the aggregate operation unit, and in the query branch select from t1 where a = b, the filter condition is: where a and b are two variables, and numbers 1 and 2 may be filled in the query statement during the running of the application program and the website program, so that the query result set of the query branch of select _ from t1 w here a = b is empty, and therefore, when the database executes the query statement select _ from t1 w here 1=2union select _ from t2, the database may completely execute select _ from t1 w here 1=2 without executing select _ from t1 w here 2, but only execute select _ from t2, but the database may execute select _ from t1 w here 1= 2on select 2, and thus, the database may obtain the data according to the query statement select _ from t1 w here 2, and the efficiency of the query statement filtering the data in the query statement select _ from t1 w here 1 from t2 is reduced.
Based on this, in this specification, the service platform may obtain the data query statement used for reading data from the database, so that optimization may be performed in the execution process of the obtained data query statement, so as to reduce the number of unnecessary data query operations executed by the database, thereby improving the efficiency of executing the data query statement by the database, and reducing the consumption of database resources.
In this specification, the execution subject of the method for implementing data query may refer to a designated device such as a server installed on a service platform, or may refer to a terminal device such as a laptop or a desktop, or of course, may also be an optimizer deployed in a database, and when the optimizer is used as the execution subject, part of operations in the data query method provided in this specification may be executed, and the rest of operations may be executed by an executor deployed in the database.
S102: and analyzing the data query statement to determine a set operation involved in the data query statement and each query branch involved in the set operation, wherein different query branches are used for obtaining different query result sets.
Further, the server may perform data query statements such as: parsing such as lexical parsing, syntactic parsing, semantic parsing and the like, and generating a syntactic parse tree, so that set operations related in data query sentences and query branches related in the set operations can be determined, wherein different query branches are used for obtaining different query result sets.
Specifically, the server may determine each statement keyword included in the data query statement, and determine the aggregation operation related to the data query statement according to each statement keyword, where the statement keyword refers to a specific keyword included in the data query statement and corresponding to the aggregation operation, such as: the sentence key word units or units all corresponding to the union set collection operation, the sentence key words except or minus corresponding to the difference set collection operation, the sentence key words intersector corresponding to the intersection set collection operation and the like, so that the server can determine the collection operation related to the data query sentence according to the corresponding relation between the sentence key words and the collection operation contained in the data query sentence.
Furthermore, each query branch involved in the aggregation operation may be determined according to a position of a term keyword corresponding to the aggregation operation in the data query term.
For example: suppose that the data query statement is select from t1 unity select from t2, wherein the set operation included in the data query statement is to take a union of two query result sets, and the corresponding statement keyword is unity, and the set operation is to take a union of the query result set of select from t1 and the query result set of select from t2, where select from t1 and select from t2 are respectively a query branch involved in the set operation.
In addition, in practical applications, there are cases where a query branch involved in one collective operation includes another collective operation. For example: assuming that it is necessary to take an intersection c of a query result set a and a query result set b and take a union of the intersection c and a query result set d, where the data query statement relates to two set operations, that is, an intersection and a union, for the union set operation, one query branch related to the set operation is a data query statement corresponding to the intersection c of the query result set a and the query result set b, and the other query branch is a data query statement corresponding to the query result set d, and for the intersection set operation, one query branch related to the set operation is a data query statement corresponding to the query result set a and the other query branch is a data query statement corresponding to the query result set b, in order to explain the above contents in detail, the present specification further provides a schematic diagram of a case where the query branch related to one set operation includes the other set operation, as shown in fig. 2.
Fig. 2 is a schematic diagram of a collective operation provided in the present specification.
As can be seen from fig. 2, when one query branch of the aggregate operation unit includes another aggregate operation inteselect, the other aggregate operation inteselect is used as one query branch of the aggregate operation, and for this case, when the server optimizes the query branch, the server may remove the query branch corresponding to the bottommost aggregate operation first and then remove the branch corresponding to the outer aggregate operation, for example, in fig. 2, the server may determine whether there is a query branch to be removed in the query branches select from t1 and select from t2 of the aggregate operation inteselect first, and after removing the query branch, the remaining other query branches are used as new query branches of the aggregate operation unit and then process the query branch of the aggregate operation unit.
S104: and determining abnormal query branches from the query branches according to the filtering conditions contained in each query branch, wherein the filtering conditions are used for filtering the query results of the query branches.
After determining each set operation involved in the data query statement and each query branch involved in each set operation, the server may analyze, for each query branch, the filtering condition included in the query branch, determine whether the filtering condition included in the query branch is an abnormal filtering condition, and if so, determine that the query branch is an abnormal query branch.
The filtration conditions here include: and conditional statements limiting at least one of the selected statements. The conditional statement, namely a where statement or a having statement, is used for judging whether data in the queried result set meet a preset condition, if so, retaining the data, and if not, removing the data from the queried structure set. For example: in this data query statement, the database executes from t1 first, that is, obtains all the data in table t1, and then, according to the conditional statement where user name is "zhou", the data with user name zhou is screened out from all the data in table t 1.
It should be noted that the difference between the word statement and the happy statement in the conditional statement is that the word statement is subjected to conditional screening before grouping the query result set preliminarily returned by the query statement, and the happy statement is subjected to conditional screening after grouping the query result set preliminarily returned by the query statement.
For example: assuming that the highest salary of each department of a company needs to be queried, when the corresponding data query statement is executed by a select deptno max (sal) from employee group by deptno database, salaries of all employees in the table are firstly obtained from an employee list, then the obtained employees are grouped according to different departments according to the department deptno, and the salary of the employee with the highest salary in the department group is determined for each department group.
Assuming that the highest salaries larger than 5000 in the highest salaries of each department of a company need to be queried, the corresponding data query statement may be select default max (sal) from employee group by default having max (sal) >5000, when the database executes the query statement, the first choice obtains the salary of all employees in the employee form, and then, the obtained employees are grouped according to different departments according to the department deptno, the salaries of the employees with the highest salaries in the department groups are determined according to each department group, and the highest salaries of each department with the highest salaries higher than 5000 can be screened out from the highest salaries of each department.
If the highest salaries larger than 5000 in the highest salaries of each department of a company are queried through the where statement, the corresponding data query statement may be select deptno max (sal) from employee where sal >5000group by deptno, when the database executes the query statement, the salaries of all employees in the table are preferably obtained from the employee's employe table, then, employees with salaries larger than 5000 in the salary departments are screened out from the salaries of all employees, further, the remaining employees may be grouped according to the department deptno, and the salaries of the employees with the highest salaries in the department groups are determined for each department.
From the above, it can be seen that although both the where statement and the having statement can filter the preliminary query result set, the execution order in the database is different.
The limit selection statement, i.e., the limit statement, is used for sequentially selecting a plurality of pieces of data from the data in the queried result set. For example: select from t1 limit 10, in this data query statement, the database will first execute from t1, i.e. all the data in table t1 will be obtained, and then the first ten of all the data in table t1 will be obtained according to the limit selection statement limit 10.
In the above content, the server may determine whether each filtering condition included in each query branch is an abnormal filtering condition, determine whether the filtering condition is a conditional statement, if yes, determine whether the conditional statement is a where statement, if the conditional statement is a where statement, determine whether a condition in the conditional statement is a constant false (in other words, determine whether a condition included in the conditional statement is not a certain false), determine whether an inquiry branch in which the conditional statement is located includes an aggregation function (i.e., a SUM, a COUNT, an AVG, or another function), and if it is determined that a condition included in the conditional statement is a certain false and an inquiry branch in which the conditional statement is located does not include an aggregation function, determine that the filtering condition is an abnormal filtering condition. For example: select from t1 where 1=2, it can be seen from this data query statement that the conditional statement where 1=2 is constant and not standing, and therefore the conditional statement of the data query statement is an exception filter condition.
And if the conditional statement is a changing statement, judging whether the conditional statement is a where statement, if so, judging whether the condition in the conditional statement is a constant false, and if so, determining that the filtering condition is an abnormal filtering condition.
It should be noted that, as can be seen from the foregoing, although the where statement and the having statement are both conditional statements, the order of executing the where filtering operation and the having filtering operation on the database is different, and the where statement filters according to the condition before executing the aggregation function, so that even if the condition in the where statement is a constant false, the query result set of the query branch after passing through the aggregation function also has a query result, rather than an empty set, and therefore, if the aggregation function is included in the query branch where the where statement is located, the query branch cannot be regarded as an abnormal query branch. Since the converging statement executes the aggregation function first and then performs filtering according to the condition, if the condition in the converging statement is a constant false, the query result set of the query branch is always an empty set, and therefore, whether the aggregation function exists or not does not need to be considered in the converging statement.
In addition, if the filtering condition is a restricted selection statement, whether the restricted condition in the restricted selection statement is less than or equal to 0 is judged, and if yes, the filtering condition is determined to be an abnormal filtering condition. For example: suppose that the constraint selection statement is select from t1 limit 0, where limit is used to intercept several previous pieces of data from the query result set, here 0 pieces of data are intercepted, that is, no matter what data was acquired by the select from t1, this is not done uniformly, and therefore, the constraint selection statement of the data query statement is an exception filter condition.
S106: and determining each target branch from each query branch according to the operation type corresponding to the set operation and the abnormal query branch.
Further, the server may determine each target branch from each query branch according to an operation type corresponding to the set operation and the abnormal query branch.
Specifically, for each collective operation, if the server determines that the collective operation belongs to the collective operation of the first type, when the number of abnormal query branches involved in the collective operation is equal to the number of query branches involved in the collective operation, one abnormal query branch reservation may be arbitrarily selected from the abnormal query branches involved in the collective operation, and the remaining abnormal query branches involved in the collective operation are taken as target branches. If the collective operation belongs to a collective operation of a first type, when the number of the abnormal query branches involved in the collective operation is less than the number of the query branches involved in the collective operation, the abnormal query branches involved in the collective operation may be used as target branches, where the collective operation of the first type is used to take a union between query result sets obtained through the query branches, respectively.
It should be noted that, if the server determines that the number of the abnormal query branches related to the statement keyword is equal to the number of the query branches related to the statement keyword, it indicates that the query result sets corresponding to the query branches corresponding to the union set operation are all empty sets, and also makes the query result sets obtained after the union of the query result sets corresponding to the query branches are also empty, so that only one query branch needs to be arbitrarily saved, and preferably, the first query branch may be retained by default.
In addition, for each set operation, if the server determines that the set operation belongs to a set operation of a second type, a main query branch and a secondary query branch may be determined from the query branches involved in the set operation, where the set operation of the second type is used to obtain a difference set between the query result sets obtained by the query branches, and if the query branches involved in the set operation include an abnormal query branch, the secondary query branch may be used as a target branch.
It should be noted that in the set operation of taking the difference set except/minus, the query result of the secondary query branch is removed from the query results of the primary query branch, for example: assuming that the query result sets of the main query branches are 1, 2 and 3, and the query result sets of the main query branches are 1 and 2, the query result set obtained by taking the difference set is 1, that is, 2 and 3 are removed, so if the query result set of the main query branch is empty, the difference set is always empty, and therefore, only the main query branch needs to be retained, and when the query result set of the secondary query branch is empty, the result obtained by taking the difference set is equivalent to not removing any data from the query result set of the main query branch, that is, the result obtained by taking the difference set is still the query result set of the main query branch, and all the query result sets only need to be retained.
Further, for each collective operation, if the server determines that the collective operation belongs to a collective operation of a third type, one abnormal query branch may be arbitrarily selected from the abnormal query branches related to the collective operation to be retained, and the remaining abnormal query branches related to the collective operation may be used as target branches, where the collective operation of the third type is used to take intersections between query result sets obtained through the query branches, respectively.
It should be noted that in the set operation of intersection interselect, the server takes the intersection of the result sets of each query branch, for example: assuming that the query result sets of the two query branches involved in the set intersection operation are respectively 1, 2, 3 and 1, 4, 5, the intersection of the two query result sets is 1. Therefore, when the result set of any query branch is an empty set, the finally obtained query result set is also an empty set, so that only one abnormal query branch needs to be reserved, for example: in the two query branches involved in the set operation of taking the intersection, the query result set of one query branch is 1 or 2, the other query branch is an abnormal query branch, the query result set of the abnormal query branch is an empty set, and at the moment, the intersection of the two query result sets is also an empty set, so that only the abnormal query branch needs to be reserved.
S108: and generating an execution plan aiming at the data query statement according to other query branches except the target branch in each query branch, and performing data query according to the execution plan.
After removing the target branch, the server may generate an execution plan for the data query statement according to other query branches except the target branch in each query branch, and perform data query according to the execution plan.
It should be noted that there are two kinds of union set fetching operations, the first kind is to remove repeated data in an obtained union set after the union set is fetched, and usually expressed by sentence keyword union, and the second kind is to remove no repeated data in an obtained union set after the union set is fetched, and usually expressed by sentence keyword union all, for the first kind of union set fetching operation, the server removes one query branch in the query branches related to the server, so that the server does not remove the query branch, so the server also needs to perform deduplication distinguishing operation on other query branches except the target branch in each query branch related to the union operation when generating an execution plan for the data query statement, so as to ensure that the data query statement after the query branch is removed and the query result set of the data query statement before the query branch is removed are consistent.
As can be seen from the above, the server can screen out the query branches without practical meaning in the data query statement according to the operation type of each set operation included in the data query statement input by the user and the filtering condition included in the data query statement input by the user, and remove the query branches, so that the data query statement can be optimized, and the efficiency of the database executing the data query statement to query data can be improved.
Based on the same idea, the data query method provided in one or more embodiments of the present specification further provides a corresponding data query device, as shown in fig. 3.
Fig. 3 is a schematic diagram of an apparatus for querying data provided in the present specification, including:
an obtaining module 301, configured to obtain a data query statement;
an analysis module 302, configured to analyze the data query statement to determine a set operation involved in the data query statement and query branches involved in the set operation, where different query branches are used to obtain different query result sets;
a first determining module 303, configured to determine an abnormal query branch from the query branches according to a filtering condition included in each query branch, where the filtering condition is used to filter a query result of the query branch;
a second determining module 304, configured to determine, according to the operation type corresponding to the set operation and the abnormal query branch, each target branch from the query branches;
an execution module 305, configured to generate an execution plan for the data query statement according to other query branches except the target branch in each query branch, and perform data query according to the execution plan.
Optionally, the parsing module 302 is specifically configured to determine each statement keyword included in the data query statement; determining set operation related to the data query statement according to the statement keywords; and determining each query branch involved in the set operation according to the position of the statement keyword corresponding to the set operation in the data query statement.
Optionally, the second determining module 304 is specifically configured to, for each query branch, analyze the filtering condition included in the query branch, and determine whether the filtering condition included in the query branch is an abnormal filtering condition; if yes, determining the query branch as an abnormal query branch.
Optionally, the second determining module 304 is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a first type, when the number of abnormal query branches involved in the collective operation is equal to the number of query branches involved in the collective operation, arbitrarily select one abnormal query branch from the abnormal query branches involved in the collective operation for reservation, and use the remaining abnormal query branches involved in the collective operation as target branches, where the collective operation of the first type is used to take a union between query result sets obtained through the query branches respectively.
Optionally, the second determining module 304 is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a first type, take the abnormal query branches involved in the collective operation as target branches when the number of the abnormal query branches involved in the collective operation is smaller than the number of the query branches involved in the collective operation, where the collective operation of the first type is used to take a union between query result sets obtained by the query branches respectively.
Optionally, the second determining module 304 is specifically configured to, for each set operation, if the set operation belongs to a set operation of a second type, determine a main query branch and a secondary query branch from query branches involved in the set operation, where the set operation of the second type is used to obtain a difference set between query result sets obtained through the query branches, respectively; and if each query branch related to the set operation contains an abnormal query branch, taking the secondary query branch as the target branch.
Optionally, the second determining module 304 is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a third type, arbitrarily select one abnormal query branch from the abnormal query branches involved in the collective operation to be reserved, and use the remaining abnormal query branches involved in the collective operation as target branches, where the collective operation of the third type is used to take intersections between query result sets obtained through the query branches, respectively.
The present specification also provides a computer-readable storage medium storing a computer program, which can be used to execute a method of data query provided in fig. 1.
The present specification also provides a schematic block diagram of an electronic device corresponding to fig. 1 shown in fig. 4. As shown in fig. 4, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the method of data query of fig. 1. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD) (e.g., a Field Programmable Gate Array (FPGA)) is an integrated circuit whose Logic functions are determined by a user programming the Device. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as ABEL (Advanced Boolean Expression Language), AHDL (alternate Hardware Description Language), traffic, CUPL (core universal Programming Language), HDCal, jhddl (Java Hardware Description Language), lava, lola, HDL, PALASM, rhyd (Hardware Description Language), and vhigh-Language (Hardware Description Language), which is currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be regarded as a hardware component and the means for performing the various functions included therein may also be regarded as structures within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, respectively. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable application detection apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable application detection apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable application detection apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable application detection apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (16)

1. A method of data querying, the method comprising:
acquiring a data query statement;
analyzing the data query statement to determine a set operation involved in the data query statement and each query branch involved in the set operation, wherein different query branches are used for obtaining different query result sets;
determining abnormal query branches from the query branches according to filtering conditions contained in each query branch, wherein the filtering conditions are used for filtering the query results of the query branches;
determining each target branch from each query branch according to the operation type corresponding to the set operation and the abnormal query branch;
and generating an execution plan aiming at the data query statement according to other query branches except the target branch in each query branch, and performing data query according to the execution plan.
2. The method of claim 1, wherein determining the collective operation involved in the data query statement and the query branches involved in the collective operation specifically comprises:
determining each sentence keyword contained in the data query sentence;
determining set operation related to the data query statement according to the statement keywords;
and determining each query branch involved in the set operation according to the position of the statement keyword corresponding to the set operation in the data query statement.
3. The method according to claim 2, wherein determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch specifically includes:
aiming at each query branch, analyzing the filtering conditions contained in the query branch, and judging whether the filtering conditions contained in the query branch are abnormal filtering conditions or not;
if yes, determining the query branch as an abnormal query branch.
4. The method according to claim 1, wherein determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch specifically comprises:
for each collective operation, if the collective operation belongs to a first type collective operation, when the number of abnormal query branches involved in the collective operation is equal to the number of query branches involved in the collective operation, arbitrarily selecting one abnormal query branch from the abnormal query branches involved in the collective operation for reservation, and using the rest abnormal query branches involved in the collective operation as target branches, wherein the first type collective operation is used for taking a union between query result sets respectively obtained through the query branches.
5. The method according to claim 1, wherein determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch specifically comprises:
and for each set operation, if the set operation belongs to a first type of set operation, taking the abnormal query branches related to the set operation as target branches when the number of the abnormal query branches related to the set operation is less than that of the query branches related to the set operation, wherein the first type of set operation is used for taking a union between query result sets respectively obtained by the query branches.
6. The method according to claim 1, wherein determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch specifically comprises:
for each set operation, if the set operation belongs to a second type of set operation, determining a main query branch and a secondary query branch from the query branches related to the set operation, wherein the second type of set operation is used for taking a difference set between query result sets respectively obtained through the query branches;
and if each query branch related to the set operation comprises an abnormal query branch, taking the secondary query branch as the target branch.
7. The method according to claim 1, wherein determining each target branch from the query branches according to the operation type corresponding to the set operation and the abnormal query branch specifically comprises:
for each set operation, if the set operation belongs to a set operation of a third type, one abnormal query branch is selected from the abnormal query branches related to the set operation arbitrarily and reserved, and the rest abnormal query branches related to the set operation are used as target branches, wherein the set operation of the third type is used for taking intersections among query result sets respectively obtained through the query branches.
8. An apparatus for data querying, comprising:
the acquisition module is used for acquiring a data query statement;
the analysis module is used for analyzing the data query statement to determine the set operation involved in the data query statement and all query branches involved in the set operation, wherein different query branches are used for obtaining different query result sets;
the first determining module is used for determining abnormal query branches from the query branches according to filtering conditions contained in each query branch, and the filtering conditions are used for filtering query results of the query branches;
a second determining module, configured to determine, according to the operation type corresponding to the set operation and the abnormal query branch, each target branch from the query branches;
and the execution module is used for generating an execution plan aiming at the data query statement according to other query branches except the target branch in each query branch, and carrying out data query according to the execution plan.
9. The apparatus according to claim 8, wherein the parsing module is specifically configured to determine each sentence keyword included in the data query sentence; determining set operation related to the data query statement according to the statement keywords; and determining each query branch involved in the set operation according to the position of the statement keyword corresponding to the set operation in the data query statement.
10. The apparatus according to claim 9, wherein the second determining module is specifically configured to, for each query branch, analyze the filtering condition included in the query branch, and determine whether the filtering condition included in the query branch is an abnormal filtering condition; if yes, determining the query branch as an abnormal query branch.
11. The apparatus according to claim 8, wherein the second determining module is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a first type, arbitrarily select one abnormal query branch reservation from the abnormal query branches involved in the collective operation when the number of abnormal query branches involved in the collective operation is equal to the number of query branches involved in the collective operation, and use the remaining abnormal query branches involved in the collective operation as target branches, where the collective operation of the first type is used to take a union between query result sets obtained through the query branches, respectively.
12. The apparatus according to claim 8, wherein the second determining module is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a first type, take, as a target branch, each of the abnormal query branches involved in the collective operation when the number of the abnormal query branches involved in the collective operation is smaller than the number of the query branches involved in the collective operation, where the collective operation of the first type is used to take a union between query result sets obtained by the query branches, respectively.
13. The apparatus according to claim 8, wherein the second determining module is specifically configured to, for each set operation, determine, if the set operation belongs to a set operation of a second type, a main query branch and a secondary query branch from query branches involved in the set operation, where the set operation of the second type is used to take a difference set between query result sets obtained through the query branches, respectively; and if each query branch related to the set operation contains an abnormal query branch, taking the secondary query branch as the target branch.
14. The apparatus according to claim 8, wherein the second determining module is specifically configured to, for each collective operation, if the collective operation belongs to a collective operation of a third type, arbitrarily select one abnormal query branch from the abnormal query branches involved in the collective operation, and use the remaining abnormal query branches involved in the collective operation as target branches, where the collective operation of the third type is configured to take intersections between the query result sets obtained through the query branches, respectively.
15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of the preceding claims 1 to 7.
16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1 to 7 when executing the program.
CN202211327887.6A 2022-10-27 2022-10-27 Data query method, device, equipment and storage medium Pending CN115878654A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211327887.6A CN115878654A (en) 2022-10-27 2022-10-27 Data query method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211327887.6A CN115878654A (en) 2022-10-27 2022-10-27 Data query method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115878654A true CN115878654A (en) 2023-03-31

Family

ID=85759019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211327887.6A Pending CN115878654A (en) 2022-10-27 2022-10-27 Data query method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115878654A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271576A (en) * 2023-10-19 2023-12-22 北京人大金仓信息技术股份有限公司 Query optimization method, storage medium and computer equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271576A (en) * 2023-10-19 2023-12-22 北京人大金仓信息技术股份有限公司 Query optimization method, storage medium and computer equipment

Similar Documents

Publication Publication Date Title
CN110096513B (en) Data query and fund check method and device
CN107622080B (en) Data processing method and equipment
CN111400681B (en) Data authority processing method, device and equipment
CN112307509B (en) Desensitization processing method, equipment, medium and electronic equipment
US20240256613A1 (en) Data processing method and apparatus, readable storage medium, and electronic device
CN113434533A (en) Data tracing tool construction method, data processing method, device and equipment
CN115756449A (en) Page multiplexing method and device, storage medium and electronic equipment
CN115878654A (en) Data query method, device, equipment and storage medium
CN109656946B (en) Multi-table association query method, device and equipment
CN108804563B (en) Data labeling method, device and equipment
CN117935915A (en) Gene expression quantity detection data management method and device
CN116341642B (en) Data processing method and device, storage medium and electronic equipment
CN116521705A (en) Data query method and device, storage medium and electronic equipment
CN112148746A (en) Method and device for generating database table structure document, electronic device and storage medium
CN116010419A (en) Method and device for creating unique index and optimizing logic deletion
CN116822606A (en) Training method, device, equipment and storage medium of anomaly detection model
CN115391426A (en) Data query method and device, storage medium and electronic equipment
CN115934161A (en) Code change influence analysis method, device and equipment
CN114817212A (en) Database optimization method and optimization device
CN116185617A (en) Task processing method and device
CN115658732A (en) Method and device for optimizing query of SQL (structured query language) statements, electronic equipment and medium
CN115563116A (en) Database table scanning method, device and equipment
CN115757302B (en) Data analysis method, device, equipment and storage medium
CN110032565A (en) A kind of method, system and electronic equipment generating statistical information
CN118193802A (en) Data query method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination