CN106611044A - SQL optimization method and device - Google Patents

SQL optimization method and device Download PDF

Info

Publication number
CN106611044A
CN106611044A CN201611109489.1A CN201611109489A CN106611044A CN 106611044 A CN106611044 A CN 106611044A CN 201611109489 A CN201611109489 A CN 201611109489A CN 106611044 A CN106611044 A CN 106611044A
Authority
CN
China
Prior art keywords
sql statement
row
sql
essential information
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611109489.1A
Other languages
Chinese (zh)
Other versions
CN106611044B (en
Inventor
陈振强
熊仲健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Star Link Information Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Star Link Information Technology (shanghai) Co Ltd filed Critical Star Link Information Technology (shanghai) Co Ltd
Priority to CN201611109489.1A priority Critical patent/CN106611044B/en
Publication of CN106611044A publication Critical patent/CN106611044A/en
Application granted granted Critical
Publication of CN106611044B publication Critical patent/CN106611044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention aims to provide an SQL optimization method and device. The method comprises the steps of analyzing and extracting at least two obtained SQL statements to obtain basic information of the SQL statements, tables corresponding to the SQL statements and a corresponding relationship of columns of the tables; based on the basic information of the SQL statements, the tables corresponding to the SQL statements and the corresponding relationship of the columns of the tables, deleting useless tables in the SQL statements to obtain SQL statements after preliminary filtering of the tables; and deleting useless columns corresponding to the tables in the SQL statements after preliminary filtering of the tables to obtain optimized SQL statements. The SQL statements can be pre-optimized based on the at least two SQL statements, the tables corresponding to the SQL statements and the corresponding relationship of the columns of the tables, so that the purpose of optimizing all the SQL statements is achieved, the overall SQL optimization effect is improved, and the resource additional cost of pre-calculation in the prior art is reduced.

Description

A kind of SQL optimization methods and equipment
Technical field
The application is related to computer realm, more particularly to a kind of SQL optimization methods and equipment.
Background technology
With the arrival in big data epoch, database application is more and more, and the data volume of process is also increasing, and process is patrolled Collect and also become increasingly complex, common application has no longer been several simple SQL (Structured Query Language, knot Structure query language) sentence operation, large-scale application usually contains thousands of SQL operation, how quickly to process magnanimity number According to a difficult problem for having become industry.
SQL optimisation techniques of the prior art are focused primarily on the optimization of the implement plan of single SQL statement, such as join Reorder (adding sequence) and predicate pushdown (pushing away under predicate) etc.;Prior art carries out excellent to single SQL statement Change is pre-defined by the annexation being likely to occur or aggregation between the multiple tables in SQL statement, And the corresponding operating result of each table defined under SQL statement is precalculated, to wait to need to having annexation or aggregation to close When the table of system is operated accordingly, the operating result for pre-defining can be quickly obtained, and then reaches the property for improving SQL The purpose of energy.But precalculate and cannot cover all SQL statement, only single SQL statement is optimized, and precalculate and be Carried out based on out-of-date table, after these tables are changed, the operating result for pre-defining originally cannot just react true Real situation, and then the demand to real-time table can not be met, in fact it could happen that precalculate the table for having optimized and take less than or fail, from And cause the waste of resource;Again as single SQL statement is without information is connected with the context between other SQL statement, cause Global optimization effect on driving birds is not good.
Therefore, it is pre-optimized by carrying out to single SQL statement in prior art, cause pre-optimized SQL statement pair The table answered takes less than or optimizes failure, causes global optimization effect on driving birds is not good and the wasting of resources.
The content of the invention
One purpose of the application is to provide a kind of SQL optimization methods and equipment, by single in solution prior art SQL statement carry out it is pre-optimized, cause the corresponding table of pre-optimized SQL statement take less than or optimize failure problem, while Solve the problems, such as global optimization effect on driving birds is not good and the wasting of resources.
According to the one side of the application, there is provided a kind of SQL optimization methods, the method include:
At least two SQL statement are analyzed and are extracted, the essential information of SQL statement, each SQL language is obtained The corresponding relation of the row of the corresponding table of sentence and its table;
Based on the corresponding relation of the row of the essential information of the SQL statement, the corresponding table of the SQL statement and its table, delete Useless table in except the SQL statement obtains the SQL statement after preliminary filter table;
Delete the useless row in the SQL statement after the preliminary filter table, the SQL statement after being optimized.
Further, it is in said method, described that at least two SQL statement are analyzed and are extracted, obtain SQL statement The corresponding relation of the row of the corresponding table of essential information, each SQL statement and its table, including:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL language The intermediate representation information of the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of sentence;
The intermediate representation information is extracted, essential information, the SQL statement for obtaining each SQL statement is corresponding The corresponding relation of the row of table and its table.
Further, in said method, the corresponding relation of the row of the corresponding table of the SQL statement and its table includes:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
Further, it is in said method, described based on the essential information of the SQL statement, the corresponding table of the SQL statement And its corresponding relation of the row of table, the useless table deleted in the SQL statement obtains the SQL statement after preliminary filter table, including:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table Corresponding relation, by the interim table without adduction relationship and/or not predefined global table from the corresponding table of the SQL statement Middle deletion, obtains the SQL statement after preliminary filter table.
Further, it is in said method, described based on the essential information of the SQL statement, the corresponding table of the SQL statement And its corresponding relation of the row of table, after deleting the SQL statement after the useless table in the SQL statement obtains preliminary filter table, Also include:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding Currently available all tables;
Detect table that can be containing in the corresponding currently available all tables of the SQL statement which is containing to uniquely drawing With point, the SQL statement after further filter table is obtained.
Further, in said method, the data-flow equations group includes following at least any one:
The corresponding equation of output table set of previous SQL statement;
The current corresponding equation of table set for creating or changing;
The corresponding equation of table set of table and the current table deleted before being changed;
It is current to export the corresponding equation of table set.
Further, in said method, in the corresponding currently available all tables of the detection SQL statement can in The table held back by its containing to unique invocation point, obtains the SQL statement after further filter table, including by the SQL statement pair The table of the only one of which invocation point in the currently available all tables answered is initialized in table Work List, is repeated below step, Until the table Work List for sky after, by determine the labelling containing to the unique invocation point of containing table after, by institute The containing table for stating labelling deletes the SQL statement after obtaining further filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no the containing table in corresponding reference list set when the containing table is created The corresponding previous SQL statement of invocation point input table set in, then containing table described in update mark be can not be by containing Table;
If all there is drawing for the containing table in all references table when the containing table is created in corresponding reference list set In output table set with the corresponding previous SQL statement of point, it is determined that the containing table of the labelling.
Further, in said method, the useless row in the SQL statement deleted after the preliminary filter table are obtained SQL statement after optimization, including:
By all unreferenced of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Row initialization in row Work List, be repeated below step, until the row Work List for sky after, by the nothing of labelling Deleted with row, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference of the referenced column Number of times is zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
According to the another aspect of the application, a kind of SQL optimizations equipment is additionally provided, the equipment includes:
Analysis extraction element, at least two SQL statement are analyzed and are extracted, obtains the basic letter of SQL statement The corresponding relation of the row of the corresponding table of breath, each SQL statement and its table;
Meter apparatus are filtered, for based on the essential information of the SQL statement, the corresponding table of the SQL statement and its table The corresponding relation of row, the useless table deleted in the SQL statement obtain the SQL statement after preliminary filter table;
Array apparatus are filtered, for deleting the useless row in the SQL statement after the preliminary filter table, after being optimized SQL statement.
Further, in the said equipment, the analysis extraction element is used for:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL language The intermediate representation information of the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of sentence;
The intermediate representation information is extracted, essential information, the SQL statement for obtaining each SQL statement is corresponding The corresponding relation of the row of table and its table.
Further, in the said equipment, the corresponding relation of the row of the corresponding table of the SQL statement and its table includes:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
Further, in the said equipment, the filtration meter apparatus are used for:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table Corresponding relation, by the interim table without adduction relationship and/or not predefined global table from the corresponding table of the SQL statement Middle deletion, obtains the SQL statement after preliminary filter table.
Further, in the said equipment, the filtration meter apparatus are additionally operable to:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding Currently available all tables;
Detect table that can be containing in the corresponding currently available all tables of the SQL statement which is containing to uniquely drawing With point, the SQL statement after further filter table is obtained.
Further, in the said equipment, the data-flow equations group includes following at least any one:
The corresponding equation of output table set of previous SQL statement;
The current corresponding equation of table set for creating or changing;
The corresponding equation of table set of table and the current table deleted before being changed;
It is current to export the corresponding equation of table set.
Further, in the said equipment, the filtration meter apparatus are used for:
The table of the only one of which invocation point in the SQL statement corresponding currently available all tables is initialized in table work Make, in list, to be repeated below step, until after the table Work List is for sky, will be the containing table of the labelling for determining containing extremely After unique invocation point, the containing table of the labelling is deleted and obtains the SQL statement after further filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no the containing table in corresponding reference list set when the containing table is created The corresponding previous SQL statement of invocation point input table set in, then containing table described in update mark be can not be by containing Table;
If all there is drawing for the containing table in all references table when the containing table is created in corresponding reference list set In output table set with the corresponding previous SQL statement of point, it is determined that the containing table of the labelling.
Further, in the said equipment, the filtration array apparatus are used for:
By all unreferenced of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Row initialization in row Work List, be repeated below step, until the row Work List for sky after, by the nothing of labelling Deleted with row, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference of the referenced column Number of times is zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
Compared with prior art, the application obtains at least two SQL statement at least two SQL languages to obtaining first Sentence is analyzed and extracts, and obtains the correspondence of the row of the essential information, the corresponding table of each SQL statement and its table of SQL statement Relation;Then, based on the essential information of the SQL statement, the corresponding table of SQL statement and its table row corresponding relation, delete institute The useless table in SQL statement is stated, the SQL statement after preliminary filter table is obtained;Finally, delete the SQL statement after preliminary filter table In the corresponding useless row of table, the SQL statement after being optimized realizes corresponding from least two SQL statement and SQL statement It is on the basis of the corresponding relation of the row of table and its table, pre-optimized to carry out to a plurality of SQL statement, do not only reach to each The purpose that SQL statement is optimized, also improves overall SQL effect of optimization, and then saves of the prior art precalculating Resource overhead.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application other Feature, objects and advantages will become more apparent upon:
Fig. 1 illustrates a kind of schematic flow sheet of the SQL optimization methods according to the application one side;
Fig. 2 illustrates a kind of overall flow schematic diagram of the SQL optimization methods according to the application one side;
Fig. 3 illustrates the structural representation for optimizing equipment according to a kind of SQL of the application one side.
In accompanying drawing, same or analogous reference represents same or analogous part.
Specific embodiment
Below in conjunction with the accompanying drawings the application is described in further detail.
In one typical configuration of the application, terminal, the equipment of service network and trusted party include one or more Processor (CPU), input/output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium Example.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by any method Or technology is realizing information Store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read only memory (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, magnetic disk storage or other magnetic storage apparatus or Any other non-transmission medium, can be used to store the information that can be accessed by a computing device.Define according to herein, computer Computer-readable recording medium does not include non-temporary computer readable media (transitory media), the such as data signal and carrier wave of modulation.
Fig. 1 illustrates a kind of schematic flow sheet of SQL optimization methods of the application one side, and the method is applied to include The application apparatus end of at least two SQL statement, specifically includes:Step S11, step S12 and step S13, wherein, step described first At least two SQL statement are analyzed and are extracted by rapid S11, obtain the essential information of SQL statement, each SQL statement The corresponding relation of the row of corresponding table and its table;Then step S12, based on the essential information of the SQL statement, described The corresponding relation of the row of the corresponding table of SQL statement and its table, after the useless table in the deletion SQL statement obtains preliminary filter table SQL statement;Last step S13, deletes the useless row in the SQL statement after the preliminary filter table, after being optimized SQL statement, realize the basis of the corresponding relation of row from least two SQL statement and the corresponding table of SQL statement and its table On, it is pre-optimized to carry out to a plurality of SQL statement, the purpose being optimized to each SQL statement is not only reached, has also been improved Overall SQL effect of optimization, and then save precalculated resource overhead of the prior art.
It should be noted that the essential information of the SQL can be including the numbering of SQL statement, SQL statement type, SQL languages The row that the table and SQL statement that the modification of sentence correspondence or the table for creating, SQL statement are quoted is quoted;Wherein, the SQL statement class Type includes creating type, modification type and query type etc..Currently, other existing or SQL for being likely to occur from now on If essential information can be suitably used for the application, should also be included in the application.
In one embodiment of the application, for the convenience that large-scale SQL applications are analyzed, debug and are safeguarded, prior art In often introduce many interim tables and be used for preserving the intermediate result of process.During business migration, because anthropic factor (example Such as, in order to write, code is convenient, and developer only has part during the row of needs, but the usual table are often selected from a table Row be it is useful, or during modification of program, some interim tables in unnecessary centre are not deleted in time), SQL business sheets Body be not optimised (the interim table for for example increasing to debug can cause unnecessary data write operation and/or read operation etc.) and Cannot be optimized due to the global information lacked between global all SQL business, poorly efficient the making of the corresponding tables of a large amount of SQL can be caused With.In order to ensure that the application is in step S11 at least two SQL statement by the effective use of the table after pre-optimized It is analyzed and extracts, obtains the correspondence of the row of the essential information, the corresponding table of each SQL statement and its table of SQL statement Relation, and then global optimization can be carried out to SQL statement from the basis of global SQL statement.
Further, step S11 is analyzed and is extracted at least two SQL statement, obtains the basic of SQL statement The corresponding relation of the row of the corresponding table of information, each SQL statement and its table, including:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL language The intermediate representation information of the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of sentence;
The intermediate representation information is extracted, essential information, the SQL statement for obtaining each SQL statement is corresponding The corresponding relation of the row of table and its table.
In one embodiment of the application, step S11 obtains at least two SQL statement of the whole SQL applications of reaction first, connects , each SQL statement be analyzed by morphology, syntax and semantics analysis, obtain facilitating application program operation one in Between represent (Internal Representation) information, wherein, the intermediate representation packet contains the original of SQL statement Corresponding relation of row of the corresponding table of the essential information of information, such as SQL statement, SQL statement and its table etc.;Due to the centre Expression information is to the optimization between multiple SQL and unfriendly, therefore needs further to extract the intermediate representation information, To obtain the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of each SQL statement, for follow-up right It is optimized between multiple SQL statement, realizes analysis and extraction at least two SQL statement.
Further, the corresponding relation of the row of the corresponding table of the SQL statement and its table includes:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
It should be noted that the essential information of the SQL statement correspondence table can include:Table type (global table or Interim table), the corresponding all row of table, table be located database name, the another name of table name and table, modified by other SQL statement Operation and;(inquiry operation) etc. is quoted by other SQL statement.Wherein, the type of the table includes predefined global table, not Predefined global table and interim table etc., it is described carried out by other SQL statement modification operation include create (create) operate, Delete (drop) operation and update (update) operation etc., being quoted by other SQL statement includes inquiry operation.Certainly, other If the essential information that the SQL statement that is existing or being likely to occur from now on corresponds to table can be suitably used for the application, should also be included in In the application.
It should be noted that the essential information of the row of the corresponding table of the SQL statement can include:Which belongs to Reference information (such as e.g.udf of table, the index (index) in table, the another name quoted by which SQL statement, arranged and row (a, b, c) as id) etc..Certainly, if other essential informations of the row of the table that are existing or being likely to occur from now on can be suitably used for The application, should also be included in the application.
Then embodiments herein, for the ease of being subsequently optimized to SQL statement, obtains institute in step S11 After the corresponding relation of the row for stating the essential information, the corresponding table of each SQL statement and its table of SQL statement, by SQL languages The abstract variable for structure (struct) type of the corresponding table of sentence, while by the abstract domain for structure of row of the table (fields), to be subsequently optimized to multiple SQL.Certainly, the variable of structure (struct) type is only the application Embodiment in the corresponding table of the SQL statement a kind of form of expression, the domain (fields) of the structure is only the application Embodiment in the table row a kind of form of expression, other, other described SQL languages that are existing or being likely to occur from now on If the form of expression of the row of the form of expression and the table of the corresponding table of sentence can be suitably used for the application, the application should be also included in It is interior.
Further, at least two SQL languages in step S11 by morphology, syntax and semantics analysis to acquisition Sentence is analyzed and has been extracted, and obtains the row of the essential information, the corresponding table of the SQL statement and its table of the SQL statement After corresponding relation, step S12 is based on the essential information of the SQL statement, the corresponding table of the SQL statement and its table The corresponding relation of row, the useless table deleted in the SQL statement obtain the SQL statement after preliminary filter table, including:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table Corresponding relation, by the interim table without adduction relationship and/or not predefined global table from the corresponding table of the SQL statement Middle deletion, obtains the SQL statement after preliminary filter table.
In one embodiment of the application, the life cycle of the table starts from establishment table (create table) and deletes finally table (drop table).As long as interim table can find create table and drop table, just can determine that the interim table Life cycle, for example, if from create table abc, centre by multiple SQL statement:SQL1, SQL2 ..., Sn institutes Quote, until the whole process of drop table abc, then the whole process is used to represent the life cycle of the interim table abc. In the life cycle of the interim table abc of optimizing phase, if no SQL statement uses the interim table abc, then the interim table Abc is exactly dead, need not be created, i.e., embodiments herein is in the base based on the life cycle in table and the SQL statement During the corresponding relation of the row of this information, the corresponding table of the SQL statement and its table is optimized to SAL sentences, It is interim that create table abc refer to this up to the none of SQL statement of whole process of drop table abc Table abc, then illustrate our this table abc be it is dead, then SQL optimization during by the interim table without adduction relationship from institute After the SQL statement after obtaining preliminary filter table is deleted in stating the corresponding table of SQL statement, in actual execution SQL statement During need not then create the interim table abc with save create the interim table abc time.
Then above-described embodiment of the application, for global table, in analytical type application, SQL statement is dynamic input, Therefore all of SQL statement cannot be analyzed, and then can not directly judge whether global table is useful.Due to having externally in common application The interface table for providing interface is fixed, therefore can define predefined global table living by predefined mode, and is not made a reservation for Justice global table due to SQL statement perform terminate after, will not be by other SQL statement, the corresponding table of other SQL statement and its row The object information of current SQL sentence execution is used, then the life cycle of not predefined global table starts from establishment global table and terminates in At the ending of the current SQL statement of analysis, i.e., add drop table at the corresponding ending of SQL statement, with reach based on The corresponding relation of the row of the essential information of the life cycle of table and the SQL statement, the corresponding table of the SQL statement and its table, The purpose of the not predefined global table in the corresponding table of the SQL statement is deleted, the SQL languages after preliminary filter table are obtained Sentence.
Then above-described embodiment of the application, if in the optimizing phase of at least two SQL statement, not only existing , also there is not predefined global table in the interim table of system, then can be based on the life cycle in table and the basic letter of the SQL statement The corresponding relation of breath, the row of the corresponding table of the SQL statement and its table, by the interim table without adduction relationship and it is not predefined entirely Office's deletion of the table from the corresponding table of the SQL statement, to obtain the SQL statement after preliminary filter table, realizes to SQL languages The purpose of the preliminary filter table between sentence.
Further, step S12 based on the essential information of the SQL statement, the corresponding table of the SQL statement and its The corresponding relation of the row of table, after deleting the SQL statement after the useless table in the SQL statement obtains preliminary filter table, also wraps Include:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding Currently available all tables;
Detect table that can be containing in the corresponding currently available all tables of the SQL statement which is containing to uniquely drawing With point, the SQL statement after further filter table is obtained.
It should be noted that the data-flow equations group includes following at least any one:The output table of previous SQL statement Gather corresponding equation;The current corresponding equation of table set for creating or changing;Table and the current table deleted before being changed The corresponding equation of table set;It is current to export the corresponding equation of table set.For example, the output table set of the previous SQL statement Corresponding equation is:IN (s)={ out of previous SQL statement correspondence table }, wherein, during s represents at least two SQL statement The numbering of SQL statement s, is analyzed the output that obtains previous SQL statement to each SQL statement based on equation IN (s) Table set;The current establishment or the corresponding equation of table set changed are:GEN (s)={ table for creating or changing }, based on this Equation GEN (s) is analyzed the table set for obtaining the corresponding current establishments of presently described SQL or changing to each SQL statement; It is described changed before table and the corresponding equation of table set of the current table deleted be:KILL (s)={ table before being changed }+ { table of drop } }, each SQL statement is analyzed based on equation KILL (s) and obtains that presently described SQL is corresponding to be repaiied The table set of table and the current table deleted before changing;The corresponding equation of the current output table set is:OUT (s)=IN (s)- KILL (s)+GEN (s), is analyzed to each SQL statement based on equation OUT (s) that to obtain each SQL statement corresponding Currently available all tables.
In one embodiment of the application, in order to be optimized to SQL statement, also need after step S12 based on construction Data-flow equations group carries out data-flow analysis to each SQL statement, corresponding currently available to obtain each SQL statement All tables.For example, data-flow analysis is carried out to SQL statement S1, S2 and S3 based on the data-flow equations group of construction, wherein, the Equation IN (s) of one SQL statement S1 includes all predefined global tables used, by being based on to each SQL statement Data-flow equations group analysis just can calculate the corresponding set of data-flow equations group one time, below with predefined interface table G and Illustrate as a example by being done as follows to SQL statement S1, S2 and S3 respectively.
For example, SQL statement S1:CREATE TABLE A...;
SQL statement S2:CREATE TABLE B...;
SQL statement S3:UPDATE TABLE A...;
Data-flow analysis is carried out to above-mentioned SQL statement S1, S2 and S3 based on data-flow equations group as follows:
IN (S1)={ G.v0 }, wherein, G.v0 represents working as the interface table G in the output table set of previous SQL statement Front version is v0;
GEN (S1)={ A.v0 }, for indicating that the table that current SQL sentence S1 is created is A, its version number is v0;
KILL (S1 }={ }, for indicate current SQL sentence S1 changed before table and the current table deleted be sky;
OUT (S1)={ G.v0, A.v0 }, for indicate the output table set of current SQL sentence S1 include interface table G.v0 and The Table A .v0 of establishment;
IN (S2)={ G.v0, A.v0 }, for indicating that the input table collection of current SQL sentence S2 is combined into previous SQL statement Output table set OUT (S1) of S1;
GEN (S2)={ B.v0 }, for indicating that the table that current SQL sentence S2 is created is B, its version number is v0;
KILL (S2 }={ }, for indicate current SQL sentence S2 changed before table and the current table deleted be sky;
OUT (S2)={ G.v0, A.v0, B.v0 }, for indicating that the output table set of current SQL sentence S2 includes interface table The table B.v0 of G.v0, Table A .v0 and establishment;
IN (S3)={ G.v0, A.v0, B.v0 }, for indicating that the input table collection of current SQL sentence S3 is combined into previous SQL Output table set OUT (S2) of sentence S2;
GEN (S3)={ A.v1 }, for indicating that the table of current SQL sentence S3 modification is A, its version number is v1;
KILL (S3 }={ A.v0 }, for indicate current SQL sentence S3 changed before table be A.v0;
OUT (S3)={ G.v0, A.v1, B.v0 }, for indicating that the output table set of current SQL sentence S3 includes interface table The table B.v0 of G.v0, Table A .v1 and establishment.
Then above-described embodiment of the application, after the corresponding currently available all tables of each SQL statement are obtained, Step S12 then detects the table that can be containing in the corresponding currently available all tables of the SQL statement which is containing To unique invocation point, obtain the SQL statement after further filter table, realize by a plurality of SQL statement can be with containing table Containing, to increase during reducing establishment table IO (input and output) loads are carried out, the concurrency for processing SQL statement is improved.
Further, in the corresponding currently available all tables of the detection SQL statement in step S12 can in The table held back by its containing to unique invocation point, obtains the SQL statement after further filter table, including by the SQL statement pair The table of the only one of which invocation point in the currently available all tables answered is initialized in table Work List, is repeated below step, Until the table Work List for sky after, by determine the labelling containing to the unique invocation point of containing table after, by institute The containing table for stating labelling deletes the SQL statement after obtaining further filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no the containing table in corresponding reference list set when the containing table is created The corresponding previous SQL statement of invocation point input table set in, then containing table described in update mark be can not be by containing Table;
If all there is drawing for the containing table in all references table when the containing table is created in corresponding reference list set In output table set with the corresponding previous SQL statement of point, it is determined that the containing table of the labelling.
In one embodiment of the application, it is containing before the corresponding SQL statement of table be following two SQL with:create temporary table abc as select*from XYZ;select*from abc;Corresponding SQL statement after containing For:Select*from (select*from XYZ), realizes to the containing of table abc.
For example, creating the corresponding table of SQL statement can increase extra I/O load, reduce concurrency.If a table only has One invocation point, then need not create this table, but the table not necessarily being capable of inline (containing) SQL in invocation point.Detection is only Whether the table for having an invocation point can be needed using the corresponding set of above-mentioned data-flow equations group with inline (containing).Detection In SQL can be as follows in containing table method:
Initialization:The table of the only one of which invocation point in the corresponding currently available all tables of the SQL statement is found, and All tables are initialized in a table Work List (Tworklist);
The taking-up of table is carried out successively to above-mentioned table Work List (Tworklist) and is detected, until the table Work List After for sky, before any one table rb in table set { sclist } is existed the containing table tab invocation point sr is corresponding In input table set IN (sr) of one SQL statement, it is determined that containing table tabs of the containing table tab for labelling, then by determination After containing to the unique invocation point of containing table of the labelling, the containing table of the labelling is deleted and obtains further filter table SQL statement afterwards, realize to SQL statement in can be containing table optimization.
For example:Following analysis is done to SQL statement S4, S5, S6 and S7:
S4:A=1;
S5:X=a+b;// table x is labeled as into containing table, the table set quoted during the corresponding establishment of current SQL sentence S5 { sclist } is { av1, b }, and wherein av1 represents that the current version of table a is v1
S6:A=4;// now a changed, then OUT (S3)={ av2, b }, wherein av2 represent table a by amended current Version is v2
S7:Y=x+1;The input table collection of // current SQL sentence S7 is combined into the output table set of previous SQL statement S6, i.e., IN (S6)=IN (y)={ av2, b }
There is a reference list in corresponding reference list set { sclist }={ av1, b } when creating due to the containing table x Av1 is not present in input table set IN (y)={ av2, the b } of the corresponding previous SQL statement of invocation point y of the containing table x, Then containing table x described in update mark be can not be by containing table.
In one embodiment of the application, as useless row refer to the row for not having any code to use, moreover useless row can increase The time of table is created, is increased the burden of disk I/O (input and output) and network, is had a strong impact on the execution efficiency of SQL statement.It is existing Only wall scroll SQL statement is optimized in technology, it is impossible to which Statistical error goes out useless row, therefore in the embodiment of the present application by extremely Few two SQL statement analysis detects these useless row, with the SQL statement after being optimized, specifically as step S13 is deleted The useless row in the SQL statement after preliminary filter table, the SQL statement after being optimized, including:
By all unreferenced of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Row initialization in row Work List, be repeated below step, until the row Work List for sky after, by the nothing of labelling Deleted with row, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference of the referenced column Number of times is zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
In one embodiment of the application, it is a recurrence that the useless row at least two SQL statement are searched in step S13 Process, detailed process are as follows:
Initialization:The table of the corresponding establishment of SQL statement after filtering after searching the preliminary filter table or further All unreferenced row initialization in (create table) is in row Work List (Cworklist)
The taking-up of useless row is carried out successively to above-mentioned row Work List (Cworklist) and is detected, until the row work After list is for sky, the described useless row of labelling are deleted, the SQL statement after being optimized is realized to currently may be used in SQL statement The optimization of the row of table.
For example, arrange q=a+b;// row q is labeled as into useless row, wherein all references cited in useless row q is classified as a and b
When row x is not used by any code, then by the reference number of times of all references row a and b cited in useless row q Subtract 1;
If the reference number of times of the row (a and/or b) in all references row cited in useless row q is 0, a and/or b are added Adding in row Work List, and repeat one row of taking-up from row Work List carries out aforesaid operations, until row working line After table is for sky, useless row are deleted, the SQL statement after being optimized realizes the optimization at least two SQL statement, enters And overall SQL effect of optimization is improve, while also saving the resource for precalculating and creating useless row of the prior art Overhead.
It should be noted that deletion nothing can not only be done in embodiments herein at least two SQL statement drawing Interim table and/or not predefined interim table, by can with containing containing to the unique invocation point of table and delete containing table with And delete the optimization of useless row, can also realize the common sub-queries carried out by least two SQL statement delete (CSE) and SQL performs the optimization such as parallelization.
In one embodiment of the application, Fig. 2 illustrates a kind of overall flow of SQL optimization methods of the one side of the application Schematic diagram, the method comprising the steps of S21, step S22, step S23, step S24 and step S25, wherein,
Step S21, obtains at least two original SQL statement (SQLs);
Step S22, at least two SQL statement (SQLs) to obtaining carry out this method, syntax and semantics analysis and carry Take, obtain the corresponding relation of the row of the essential information, the corresponding table of each SQL statement and its table of each SQL statement;
Step S23, the row based on the essential information of the SQL statement, the corresponding table of the SQL statement and its table At least two SQL statement (SQLs) are carried out the analysis across SQL by corresponding relation, and the useless table deleted in the SQL statement is obtained SQL statement after preliminary filter table;
Step S24, the SQL statement after filtering to preliminary filter table or further carry out across SQL optimization, delete described The useless row in SQL statement after preliminary filter table, the SQL statement after being optimized;
Step S25, at least two SQL statement (SQLs) after output optimization.
Fig. 3 illustrates that a kind of SQL of the application one side optimizes the structural representation of equipment, and the equipment application is in including The application apparatus end of at least two SQL statement, the equipment 1 specifically include:Analysis extraction element 11, filtration meter apparatus 12 and filtration Array apparatus 13, wherein, analysis extraction element 11 described first, at least two SQL statement are analyzed and are extracted, is obtained The corresponding relation of the row of the corresponding table of the essential information of SQL statement, each SQL statement and its table;Then the filtration Meter apparatus 12, the correspondence for the row based on the essential information of the SQL statement, the corresponding table of the SQL statement and its table are closed System, the useless table deleted in the SQL statement obtain the SQL statement after preliminary filter table;The last filtration array apparatus 13, use The useless row in SQL statement after the preliminary filter table is deleted, the SQL statement after being optimized are realized from least two It is on the basis of the corresponding relation of the row of bar SQL statement and the corresponding table of SQL statement and its table, pre- to carry out to a plurality of SQL statement First optimize, do not only reached the purpose being optimized to each SQL statement, also improve overall SQL effect of optimization, Jin Erjie Precalculated resource overhead of the prior art is saved.
It should be noted that the essential information of the SQL can be including the numbering of SQL statement, SQL statement type, SQL languages The row that the table and SQL statement that the modification of sentence correspondence or the table for creating, SQL statement are quoted is quoted;Wherein, the SQL statement class Type includes creating type, modification type and query type etc..Currently, other existing or SQL for being likely to occur from now on If essential information can be suitably used for the application, should also be included in the application.
In one embodiment of the application, for the convenience that large-scale SQL applications are analyzed, debug and are safeguarded, prior art In often introduce many interim tables and be used for preserving the intermediate result of process.During business migration, because anthropic factor (example Such as, in order to write, code is convenient, and developer only has part during the row of needs, but the usual table are often selected from a table Row be it is useful, or during modification of program, some interim tables in unnecessary centre are not deleted in time), SQL business sheets Body be not optimised (the interim table for for example increasing to debug can cause unnecessary data write operation and/or read operation etc.) and Cannot be optimized due to the global information lacked between global all SQL business, poorly efficient the making of the corresponding tables of a large amount of SQL can be caused With.In order to ensure that the application is in the analysis extraction element 11 at least two by the effective use of the table after pre-optimized SQL statement is analyzed and extracts, to obtain the row of the essential information, the corresponding table of each SQL statement and its table of SQL statement Corresponding relation, and then global optimization can be carried out to SQL statement from the basis of global SQL statement.
Further, the analysis extraction element 11 is used for:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL language The intermediate representation information of the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of sentence;
The intermediate representation information is extracted, essential information, the SQL statement for obtaining each SQL statement is corresponding The corresponding relation of the row of table and its table.
In one embodiment of the application, at least two SQL languages that extraction element 11 obtains the whole SQL applications of reaction first are analyzed Sentence, then, is analyzed to each SQL statement by morphology, syntax and semantics analysis, obtains facilitating application program operation One intermediate representation (Internal Representation) information, wherein, the intermediate representation packet contains SQL statement Corresponding relation of row of the corresponding table of the essential information of raw information, such as SQL statement, SQL statement and its table etc.;Due to described Intermediate representation information is to the optimization between multiple SQL and unfriendly, therefore needs further to carry out the intermediate representation information Extract, to obtain the corresponding relation of the row of the essential information, the corresponding table of SQL statement and its table of each SQL statement, after being used for Continue to being optimized between multiple SQL statement, realize analysis and extraction at least two SQL statement.
Further, the corresponding relation of the row of the corresponding table of the SQL statement and its table includes:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
It should be noted that the essential information of the SQL statement correspondence table can include:Table type (global table or Interim table), the corresponding all row of table, table be located database name, the another name of table name and table, modified by other SQL statement Operation and;(inquiry operation) etc. is quoted by other SQL statement.Wherein, the type of the table includes predefined global table, not Predefined global table and interim table etc., it is described carried out by other SQL statement modification operation include create (create) operate, Delete (drop) operation and update (update) operation etc., being quoted by other SQL statement includes inquiry operation.Certainly, other If the essential information that the SQL statement that is existing or being likely to occur from now on corresponds to table can be suitably used for the application, should also be included in In the application.
It should be noted that the essential information of the row of the corresponding table of the SQL statement can include:Which belongs to Reference information (such as e.g.udf of table, the index (index) in table, the another name quoted by which SQL statement, arranged and row (a, b, c) as id) etc..Certainly, if other essential informations of the row of the table that are existing or being likely to occur from now on can be suitably used for The application, should also be included in the application.
Then embodiments herein, for the ease of being subsequently optimized to SQL statement, analyzes extraction element 11 described After the corresponding relation of the row for obtaining the essential information, the corresponding table of each SQL statement and its table of the SQL statement, By the SQL statement corresponding table abstract variable for structure (struct) type, while by the abstract domain for structure of row of the table (fields), to be subsequently optimized to multiple SQL.Certainly, the variable of structure (struct) type is only the application Embodiment in the corresponding table of the SQL statement a kind of form of expression, the domain (fields) of the structure is only the application Embodiment in the table row a kind of form of expression, other, other described SQL languages that are existing or being likely to occur from now on If the form of expression of the row of the form of expression and the table of the corresponding table of sentence can be suitably used for the application, the application should be also included in It is interior.
Further, the analysis extraction element 11 is used at least two by morphology, syntax and semantics analysis to acquisition Bar SQL statement is analyzed and has been extracted, and obtains essential information, the corresponding table of the SQL statement and its table of the SQL statement Row corresponding relation after, it is described filtration meter apparatus 12 be used for:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table Corresponding relation, by the interim table without adduction relationship and/or not predefined global table from the corresponding table of the SQL statement Middle deletion, obtains the SQL statement after preliminary filter table.
In one embodiment of the application, the life cycle of the table starts from establishment table (create table) and deletes finally table (drop table).As long as interim table can find create table and drop table, just can determine that the interim table Life cycle, for example, if from create table abc, centre by multiple SQL statement:SQL1, SQL2 ..., Sn institutes Quote, until the whole process of drop table abc, then the whole process is used to represent the life cycle of the interim table abc. In the life cycle of the interim table abc of optimizing phase, if no SQL statement uses the interim table abc, then the interim table Abc is exactly dead, need not be created, i.e., embodiments herein is in the base based on the life cycle in table and the SQL statement During the corresponding relation of the row of this information, the corresponding table of the SQL statement and its table is optimized to SAL sentences, It is interim that create table abc refer to this up to the none of SQL statement of whole process of drop table abc Table abc, then illustrate our this table abc be it is dead, then SQL optimization during by the interim table without adduction relationship from institute After the SQL statement after obtaining preliminary filter table is deleted in stating the corresponding table of SQL statement, in actual execution SQL statement During need not then create the interim table abc with save create the interim table abc time.
Then above-described embodiment of the application, for global table, in analytical type application, SQL statement is dynamic input, Therefore all of SQL statement cannot be analyzed, and then can not directly judge whether global table is useful.Due to having externally in common application The interface table for providing interface is fixed, therefore can define predefined global table living by predefined mode, and is not made a reservation for Justice global table due to SQL statement perform terminate after, will not be by other SQL statement, the corresponding table of other SQL statement and its row The object information of current SQL sentence execution is used, then the life cycle of not predefined global table starts from establishment global table and terminates in At the ending of the current SQL statement of analysis, i.e., add drop table at the corresponding ending of SQL statement, with reach based on The corresponding relation of the row of the essential information of the life cycle of table and the SQL statement, the corresponding table of the SQL statement and its table, The purpose of the not predefined global table in the corresponding table of the SQL statement is deleted, the SQL languages after preliminary filter table are obtained Sentence.
Then above-described embodiment of the application, if in the optimizing phase of at least two SQL statement, not only existing , also there is not predefined global table in the interim table of system, then can be based on the life cycle in table and the basic letter of the SQL statement The corresponding relation of breath, the row of the corresponding table of the SQL statement and its table, by the interim table without adduction relationship and it is not predefined entirely Office's deletion of the table from the corresponding table of the SQL statement, to obtain the SQL statement after preliminary filter table, realizes to SQL languages The purpose of the preliminary filter table between sentence.
Further, the filtration meter apparatus 12 are additionally operable to:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding Currently available all tables;
Detect table that can be containing in the corresponding currently available all tables of the SQL statement which is containing to uniquely drawing With point, the SQL statement after further filter table is obtained.
It should be noted that the data-flow equations group includes following at least any one:The output table of previous SQL statement Gather corresponding equation;The current corresponding equation of table set for creating or changing;Table and the current table deleted before being changed The corresponding equation of table set;It is current to export the corresponding equation of table set.For example, the output table set of the previous SQL statement Corresponding equation is:IN (s)={ out of previous SQL statement correspondence table }, wherein, during s represents at least two SQL statement The numbering of SQL statement s, is analyzed the output that obtains previous SQL statement to each SQL statement based on equation IN (s) Table set;The current establishment or the corresponding equation of table set changed are:GEN (s)={ table for creating or changing }, based on this Equation GEN (s) is analyzed the table set for obtaining the corresponding current establishments of presently described SQL or changing to each SQL statement; It is described changed before table and the corresponding equation of table set of the current table deleted be:KILL (s)={ table before being changed }+ { table of drop } }, each SQL statement is analyzed based on equation KILL (s) and obtains that presently described SQL is corresponding to be repaiied The table set of table and the current table deleted before changing;The corresponding equation of the current output table set is:OUT (s)=IN (s)- KILL (s)+GEN (s), is analyzed to each SQL statement based on equation OUT (s) that to obtain each SQL statement corresponding Currently available all tables.
In one embodiment of the application, in order to be optimized to SQL statement, also need based on structure after the filtration meter apparatus 12 The data-flow equations group made carries out data-flow analysis to each SQL statement, and to obtain, each SQL statement is corresponding currently may be used All tables.For example, data-flow analysis is carried out to SQL statement S1, S2 and S3 based on the data-flow equations group of construction, its In, equation IN (s) of first SQL statement S1 comprising all predefined global tables used, by each SQL statement Just the corresponding set of data-flow equations group can be calculated one time based on data-flow equations group analysis, below with predefined interface table G and illustrate as a example by being done as follows to SQL statement S1, S2 and S3 respectively.
For example, SQL statement S1:CREATE TABLE A...;
SQL statement S2:CREATE TABLE B...;
SQL statement S3:UPDATE TABLE A...;
Data-flow analysis is carried out to above-mentioned SQL statement S1, S2 and S3 based on data-flow equations group as follows:
IN (S1)={ G.v0 }, wherein, G.v0 represents working as the interface table G in the output table set of previous SQL statement Front version is v0;
GEN (S1)={ A.v0 }, for indicating that the table that current SQL sentence S1 is created is A, its version number is v0;
KILL (S1 }={ }, for indicate current SQL sentence S1 changed before table and the current table deleted be sky;
OUT (S1)={ G.v0, A.v0 }, for indicate the output table set of current SQL sentence S1 include interface table G.v0 and The Table A .v0 of establishment;
IN (S2)={ G.v0, A.v0 }, for indicating that the input table collection of current SQL sentence S2 is combined into previous SQL statement Output table set OUT (S1) of S1;
GEN (S2)={ B.v0 }, for indicating that the table that current SQL sentence S2 is created is B, its version number is v0;
KILL (S2 }={ }, for indicate current SQL sentence S2 changed before table and the current table deleted be sky;
OUT (S2)={ G.v0, A.v0, B.v0 }, for indicating that the output table set of current SQL sentence S2 includes interface table The table B.v0 of G.v0, Table A .v0 and establishment;
IN (S3)={ G.v0, A.v0, B.v0 }, for indicating that the input table collection of current SQL sentence S3 is combined into previous SQL Output table set OUT (S2) of sentence S2;
GEN (S3)={ A.v1 }, for indicating that the table of current SQL sentence S3 modification is A, its version number is v1;
KILL (S3 }={ A.v0 }, for indicate current SQL sentence S3 changed before table be A.v0;
OUT (S3)={ G.v0, A.v1, B.v0 }, for indicating that the output table set of current SQL sentence S3 includes interface table The table B.v0 of G.v0, Table A .v1 and establishment.
Then above-described embodiment of the application, after the corresponding currently available all tables of each SQL statement are obtained, It is described filter meter apparatus 12 then detect in the corresponding currently available all tables of the SQL statement can be containing table and by its Containing to unique invocation point, obtains the SQL statement after further filter table, realize by a plurality of SQL statement can be with containing Table carry out containing, the IO (input and output) that increases during reducing establishment table load, improve the concurrency for processing SQL statement.
Further, the filtration meter apparatus 12 are used for:By in the SQL statement corresponding currently available all tables The table of only one of which invocation point is initialized in table Work List, is repeated below step, until after the table Work List is for sky, After by containing to the unique invocation point of containing table of the labelling for determining, the containing table of the labelling is deleted and is obtained into one SQL statement after step filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no the containing table in corresponding reference list set when the containing table is created The corresponding previous SQL statement of invocation point input table set in, then containing table described in update mark be can not be by containing Table;
If all there is drawing for the containing table in all references table when the containing table is created in corresponding reference list set In output table set with the corresponding previous SQL statement of point, it is determined that the containing table of the labelling.
In one embodiment of the application, it is containing before the corresponding SQL statement of table be following two SQL with:create temporary table abc as select*from XYZ;select*from abc;Corresponding SQL statement after containing For:Select*from (select*from XYZ), realizes to the containing of table abc.
For example, creating the corresponding table of SQL statement can increase extra I/O load, reduce concurrency.If a table only has One invocation point, then need not create this table, but the table not necessarily being capable of inline (containing) SQL in invocation point.Detection is only Whether the table for having an invocation point can be needed using the corresponding set of above-mentioned data-flow equations group with inline (containing).Detection In SQL can be as follows in containing table method:
Initialization:The table of the only one of which invocation point in the corresponding currently available all tables of the SQL statement is found, and All tables are initialized in a table Work List (Tworklist);
The taking-up of table is carried out successively to above-mentioned table Work List (Tworklist) and is detected, until the table Work List After for sky, before any one table rb in table set { sclist } is existed the containing table tab invocation point sr is corresponding In input table set IN (sr) of one SQL statement, it is determined that containing table tabs of the containing table tab for labelling, then by determination After containing to the unique invocation point of containing table of the labelling, the containing table of the labelling is deleted and obtains further filter table SQL statement afterwards, realize to SQL statement in can be containing table optimization.
For example:Following analysis is done to SQL statement S4, S5, S6 and S7:
S4:A=1;
S5:X=a+b;// table x is labeled as into containing table, the table set quoted during the corresponding establishment of current SQL sentence S5 { sclist } is { av1, b }, and wherein av1 represents that the current version of table a is v1
S6:A=4;// now a changed, then OUT (S3)={ av2, b }, wherein av2 represent table a by amended current Version is v2
S7:Y=x+1;The input table collection of // current SQL sentence S7 is combined into the output table set of previous SQL statement S6, i.e., IN (S6)=IN (y)={ av2, b }
There is a reference list in corresponding reference list set { sclist }={ av1, b } when creating due to the containing table x Av1 is not present in input table set IN (y)={ av2, the b } of the corresponding previous SQL statement of invocation point y of the containing table x, Then containing table x described in update mark be can not be by containing table.
In one embodiment of the application, as useless row refer to the row for not having any code to use, moreover useless row can increase The time of table is created, is increased the burden of disk I/O (input and output) and network, is had a strong impact on the execution efficiency of SQL statement.It is existing Only wall scroll SQL statement is optimized in technology, it is impossible to which Statistical error goes out useless row, therefore in the embodiment of the present application by extremely Few two SQL statement analysis detects these useless row, with the SQL statement after being optimized, the concrete filtration array apparatus 13 For:
By all unreferenced of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Row initialization in row Work List, be repeated below step, until the row Work List for sky after, by the nothing of labelling Deleted with row, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference of the referenced column Number of times is zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
In one embodiment of the application, the useless row searched at least two SQL statement in the filtration array apparatus 13 are individual The process of recurrence, detailed process are as follows:
Initialization:The table of the corresponding establishment of SQL statement after filtering after searching the preliminary filter table or further All unreferenced row initialization in (create table) is in row Work List (Cworklist)
The taking-up of useless row is carried out successively to above-mentioned row Work List (Cworklist) and is detected, until the row work After list is for sky, the described useless row of labelling are deleted, the SQL statement after being optimized is realized to currently may be used in SQL statement The optimization of the row of table.
For example, arrange q=a+b;// row q is labeled as into useless row, wherein all references cited in useless row q is classified as a and b
When row x is not used by any code, then by the reference number of times of all references row a and b cited in useless row q Subtract 1;
If the reference number of times of the row (a and/or b) in all references row cited in useless row q is 0, a and/or b are added Adding in row Work List, and repeat one row of taking-up from row Work List carries out aforesaid operations, until row working line After table is for sky, useless row are deleted, the SQL statement after being optimized realizes the optimization at least two SQL statement, enters And overall SQL effect of optimization is improve, while also saving the resource for precalculating and creating useless row of the prior art Overhead.
It should be noted that deletion nothing can not only be done in embodiments herein at least two SQL statement drawing Interim table and/or not predefined interim table, by can with containing containing to the unique invocation point of table and delete containing table with And delete the optimization of useless row, can also realize the common sub-queries carried out by least two SQL statement delete (CSE) and SQL performs the optimization such as parallelization.
In sum, the application is by obtaining at least two SQL statement at least two SQL statement to acquisition first It is analyzed and extracts, obtains the correspondence pass of the row of the essential information, the corresponding table of each SQL statement and its table of SQL statement System;Then, based on the essential information of the SQL statement, the corresponding table of SQL statement and its table row corresponding relation, delete described Useless table in SQL statement, obtains the SQL statement after preliminary filter table;Finally, delete in the SQL statement after preliminary filter table The corresponding useless row of table, the SQL statement after being optimized realized from least two SQL statement and the corresponding table of SQL statement And its on the basis of the corresponding relation of the row of table, it is pre-optimized to carry out to a plurality of SQL statement, do not only reach to each The purpose that SQL statement is optimized, also improves overall SQL effect of optimization, and then saves of the prior art precalculating Resource overhead.
Obviously, those skilled in the art can carry out the essence of various changes and modification without deviating from the application to the application God and scope.So, if these modifications of the application and modification belong to the scope of the application claim and its equivalent technologies Within, then the application is also intended to comprising these changes and modification.
It should be noted that the application can be carried out in the assembly of software and/or software with hardware, for example, can adopt Realized with special IC (ASIC), general purpose computer or any other similar hardware device.In one embodiment In, the software program of the application can pass through computing device to realize steps described above or function.Similarly, the application Software program (including related data structure) can be stored in computer readable recording medium storing program for performing, for example, RAM memory, Magnetically or optically driver or floppy disc and similar devices.In addition, some steps or function of the application can employ hardware to realize, example Such as, as coordinating so as to perform the circuit of each step or function with processor.
In addition, the part of the application can be applied to computer program, such as computer program instructions, when its quilt When computer is performed, by the operation of the computer, can call or provide according to the present processes and/or technical scheme. And the programmed instruction of the present processes is called, it is possibly stored in fixed or moveable recording medium, and/or passes through Data flow in broadcast or other signal bearing medias and be transmitted, and/or be stored according to described program instruction operation In the working storage of computer equipment.Here, including a device according to one embodiment of the application, the device includes using In the memorizer and the processor for execute program instructions of storage computer program instructions, wherein, when the computer program refers to When order is by the computing device, method and/or skill of the plant running based on aforementioned multiple embodiments according to the application are triggered Art scheme.
It is obvious to a person skilled in the art that the application is not limited to the details of above-mentioned one exemplary embodiment, Er Qie In the case of without departing substantially from spirit herein or basic feature, the application can be realized in other specific forms.Therefore, no matter From the point of view of which point, embodiment all should be regarded as exemplary, and be nonrestrictive, scope of the present application is by appended power Profit is required rather than described above is limited, it is intended that all in the implication and scope of the equivalency of claim by falling Change is included in the application.Any reference in claim should not be considered as and limit involved claim.This Outward, it is clear that " including ", a word was not excluded for other units or step, and odd number is not excluded for plural number.That what is stated in device claim is multiple Unit or device can also be realized by software or hardware by a unit or device.The first, the second grade word is used for table Show title, and be not offered as any specific order.

Claims (16)

1. a kind of SQL optimization methods, wherein, methods described includes:
At least two SQL statement are analyzed and are extracted, the essential information of SQL statement, each SQL statement pair is obtained The table answered and its corresponding relation of the row of table;
Based on the corresponding relation of the row of the essential information of the SQL statement, the corresponding table of the SQL statement and its table, institute is deleted State the useless table in SQL statement and obtain the SQL statement after preliminary filter table;
Delete the useless row in the SQL statement after the preliminary filter table, the SQL statement after being optimized.
2. method according to claim 1, wherein, it is described that at least two SQL statement are analyzed and are extracted, obtain The corresponding relation of the row of the corresponding table of the essential information of SQL statement, each SQL statement and its table, including:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL statement The intermediate representation information of the corresponding relation of the row of the corresponding table of essential information, SQL statement and its table;
The intermediate representation information is extracted, obtain the essential information of each SQL statement, the corresponding table of SQL statement and The corresponding relation of the row of its table.
3. method according to claim 1 and 2, wherein, the corresponding relation of the row of the corresponding table of the SQL statement and its table Including:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
4. method according to claim 1, wherein, it is described based on the essential information of the SQL statement, the SQL statement The corresponding relation of the row of corresponding table and its table, the useless table deleted in the SQL statement obtain the SQL languages after preliminary filter table Sentence, including:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table it is right Should be related to, the interim table without adduction relationship and/or not predefined global table are deleted from the corresponding table of the SQL statement Remove, obtain the SQL statement after preliminary filter table.
5. method according to claim 1, wherein, it is described based on the essential information of the SQL statement, the SQL statement The corresponding relation of the row of corresponding table and its table, the useless table deleted in the SQL statement obtain the SQL languages after preliminary filter table After sentence, also include:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding current Available all tables;
Detect in the corresponding currently available all tables of the SQL statement can be containing table and which containing to unique is quoted Point, obtains the SQL statement after further filter table.
6. method according to claim 5, wherein, the data-flow equations group includes following at least any one:
The corresponding equation of output table set of previous SQL statement;
The current corresponding equation of table set for creating or changing;
The corresponding equation of table set of table and the current table deleted before being changed;
It is current to export the corresponding equation of table set.
7. method according to claim 6, wherein, the corresponding currently available all tables of the detection SQL statement In can be containing table and by its containing to unique invocation point, obtain the SQL statement after further filter table, including will be described The table of the only one of which invocation point in the corresponding currently available all tables of SQL statement is initialized in table Work List, is repeated Following steps, until after the table Work List is for sky, by containing to the unique invocation point of containing table of the labelling for determining Afterwards, the containing table of the labelling is deleted and obtains the SQL statement after further filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no drawing for the containing table in corresponding reference list set when the containing table is created With putting in the input table set of corresponding previous SQL statement, then containing table described in update mark be can not be by containing table;
If all there is the invocation point of the containing table in all references table when the containing table is created in corresponding reference list set In the output table set of corresponding previous SQL statement, it is determined that the containing table of the labelling.
8. method according to claim 1 or 5, wherein, the nothing in the SQL statement deleted after the preliminary filter table With row, the SQL statement after being optimized, including:
By all unreferenced row of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Initialization is repeated below step in row Work List, until after the row Work List is for sky, by the described useless row of labelling Delete, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference number of times of the referenced column Be zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
9. a kind of SQL optimizes equipment, wherein, the equipment includes:
Analysis extraction element, at least two SQL statement are analyzed and are extracted, obtains the essential information of SQL statement, every The corresponding relation of the row of the corresponding table of one SQL statement and its table;
Meter apparatus are filtered, for the row based on the essential information of the SQL statement, the corresponding table of the SQL statement and its table Corresponding relation, the useless table deleted in the SQL statement obtain the SQL statement after preliminary filter table;
Filter array apparatus, the SQL languages for deleting the useless row in the SQL statement after the preliminary filter table, after being optimized Sentence.
10. equipment according to claim 9, wherein, the analysis extraction element is used for:
Obtain at least two SQL statement;
Each SQL statement is analyzed by morphology, syntax and semantics analysis, obtains including each SQL statement The intermediate representation information of the corresponding relation of the row of the corresponding table of essential information, SQL statement and its table;
The intermediate representation information is extracted, obtain the essential information of each SQL statement, the corresponding table of SQL statement and The corresponding relation of the row of its table.
11. equipment according to claim 9 or 10, wherein, the correspondence pass of the row of the corresponding table of the SQL statement and its table System includes:
The essential information of the corresponding table of the SQL statement;
The essential information of the row of the table.
12. equipment according to claim 9, wherein, the filtration meter apparatus are used for:
The row of the essential information, the corresponding table of the SQL statement and its table of life cycle and the SQL statement based on table it is right Should be related to, the interim table without adduction relationship and/or not predefined global table are deleted from the corresponding table of the SQL statement Remove, obtain the SQL statement after preliminary filter table.
13. equipment according to claim 9, wherein, the filtration meter apparatus are additionally operable to:
Data-flow analysis is carried out to each SQL statement based on data-flow equations group, each SQL statement is obtained corresponding current Available all tables;
Detect in the corresponding currently available all tables of the SQL statement can be containing table and which containing to unique is quoted Point, obtains the SQL statement after further filter table.
14. equipment according to claim 13, wherein, the data-flow equations group includes following at least any one:
The corresponding equation of output table set of previous SQL statement;
The current corresponding equation of table set for creating or changing;
The corresponding equation of table set of table and the current table deleted before being changed;
It is current to export the corresponding equation of table set.
15. equipment according to claim 14, wherein, the filtration meter apparatus are used for:
The table of the only one of which invocation point in the SQL statement corresponding currently available all tables is initialized in table working line In table, step is repeated below, until after the table Work List is for sky, will be the containing table of the labelling for determining containing to unique Invocation point after, the containing table of the labelling is deleted and obtains the SQL statement after further filter table:
A table is taken out from the table Work List every time, and simultaneously table described in labelling is containing table;
Corresponding reference list set when obtaining the invocation point of the containing table and its containing table establishment;
If having at least one reference list to there is no drawing for the containing table in corresponding reference list set when the containing table is created With putting in the input table set of corresponding previous SQL statement, then containing table described in update mark be can not be by containing table;
If all there is the invocation point of the containing table in all references table when the containing table is created in corresponding reference list set In the output table set of corresponding previous SQL statement, it is determined that the containing table of the labelling.
16. equipment according to claim 9 or 13, wherein, the filtration array apparatus are used for:
By all unreferenced row of the table of the corresponding establishment of SQL statement after after the preliminary filter table or further filtering Initialization is repeated below step in row Work List, until after the row Work List is for sky, by the described useless row of labelling Delete, the SQL statement after being optimized:
Take out from the row Work List every time useless row are classified as described in a row labelling;
After subtracting 1 by the reference number of times of all references row cited in the useless row, if the reference number of times of the referenced column Be zero, then by it is described reference number of times be zero when corresponding referenced column add into the row Work List.
CN201611109489.1A 2016-12-02 2016-12-02 SQL optimization method and equipment Active CN106611044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611109489.1A CN106611044B (en) 2016-12-02 2016-12-02 SQL optimization method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611109489.1A CN106611044B (en) 2016-12-02 2016-12-02 SQL optimization method and equipment

Publications (2)

Publication Number Publication Date
CN106611044A true CN106611044A (en) 2017-05-03
CN106611044B CN106611044B (en) 2020-05-08

Family

ID=58636612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611109489.1A Active CN106611044B (en) 2016-12-02 2016-12-02 SQL optimization method and equipment

Country Status (1)

Country Link
CN (1) CN106611044B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
CN108170775A (en) * 2017-12-26 2018-06-15 上海新炬网络技术有限公司 A kind of database SQL indexes dynamic optimization method
CN109800240A (en) * 2018-12-13 2019-05-24 平安科技(深圳)有限公司 SQL statement classifying method, device, computer equipment and storage medium
CN110019313A (en) * 2017-12-30 2019-07-16 中国移动通信集团四川有限公司 Method for optimizing resources, device, equipment and medium
CN110119403A (en) * 2019-04-03 2019-08-13 北京三快在线科技有限公司 SQL optimization method, device, electronic equipment and readable storage medium storing program for executing
CN110555035A (en) * 2018-05-31 2019-12-10 阿里巴巴集团控股有限公司 Method and device for optimizing query statement
CN110895529A (en) * 2018-08-23 2020-03-20 马上消费金融股份有限公司 Processing method of structured query language and related device
CN110955662A (en) * 2019-11-29 2020-04-03 车智互联(北京)科技有限公司 Method, computing device and storage medium for maintaining data table association relation
CN110968593A (en) * 2019-12-10 2020-04-07 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN115552391A (en) * 2020-05-12 2022-12-30 谷歌有限责任公司 Zero copy optimization of Select query

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714058A (en) * 2012-09-28 2014-04-09 Sap股份公司 Database query optimization
CN104765731A (en) * 2014-01-02 2015-07-08 国际商业机器公司 Database query optimization method and equipment
US20160042068A1 (en) * 2014-08-08 2016-02-11 Software Ag Service oriented query and service query language framework
CN105701128A (en) * 2014-11-28 2016-06-22 华为技术有限公司 Query statement optimization method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714058A (en) * 2012-09-28 2014-04-09 Sap股份公司 Database query optimization
CN104765731A (en) * 2014-01-02 2015-07-08 国际商业机器公司 Database query optimization method and equipment
US20160042068A1 (en) * 2014-08-08 2016-02-11 Software Ag Service oriented query and service query language framework
CN105701128A (en) * 2014-11-28 2016-06-22 华为技术有限公司 Query statement optimization method and apparatus

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704511B (en) * 2017-08-31 2019-09-06 星环信息科技(上海)有限公司 A kind of SQL optimization method and equipment
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
CN108170775A (en) * 2017-12-26 2018-06-15 上海新炬网络技术有限公司 A kind of database SQL indexes dynamic optimization method
CN110019313B (en) * 2017-12-30 2021-09-28 中国移动通信集团四川有限公司 Resource optimization method, device, equipment and medium
CN110019313A (en) * 2017-12-30 2019-07-16 中国移动通信集团四川有限公司 Method for optimizing resources, device, equipment and medium
CN110555035A (en) * 2018-05-31 2019-12-10 阿里巴巴集团控股有限公司 Method and device for optimizing query statement
CN110895529A (en) * 2018-08-23 2020-03-20 马上消费金融股份有限公司 Processing method of structured query language and related device
CN110895529B (en) * 2018-08-23 2021-03-30 马上消费金融股份有限公司 Processing method of structured query language and related device
CN109800240A (en) * 2018-12-13 2019-05-24 平安科技(深圳)有限公司 SQL statement classifying method, device, computer equipment and storage medium
CN109800240B (en) * 2018-12-13 2024-03-22 平安科技(深圳)有限公司 SQL sentence classifying method, device, computer equipment and storage medium
CN110119403A (en) * 2019-04-03 2019-08-13 北京三快在线科技有限公司 SQL optimization method, device, electronic equipment and readable storage medium storing program for executing
CN110955662A (en) * 2019-11-29 2020-04-03 车智互联(北京)科技有限公司 Method, computing device and storage medium for maintaining data table association relation
CN110968593B (en) * 2019-12-10 2023-10-03 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN110968593A (en) * 2019-12-10 2020-04-07 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN115552391A (en) * 2020-05-12 2022-12-30 谷歌有限责任公司 Zero copy optimization of Select query
CN115552391B (en) * 2020-05-12 2023-08-25 谷歌有限责任公司 Zero-copy optimization of Select queries

Also Published As

Publication number Publication date
CN106611044B (en) 2020-05-08

Similar Documents

Publication Publication Date Title
CN106611044A (en) SQL optimization method and device
US10509804B2 (en) Method and apparatus for storing sparse graph data as multi-dimensional cluster
CN107038207B (en) Data query method, data processing method and device
CN110019218B (en) Data storage and query method and equipment
US8700605B1 (en) Estimating rows returned by recursive queries using fanout
US8977626B2 (en) Indexing and searching a data collection
US9372890B2 (en) Methods, systems, and computer-readable media for providing a query layer for cloud databases
US10762087B2 (en) Database search
WO2015138497A2 (en) Systems and methods for rapid data analysis
CN110109910A (en) Data processing method and system, electronic equipment and computer readable storage medium
CN104765731A (en) Database query optimization method and equipment
CN102760165B (en) Full text retrieval method using bitmap index and device
US11995059B2 (en) Database index and database query processing method, apparatus, and device
CN107704511A (en) A kind of SQL optimization methods and equipment
CN105630934A (en) Data statistic method and system
CN106815256A (en) Set up the method and device of laws and regulations bar fund incidence relation
Shanoda et al. JOMR: Multi-join optimizer technique to enhance map-reduce job
CN104714956A (en) Comparison method and device for isomerism record sets
CN110990423B (en) SQL statement execution method, device, equipment and storage medium
CN109697234B (en) Multi-attribute information query method, device, server and medium for entity
CN116610700A (en) Query statement detection method and device and storage medium
CN112231531A (en) Data display method, equipment and medium based on openstb
CN111125216A (en) Method and device for importing data into Phoenix
CN115809294A (en) Rapid ETL method based on Spark SQL temporary view
Li et al. FACC: a novel finite automaton based on cloud computing for the multiple longest common subsequences search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 200233 11-12 / F, building B, 88 Hongcao Road, Xuhui District, Shanghai

Patentee after: Star link information technology (Shanghai) Co.,Ltd.

Address before: 200233 11-12 / F, building B, 88 Hongcao Road, Xuhui District, Shanghai

Patentee before: TRANSWARP TECHNOLOGY (SHANGHAI) Co.,Ltd.

CP01 Change in the name or title of a patent holder
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A SQL Optimization Method and Equipment

Effective date of registration: 20230616

Granted publication date: 20200508

Pledgee: Bank of China Limited by Share Ltd. Shanghai Xuhui branch

Pledgor: Star link information technology (Shanghai) Co.,Ltd.

Registration number: Y2023310000252

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20200508

Pledgee: Bank of China Limited by Share Ltd. Shanghai Xuhui branch

Pledgor: Star link information technology (Shanghai) Co.,Ltd.

Registration number: Y2023310000252

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A SQL optimization method and device

Granted publication date: 20200508

Pledgee: Bank of China Limited by Share Ltd. Shanghai Xuhui branch

Pledgor: Star link information technology (Shanghai) Co.,Ltd.

Registration number: Y2024310000787