CN111723104A - Method, device and system for syntax analysis in data processing system - Google Patents

Method, device and system for syntax analysis in data processing system Download PDF

Info

Publication number
CN111723104A
CN111723104A CN201910222559.1A CN201910222559A CN111723104A CN 111723104 A CN111723104 A CN 111723104A CN 201910222559 A CN201910222559 A CN 201910222559A CN 111723104 A CN111723104 A CN 111723104A
Authority
CN
China
Prior art keywords
operator
syntax analysis
generated
parsing
sql
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910222559.1A
Other languages
Chinese (zh)
Inventor
温绍锦
周祥
王烨
蔡利军
马文博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910222559.1A priority Critical patent/CN111723104A/en
Publication of CN111723104A publication Critical patent/CN111723104A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Abstract

The application discloses a method, a device and a system for syntax analysis in a data processing system, wherein the method comprises the following steps: syntax analysis is carried out on the sentences needing to be executed; and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation. By adopting the syntax analysis method in the data processing system, an operator processing program for processing based on the operator operation of the abstract syntax tree in the data processing system can be prepositioned to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the processing efficiency of the data processing system is optimized, and the occupation of system resources is reduced.

Description

Method, device and system for syntax analysis in data processing system
Technical Field
The present application relates to the field of data analysis, and in particular, to a method, an apparatus, and a system for parsing in a data processing system. In addition, the application also relates to a syntax analysis method and a syntax analysis device for the SQL statement in the data storage system.
Background
In recent years, with the rapid development of network technology, the development demand of data processing systems is changing day by day, and the SQL statements that the data processing systems need to execute are also becoming more and more complex. How to improve the syntax analysis efficiency of the SQL statement and optimize the resource occupation of the data processing system becomes a technical goal pursued by those skilled in the art.
In order to solve the above technical problem, the analysis process of the executed statement is generally divided into two stages, the first stage is a syntax analysis stage, and an abstract syntax tree is generated according to the statement to be executed; the second stage is an abstract syntax tree processing stage, which executes processing logic for each node of the generated abstract syntax tree. The SQL (structured query language) structured query language, referred to as SQL for short, is a special purpose programming language, and is a database query and programming language, which is used to access data and query, update, and manage a relational data processing system. It does not require the user to specify the method of storing the data, nor does it require the user to know the specific data storage method, so different data processing systems with completely different underlying structures can use the same structured query language as the interface for data input and management.
However, as more and more execution statements are sent to the data processing system by the user through the application, a large number of intermediate objects are usually generated in a manner of generating the abstract syntax tree and then performing analysis processing in the prior art, which results in higher system resource occupation and affects the efficiency of syntax analysis in the data processing system, and further, actual requirements of the user cannot be met more and more.
Disclosure of Invention
The application provides a method, a device and a system for syntax analysis in a data processing system, which aim to solve the problem of low syntax analysis efficiency in the data processing system in the prior art. In addition, the application also provides a syntax analysis method and a syntax analysis device for the SQL statement in the data storage system.
The method for syntax analysis in a data processing system provided by the application comprises the following steps: syntax analysis is carried out on the sentences needing to be executed; and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
Optionally, the data processing system further includes: and judging whether the syntactic analysis is finished and whether the called operator processing program is executed, and if so, generating a syntactic analysis result which brings back the processing result of the operator processing program.
Optionally, the data processing system includes a data storage system, and correspondingly, the statement to be executed includes an SQL statement.
Optionally, in the syntax analysis process, the method of setting a callback interface for the generated operator operation calls a corresponding operator processing program, and includes: and in the stage of forming an execution plan in the process of carrying out syntactic analysis on the SQL statement to be executed in the data storage system, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation of the SQL statement, wherein the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
Optionally, the operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
Optionally, the operator processing program is configured to perform a preprocessing operation on a data structure of an operator operation generated in a process of performing syntax analysis on an executed SQL statement in the data storage system; wherein the preprocessing operation is a processing operation that needs to be executed by a processing module after the execution plan is generated.
Optionally, before the step of performing the preprocessing operation on the data structure of the operator operation generated in the process of performing the syntax analysis on the executed SQL statement, the following steps are performed: carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule.
Optionally, the syntax analysis result includes an operator operation processing result generated after the execution of the corresponding operator processing program is called through the callback interface, and an operator operation not calling the callback interface.
Accordingly, the present application also provides an apparatus for parsing in a data processing system, comprising: the first syntax analysis unit and the first calling unit; the first syntax analysis unit is used for performing syntax analysis on the statement to be executed; and the first calling unit is used for calling the corresponding operator processing program in a mode of setting a callback interface for the generated operator operation in the syntax analysis process.
Optionally, the data processing system further includes: and the first judgment unit is used for judging whether the syntactic analysis is finished and whether the execution of the called operator processing program is finished, and if the judgment is yes, generating a syntactic analysis result which brings back the processing result of the operator processing program.
Optionally, the data processing system includes a data storage system, and correspondingly, the statement to be executed includes an SQL statement.
Optionally, the first calling unit is specifically configured to: and in the stage of forming an execution plan in the process of carrying out syntactic analysis on the SQL statement to be executed in the data storage system, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation of the SQL statement, wherein the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
Optionally, the operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
Optionally, the operator processing program is configured to perform a preprocessing operation on a data structure of an operator operation generated in a process of performing syntax analysis on an executed SQL statement in the data storage system; wherein the preprocessing operation is a processing operation that needs to be executed by a processing module after the execution plan is generated.
Optionally, before the step of performing the preprocessing operation on the data structure of the operator operation generated in the process of performing the syntax analysis on the executed SQL statement, the following steps are performed: carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule.
Optionally, the syntax analysis result includes an operator operation processing result generated after the execution of the corresponding operator processing program is called through the callback interface, and an operator operation not calling the callback interface.
Correspondingly, the application also provides a syntax analysis method for the SQL statement in the data storage system, which comprises the following steps: receiving an input SQL statement; carrying out syntactic analysis on SQL sentences to be executed in the data storage system; in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation; judging whether the syntax analysis is finished and whether the execution of the called operator processing program is finished, if so, generating a syntax analysis result which brings back the processing result of the operator processing program; and processing the output content of the grammar analysis result.
Optionally, in the syntax analysis process, the method of setting a callback interface for the generated operator operation calls a corresponding operator processing program, and includes: and in the process of generating an execution plan by syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the execution plan is an abstract syntax tree which is generated in a data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
Correspondingly, the present application also provides a syntax analysis device for SQL statements in a data storage system, which is characterized by comprising: the receiving unit, the second syntax analysis unit, the second calling unit, the second judging unit and the processing unit; the receiving unit is used for receiving the input SQL statement; the second syntax analysis unit is used for performing syntax analysis on SQL sentences needing to be executed in the data storage system; the second calling unit is used for calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation in the process of generating an execution plan by carrying out syntax analysis on the SQL statement in the data storage system, wherein the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence; the second judging unit is used for judging whether the syntax analysis is finished and whether the execution of the called operator processing program is finished, and if the judgment is yes, generating a syntax analysis result which brings back the processing result of the operator processing program; and the processing unit is used for processing the content of the output syntax analysis result.
Optionally, the second calling unit is specifically configured to: and in the process of generating an execution plan by syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the execution plan is an abstract syntax tree which is generated in a data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
Correspondingly, the present application also provides a syntax analysis system in a data processing system, comprising: the syntax analysis device in the data processing system and the syntax analysis device aiming at the SQL statement in the data storage system are provided.
Correspondingly, the present application also provides an electronic device, comprising: a processor and a memory for storing a program of a parsing method in a data processing system, the apparatus performing the following steps after being powered on and running the program of the parsing method in the data processing system by the processor: syntax analysis is carried out on the sentences needing to be executed; and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
Accordingly, the present application also provides a storage device storing a program of a syntax parsing method in a data processing system, the program being executed by a processor to perform the steps of: syntax analysis is carried out on the sentences needing to be executed; and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
Compared with the prior art, the method has the following advantages:
by adopting the syntax analysis method in the data processing system, an operator processing program for processing based on the operator operation of the abstract syntax tree in the data processing system can be prepositioned to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the syntax analysis efficiency of the data processing system is optimized, and the occupation of system resources is reduced.
Drawings
FIG. 1 is a flow chart of a method of parsing in a data processing system according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an apparatus for parsing in a data processing system according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present invention;
fig. 4 is a flowchart of a syntax parsing method for SQL statements in a data storage system according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a syntax parsing apparatus for SQL statements in a data storage system according to an embodiment of the present invention;
FIG. 6 is a flow diagram of a prior art parsing process within a data processing system;
FIG. 7 is a flow chart of a parsing process in a data processing system according to an embodiment of the present invention;
FIG. 8 is a complete flow diagram of a method for parsing in a data processing system in accordance with an embodiment of the present invention;
fig. 9 is a schematic diagram of setting a callback interface in a syntax analysis process based on an SQL statement according to an embodiment of the present invention;
FIG. 10 is a prior art flow diagram of parsing based on INSERT statements within a data processing system;
FIG. 11 is a flowchart of an embodiment of parsing based on INSERT statements in a data processing system.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather construed as limited to the embodiments set forth herein.
The following describes an embodiment of the syntax analysis method in the data processing system according to the present invention in detail. Please refer to fig. 1, which is a flowchart illustrating a syntax parsing method in a data processing system according to an embodiment of the present invention.
The embodiment of the present invention may be implemented based on an SQL Parser (SQL Parser) in a conventional data storage system, and specific implementation steps thereof are described below with reference to fig. 1.
Step S101: and (5) syntax analysis is carried out on the sentences needing to be executed.
In the embodiment of the present invention, the data processing system may refer to a data storage system. The statement to be executed may refer to an SQL statement used for storing object query and management in the data storage system. For example, the SQL statement may specifically include: a SELECT statement (SELECT data operation statement) to retrieve a row and column of data from a table of the data storage system; an INSERT statement (INSERT data operation statement) for adding a new row of data to a data storage system table; a DELETE statement (DELETE data operation statement) for deleting a data line from a data storage system table; an UPDATE statement (UPDATE data operation statement) for updating data in the data storage system table, and the like.
In the embodiment of the present invention, syntax analysis is performed on the statements to be executed, that is,: and (3) carrying out syntactic analysis on the SELECT statement, the INSERT statement, the DELETE statement and the UPDATE statement in the SQL statement to generate AST (Abstract Syntax Tree) containing a large number of operator operations. And the operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
Fig. 6 is a flow diagram of a prior art parsing process within a data processing system. In the prior art, SQL Parser (SQL Parser module) is an indispensable important component in databases. At present, the SQLParser is usually set in the first processing link of the whole processing flow of SQL statement parsing. After the SQL sentences are input into the data storage system, the syntax analysis is carried out to generate AST, and then a subsequent processing module in the data storage system carries out processing based on the AST. The process is generally a waterfall model, that is, SQL Parser generates the AST first, and then the subsequent data processing module performs traversal processing based on operator operations stored in each node of the AST. It should be noted that, after the SQL Parser parses the statement to be executed, the subsequent data Processing module in the data storage system may refer to a Processing module or an Optimizer Processing module.
The conventional processing approach described above typically generates a large number of AST intermediate objects, resulting in a large amount of data storage system resources being occupied. To this end, the present invention provides a method of parsing in a data processing system.
Fig. 7 is a flowchart illustrating a parsing process in a data processing system according to an embodiment of the present invention. Firstly, the SQL statement to be executed is parsed, that is: calling a callback interface through SQL Parser, calling operator processing operation in an operator processing module into the SQL Parser to realize the process of generating the abstract syntax tree AST by carrying out syntax analysis on the input SQL statement. In the process of generating the syntax tree, a lexical analysis stage of operator processing operation stored based on each node of the AST can be preposed, so that the processing flow of SQL statement analysis and processing is more compact, the generation of AST intermediate objects is reduced, and the occupation of system resources is optimized.
Step S102: and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
In the step S101, syntax analysis is performed on the statement to be executed, and preparation is made for calling the corresponding operator processing program in the syntax analysis process in this step by setting the callback interface for the generated operator operation. In step S102, different callback interfaces may be set for the generated operator operation according to each parsing stage in SQL Parser, so as to call a corresponding operator handler.
Specifically, in the embodiment of the present invention, at the stage of forming an execution plan in the process of parsing the SQL statement to be executed by SQL Parser in the data storage system, a corresponding operator handler is called in a manner of setting a callback interface for the generated operator operation of the SQL statement, where the execution plan is an AST abstract syntax tree which is generated in the data storage system according to the SQL statement and is composed of a plurality of operator operations according to a set logical order. The operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system. The operator processing program is used for carrying out preprocessing operation on a data structure of operator operation generated in the process of carrying out syntactic analysis on the executed SQL statement in the data storage system. The preprocessing operation is a processing operation required to be executed by a processing module after the execution plan is generated.
It should be noted that, before the step of performing the preprocessing operation on the data structure of the operator operation generated in the process of performing the syntax analysis on the executed SQL statement, the following steps are performed: carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule.
The SQL parser generates an AST abstract syntax tree for a corresponding SQL statement, then traverses operator operations stored on each node of the abstract syntax tree, and calls a callback API (callback interface) corresponding to each operator operation when the operator operations are traversed.
Fig. 9 is a schematic diagram of setting a callback interface in an SQL statement-based syntax analysis process according to an embodiment of the present invention.
The operator operation and the callback setting interface in the embodiment of the invention are realized by a programming mode, for example: for operator operations generated in the process of the SELECT syntax analysis, a corresponding callback API is defined, which includes: a table callback interface for table operator operation, a table callback interface for hit operator operation, a prefix callback interface for prefix operator operation, a join callback interface for join operator operation, a group by callback interface for group by operator operation, an order by callback interface for order by operator operation, a happy callback interface for happy operator operation, a limit callback interface for limit operator operation, and an op set callback interface for set operator operation.
After generating the abstract syntax tree for the SELECT statement, each node on the abstract syntax tree contains various operator operations, the abstract syntax tree is traversed, in the traversing process, when the related operator operations are encountered, a callback API corresponding to the operator operations is called, and an operator processing program which needs to be processed by a subsequent data processing module based on the AST is prepositioned to a stage of generating the operator operations in syntax analysis. The callback interface may specifically include:
interface Select Handler{
void process Table (SQL Table); v/callback interface for table callback operator operation;
void process Hint (SQL Hint Hint); v/callback interface for hit operator operation;
void Process Where (SQL Where); a callback interface for predicate operator operation;
void process Join (SQL Join); // a callback interface for Join operator operations;
void process Group By (SQL Group By); a callback interface for Group By clause operator operation;
void process Order By (SQL Order By); a callback interface aiming at operation of an Order By clause operator;
void Process bathing (SQL bathing); a callback interface for Having clause operator operation;
void process Limit (SQL Limit); a callback interface for Limit clause operator operation;
void process Set Op (SQL Set Op); // for the cell, interrupt,
callback interface of the operation set of operators such as MINUS }.
Fig. 8 is a flowchart illustrating a complete parsing method in a data processing system according to an embodiment of the present invention. The embodiment of the present invention may further include: step S103: and judging whether the syntactic analysis is finished and whether the called operator processing program is executed, and if so, generating a syntactic analysis result which brings back the processing result of the operator processing program.
In step S102, after the corresponding operator processing program is called in the syntax analysis process by setting the callback interface for the generated operator operation, the syntax analysis result that has brought back the processing result of the operator processing program can be generated by determining whether the syntax analysis process is completed and whether the called operator processing program is executed. The operator processing program is used for carrying out preprocessing operation on a data structure of operator operation generated in the process of carrying out syntactic analysis on SQL sentences to be executed in the data storage system. The preprocessing operation is a processing operation required to be executed by a processing module after the execution plan is generated.
The syntax analysis result comprises an operator operation processing result generated after the execution of the corresponding operator processing program is called through the callback interface, and an operator operation without calling the callback interface. The parsing process may refer to a process of generating an AST abstract syntax tree in the data storage system by parsing the SQL statements to be executed.
In the syntax analysis process, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the method comprises the following steps:
carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule. And in the stage of forming an execution plan in the process of carrying out syntactic analysis on the SQL statement to be executed in the data storage system, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation of the SQL statement, wherein the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence. The operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
Fig. 10 and 11 are a flowchart illustrating parsing based on an INSERT statement in a data processing system according to the prior art and a flowchart illustrating parsing based on an INSERT statement in a data processing system according to an embodiment of the present invention, respectively.
The invention provides a specific embodiment, which takes the syntax analysis process of an INSERT statement in an SQL statement as an example for explanation, and the common processing steps comprise the validity verification by combining operator operation, the updating of a corresponding data structure after the verification, and the like. Such as: INSERT intot TVALUES (1, 'aa'), (2, 'bb'), (3, 'cc'); the treatment process is as follows:
initializing a data structure according to an INSERT statement required by a subsequent data processing module, for example:
INSERT column_1_values[…];
column_2_values[…];
……
column_N_values[…];
in this embodiment, the subsequent data processing module needs to perform subsequent data processing in the above manner, so that the data structure of the above mode is defined;
SQL parser parses the INSERT statement, when the target table t is met, operator operation query is carried out,
firstly, confirming that the t table is an existing target table, and if the t table does not exist, directly reporting an error and ending;
acquiring operator operation information of a column of a t table, for example, in the present embodiment, the t table has two columns COL _1 and COL _2, and the operator operation information of the column is acquired, including a column name and a data type of the column;
SQL parser continues parsing into clauses of VALUES statement followed by three rows of records, so the above-mentioned
The data structure is:
INSERT column_1_values[?,?,?];
column_2_values[?,?,?];
and (3) carrying out validity check on the recorded data in the VALUES by combining the position labels and the data types of the columns, such as: COL _1 is integer type, COL _2 is varchar type, and 1, 2, 3 pass the check, all meet the requirement of integer type, and 'aa', 'bb', 'cc' pass the check, all meet varchar type;
the final processed data structure is as follows:
INSERT column_1_values[1,2,3];
column_2_values[‘aa’,‘bb’,‘cc’]。
by adopting the syntax analysis method in the data processing system, an operator processing program processed by the data processing system based on the abstract syntax tree can be preposed to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the syntax analysis efficiency of the data processing system is optimized, and the occupation of system resources is reduced.
Corresponding to the syntax analysis method in the data processing system, the invention also provides a syntax analysis device in the data processing system. Since the embodiment of the apparatus is similar to the embodiment of the method described above, the description is simple, and for the relevant points, reference may be made to the description of the embodiment of the method described above, and the following description of the embodiment of the apparatus is only illustrative. Please refer to fig. 2, which is a diagram illustrating a syntax parsing apparatus in a data processing system according to an embodiment of the present invention.
The syntax analysis device in the data processing system of the embodiment of the invention comprises the following parts:
a first syntax analysis unit 201, configured to perform syntax analysis on a statement to be executed.
In the embodiment of the present invention, the data processing system may refer to a data storage system. The statement to be executed may refer to an SQL statement used for storing object query and management in the data storage system. For example, the SQL statement may specifically include: a SELECT statement (SELECT data operation statement) to retrieve a row and column of data from a table of the data storage system; an INSERT statement (INSERT data operation statement) for adding a new row of data to a data storage system table; a DELETE statement (DELETE data operation statement) for deleting a data line from a data storage system table; an UPDATE statement (UPDATE data operation statement) for updating data in the data storage system table, and the like.
In the embodiment of the present invention, syntax analysis is performed on the statements to be executed, that is,: and (3) carrying out syntactic analysis on the SELECT statement, the INSERT statement, the DELETE statement and the UPDATE statement in the SQL statement to generate AST (Abstract Syntax Tree) containing a large number of operator operations. And the operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
Fig. 6 is a flow diagram of a prior art parsing process within a data processing system. In the prior art, SQL Parser (SQL Parser module) is an indispensable important component in databases. At present, the SQLParser is usually set in the first processing link of the whole processing flow of SQL statement parsing. After the SQL sentences are input into the data storage system, the syntax analysis is carried out to generate AST, and then a subsequent processing module in the data storage system carries out processing based on the AST. The process is generally a waterfall model, that is, SQL Parser generates the AST first, and then the subsequent data processing module performs traversal processing based on operator operations stored in each node of the AST. It should be noted that, after the SQL Parser parses the statement to be executed, the subsequent data Processing module in the data storage system may refer to a Processing module or an Optimizer Processing module.
The conventional processing approach described above typically generates a large number of AST intermediate objects, resulting in a large amount of data storage system resources being occupied. To this end, the present invention provides a method of parsing in a data processing system.
Fig. 7 is a flowchart illustrating a parsing process in a data processing system according to an embodiment of the present invention. Firstly, the SQL statement to be executed is parsed, that is: calling a callback interface through SQL Parser, calling operator processing operation in an operator processing module into the SQL Parser to realize the process of generating the abstract syntax tree AST by carrying out syntax analysis on the input SQL statement. In the process of generating the syntax tree, a lexical analysis stage of operator processing operation stored based on each node of the AST can be preposed, so that the processing flow of SQL statement analysis and processing is more compact, the generation of AST intermediate objects is reduced, and the occupation of system resources is optimized.
And the first calling unit 202 is configured to call a corresponding operator processing program in a manner of setting a callback interface for the generated operator operation in the syntax analysis process.
Specifically, in the embodiment of the present invention, at the stage of forming an execution plan in the process of parsing the SQL statement to be executed by SQL Parser in the data storage system, a corresponding operator handler is called in a manner of setting a callback interface for the generated operator operation of the SQL statement, where the execution plan is an AST abstract syntax tree which is generated in the data storage system according to the SQL statement and is composed of a plurality of operator operations according to a set logical order. The operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system. The operator processing program is used for carrying out preprocessing operation on a data structure of operator operation generated in the process of carrying out syntactic analysis on the executed SQL statement in the data storage system. The preprocessing operation is a processing operation required to be executed by a processing module after the execution plan is generated.
It should be noted that, before the step of performing the preprocessing operation on the data structure of the operator operation generated in the process of performing the syntax analysis on the executed SQL statement, the following steps are performed: carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule.
The SQL parser generates an AST abstract syntax tree for a corresponding SQL statement, then traverses operator operations stored on each node of the abstract syntax tree, and calls a callback API (callback interface) corresponding to each operator operation when the operator operations are traversed.
Fig. 9 is a schematic diagram of setting a callback interface in an SQL statement-based syntax analysis process according to an embodiment of the present invention.
The operator operation and the callback setting interface in the embodiment of the invention are realized by a programming mode, for example: for operator operations generated in the process of the SELECT syntax analysis, a corresponding callback API is defined, which includes: a table callback interface for table operator operation, a tabbecallback interface for hit operator operation, a prefix callback interface for prefix operator operation, a join callback interface for join operator operation, a group by callback interface for group by operator operation, an order by callback interface for order by operator operation, a happy callback interface for happy operator operation, a limit callback interface for limit operator operation, and an op set callback interface for set operator operation.
After generating the abstract syntax tree for the SELECT statement, each node on the abstract syntax tree contains various operator operations, the abstract syntax tree is traversed, in the traversing process, when the related operator operations are encountered, a callback API corresponding to the operator operations is called, and an operator processing program which needs to be processed by a subsequent data processing module based on the AST is prepositioned to a stage of generating the operator operations in syntax analysis. The callback interface may specifically include:
interface Select Handler{
void process Table (SQL Table); v/callback interface for table callback operator operation;
void process Hint (SQL Hint Hint); v/callback interface for hit operator operation;
void Process Where (SQL Where); a callback interface for predicate operator operation;
void process Join (SQL Join); // a callback interface for Join operator operations;
void process Group By (SQL Group By); a callback interface for Group By clause operator operation;
void process Order By (SQL Order By); a callback interface aiming at operation of an Order By clause operator;
void Process bathing (SQL bathing); a callback interface for Having clause operator operation;
void process Limit (SQL Limit); a callback interface for Limit clause operator operation;
void process Set Op (SQL Set Op); // for the cell, interrupt,
callback interface of the operation set of operators such as MINUS }.
The embodiment of the invention may further include: a first judgment unit 203. The first judging unit 203 is configured to judge whether the syntax analysis is completed and whether the execution of the called operator processing program is completed, and if yes, generate a syntax analysis result that brings back a processing result of the operator processing program.
The syntax analysis result comprises an operator operation processing result generated after the execution of the corresponding operator processing program is called through the callback interface, and an operator operation without calling the callback interface. The parsing process may refer to a process of generating an AST abstract syntax tree in the data storage system by parsing the SQL statements to be executed.
In the syntax analysis process, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the method comprises the following steps:
carrying out legality check on a data structure of operator operation generated in the process of carrying out syntax analysis on the executed SQL statement; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule. And in the stage of forming an execution plan in the process of carrying out syntactic analysis on the SQL statement to be executed in the data storage system, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation of the SQL statement, wherein the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence. The operator operation is a basic operation program generated by analyzing the SQL statement in the data storage system.
By adopting the syntax analysis device in the data processing system, an operator processing program processed by the data processing system based on the abstract syntax tree can be preposed to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the syntax analysis efficiency of the data processing system is optimized, and the occupation of system resources is reduced.
Corresponding to the grammar analysis method in the data processing system, the invention also provides electronic equipment. Fig. 3 is a schematic view of an electronic device according to an embodiment of the invention.
The electronic device provided by the invention specifically comprises: a processor 302 and a memory 301. Wherein, the memory 301 is used for storing the program of the syntax analysis method in the data processing system, and after the device is powered on and runs the program of the syntax analysis method in the data processing system through the processor 302, the first step is executed: syntax analysis is carried out on the sentences needing to be executed; step two: and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
Corresponding to the syntax analysis method in the data processing system, the present invention further provides a storage device, in which a program of the syntax analysis method in the data processing system is stored, and the program can be executed by a processor, and performs the following steps: syntax analysis is carried out on the sentences needing to be executed; step two: and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
Corresponding to the syntax analysis method in the data processing system, the invention also provides a syntax analysis method for SQL statements in the data storage system. Since the method embodiment is similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the description of the method embodiment, and the method embodiment described below is only illustrative. Fig. 4 is a flowchart of a syntax parsing method for SQL statements in a data storage system according to an embodiment of the present invention.
Step S401, receiving the input SQL statement.
Step S402, carrying out syntax analysis on SQL sentences to be executed in the data storage system.
Step S403, in the syntax analysis process, a corresponding operator processing program is called in a manner of setting a callback interface for the generated operator operation.
In the embodiment of the present invention, the calling of the corresponding operator processing program by setting the callback interface for the generated operator operation is specifically implemented by: and in the process of generating an execution plan by syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the execution plan is an abstract syntax tree which is generated in a data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
Step S404, judging whether the grammar analysis is finished and whether the called operator processing program is finished, if the judgment is yes, generating a grammar analysis result which brings back the processing result of the operator processing program.
Step S405, the content of the output parsing result is processed.
By adopting the syntax analysis method for the SQL statement in the data storage system, an operator processing program processed by the data storage system based on the abstract syntax tree can be preposed to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the syntax analysis efficiency of the data storage system is optimized, and the occupation of system resources is reduced.
Corresponding to the syntax analysis method for the SQL statement in the data storage system, the invention also provides a syntax analysis device for the SQL statement in the data storage system. Since the embodiment of the apparatus is similar to the above method embodiment, the description is simple, and for the relevant points, reference may be made to the description of the above method embodiment, and the following description of the embodiment of the apparatus is only illustrative. Fig. 5 is a schematic diagram of a syntax parsing apparatus for SQL statements in a data storage system according to an embodiment of the present invention.
The syntax analysis device for the SQL statement in the data storage system comprises the following parts:
the receiving unit 501 is configured to receive an input SQL statement.
The second syntax analysis unit 502 is configured to perform syntax analysis on an SQL statement that needs to be executed in the data storage system.
The second calling unit 503 is configured to, in the syntax analysis process, call a corresponding operator processing program in a manner of setting a callback interface for the generated operator operation.
In the embodiment of the present invention, the calling of the corresponding operator processing program by setting the callback interface for the generated operator operation is specifically implemented by: and in the process of generating an execution plan by syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation, wherein the execution plan is an abstract syntax tree which is generated in a data storage system according to the SQL statement and consists of a plurality of operator operations according to a set logic sequence.
The second determining unit 504 is configured to determine whether the syntax analysis is completed and whether the execution of the called operator processing program is completed, and if the determination is yes, generate a syntax analysis result that brings back a processing result of the operator processing program.
The processing unit 505 is configured to process the content of the output parsing result.
By adopting the syntax analysis device for the SQL statement in the data storage system, an operator processing program processed by the data storage system based on the abstract syntax tree can be preposed to the abstract syntax tree generation stage, and generation of a large number of intermediate objects of the abstract syntax tree is avoided, so that the syntax analysis efficiency of the data storage system is optimized, and the occupation of system resources is reduced.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the present invention, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present invention.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (15)

1. A method of parsing in a data processing system, comprising:
syntax analysis is carried out on the sentences needing to be executed;
and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
2. The method of parsing in a data processing system of claim 1, further comprising:
and judging whether the syntax analysis process is finished and whether the called operator processing program is executed, and if so, generating a syntax analysis result which brings back the processing result of the operator processing program.
3. The method of parsing in a data processing system according to claim 1 wherein said data processing system comprises a data storage system and wherein said statements to be executed comprise SQL statements, accordingly.
4. A method for parsing in a data processing system according to claim 3, wherein said invoking the corresponding operator handler by setting a callback interface for the generated operator operation during parsing comprises:
in the process of carrying out syntactic analysis on SQL sentences to be executed in the data storage system to form an execution plan, calling corresponding operator processing programs in a mode of setting a callback interface for the generated operator operation of the SQL sentences;
the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and is composed of a plurality of operator operations according to a set logic sequence.
5. The method of parsing in a data processing system of claim 1 wherein said operator operation is a basic operation procedure in a data storage system generated by analyzing said SQL statement.
6. The method according to claim 1, wherein the operator handler is configured to perform a preprocessing operation on a data structure of an operator operation generated during the parsing of the SQL statement to be executed; wherein the preprocessing operation is a processing operation that needs to be executed by a processing module after the execution plan is generated.
7. The method of claim 6, wherein before the step of performing the preprocessing operation on the data structure of the operator operation generated during the parsing process of the SQL statement to be executed, the following steps are performed:
carrying out legality check on a data structure of operator operation generated in the process of carrying out syntactic analysis on an SQL statement to be executed; and the legality test is to judge whether the data structure of the operator operation of the SQL statement conforms to a preset SQL grammar rule.
8. The method of claim 3, wherein the syntax analysis result includes an operator operation processing result generated after the execution of the corresponding operator processing program has been called through the callback interface, and an operator operation not calling the callback interface.
9. An apparatus for parsing in a data processing system, comprising: the first syntax analysis unit and the first calling unit;
the first syntax analysis unit is used for performing syntax analysis on the statement to be executed;
and the first calling unit is used for calling the corresponding operator processing program in a mode of setting a callback interface for the generated operator operation in the syntax analysis process.
10. A syntax analysis method for SQL statements in a data storage system is characterized by comprising the following steps:
receiving an input SQL statement;
carrying out syntactic analysis on SQL sentences to be executed in the data storage system;
in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation;
judging whether the syntax analysis is finished and whether the execution of the called operator processing program is finished, if so, generating a syntax analysis result which brings back the processing result of the operator processing program;
and processing the output content of the grammar analysis result.
11. The method for parsing an SQL statement in a data storage system according to claim 10, wherein in the parsing process, the corresponding operator handler is called by setting a callback interface for the generated operator operation, and the method includes:
in the process of generating an execution plan by syntactic analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation; the execution plan is an abstract syntax tree which is generated in the data storage system according to the SQL statement and is composed of a plurality of operator operations according to a set logic sequence.
12. A parsing apparatus for SQL statements in a data storage system, comprising: the receiving unit, the second syntax analysis unit, the second calling unit, the second judging unit and the processing unit;
the receiving unit is used for receiving the input SQL statement;
the second syntax analysis unit is used for performing syntax analysis on SQL sentences needing to be executed in the data storage system;
the second calling unit is used for calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation in the syntax analysis process;
the second judging unit is used for judging whether the syntax analysis is finished and whether the execution of the called operator processing program is finished, and if the judgment is yes, generating a syntax analysis result which brings back the processing result of the operator processing program;
and the processing unit is used for processing the content of the output syntax analysis result.
13. A system for parsing in a data processing system, comprising: apparatus for parsing in a data processing system as claimed in claim 9, and apparatus for parsing in SQL statements in a data storage system as claimed in claim 12.
14. An electronic device, comprising:
a processor; and
a memory for storing a program of a parsing method in a data processing system, the apparatus performing the following steps after being powered on and running the program of the parsing method in the data processing system by the processor:
syntax analysis is carried out on the sentences needing to be executed;
and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
15. A storage device storing a program for a parsing method in a data processing system, the program being executed by a processor and performing the steps of:
syntax analysis is carried out on the sentences needing to be executed;
and in the process of syntax analysis, calling a corresponding operator processing program in a mode of setting a callback interface for the generated operator operation.
CN201910222559.1A 2019-03-22 2019-03-22 Method, device and system for syntax analysis in data processing system Pending CN111723104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910222559.1A CN111723104A (en) 2019-03-22 2019-03-22 Method, device and system for syntax analysis in data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910222559.1A CN111723104A (en) 2019-03-22 2019-03-22 Method, device and system for syntax analysis in data processing system

Publications (1)

Publication Number Publication Date
CN111723104A true CN111723104A (en) 2020-09-29

Family

ID=72562763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910222559.1A Pending CN111723104A (en) 2019-03-22 2019-03-22 Method, device and system for syntax analysis in data processing system

Country Status (1)

Country Link
CN (1) CN111723104A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421701A (en) * 2022-11-07 2022-12-02 北京滴普科技有限公司 Method and system for generating cypher statement based on model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421701A (en) * 2022-11-07 2022-12-02 北京滴普科技有限公司 Method and system for generating cypher statement based on model
CN115421701B (en) * 2022-11-07 2023-01-10 北京滴普科技有限公司 Method and system for generating cypher statement based on model

Similar Documents

Publication Publication Date Title
US9009140B2 (en) Optimization of database query
CN110019218B (en) Data storage and query method and equipment
CN112988782B (en) Hive-supported interactive query method and device and storage medium
CN110222071B (en) Data query method, device, server and storage medium
US20210209098A1 (en) Converting database language statements between dialects
US11893011B1 (en) Data query method and system, heterogeneous acceleration platform, and storage medium
CN110688544A (en) Method, device and storage medium for querying database
US20100010962A1 (en) Deferred Compilation of Stored Procedures
CN113204571B (en) SQL execution method and device related to write-in operation and storage medium
US6763358B2 (en) Method and system for activating column triggers in a database management system
CN114116767A (en) Method and device for converting SQL (structured query language) query statement of database
US9870399B1 (en) Processing column-partitioned data for row-based operations in a database system
CN111723104A (en) Method, device and system for syntax analysis in data processing system
CN110580170B (en) Method and device for identifying software performance risk
US20070088684A1 (en) Partial updating in a database proxy driver
CN116610694A (en) Rule verification method and system based on relation between columns and access sentences
CN115114325B (en) Data query method and device, electronic equipment and storage medium
CN116610568A (en) Method, device, equipment and medium for identifying dependency relationship of codes
Ekanayake et al. Performance Evaluation of Google Spreadsheet over RDBMS through Cloud Scripting Algorithms
CN116561154A (en) SQL sentence optimization method and device
CN114090614A (en) Data query method and device
CN112416362B (en) PDK compiling function implementation method
CN115809268B (en) Adaptive query method and device based on fragment index
CN111090671B (en) Method and device for eliminating difference between empty character string and invalid character string in database
CN116627390B (en) ICD file substitution method and device in aviation software development

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination