CN116226169A - SQL sentence processing method, device and equipment - Google Patents

SQL sentence processing method, device and equipment Download PDF

Info

Publication number
CN116226169A
CN116226169A CN202310201045.4A CN202310201045A CN116226169A CN 116226169 A CN116226169 A CN 116226169A CN 202310201045 A CN202310201045 A CN 202310201045A CN 116226169 A CN116226169 A CN 116226169A
Authority
CN
China
Prior art keywords
node
target node
nodes
expression
ast
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310201045.4A
Other languages
Chinese (zh)
Inventor
苏宁宁
王赛
王帅
吴烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202310201045.4A priority Critical patent/CN116226169A/en
Publication of CN116226169A publication Critical patent/CN116226169A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a processing method, device and equipment of SQL sentences, which are used for converting the SQL sentences into an AST structure and then realizing the execution of a differential privacy algorithm by utilizing the structure. The method comprises the following steps: acquiring SQL sentences; converting the SQL sentence into an AST according to an AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node; searching a first target node belonging to an aggregation expression in expression nodes of AST; and calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, wherein the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm aiming at the first target node.

Description

SQL sentence processing method, device and equipment
Technical Field
The present invention relates to the field of database technologies, and in particular, to a method, an apparatus, and a device for processing an SQL (Structured Query Language) statement.
Background
Differential privacy (Differential Privacy, DP for short) is a privacy protection means, which is mainly used for protecting aggregated statistical data, and can protect sensitive information of individuals under the condition of keeping the overall statistical characteristics of the data stable. The main method is to add a proper amount of noise into the statistical result to ensure that an individual record in the modified data cannot influence the statistical result obviously, and aim to solve the problem of user privacy disclosure in the statistical distribution process of the data.
At present, the data table in the database can be queried by utilizing SQL sentences, but the differential privacy algorithm cannot be directly executed on the SQL sentences belonging to the character strings.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method, apparatus, and device for processing an SQL statement, which can convert the SQL statement into an intermediate representation structure, and then implement execution of a differential privacy algorithm by using the structure.
In order to solve the above problems, the technical solution provided in the embodiments of the present application is as follows:
in a first aspect, an embodiment of the present application provides a method for processing a structured query language SQL statement, where the method includes:
acquiring SQL sentences;
converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
searching a first target node belonging to an aggregation expression in the expression nodes of the AST;
and calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, wherein the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm aiming at the first target node.
In a second aspect, an embodiment of the present application provides a processing apparatus for a structured query language SQL statement, where the apparatus includes:
the first acquisition unit is used for acquiring SQL sentences;
the conversion unit is used for converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
a first searching unit, configured to search for a first target node belonging to an aggregation expression in the expression nodes of the AST;
the calling unit is used for calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, and the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm on the first target node.
In a third aspect, an embodiment of the present application provides a processing device for a structured query language SQL statement, including: the SQL sentence processing method is realized by the processor when the processor executes the computer program.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium, where instructions are stored, when the instructions are executed on a terminal device, cause the terminal device to execute a method for processing an SQL statement as described above.
From this, the embodiment of the application has the following beneficial effects:
according to the embodiment of the application, AST (Abstract Syntax Tree ) conversion rules corresponding to different SQL sentence types are pre-established, so that SQL sentences of different types can be compatible, the SQL sentences are converted into AST, and the AST is represented by sentence nodes, block nodes and expression node three-level node structures. And then determining a first target node belonging to the aggregation expression, which needs to execute the differential privacy algorithm, from the expression nodes in the AST, and calling a sensitivity calculation function of an internal node of the first target node, so that the sensitivity of the internal node of the first target node is calculated, and executing the differential privacy algorithm on the first target node can be realized by utilizing the sensitivity of the internal node of the first target node. Thus, the execution of the differential privacy algorithm after the SQL sentence is input is realized.
Drawings
Fig. 1 is a schematic diagram of an exemplary application scenario provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for processing SQL statements according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of an AST generation provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a processing device for SQL statements according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the above objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures and detailed description are described in further detail below.
In order to facilitate understanding and explanation of the technical solutions provided by the embodiments of the present application, the background art of the present application will be described first.
Differential privacy (Differential Privacy, DP for short) is a privacy protection means, which is mainly used for protecting aggregated statistical data, and can protect sensitive information of individuals under the condition of keeping the overall statistical characteristics of the data stable.
For example, assume that there is a user information record database in which whether each person has a certain user characteristic is recorded using a boolean value, wherein the boolean value having the user characteristic is 1 and the boolean value not having the user characteristic is 2. Assume that a malicious user (often referred to as an attacker) wants to know if someone has the user characteristics. Suppose that an attacker knows which row of the database the person is in, for example, row 10. The attacker uses the specific form to inquire the sum of the Boolean values of the first 10 rows and the first 9 rows in the user information record database, and then calculates the difference between the two inquires, so as to obtain the result of whether a person has the user characteristics, namely the differential privacy attack. If the information of query 9 persons and the information of query 10 persons are consistent, then the attacker has no way to determine the information of 10 th person, which is differential privacy protection.
The main method of differential privacy protection is to add a proper amount of noise into the aggregate statistics result so as to ensure that one individual record in the modified data cannot cause significant influence on the statistics result, and aim to solve the problem of user privacy disclosure in the statistics publishing process of the data.
At present, an SQL statement can be utilized to query a data table in a database to obtain an aggregated query result. However, the SQL statement belongs to a string, and cannot be presented in a structured manner for a complex SQL statement, and cannot find which queries therein need to be executed with a differential privacy algorithm. Therefore, it is currently impossible to directly execute the differential privacy algorithm on the SQL statements belonging to the character string.
Based on this, the embodiment of the application provides a method, a device and equipment for processing an SQL sentence, which convert the SQL sentence into an intermediate representation structure AST for representation, and then implement the execution of a differential privacy algorithm by using the structure.
In order to facilitate understanding of the method for processing the SQL statement provided in the embodiment of the present application, the following description is made with reference to the scenario example shown in fig. 1. Referring to fig. 1, the diagram is a schematic diagram of an exemplary application scenario provided in an embodiment of the present application.
The differential privacy protection system of the embodiment of the application takes SQL sentences as input, converts the SQL sentences into AST, and the AST comprises three stages of nodes: statement (Statement) node, block (Block) node, and expression (Expr) node. When the differential privacy algorithm is executed, a first target node belonging to the aggregate expression, such as the node expression with the shading in fig. 1, which needs to execute the differential privacy algorithm, is found from the respective expression nodes of the AST. And then, invoking a sensitivity calculation function of the internal node of the first target node to calculate the sensitivity of the internal node of the first target node, so that a differential privacy algorithm such as noise adding and the like can be executed for the first target node according to the sensitivity of the internal node of the first target node.
Those skilled in the art will appreciate that the frame diagram shown in fig. 1 is but one example in which embodiments of the present application may be implemented. The scope of applicability of the embodiments of the application is not limited in any way by the framework.
In order to facilitate understanding of the present application, a method for processing an SQL statement provided in an embodiment of the present application is described below with reference to the accompanying drawings.
Referring to fig. 2, the flowchart of a method for processing an SQL statement according to an embodiment of the present application is shown in fig. 2, where the method may include S201 to S204:
s201: SQL statements are obtained.
The data table in the database can be queried through SQL sentences. In practical applications, there are different types of SQL statements, i.e. with various SQL statement dialects, such as Mysql type SQL statement, hive type SQL statement, etc. Each type of SQL statement has a unique grammar structure, and meanwhile, the writing mode of the SQL statement is relatively free, so that the analysis of the SQL statement is challenged.
S202: the SQL statement is converted into AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL statement. The node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node.
To facilitate the subsequent understanding, the structure of an AST provided in the embodiments of the present application will be first described. AST includes three levels of nodes: statement (Statement) nodes, block (Block) nodes, and expression (Expr) nodes, the Statement nodes including at least one Block node, the Block node including at least one expression node.
Statement nodes represent a statement, and different types of SQL statements have grammar differences, and the contents possibly contained in the statement are different, so that the different types of SQL statements are processed by constructing different statement nodes for the different types of SQL statements. The sentence nodes are internally composed of different block nodes, the sentence nodes are different, and the possible block nodes are also different. For example, the sentence nodes include clickHouse-type SQL sentence-corresponding sentence nodes), hiveStatement (Hive-type SQL sentence-corresponding sentence nodes), mysqlStatement (Mysql-type SQL sentence-corresponding sentence nodes), spark Statement (Spark-type SQL sentence-corresponding sentence nodes), other Statement (other types SQL sentence-corresponding sentence nodes), and the like. The presphere grammar block may be contained in the ClickHouse type SQL statement, while the Hive type SQL statement does not support the presphere grammar block in grammar, and presphere block nodes can be filled in when creating the ClickHouse statement, but HiveStatement is not filled in, namely the ClickHouse statement comprises presphere block nodes, and the HiveStatement does not comprise presphere block nodes. The same block node can only have one inside the sentence node.
Block nodes represent a relatively independent block structure in an SQL statement, similar to a short sentence (a different part of a sentence divided by comma). In the current AST representation system, in order to support different types of SQL statements, syntax portions with different characteristics contained in each type of SQL statement are encapsulated into independent block nodes. For example, block nodes include SelectBlock (Select block node), from block (From block node), wheerblock (Where block node), groupByBlock (GroupBy block node), havingBlock (HavingBlock node), orderbuyblock (OrderBy block node), limited block (limited block node), limited syntax block, tealimited block (TeaLimit block node), width block node, prewhereBlock (PrewhereBlock node), setingblock node, sampleBlock (SampleBlock node), and the like.
One block node may also include other block nodes, for example, withBlock includes WithFromExpression block (WithFromExpression block node) and withasequeryblock (withasequery block node). The interior of each block node is composed of different expression nodes, the expression nodes in the interior of each block node can be freely combined, and a plurality of the same expression nodes in one block node can be arranged.
The expression node represents a single field or a compound field expression, which is the most heterogeneous node in the current AST representation system. The expression nodes may include a plurality of large-direction expression type nodes, each of which is further divided into more next-level expression nodes. For example, the expression node includes BaseExpr (Base type expression node), aggFunctionExpr (aggfunciton type expression node), arithmetic expr (Arithmetic type expression node), condExpr (Cond type expression node), functionExpr (Function type expression node), logicalExpr (Logical type expression node), boolxpr (Bool type expression node), and the like. Taking BaseExpr as an example, the interior of the BaseExpr can be further divided into different expression nodes, such as LiteralExpr (constant expression node), identifierExpr (identifier expression node), numberExprExpr (numerical expression node), and the like.
In the embodiment of the application, in order to realize correct analysis of the SQL statement, AST conversion rules corresponding to different types of the SQL statement are pre-established so as to convert the SQL statement into AST. The AST transformation rules of each type of SQL statement may be understood as nodes in the AST that type of SQL statement may include and node-level relationships between the various nodes. For example, AST conversion rules of SQL statements with ClickHouse indicate that SQL statements with ClickHouse may include ClickHouseStatement, clickHouseStatement including SelectBlock, fromBlock, prewhereBlock, etc., each block node in turn including which expression nodes. By matching the SQL sentence with the corresponding AST conversion rule, it can be identified which nodes the SQL sentence includes and the node hierarchy relationship between the nodes, thereby establishing the AST corresponding to the SQL sentence.
In one possible implementation manner, S202 may include, according to an abstract syntax tree AST conversion rule corresponding to a type of an SQL statement, a specific implementation of converting the SQL statement into an AST:
a1: the SQL statement is parsed into a stream of SQL statement words.
The SQL sentence is a string of character strings, and the grammar rule parser corresponding to the type of the SQL sentence can identify an SQL sentence word stream corresponding to the SQL sentence, wherein the SQL sentence word stream comprises a plurality of words. For example, the SQL statement "select age from table", can be parsed to obtain a word stream comprising 4 words "select, age, from, table". The grammar rule parser may be an SQL statement parsing tool in actual application.
A2: matching the words in the SQL sentence word stream with abstract syntax tree AST conversion rules corresponding to the types of the SQL sentences, and determining the node types of the words and node hierarchical relations among the words.
A3: the SQL statement is converted into AST according to the node type of the word and the node hierarchical relation among the words.
By matching the words in the SQL sentence word stream with the corresponding AST conversion rules, the nodes of which type each word belongs to in AST and the node hierarchy relation of which nodes can be obtained. And assembling the words according to the node types and the corresponding node hierarchical relations to obtain the AST corresponding to the SQL sentence. For example, a Mysql type SQL statement "Select age From table", where "age" is an expression node, "table" is an expression node, "Select" is a Select block node, "From" is a From block node, then the age expression node belongs to the Select block node, the table expression node belongs to the From block node, the Select block node and the From block node belong to the Mysql state node, thereby building an AST of the SQL statement.
Referring to fig. 3, a schematic diagram of the process of converting an SQL statement into an AST in practical application is shown.
Aiming at the input of SQL sentences of different types, each word in the SQL sentences is identified by utilizing a grammar rule analyzer corresponding to the SQL sentences to form an SQL sentence word stream, and then the words are assembled into an AST of the current design by utilizing an AST builder corresponding to the SQL sentences. Corresponding AST conversion rules are stored in the AST builder. When a new type of SQL statement needs to be adapted, only a corresponding grammar rule parser and an AST builder are needed to be realized, and finally, different types of SQL statements are realized to construct unified AST by utilizing the corresponding AST builder. For example, a Mysql-type SQL statement is obtained, the Mysql grammar rule parser is used to parse the SQL statement into a stream of SQL statement words, the stream of SQL statement words is input into Mysql astm builder, and the SQL statement is converted into AST. The same procedure is also used for other types of SQL statements, and will not be described in detail here.
In this way, in the embodiment of the application, different statement nodes are constructed for different types of SQL statements in an AST, each type of SQL statement is directly resolved into corresponding statement nodes during resolution, meanwhile, relatively independent statements and grammar characteristics are packaged into block nodes, different block node combinations are utilized to fill different statement nodes, finally, various basic expressions are packaged into expression nodes, and the expression nodes and the block nodes are filled by utilizing the expression node combinations. The method realizes that the upper AST node processes different grammars, and the lower AST node is used as a combination mode of general node filling, thereby solving the problem of compatibility of SQL sentences of different types on AST.
S203: searching a first target node belonging to the aggregation expression in the expression nodes of the AST.
The differential privacy algorithm requires sensitivity computation for internal nodes belonging to the aggregated expression in the implementation process. In order to implement the differential privacy algorithm, a list of aggregated expressions may be preset, and then a node belonging to the aggregated expression is searched for as a first target node among expression nodes of the AST. For example, if a differential privacy algorithm needs to be executed for a sum expression in advance, the sum expression is an aggregate expression, and the sum expression node is searched in each expression node of the AST and is used as a first target node.
Since the first target node belongs to the aggregation expression, the first target node further comprises an internal node, and the first target node can comprise one-stage internal nodes or multi-stage internal nodes. For example, a sum (age) part in the SQL statement is converted into a sum expression node and an age expression node in the AST, where the sum expression node is a first target node, and the internal node is the age expression node. For another example, a sum (id+age) portion in the SQL statement is converted into a sum expression node, an add expression node, an id expression node, and an age expression node in the AST tree, where the sum expression node is a first target node, the internal node is an add expression node, and the internal node of the add expression node is an id expression node and an age expression node.
S204: and calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, wherein the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm on the first target node.
After determining the first target node, an internal node of the first target node may be determined by AST. According to the embodiment of the application, the corresponding sensitivity calculation function is preset for the internal node of the first target node, and the sensitivity of the internal node of the first target node can be obtained according to the execution logic of the sensitivity calculation function, so that the differential privacy algorithm is executed on the first target node by utilizing the sensitivity of the internal node of the target node.
In one possible implementation manner, S204 calls a sensitivity calculation function of an internal node of the first target node, and the specific implementation of obtaining the sensitivity of the internal node of the first target node may include:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the metadata information is obtained by querying a data table through SQL statements.
When calculating the sensitivity of the internal node of the first target node, metadata information of the internal node of the first target node needs to be utilized, and the metadata information can be obtained by inquiring a data table corresponding to the internal node. Specifically, the metadata information may include a data type, a maximum value, a minimum value, a data occurrence frequency, and the like in a data table corresponding to the internal node. For example, based on the above example, from sum (age), it can be known that the sum expression node is the first target node, and the corresponding internal node is the age expression node. The storage expression node is used for inquiring a storage (age) column in the data table, and metadata information of the storage expression node can be obtained from the data table, wherein the metadata information comprises a data type, a maximum value, a minimum value and the like of the storage column in the data table. When the sensitivity calculation function of the age expression node is called, the sensitivity of the age expression node can be calculated by using the metadata information of the age expression node, so that a differential privacy algorithm is executed on the query result corresponding to the sum expression node according to the sensitivity of the age expression node.
In some possible implementations, since the first target node may include multiple levels of internal nodes, invoking the sensitivity calculation function of the internal node of the first target node to obtain metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node using the metadata information may include:
And calling a sensitivity calculation function of the first-stage internal node of the target node, acquiring metadata information of the second-stage internal node of the target node, and calculating the sensitivity of the first-stage internal node of the target node by using the metadata information of the second-stage internal node of the target node.
That is, when the first target node includes a plurality of levels of internal nodes, the second level of internal nodes may be understood as a next level node of the first internal node. And calling a sensitivity calculation function of the first-stage internal node to acquire metadata information of the next-stage internal node (namely the second internal node) of the first-stage internal node, so that the sensitivity of the first-stage internal node can be calculated.
When the first target node comprises three or more levels of internal nodes, the sensitivity calculation function of the internal node of the last level can be called from the sensitivity calculation function of the internal node of the last level, the metadata information of the internal node of the last level is obtained, the sensitivity of the internal node of the last level is calculated by using the metadata information of the internal node of the last level, and the sensitivity of the internal node of the last level can be used as the metadata information of the internal node of the last level. And continuously calling a sensitivity calculation function of the internal node of the last-last level and the internal node of the last level, and calculating the sensitivity of the internal node of the last-last level and the internal node of the last level by using metadata information of the internal node of the last-last level. And the sensitivity of the next-stage internal node of the first target node can be finally obtained, so that a differential privacy algorithm is executed on the query result of the first target node.
For example, based on the above example, from sum (id+age), sum expression node is a first target node, addition expression node is a first-level internal node, and id expression node and age expression node are second-level internal nodes. And calling a sensitivity calculation function of the addition expression node to acquire metadata information of the id expression node and the age expression node. According to a preset sensitivity calculation function of the addition expression node, adding the minimum value corresponding to the id expression node and the minimum value corresponding to the age expression node to be used as the minimum value of the addition expression node, adding the maximum value corresponding to the id expression node and the age expression node to be used as the maximum value of the addition expression node, and calculating to obtain the sensitivity of the addition expression node.
For another example, in the sum (power (2)) part of the SQL statement, the sum expression node is a first target node, the first-level internal node is a power (exponential calculation) expression node, and the second-level internal node is a salary expression node. And calling a sensitivity calculation function of the power expression node to acquire metadata information of the salary expression node. According to a preset sensitivity calculation function of the power expression node, when the minimum value corresponding to the salary expression node is a non-negative number, taking the square of the minimum value as the minimum value of the power expression node, and taking the square of the maximum value corresponding to the salary expression node as the maximum value of the power expression node; when the maximum value corresponding to the salary expression node is a non-positive number, taking the square of the maximum value as the minimum value of the power expression node, and taking the square of the minimum value corresponding to the salary expression node as the maximum value of the power expression node; when the minimum value corresponding to the salary expression node is a negative number and the maximum value is a positive number, the minimum value of the power expression node is 0, and the larger of the square of the minimum value and the square of the maximum value corresponding to the salary expression node is used as the maximum value of the power expression node, so that the sensitivity calculation of the power expression node is completed.
According to the embodiment of the application, the sensitivity calculation logic is pre-packaged for the expression node needing to carry out the sensitivity calculation, the specific logic is different according to the difference of the expression nodes, and the differential privacy result output of the first target node is realized based on the specific logic.
Based on the description of S201-S204, AST conversion rules corresponding to different SQL statement types are pre-established, so that SQL statements of different types can be compatible, the SQL statements are converted into AST, and the AST is represented by statement nodes, block nodes and expression node three-level node structures. And then determining a first target node belonging to the aggregation expression, which needs to execute the differential privacy algorithm, from the expression nodes in the AST, and calling a sensitivity calculation function of an internal node of the first target node, so that the sensitivity of the internal node of the first target node is calculated, and executing the differential privacy algorithm on the first target node can be realized by utilizing the sensitivity of the internal node of the first target node. Thus, the execution of the differential privacy algorithm after the SQL sentence is input is realized.
Based on the above embodiment, in order to calculate the sensitivity of the internal node of the first target node, metadata information of a corresponding database, data table, data column, etc. needs to be queried from the data table corresponding to the internal node of the first target node. In order to improve the efficiency of acquiring metadata information and reduce the number of times of querying a database, the embodiment of the application can bind the metadata information with expression nodes in an AST in advance.
In one possible implementation manner, based on the foregoing embodiment, the embodiment of the present application may further include:
b1: searching a second target node of a preset type in the expression nodes of the AST.
The second target node may be understood as an expression node that may need to acquire metadata information, and there are three types of second target nodes in AST, which are a table type, a column type and a Map expression type respectively. The second target node is a table type representing that the expression node needs to query the data table, the second target node is a table type representing that the expression node needs to query the data column, and the second target node is a Map expression type representing that the expression node needs to query the Map expression. And searching a second target node of a preset type in each expression node of the AST, wherein the preset type is a table type, a column type and a Map expression type.
B2: and acquiring metadata information in a data table corresponding to the second target node according to the execution sequence of the SQL sentence.
And querying a data table, a data column or a corresponding node corresponding to the corresponding second target node according to the execution sequence of the SQL sentence, thereby obtaining the metadata information of the second target node. For example, the SQL statement selects name, sum (age) as sa, flow_params { '_slot_param_1' } as fp1 from table1, where table1 is parsed into a table type second target node, name and age are parsed into column type second target nodes, flow_params { 'slot_param_1' } is parsed into Map expression type second target node.
According to the execution sequence of the SQL statement, as in the above example, the table1 expression node in the From block node acquires metadata information first, and then three expression nodes in the Select block node acquire metadata information again.
In one possible implementation manner, when the second target node is a table type, metadata information in a data table corresponding to the second target node is metadata information of a corresponding table in the data table corresponding to the second target node;
when the second target node is of a column type, the metadata information in the database corresponding to the second target node is the metadata information of the corresponding column in the data table corresponding to the second target node;
when the second target node is of Map expression type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding node in the data table corresponding to the second target node.
In this embodiment of the present application, when the second target node is of a table type, metadata information (including metadata information of all columns of the data table) of a data table corresponding to the second target node is queried. And when the second target node is of a column type, inquiring metadata information of a data column corresponding to the second target node. And when the second target node is of the Map expression type, inquiring metadata information of the corresponding node in the data table corresponding to the second target node.
B3: and binding the metadata information corresponding to the second target node with the second target node in AST.
And finally, binding the metadata information of each second target node with the second target node, so that the metadata information of the second target node can be directly acquired from AST without repeatedly querying the database again when the metadata information of the second target node needs to be acquired.
In one possible implementation manner, S204 calls a sensitivity calculation function of an internal node of the first target node, and the specific implementation of obtaining the sensitivity of the internal node of the first target node may include:
invoking a sensitivity calculation function of the internal node of the first target node, acquiring metadata information of the internal node of the first target node from AST, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the internal node of the first target node is matched with the second target node.
After metadata information is bound in the AST, when the metadata information of the internal node of the first target node needs to be acquired, if the internal node belongs to the second target node, namely the metadata information is bound, the metadata information of the internal node can be directly acquired from the AST without querying a database, and the implementation efficiency is improved.
Based on the method for processing the SQL statement provided in the foregoing method embodiment, the embodiment of the present application further provides a device for processing the SQL statement, which will be described with reference to the accompanying drawings.
Referring to fig. 4, the diagram is a schematic structural diagram of an SQL statement processing device provided in an embodiment of the present application. As shown in fig. 4, the processing device of the SQL statement includes:
a first obtaining unit 401, configured to obtain an SQL statement;
a conversion unit 402, configured to convert the SQL statement into an AST according to an abstract syntax tree AST conversion rule corresponding to a type of the SQL statement; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
a first searching unit 403, configured to search for a first target node belonging to an aggregation expression in the expression nodes of the AST;
a calling unit 404, configured to call a sensitivity calculation function of an internal node of the first target node, to obtain a sensitivity of the internal node of the first target node, where the sensitivity of the internal node of the first target node is used to execute a differential privacy algorithm for the first target node.
In one possible implementation, the conversion unit includes:
the analysis subunit is used for analyzing the SQL sentence into an SQL sentence word stream;
the matching subunit is used for matching the words in the SQL sentence word stream with abstract syntax tree AST conversion rules corresponding to the types of the SQL sentences, and determining the node types of the words and node level relations among the words;
and the conversion subunit is used for converting the SQL sentence into AST according to the node type of the word and the node hierarchical relation among the words.
In a possible implementation manner, the calling unit is specifically configured to:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the metadata information is obtained by querying a data table through the SQL statement.
In a possible implementation manner, when the first target node includes a multi-level internal node, the calling unit is specifically configured to include:
and calling a sensitivity calculation function of the first-stage internal node of the target node, acquiring metadata information of the second-stage internal node of the target node, and calculating the sensitivity of the first-stage internal node of the target node by using the metadata information of the second-stage internal node of the target node.
In one possible implementation, the apparatus further includes:
a second searching unit, configured to search a second target node of a preset type in the expression node of the AST;
the second acquisition unit is used for acquiring metadata information in a data table corresponding to the second target node according to the execution sequence of the SQL statement;
and the binding unit is used for binding the metadata information corresponding to the second target node with the second target node in the AST.
In one possible implementation manner, the preset type is a table type, a column type and a Map expression type;
when the second target node is of a table type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding table in the data table corresponding to the second target node;
when the second target node is of a column type, the metadata information in the database corresponding to the second target node is the metadata information of the corresponding column in the data table corresponding to the second target node;
and when the second target node is of the Map expression type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding node in the data table corresponding to the second target node.
In a possible implementation manner, the calling unit is specifically configured to:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node from the AST, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the internal node of the first target node is matched with the second target node.
Based on the method for processing the SQL sentence provided by the embodiment of the method, the application also provides electronic equipment, which comprises the following steps: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the SQL sentence processing method according to any embodiment.
Referring now to fig. 5, a schematic diagram of an electronic device 1300 suitable for use in implementing embodiments of the present application is shown. The terminal devices in the embodiments of the present application may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (Personal Digital Assistant, personal digital assistants), PADs (portable android device, tablet computers), PMPs (Portable Media Player, portable multimedia players), vehicle-mounted terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs (televisions), desktop computers, and the like. The electronic device shown in fig. 5 is only an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
As shown in fig. 5, the electronic device 1300 may include a processing means (e.g., a central processor, a graphics processor, etc.) 1301, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1302 or a program loaded from a storage means 1306 into a Random Access Memory (RAM) 1303. In the RAM1303, various programs and data necessary for the operation of the electronic apparatus 1300 are also stored. The processing device 1301, the ROM 1302, and the RAM1303 are connected to each other through a bus 1304. An input/output (I/O) interface 1305 is also connected to bus 1304.
In general, the following devices may be connected to the I/O interface 1305: input devices 1306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 1307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 1306 including, for example, magnetic tape, hard disk, etc.; and communication means 1309. The communication means 1309 may allow the electronic device 1300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 1300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 1309, or installed from the storage device 1306, or installed from the ROM 1302. When executed by processing device 1301, performs the functions described above as defined in the methods of the embodiments of the present application.
The electronic device provided in the embodiment of the present application and the method for processing an SQL statement provided in the foregoing embodiment belong to the same inventive concept, and technical details that are not described in detail in the present embodiment may be referred to the foregoing embodiment, and the present embodiment has the same beneficial effects as the foregoing embodiment.
Based on the method for processing the SQL statement provided in the foregoing method embodiment, the embodiment of the present application provides a computer readable medium on which a computer program is stored, where the program, when executed by a processor, implements the method for processing the SQL statement described in any one of the foregoing embodiments.
It should be noted that the computer readable medium described in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal that propagates in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to perform the method of processing the SQL statement.
Computer program code for carrying out operations for embodiments of the present application may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The name of the unit/module is not limited to the unit itself in some cases, and the first acquisition unit may be described as a "first acquisition module", for example.
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of embodiments of the present application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present application, there is provided a method of processing an SQL statement, the method comprising:
acquiring SQL sentences;
converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
searching a first target node belonging to an aggregation expression in the expression nodes of the AST;
and calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, wherein the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm aiming at the first target node.
According to one or more embodiments of the present application, there is provided a method for processing an SQL statement, the converting the SQL statement into an AST according to an abstract syntax tree AST conversion rule corresponding to a type of the SQL statement, including:
analyzing the SQL sentence into an SQL sentence word stream;
Matching the words in the SQL sentence word stream with abstract syntax tree AST conversion rules corresponding to the types of the SQL sentences, and determining the node types of the words and node hierarchical relations among the words;
and converting the SQL sentence into AST according to the node type of the word and the node hierarchical relation among the words.
According to one or more embodiments of the present application, there is provided a method for processing an SQL statement, the calling a sensitivity calculation function of an internal node of the first target node to obtain a sensitivity of the internal node of the first target node, including:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the metadata information is obtained by querying a data table through the SQL statement.
According to one or more embodiments of the present application, there is provided a method for processing an SQL statement [ example four ], when the first target node includes a plurality of levels of internal nodes, the invoking a sensitivity calculation function of the internal nodes of the first target node, acquiring metadata information of the internal nodes of the first target node, and calculating sensitivities of the internal nodes of the first target node using the metadata information, including:
And calling a sensitivity calculation function of the first-stage internal node of the target node, acquiring metadata information of the second-stage internal node of the target node, and calculating the sensitivity of the first-stage internal node of the target node by using the metadata information of the second-stage internal node of the target node.
According to one or more embodiments of the present application, there is provided a method for processing an SQL statement [ example five ], the method further comprising:
searching a second target node of a preset type in the expression nodes of the AST;
acquiring metadata information in a data table corresponding to the second target node according to the execution sequence of the SQL statement;
and binding the metadata information corresponding to the second target node with the second target node in the AST.
According to one or more embodiments of the present application, there is provided a method for processing an SQL statement, where the preset types are a table type, a column type, and a Map expression type;
when the second target node is of a table type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding table in the data table corresponding to the second target node;
When the second target node is of a column type, the metadata information in the database corresponding to the second target node is the metadata information of the corresponding column in the data table corresponding to the second target node;
and when the second target node is of the Map expression type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding node in the data table corresponding to the second target node.
According to one or more embodiments of the present application, an exemplary seventh aspect provides a method for processing an SQL statement, where the invoking a sensitivity calculation function of an internal node of the first target node, to obtain a sensitivity of the internal node of the first target node, includes:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node from the AST, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the internal node of the first target node is matched with the second target node.
According to one or more embodiments of the present application, there is provided an apparatus for processing an SQL statement, the apparatus comprising:
The first acquisition unit is used for acquiring SQL sentences;
the conversion unit is used for converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
a first searching unit, configured to search for a first target node belonging to an aggregation expression in the expression nodes of the AST;
the calling unit is used for calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, and the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm on the first target node.
According to one or more embodiments of the present application, there is provided a processing apparatus of an SQL statement [ example nine ], the conversion unit including:
the analysis subunit is used for analyzing the SQL sentence into an SQL sentence word stream;
the matching subunit is used for matching the words in the SQL sentence word stream with abstract syntax tree AST conversion rules corresponding to the types of the SQL sentences, and determining the node types of the words and node level relations among the words;
And the conversion subunit is used for converting the SQL sentence into AST according to the node type of the word and the node hierarchical relation among the words.
According to one or more embodiments of the present application, there is provided a processing apparatus of an SQL statement [ example ten ], the calling unit being specifically configured to:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the metadata information is obtained by querying a data table through the SQL statement.
According to one or more embodiments of the present application, there is provided a processing apparatus of an SQL statement [ example eleven ], when the first target node includes a multi-level internal node, the calling unit is specifically configured to include:
and calling a sensitivity calculation function of the first-stage internal node of the target node, acquiring metadata information of the second-stage internal node of the target node, and calculating the sensitivity of the first-stage internal node of the target node by using the metadata information of the second-stage internal node of the target node.
According to one or more embodiments of the present application, there is provided an apparatus for processing an SQL statement [ example twelve ], the apparatus further comprising:
a second searching unit, configured to search a second target node of a preset type in the expression node of the AST;
the second acquisition unit is used for acquiring metadata information in a data table corresponding to the second target node according to the execution sequence of the SQL statement;
and the binding unit is used for binding the metadata information corresponding to the second target node with the second target node in the AST.
According to one or more embodiments of the present application, there is provided a processing apparatus for an SQL statement, where the preset types are a table type, a column type, and a Map expression type;
when the second target node is of a table type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding table in the data table corresponding to the second target node;
when the second target node is of a column type, the metadata information in the database corresponding to the second target node is the metadata information of the corresponding column in the data table corresponding to the second target node;
And when the second target node is of the Map expression type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding node in the data table corresponding to the second target node.
According to one or more embodiments of the present application, there is provided a processing apparatus of an SQL statement [ example fourteen ], where the invoking unit is specifically configured to:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node from the AST, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the internal node of the first target node is matched with the second target node.
According to one or more embodiments of the present application, there is provided a processing device of an SQL statement [ example fifteen ], including: the processor executes the computer program to implement the method for processing the SQL statement according to any one of [ example one ] to [ example seven ].
According to one or more embodiments of the present application, there is provided a computer-readable storage medium, wherein the computer-readable storage medium has instructions stored therein, which when executed on a terminal device, cause the terminal device to perform the method of processing an SQL statement as described in any one of examples one to seven.
It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system or device disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple, and the relevant points refer to the description of the method section.
It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for processing a structured query language SQL statement, the method comprising:
acquiring SQL sentences;
converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
searching a first target node belonging to an aggregation expression in the expression nodes of the AST;
and calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, wherein the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm aiming at the first target node.
2. The method of claim 1, wherein converting the SQL statement into an AST according to an abstract syntax tree AST conversion rule corresponding to a type of the SQL statement, comprises:
analyzing the SQL sentence into an SQL sentence word stream;
matching the words in the SQL sentence word stream with abstract syntax tree AST conversion rules corresponding to the types of the SQL sentences, and determining the node types of the words and node hierarchical relations among the words;
and converting the SQL sentence into AST according to the node type of the word and the node hierarchical relation among the words.
3. The method of claim 1, wherein the invoking the sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node comprises:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the metadata information is obtained by querying a data table through the SQL statement.
4. A method according to claim 3, wherein when the first target node comprises a plurality of levels of internal nodes, the invoking the sensitivity calculation function of the internal nodes of the first target node obtains metadata information of the internal nodes of the first target node, and calculating the sensitivity of the internal nodes of the first target node using the metadata information comprises:
and calling a sensitivity calculation function of the first-stage internal node of the target node, acquiring metadata information of the second-stage internal node of the target node, and calculating the sensitivity of the first-stage internal node of the target node by using the metadata information of the second-stage internal node of the target node.
5. The method according to any one of claims 1-4, further comprising:
searching a second target node of a preset type in the expression nodes of the AST;
acquiring metadata information in a data table corresponding to the second target node according to the execution sequence of the SQL statement;
and binding the metadata information corresponding to the second target node with the second target node in the AST.
6. The method of claim 5, wherein the preset type is a table type, a column type, and a Map expression type;
When the second target node is of a table type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding table in the data table corresponding to the second target node;
when the second target node is of a column type, the metadata information in the database corresponding to the second target node is the metadata information of the corresponding column in the data table corresponding to the second target node;
and when the second target node is of the Map expression type, the metadata information in the data table corresponding to the second target node is the metadata information of the corresponding node in the data table corresponding to the second target node.
7. The method of claim 5, wherein the invoking the sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node comprises:
invoking a sensitivity calculation function of an internal node of the first target node, acquiring metadata information of the internal node of the first target node from the AST, and calculating the sensitivity of the internal node of the first target node by using the metadata information; the internal node of the first target node is matched with the second target node.
8. A processing apparatus for structured query language SQL statements, the apparatus comprising:
the first acquisition unit is used for acquiring SQL sentences;
the conversion unit is used for converting the SQL sentence into an AST according to an abstract syntax tree AST conversion rule corresponding to the type of the SQL sentence; the node types of the AST comprise statement nodes, block nodes and expression nodes, wherein the statement nodes comprise at least one block node, and the block nodes comprise at least one expression node;
a first searching unit, configured to search for a first target node belonging to an aggregation expression in the expression nodes of the AST;
the calling unit is used for calling a sensitivity calculation function of the internal node of the first target node to obtain the sensitivity of the internal node of the first target node, and the sensitivity of the internal node of the first target node is used for executing a differential privacy algorithm on the first target node.
9. A processing device for structured query language SQL statements, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method of processing an SQL statement of any one of claims 1-7 when the computer program is executed.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein instructions, which when run on a terminal device, cause the terminal device to execute the method of processing an SQL statement according to any of claims 1-7.
CN202310201045.4A 2023-03-03 2023-03-03 SQL sentence processing method, device and equipment Pending CN116226169A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310201045.4A CN116226169A (en) 2023-03-03 2023-03-03 SQL sentence processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310201045.4A CN116226169A (en) 2023-03-03 2023-03-03 SQL sentence processing method, device and equipment

Publications (1)

Publication Number Publication Date
CN116226169A true CN116226169A (en) 2023-06-06

Family

ID=86576553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310201045.4A Pending CN116226169A (en) 2023-03-03 2023-03-03 SQL sentence processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN116226169A (en)

Similar Documents

Publication Publication Date Title
CN109582691B (en) Method and apparatus for controlling data query
CN112395303A (en) Query execution method and device, electronic equipment and computer readable medium
CN111314388B (en) Method and apparatus for detecting SQL injection
WO2024021790A1 (en) Data lake-based virtual column construction method and data query method
CN109189395B (en) Data analysis method and device
WO2023029854A1 (en) Data query method and apparatus, storage medium, and electronic device
CN107463671B (en) Method and device for path query
CN113220710A (en) Data query method and device, electronic equipment and storage medium
WO2023065937A1 (en) Data processing method and apparatus, and readable medium and electronic device
CN116226169A (en) SQL sentence processing method, device and equipment
CN111737571B (en) Searching method and device and electronic equipment
CN113468342B (en) Knowledge graph-based data model construction method, device, equipment and medium
CN115827676A (en) SQL sub-query execution method, device, terminal equipment and medium
US20240078387A1 (en) Text chain generation method and apparatus, device, and medium
CN113157695B (en) Data processing method and device, readable medium and electronic equipment
CN113393288A (en) Order processing information generation method, device, equipment and computer readable medium
CN112307061A (en) Method and device for querying data
US20240135196A1 (en) Method and apparatus for knowledge graph construction, storage medium, and electronic device
CN114780107B (en) Grammar analysis method and device of rule running file and decision engine
CN115994151B (en) Data request changing method, device, electronic equipment and computer readable medium
CN117493375A (en) Structured query statement similarity detection method, device and equipment
CN116737762B (en) Structured query statement generation method, device and computer readable medium
CN113760905A (en) Database index processing method and device, electronic equipment and computer readable medium
CN114764406B (en) Database query method and related device
CN116166856A (en) Processing method, device, equipment and storage medium of table data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination