CN111522839A

CN111522839A - Natural language query method based on deep learning

Info

Publication number: CN111522839A
Application number: CN202010336575.6A
Authority: CN
Inventors: 李玉华; 李相臣; 李瑞轩; 辜希武
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-04-25
Filing date: 2020-04-25
Publication date: 2020-08-11
Anticipated expiration: 2040-04-25
Also published as: CN111522839B

Abstract

The invention discloses a natural language query method based on deep learning, which comprises the following steps: s1, inputting the natural language question Q into a pre-trained sentence vector model to obtain a corresponding sentence vector; s2, retrieving the nearest neighbor sentence vector of the sentence vector corresponding to the natural language problem Q and the natural language problem thereof from the sentence vector space corresponding to the sentence vector model, and taking the relational database table of the natural language problem of the obtained nearest neighbor sentence vector as the target relational database table of the natural language problem Q; s3, splicing the natural language question Q and the header of the target relational database table together, and inputting the natural language question Q and the header into a conversion model which is trained well in advance to obtain a corresponding SQL query statement; and S4, querying in a target relational database table of the natural language question Q by using the obtained SQL statement to obtain a query result. The method can quickly determine the target relational database table after giving the natural language problem, and has higher accuracy of the query result in a real scene.

Description

Natural language query method based on deep learning

Technical Field

The invention belongs to the field of natural language processing, and particularly relates to a natural language query method based on deep learning.

Background

With the popularization and development of information systems, the quantity of data is more and more, and the importance is higher and more, and the demand of people for data is also increased. Sometimes, the data is too large and complex, which is a trouble for us, and it is not easy to find the needed data from the data, because the data is mostly stored in the database, and for the use of the database, a specific query language is needed to query the data from the database, and the learning and the use of the specific query language have certain thresholds, so that the learning cost is high for a common user, and there is an important meaning for researching a natural language query method.

Many researchers have proposed many natural language query methods in succession, and many attempts have been made in the direction of natural language query from early methods based on syntactic analysis to the latest methods based on deep learning. In the early days, scholars usually adopt grammar analysis to research a customized database natural language interface query system, the method usually needs domain expert knowledge and customization, and the generalization capability is general; with the rise of deep learning and the rapid promotion of computing power, researches on natural language query based on a deep learning method are gradually increased, and scholars successively put forward methods such as SEQ2SQL, SqlNet, Sqlova, TypeSQL, X-SQL, Coarse2Fine, Pointer-SQL, AntotatedSeq 2Seq, Multitask Question Answering Network (MQAN), Execution guided decoding and the like to process NL2SQL tasks, wherein the method has relatively good generalization capability effect, but needs a large amount of artificially labeled training data, adopts artificial labeling to generate the training data, and has relatively high manufacturing cost; meanwhile, some researchers propose to generate training data by using weak supervised Learning, such as memory enhanced Policy Optimization (MAPO) and metareward Learning (MeRL), and further perform deep Learning training by using the generated training data. However, in the above method, most target relational database tables of the current natural language problem are given, and only how to generate the SQL query statement is studied on the given tables, without considering the process of locating the target relational database tables of the natural language problem according to the natural language problem. In reality, the process of positioning the relational database table by the natural language problem is inevitable, and the relation is directly related to whether the correct SQL query statement can be generated, so that the accuracy of the query result in a real scene is low in the conventional method; most of the current research is concentrated in the field of English wikisq1 data sets, and the research on natural language query interfaces in the Chinese field is less.

Disclosure of Invention

Aiming at the defects or the improvement requirements of the prior art, the invention provides a natural language query method based on deep learning, which is used for solving the technical problem that the accuracy of query results in a real scene is low because the process of positioning a target relational database table of a natural language problem according to the natural language problem is not considered in the prior art.

In order to achieve the above object, in a first aspect, the present invention provides a deep learning-based natural language query method, including the following steps:

s1, inputting the natural language question Q into a pre-trained sentence vector model to obtain a corresponding sentence vector;

s2, retrieving the nearest neighbor sentence vector of the sentence vector corresponding to the natural language problem Q and the natural language problem thereof from the sentence vector space corresponding to the sentence vector model, and taking the relational database table of the natural language problem of the obtained nearest neighbor sentence vector as the target relational database table of the natural language problem Q;

s3, splicing the natural language question Q and the header of the target relational database table together, and inputting the natural language question Q and the header into a conversion model which is trained well in advance to obtain a corresponding SQL query statement;

s4, querying in a target relational database table of the natural language question Q by using the obtained SQL statement to obtain a query result;

wherein, the conversion model comprises: the device comprises a BERT pre-training model, a first full connection layer, a second full connection layer, a third full connection layer, a fourth full connection layer, a fifth full connection layer, a sixth full connection layer and a seventh full connection layer.

Further preferably, in step S3, the natural language question Q and the header of the target relational database table are spliced together to obtain an input list of the conversion model, specifically: [ [ CLS ], Q, [ SEP ], a first list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ], a second list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ],...., [ SEP ], a C list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ] ]; each character in the natural language question Q is an element of an input list, the position of each feature of each list head of the target relational database is a reserved position, and C is the number of columns of the database table.

Further preferably, a BERT pre-training model (Bidirectional Encoder Representation from transformations) is used to derive [ CLS ] based on the input list]Corresponding vector V_CLSVector V corresponding to each character in natural language question Q_lAnd the vector V corresponding to each list head of the database_cAnd vectors corresponding to feature 1, feature 2, feature 3, and feature 4 of each list header are respectively denoted as V_c1、V_c2、V_c3And V_c4(ii) a And will V_CLSRespectively inputting the V into a first full connection layer, a second full connection layer and a third full connection layer_c1Inputting into a fourth fully-connected layer, and adding V_c2、V_c3And V_c4Respectively inputting the V into a fifth full connection layer and a sixth full connection layer in parallel_c2、V_c3And V_c4And V_lInputting the integrated data into a seventh full connection layer; l1, 2, theC, wherein C is the number of columns of the database table;

first full connection layer for [ CLS-based]Corresponding vector V_CLSPredicting connectors among conditions of WHERE clauses in an SQL query statement;

second fully connected layer for [ CLS ] based]Corresponding vector V_CLSPredicting the number of columns involved in the SELECT clause in the SQL query statement;

third fully connected layer for [ CLS ] based]Corresponding vector V_CLSPredicting the number of columns involved in a WHERE clause in an SQL query statement;

the fourth full connection layer is used for generating a vector V corresponding to the characteristic 1 of each list head_c1Predicting whether each column in the SQL query statement is a SELECT column and a corresponding aggregation symbol;

the fifth full connection layer is used for generating vectors V corresponding to the characteristics 2, 3 and 4 of each list head_c2、V_c3、V_c4Predicting whether each column in the SQL query statement is a condition column in the WHERE clause;

the sixth full connection layer is used for generating a vector V corresponding to the feature 2, the feature 3 and the feature 4 of each list head_c2、V_c3、V_c4Predicting comparison symbols of columns related to the WHERE clause in the SQL query statement;

the seventh full connection layer is used for solving the problem of the natural language Q according to the vector V corresponding to each character_lAnd a vector V corresponding to the feature 2, the feature 3, and the feature 4 of each list header_c2、V_c3、V_c4And predicting the starting position and the ending position of the value of each condition in the WHERE clause in the SQL query statement in the natural language question Q.

Further preferably, the first full connection layer, the second full connection layer, the third full connection layer, the fourth full connection layer, the fifth full connection layer, the sixth full connection layer, and the seventh full connection layer have the following sizes: hidden _ size x 3, hidden _ size x 4, hidden _ size x 7, hidden _ size x 1, hidden _ size x 4, and hidden _ size x 2; wherein hidden _ size is the dimension of the output vector of the BERT pre-training model.

Further preferably, the sentence vector model training method includes the following steps:

s011, collecting query data training set { (Q)_i，SQL_i) In which Q_iAnd SQL_iThe ith natural language question and the corresponding SQL query statement are respectively; wherein, the respective natural language questions carry corresponding relational database tables; i is more than or equal to 1 and less than or equal to N, and N is the number of data in the query data training set;

s012, respectively calculating the similarity between every two SQL query sentences corresponding to the natural language problems in the query data training set to obtain a triple data set { (Q)_i，Q_j，Sim_ij) In which Sim_ijFor natural language question Q_iAnd Q_jThe similarity between the corresponding SQL query sentences, i is not equal to j; only considering the SELECT column and the WHERE conditional column when calculating the similarity of the SQL query statement;

s013, inputting the triple data set into the sentence vector model to train the triple data set to obtain a pre-trained sentence vector model, and inputting all natural language questions in the query data training set into the pre-trained sentence vector model to obtain a sentence vector space corresponding to the sentence vector model.

Further preferably, the training method of the conversion model includes:

training set { (Q) for query data_i，SQL_i) Splicing the respective language question with the corresponding database table header to obtain (Q)_i，Header_i) (ii) a With (Q)_i，Header_i) For input, SQL_iAnd training the conversion model for outputting to obtain a pre-trained conversion model, wherein i is more than or equal to 1 and less than or equal to N, and N is the number of data in the query data training set.

Further preferably, the loss function of the conversion model is:

wherein los iss₁Loss functions of connectors among all conditions of WHERE clauses in SQL query sentences in a query data training set and connectors among all conditions of WHERE clauses obtained by a first full connection layer of a conversion model are used; loss₂A loss function based on the number of columns involved in a SELECT clause in an SQL query statement in a query data training set and the number of columns involved in the SELECT clause obtained by a second fully connected layer of the conversion model; loss₃A loss function based on the number of columns involved in a WHERE clause in an SQL query statement in a query data training set and the number of columns involved in the WHERE clause obtained by a third full-connection layer of a conversion model; loss₄Whether each column in the SQL query statement in the query data training set is a SELECT column, a corresponding aggregation symbol and a loss function of the SELECT column and the corresponding aggregation symbol is determined based on whether each column in the SQL query statement in the query data training set is the SELECT column and the corresponding aggregation symbol; loss₅Whether each column in the SQL query statement in the query data training set is a condition column in a WHERE clause or not and a loss function of whether each column in the SQL query statement obtained by a fifth full-connection layer of the conversion model is a condition column in the WHERE clause or not are determined; loss₆A loss function of the comparison symbol of the column involved in the WHERE clause in the SQL query sentence in the query data training set and the comparison symbol of the column involved in the WHERE clause obtained by the sixth full connection layer of the conversion model; loss₇And the loss function is based on the starting and ending positions of the values of the conditions in the WHERE clauses in the SQL query sentences in the query data training set in the natural language problem and the starting and ending positions of the values of the conditions in the WHERE clauses obtained by the seventh full connection layer of the conversion model in the natural language problem.

In a second aspect, the present invention also provides a storage medium, which when read by a computer, causes the computer to execute the deep learning based natural language query method provided by the first aspect of the present invention.

Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:

1. the invention provides a natural language query method based on deep learning, which is characterized in that a natural language question is input at will, a sentence vector is obtained through a sentence vector model, then a nearest neighbor sentence vector is searched in the obtained sentence vector space, a target relational database table of the nearest neighbor sentence vector is used as a target relational database table of the current natural language question, the natural language question is mapped to a vector space with fixed dimensionality through the steps, so that the distances between the sentence vectors corresponding to similar questions are closer in the vector space, and the database table where the nearest neighbor is located can be quickly found after the natural language question is given, namely the target relational database table of the natural language question, then, SQL sentences are generated according to the natural language question and then query is carried out on the obtained target relational database table, and the query result is obtained, and the accuracy of the query result in a real scene is higher.

2. According to the natural language query method based on deep learning, in the process of converting natural language into SQL query statements, a conversion model divides the whole SQL predicted task into different subtasks, and predicts the number of SELECT COLs, the columns and aggregators of the SELECT COLs, the number of condition columns in WHERE clauses, the columns of the condition columns, the comparison symbols of the condition columns, the values of the condition columns and the connectors between different conditions. All subtasks are combined and trained together, and finally, the results predicted by all subtasks are combined together, so that the model has high accuracy.

3. The natural language query method based on deep learning provided by the invention can quickly complete two processes of target database table positioning and target SQL query statement conversion for any natural language problem in actual scene application, has high processing speed and can meet the requirements of actual application.

4. The natural language query method based on deep learning provided by the invention has better mobility, and when the natural language query method is replaced to different fields, only the training data of a new field needs to be input into the model and retrained once.

5. The conversion model in the natural language query method based on deep learning provided by the invention adopts a subtask division method on the thought, and the specific conversion process is more refined, so that when the effect of the conversion model is evaluated, which subtask the short board of the conversion model is can be visually seen, further modification and promotion on the short board subtask are facilitated, and the accuracy of the model is higher.

Drawings

Fig. 1 is a flowchart of a natural language query method based on deep learning according to embodiment 1 of the present invention;

fig. 2 is a schematic structural diagram of a conversion model provided in embodiment 1 of the present invention;

fig. 3 is a flowchart of predicting connectors between conditions of a where clause by using a first full-link layer according to embodiment 1 of the present invention;

fig. 4 is a flowchart of predicting the number of columns in a select clause by using a second full link layer according to embodiment 1 of the present invention;

fig. 5 is a flowchart of predicting the number of conditional columns in a where clause by using a third full-link layer according to embodiment 1 of the present invention;

fig. 6 is a flowchart of predicting column and aggregation symbols in a select clause by using a fourth full link layer according to embodiment 1 of the present invention;

fig. 7 is a flowchart of predicting each condition column in a where clause by using a fifth full-link layer according to embodiment 1 of the present invention;

fig. 8 is a flowchart of comparing symbols in a where clause predicted by using a sixth full link layer according to embodiment 1 of the present invention;

fig. 9 is a flowchart for predicting the start and end positions of each condition value in a where clause by using a seventh full-link layer according to embodiment 1 of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Examples 1,

A natural language query method based on deep learning, as shown in fig. 1, includes the following steps:

specifically, the sentence vector model training method comprises the following steps:

specifically, [0, 1] is divided into 4 similarity intervals of the same size, which are [0, 0.25), [0.25, 0.5), [0.5, 0.75), [0.75, 1.0] on average, the obtained similarities are divided into the similarity intervals, and boundary values are taken down in the similarity intervals and mapped into a {0, 0.25, 0.5, 0.75, 1.0} set.

Specifically, when the sentence vector model is used for calculating the sentence vector, the sentence vector model only comprises a BERT pre-training model and a pooling layer which are sequentially connected, and the output of the pooling layer is the sentence vector corresponding to the natural language problem. In the training process, the sentence vector model comprises a BERT pre-training model, a pooling layer, a splicing module and a soffmax layer which are connected in sequence; the loss function of the sentence vector model is a cross entropy loss function. Specifically, two problems are input into a BERT pre-training model, average pooling operation is carried out on output word vectors, then two sentence vectors u and v are spliced together in a [ u, v, | u-v | ] mode to obtain a vector with the dimension of 3 × n, wherein n is the dimension of the sentence vector, the obtained vector with the dimension of 3 × n is input into a softmax layer to obtain similarity categories of the two natural language problems, then a cross entropy loss function is adopted to calculate a loss value, and parameters in the sentence vector model are updated based on the obtained loss value.

s3, splicing the natural language question Q and the header of the target relational database table to obtain an input list of the conversion model, which specifically comprises the following steps: [ [ CLS ], Q, [ SEP ], a first list header of a target relational database, a feature 1, a feature 2, a feature 3, a feature 4, [ SEP ], a second list header of the target relational database, a feature 1, a feature 2, a feature 3, a feature 4, [ SEP ], a C list header of the target relational database, a feature 1, a feature 2, a feature 3, a feature 4, [ SEP ] ], and inputting a person into a pre-trained conversion model to obtain a corresponding SQL query statement; each character in the natural language question Q is an element of an input list, the position of each feature of each list head of the target relational database is a reserved position, and C is the number of columns of the database table.

Specifically, as shown in fig. 2, the conversion model includes: the method comprises the following steps of (1) pre-training a BERT model, a first full-link layer, a second full-link layer, a third full-link layer, a fourth full-link layer, a fifth full-link layer, a sixth full-link layer and a seventh full-link layer;

wherein, in this embodiment, the size of first full tie layer, the full tie layer of second, the full tie layer of third, the full tie layer of fourth, the full tie layer of fifth, the full tie layer of sixth and the full tie layer of seventh is respectively: hidden _ size x 3, hidden _ size x 4, hidden _ size x 7, hidden _ size x 1, hidden _ size x 4, and hidden _ size x 2; wherein, hidden _ size is the dimension of output vector of the BERT pre-training model; before inputting the conversion model, splicing the natural language question Q and the table head of the target relational database table together to obtain an input list of the conversion model, wherein the method specifically comprises the following steps: [ [ CLS ], Q, [ SEP ], a first list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ], a second list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ],.. till., [ SEP ], a C list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ] ], wherein each character in the natural language question Q is an element of an input list, the position of each feature of each list header of the target relational database is a reserved position, the position of each feature of each list header of the target relational database is a word vector for obtaining the position output by a BERT pre-training model, and C is the number of columns of the database table.

The BERT pre-training model is used for obtaining [ CLS ] based on an input list]Corresponding vector V_CLSVector V corresponding to each character in natural language question Q_lThe vector Vc corresponding to each list head of the database, and the vectors corresponding to the characteristic 1, the characteristic 2, the characteristic 3 and the characteristic 4 of each list head are respectively marked as V_c1、V_c2、V_{c3 and}V_c4(ii) a And will V_CLSRespectively inputting the V into a first full connection layer, a second full connection layer and a third full connection layer_c1Inputting into a fourth fully-connected layer, and adding V_c2、V_c3And V_c4Respectively inputting the V into a fifth full connection layer and a sixth full connection layer in parallel_c2、V_c3And V_c4And V_lInputting the integrated data into a seventh full connection layer; l is 1, 2, theC is the column number of the database table; first full connection layer for [ CLS-based]Corresponding vector V_CLSAnd predicting connectors between the conditions of the WHERE clause in the SQL query statement, wherein the connectors between the conditions of the WHERE clause in the embodiment include "and", "or" and no connector, as shown in fig. 3. Second fully connected layer for [ CLS ] based]Corresponding vector V_CLSThe number of columns involved in the SELECT clause in the SQL query statement is predicted, as shown in fig. 4. Third fully connected layer for [ CLS ] based]Corresponding vector V_CLSThe number of columns involved in the WHERE clause in the SQL query statement is predicted, as shown in fig. 5. The fourth full connection layer is used for generating a vector V corresponding to the characteristic 1 of each list head_c1And predicting whether each column in the SQL query statement is a SELECT column and a corresponding aggregation symbol, as shown in fig. 6. The fifth full connection layer is used for generating vectors V corresponding to the characteristics 2, 3 and 4 of each list head_c2、V_c3、V_c4And predicting whether each column in the SQL query statement is a conditional column in the WHERE clause, as shown in fig. 7. The sixth full connection layer is used for generating a vector V corresponding to the feature 2, the feature 3 and the feature 4 of each list head_c2、V_c3、V_c4The comparison symbols of the columns involved in the WHERE clause in the SQL query statement are predicted, as shown in fig. 8. The seventh full connection layer is used for solving the problem of the natural language Q according to the vector V corresponding to each character_lAnd a vector V corresponding to the feature 2, the feature 3, and the feature 4 of each list header_c2、V_c3、V_c4And predicting the starting position and the ending position of the value of each condition in the WHERE clause in the SQL query statement in the natural language question Q, as shown in fig. 9.

Further, the training method of the conversion model comprises the following steps:

training set { (Q) for query data_i，SQL_i) Splicing the respective language question with the corresponding database table header to obtain (Q)_i，Header_i) (ii) a With (Q)_i，Header_i) For input, SQL_iTraining the conversion model for output to obtain a pre-trained conversion model, wherein i is more than or equal to 1And N is less than or equal to N, wherein N is the number of data in the query data training set. Specifically, the method comprises the following steps:

s031, let i equal to 1;

s032 question Q of natural language_iSplicing the table head columns of the corresponding database table together to obtain an input list of the conversion model;

specifically, the input list is: [ [ CLS ]]，Q_i，[SEP]First list header of target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP]Second list header of target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP]，......，[SEP]List head of C of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP]]Wherein, the natural language question Q_iEach character in (1) is an element of an input list, the position of each feature of each list head of the target relational database is a reserved position, and C is the number of columns of the database table.

S033, inputting the input list into a BERT pre-training model of the conversion model to obtain [ CLS]Corresponding vector V_CLSVector V corresponding to each character in natural language question Q_lThe vector Vc corresponding to each list head of the database, and the vectors corresponding to the characteristic 1, the characteristic 2, the characteristic 3 and the characteristic 4 of each list head are respectively marked as V_c1、V_c2、V_{c3 and}V_c4(ii) a L is 1, 2, the.

S034, mixing [ CLS]Corresponding vector V_CLSInputting the data into a first full-connection layer of a conversion model, and predicting to obtain connectors among conditions of WHERE clauses in an SQL query statement;

s035, mixing [ CLS]Corresponding vector V_CLSRespectively inputting the number of columns into a second full-connection layer and a third full-connection layer of the conversion model, and predicting to obtain the number of columns related to a SELECT clause and a WHERE clause in an SQL query statement;

s036, corresponding vector V to feature 1 of each list head_c1Inputting the data into a fourth full-connection layer to predict whether each column in the SQL query statement is SELECT columns and corresponding aggregate symbols;

s037, corresponding vector V of feature 2, feature 3 and feature 4 of each list head_c2V_c3V_c4Inputting the predicted results into a fifth full-connection layer and a sixth full-connection layer respectively, and predicting condition columns related to WHERE clauses in SQL query sentences and comparison signs of each condition column respectively;

s038, solving the natural language question Q_iThe vector corresponding to each word in (1) and the vector V corresponding to the feature 2, feature 3 and feature 4 of each list head_a2、V_c3And V_c4Inputting the integrated result into a seventh full-connection layer, and predicting to obtain the starting position and the ending position of each condition value in the WHERE clause in the SQL query statement in the natural language question Q;

s039, preprocessing each part of elements of the SQL query statement of the training data into vectors which meet the specification size and serve as label, respectively solving loss functions with the results obtained in the steps S034-S038, adding each part of loss functions to serve as a total loss function, and updating parameter values in the conversion model through the result of the minimum term loss function;

specifically, processing each part of elements of the SQL query statement of the training data into a vector meeting the specification size as a label and solving for the loss function respectively needs the following processes:

(1) for subtask 1, the connection sign between conditions in the WHERE clause needs to be predicted, and the vector v with the size of 3 is predicted through the output of the first full-connection layer network₁₃Respectively represent three connection symbols; converting specific connection symbols between WHERE clauses in SQL query statement into vectors v with the size of 1₁₁This subtask uses a cross entropy loss function:

loss₁＝cross_entropy(v₁₃，v₁₁)

(2) for subtask 2, the number of columns involved in the SELECT clause needs to be predicted, predicted as a vector v of size 4 via the output of the second full-connection layer network₂₄Represent different numbers of 1, 2, 3, 4 columns respectively; converting the specific value of the column number of the SELECT clause in the SQL query statement into a vector v with the size of 1₂₁This subtask uses a cross entropy loss function:

loss₂＝cross_entropy(v₂₄，v₂₁)

(3) for subtask 3, the number of columns involved by the WHERE clause in the SQL query statement needs to be predicted, and the vector v with the size of 4 is predicted by the output of the third full-connection layer network₃₄Represent different numbers of 1, 2, 3, 4 columns respectively; converting the specific value of the number of columns of the WHERE clause in the SQL query statement into a vector v with the size of 1₃₁This subtask uses a cross entropy loss function:

loss₃＝cross_entropy(v₃₄，v₃₁)

(4) for the subtask 4, it is necessary to predict the column and aggregation symbol information in the SELECT clause in the SQL query statement, i.e., whether each column in the prediction header is a column in the SELECT clause and a specific aggregation symbol, and a vector v with a prediction size of 7 is output through the fourth full-connection-layer network₄₇Respectively representing seven conditions that the current column is not a column in a SELECT clause, is a column obtained in the SELECT clause and has an aggregation symbol of "" AVG, MAX, MIN, SUM and COUNT "; converting the information of the SELECT clause in the SQL query statement into a vector v with the size of 1₄₁This subtask uses a cross entropy loss function:

loss₄＝cross_entropy(v₄₇，v₄₁)

(5) for subtask 5, it is necessary to predict the column information in the WHERE clause, i.e. to predict whether each column in the header of the table is a column in the WHERE clause, and to output a vector v with a prediction size of 1 through the fifth full-connection-layer network₅₁Representing the likelihood that the current column is a column in the WHERE clause; converting a particular column in a WHERE clause into a vector v of size 1₅₁', this subtask measures the error between the two using KL divergence:

loss₅＝kl_loss(softmax(v₅₁)，v₅₁′)

(6) for subtask 6, the comparison sign of each condition column in the WHERE clause needs to be predicted, and a vector v with the prediction size of 4 is output through the sixth fully-connected network₆₄Each representing 4 different ratiosComparing the signs; converting specific comparison symbols of each condition column in WHERE clause into vector v with size of 1₆₁This subtask uses a cross entropy loss function:

loss₆＝cross_entropy(v₆₄，v₆₁)

(7) for subtask 7, it is necessary to predict the condition value of each condition in the WHERE clause, WHERE the start and end positions of each condition value in the natural language question are predicted, and output a vector v with a size of 3 × 2 × L via the seventh fully-connected network layer_732LRepresenting possible starting and ending positions and probabilities of the current condition value, where L is the length of the natural language question; converting the specific starting and ending positions of the condition values in the WHERE clause in the natural language problem into a matrix v with the size of 3 x 2₇₃₂This task uses a cross entropy loss function:

loss₇＝cross_entropy(v_732L，v₇₃₂)

and synthesizing the subtasks to obtain a loss function of the conversion model as follows:

s040, where i is i + 1;

and S041, repeating the steps S032-S040 until i is larger than the number N of the data in the query data training set.

And S4, querying in a target relational database table of the natural language question Q by using the obtained SQL statement to obtain a query result.

Examples 2,

A storage medium that, when reading the instructions, causes a computer to execute the deep learning-based natural language query method provided in embodiment 1 of the present invention.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A natural language query method based on deep learning is characterized by comprising the following steps:

wherein the conversion model comprises: the device comprises a BERT pre-training model, a first full connection layer, a second full connection layer, a third full connection layer, a fourth full connection layer, a fifth full connection layer, a sixth full connection layer and a seventh full connection layer.

2. The deep learning-based natural language query method according to claim 1, wherein in step S3, the natural language question Q and the header of the target relational database table are spliced together to obtain an input list of the conversion model, specifically: [ [ CLS ], Q, [ SEP ], a first list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ], a second list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ],...., [ SEP ], a C list header of the target relational database, feature 1, feature 2, feature 3, feature 4, [ SEP ] ]; each character in the natural language question Q is an element of an input list, the position of each feature of each list head of the target relational database is a reserved position, and C is the number of columns of the database table.

3. The deep learning-based natural language query method of claim 2, wherein the BERT pre-training model is used to derive [ CLS ] based on an input list]Corresponding vector V_CLSVector V corresponding to each character in natural language question Q_lAnd the vector V corresponding to each list head of the database_cAnd vectors corresponding to feature 1, feature 2, feature 3, and feature 4 of each list header are respectively denoted as V_c1、V_c2、V_c3And V_c4(ii) a And will V_CLSRespectively inputting the V into a first full connection layer, a second full connection layer and a third full connection layer_c1Inputting into a fourth fully-connected layer, and adding V_c2、V_c3And V_c4Respectively inputting the V into a fifth full connection layer and a sixth full connection layer in parallel_c2、V_c3And V_c4And V_lInputting the integrated data into a seventh full connection layer; l is 1, 2, the.

The first full connection layer is used for being based on [ CLS]Corresponding vector V_CLSPredicting connectors among conditions of WHERE clauses in an SQL query statement;

the second fully connected layer is used for being based on [ CLS]Corresponding vector V_CLSPredicting the number of columns involved in the SELECT clause in the SQL query statement;

the third fully connected layer is used for being based on [ CLS]Corresponding vector V_CLSPredicting the number of columns involved in a WHERE clause in an SQL query statement;

the fifth full connection layer is used for generating a vector V corresponding to the characteristic 2, the characteristic 3 and the characteristic 4 of each list head_c2、V_c3、V_c4Predicting whether each column in the SQL query statement is a condition column in the WHERE clause;

the seventh full connection layer is used for solving the problem that the vector V corresponding to each character in the natural language question Q is_lAnd a vector V corresponding to the feature 2, the feature 3, and the feature 4 of each list header_c2、V_c3、V_c4And predicting the starting position and the ending position of the value of each condition in the WHERE clause in the SQL query statement in the natural language question Q.

4. The deep learning-based natural language query method according to claim 1 or 3, wherein the first fully-connected layer, the second fully-connected layer, the third fully-connected layer, the fourth fully-connected layer, the fifth fully-connected layer, the sixth fully-connected layer and the seventh fully-connected layer have the sizes respectively: hidden _ size x 3, hidden _ size x 4, hidden _ size x 7, hidden _ size x 1, hidden _ size x 4, and hidden _ size x 2; wherein hidden _ size is the dimension of the output vector of the BERT pre-training model.

5. The deep learning-based natural language query method according to claim 1, wherein the sentence vector model training method comprises the following steps:

s012, respectively calculating the similarity between every two SQL query sentences corresponding to the natural language problems in the query data training set to obtain a triple data set { (Q)_i，Q_j，Sim_ij) In which Sim_ijFor natural language question Q_iAnd Q_jCorresponding SQL query statementSimilarity between the two groups, i is not equal to j; only considering the SELECT column and the WHERE conditional column when calculating the similarity of the SQL query statement;

6. The deep learning based natural language query method according to claim 1 or 3, wherein the training method of the conversion model comprises:

7. The deep learning based natural language query method of claim 6, wherein the loss function of the conversion model is:

therein, loss₁Loss functions of connectors among all conditions of WHERE clauses in SQL query sentences in a query data training set and connectors among all conditions of WHERE clauses obtained by a first full connection layer of a conversion model are used; loss₂A loss function based on the number of columns involved in a SELECT clause in an SQL query statement in a query data training set and the number of columns involved in the SELECT clause obtained by a second fully connected layer of the conversion model; loss₃For WHERE clauses in SQL query sentences in training set based on query dataThe number of columns involved in (a) and the number of columns involved in the WHERE clause resulting from the third fully-connected layer of the conversion model; loss₄Whether each column in the SQL query statement in the query data training set is a SELECT column, a corresponding aggregation symbol and a loss function of the SELECT column and the corresponding aggregation symbol is determined based on whether each column in the SQL query statement in the query data training set is the SELECT column and the corresponding aggregation symbol; loss₅Whether each column in the SQL query statement in the query data training set is a condition column in a WHERE clause or not and a loss function of whether each column in the SQL query statement obtained by a fifth full-connection layer of the conversion model is a condition column in the WHERE clause or not are determined; loss₆A loss function of the comparison symbol of the column involved in the WHERE clause in the SQL query sentence in the query data training set and the comparison symbol of the column involved in the WHERE clause obtained by the sixth full connection layer of the conversion model; loss₇And the loss function is based on the starting and ending positions of the values of the conditions in the WHERE clauses in the SQL query sentences in the query data training set in the natural language problem and the starting and ending positions of the values of the conditions in the WHERE clauses obtained by the seventh full connection layer of the conversion model in the natural language problem.

8. A storage medium, which when read by a computer, causes the computer to execute the deep learning based natural language query method according to any one of claims 1 to 7.