CN112487135B

CN112487135B - Method and device for converting text into structured query language

Info

Publication number: CN112487135B
Application number: CN202011502186.2A
Authority: CN
Inventors: 俞凯; 陈志�
Original assignee: Sipic Technology Co Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2022-07-15
Anticipated expiration: 2040-12-18
Also published as: CN112487135A

Abstract

The invention discloses a method for converting text into structured query language, which comprises the following steps: determining abstract question expression and abstract database information expression according to the user question text and corresponding database information; inputting the abstract question representation and the abstract database information representation into a first converter to obtain a unified information representation; and determining a grammar tree structure corresponding to the uniform information representation so as to obtain a structured query language corresponding to the question text of the user. The invention borrows the characteristics of both field information and structural information in the database and adopts a graph projection model to separate the field information. Semantic information of a database is used as a springboard, a question is updated by using structural information of the database, expression of the question is abstracted step by step, the question is gradually stripped from field information in the database, and finally the abstract question and the expression of the database are obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the method.

Description

Method and device for converting text into structured query language

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a device for converting a text into a structured query language.

Background

The purpose of the text-to-SQL (structured Query language) statement task is to convert a natural language question into a corresponding executable SQL statement. The traditional text-to-SQL approach is based on the intermediate representation text-to-SQL (structured query language) parsing network (IRNet) and the relational-aware transformer text-to-SQL model (RATSQL).

IRNet: a set of intermediate grammars is designed by using an abstract syntax tree technology aiming at executable SQL sentences, and all the SQL sentences can be represented by the set of grammars. Compared with the SQL statement, the intermediate grammar abstracts the keywords in the SQL, so that the search space is greatly reduced. When in analysis, only the intermediate grammar with a smaller search space needs to be analyzed, and then the intermediate grammar is restored into the SQL statement.

RATSQL: the relational knowledge transformer text-to-SQL model combines the database information and the user question information together, fully considers the relationship between the database information and the user question information, and fuses the relationship information into the representation of the question and the database information. The uniform representation mode obtains better effect in the field migration task.

However, the two methods described above do not take into account the influence of the domain information on the text-to-SQL statement parsing task, but have a practical significance for the capability of domain migration of the text-to-SQL statement parsing task. The influence of domain information on performance needs to be considered, and how to eliminate the influence is not solved in the previous methods.

Disclosure of Invention

The embodiment of the invention provides a method and a device for converting a text into a structured query language, which are used for solving at least one of the technical questions.

In a first aspect, an embodiment of the present invention provides a method for converting text into a structured query language, including:

determining abstract question expression and abstract database information expression according to the user question text and corresponding database information;

inputting the abstract question expression and abstract database information expression into a first converter to obtain uniform information expression;

and determining a syntax tree structure corresponding to the uniform information representation so as to obtain a structured query language corresponding to the question text of the user.

In a second aspect, an embodiment of the present invention provides an apparatus for converting text into a structured query language, including:

the projection layer program module is used for determining abstract question expression and abstract database information expression according to the question text of the user and the corresponding database information;

a first translator program module for inputting the abstract question representation and the abstract database information representation to a first translator to obtain a unified information representation;

and the decoder program module is used for determining the grammar tree structure corresponding to the uniform information representation so as to obtain the structured query language corresponding to the question text of the user.

In a third aspect, an embodiment of the present invention provides a storage medium, where one or more programs including execution instructions are stored, where the execution instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the above methods for converting text into a structured query language of the present invention.

In a fourth aspect, an electronic device is provided, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform any of the above methods of converting text to structured query language of the present invention.

In a fifth aspect, the embodiments of the present invention further provide a computer program product, the computer program product includes a computer program stored on a storage medium, the computer program includes program instructions, when the program instructions are executed by a computer, the computer executes any one of the above methods for converting text into a structured query language.

The embodiment of the invention has the beneficial effects that: the method has the characteristics of both field information and structural information in the database, and adopts a graph projection model to separate the field information. Semantic information of a database is used as a springboard, a question is updated by using structural information of the database, expression of the question is abstracted step by step, the question is gradually stripped from field information in the database, and finally the abstract question and the expression of the database are obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the method.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow diagram of one embodiment of a method of converting text into a structured query language in accordance with the present invention;

FIG. 2 is a flow diagram of another embodiment of a method of converting text into a structured query language in accordance with the present invention;

FIG. 3 is a flow diagram of yet another embodiment of a method of converting text into a structured query language in accordance with the present invention;

FIG. 4 is a functional block diagram of an apparatus for converting text to a structured query language in accordance with the present invention;

fig. 5 is a schematic structural diagram of a ShadowGNN according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

As used in this disclosure, "module," "device," "system," and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. In particular, for example, an element may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. Also, an application or script running on a server, or a server, may be an element. One or more elements may be in a process and/or thread of execution and an element may be localized on one computer and/or distributed between two or more computers and can be operated by various computer-readable media. The elements may also communicate by way of local and/or remote processes in accordance with a signal having one or more data packets, e.g., signals from data interacting with another element in a local system, distributed system, and/or across a network of the internet with other systems by way of the signal.

Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

As shown in FIG. 1, an embodiment of the present invention provides a method for converting text into a structured query language, comprising:

and S10, determining abstract question expression and abstract database information expression according to the user question text and the corresponding database information.

Illustratively, user question text and corresponding database information are input into the graph projection neural network to obtain abstract question representation and abstract database information representation.

In some embodiments, the database information includes database information with domain information and database information with structure information; inputting a user question text and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation, wherein the abstract question representation and the abstract database information representation comprise: and inputting the user question text, the database information with the field information and the database information with the structure information into a pre-constructed graph projection neural network to obtain abstract question representation and abstract database information representation.

S20, inputting the abstract question expression and abstract database information expression into a first converter to obtain unified information expression;

and S30, determining the grammar tree structure corresponding to the uniform information representation to obtain the structured query language corresponding to the question text of the user.

In the embodiment of the invention, the characteristics of both field information and structural information in the database are borrowed, and the field information is separated by adopting a graph projection model. Semantic information of a database is used as a springboard, question sentences are updated by using structural information of the database, expression of the question sentences is abstracted step by step, the question sentences are gradually stripped from field information in the database, and finally expression of the abstracted question sentences and the database is obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the mode.

As shown in fig. 2, a flowchart of another embodiment of the present invention, in which inputting a user question text, database information with domain information, and database information with structure information into a pre-constructed graph projection neural network to obtain an abstract question representation and an abstract database information representation includes:

s11, obtaining an attention weight matrix according to the user question text and the database information with the field information;

s12, updating the user question text according to the attention weight matrix and the abstract database information representation, and inputting the updated user question text into a second converter to obtain abstract question representation;

and S13, updating the database information with the field information and the database information with the structure information according to the attention weight matrix and the user question text, and inputting the updated database information and the updated database information into a graph volume network to obtain abstract database representation.

In this embodiment, the user question text and the information database with domain information first find the attention weight matrix, the user question text is updated according to the attention weight matrix and the abstract database information, and the other two databases (the database with domain information and the database with structure information) are updated according to the attention weight matrix and the user question. And finally, the representation of the database passes through a layer of graph convolution network based on the relationship, and the representation of the question of the user passes through a layer of transformer. In the above one-time updating process, which is called as a graph projection process, an abstract database representation and an abstract question representation can be obtained after a plurality of updating processes. The abstract database representation and the abstract question representation are then represented uniformly using a relational aware transformer.

As shown in fig. 3, a flowchart of another embodiment of the present invention, in which updating the database information with domain information and the database information with structure information according to the attention weight matrix and the user question text comprises:

s131, determining a first text representation corresponding to the user question text under the database view angle with the domain information;

s132, determining a second text representation corresponding to the question text of the user under the database view angle with the structure information;

s133, updating database information with domain information according to the attention weight matrix and the first text representation;

and S134, updating the database information with the structural information according to the attention weight matrix and the second text representation.

In this embodiment, the update process represented by the database and question: the user question and the database with the field information are firstly used for obtaining an attention weight matrix, the updating of the user question is according to the attention weight matrix and the abstract database information, and the updating of the other two databases is according to the attention weight matrix and the user question in each view angle. And finally, the representation of the database passes through a layer of relation-based graph convolution network, and the representation of the user question passes through a layer of transformer. In the above one-time updating process, which is called as a graph projection neural network, after a plurality of updating processes, an abstract database representation and an abstract question representation can be obtained. The abstract database representation and the abstract question representation are then rendered using a relational-aware transformer to produce a unified representation.

In some embodiments, determining the syntax tree structure corresponding to the unified information representation to obtain the structured query language corresponding to the user question text comprises:

an IRNet decoding mode is adopted in advance, and the structured query language is expressed into an abstract syntax tree form;

determining a syntax tree structure corresponding to the unified information representation;

and determining the structured query language corresponding to the grammar tree structure as the structured query language corresponding to the user question text.

As shown in fig. 4, an embodiment of the present invention further provides an apparatus 400 for converting text into a structured query language, which in this embodiment includes:

a projection layer program module 410, configured to determine an abstract question representation and an abstract database information representation according to a user question text and corresponding database information;

a first translator program module 420 for inputting the abstract question representation and the abstract database information representation to a first translator to obtain a unified information representation;

a decoder program module 430, configured to determine a syntax tree structure corresponding to the unified information representation, so as to obtain a structured query language corresponding to the user question text.

In some embodiments, determining the abstract question representation and the abstract database information representation from the user question text and the corresponding database information comprises: and inputting the text of the question of the user and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation.

In some embodiments, the database information includes database information with domain information and database information with structure information;

the step of inputting the user question text and the corresponding database information into the graph projection neural network to obtain abstract question representation and abstract database information representation comprises the following steps:

and inputting the user question text, the database information with the field information and the database information with the structure information into a pre-constructed graph projection neural network to obtain abstract question representation and abstract database information representation.

In some embodiments, inputting the user question text, the database information with the domain information, and the database information with the structure information into a pre-constructed graph projection neural network to obtain the abstract question representation and the abstract database information representation comprises:

obtaining an attention weight matrix according to a user question text and database information with field information;

updating the user question text according to the attention weight matrix and the abstract database information representation, and inputting the updated user question text into a second converter to obtain abstract question representation;

and updating the database information with the field information and the database information with the structure information according to the attention weight matrix and the user question text, and inputting the updated database information and the database information into a graph convolution network to obtain abstract database representation.

In some embodiments, updating the database information with domain information and the database information with structure information based on the attention weight matrix and the user question text comprises:

determining a first text representation corresponding to the user question text at a database perspective with domain information;

determining a second text representation corresponding to the user question text at a database perspective with structural information;

updating database information with domain information according to the attention weight matrix and the first text representation;

updating the database information with structural information based on the attention weight matrix and the second textual representation.

In some embodiments, determining the syntax tree structure to which the unified information representation corresponds to obtain the structured query language corresponding to the user question text comprises:

expressing the structured query language into an abstract syntax tree form by adopting an IRNet decoding mode in advance;

In order to more clearly describe the technical solutions of the present invention and to more directly prove the feasibility and the benefits of the present invention compared with the prior art, the invention process, the technical background, the technical solutions, the experiments performed, etc. of the present invention are described in more detail below.

Abstract

In order to improve the generalization capability of the model, a new analytic framework is proposed from the two aspects of the structure and the semantics of the database, and is called ShadowGNN. The abstract database eliminates semantic information in the database representation, and the abstract representation is combined with a graph projection neural network designed by us to obtain a de-lexical question and a database representation. In combination with the abstract representation, we further use a relation-aware transformer to get a unified representation of the question and the database. Finally, we incorporate a context-free intermediate syntax for decoding. On the challenging text-to-SQL task Spider, the model we propose is better than the baseline model we implement. In combination with the pre-training model (ELECTRA), ShadowGNN yielded results that were comparable to the best performance at present.

1. Introduction to

Recently, Text-to-SQL has attracted a wide focus of the semantic parsing community. The ability to query databases using Natural Language (NL) attracts most users unfamiliar with the SQL language to access large databases. Many neural methods have been proposed to translate a question into an executable SQL query. In the published Text-to-SQL benchmarking test, the exact match accuracy is even over 80%. However, the cross-domain question of Text-to-SQL is a real challenge, ignored by previous datasets. It should be noted that the database architecture is regarded as a domain. The domain information consists of two parts: semantic information of schema components (e.g., table names) and structural information of schemas (e.g., primary key relationships between tables and columns).

The recently released data set Spider hides the database schema of the test set, which is quite different from the training set. In such a cross-domain setup, domain adaptation is challenging for two main reasons. First, semantic information for domains in the test and development set is not visible in the training set. On a given development set, 35% of the words in the database schema do not appear in the patterns in the training set. It is difficult to match the domain representation in question sentences and patterns. Second, there are considerable differences between the structures of database schemas. In particular, database schemas always contain semantic information. It is difficult to obtain a uniform representation of the database schema. We can see that for the reasons mentioned above, keywords are domain specific semantic information.

Herein, we attempt to mitigate the impact of domain information under a cross-domain setting. It is important to clarify what the semantic information of schema components play during the conversion of NL questions into SQL queries. For the Text-to-SQL model, the basic task is to find all the mentioned columns (names) and tables (team, season) by looking up patterns with semantic information (named semantic patterns). Once the columns and tables mentioned in the NL question match the schema component exactly, we can abstract the NL question and the semantic schema by replacing the regular component types with specific schema components. We can still infer the structure of the SQL query using abstract NL question and schema structures. By the correspondence between the semantic schema and the abstract schema, we can restore the abstract query to an executable SQL query with domain information. Inspired by this phenomenon, we decomposed the encoder of the Text-to-SQL model into two modules. First, the present invention proposes a graph-projected neural network (GPNN) to abstract NL question and semantic schema, where as much domain information as possible is deleted. The invention then uses a relationship-aware converter to obtain a unified representation of abstract NL question and abstract patterns.

The method of the invention is evaluated on a challenging cross-domain Text-to-SQL dataset Spider. The contribution sum is:

the first of the present invention is to mitigate the effect of domain information by abstracting NL question and SQL query representations. Applying it to similar cross-domain tasks is a meaningful approach.

In order to eliminate the domain information contained in NL questions and patterns, the present invention suggests that GPNNs obtain their abstract representation of NL questions and patterns.

Empirical results show that the method of the invention can achieve comparable performance to the latest method in a challenging Spider benchmark test. Ablation studies further demonstrate that GPNN is important for abstracting the presentation of NL questions and protocols.

2. Related work

text-to-SQL: the model recently evaluated on Spider points out several interesting directions for text-to-SQL studies. An AST-based decoder is first proposed that decodes a more abstract Intermediate Representation (IR) using a similar AST-based decoder and then converts it to an SQL query. RAT-SQL introduces a relationship-aware transcoder coder to improve joint coding of question and pattern and achieve optimal performance on the Spider dataset. EditSQL takes into account the dialog context when converting utterances to SQL queries in a context-to-SQL benchmark that converts context to context.

A neural network is shown: graph Neural Networks (GNNs) have been used to encode patterns in a more structured manner. Previous work constructed a directed graph of foreign key relationships in schemas, and then used GNN to obtain the corresponding schema representation. Global-GNN also employs GNN to derive a representation of the schema and to select a set of schema nodes that may appear in the output query. It then differentially rearranges the first K queries output from the generating decoder. We propose a Graphical Projection Neural Network (GPNN) that can extract abstract representations of NL question sentences and semantic schemas.

The method has the following field adaptation: domain adaptation is of interest to researchers, as models with good adaptation capability can be transferred to data with similar properties but from different domains. Some recently proposed models focus on improving domain adaptation capability. Thanks to the challenging Spider dataset, we now have a better way to assess this capability. IRNet uses a coarse-to-fine decoding strategy in which a domain-independent sketch is generated prior to domain-dependent node population. RYANSQL also provides a detailed sketch for complex SELECT statements. The RATSQL can be transferred to a new database by performing domain-independent encoding by combining architectural chaining and a relationship-aware converter. GNNs and Global-GNNs incorporate Graphical Neural Networks (GNNs) into pattern coding, which can capture structural information and easily migrate to other databases. We are the first to focus on the abstract schema in Text-to-SQL to improve domain adaptation.

3. Background of the invention

In this section we first introduce a relational graph convolutional network (R-GCN), which is the basis of the proposed GPNN we introduce in the next section. Then, we introduce a relation-aware converter, which is a variant of the converter that takes relevant information into account when calculating attention weights.

3.1 relationship graph convolution network

Before describing the details of R-GCN, we first give a representation of a directed graph of relationships. We show this figure as

WhereinThe node (mode component) is

And the edge with the direction mark is (v)_i；r；v_j) Wherein v is_iIs the source node, v_jIs a target node, and

is from v_iTo v_jThe type of edge of (c).

Represents the node v under the relation r_iOf v, where v_iActing as a target node.

Each node of the graph has an input feature x_iIt can be considered as an initial hidden state of the R-GCN

The hidden state of each node in the graph is updated layer by layer through the following steps:

and sending a message: on the l-th layer R-GCN, each side of the figure (v)_i；r；v_j) Will be from the source node v_iTo the target node v_jAnd sending the message. The message is calculated as follows:

wherein r is from v_iTo v_jIn the context of (a) or (b),

is a linear transformation and is a trainable matrix. According to equation 1, the size of the parameter of the calculation message is proportional to the number of node types. To increase scalability, the R-GCN regularizes the message computation parameters using a basic decomposition method, defined as follows:

wherein B is a base number, and B is a base number,

is a change of basis

The coefficient of (a). For different edge types, the underlying transform is shared and only the coefficients

Depending on r.

Aggregate messages after the messaging process, all incoming messages for each node will be aggregated. The R-GCN simply averages these incoming messages as:

wherein, the first and the second end of the pipe are connected with each other,

is equal to

After updating the status aggregation message, each node will change its hidden status from h_i ^(l-1)Is updated to h_i ^(l)，

Where σ is the activation function (i.e., ReLU), W₀ ^(l)Is a weight matrix. For each layer of the R-GCN, the update process can be simply expressed as:

wherein the content of the first and second substances,

is the number of the nodes that are present,

is a graph structure.

3.2 relationship-aware converter

With the success of large-scale language models, converter architectures have been widely used in Natural Language Processing (NLP) tasks to leverage the self-attention mechanism to sequence X ═ X]ⁿ _i＝1And (6) encoding is carried out. For example, a transducer is stacked together from multiple self-attention layers, where each layer stacks x_iConversion to y with H head_iThe following:

where h is the head index, d_zIs z_i ^(h)Hidden dimension of (a)_ij ^(h)Note that probability, Concat represents the join operation, LayerNorm is the layer normalization, and FC is the fully-joined layer. The converter function can be simplyExpressed as:

Y＝Transformer(X)， (10)

| X | is the sequence length.

A relationship-aware translator (RAT) is an important extension of the traditional translator, which treats the input sequence as a graph marked with directed full connections. The pair-wise relation between the input elements is taken into account in the RAT. The RAT incorporates the relationship information in equations 5 and 6. From the element x_iTo the element x_jIs represented by a vector r_ij；,KAnd r_ij,VThey are expressed as deviations incorporated into the self-care layer, as follows:

wherein r is_ij；,KAnd r_ij,VShared among different concerns. For each layer of the RAT, the update procedure can be simply expressed as:

wherein the content of the first and second substances,

is a relationship matrix between sequence tags.

R-GCN and RAT have been successfully applied to Text-to-SQL tasks. Bogin, Bernt, and Gardner encode the structure of semantic patterns using R-GCN to obtain a global representation of the nodes. Not only the schema structure is considered, but also the schema linkage between the schema and the NL question is considered. They propose a unified framework to model patterns and question representations using RATs. However, they did not explicitly discuss the influence of domain information. In the next section, we will introduce our proposed GPNN and illustrate how to obtain abstract representations of patterns and question sentences using the GPNN.

4. Method of producing a composite material

text-to-SQL model with NL question

And semantic schema

As an input. In our proposed ShadowGNN, the encoder has been decomposed into two modules. The first module filters domain-specific information using a well-designed Graphics Projection Neural Network (GPNN). The second module further obtains a unified representation of the question and the pattern using a relationship-aware converter. When the two-stage encoder of ShadowGNN converts a question into an SQL query under a cross-domain setting, the human reasoning process is simulated: abstraction and reasoning.

Fig. 5 is a schematic structural diagram of ShadowGNN according to an embodiment of the present invention. In this embodiment, the ShadowGNN has three inputs: abstract schema (structural information with schema only), semantic schema, and natural language question. The encoder of the ShadowGNN consists of two modules: a set of graphic projection layers and a set of dependency self-attention layers.

4.1 graph projection neural network

In this subsection, we introduce the structure of the GPNN. As we discuss, the architecture is structured by the database

And domain semantic information. Thus, there are two views on the schema, abstract and semantic. Abstract mode

Is the type (table or column) of the schema node without any domain information, which can be considered as a projection of the semantic schema. Input packet of semantic schemaIncluding domain information. First, the NL question always contains domain information.

The traditional R-GCN method only takes NL question sentences and semantic patterns as input. As shown in fig. 5, the GPNN takes other abstract patterns as input. The main motivation for GPNN is the representation of abstract question and pattern. The abstract schema has been abstracted from the semantic schema. The essence is an abstract question representation. The idea of GPNN is that a semantic schema is used as a bridge, a question uses an abstract schema to update its representation form, and attention information is calculated by a vector of the semantic schema. In each graph projection layer, attention is first paid between an NL question and a semantic schema as follows:

wherein, W_Q ^(l)And W_K ^(l)Is the attention weight of the ith projection layer, and

is a matrix of weight scores. When updating the question representation, we will abstract the schema g_j ^a,(j)As the key value of interest of the GPNN l-th layer,

wherein, gate (·) is sigmoid (FC (·)), and W is_V ^(l)Is a trainable weight. When updating the semantic schema, we take the transpose of the above-mentioned attention matrix as an attention from schema to question,

similar to the update procedure of question sentences in equations 15-17, semantic schema

Will be updated

As an attention score, and dividing q_i ^(l)As a value of attention. We can see that we only use the abstract pattern to update the question representation. In this way, the domain information contained in the question expression will be deleted.

The abstract schema is updated in the same way as the semantic schema, where they are paired with question q_i ^(l)Attention weights of (1) are not shared. There is an important operation before the abstract pattern is updated using the attention mechanism. We first calculate the maximum value u of the attention probability,

wherein u is_jThe physical meaning of (1) is that the question mentions the j-th component of the schema the most probable. We go by broadcasting the upper g^a,(l)Multiplying by u to distinguish an initial representation of the abstract pattern; used in a broadcast manner. Note the mechanism to update the three vectors. We then proceed to encode patterns and question sentences with R-GCN (-) and Transformer (-) functions, respectively, in conjunction with the characteristics of the patterns and NL question sentences, as shown in FIG. 5. So far, projection layers have been described. A Graph Projection Neural Network (GPNN) is a stack of projection layers.

4.2 architecture linking and RAT

Pattern chaining can be viewed as a priori knowledge in which the relative representation between question and pattern will be labeled according to the degree of match. There were 7 tags: complete table match, partial table matchColumn full match, column partial match, column value full match, column value partial match and no match. The column values are stored in a database. If the schema is divided into tables and columns, there are three inputs: question, table and column. The RATSQL unifies the representation of the three inputs using a relational-aware converter. RATSQL defines all relationships between three inputs

The RAT (-) function obtains a uniform representation of the question and the pattern. The schema linking relationships are a subset of R. In this context, we further unify the abstract representation of the question and the pattern, generated by the previous GPNN module, using the RAT.

4.3 decoder with SemQL syntax

To effectively limit the search space during synthesis, IRNet designs a context-free SemQL syntax as an intermediate representation between the NL question and SQL, essentially an Abstract Syntax Tree (AST). SemQL restores the tree-like nature of SQL. To simplify the syntax tree, SemQL does not cover all the keywords of SQL. For example, the columns contained in the GROUPBY clause may be inferred from the SELECT clause or the primary key of the table (where the aggregation function is applied to one of its columns).

Combining with the characteristics of the SemQL query, IRNet decomposes the decoding process of the SemQL query into two stages by using a coarse-to-fine method. The first step is to predict the framework of the SemQL query using a framework decoder. The detail decoder then fills in the missing details in the skeleton by selecting columns and tables. In this context, we use the IRNet decoder directly for the Text-to-SQL model, where source code 1 has been published.

5. Experiment of the invention

In this section, we evaluated the effectiveness of our proposed ShadowGNN over other recent models. We further eliminated other design choices to understand their contribution.

5.1 Experimental setup

Data set and indices: we performed experiments on Spider, which is a large-scale, complex and cross-domain basis for Text-to-SQL. The database on Spider was divided into 146 trains, 20 developments and 40 tests. The manually labeled question-SQL query pair is split 8625/1034/2147 for training/development/testing. Like all competing challenges, the tester is not open to the public. We report the results using the same metrics: precision matching accuracy and component matching accuracy.

Baseline: the main contribution here is to the encoder of the Text-to-SQL model. As for the decoder of our evaluation model, it is borrowed directly from IRNet. First, the SQL query is represented by an Abstract Syntax Tree (AST) according to a well-designed syntax. The AST is then flattened into a sequence by a Depth First Search (DFS) method (named SemQL query). During decoding, LSTM decoders are still used for one-to-one prediction. However, IRNet uses a coarse to fine approach to the decoder. The framework decoder first outputs the framework of the SemQL query. The detail decoder then fills in the missing details in the skeleton by selecting columns and tables. R-GCN and RATSQL are two other powerful benchmarks that improve the representation capability of Text-to-SQL encoders.

Table 1: precision matching accuracy of the development set and the test set. The top method of the table is augmented only by the BERT based model, and the bottom method incorporates other pre-trained models.

Pre-training the model: language model pre-training is effective for learning context-dependent natural language representations. For comparison with our baseline approach, we initially embedded a representation of NLP question and pattern components using a BERT-based model. There is a separator [ SEP ] between the question mark and the pattern mark. If a token contains multiple words (e.g., a list player ID), there will be an average pool level after the last level of BERT. To further evaluate the effectiveness of our method, we utilized more powerful ELECTRA-large coding questions and patterns that have been widely used in machine reading tasks.

The implementation is as follows: we performed ShadowGNN and baseline methods using PyTorch. We used the pre-trained models BERT and ELECTRA in the Py-Torch converter repository to fine-tune the BERT-based model on 1080TiGPU and the ELECTRA-based model on Titan Titan. We use Adam with default hyper-parameters for optimization. The learning rate is set to 1e-4, but the weight decay for the learning rate of the pre-trained model is 0.1. The hidden size of the GPNN layer and RAT layer is set to 512. The rate of padding was 0.3. The batch size is set to 16. Due to the limitations of GPU devices, we do not search for the best settings in the hyperparametric grid like rantsql. This is also true in the elettra-based and BERT-based experiments. The GPNN and RAT layers in the ShadowGNN encoder are set to 4.

5.2, Experimental results

Table 1 lists the three models with the highest precision of exact match enhanced by the BERT based pre-trained model, and the bottom four models combined with other pre-trained models. Compared with the ShadowGNN proposed by us, we can find that the ShadowGNN is absolutely improved by 3.8% and 1.5% on the development set. The ShadowGNN achieved comparable performance by the most advanced method. IRNet + + gets the best performance among the IRNet variants on the Spider chart, as inferred by its name. The ShadowGNN integrated ELECTRA model achieved absolute 6.6% and 4.8% performance improvements over development and test set 2, respectively.

The excellent performance of the Text-to-SQL model suggests that we convert the tagged SemQL tree directly into an SQL query. This is the upper bound of the model we propose, and in fact the performance of the IRNet model. The gold performance depends only on the SemQL syntax of the design. The SemQL syntax used in IRNet performs the lowest for group clauses and complex IUEN operations. The golden exact match accuracy on the development set is only 89.6%, which is a major limitation of our Text-to-SQL model. We can see that the lower the golden match accuracy, the lower the component match accuracy of the model we propose. In the GROUPBY clause only, the performance of ShadowGNN with ELECTRA is lower than that with BERT. As we have discussed, the group by clause is inferred from another clause that is not related to a particular model.

We further designed an experiment to validate the effectiveness of the graph-projected neural network (GPNN). Consider a question that has been preprocessed: "what are the names and capacities of stadiums holding the most concerts a year later? "," name "and" capacity "are column names. We swap their locations and compute the cosine similarity with the representation of the final GPNN layer. Interestingly, we found that the "name" is most similar to the "capacity". The semantics of the two column names seem to be deleted, since the representation of the two column names depends only on the location where it is present. This indicates that GPNN is indeed valid.

Table 2: the matching accuracy of the method is eliminated at four difficulty levels.

5.3 ablation study

We performed ablation studies on BERT-based ShadowGNN to analyze the contribution of well-designed Graphical Projection Neural Networks (GPNN). We have implemented four ablation models: R-GCN, GPNN, RAT and R-GCN + RAT. First, we describe the implementation of the ablation model.

We are the first to mitigate the effects of domain information by abstracting NL question and SQL query representations. Applying it to similar cross-domain tasks is a meaningful approach.

R-GCN: we delete the projection part directly in GPNN. When updating the question representation, we use the representation of the semantic schema as the focus value, rather than the abstract representation. We can find that there is no architectural linking information. It is not fair to compare directly with the main model. We compute the a priori score pn m by architecturally chaining the inputs using the linear layers. We add pn m as a priori knowledge on the attention score. At each layer, the a priori score is shared.

GPNN: in contrast to ShadowGNN, the GPNN model directly eliminates the relationship-aware translators. There are only four projection layers in the encoder.

RAT: the model replaces the four projection layers with four other relationship-aware self-attention layers. There are a total of eight relationship-aware self-attention layers in the encoder, consistent with the RAT-SQL setup.

R-GCN + RAT: in this model, there are four R-GCN layers and four relationship-aware self-concern layers. By comparison, the initial input to the R-GCN is the sum of the semantic schema and the abstract schema.

The decoder portion of these four ablation models is the same as that of ShadowGNN. We define therein the accuracy of the ablation model at four difficulty levels on the development set. As shown in Table 2, ShadowGN can achieve the best performance at all difficulty levels. Compared with R-GCN, R-GCN realized by the SemQL syntax has higher performance. GPNN is a graph neural network whose focus is to obtain abstract representations of question sentences and patterns. Interestingly, under BERT extension, GPNN achieved the best performance in GNN-based neural networks, which suggests the effectiveness of the Graph Projection Neural Network (GPNN). The absolute accuracy of ShadowGNN is improved by 3.6% compared to the RAT model. Especially on hard water level data, the improvement is absolutely 5.7%. In the training set, hard water level data only accounted for around 15%. Indicating that the domain adaptation capability of ShadowGNN is superior to the RAT model. ShadowGNN still has better performance than the R-GCN + RAT model, with the initial input information being absolutely the same. It represents the necessity and validity of explicit abstract question and pattern representation.

5.4, error analysis

To understand the source of the error, we analyzed 288 examples of eletcra enhanced ShadowGNN failures on the development set. We determine three main causes of SQL query errors: (1) the 16% failed queries are equivalent implementations of NL intents using different SQL syntax. For example, the MAX operation may be overwritten BY ORDER BY C DESC LIMIT 1. (2) The 13% failed example is in error in the operator because it requires domain knowledge to predict the correct example. For the SQL expert, even some examples are difficult to label. Consider the following question: "average body weight and number of years per year? "for the phrase" average weight and year ", it is difficult to determine whether or not" average "needs to be calculated. (3) The 25% failure example selects the wrong table column. Most such failure examples select the error table of the corresponding column that is the foreign key of both tables with the same column name.

5.5 discussion and future work

From the above results, it can be seen that the main limitation of our proposed ShadowGN is the incompleteness of SemQL syntax, where some important clauses are inferred from the Text-to-SQL model rather than predicted. In the next work, we will improve the SemQL syntax. On the other hand, to verify the generality of the proposed graph-projected neural network (GPNN), we will adapt similar tasks in a cross-domain setting, such as Dialog State Tracking (DST).

6. Conclusion

In this context, we attempt to mitigate the impact of domain information on the cross-domain Text-to-SQL task. We propose a Graphical Projection Neural Network (GPNN) that abstracts the representation of question and pattern in a simple focused manner. We further unify the abstract representation of the questions and patterns output by the GPNN with the relative perception translators (RATs). Experiments have shown that our proposed ShadowGNN can achieve excellent performance in challenging Text-to-SQL tasks. Ablation studies further demonstrate the effectiveness of our proposed GPNN.

It should be noted that for simplicity of explanation, the foregoing method embodiments are described as a series of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated ordering of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required by the invention. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to the related descriptions of other embodiments.

In some embodiments, the present invention provides a non-transitory computer readable storage medium, in which one or more programs including executable instructions are stored, the executable instructions being capable of being read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the above methods for converting text into a structured query language.

In some embodiments, the present invention further provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the above methods for converting text into a structured query language.

In some embodiments, an embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of converting text to a structured query language.

In some embodiments, the present invention further provides a storage medium having a computer program stored thereon, where the program is used for implementing a method for converting text into a structured query language when the program is executed by a processor.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of converting text to a structured query language, comprising:

inputting a user question text and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation;

inputting the abstract question representation and abstract database information representation into a first converter to obtain unified information representation;

2. The method of claim 1, wherein the database information comprises database information with domain information and database information with structure information;

and inputting the user question text, the database information with the field information and the database information with the structure information into the graph projection neural network to obtain abstract question representation and abstract database information representation.

3. The method of claim 2, wherein inputting the user question text, the database information with the domain information, and the database information with the structure information into the graph projection neural network to obtain the abstract question representation and the abstract database information representation comprises:

4. The method of claim 3, wherein updating the database information with domain information and the database information with structure information based on the attention weight matrix and the user question text comprises:

5. The method according to any of claims 1-4, wherein the determining the syntax tree structure corresponding to the unified information representation to obtain the structured query language corresponding to the user question text comprises:

6. An apparatus for converting text into a structured query language, comprising:

the projection layer program module is used for inputting a question text of a user and corresponding database information into the graph projection neural network to obtain abstract question representation and abstract database information representation;

7. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-5.

8. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.