CN112487135B - Method and device for converting text into structured query language - Google Patents

Method and device for converting text into structured query language Download PDF

Info

Publication number
CN112487135B
CN112487135B CN202011502186.2A CN202011502186A CN112487135B CN 112487135 B CN112487135 B CN 112487135B CN 202011502186 A CN202011502186 A CN 202011502186A CN 112487135 B CN112487135 B CN 112487135B
Authority
CN
China
Prior art keywords
information
representation
abstract
database
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011502186.2A
Other languages
Chinese (zh)
Other versions
CN112487135A (en
Inventor
俞凯
陈志�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Sipic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sipic Technology Co Ltd filed Critical Sipic Technology Co Ltd
Priority to CN202011502186.2A priority Critical patent/CN112487135B/en
Publication of CN112487135A publication Critical patent/CN112487135A/en
Application granted granted Critical
Publication of CN112487135B publication Critical patent/CN112487135B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for converting text into structured query language, which comprises the following steps: determining abstract question expression and abstract database information expression according to the user question text and corresponding database information; inputting the abstract question representation and the abstract database information representation into a first converter to obtain a unified information representation; and determining a grammar tree structure corresponding to the uniform information representation so as to obtain a structured query language corresponding to the question text of the user. The invention borrows the characteristics of both field information and structural information in the database and adopts a graph projection model to separate the field information. Semantic information of a database is used as a springboard, a question is updated by using structural information of the database, expression of the question is abstracted step by step, the question is gradually stripped from field information in the database, and finally the abstract question and the expression of the database are obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the method.

Description

Method and device for converting text into structured query language
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for converting a text into a structured query language.
Background
The purpose of the text-to-SQL (structured Query language) statement task is to convert a natural language question into a corresponding executable SQL statement. The traditional text-to-SQL approach is based on the intermediate representation text-to-SQL (structured query language) parsing network (IRNet) and the relational-aware transformer text-to-SQL model (RATSQL).
IRNet: a set of intermediate grammars is designed by using an abstract syntax tree technology aiming at executable SQL sentences, and all the SQL sentences can be represented by the set of grammars. Compared with the SQL statement, the intermediate grammar abstracts the keywords in the SQL, so that the search space is greatly reduced. When in analysis, only the intermediate grammar with a smaller search space needs to be analyzed, and then the intermediate grammar is restored into the SQL statement.
RATSQL: the relational knowledge transformer text-to-SQL model combines the database information and the user question information together, fully considers the relationship between the database information and the user question information, and fuses the relationship information into the representation of the question and the database information. The uniform representation mode obtains better effect in the field migration task.
However, the two methods described above do not take into account the influence of the domain information on the text-to-SQL statement parsing task, but have a practical significance for the capability of domain migration of the text-to-SQL statement parsing task. The influence of domain information on performance needs to be considered, and how to eliminate the influence is not solved in the previous methods.
Disclosure of Invention
The embodiment of the invention provides a method and a device for converting a text into a structured query language, which are used for solving at least one of the technical questions.
In a first aspect, an embodiment of the present invention provides a method for converting text into a structured query language, including:
determining abstract question expression and abstract database information expression according to the user question text and corresponding database information;
inputting the abstract question expression and abstract database information expression into a first converter to obtain uniform information expression;
and determining a syntax tree structure corresponding to the uniform information representation so as to obtain a structured query language corresponding to the question text of the user.
In a second aspect, an embodiment of the present invention provides an apparatus for converting text into a structured query language, including:
the projection layer program module is used for determining abstract question expression and abstract database information expression according to the question text of the user and the corresponding database information;
a first translator program module for inputting the abstract question representation and the abstract database information representation to a first translator to obtain a unified information representation;
and the decoder program module is used for determining the grammar tree structure corresponding to the uniform information representation so as to obtain the structured query language corresponding to the question text of the user.
In a third aspect, an embodiment of the present invention provides a storage medium, where one or more programs including execution instructions are stored, where the execution instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the above methods for converting text into a structured query language of the present invention.
In a fourth aspect, an electronic device is provided, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform any of the above methods of converting text to structured query language of the present invention.
In a fifth aspect, the embodiments of the present invention further provide a computer program product, the computer program product includes a computer program stored on a storage medium, the computer program includes program instructions, when the program instructions are executed by a computer, the computer executes any one of the above methods for converting text into a structured query language.
The embodiment of the invention has the beneficial effects that: the method has the characteristics of both field information and structural information in the database, and adopts a graph projection model to separate the field information. Semantic information of a database is used as a springboard, a question is updated by using structural information of the database, expression of the question is abstracted step by step, the question is gradually stripped from field information in the database, and finally the abstract question and the expression of the database are obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow diagram of one embodiment of a method of converting text into a structured query language in accordance with the present invention;
FIG. 2 is a flow diagram of another embodiment of a method of converting text into a structured query language in accordance with the present invention;
FIG. 3 is a flow diagram of yet another embodiment of a method of converting text into a structured query language in accordance with the present invention;
FIG. 4 is a functional block diagram of an apparatus for converting text to a structured query language in accordance with the present invention;
fig. 5 is a schematic structural diagram of a ShadowGNN according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
As used in this disclosure, "module," "device," "system," and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. In particular, for example, an element may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. Also, an application or script running on a server, or a server, may be an element. One or more elements may be in a process and/or thread of execution and an element may be localized on one computer and/or distributed between two or more computers and can be operated by various computer-readable media. The elements may also communicate by way of local and/or remote processes in accordance with a signal having one or more data packets, e.g., signals from data interacting with another element in a local system, distributed system, and/or across a network of the internet with other systems by way of the signal.
Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
As shown in FIG. 1, an embodiment of the present invention provides a method for converting text into a structured query language, comprising:
and S10, determining abstract question expression and abstract database information expression according to the user question text and the corresponding database information.
Illustratively, user question text and corresponding database information are input into the graph projection neural network to obtain abstract question representation and abstract database information representation.
In some embodiments, the database information includes database information with domain information and database information with structure information; inputting a user question text and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation, wherein the abstract question representation and the abstract database information representation comprise: and inputting the user question text, the database information with the field information and the database information with the structure information into a pre-constructed graph projection neural network to obtain abstract question representation and abstract database information representation.
S20, inputting the abstract question expression and abstract database information expression into a first converter to obtain unified information expression;
and S30, determining the grammar tree structure corresponding to the uniform information representation to obtain the structured query language corresponding to the question text of the user.
In the embodiment of the invention, the characteristics of both field information and structural information in the database are borrowed, and the field information is separated by adopting a graph projection model. Semantic information of a database is used as a springboard, question sentences are updated by using structural information of the database, expression of the question sentences is abstracted step by step, the question sentences are gradually stripped from field information in the database, and finally expression of the abstracted question sentences and the database is obtained, wherein the expression does not contain specific semantic information, and the field migration capability of the model is improved by the mode.
As shown in fig. 2, a flowchart of another embodiment of the present invention, in which inputting a user question text, database information with domain information, and database information with structure information into a pre-constructed graph projection neural network to obtain an abstract question representation and an abstract database information representation includes:
s11, obtaining an attention weight matrix according to the user question text and the database information with the field information;
s12, updating the user question text according to the attention weight matrix and the abstract database information representation, and inputting the updated user question text into a second converter to obtain abstract question representation;
and S13, updating the database information with the field information and the database information with the structure information according to the attention weight matrix and the user question text, and inputting the updated database information and the updated database information into a graph volume network to obtain abstract database representation.
In this embodiment, the user question text and the information database with domain information first find the attention weight matrix, the user question text is updated according to the attention weight matrix and the abstract database information, and the other two databases (the database with domain information and the database with structure information) are updated according to the attention weight matrix and the user question. And finally, the representation of the database passes through a layer of graph convolution network based on the relationship, and the representation of the question of the user passes through a layer of transformer. In the above one-time updating process, which is called as a graph projection process, an abstract database representation and an abstract question representation can be obtained after a plurality of updating processes. The abstract database representation and the abstract question representation are then represented uniformly using a relational aware transformer.
As shown in fig. 3, a flowchart of another embodiment of the present invention, in which updating the database information with domain information and the database information with structure information according to the attention weight matrix and the user question text comprises:
s131, determining a first text representation corresponding to the user question text under the database view angle with the domain information;
s132, determining a second text representation corresponding to the question text of the user under the database view angle with the structure information;
s133, updating database information with domain information according to the attention weight matrix and the first text representation;
and S134, updating the database information with the structural information according to the attention weight matrix and the second text representation.
In this embodiment, the update process represented by the database and question: the user question and the database with the field information are firstly used for obtaining an attention weight matrix, the updating of the user question is according to the attention weight matrix and the abstract database information, and the updating of the other two databases is according to the attention weight matrix and the user question in each view angle. And finally, the representation of the database passes through a layer of relation-based graph convolution network, and the representation of the user question passes through a layer of transformer. In the above one-time updating process, which is called as a graph projection neural network, after a plurality of updating processes, an abstract database representation and an abstract question representation can be obtained. The abstract database representation and the abstract question representation are then rendered using a relational-aware transformer to produce a unified representation.
In some embodiments, determining the syntax tree structure corresponding to the unified information representation to obtain the structured query language corresponding to the user question text comprises:
an IRNet decoding mode is adopted in advance, and the structured query language is expressed into an abstract syntax tree form;
determining a syntax tree structure corresponding to the unified information representation;
and determining the structured query language corresponding to the grammar tree structure as the structured query language corresponding to the user question text.
As shown in fig. 4, an embodiment of the present invention further provides an apparatus 400 for converting text into a structured query language, which in this embodiment includes:
a projection layer program module 410, configured to determine an abstract question representation and an abstract database information representation according to a user question text and corresponding database information;
a first translator program module 420 for inputting the abstract question representation and the abstract database information representation to a first translator to obtain a unified information representation;
a decoder program module 430, configured to determine a syntax tree structure corresponding to the unified information representation, so as to obtain a structured query language corresponding to the user question text.
In some embodiments, determining the abstract question representation and the abstract database information representation from the user question text and the corresponding database information comprises: and inputting the text of the question of the user and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation.
In some embodiments, the database information includes database information with domain information and database information with structure information;
the step of inputting the user question text and the corresponding database information into the graph projection neural network to obtain abstract question representation and abstract database information representation comprises the following steps:
and inputting the user question text, the database information with the field information and the database information with the structure information into a pre-constructed graph projection neural network to obtain abstract question representation and abstract database information representation.
In some embodiments, inputting the user question text, the database information with the domain information, and the database information with the structure information into a pre-constructed graph projection neural network to obtain the abstract question representation and the abstract database information representation comprises:
obtaining an attention weight matrix according to a user question text and database information with field information;
updating the user question text according to the attention weight matrix and the abstract database information representation, and inputting the updated user question text into a second converter to obtain abstract question representation;
and updating the database information with the field information and the database information with the structure information according to the attention weight matrix and the user question text, and inputting the updated database information and the database information into a graph convolution network to obtain abstract database representation.
In some embodiments, updating the database information with domain information and the database information with structure information based on the attention weight matrix and the user question text comprises:
determining a first text representation corresponding to the user question text at a database perspective with domain information;
determining a second text representation corresponding to the user question text at a database perspective with structural information;
updating database information with domain information according to the attention weight matrix and the first text representation;
updating the database information with structural information based on the attention weight matrix and the second textual representation.
In some embodiments, determining the syntax tree structure to which the unified information representation corresponds to obtain the structured query language corresponding to the user question text comprises:
expressing the structured query language into an abstract syntax tree form by adopting an IRNet decoding mode in advance;
determining a syntax tree structure corresponding to the unified information representation;
and determining the structured query language corresponding to the grammar tree structure as the structured query language corresponding to the user question text.
In order to more clearly describe the technical solutions of the present invention and to more directly prove the feasibility and the benefits of the present invention compared with the prior art, the invention process, the technical background, the technical solutions, the experiments performed, etc. of the present invention are described in more detail below.
Abstract
In order to improve the generalization capability of the model, a new analytic framework is proposed from the two aspects of the structure and the semantics of the database, and is called ShadowGNN. The abstract database eliminates semantic information in the database representation, and the abstract representation is combined with a graph projection neural network designed by us to obtain a de-lexical question and a database representation. In combination with the abstract representation, we further use a relation-aware transformer to get a unified representation of the question and the database. Finally, we incorporate a context-free intermediate syntax for decoding. On the challenging text-to-SQL task Spider, the model we propose is better than the baseline model we implement. In combination with the pre-training model (ELECTRA), ShadowGNN yielded results that were comparable to the best performance at present.
1. Introduction to
Recently, Text-to-SQL has attracted a wide focus of the semantic parsing community. The ability to query databases using Natural Language (NL) attracts most users unfamiliar with the SQL language to access large databases. Many neural methods have been proposed to translate a question into an executable SQL query. In the published Text-to-SQL benchmarking test, the exact match accuracy is even over 80%. However, the cross-domain question of Text-to-SQL is a real challenge, ignored by previous datasets. It should be noted that the database architecture is regarded as a domain. The domain information consists of two parts: semantic information of schema components (e.g., table names) and structural information of schemas (e.g., primary key relationships between tables and columns).
The recently released data set Spider hides the database schema of the test set, which is quite different from the training set. In such a cross-domain setup, domain adaptation is challenging for two main reasons. First, semantic information for domains in the test and development set is not visible in the training set. On a given development set, 35% of the words in the database schema do not appear in the patterns in the training set. It is difficult to match the domain representation in question sentences and patterns. Second, there are considerable differences between the structures of database schemas. In particular, database schemas always contain semantic information. It is difficult to obtain a uniform representation of the database schema. We can see that for the reasons mentioned above, keywords are domain specific semantic information.
Herein, we attempt to mitigate the impact of domain information under a cross-domain setting. It is important to clarify what the semantic information of schema components play during the conversion of NL questions into SQL queries. For the Text-to-SQL model, the basic task is to find all the mentioned columns (names) and tables (team, season) by looking up patterns with semantic information (named semantic patterns). Once the columns and tables mentioned in the NL question match the schema component exactly, we can abstract the NL question and the semantic schema by replacing the regular component types with specific schema components. We can still infer the structure of the SQL query using abstract NL question and schema structures. By the correspondence between the semantic schema and the abstract schema, we can restore the abstract query to an executable SQL query with domain information. Inspired by this phenomenon, we decomposed the encoder of the Text-to-SQL model into two modules. First, the present invention proposes a graph-projected neural network (GPNN) to abstract NL question and semantic schema, where as much domain information as possible is deleted. The invention then uses a relationship-aware converter to obtain a unified representation of abstract NL question and abstract patterns.
The method of the invention is evaluated on a challenging cross-domain Text-to-SQL dataset Spider. The contribution sum is:
the first of the present invention is to mitigate the effect of domain information by abstracting NL question and SQL query representations. Applying it to similar cross-domain tasks is a meaningful approach.
In order to eliminate the domain information contained in NL questions and patterns, the present invention suggests that GPNNs obtain their abstract representation of NL questions and patterns.
Empirical results show that the method of the invention can achieve comparable performance to the latest method in a challenging Spider benchmark test. Ablation studies further demonstrate that GPNN is important for abstracting the presentation of NL questions and protocols.
2. Related work
text-to-SQL: the model recently evaluated on Spider points out several interesting directions for text-to-SQL studies. An AST-based decoder is first proposed that decodes a more abstract Intermediate Representation (IR) using a similar AST-based decoder and then converts it to an SQL query. RAT-SQL introduces a relationship-aware transcoder coder to improve joint coding of question and pattern and achieve optimal performance on the Spider dataset. EditSQL takes into account the dialog context when converting utterances to SQL queries in a context-to-SQL benchmark that converts context to context.
A neural network is shown: graph Neural Networks (GNNs) have been used to encode patterns in a more structured manner. Previous work constructed a directed graph of foreign key relationships in schemas, and then used GNN to obtain the corresponding schema representation. Global-GNN also employs GNN to derive a representation of the schema and to select a set of schema nodes that may appear in the output query. It then differentially rearranges the first K queries output from the generating decoder. We propose a Graphical Projection Neural Network (GPNN) that can extract abstract representations of NL question sentences and semantic schemas.
The method has the following field adaptation: domain adaptation is of interest to researchers, as models with good adaptation capability can be transferred to data with similar properties but from different domains. Some recently proposed models focus on improving domain adaptation capability. Thanks to the challenging Spider dataset, we now have a better way to assess this capability. IRNet uses a coarse-to-fine decoding strategy in which a domain-independent sketch is generated prior to domain-dependent node population. RYANSQL also provides a detailed sketch for complex SELECT statements. The RATSQL can be transferred to a new database by performing domain-independent encoding by combining architectural chaining and a relationship-aware converter. GNNs and Global-GNNs incorporate Graphical Neural Networks (GNNs) into pattern coding, which can capture structural information and easily migrate to other databases. We are the first to focus on the abstract schema in Text-to-SQL to improve domain adaptation.
3. Background of the invention
In this section we first introduce a relational graph convolutional network (R-GCN), which is the basis of the proposed GPNN we introduce in the next section. Then, we introduce a relation-aware converter, which is a variant of the converter that takes relevant information into account when calculating attention weights.
3.1 relationship graph convolution network
Before describing the details of R-GCN, we first give a representation of a directed graph of relationships. We show this figure as
Figure BDA0002843935100000111
WhereinThe node (mode component) is
Figure BDA0002843935100000112
And the edge with the direction mark is (v)i;r;vj) Wherein v isiIs the source node, vjIs a target node, and
Figure BDA0002843935100000113
is from viTo vjThe type of edge of (c).
Figure BDA0002843935100000114
Represents the node v under the relation riOf v, where viActing as a target node.
Each node of the graph has an input feature xiIt can be considered as an initial hidden state of the R-GCN
Figure BDA0002843935100000115
The hidden state of each node in the graph is updated layer by layer through the following steps:
and sending a message: on the l-th layer R-GCN, each side of the figure (v)i;r;vj) Will be from the source node viTo the target node vjAnd sending the message. The message is calculated as follows:
Figure BDA0002843935100000116
wherein r is from viTo vjIn the context of (a) or (b),
Figure BDA0002843935100000117
is a linear transformation and is a trainable matrix. According to equation 1, the size of the parameter of the calculation message is proportional to the number of node types. To increase scalability, the R-GCN regularizes the message computation parameters using a basic decomposition method, defined as follows:
Figure BDA0002843935100000118
wherein B is a base number, and B is a base number,
Figure BDA0002843935100000119
is a change of basis
Figure BDA00028439351000001110
The coefficient of (a). For different edge types, the underlying transform is shared and only the coefficients
Figure BDA00028439351000001111
Depending on r.
Aggregate messages after the messaging process, all incoming messages for each node will be aggregated. The R-GCN simply averages these incoming messages as:
Figure BDA0002843935100000121
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002843935100000122
is equal to
Figure BDA0002843935100000123
After updating the status aggregation message, each node will change its hidden status from hi (l-1)Is updated to hi (l)
Figure BDA0002843935100000124
Where σ is the activation function (i.e., ReLU), W0 (l)Is a weight matrix. For each layer of the R-GCN, the update process can be simply expressed as:
Figure BDA0002843935100000125
wherein the content of the first and second substances,
Figure BDA0002843935100000126
Figure BDA0002843935100000127
is the number of the nodes that are present,
Figure BDA0002843935100000128
is a graph structure.
3.2 relationship-aware converter
With the success of large-scale language models, converter architectures have been widely used in Natural Language Processing (NLP) tasks to leverage the self-attention mechanism to sequence X ═ X]n i=1And (6) encoding is carried out. For example, a transducer is stacked together from multiple self-attention layers, where each layer stacks xiConversion to y with H headiThe following:
Figure BDA0002843935100000129
Figure BDA00028439351000001210
Figure BDA00028439351000001211
Figure BDA00028439351000001212
Figure BDA00028439351000001213
where h is the head index, dzIs zi (h)Hidden dimension of (a)ij (h)Note that probability, Concat represents the join operation, LayerNorm is the layer normalization, and FC is the fully-joined layer. The converter function can be simplyExpressed as:
Y=Transformer(X), (10)
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00028439351000001214
| X | is the sequence length.
A relationship-aware translator (RAT) is an important extension of the traditional translator, which treats the input sequence as a graph marked with directed full connections. The pair-wise relation between the input elements is taken into account in the RAT. The RAT incorporates the relationship information in equations 5 and 6. From the element xiTo the element xjIs represented by a vector rij;,KAnd rij,VThey are expressed as deviations incorporated into the self-care layer, as follows:
Figure BDA0002843935100000131
Figure BDA0002843935100000132
wherein r isij;,KAnd rij,VShared among different concerns. For each layer of the RAT, the update procedure can be simply expressed as:
Figure BDA0002843935100000133
wherein the content of the first and second substances,
Figure BDA0002843935100000134
is a relationship matrix between sequence tags.
R-GCN and RAT have been successfully applied to Text-to-SQL tasks. Bogin, Bernt, and Gardner encode the structure of semantic patterns using R-GCN to obtain a global representation of the nodes. Not only the schema structure is considered, but also the schema linkage between the schema and the NL question is considered. They propose a unified framework to model patterns and question representations using RATs. However, they did not explicitly discuss the influence of domain information. In the next section, we will introduce our proposed GPNN and illustrate how to obtain abstract representations of patterns and question sentences using the GPNN.
4. Method of producing a composite material
text-to-SQL model with NL question
Figure BDA0002843935100000135
And semantic schema
Figure BDA0002843935100000136
As an input. In our proposed ShadowGNN, the encoder has been decomposed into two modules. The first module filters domain-specific information using a well-designed Graphics Projection Neural Network (GPNN). The second module further obtains a unified representation of the question and the pattern using a relationship-aware converter. When the two-stage encoder of ShadowGNN converts a question into an SQL query under a cross-domain setting, the human reasoning process is simulated: abstraction and reasoning.
Fig. 5 is a schematic structural diagram of ShadowGNN according to an embodiment of the present invention. In this embodiment, the ShadowGNN has three inputs: abstract schema (structural information with schema only), semantic schema, and natural language question. The encoder of the ShadowGNN consists of two modules: a set of graphic projection layers and a set of dependency self-attention layers.
4.1 graph projection neural network
In this subsection, we introduce the structure of the GPNN. As we discuss, the architecture is structured by the database
Figure BDA00028439351000001410
And domain semantic information. Thus, there are two views on the schema, abstract and semantic. Abstract mode
Figure BDA0002843935100000141
Is the type (table or column) of the schema node without any domain information, which can be considered as a projection of the semantic schema. Input packet of semantic schemaIncluding domain information. First, the NL question always contains domain information.
The traditional R-GCN method only takes NL question sentences and semantic patterns as input. As shown in fig. 5, the GPNN takes other abstract patterns as input. The main motivation for GPNN is the representation of abstract question and pattern. The abstract schema has been abstracted from the semantic schema. The essence is an abstract question representation. The idea of GPNN is that a semantic schema is used as a bridge, a question uses an abstract schema to update its representation form, and attention information is calculated by a vector of the semantic schema. In each graph projection layer, attention is first paid between an NL question and a semantic schema as follows:
Figure BDA0002843935100000142
wherein, WQ (l)And WK (l)Is the attention weight of the ith projection layer, and
Figure BDA0002843935100000143
is a matrix of weight scores. When updating the question representation, we will abstract the schema gj a,(j)As the key value of interest of the GPNN l-th layer,
Figure BDA0002843935100000144
Figure BDA0002843935100000145
Figure BDA0002843935100000146
wherein, gate (·) is sigmoid (FC (·)), and W isV (l)Is a trainable weight. When updating the semantic schema, we take the transpose of the above-mentioned attention matrix as an attention from schema to question,
Figure BDA0002843935100000147
similar to the update procedure of question sentences in equations 15-17, semantic schema
Figure BDA0002843935100000148
Will be updated
Figure BDA0002843935100000149
As an attention score, and dividing qi (l)As a value of attention. We can see that we only use the abstract pattern to update the question representation. In this way, the domain information contained in the question expression will be deleted.
The abstract schema is updated in the same way as the semantic schema, where they are paired with question qi (l)Attention weights of (1) are not shared. There is an important operation before the abstract pattern is updated using the attention mechanism. We first calculate the maximum value u of the attention probability,
Figure BDA0002843935100000151
wherein u isjThe physical meaning of (1) is that the question mentions the j-th component of the schema the most probable. We go by broadcasting the upper ga,(l)Multiplying by u to distinguish an initial representation of the abstract pattern; used in a broadcast manner. Note the mechanism to update the three vectors. We then proceed to encode patterns and question sentences with R-GCN (-) and Transformer (-) functions, respectively, in conjunction with the characteristics of the patterns and NL question sentences, as shown in FIG. 5. So far, projection layers have been described. A Graph Projection Neural Network (GPNN) is a stack of projection layers.
4.2 architecture linking and RAT
Pattern chaining can be viewed as a priori knowledge in which the relative representation between question and pattern will be labeled according to the degree of match. There were 7 tags: complete table match, partial table matchColumn full match, column partial match, column value full match, column value partial match and no match. The column values are stored in a database. If the schema is divided into tables and columns, there are three inputs: question, table and column. The RATSQL unifies the representation of the three inputs using a relational-aware converter. RATSQL defines all relationships between three inputs
Figure BDA0002843935100000152
The RAT (-) function obtains a uniform representation of the question and the pattern. The schema linking relationships are a subset of R. In this context, we further unify the abstract representation of the question and the pattern, generated by the previous GPNN module, using the RAT.
4.3 decoder with SemQL syntax
To effectively limit the search space during synthesis, IRNet designs a context-free SemQL syntax as an intermediate representation between the NL question and SQL, essentially an Abstract Syntax Tree (AST). SemQL restores the tree-like nature of SQL. To simplify the syntax tree, SemQL does not cover all the keywords of SQL. For example, the columns contained in the GROUPBY clause may be inferred from the SELECT clause or the primary key of the table (where the aggregation function is applied to one of its columns).
Combining with the characteristics of the SemQL query, IRNet decomposes the decoding process of the SemQL query into two stages by using a coarse-to-fine method. The first step is to predict the framework of the SemQL query using a framework decoder. The detail decoder then fills in the missing details in the skeleton by selecting columns and tables. In this context, we use the IRNet decoder directly for the Text-to-SQL model, where source code 1 has been published.
5. Experiment of the invention
In this section, we evaluated the effectiveness of our proposed ShadowGNN over other recent models. We further eliminated other design choices to understand their contribution.
5.1 Experimental setup
Data set and indices: we performed experiments on Spider, which is a large-scale, complex and cross-domain basis for Text-to-SQL. The database on Spider was divided into 146 trains, 20 developments and 40 tests. The manually labeled question-SQL query pair is split 8625/1034/2147 for training/development/testing. Like all competing challenges, the tester is not open to the public. We report the results using the same metrics: precision matching accuracy and component matching accuracy.
Baseline: the main contribution here is to the encoder of the Text-to-SQL model. As for the decoder of our evaluation model, it is borrowed directly from IRNet. First, the SQL query is represented by an Abstract Syntax Tree (AST) according to a well-designed syntax. The AST is then flattened into a sequence by a Depth First Search (DFS) method (named SemQL query). During decoding, LSTM decoders are still used for one-to-one prediction. However, IRNet uses a coarse to fine approach to the decoder. The framework decoder first outputs the framework of the SemQL query. The detail decoder then fills in the missing details in the skeleton by selecting columns and tables. R-GCN and RATSQL are two other powerful benchmarks that improve the representation capability of Text-to-SQL encoders.
Figure BDA0002843935100000161
Table 1: precision matching accuracy of the development set and the test set. The top method of the table is augmented only by the BERT based model, and the bottom method incorporates other pre-trained models.
Pre-training the model: language model pre-training is effective for learning context-dependent natural language representations. For comparison with our baseline approach, we initially embedded a representation of NLP question and pattern components using a BERT-based model. There is a separator [ SEP ] between the question mark and the pattern mark. If a token contains multiple words (e.g., a list player ID), there will be an average pool level after the last level of BERT. To further evaluate the effectiveness of our method, we utilized more powerful ELECTRA-large coding questions and patterns that have been widely used in machine reading tasks.
The implementation is as follows: we performed ShadowGNN and baseline methods using PyTorch. We used the pre-trained models BERT and ELECTRA in the Py-Torch converter repository to fine-tune the BERT-based model on 1080TiGPU and the ELECTRA-based model on Titan Titan. We use Adam with default hyper-parameters for optimization. The learning rate is set to 1e-4, but the weight decay for the learning rate of the pre-trained model is 0.1. The hidden size of the GPNN layer and RAT layer is set to 512. The rate of padding was 0.3. The batch size is set to 16. Due to the limitations of GPU devices, we do not search for the best settings in the hyperparametric grid like rantsql. This is also true in the elettra-based and BERT-based experiments. The GPNN and RAT layers in the ShadowGNN encoder are set to 4.
5.2, Experimental results
Table 1 lists the three models with the highest precision of exact match enhanced by the BERT based pre-trained model, and the bottom four models combined with other pre-trained models. Compared with the ShadowGNN proposed by us, we can find that the ShadowGNN is absolutely improved by 3.8% and 1.5% on the development set. The ShadowGNN achieved comparable performance by the most advanced method. IRNet + + gets the best performance among the IRNet variants on the Spider chart, as inferred by its name. The ShadowGNN integrated ELECTRA model achieved absolute 6.6% and 4.8% performance improvements over development and test set 2, respectively.
The excellent performance of the Text-to-SQL model suggests that we convert the tagged SemQL tree directly into an SQL query. This is the upper bound of the model we propose, and in fact the performance of the IRNet model. The gold performance depends only on the SemQL syntax of the design. The SemQL syntax used in IRNet performs the lowest for group clauses and complex IUEN operations. The golden exact match accuracy on the development set is only 89.6%, which is a major limitation of our Text-to-SQL model. We can see that the lower the golden match accuracy, the lower the component match accuracy of the model we propose. In the GROUPBY clause only, the performance of ShadowGNN with ELECTRA is lower than that with BERT. As we have discussed, the group by clause is inferred from another clause that is not related to a particular model.
We further designed an experiment to validate the effectiveness of the graph-projected neural network (GPNN). Consider a question that has been preprocessed: "what are the names and capacities of stadiums holding the most concerts a year later? "," name "and" capacity "are column names. We swap their locations and compute the cosine similarity with the representation of the final GPNN layer. Interestingly, we found that the "name" is most similar to the "capacity". The semantics of the two column names seem to be deleted, since the representation of the two column names depends only on the location where it is present. This indicates that GPNN is indeed valid.
Figure BDA0002843935100000181
Table 2: the matching accuracy of the method is eliminated at four difficulty levels.
5.3 ablation study
We performed ablation studies on BERT-based ShadowGNN to analyze the contribution of well-designed Graphical Projection Neural Networks (GPNN). We have implemented four ablation models: R-GCN, GPNN, RAT and R-GCN + RAT. First, we describe the implementation of the ablation model.
We are the first to mitigate the effects of domain information by abstracting NL question and SQL query representations. Applying it to similar cross-domain tasks is a meaningful approach.
R-GCN: we delete the projection part directly in GPNN. When updating the question representation, we use the representation of the semantic schema as the focus value, rather than the abstract representation. We can find that there is no architectural linking information. It is not fair to compare directly with the main model. We compute the a priori score pn m by architecturally chaining the inputs using the linear layers. We add pn m as a priori knowledge on the attention score. At each layer, the a priori score is shared.
GPNN: in contrast to ShadowGNN, the GPNN model directly eliminates the relationship-aware translators. There are only four projection layers in the encoder.
RAT: the model replaces the four projection layers with four other relationship-aware self-attention layers. There are a total of eight relationship-aware self-attention layers in the encoder, consistent with the RAT-SQL setup.
R-GCN + RAT: in this model, there are four R-GCN layers and four relationship-aware self-concern layers. By comparison, the initial input to the R-GCN is the sum of the semantic schema and the abstract schema.
The decoder portion of these four ablation models is the same as that of ShadowGNN. We define therein the accuracy of the ablation model at four difficulty levels on the development set. As shown in Table 2, ShadowGN can achieve the best performance at all difficulty levels. Compared with R-GCN, R-GCN realized by the SemQL syntax has higher performance. GPNN is a graph neural network whose focus is to obtain abstract representations of question sentences and patterns. Interestingly, under BERT extension, GPNN achieved the best performance in GNN-based neural networks, which suggests the effectiveness of the Graph Projection Neural Network (GPNN). The absolute accuracy of ShadowGNN is improved by 3.6% compared to the RAT model. Especially on hard water level data, the improvement is absolutely 5.7%. In the training set, hard water level data only accounted for around 15%. Indicating that the domain adaptation capability of ShadowGNN is superior to the RAT model. ShadowGNN still has better performance than the R-GCN + RAT model, with the initial input information being absolutely the same. It represents the necessity and validity of explicit abstract question and pattern representation.
5.4, error analysis
To understand the source of the error, we analyzed 288 examples of eletcra enhanced ShadowGNN failures on the development set. We determine three main causes of SQL query errors: (1) the 16% failed queries are equivalent implementations of NL intents using different SQL syntax. For example, the MAX operation may be overwritten BY ORDER BY C DESC LIMIT 1. (2) The 13% failed example is in error in the operator because it requires domain knowledge to predict the correct example. For the SQL expert, even some examples are difficult to label. Consider the following question: "average body weight and number of years per year? "for the phrase" average weight and year ", it is difficult to determine whether or not" average "needs to be calculated. (3) The 25% failure example selects the wrong table column. Most such failure examples select the error table of the corresponding column that is the foreign key of both tables with the same column name.
5.5 discussion and future work
From the above results, it can be seen that the main limitation of our proposed ShadowGN is the incompleteness of SemQL syntax, where some important clauses are inferred from the Text-to-SQL model rather than predicted. In the next work, we will improve the SemQL syntax. On the other hand, to verify the generality of the proposed graph-projected neural network (GPNN), we will adapt similar tasks in a cross-domain setting, such as Dialog State Tracking (DST).
6. Conclusion
In this context, we attempt to mitigate the impact of domain information on the cross-domain Text-to-SQL task. We propose a Graphical Projection Neural Network (GPNN) that abstracts the representation of question and pattern in a simple focused manner. We further unify the abstract representation of the questions and patterns output by the GPNN with the relative perception translators (RATs). Experiments have shown that our proposed ShadowGNN can achieve excellent performance in challenging Text-to-SQL tasks. Ablation studies further demonstrate the effectiveness of our proposed GPNN.
It should be noted that for simplicity of explanation, the foregoing method embodiments are described as a series of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated ordering of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required by the invention. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to the related descriptions of other embodiments.
In some embodiments, the present invention provides a non-transitory computer readable storage medium, in which one or more programs including executable instructions are stored, the executable instructions being capable of being read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the above methods for converting text into a structured query language.
In some embodiments, the present invention further provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the above methods for converting text into a structured query language.
In some embodiments, an embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of converting text to a structured query language.
In some embodiments, the present invention further provides a storage medium having a computer program stored thereon, where the program is used for implementing a method for converting text into a structured query language when the program is executed by a processor.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (8)

1. A method of converting text to a structured query language, comprising:
inputting a user question text and corresponding database information into a graph projection neural network to obtain abstract question representation and abstract database information representation;
inputting the abstract question representation and abstract database information representation into a first converter to obtain unified information representation;
and determining a syntax tree structure corresponding to the uniform information representation so as to obtain a structured query language corresponding to the question text of the user.
2. The method of claim 1, wherein the database information comprises database information with domain information and database information with structure information;
the step of inputting the user question text and the corresponding database information into the graph projection neural network to obtain abstract question representation and abstract database information representation comprises the following steps:
and inputting the user question text, the database information with the field information and the database information with the structure information into the graph projection neural network to obtain abstract question representation and abstract database information representation.
3. The method of claim 2, wherein inputting the user question text, the database information with the domain information, and the database information with the structure information into the graph projection neural network to obtain the abstract question representation and the abstract database information representation comprises:
obtaining an attention weight matrix according to a user question text and database information with field information;
updating the user question text according to the attention weight matrix and the abstract database information representation, and inputting the updated user question text into a second converter to obtain abstract question representation;
and updating the database information with the field information and the database information with the structure information according to the attention weight matrix and the user question text, and inputting the updated database information and the database information into a graph convolution network to obtain abstract database representation.
4. The method of claim 3, wherein updating the database information with domain information and the database information with structure information based on the attention weight matrix and the user question text comprises:
determining a first text representation corresponding to the user question text at a database perspective with domain information;
determining a second text representation corresponding to the user question text at a database perspective with structural information;
updating database information with domain information according to the attention weight matrix and the first text representation;
updating the database information with structural information based on the attention weight matrix and the second textual representation.
5. The method according to any of claims 1-4, wherein the determining the syntax tree structure corresponding to the unified information representation to obtain the structured query language corresponding to the user question text comprises:
expressing the structured query language into an abstract syntax tree form by adopting an IRNet decoding mode in advance;
determining a syntax tree structure corresponding to the unified information representation;
and determining the structured query language corresponding to the grammar tree structure as the structured query language corresponding to the user question text.
6. An apparatus for converting text into a structured query language, comprising:
the projection layer program module is used for inputting a question text of a user and corresponding database information into the graph projection neural network to obtain abstract question representation and abstract database information representation;
a first translator program module for inputting the abstract question representation and the abstract database information representation to a first translator to obtain a unified information representation;
and the decoder program module is used for determining the grammar tree structure corresponding to the uniform information representation so as to obtain the structured query language corresponding to the question text of the user.
7. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-5.
8. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202011502186.2A 2020-12-18 2020-12-18 Method and device for converting text into structured query language Active CN112487135B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011502186.2A CN112487135B (en) 2020-12-18 2020-12-18 Method and device for converting text into structured query language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011502186.2A CN112487135B (en) 2020-12-18 2020-12-18 Method and device for converting text into structured query language

Publications (2)

Publication Number Publication Date
CN112487135A CN112487135A (en) 2021-03-12
CN112487135B true CN112487135B (en) 2022-07-15

Family

ID=74914796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011502186.2A Active CN112487135B (en) 2020-12-18 2020-12-18 Method and device for converting text into structured query language

Country Status (1)

Country Link
CN (1) CN112487135B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11726750B1 (en) * 2021-11-17 2023-08-15 Outsystems—Software Em Rede, S.A. Constrained decoding and ranking of language models for code generation
CN115982336B (en) * 2023-02-15 2023-05-23 创意信息技术股份有限公司 Dynamic dialogue state diagram learning method, device, system and storage medium
CN116991877B (en) * 2023-09-25 2024-01-02 城云科技(中国)有限公司 Method, device and application for generating structured query statement
CN117591543B (en) * 2024-01-19 2024-04-02 成都工业学院 SQL sentence generation method and device for Chinese natural language

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933602A (en) * 2019-02-28 2019-06-25 武汉大学 A kind of conversion method and device of natural language and structured query language
US20200134032A1 (en) * 2018-10-31 2020-04-30 Microsoft Technology Licensing, Llc Constructing structured database query language statements from natural language questions
CN111813802A (en) * 2020-09-11 2020-10-23 杭州量之智能科技有限公司 Method for generating structured query statement based on natural language

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200134032A1 (en) * 2018-10-31 2020-04-30 Microsoft Technology Licensing, Llc Constructing structured database query language statements from natural language questions
CN109933602A (en) * 2019-02-28 2019-06-25 武汉大学 A kind of conversion method and device of natural language and structured query language
CN111813802A (en) * 2020-09-11 2020-10-23 杭州量之智能科技有限公司 Method for generating structured query statement based on natural language

Also Published As

Publication number Publication date
CN112487135A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
CN112487135B (en) Method and device for converting text into structured query language
Lin et al. Bridging textual and tabular data for cross-domain text-to-SQL semantic parsing
Shi et al. Learning contextual representations for semantic parsing with generation-augmented pre-training
Yin et al. Neural enquirer: Learning to query tables with natural language
Hui et al. Dynamic hybrid relation exploration network for cross-domain context-dependent semantic parsing
Cao et al. Semantic parsing with dual learning
US20220164626A1 (en) Automated merge conflict resolution with transformers
US20220308848A1 (en) Semi-supervised translation of source code programs using neural transformers
CN109933602B (en) Method and device for converting natural language and structured query language
CN111382574B (en) Semantic parsing system combining syntax under virtual reality and augmented reality scenes
CN111930906A (en) Knowledge graph question-answering method and device based on semantic block
US20220129450A1 (en) System and method for transferable natural language interface
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN110084323A (en) End-to-end semanteme resolution system and training method
Luz et al. Semantic parsing natural language into SPARQL: improving target language representation with neural attention
CN115374270A (en) Legal text abstract generation method based on graph neural network
Jhunjhunwala et al. Multi-action dialog policy learning with interactive human teaching
CN116561251A (en) Natural language processing method
Huang et al. Relation aware semi-autoregressive semantic parsing for nl2sql
Cao et al. Improving and evaluating complex question answering over knowledge bases by constructing strongly supervised data
Sun et al. Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
CN116661852B (en) Code searching method based on program dependency graph
CN116432637A (en) Multi-granularity extraction-generation hybrid abstract method based on reinforcement learning
CN114579605B (en) Table question-answer data processing method, electronic equipment and computer storage medium
Wang et al. Knowledge base question answering system based on knowledge graph representation learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Co.,Ltd.

GR01 Patent grant
GR01 Patent grant