CN114610752B

CN114610752B - Reply generation and model training method, device and equipment based on form question answering

Info

Publication number: CN114610752B
Application number: CN202210501373.1A
Authority: CN
Inventors: 耿瑞莹; 李亮; 石翔; 黎槟华; 李永彬; 孙健
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-05-10
Filing date: 2022-05-10
Publication date: 2022-09-30
Anticipated expiration: 2042-05-10
Also published as: CN114610752A

Abstract

The embodiment of the application provides a reply generation and model training method, device and equipment based on table question answering. In the embodiment of the application, firstly, an SQL sentence of a user question is converted into an abstract syntax tree, a graph structure of a table to be accessed by the SQL sentence is constructed, then, the abstract syntax tree and the graph structure of the table to be accessed by the SQL sentence are combined into a uniform heterogeneous graph structure, and finally, a reply generation model obtained based on training of the uniform sample heterogeneous graph structure is used for performing table question-answering on the heterogeneous graph structure of the user question. Therefore, the semantic gap between the SQL statement and the table data is better solved, and the accuracy of the reply content based on the table data is improved.

Description

Reply generation and model training method, device and equipment based on form question and answer

Technical Field

The application relates to the technical field of natural language processing, in particular to a reply generation and model training method, device and equipment based on form question answering.

Background

Because the structure of the table data is clear and easy to maintain, the table data is the structured data which is commonly applied in various industries and is also an important answer source of an intelligent dialogue system, a search engine and the like. The advent of the table question and answer (TableQA) technology has made possible large-scale application of table data. The table question-and-answer technology converts a natural Language question into an SQL (Structured Query Language) statement, and directly interacts with table data by using the SQL statement to obtain reply content based on the table data. Since the accuracy of the reply content based on the table data directly affects the large-scale application of the table data, how to improve the accuracy of the reply content based on the table data becomes a research hotspot in the field of natural language processing.

Disclosure of Invention

Aspects of the present application provide a reply generation and model training method, device and apparatus based on table question answering, so as to improve accuracy of reply content based on table data.

The embodiment of the application provides a reply generation method based on a form question and answer, which comprises the following steps: converting a Structured Query Language (SQL) statement of a user question into an abstract syntax tree; constructing a graph structure of a target table to be accessed by the SQL sentence, wherein the graph structure takes the words in the target table as nodes and reflects the incidence relation between the words in the target table; creating connecting edges between the same nodes in the abstract syntax tree and the graph structure so as to merge the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure;

and inputting the heterogeneous graph structure into a pre-trained reply generation model to generate reply contents for replying the user question based on the target form.

The embodiment of the present application further provides a model training method, including: obtaining a plurality of sample heterogeneous graph structures; and carrying out model training by using the different composition picture structures of the plurality of samples to obtain a reply generation model.

An embodiment of the present application further provides a reply generation device based on a form question and answer, including: the conversion module is used for converting the Structured Query Language (SQL) sentences of the user questions into an abstract syntax tree; the system comprises a construction module, a query module and a query module, wherein the construction module is used for constructing a graph structure of a target table to be accessed by an SQL statement, and the graph structure takes words in the target table as nodes and reflects the incidence relation between the words in the target table; the merging module is used for creating connecting edges between the same nodes in the abstract syntax tree and the graph structure so as to merge the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure; and the generating module is used for inputting the heterogeneous graph structure into a pre-trained reply generating model and generating reply contents for replying the user question based on the target form.

The embodiment of the present application further provides a model training device, including: the acquisition module is used for acquiring a plurality of sample heterogeneous graph structures; and the training module is used for carrying out model training by utilizing the different composition picture structures of the plurality of samples to obtain a reply generation model.

An embodiment of the present application further provides an electronic device, including: a memory and a processor; a memory for storing a computer program; a processor is coupled to the memory for executing a computer program for performing steps in a form question-and-answer based reply generation method or model training method.

Embodiments of the present application also provide a computer storage medium storing a computer program, which, when executed by a processor, causes the processor to implement steps in a reply generation method or a model training method based on a form question-answer.

In the embodiment of the application, firstly, an SQL sentence of a user question is converted into an abstract syntax tree, a graph structure of a table to be accessed by the SQL sentence is constructed, then, the abstract syntax tree and the graph structure of the table to be accessed by the SQL sentence are combined into a uniform heterogeneous graph structure, and finally, a reply generation model obtained based on training of the uniform sample heterogeneous graph structure is used for performing table question-answering on the heterogeneous graph structure of the user question. Therefore, the semantic gap between the SQL statement and the table data is better solved, and the accuracy of the reply content based on the table data is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic view of an application scenario to which a reply generation method based on a form question and answer provided in the embodiment of the present application is applied;

fig. 2 is a flowchart of a reply generation method based on a form question and answer according to an embodiment of the present application;

FIG. 3 is an exemplary heterogeneous graph structure;

FIG. 4 is a model structure of an exemplary reply generation model;

FIG. 5 is a flowchart of a model training method provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a reply generation apparatus based on a form question and answer according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

First, terms related to embodiments of the present application are described;

FIG. (Graph): the graph is formed by a plurality of nodes and connecting edges (edges) connecting the two nodes and is used for describing the incidence relation between different nodes.

A heterogeneous graph structure: a graph includes multiple types of nodes or multiple types of connected edges.

Graph Neural Networks (GNNs): is a neural network that acts directly on graph structures, and broadly includes the following categories: graph Convolution Networks (GCNs), Graph Attention Networks (GANs), Graph Autoencoders (Graph Autoencoders), Graph Generation Networks (GGNs), and Graph space-time Networks (Graph Spatial-temporal Networks).

Currently, the advent of the table question answering (TableQA) technology has made large-scale application of table data possible. Since the accuracy of the reply content based on the table data directly affects the large-scale application of the table data, how to improve the accuracy of the reply content based on the table data becomes a research hotspot in the field of natural language processing. Therefore, the embodiment of the application provides a reply generation and model training method, device and equipment based on table question answering. In the embodiment of the application, firstly, SQL sentences of user questions are converted into abstract syntax trees, graph structures of tables to be accessed by the SQL sentences are constructed, then, the graph structures of the tables to be accessed by the abstract syntax trees and the SQL sentences are combined into a uniform heterogeneous graph structure, and finally, a reply generation model obtained based on training of the uniform sample heterogeneous graph structure is used for performing table question answering on the heterogeneous graph structure of the user questions. Therefore, the semantic gap between the SQL statement and the table data is better solved, and the accuracy of the reply content based on the table data is improved.

Fig. 1 is a schematic view of an application scenario in which the reply generation method based on a table question and answer provided in the embodiment of the present application is applied. In the intelligent dialog scenario shown in fig. 1, the conversation robot supports an intelligent conversation based on Natural Language Processing (NLP). Specifically, the conversation robot first receives a user voice including a user question; then, analyzing a user problem in a text form from the user voice, and converting the user problem in the text form into an SQL Query; then, SQL Query is executed to interact with the database, reply content for replying the user question is generated based on the table data inquired from the database, and finally, the reply content is displayed for the user to check. Session robots include, for example, but are not limited to: various terminal devices such as mobile phones, tablet computers, wearable devices, and vehicle-mounted devices, and the conversation robot is shown as a mobile phone in fig. 1.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 2 is a flowchart of a reply generation method based on a form question and answer according to an embodiment of the present application. The method may be performed by a form question and answer based reply generation apparatus, which may be implemented by means of software and/or hardware, and may generally be integrated in an electronic device. Referring to fig. 2, the method may include the steps of:

201. and converting the Structured Query Language (SQL) statement of the user question into an abstract syntax tree.

202. And constructing a graph structure of a target table to be accessed by the SQL statement, wherein the graph structure takes the words in the target table as nodes and reflects the association relationship between the words in the target table.

203. And creating connecting edges between the same nodes in the abstract syntax tree and the graph structure so as to combine the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure.

204. And inputting the heterogeneous graph structure into a pre-trained reply generation model to generate reply contents for replying the user question based on the target form.

In this embodiment, the user question refers to a question that is provided by a user in a natural language form, and can reflect a user requirement. In practical application, the user may use a voice interaction mode to ask a user question, or may use a text input mode to input a user question, which is not limited herein. After the user questions are obtained, a Text-to-SQL technology can be adopted to convert the user questions in the natural language into corresponding SQL query statements, wherein the Text-to-SQL technology aims to convert the natural language description into the corresponding SQL query statements and assist people in querying massive databases. Specifically, semantic analysis is carried out on the user problem by utilizing a semantic analysis technology in a natural language processing technology to obtain a semantic analysis result; and generating SQL sentences corresponding to the user problems according to the semantic parsing result.

After the SQL statement corresponding to the user question is obtained, Syntax analysis is performed on the SQL statement to convert the SQL statement into an Abstract Syntax Tree (AST), wherein the Abstract Syntax Tree is a graph structure which takes the words in the SQL statement as nodes and reflects the incidence relation among the words in the SQL statement.

In this embodiment, in addition to converting the SQL statement corresponding to the user question into the abstract syntax tree, a graph structure of a target table to be accessed by the SQL statement needs to be constructed. The graph structure of the target table takes the words in the target table as nodes and reflects the association relationship between the words in the target table. It is noted that the target table includes a plurality of cells, and the words included in each cell are abstracted to one node in the graph structure. Therefore, when the graph structure of the target table to be accessed by the SQL statement is constructed, the words included in each cell in the target table may be abstracted into each node in the graph structure; determining every two words with association relation according to the attribute information of each word and/or the position information of each word in the target table; for every two words, a connecting edge between two nodes corresponding to the two words is created in the graph structure.

In this embodiment, based on the location information of each word in the target table, it can be known which words are words of cells in the header, which words are words of cells in other table regions except the header, which words belong to the same column in the target table, which words belong to the same row in the target table, and so on, and based on these information, a continuous edge between nodes is created. For example, for the same column, there is an association relationship between a word in each of the other cells of the column except for the cell of the header and a word in the header, and there is an association relationship between words in two adjacent cells in the column. For the same row, association exists between words corresponding to each cell in the same row. Taking the company market value table shown in fig. 3 as an example, each company node in the same column has an association relationship with a company name node, and a corresponding connecting edge is created; two adjacent nodes in the same column establish corresponding connecting edges; two adjacent nodes in the same row and corresponding connecting edges are created.

In this embodiment, the attribute information of the word refers to attribute information of a data object recorded in the target table, the data object and its attribute are different according to different application scenarios, and the attribute of the data object may be one or more. Similarly, when determining the association relationship between words based on the attribute information of the words, different attributes have their adaptive association relationship determination policies, which are specifically related to the actual application requirements. Taking the company market value table shown in fig. 3 as an example, the attribute information of the company includes, but is not limited to: which subordinate subsidiaries a company owns, which country a company belongs to, the company type, etc. Based on the attribute information of the company, the company D is determined to be a subsidiary of the company A, so that an association relationship exists between the node and the company D node, a corresponding connecting edge is created, and a connecting edge is also created between the node and a market value node corresponding to the company D node.

It is worth noting that the number of the connecting edges between two nodes may be more than one, and particularly, under the condition that various attribute information exists, the two nodes may have an association relationship related to various attributes, and each association relationship related to the attributes corresponds to one connecting edge.

In this embodiment, the abstract syntax tree of the SQL statement corresponding to the user question and the graph structure of the target table are merged to obtain a unified heterogeneous graph structure. Specifically, edges are created between the same nodes in the abstract syntax tree and the graph structure to merge the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure. Referring to fig. 3, the abstract syntax tree in fig. 3 is an abstract syntax tree of the SQL statement in fig. 1, and the graph structure of the target table in fig. 3 is a graph structure corresponding to the company market value table in fig. 1. The abstract syntax tree is provided with company name nodes and market value nodes, the graph structure of the company market value table is also provided with the company name nodes and the market value nodes, a new connecting edge is created between the company name nodes in the abstract syntax tree and the company name nodes in the graph structure of the company market value table, and a new connecting edge is created between the market value nodes in the abstract syntax tree and the market value nodes in the graph structure of the company market value table, so that the abstract syntax tree and the graph structure of the company market value table are combined to obtain a new heterogeneous graph structure.

After the heterogeneous graph structure corresponding to the user question is obtained, inputting the heterogeneous graph structure into a pre-trained reply generation model, and generating reply content for replying the user question based on the target table. Referring to fig. 1, the session robot generates reply content for replying to the user question based on the company market value table, and displays the reply content on a display interface for the user to visually know the reply content.

In training the reply generative model, first, a plurality of sample heterogeneous graph structures are prepared. Each sample abnormal graph structure is obtained by merging an abstract syntax tree corresponding to the sample SQL statement and a graph structure of a sample table, and the sample table is a table accessed by the sample SQL statement. And then, training a graph neural network by using a plurality of sample heterogeneous graph structures, and taking the trained graph neural network as a reply generation model. For the training of the graph neural network, see the related art.

The reply generation method based on the table question-answering provided by the embodiment of the application comprises the steps of firstly converting an SQL (structured query language) statement of a user question into an abstract syntax tree, constructing a graph structure of a table to be accessed by the SQL statement, then combining the abstract syntax tree and the graph structure of the table to be accessed by the SQL statement into a uniform heterogeneous graph structure, and finally performing the table question-answering on the heterogeneous graph structure of the user question by using a reply generation model obtained by training based on the uniform sample heterogeneous graph structure. Therefore, the semantic gap between the SQL statement and the table data is better solved, and the accuracy of the reply content based on the table data is improved.

The embodiment of the present application does not limit the network structure of the reply generation model. As an example, the reply generation model includes a plurality of neural network layers, such as an encoding layer for an encoding (Encode) process, a graph attention layer for an attention mechanism, and a decoding layer for a decoding (Decode) process. To some extent, the reply generation model is essentially a Transformer model with an Encoder-Decoder model structure. Thus, the specific implementation of step 204 is: inputting a node sequence of a heterogeneous graph structure into an encoding layer for encoding processing to obtain initial vector representation of each node in the node sequence; inputting the initial vector representation of each node into a graph attention layer for processing to obtain a first intermediate vector representation of each node; performing fusion processing on the initial vector representation and the first intermediate vector representation of each node to obtain a final vector representation of each node; and inputting the final vector representations of the nodes into a decoding layer for decoding to obtain reply contents.

Specifically, the nodes in the heterogeneous graph structure are sequentially traversed from the root node of the heterogeneous graph structure, and a node sequence of the heterogeneous graph structure is generated according to the traversed nodes. It should be noted that each node on the heterogeneous graph structure corresponds to a word, and thus, a node sequence can be regarded as a word sequence composed of a plurality of words.

After traversing to obtain a node sequence of the heterogeneous graph structure, referring to fig. 4, first, Word embedding (Word embedding) of each node in the node sequence is input into an encoding layer for encoding processing, so as to obtain an initial vector representation of each node in the node sequence. Then, inputting the initial vector representation of each node into a graph attention layer for processing based on an attention mechanism to obtain a first intermediate vector representation of each node; and carrying out fusion processing on the initial vector representation and the first intermediate vector representation of each node to obtain the final vector representation of each node. Here, the fusion process may be a vector addition operation, and [ ] in fig. 4 denotes a vector addition operation. And finally, after the final vector representation of each node is obtained, inputting the final vector representations of the nodes into a decoding layer for decoding, wherein the decoding result is the reply content for replying the user question based on the form question and answer.

Further optionally, in order to further improve the accuracy of the reply generation model for replying the user question based on the table question-answer, referring to fig. 4, an embedding layer may be further added in the reply generation model, where the embedding layer is a neural network connected between the coding layer and the graph attention layer and is used to perform fusion processing on the initial vector representation of the node, the node type vector, and the source vector.

In this embodiment, the node type vector is a vectorized representation of the node type, and the source identification vector is a vectorized representation of the source identification. The node type and source identification of any node may be obtained during traversal of the heteromorphic graph structure.

In this embodiment, the node types include, but are not limited to: an abstract syntax tree type, a header type, and a non-header type. Wherein the nodes of the abstract syntax tree type are nodes on the abstract syntax tree; the node of the header type is a node on the graph structure to which the table corresponds, and the node corresponding word of the header type is a word in one cell in the header of the table. The nodes of the non-header type are nodes on the graph structure corresponding to the table, and the corresponding words of the nodes of the non-header type are words in a cell in a certain row of the table except the header. In fig. 4, in the dashed box corresponding to the node type vector, S represents an abstract syntax tree type, H represents a header type, and T represents a non-header type.

In this embodiment, the source identifier is used to identify the source of the node, where the source of the node on the heterogeneous graph structure is the graph structure of the abstract syntax tree or the target table. In fig. 4, in the dashed box corresponding to the source identification vector, each number represents the node in the abstract syntax tree, which nodes belong to the abstract syntax tree and which nodes belong to the target table, which is known information, so that the source of the node can be determined based on the number corresponding to the node.

Then, further optionally, the initial vector representation of each node is input to the graph attention layer for processing, and an implementation manner of obtaining the first intermediate vector representation of each node may be: acquiring a node type vector and a source identification vector of each node in the heterogeneous graph structure; inputting the initial vector representation, the node type vector and the source identification vector of each node into an embedding layer for fusion processing to obtain a second intermediate vector representation of each node; and inputting the second intermediate vector representation of each node into the graph attention layer for processing to obtain the first intermediate vector representation of each node.

As a further alternative, in order to further improve the model performance of the reply generation model, referring to fig. 4, the reply generation model further includes: layer Normalization (Layer Normalization) connecting between the encoding Layer and the embedding Layer. Then, before the initial vector representation, the node type vector and the source identification vector of each node are input to the embedding layer for fusion processing to obtain the second intermediate vector representation of each node, the initial vector representation of each node can be input to the layer for normalization processing.

In order to better understand the reply generation method based on the form question answering provided by the application, several scenario embodiments are introduced below.

Scenario example 1:

the conversation robot is a robot supporting voice interaction to perform logistics inquiry. When a user has a logistics inquiry demand, inputting a section of voice 'i want to inquire the logistics state of a certain express package' to a conversation robot. The dialogue robot receives the user voice and analyzes the user voice, determines the user problem in the Text form, and converts the user problem into an SQL query statement based on a Text-to-SQL technology. And the conversation robot sends the SQL query statement to the cloud server. The cloud server converts the SQL query statement into an abstract syntax tree and constructs a graph structure of a logistics table to be accessed by the SQL query statement, and the logistics table records the logistics state of each express package; and the cloud server merges the graph structures of the abstract syntax tree and the logistics table to obtain a heterogeneous graph structure, inputs the heterogeneous graph structure into the reply generation model for processing to obtain a logistics query result, and returns the logistics query result to the conversation robot. The conversation robot broadcasts the logistics inquiry result in a voice broadcasting mode or displays the logistics inquiry result in a text mode so that a user can visually know the logistics inquiry result.

Scenario example 2:

the conversation robot is a robot supporting intelligent shopping guide. When a shopping demand exists, a user inputs a voice 'I wants to buy a certain brand of clothes' to the conversation robot. The dialogue robot receives the user voice and analyzes the user voice to determine the user problem in a Text form, converts the user problem into an SQL query statement based on a Text-to-SQL technology, converts the SQL query statement into an abstract syntax tree and constructs a graph structure of a shopping guide table to be accessed by the SQL query statement, and the shopping guide table records commodity information of each shop; and the dialogue robot merges the graph structures of the abstract syntax tree and the shopping guide table to obtain a heterogeneous graph structure, and inputs the heterogeneous graph structure into the reply generation model for processing to obtain a shopping guide query result. And the dialogue robot broadcasts the shopping guide inquiry result in a voice broadcasting mode or displays the shopping guide inquiry result in a text mode so that a user can visually know the shopping guide inquiry result.

Fig. 5 is a flowchart of a model training method according to an embodiment of the present application. Referring to fig. 5, the model training method may include the steps of:

501. and acquiring a plurality of sample heterogeneous graph structures.

502. And carrying out model training by using the structure of the sample abnormal graph to obtain a reply generation model.

Further optionally, the reply generation model comprises: an encoding layer, a graph attention layer, and a decoding layer.

Further optionally, the reply generation model further includes: an embedding layer connected between the coding layer and the attention layer.

Further optionally, the reply generation model further includes: layer normalization connected between the coding layer and the embedding layer.

The description of the model training method and the components in the generated model are introduced in the foregoing method embodiments, and are not repeated herein.

Fig. 6 is a schematic structural diagram of a reply generation device based on a form question and answer according to an embodiment of the present application. Referring to fig. 6, the apparatus may include the steps of:

the conversion module 61 is used for converting the structured query language SQL statement of the user question into an abstract syntax tree;

the building module 62 is configured to build a graph structure of a target table to be accessed by the SQL statement, where the graph structure takes words in the target table as nodes and reflects an association relationship between the words in the target table;

a merging module 63, configured to create a connecting edge between the same nodes in the abstract syntax tree and the graph structure, so as to merge the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure;

and the generating module 64 is configured to input the heterogeneous graph structure into a pre-trained reply generation model, and generate reply content for replying the user question based on the target form.

Further optionally, the reply generation model includes: if the encoding layer, the graph attention layer, and the decoding layer are used, the generation module 64 inputs the heterogeneous graph structure into the reply generation model, and when reply content for replying the user question based on the target form is generated, the generation module is specifically configured to: inputting a node sequence of a heterogeneous graph structure into an encoding layer for encoding processing to obtain initial vector representation of each node in the node sequence; inputting the initial vector representation of each node into a graph attention layer for processing to obtain a first intermediate vector representation of each node; carrying out fusion processing on the initial vector representation and the first intermediate vector representation of each node to obtain final vector representation of each node; and inputting the final vector representations of the plurality of nodes into a decoding layer for decoding to obtain reply contents.

Further optionally, the reply generation model further includes: an embedding layer connected between the coding layer and the graph attention layer; the generating module 64 inputs the initial vector representation of each node to the graph attention layer for processing, and when obtaining the first intermediate vector representation of each node, is specifically configured to: acquiring a node type vector and a source identification vector of each node in the heterogeneous graph structure; inputting the initial vector representation, the node type vector and the source identification vector of each node into an embedding layer for fusion processing to obtain a second intermediate vector representation of each node; and inputting the second intermediate vector representation of each node into the graph attention layer for processing to obtain the first intermediate vector representation of each node.

Further optionally, the reply generation model further includes: layer normalization connected between the coding layer and the embedding layer; the generating module 64 inputs the initial vector representation, the node type vector, and the source identifier vector of each node into the embedding layer for fusion processing, and before obtaining the second intermediate vector representation of each node, is further configured to: the initial vector representation for each node is input to the layer normalization for normalization processing.

Further optionally, when the building module 62 builds the graph structure of the target table to be accessed by the SQL statement, the building module is specifically configured to: abstracting words included in each cell in the target table into each node in the graph structure; determining every two words with association relation according to the position information of each word in the target table; for every two words, a connecting edge between two nodes corresponding to the two words is created in the graph structure.

Further optionally, before the building module 62 creates a connecting edge between two nodes corresponding to the two words in the graph structure, the building module is further configured to: and determining every two words with the association relation according to the attribute information of each word.

Further optionally, the apparatus further comprises: the acquisition module is used for acquiring a plurality of sample heterogeneous graph structures; and the training module is used for carrying out model training by utilizing the different composition picture structures of the plurality of samples to obtain a reply generation model.

The reply generation device based on the form question and answer shown in fig. 6 may execute the reply generation method based on the form question and answer shown in the embodiment shown in fig. 2, and the implementation principle and the technical effect thereof are not described again. The specific manner in which each module and unit of the apparatus shown in fig. 6 in the above-described embodiment perform operations has been described in detail in the embodiment related to the method, and will not be described in detail herein.

Fig. 7 is a schematic structural diagram of a model training device according to an embodiment of the present application. Referring to fig. 7, the apparatus may include the steps of:

an obtaining module 70, configured to obtain a plurality of sample heterogeneous graph structures.

And the training module 71 is configured to perform model training by using the plurality of sample heteromorphic image structures to obtain a reply generation model.

Further optionally, the reply generation model includes: an encoding layer, a graph attention layer, and a decoding layer.

The model training apparatus shown in fig. 7 may execute the model training method of the embodiment shown in fig. 5, and the implementation principle and the technical effect are not repeated. The specific manner in which each module and unit of the apparatus shown in fig. 7 in the above-described embodiment perform operations has been described in detail in the embodiment related to the method, and will not be described in detail herein.

It should be noted that the execution subjects of the steps of the methods provided in the above embodiments may be the same device, or different devices may be used as the execution subjects of the methods. For example, the execution subjects of step 201 to step 204 may be device a; for another example, the execution subject of steps 201 and 202 may be device a, and the execution subject of steps 203 and 204 may be device B; and so on.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations occurring in a specific order are included, but it should be clearly understood that these operations may be executed out of order or in parallel as they appear herein, and the sequence numbers of the operations, such as 201, 202, etc., are used merely to distinguish various operations, and the sequence numbers themselves do not represent any execution order. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 8, the electronic apparatus includes: a memory 81 and a processor 82;

memory 81 is used to store computer programs and may be configured to store other various data to support operations on the computing platform. Examples of such data include instructions for any application or method operating on the computing platform, contact data, phonebook data, messages, pictures, videos, and so forth.

The memory 81 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A processor 82 coupled to the memory 81 for executing the computer program in the memory 81 for: converting a Structured Query Language (SQL) statement of a user question into an abstract syntax tree; constructing a graph structure of a target table to be accessed by an SQL statement, wherein the graph structure takes words in the target table as nodes and reflects the incidence relation between the words in the target table; creating connecting edges between the same nodes in the abstract syntax tree and the graph structure so as to combine the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure; and inputting the heterogeneous graph structure into a pre-trained reply generation model to generate reply contents for replying the user question based on the target form.

Further optionally, the reply generation model comprises: the processor 82 inputs the heterogeneous graph structure into the reply generation model, and when generating reply content for replying the user question based on the target form, the encoding layer, the graph attention layer, and the decoding layer are specifically configured to: inputting a node sequence of a heterogeneous graph structure into an encoding layer for encoding processing to obtain initial vector representation of each node in the node sequence; inputting the initial vector representation of each node into a graph attention layer for processing to obtain a first intermediate vector representation of each node; carrying out fusion processing on the initial vector representation and the first intermediate vector representation of each node to obtain final vector representation of each node; and inputting the final vector representations of the nodes into a decoding layer for decoding to obtain reply contents.

Further optionally, the reply generation model further includes: an embedding layer connected between the coding layer and the graph attention layer; the processor 82 inputs the initial vector representation of each node to the graph attention layer for processing, and when obtaining the first intermediate vector representation of each node, is specifically configured to: acquiring a node type vector and a source identification vector of each node in the heterogeneous graph structure; inputting the initial vector representation, the node type vector and the source identification vector of each node into an embedding layer for fusion processing to obtain a second intermediate vector representation of each node; and inputting the second intermediate vector representation of each node into the graph attention layer for processing to obtain the first intermediate vector representation of each node.

Further optionally, the reply generation model further includes: layer normalization connected between the coding layer and the embedding layer; before the processor 82 inputs the initial vector representation, the node type vector, and the source identification vector of each node into the embedding layer for fusion processing, and obtains a second intermediate vector representation of each node, the processor is further configured to: the initial vector representation for each node is input to the layer normalization for normalization processing.

Further optionally, when the processor 82 constructs the graph structure of the target table to be accessed by the SQL statement, the processor is specifically configured to: abstracting words included in each cell in the target table into each node in the graph structure; determining every two words with association relation according to the position information of each word in the target table; for every two words, a connecting edge between two nodes corresponding to the two words is created in the graph structure.

Further optionally, before creating a connecting edge between two nodes corresponding to the two words in the graph structure, the processor 82 is further configured to: and determining every two words with the association relation according to the attribute information of each word.

Further optionally, the processor 82 is further configured to obtain a plurality of sample heterogeneous graph structures; and carrying out model training by using the different composition picture structures of the plurality of samples to obtain a reply generation model.

Further, as shown in fig. 8, the electronic device further includes: communication components 83, display 84, power components 85, audio components 86, and the like. Only some of the components are schematically shown in fig. 8, and the electronic device is not meant to include only the components shown in fig. 8. In addition, the components within the dashed line frame in fig. 8 are optional components, not necessary components, and may be determined according to the product form of the electronic device. The electronic device of this embodiment may be implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, or an IOT device, or may be a server device such as a conventional server, a cloud server, or a server array. If the electronic device of this embodiment is implemented as a terminal device such as a desktop computer, a notebook computer, or a smart phone, the electronic device may include components within a dashed line frame in fig. 8; if the electronic device of this embodiment is implemented as a server device such as a conventional server, a cloud server, or a server array, the components in the dashed box in fig. 8 may not be included.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the electronic device in the foregoing method embodiments when executed.

Accordingly, embodiments of the present application also provide a computer program product, which includes a computer program/instruction, when the computer program/instruction is executed by a processor, the processor is enabled to implement the steps that can be executed by an electronic device in the foregoing method embodiments.

The communication component is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a WiFi, a 2G, 3G, 4G/LTE, 5G and other mobile communication networks, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

The display includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

The power supply assembly provides power for various components of the device in which the power supply assembly is located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.

The audio component may be configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive an external audio signal when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A reply generation method based on table question answering is characterized by comprising the following steps:

converting a Structured Query Language (SQL) statement of a user question into an abstract syntax tree;

constructing a graph structure of a target table to be accessed by the SQL statement, wherein the graph structure takes words in the target table as nodes and reflects the incidence relation between the words in the target table;

creating connecting edges between the abstract syntax tree and the same nodes in the graph structure so as to combine the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure;

inputting the heterogeneous graph structure into a pre-trained reply generation model to generate reply contents for replying the user question based on the target form; wherein, the reply generation model is obtained by training according to the model training method of claim 6;

constructing a graph structure of a target table to be accessed by the SQL statement, wherein the graph structure comprises; abstracting words included in each cell in the target table into each node in the graph structure; determining every two words with association relation according to the position information of each word in the target table; for each two words, a connecting edge between two nodes corresponding to the two words is created in the graph structure.

2. The method of claim 1, wherein the reply generation model comprises: the encoding layer, the graph attention layer and the decoding layer input the heterogeneous graph structure into a reply generation model, and the generation of reply content for replying the user question based on the target form comprises the following steps:

inputting the node sequence of the heterogeneous graph structure into a coding layer for coding processing to obtain initial vector representation of each node in the node sequence;

inputting the initial vector representation of each node into a graph attention layer for processing to obtain a first intermediate vector representation of each node;

carrying out fusion processing on the initial vector representation and the first intermediate vector representation of each node to obtain final vector representation of each node;

and inputting the final vector representations of the nodes into a decoding layer for decoding processing to obtain the reply content.

3. The method of claim 2, wherein the reply generation model further comprises: the embedding layer is connected between the coding layer and the graph attention layer; inputting the initial vector representation of each node into the graph attention layer for processing to obtain a first intermediate vector representation of each node, wherein the method comprises the following steps:

acquiring a node type vector and a source identification vector of each node in the heterogeneous graph structure;

inputting the initial vector representation, the node type vector and the source identification vector of each node into an embedding layer for fusion processing to obtain a second intermediate vector representation of each node;

and inputting the second intermediate vector representation of each node into the graph attention layer for processing to obtain the first intermediate vector representation of each node.

4. The method of claim 3, wherein the reply generation model further comprises: a layer normalization connected between the coding layer and the embedding layer; before inputting the initial vector representation, the node type vector and the source identification vector of each node into the embedding layer for fusion processing to obtain a second intermediate vector representation of each node, the method further comprises the following steps:

the initial vector representation for each node is input to the layer normalization for normalization processing.

5. The method of claim 1, prior to creating a connecting edge between two nodes corresponding to the two words in the graph structure, further comprising:

and determining every two words with the association relation according to the attribute information of each word.

6. A method of model training, comprising:

acquiring a plurality of sample heterogeneous graph structures, wherein each sample heterogeneous graph structure is obtained by merging an abstract syntax tree corresponding to a sample SQL statement and a graph structure of a sample table, and the sample table is a table accessed by the sample SQL statement;

the method comprises the steps of training a neural network of a graph by utilizing a plurality of sample heterogeneous graph structures to obtain a reply generation model, wherein the reply generation model is used for generating reply contents for replying a user question based on a target table according to an input heterogeneous graph structure, the heterogeneous graph structure is obtained by combining an abstract syntax tree corresponding to an SQL (structured query language) statement of the user question and a graph structure of the target table, and the target table is a table to be accessed by the SQL statement.

7. The method of claim 6, wherein the reply generation model comprises:

an encoding layer, a graph attention layer, and a decoding layer.

8. The method of claim 7, wherein the reply generation model further comprises: an embedding layer connected between the coding layer and the attention layer.

9. A reply generation apparatus based on a form question and answer, comprising:

the conversion module is used for converting the Structured Query Language (SQL) sentences of the user questions into an abstract syntax tree;

the construction module is used for constructing a graph structure of a target table to be accessed by the SQL sentence, and the graph structure takes the words in the target table as nodes and reflects the incidence relation among the words in the target table;

the merging module is used for creating a connecting edge between the abstract syntax tree and the same node in the graph structure so as to merge the abstract syntax tree and the graph structure to obtain a heterogeneous graph structure;

a generating module, configured to input the heterogeneous graph structure into a pre-trained reply generation model, and generate reply content for replying the user question based on the target table, where the reply generation model is obtained by training according to the model training method of claim 6;

wherein the building block is specifically configured to: abstracting words included in each cell in the target table into each node in the graph structure; determining every two words with association relation according to the position information of each word in the target table; for each two words, a connecting edge between two nodes corresponding to the two words is created in the graph structure.

10. A model training apparatus, comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a plurality of sample heterogeneous graph structures, each sample heterogeneous graph structure is obtained by merging an abstract syntax tree corresponding to a sample SQL statement and a graph structure of a sample table, and the sample table is a table accessed by the sample SQL statement;

the system comprises a training module and a reply generation module, wherein the training module is used for training a neural network of a graph by utilizing a plurality of sample heterogeneous graph structures to obtain a reply generation model, the reply generation model is used for generating reply contents for replying a user question based on a target table according to an input heterogeneous graph structure, the heterogeneous graph structure is obtained by combining an abstract syntax tree corresponding to an SQL sentence of the user question and a graph structure of the target table, and the target table is a table to be accessed by the SQL sentence.

11. An electronic device, comprising: a memory and a processor; the memory for storing a computer program; the processor is coupled to the memory for executing the computer program for performing the steps of the method of any of claims 1-8.

12. A computer storage medium storing a computer program, wherein the computer program, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 8.