CN110609849A - Natural language generation method based on SQL syntax tree node type - Google Patents

Natural language generation method based on SQL syntax tree node type Download PDF

Info

Publication number
CN110609849A
CN110609849A CN201910796688.1A CN201910796688A CN110609849A CN 110609849 A CN110609849 A CN 110609849A CN 201910796688 A CN201910796688 A CN 201910796688A CN 110609849 A CN110609849 A CN 110609849A
Authority
CN
China
Prior art keywords
node
natural language
sql
vector
syntax tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910796688.1A
Other languages
Chinese (zh)
Other versions
CN110609849B (en
Inventor
蔡瑞初
梁智豪
许柏炎
郝志峰
温雯
李梓健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910796688.1A priority Critical patent/CN110609849B/en
Publication of CN110609849A publication Critical patent/CN110609849A/en
Application granted granted Critical
Publication of CN110609849B publication Critical patent/CN110609849B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Devices For Executing Special Programs (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of natural language, in particular to a natural language generation method based on SQL syntax tree node types. The present invention does not require extensive manual operations and does not require that natural language must support multiple patterns. Compared with a natural language generating method based on sequence-to-sequence learning, the method can acquire the text information of the SQL language, and can be used by combining the tree-shaped structured data of the SQL syntax tree and the tree-shaped long and short term memory network to more fully acquire the syntax structure information of the SQL sentences, thereby having practical application significance, avoiding the defect that the development document and the network data are searched and searched manually, greatly reducing the time cost and the labor cost and improving the working efficiency.

Description

Natural language generation method based on SQL syntax tree node type
Technical Field
The invention relates to the field of natural language, in particular to a natural language generation method based on SQL syntax tree node types.
Background
The Structured Query Language (SQL) is a programming language of a non-procedural operation relational database, and allows a user to interactively query data on a high-level data structure, so that the specific storage mode of the data is transparent to the user; structured query languages are widely used in database manipulation transactions. Since the SQL Language is a programming Language, it can be converted into an Abstract Syntax Tree (AST) through an Abstract Syntax Description Language (ASDL), a Language used to describe a Tree-like data structure in a compiler. The abstract syntax tree is capable of representing the syntax structure of the SQL language in the form of a tree without showing the concrete details of the SQL language. The abstract syntax tree of SQL is an abstract representation of the SQL language, and by representing each segment of SQL statement in the form of an abstract syntax tree, the syntax structure of each segment of SQL statement can be easily and clearly obtained.
The SQL language is widely applied to various projects and products to meet various data operations and database requirements, and a large number of SQL statements exist in the system to support most of the data operations of the system. In the process, in order to facilitate future maintenance work, clear natural language comments need to be marked on the SQL statements, or in the process of updating the SQL statements, it is necessary to refer to development documents and online data to know the functional requirements to be realized by the SQL statements, which requires much time and effort. In the face of such a practical requirement, a method capable of converting the SQL language into the natural language is necessary. There are several ideas to solve the problem. The first is to convert SQL language into natural language according to the pre-designed artificial rules and templates, and the method has the disadvantages that the generated natural language has high similarity, the sentence pattern lacks diversity, and can support the limited kinds of SQL sentences, after all, the artificially designed template is used as the basis; the second idea is to consider the problem of translation from sequence to sequence by converting the SQL language into the natural language, consider a segment of SQL statement and a segment of natural language description as the form of sequence, encode the sequence of SQL statement through the neural network to extract the whole expression of the sequence of SQL statement, and generate the natural language sequence according to the expression.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a natural language generation method based on SQL syntax tree node types. The present invention does not require extensive manual operations and does not require that natural language must support multiple patterns. Compared with a natural language generating method based on sequence-to-sequence learning, the method can acquire the text information of the SQL language, and can acquire the syntax structure information of the SQL statement more fully by combining the tree-shaped structured data of the SQL syntax tree and the use of the tree-shaped long-term and short-term memory network, so that the method has practical application significance.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a natural language generation method based on SQL syntax tree node type includes the following steps:
step S1: constructing a natural language generation model, wherein the model comprises a language encoder and a language decoder which are formed on the basis of a memory network;
step S2: collecting a natural language data set from an SQL text, traversing the natural language data set according to breadth first to obtain an SQL abstract syntax tree T ═ node with n nodes1,...,nodenAnd the corresponding natural language sequence X ═ X1,...,xm}; wherein, the node represents a node of the SQL abstract syntax tree T, and the subscript is a node serial number; x represents a word in a natural language sentence X, and subscripts represent sequence numbers;
step S3: generating a language in a model using natural languageThe encoder computes each node in the SQL abstract syntax treeiNode state vector of
Step S4: selecting node state vectors of an SQL abstract syntax treeAs an initial hidden state vector h0Inputting the input into a language decoder in a natural language generation model;
step S5: at each time step t, the speech decoder updates the hidden state vector h at more than one time stept-1And predicted word xt-1As input, a new hidden state vector h is updatedt
Step S6: based on hidden state vector htWith each node state vector of the SQL syntax treeCalculating to obtain an attention vector attn of the current time stept
Step S7: attention vector attntAs input, into a language decoder of a natural language generation model;
step S8: the language decoder is based on the input attention vector attntExecuting copy operation or generation operation to generate a corresponding natural language sequence;
step S9: and training the natural language generating model by using a gradient descent method, determining a model parameter theta of the natural language generating model, and obtaining the optimized natural language generating model.
Preferably, the speech decoder of step S1 further includes a binary discriminator, the speech encoder is composed of a tree-type long-short term memory network based on node types, the speech decoder is composed of a long-short term memory network, and the binary discriminator is composed of a fully-connected network; the tree-type long-short term memory network based on the node type is a variation of the tree-type long-short term memory network,the tree-shaped long and short term memory network is composed of father nodes, root nodes and child nodes, wherein one father node comprises a plurality of root nodes, one root node comprises a plurality of child nodes, and if each node of one SQL syntax tree is a nodeiIf there are K subnodes, the tree-type long-short term memory network based on node type is composed of 1 input gate InputGate, K forgetting gates ForgetGate and one output gate OutputGate, and the network uses nodeiText vector ofAnd node typeState vector of its K child nodesAnd node typeAnd cell statusRespectively calculating with an input gate, a forgetting gate and an output gate according to the following formula to obtain node state vectors
Wherein b is an offset, W· (·)Andthe learnable parameters of the tree-type long-term and short-term memory network based on the node type are determined according to the node type(·)Selecting different parameter values; sigmoid (. cndot.) and tanh (. cndot.) are nonlinear activation functions, and the specific formula is as follows:
preferably, the natural language data set in step S2 is collected by manual or machine statistics, the natural language data set includes structured query language SQL and natural language sequence pairs, and the data set is split into a training set, a verification set and a test set according to a proportion for training the reliability of the natural language generation model.
Preferably, the time step in step S5 is an input unit of the long-short term memory network when processing the sequence data.
Preferably, in step S5, x is 0 when t-1 ist-1Is a special symbol that indicates the beginning.
Preferably, in step S6, attention vector attn is calculatedtThe method comprises the following specific steps:
first based on the hidden state vector htWith each node state vector of the SQL syntax treeCalculating weights for each node state vectorThen weighted and summed according to the weight to obtain a context vector ctxtFinally, the context vector ctxtAnd hidden state vector htCalculating to obtain attention vector attntThe concrete formula is as follows:
attnt=tanh([ctxt;ht]) (13)
wherein S is a real number matrix of n × d, and represents n node state vectors of an SQL syntax treeThe dimension of the vector is d; softmax (·) and tanh (·) are nonlinear activation functions, and the concrete formula of softmax (·) is:
preferably, in step S7, attention vector attn is addedtThe binary arbiter is input into a binary arbiter of the speech decoder, which is a fully connected network with an output dimension of 2, i.e.:
P(action|x1,...,xt-1,T)=W×attnt;W∈R2×d (15)
wherein W ∈ R2×dIs a fully-connected network trainable parameter, d is an attention vector attntDimension (d);
the binary arbiter outputs 2 probabilities P (action ═ copy | x)1,...,xt-1T) and P (action | x)1,...,xt-1,T),P(action=copy|x1,...,xt-1T) represents the probability of executing a copy operation, P (action | x)1,...,xt-1And T) represents the probability of executing the generating operation, the sizes of the two probability values are compared, and the operation with the higher probability is selected to be executed.
Preferably, in step S8, if the binary arbiter determines that the copy operation is performed, the binary arbiter bases on the attention vector attntWith the state vector of each node i of the SQL syntax treeCalculating the probability P (x) of each node being copiedt|x1,...,xt-1T), selecting the node with the maximum probability to copy the node text as the output x of the current time stept(ii) a Attention vector attn in replication mechanismtWith the state vector of each node i of the SQL syntax treeThe specific formula of the probability that each node is duplicated is calculated as follows:
wherein the content of the first and second substances,state vector representing ith node of SQL syntax treeIn the transposed form of (a) to (b),is a scalar quantity representing a value at the ith node with respect to the t time step, the value representing a state vectorAnd attention vector attntIs likeDegree;
P(xt|x1,...,xt-1,T)=softmax(ut) (17)
if the binary classifier is judged to be the generating operation, attention vector attn is addedtInputting the input into a full-connection network with one output dimension being the size of the target dictionary to obtain the probability P (x) of each word in the target dictionaryt|x1,...,xt-1T), the word with the highest probability is selected as the output x of the current time stept(ii) a And repeating the steps S5-S8 until a corresponding natural language sequence is generated.
Preferably, the gradient descent algorithm in step S9 includes the following steps:
step S201: assuming an objective function J (theta) with respect to a model parameter theta of a natural language generation model;
step S202: calculating the gradient of J (theta)
Step S203: the parameter theta is updated with an update step alpha (alpha > 0),
preferably, in step S9, in the training of the natural language generation model, the model parameter θ is trained by an objective function or a loss function until the model converges, where the objective function is:
wherein, P (x)t,action=copy|x1,...,xt-1T) represents the probability that the text will perform a copy operation, P (x)t,action=generate|x1,...,xt-1T) represents the probability of the text performing the generating operation;
the corresponding loss function L is:
L=-logP(X|T)
=-∑tlog(P(xt|x1,...,xt-1,T))
=-∑tlog(P(xt,action=copy|x1,...,xt-1,T)
+P(xt,action=generate|x1,...,xt-1,T))
=-∑tlog(P(xt|x1,...,xt-1,T)×P(action=copy|x1,...,xt-1,T)
+P(xt|x1,...,xt-1,T)
×P(action=generate|x1,...,xt-1,T)) (2)
wherein X represents a natural language sentence, each sentence being X1,...,xmThe word sequence of (1); t represents an abstract syntax tree, each tree is a node1,...,nodenP (X | T) represents the conditional probability of X given the syntax tree T, P (X | T)t,action=copy|x1,...,xt-1T) represents the probability that the text will perform a copy operation, P (x)t,action=generate|x1,...,xt-1And T) represents the probability of the text performing the generating operation.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the present invention does not require extensive manual operations and does not require that natural language must support multiple patterns. Compared with a natural language generating method based on sequence-to-sequence learning, the method can acquire the text information of the SQL language, and can be used by combining the tree-shaped structured data of the SQL syntax tree and the tree-shaped long and short term memory network to more fully acquire the syntax structure information of the SQL sentences, thereby having practical application significance, avoiding the defect that the development document and the network data are searched and searched manually, greatly reducing the time cost and the labor cost and improving the working efficiency.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of the present invention;
FIG. 3 is a schematic diagram of a tree-like long short term memory network;
fig. 4 is a schematic diagram of a replication mechanism.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a natural language generation method based on SQL syntax tree node types includes the following steps:
step S1: constructing a natural language generation model, wherein the model comprises a language encoder and a language decoder which are formed on the basis of a memory network;
step S2: collecting a natural language data set from an SQL text, traversing the natural language data set according to breadth first to obtain an SQL abstract syntax tree T ═ node with n nodes1,...,nodenAnd the corresponding natural language sequence X ═ X1,...,xm}; wherein, the node represents a node of the SQL abstract syntax tree T, and the subscript is a node serial number; x represents a word in a natural language sentence X, and subscripts represent sequence numbers;
step S3: computing each node in an SQL abstract syntax tree using a language encoder in a natural language generative modeliNode state vector of
Step S4: selecting node state vectors of an SQL abstract syntax treeAs an initial hidden state vector h0Inputting the input into a language decoder in a natural language generation model;
step S5: at each time step t, the speech decoder updates the hidden state vector h at more than one time stept-1And predicted word xt-1As input, a new hidden state vector h is updatedt
Step S6: based on hidden state vector htWith each node state vector of the SQL syntax treeCalculating to obtain an attention vector attn of the current time stept
Step S7: attention vector attntAs input, into a language decoder of a natural language generation model;
step S8: the language decoder is based on the input attention vector attntExecuting copy operation or generation operation to generate a corresponding natural language sequence;
step S9: and training the natural language generating model by using a gradient descent method, determining a model parameter theta of the natural language generating model, and obtaining the optimized natural language generating model.
As shown in fig. 2, as a preferred embodiment, the speech decoder of step S1 further includes a binary arbiter, the speech encoder is composed of a tree-type long-short term memory network based on node types, the speech decoder is composed of a long-short term memory network, and the binary arbiter is composed of a fully-connected network; the tree-type long-short term memory network based on node types is a variation of the tree-type long-short term memory network, and the specific structure is shown in fig. 3, the tree-type long-short term memory network is composed of father nodes, root nodes and child nodes, one father node comprises a plurality of root nodes, one root node comprises a plurality of child nodes, and if each node of one SQL syntax tree is a nodeiWith K child nodes, the tree-type long-term and short-term memory network based on node typeThe network consists of 1 input gate InputGate, K forgetting gates ForgetGate and one output gate OutputGate, and the network utilizes nodeiText vector ofAnd node typeState vector of its K child nodesAnd node typeAnd cell statusRespectively calculating with an input gate, a forgetting gate and an output gate according to the following formula to obtain node state vectors
Wherein b is an offset, W· (·)Andthe learnable parameters of the tree-type long-term and short-term memory network based on the node type are determined according to the node type(·)Selecting different parameter values; sigmoid (. cndot.) and tanh (. cndot.) are nonlinear activation functions, and the specific formula is as follows:
as a preferred embodiment, the natural language data set in step S2 is collected by human or machine statistics, the natural language data set includes structured query language SQL and natural language sequence pairs, and the data set is proportionally split into a training set, a verification set and a test set for training the reliability of the natural language generation model.
As a preferred embodiment, the time step described in step S5 is an input unit of the long-short term memory network when processing the sequence data.
As a preferred embodiment, in step S5, when t-1 is 0, xt-1Is a special symbol that indicates the beginning.
As a preferred embodiment, in step S6, attention vector attn is calculatedtThe method comprises the following specific steps:
first based on the hidden state vector htWith each node state vector of the SQL syntax treeCalculating weights for each node state vectorThen theWeighting and summing according to weight to obtain a context vector ctxtFinally, the context vector ctxtAnd hidden state vector htCalculating to obtain attention vector attntThe concrete formula is as follows:
attnt=tanh([ctxt;ht]) (13)
wherein S is a real number matrix of n × d, and represents n node state vectors of an SQL syntax treeThe dimension of the vector is d; softmax (·) and tanh (·) are nonlinear activation functions, and the concrete formula of softmax (·) is:
as a preferred embodiment, in step S7, attention vector attn is addedtThe binary arbiter is input into a binary arbiter of the speech decoder, which is a fully connected network with an output dimension of 2, i.e.:
P(action|x1,...,xt-1,T)=W×attnt;W∈R2×d (15)
wherein W ∈ R2×dIs a fully-connected network trainable parameter, d is an attention vector attntDimension (d);
the binary arbiter outputs 2 probabilities P (action ═ copy | x)1,...,xt-1T) and P (action | x)1,...,xt-1,T),P(action=copy|x1,...,xt-1T) represents the probability of executing a copy operation, P (action | x)1,...,xt-1T) stands for execution GenerationAnd comparing the sizes of the two probability values, and selecting the operation with higher probability to execute.
As a preferred embodiment, in step S8, if the binary arbiter determines to be a copy operation, attn is based on the attention vectortWith the state vector of each node i of the SQL syntax treeCalculating the probability P (x) of each node being copiedt|x1,...,xt-1T), selecting the node with the maximum probability to copy the node text as the output x of the current time stept(ii) a Attention vector attn in replication mechanismtWith the state vector of each node i of the SQL syntax treeThe specific formula of the probability that each node is duplicated is calculated as follows:
wherein the content of the first and second substances,state vector representing ith node of SQL syntax treeIn the transposed form of (a) to (b),is a scalar quantity representing a value at the ith node with respect to the t time step, the value representing a state vectorAnd attention vector attntThe similarity of (2);
P(xt|x1,...,xt-1,T)=softmax(ut) (17)
if two isThe classifier discriminates as a generating operation that attention vector attntInputting the input into a full-connection network with one output dimension being the size of the target dictionary to obtain the probability P (x) of each word in the target dictionaryt|x1,...,xt-1T), the word with the highest probability is selected as the output x of the current time stept(ii) a And repeating the steps S5-S8 until a corresponding natural language sequence is generated.
As a preferred embodiment, the gradient descent algorithm in step S9 comprises the following steps:
step S201: assuming an objective function J (theta) with respect to a model parameter theta of a natural language generation model;
step S202: calculating the gradient of J (theta)
Step S203: the parameter theta is updated with an update step alpha (alpha > 0),
as a preferred embodiment, in step S9, in the process of training the natural language generation model, the model parameter θ is trained through an objective function or a loss function until the model converges, where the objective function is:
wherein, P (x)t,action=copy|x1,...,xt-1T) represents the probability that the text will perform a copy operation, P (x)t,action=generate|x1,...,xt-1T) represents the probability of the text performing the generating operation;
the corresponding loss function L is:
L=-logP(X|T)
=-∑tlog(P(xt|x1,...,xt-1,T))
=-∑tlog(P(xt,action=copy|x1,...,xt-1,T)
+P(xt,action=generate|x1,...,xt-1,T))
=-∑tlog(P(xt|x1,...,xt-1,T)×P(action=copy|x1,...,xt-1,T)
+P(xt|x1,...,xt-1,T)
×P(action=generate|x1,...,xt-1,T)) (2)。
wherein X represents a natural language sentence, each sentence being X1,...,xmThe word sequence of (1); t represents an abstract syntax tree, each tree is a node1,...,nodenP (X | T) represents the conditional probability of X given the syntax tree T, P (X | T)t,action=copy|x1,...,xt-1T) represents the probability that the text will perform a copy operation, P (x)t,action=generate|x1,...,xt-1And T) represents the probability of the text performing the generating operation.
Example 2
As shown in fig. 4, in the present embodiment, a detailed word is input into the speech coder in the natural language generation model, and a word capable of summarizing the input content is output into the speech decoder, and the specific examples are as follows:
inputting: xiaoming goes to Guangzhou wine house to eat lunch, and has 3 dishes, so that people have a very pleasant taste.
And (3) outputting: the Xiaoming lunch is very happy.
If the word "Xiaoming" is not in the constructed dictionary, an "unknown" word is generated without the existence of a copying mechanism; whereas if a copy mechanism is present, the word "Xiaoming" may be copied from input to output. The replication mechanism is specifically implemented based on a Pointer Network (Pointer Network). The pointer network is based on a speech coder-speech decoder framework, assuming the input is X ═ X1,...,xnThe output is Y ═ Y1,...,ymIn a certain time step i of the decoder stage, the language decoder hidden state vector diHidden state vector e associated with each time step j ∈ (1., n) of speech coder inputjThe operation yields a probability P (y) for each time step enteredi|y1,...,yi-1X), representing the possibility of copying the input word at this time step, selecting the input word with the highest probability for copying, and the specific formula is as follows:
P(yi|y1,...,yi-1,X)=softmax(ui)
wherein, softmax (·) is a nonlinear activation function, and the concrete formula of softmax (·) is:
the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A natural language generation method based on SQL syntax tree node type is characterized by comprising the following steps:
step S1: constructing a natural language generation model, wherein the model comprises a language encoder and a language decoder which are formed on the basis of a memory network;
step S2: collecting a natural language data set from an SQL text, traversing the natural language data set according to breadth first to obtain an SQL abstract syntax tree T ═ node with n nodes1,…,nodenAnd the corresponding natural language sequence X ═ X1,…,xm}; wherein, the node represents a node of the SQL abstract syntax tree T, and the subscript is a node serial number; x represents a word in a natural language sentence X, and subscripts represent sequence numbers;
step S3: computing each node in an SQL abstract syntax tree using a language encoder in a natural language generative modeliNode state vector of
Step S4: selecting node state vectors of an SQL abstract syntax treeAs an initial hidden state vector h0Inputting the input into a language decoder in a natural language generation model;
step S5: at each time step t, the speech decoder updates the hidden state vector h at more than one time stept-1And predicted word xt-1As input, a new hidden state vector h is updatedt
Step S6: based on hidden state vector htWith each node state vector of the SQL syntax treeCalculating to obtain an attention vector attn of the current time stept
Step S7: attention vector attntAs input, into a language decoder of a natural language generation model;
step S8: the language decoder is based on the input attention vector attntExecuting copy operation or generation operation to generate a corresponding natural language sequence;
step S9: and training the natural language generating model by using a gradient descent method, determining a model parameter theta of the natural language generating model, and obtaining the optimized natural language generating model.
2. The method according to claim 1, wherein the language decoder of step S1 further includes a binary discriminator, the language encoder is composed of a tree-type long-short term memory network based on node types, the language decoder is composed of a long-short term memory network, and the binary discriminator is composed of a fully-connected network; the tree-shaped long and short term memory network based on the node type is a variation of the tree-shaped long and short term memory network, the tree-shaped long and short term memory network is composed of father nodes, root nodes and child nodes, one father node comprises a plurality of root nodes, one root node comprises a plurality of child nodes, and if each node of one SQL syntax tree is a nodeiIf there are K subnodes, the tree-type long-short term memory network based on node type is composed of 1 input gate InputGate, K forgetting gates ForgetGate and one output gate OutputGate, and the network uses nodeiText vector ofAnd node typeState vector of its K child nodesAnd node typeAnd cell statusRespectively calculating with an input gate, a forgetting gate and an output gate according to the following formula to obtain node state vectors
Wherein, b is an offset,andthe learnable parameters of the tree-type long-term and short-term memory network based on the node type are determined according to the node type(·)Selecting different parameter values; sigmoid (. cndot.) and tanh (. cndot.) are nonlinear activation functions, and the specific formula is as follows:
3. the method according to claim 2, wherein the natural language data set in step S2 is collected by manual or machine statistics, the natural language data set includes structured query language SQL and natural language sequence pairs, and the data set is proportionally split into a training set, a validation set and a test set for training the reliability of the natural language generation model.
4. The method according to claim 3, wherein the time step in step S5 is an input unit of the long-term and short-term memory network when processing the sequence data.
5. The method according to claim 4, wherein in step S5, when t-1 is 0, x ist-1Is a special symbol that indicates the beginning.
6. The method according to claim 5, wherein in step S6, an attention vector attn is calculatedtThe method comprises the following specific steps:
first based on the hidden state vector htWith each node state vector of the SQL syntax treeCalculating weights for each node state vectorThen weighted and summed according to the weight to obtain a context vector ctxtFinally, the context vector ctxtAnd hidden state vector htCalculating to obtain attention vector attntIn particularThe formula is as follows:
attnt=tanh([ctxt;ht]) (13)
wherein S is a real number matrix of n × d, and represents n node state vectors of an SQL syntax treeThe dimension of the vector is d; softmax (·) and tanh (·) are nonlinear activation functions, and the concrete formula of softmax (·) is:
7. the method according to claim 6, wherein in step S7, attention vector attn is usedtThe binary arbiter is input into a binary arbiter of the speech decoder, which is a fully connected network with an output dimension of 2, i.e.:
P(action|x1,…,xt-1,T)=W×attnt;W∈R2×d (15)
wherein W ∈ R2×dIs a fully-connected network trainable parameter, d is an attention vector attntDimension (d);
the binary arbiter outputs 2 probabilities P (action ═ copy | x)1,…,xt-1T) and P (action | x)1,…,xt-1,T),P(action=copy|x1,…,xt-1T) represents the probability of executing a copy operation, P (action | x)1,…,xt-1T) represents the probability of performing the generating operation, the magnitude of the two probability values are compared,and selecting the operation with higher probability to execute.
8. The method according to claim 7, wherein in step S8, if the binary decision device decides the copy operation, it bases on the attention vector attntWith the state vector of each node i of the SQL syntax treeCalculating the probability P (x) of each node being copiedt|x1,…,xt-1T), selecting the node with the maximum probability to copy the node text as the output x of the current time stept(ii) a Attention vector attn in replication mechanismtWith the state vector of each node i of the SQL syntax treeThe specific formula of the probability that each node is duplicated is calculated as follows:
wherein the content of the first and second substances,state vector representing ith node of SQL syntax treeIn the transposed form of (a) to (b),is a scalar quantity representing a value at the ith node with respect to the t time step, the value representing a state vectorAnd attention vector attntThe similarity of (2);
P(xt|x1,…,xt-1,T)=softmax(ut) (17)
if the binary classifier is judged to be the generating operation, attention vector attn is addedtInputting the input into a full-connection network with one output dimension being the size of the target dictionary to obtain the probability P (x) of each word in the target dictionaryt|x1,…,xt-1T), the word with the highest probability is selected as the output x of the current time stept(ii) a And repeating the steps S5-S8 until a corresponding natural language sequence is generated.
9. The method according to claim 8, wherein the gradient descent algorithm in step S9 comprises the following steps:
step S201: assuming an objective function J (theta) with respect to a model parameter theta of a natural language generation model;
step S202: calculating the gradient of J (theta)
Step S203: to update the step size alpha (alpha)>0) The parameters theta are updated in such a way that,
10. the method according to claim 9, wherein in step S9, in the process of training the natural language generation model, the model parameter θ is trained by an objective function or a loss function until the model converges, wherein the objective function is:
wherein, P (x)t,action=copy|x1,…,xt-1T) is a textProbability of executing a copy operation, P (x)t,action=generate|x1,…,xt-1T) represents the probability of the text performing the generating operation;
the corresponding loss function L is:
wherein X represents a natural language sentence, each sentence being X1,…,xmThe word sequence of (1); t represents an abstract syntax tree, each tree is a node1,…,nodenP (X | T) represents the conditional probability of X given the syntax tree T, P (X | T)t,action=copy|x1,…,xt-1T) represents the probability that the text will perform a copy operation, P (x)t,action=generate|x1,…,xt-1And T) represents the probability of the text performing the generating operation.
CN201910796688.1A 2019-08-27 2019-08-27 Natural language generation method based on SQL syntax tree node type Active CN110609849B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910796688.1A CN110609849B (en) 2019-08-27 2019-08-27 Natural language generation method based on SQL syntax tree node type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910796688.1A CN110609849B (en) 2019-08-27 2019-08-27 Natural language generation method based on SQL syntax tree node type

Publications (2)

Publication Number Publication Date
CN110609849A true CN110609849A (en) 2019-12-24
CN110609849B CN110609849B (en) 2022-03-25

Family

ID=68890463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910796688.1A Active CN110609849B (en) 2019-08-27 2019-08-27 Natural language generation method based on SQL syntax tree node type

Country Status (1)

Country Link
CN (1) CN110609849B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581946A (en) * 2020-04-21 2020-08-25 上海爱数信息技术股份有限公司 Language sequence model decoding method
CN112487020A (en) * 2020-12-18 2021-03-12 苏州思必驰信息科技有限公司 Method and system for converting graph of SQL to text into natural language statement
CN113254581A (en) * 2021-05-25 2021-08-13 深圳市图灵机器人有限公司 Financial text formula extraction method and device based on neural semantic analysis
CN113553411A (en) * 2021-06-30 2021-10-26 北京百度网讯科技有限公司 Query statement generation method and device, electronic equipment and storage medium
JP2022089166A (en) * 2020-12-03 2022-06-15 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Method for generating data pair, apparatus, electronic device, and storage medium
CN114692208A (en) * 2022-05-31 2022-07-01 中建电子商务有限责任公司 Processing method of data query service authority
CN116089476A (en) * 2023-04-07 2023-05-09 北京宝兰德软件股份有限公司 Data query method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5805832A (en) * 1991-07-25 1998-09-08 International Business Machines Corporation System for parametric text to text language translation
CN110059100A (en) * 2019-03-20 2019-07-26 广东工业大学 Based on performer-reviewer's network SQL statement building method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5805832A (en) * 1991-07-25 1998-09-08 International Business Machines Corporation System for parametric text to text language translation
CN110059100A (en) * 2019-03-20 2019-07-26 广东工业大学 Based on performer-reviewer's network SQL statement building method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郝亮等: "一种数据库汉语查询接口的设计与实现", 《计算机技术与发展》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581946A (en) * 2020-04-21 2020-08-25 上海爱数信息技术股份有限公司 Language sequence model decoding method
CN111581946B (en) * 2020-04-21 2023-10-13 上海爱数信息技术股份有限公司 Language sequence model decoding method
JP7266658B2 (en) 2020-12-03 2023-04-28 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド DATA PAIR GENERATION METHOD, APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
JP2022089166A (en) * 2020-12-03 2022-06-15 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Method for generating data pair, apparatus, electronic device, and storage medium
US11748340B2 (en) 2020-12-03 2023-09-05 Beijing Baidu Netcom Science And Technology Co., Ltd. Data pair generating method, apparatus, electronic device and storage medium
CN112487020A (en) * 2020-12-18 2021-03-12 苏州思必驰信息科技有限公司 Method and system for converting graph of SQL to text into natural language statement
CN112487020B (en) * 2020-12-18 2022-07-12 思必驰科技股份有限公司 Method and system for converting graph of SQL to text into natural language statement
CN113254581A (en) * 2021-05-25 2021-08-13 深圳市图灵机器人有限公司 Financial text formula extraction method and device based on neural semantic analysis
CN113553411A (en) * 2021-06-30 2021-10-26 北京百度网讯科技有限公司 Query statement generation method and device, electronic equipment and storage medium
CN113553411B (en) * 2021-06-30 2023-08-29 北京百度网讯科技有限公司 Query statement generation method and device, electronic equipment and storage medium
CN114692208B (en) * 2022-05-31 2022-09-27 中建电子商务有限责任公司 Processing method of data query service authority
CN114692208A (en) * 2022-05-31 2022-07-01 中建电子商务有限责任公司 Processing method of data query service authority
CN116089476A (en) * 2023-04-07 2023-05-09 北京宝兰德软件股份有限公司 Data query method and device and electronic equipment

Also Published As

Publication number Publication date
CN110609849B (en) 2022-03-25

Similar Documents

Publication Publication Date Title
CN110609849B (en) Natural language generation method based on SQL syntax tree node type
CN109284506B (en) User comment emotion analysis system and method based on attention convolution neural network
CN107273355B (en) Chinese word vector generation method based on word and phrase joint training
CN109086270B (en) Automatic poetry making system and method based on ancient poetry corpus vectorization
CN109902159A (en) A kind of intelligent O&M statement similarity matching process based on natural language processing
CN106126507A (en) A kind of based on character-coded degree of depth nerve interpretation method and system
CN111782961B (en) Answer recommendation method oriented to machine reading understanding
CN108427665A (en) A kind of text automatic generation method based on LSTM type RNN models
CN110532395B (en) Semantic embedding-based word vector improvement model establishing method
CN111368082A (en) Emotion analysis method for domain adaptive word embedding based on hierarchical network
CN113821635A (en) Text abstract generation method and system for financial field
CN111400494A (en) Sentiment analysis method based on GCN-Attention
CN114925195A (en) Standard content text abstract generation method integrating vocabulary coding and structure coding
CN113157919A (en) Sentence text aspect level emotion classification method and system
CN113255366A (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN114254645A (en) Artificial intelligence auxiliary writing system
CN115687609A (en) Zero sample relation extraction method based on Prompt multi-template fusion
CN108875024B (en) Text classification method and system, readable storage medium and electronic equipment
CN116720519B (en) Seedling medicine named entity identification method
CN111259106A (en) Relation extraction method combining neural network and feature calculation
CN110705274A (en) Fusion type word meaning embedding method based on real-time learning
CN109815323B (en) Human-computer interaction training question-answer generation algorithm
CN110442693B (en) Reply message generation method, device, server and medium based on artificial intelligence
CN113901758A (en) Relation extraction method for knowledge graph automatic construction system
Lee et al. A two-level recurrent neural network language model based on the continuous Bag-of-Words model for sentence classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant