CN114201506B - Context-dependent semantic analysis method - Google Patents

Context-dependent semantic analysis method Download PDF

Info

Publication number
CN114201506B
CN114201506B CN202111524256.9A CN202111524256A CN114201506B CN 114201506 B CN114201506 B CN 114201506B CN 202111524256 A CN202111524256 A CN 202111524256A CN 114201506 B CN114201506 B CN 114201506B
Authority
CN
China
Prior art keywords
alignment matrix
current
alignment
text
round
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111524256.9A
Other languages
Chinese (zh)
Other versions
CN114201506A (en
Inventor
陈观林
余皆毅
李甜
杨武剑
翁文勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University City College ZUCC
Original Assignee
Zhejiang University City College ZUCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University City College ZUCC filed Critical Zhejiang University City College ZUCC
Priority to CN202111524256.9A priority Critical patent/CN114201506B/en
Publication of CN114201506A publication Critical patent/CN114201506A/en
Application granted granted Critical
Publication of CN114201506B publication Critical patent/CN114201506B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a context-dependent semantic analysis method, which comprises the following steps: encoding data by using ERNIE, and sequentially arranging the problem text and the tables and columns of a database to serve as initial input of an Alignment-Rect-Augmented NL2SQL model; updating attention using the current text code and the text codes of the previous rounds; and performing attention operation by using the obtained problem text codes and the alignment matrix of the previous round, and performing cross entropy calculation on the generated state tracking alignment matrix and the real label. The beneficial effects of the invention are as follows: a Chinese context correlation NL2SQL model based on alignment matrix enhancement is provided, the model is based on a RATSQL model, a pre-training language model BERT is utilized to obtain text character level codes of a problem text and a database mode, word vectors based on context information are obtained, state tracking attention and correlation characteristics are introduced, and alignment relation of the model between the problem text and the database mode is enhanced.

Description

Context-dependent semantic analysis method
Technical Field
The invention belongs to the technical field of context-related semantic parsing, and particularly relates to a context-related semantic parsing method.
Background
When digital data is related to the internet, conventional relational databases are typically used to store such data for ease of administration, operation, and maintenance. How to query necessary information from the relational databases through natural language, namely how to convert human natural language query descriptions into executable database query sentences SQL, has become one of the most popular research directions in the field of natural language processing.
The natural language query interface aims to complete man-machine interaction of a user through a natural language and relational database to obtain wanted data, and is a component for constructing an automatic database intelligent query system. The most important task to implement a natural language query interface is how to construct SQL statements, called NL2SQL operations, from natural language queries.
How to eliminate the differences in representation and structure between natural language queries, the structure and content of data tables in databases, and SQL statements is the core of using NL2 SQL. Because of how to map the query intent of a natural language query to the canonical description of the database, the problem of forming an accurate executable SQL statement is created, where NL2SQL operations are constrained by the syntax tree, and the SQL statement is handled separately in subtrees of the parts. The method achieves the purpose of eliminating inconsistency of Chinese text data, column name difference, inconsistency of natural language query description and inconsistency of database storage data.
Existing NL2SQL task research methods can be broadly divided into context-free methods and context-dependent methods. The context-free method is single round NL2SQL, a text and a database mode are given, and the relation between the text and the database mode is learned through a model, so that a desired SQL sentence is generated. The disadvantage of this approach is that only a single round of information can be processed, whereas the person's conversation habits are simple sentences, which can be challenging if all the information desired is given at one time. The context correlation method is that NL2SQL of multiple rounds, the dialogue content of each round is aligned with the database mode through the form of multiple rounds of dialogues, and corresponding SQL sentences of each round are obtained through a gradual updating mode. The context-dependent method may capture more complex and varied semantic information described in natural language text than the context-independent method.
Chinese patent invention (application number: CN202110737345.5, patent name: semantic analysis method, device, electronic equipment and storage medium), encode the input problem and corresponding database; generating SQL query sentences according to the coding result, wherein the following processing is respectively carried out on any SQL clause: determining a question segment corresponding to the SQL clause in the question; the SQL clause is generated according to the question segment, so that the accuracy of the generated SQL query statement and the like can be improved. However, the invention is based on the NL2SQL of the context nothing, and has poor effect on the NL2SQL task of the context nothing.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a context-dependent semantic analysis method.
The context-dependent semantic parsing method specifically comprises the following steps:
s1, encoding data by using ERNIE, sequentially arranging the tables and columns of a problem text and a database, and obtaining an encoding representation of the problem text and the database mode by using the table and the columns as initial input of an Alignment-Rect-Augmented NL2SQL model;
x input =(q 1 ,q 2 ,...,q i ,t 1 ,t 2 ,...,t j ,c 1 ,c 2 ,...,c k )
in the above, q i Each character element representing a question text, i being the total number of character elements of the question text; t is t j Representing the name of each table in the database, j being the total number of tables in the database; c k Representing the column names under the corresponding tables in the database, k being the total number of column names present in the database;
s2, updating attention by using the current text codes and the text codes of the previous rounds;
s3, performing attention operation by using the problem text code obtained in the step S2 and the alignment matrix of the previous round, and performing cross entropy calculation on the generated state tracking alignment matrix and the real label; the alignment matrix is co-occurrence information between the problem text and the database mode, namely, a corresponding matrix appears in the problem and a corresponding matrix appears in the database mode, the relation between the problem text and the database mode is aligned, and dialogue information of the current round is added to correct the alignment problem of the alignment matrix of the current round;
s4, calculating the correlation of the alignment matrix by using an n_gram mode, and acquiring the correlation of the alignment matrix by using a correlation Schema-Linking to obtain a correlation alignment matrix: firstly, determining the length of an n_gram, searching a question text, and listing words or sentences existing in the length of the n_gram in the question text; performing correlation calculation by using the candidate fragments, the table names and the column names respectively, and comparing a correlation calculation result with a set threshold value; if the correlation calculation result is greater than the threshold value, setting the candidate fragments related to the table names as CES, setting the candidate fragments related to the column names as TES, and setting the candidate fragments not related to the column names as NONE;
s5, performing fusion calculation on the state tracking alignment matrix and the correlation alignment matrix by utilizing the step S3 and the step S4 to obtain a current alignment matrix;
s6, replacing the Alignment matrix in the NL2SQL neural network model by using the Alignment matrix obtained in the step S5, and bringing the Alignment matrix into the step S3 to train the obtained Alignment-Rect-Augmented NL2SQL model.
Preferably, the step S2 specifically includes:
the historical information tracking attention mechanism captures the interrelation between the current round of question text and the historical question text, in the current round t, the attention weight between the current round of question text codes and the historical question text codes is calculated by using a dot product attention mode, and then the historical question text codes are added with the current question text codes in a weighted average mode to obtain
α turn =softmax(s i )
In the above-mentioned method, the step of,word embedding for the first few rounds,/>Word embedding, W, of the current round turn-att Is the parameter to be learned, alpha turn Attention weight of the current turn, which is the learned attention weight of the question text code of the previous turn,/for the current turn>The method is that the problem text codes are updated by historical attention; use->To describe current contextually relevant information.
Preferably, the step S3 specifically includes the steps of:
s3.1, based on the schema_linking marked in the data, the attention calculation is carried out by using the features of the extracted problem text in the round and the word embedding of the alignment matrix of the previous round, a parameter model for acquiring the guiding effect of the alignment matrix of the previous round on the current round is established to guide updating of the alignment matrix of the current round, and the relation between the problem text and the database mode is more accurately aligned;
R current =R current +a i .R last
in the above-mentioned method, the step of,word embedding, q, of the alignment matrix of the previous round current Word embedding of the dialog of the current round, wherein tanh, W and U are weights to be learned, and a i Is to learn the attention weight, R current Is the alignment matrix of the current wheel, R last Is the alignment matrix of the previous round, and R after updating is obtained after the weight updating is learned current
S3.2, correcting the alignment problem of the alignment matrix of the current round by adding dialogue information of the current round: taking dialogue information as a key, embedding words of an alignment matrix of the previous round as a query, and performing attention calculation by using the key and the query; the importance of the Alignment matrix of the previous round on the current Alignment matrix under the current dialogue information is calculated, the weight calculated by the Alignment matrix and the attention of the previous round is used for calculating and updating the Alignment matrix of the current round, then the model link labeling information is used as a real label, a loss function is added to the original Alignment-Rect-Augmented NL2SQL model to train together, the current Alignment matrix is updated as a supervision signal, and the loss function is as follows:
in the above, y k Label for sample, p (x k ) For the probability that the sample is predicted to be a positive class, m and n represent dimensions of the alignment matrix, and q is the class of the alignment matrix label.
Preferably, the step S5 specifically includes the steps of:
s5.1, carrying out weighted summation on the explicit Alignment matrix and the implicit Alignment matrix, setting a parameter to be learned, allowing the Alignment-Rect-Augmented NL2SQL model to carry out back propagation through a gradient descent method, and learning a weight relation between the explicit Alignment matrix and the implicit Alignment matrix;
s5.2, adding a RELU function as an activation function, and adding nonlinear capability for parameters to be learned:
in the above-mentioned method, the step of,encoding for the current explicit alignment matrix, +.>For the current correlation alignment matrix coding, RELU is the activation function, W is the parameter to be learned, < ->To learn the final alignment matrix, x i 、x j For the encoded representation of the input, is a parameter to be learned for attentive manipulation,/->For the relative positional relationship of the operation, attention weight +.> For the parameters to be learned, the final result +.>
The beneficial effects of the invention are as follows:
aiming at the problem that a context-related model is not matched in a context-free NL2SQL technology, the invention provides a Chinese context-related NL2SQL model Alignment-Rect-authorized NL2SQL based on Alignment matrix enhancement, the model obtains text character level codes of a problem text and a database mode by utilizing a pre-training language model BERT on the basis of a RATSQL model to obtain word vectors based on context information, and then introduces state tracking attention and correlation characteristics to enhance the Alignment relationship of the model between the problem text and the database mode.
Experimental results on the CHASE dataset show that the index of the Alignment-Rect-Augmented NL2SQL model is better than other multi-round Chinese NL2SQL models. The Alignment-Rect-Augmented NL2SQL can better align the relation between the problem text and the database mode, and the preset relation is learned, so that the accuracy of generating the SQL task is improved.
Drawings
FIG. 1 is a schematic flow diagram of Alignment-Rect-Augmented NL2SQL model Alignment-Rect-Augmented NL2SQL based on Alignment matrix enhancement proposed by the present invention;
fig. 2 is a historical information trace attention diagram.
Detailed Description
The invention is further described below with reference to examples. The following examples are presented only to aid in the understanding of the invention. It should be noted that it will be apparent to those skilled in the art that modifications can be made to the present invention without departing from the principles of the invention, and such modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
Example 1
An embodiment of the present application provides a context-dependent semantic parsing method as shown in fig. 1:
s1, encoding data by using ERNIE, sequentially arranging the tables and columns of a problem text and a database, and obtaining an encoding representation of the problem text and the database mode by using the table and the columns as initial input of an Alignment-Rect-Augmented NL2SQL model;
x input =(q 1 ,q 2 ,...,q i ,t 1 ,t 2 ,...,t j ,c 1 ,c 2 ,...,c k )
in the above, q i Each character element representing a question text, i being the total number of character elements of the question text; t is t j Representing the name of each table in the database, j being the total number of tables in the database; c k Representing the column names under the corresponding tables in the database, k is the total number of column names present in the database.
S2, updating attention by using the current text codes and the text codes of the previous rounds;
s3, performing attention operation by using the problem text code obtained in the step S2 and the alignment matrix of the previous round, and performing cross entropy calculation on the generated state tracking alignment matrix and the real label; the alignment matrix is co-occurrence information between the problem text and the database mode, namely, a corresponding matrix appears in the problem and a corresponding matrix appears in the database mode, the relation between the problem text and the database mode is aligned, and dialogue information of the current round is added to correct the alignment problem of the alignment matrix of the current round;
s4, calculating the correlation of the alignment matrix by using an n_gram mode, and acquiring the correlation of the alignment matrix by using a correlation Schema-Linking to obtain a correlation alignment matrix: firstly, determining the length of an n_gram, searching a question text, and listing words or sentences existing in the length of the n_gram in the question text; performing correlation calculation by using the candidate fragments, the table names and the column names respectively, and comparing a correlation calculation result with a set threshold value; if the correlation calculation result is greater than the threshold value, setting the candidate fragments related to the table names as CES, setting the candidate fragments related to the column names as TES, and setting the candidate fragments not related to the column names as NONE;
s5, performing fusion calculation on the state tracking alignment matrix and the correlation alignment matrix by utilizing the step S3 and the step S4 to obtain a current alignment matrix;
s6, replacing the Alignment matrix in the NL2SQL neural network model by using the Alignment matrix obtained in the step S5, and bringing the Alignment matrix into the step S3 to train the obtained Alignment-Rect-Augmented NL2SQL model.
Example two
Based on the first embodiment, a second embodiment of the present application provides a Chinese context-related NL2SQL model Alignment-Rect-Augmented NL2SQL method based on Alignment matrix enhancement of the first embodiment, including the following steps:
s1, acquiring a problem text and a database mode code;
ERNIE is used herein to encode data, with the input of the model having the problem and the table and column sequence of the database as initial inputs. The specific input format is as follows:
x input =(q 1 ,q 2 ,...,q i ,t 1 ,t 2 ,...,t j ,c 1 ,c 2 ,...,c k )
in the above, q i Each character element representing a question text, i being the total number of character elements of the question text; t is t j Representing the name of each table in the database, j being the total number of tables in the database; c k Representing the column names under the corresponding tables in the database, k is the total number of column names present in the database.
The model is then entered to obtain a coded representation of the acquired question text and database schema.
S2, updating by using the current text codes and the attention operation of the previous rounds;
as shown in FIG. 2, the historical information tracking attention mechanism captures the problem and historical problem correlations for the current turn. In the current round t, the attention weight of the problem code and the history problem code of the current round is calculated by using a dot product attention mode, and then the history problem code is added with the current problem code by a weighted average mode to obtainMay be used to describe the current context-bearing information. The formula is as follows:
α turn =softmax(s i )
in the above-mentioned description of the invention,word embedding of the first few rounds, < ->Word embedding, W, of the current round turn-att Is a parameter which can be learned, alpha turn Is the attention weight of the learned question code of the previous round to the current round,/for the previous round>Is the problem code updated by the history attention.
S3, performing attention operation by using the problem text code obtained in the step S2 and the alignment matrix of the previous round, and performing cross entropy calculation on the generated state tracking alignment matrix and the real label;
the alignment matrix is co-occurrence information between the problem and the database schema, i.e., a corresponding matrix of occurrences in the problem and occurrences in the database schema.
Based on the condition that the schema_linking is marked in the data, the attention calculation is performed by using the feature of the problem extracted in the round and the word embedding of the alignment matrix of the previous round, so as to establish a parameter model for acquiring the guiding effect of the alignment matrix of the previous round on the current round, and the parameter model is used for guiding updating of the alignment matrix of the current round, so that the relation between the problem and the database mode is better aligned. The formula is as follows:
R current =R current +a i .R last
in the above-mentioned description of the invention,word embedding, q, of the alignment matrix of the previous round current Word embedding of the dialog of the current round, where tanh, W and U are learnable weights, and α i Is to learn the attention weight, R current Is the alignment matrix of the current wheel, R last Is the alignment matrix of the previous round, and R after updating is obtained after the weight updating is learned current
In the model, the possible alignment problem of the alignment matrix of the current round can be corrected by adding knowledge of dialogue information of the current round, words of the alignment matrix of the previous round are embedded into the query by taking the dialogue information as keys, attention calculation is performed by using the keys and the query, the importance of the alignment matrix of the previous round to the current alignment matrix under the current dialogue information is calculated, in the model, the alignment matrix of the current round is calculated and updated by using weights calculated by the alignment matrix and the attention of the previous round, then the information is marked as a real label through a mode link, and a loss function is added to the original model for training together to update the current alignment matrix as a supervision signal. The formula is as follows:
in the above, y k Label for sample, p (x k ) For the probability that the sample is predicted to be a positive class, m and n represent dimensions of the alignment matrix, and q is the class of the alignment matrix label.
S4, acquiring the correlation of the alignment matrix by using a correlation scheme-Linking;
as shown in table 1 below, the present embodiment uses the n_gram method to calculate the correlation of the alignment matrix. First, the length of the n_gram is determined, where the value is typically set to 5; secondly, searching the question text after the n_gram length, and listing possible words or sentences within the n_gram length in the question text; then, the candidate segment and the table name are respectively used for carrying out correlation calculation with the column name, and the segment of the candidate segment is marked as the following condition according to the set threshold value: 1) When the number is greater than a certain threshold, the table name is referred to as "CES", the column name is referred to as "TES", and the table name is referred to as "NONE".
Table 1 correlation-based name matching algorithm table
S5, performing fusion calculation on the state tracking alignment matrix and the correlation alignment matrix by utilizing the step S3 and the step S4 to obtain a current alignment matrix;
in the module, fusion calculation is carried out on the explicit alignment matrix and the implicit alignment matrix. If it is simply added, simple addition is not a good choice because the magnitude of the effect of the two parts on the model is uncertain. Therefore, in this section, they are selected to be weighted and summed, a learnable parameter is set, and the model is made to learn the weight relationship between them through back propagation by a gradient descent method. Further, the relationship of the explicit alignment matrix to the implicit alignment matrix is not necessarily a linear relationship, and may be a nonlinear combination. Because the explicit relationship and the evaluation dimension of the correlation are not necessarily the same dimension, we add an activation function to the learning of the parameters, adding non-linear capabilities to the learning of the parameters, above learning their relationship using the parameters. Because our required output is not probability, the sigmod function is not suitable; because the range of output is not necessarily (-1, +1), the alignment matrix enhanced based encoder is also unsuitable with the tanh function; here we have therefore chosen the RELU function as the activation function for this part. The formula is as follows:
in the above-mentioned description of the invention,encoding for the current explicit alignment matrix, +.>For the current correlation alignment matrix coding, RELU is the activation function, W is the learnable parameter,/for the matrix coding>To learn the final alignment matrix, x i 、x j For the coded representation of the input +.> Is a parameter which can be learned for attentive manipulation,/->For the relative positional relationship of the operation, attention weight +.> For a learnable parameter, learn the final result +.>
S6, replacing the alignment matrix in the NL2SQL neural network model by using the alignment matrix obtained in the step S5, and training the obtained model in the step S3.
The multiple deep learning models of this embodiment are all trained based on a pre-training model Ernie plus specific network layer fine-tuning (fine-tuning) of downstream tasks, with the word segmenter (token) using the Ernie version. In order to prevent how much the learning rate damages the original learned knowledge of the pre-trained model during the fine tuning, the learning rate for the pre-trained model part is set to 1e-5, and the downstream tasks are decoded by using LSTM, so that the learning rate is set to 2e-3 for better learning because LSTM is not pre-trained. lstm_hidden_size is the hidden dimension of the BiLSTM output, train_epochs is the total number of iterations of training, and batch_size is the batch size of collective training at training. word_compressing_size is 768-dimensional, the lstm_hidden_size of the encoder is 400-dimensional, the lstm_hidden_size of the decoder is 300-dimensional, and the succession relationship between the front and the back is the multi-round NL2SQL, so the batch_size can only be 1. the train_epochs is 50.

Claims (4)

1. A method of context-dependent semantic parsing, comprising the steps of:
s1, encoding data by using ERNIE, sequentially arranging the tables and columns of a problem text and a database, and obtaining an encoding representation of the problem text and the database mode by using the table and the columns as initial input of an Alignment-Rect-Augmented NL2SQL model;
x input =(q 1 ,q 2 ,...,q i ,t 1 ,t 2 ,...,t j ,c 1 ,c 2 ,...,c k )
in the above, q i Each character element representing a question text, i being the total number of character elements of the question text; t is t j Representing the name of each table in the database, j being the total number of tables in the database; c k Representing the column names under the corresponding tables in the database, k being the total number of column names present in the database;
s2, updating attention by using the current text codes and the text codes of the previous rounds;
s3, performing attention operation by using the problem text code obtained in the step S2 and the alignment matrix of the previous round, and performing cross entropy calculation on the generated state tracking alignment matrix and the real label; the alignment matrix is shared information between the problem text and the database mode, the relation between the problem text and the database mode is aligned, and dialogue information of the current round is added to correct the alignment problem of the alignment matrix of the current round;
s4, calculating the correlation of the alignment matrix by using an n_gram mode, and acquiring the correlation of the alignment matrix by using a correlation Schema-Linking to obtain a correlation alignment matrix: firstly, determining the length of an n_gram, searching a question text, and listing words or sentences existing in the length of the n_gram in the question text; performing correlation calculation by using the candidate fragments, the table names and the column names respectively, and comparing a correlation calculation result with a set threshold value; if the correlation calculation result is greater than the threshold value, setting the candidate fragments related to the table names as CES, setting the candidate fragments related to the column names as TES, and setting the candidate fragments not related to the column names as NONE;
s5, performing fusion calculation on the state tracking alignment matrix and the correlation alignment matrix by utilizing the step S3 and the step S4 to obtain a current alignment matrix;
s6, replacing the Alignment matrix in the NL2SQL neural network model by using the Alignment matrix obtained in the step S5, and bringing the Alignment matrix into the step S3 to train the obtained Alignment-Rect-Augmented NL2SQL model.
2. The context-dependent semantic parsing method according to claim 1, wherein step S2 is specifically:
the historical information tracking attention mechanism captures the interrelation between the current round of question text and the historical question text, in the current round t, the attention weight between the current round of question text codes and the historical question text codes is calculated by using a dot product attention mode, and then the historical question text codes are added with the current question text codes in a weighted average mode to obtainTo (get to)
α turn =softmax(s i )
In the above-mentioned method, the step of,word embedding of the first few rounds, < ->Word embedding, W, of the current round turn-att Is the parameter to be learned, alpha turn Attention weight of the current turn, which is the learned attention weight of the question text code of the previous turn,/for the current turn>The method is that the problem text codes are updated by historical attention; use->To describe current contextually relevant information.
3. The context-dependent semantic parsing method according to claim 2, wherein step S3 specifically comprises the steps of:
s3.1, based on the schema_linking marked in the data, the attention calculation is carried out by using the features of the extracted problem text in the round and the word embedding of the alignment matrix of the previous round, a parameter model for acquiring the guiding effect of the alignment matrix of the previous round on the current round is established to guide updating of the alignment matrix of the current round, and the relation between the problem text and the database mode is more accurately aligned;
R current =R current +a i ·R last
in the above-mentioned method, the step of,word embedding, q, of the alignment matrix of the previous round current Word embedding of the dialog of the current round, wherein tanh, W and U are weights to be learned, and a i Is to learn the attention weight, R current Is the alignment matrix of the current wheel, R last Is the alignment matrix of the previous round, and R after updating is obtained after the weight updating is learned current
S3.2, correcting the alignment problem of the alignment matrix of the current round by adding dialogue information of the current round: taking dialogue information as a key, embedding words of an alignment matrix of the previous round as a query, and performing attention calculation by using the key and the query; the importance of the Alignment matrix of the previous round on the current Alignment matrix under the current dialogue information is calculated, the weight calculated by the Alignment matrix and the attention of the previous round is used for calculating and updating the Alignment matrix of the current round, then the model link labeling information is used as a real label, a loss function is added to the original Alignment-Rect-Augmented NL2SQL model to train together, the current Alignment matrix is updated as a supervision signal, and the loss function is as follows:
in the above, y k Label for sample, p (x k ) For the probability that the sample is predicted to be a positive class, m and n represent dimensions of the alignment matrix, and q is the class of the alignment matrix label.
4. A context-dependent semantic parsing method according to claim 3, wherein step S5 specifically comprises the steps of:
s5.1, carrying out weighted summation on the explicit Alignment matrix and the implicit Alignment matrix, setting a parameter to be learned, allowing the Alignment-Rect-Augmented NL2SQL model to carry out back propagation through a gradient descent method, and learning a weight relation between the explicit Alignment matrix and the implicit Alignment matrix;
s5.2, adding RELU function as an activation function:
in the above-mentioned method, the step of,encoding for the current explicit alignment matrix, +.>For the current correlation alignment matrix coding, RELU is the activation function, W is the parameter to be learned, < ->To learn to be finalAlignment matrix, x i 、x j For the coded representation of the input +.> Is a parameter to be learned for attentive manipulation,/->For the relative positional relationship of the operation, attention weight +.> For the parameters to be learned, the final result +.>
CN202111524256.9A 2021-12-14 2021-12-14 Context-dependent semantic analysis method Active CN114201506B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111524256.9A CN114201506B (en) 2021-12-14 2021-12-14 Context-dependent semantic analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111524256.9A CN114201506B (en) 2021-12-14 2021-12-14 Context-dependent semantic analysis method

Publications (2)

Publication Number Publication Date
CN114201506A CN114201506A (en) 2022-03-18
CN114201506B true CN114201506B (en) 2024-03-29

Family

ID=80653451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111524256.9A Active CN114201506B (en) 2021-12-14 2021-12-14 Context-dependent semantic analysis method

Country Status (1)

Country Link
CN (1) CN114201506B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019229769A1 (en) * 2018-05-28 2019-12-05 Thottapilly Sanjeev An auto-disambiguation bot engine for dynamic corpus selection per query
WO2021010636A1 (en) * 2019-07-17 2021-01-21 에스케이텔레콤 주식회사 Method and device for tracking dialogue state in goal-oriented dialogue system
CN112988785A (en) * 2021-05-10 2021-06-18 浙江大学 SQL conversion method and system based on language model coding and multitask decoding
CN113011136A (en) * 2021-04-02 2021-06-22 中国人民解放军国防科技大学 SQL (structured query language) analysis method and device based on correlation judgment and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019229769A1 (en) * 2018-05-28 2019-12-05 Thottapilly Sanjeev An auto-disambiguation bot engine for dynamic corpus selection per query
WO2021010636A1 (en) * 2019-07-17 2021-01-21 에스케이텔레콤 주식회사 Method and device for tracking dialogue state in goal-oriented dialogue system
CN113011136A (en) * 2021-04-02 2021-06-22 中国人民解放军国防科技大学 SQL (structured query language) analysis method and device based on correlation judgment and computer equipment
CN112988785A (en) * 2021-05-10 2021-06-18 浙江大学 SQL conversion method and system based on language model coding and multitask decoding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于语义感知的中文短文本摘要生成模型;倪海清;刘丹;史梦雨;;计算机科学;20201231(第06期);全文 *
自然语言生成多表SQL查询语句技术研究;曹金超;黄滔;陈刚;吴晓凡;陈珂;;计算机科学与探索;20201231(第07期);全文 *

Also Published As

Publication number Publication date
CN114201506A (en) 2022-03-18

Similar Documents

Publication Publication Date Title
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
CN108519890B (en) Robust code abstract generation method based on self-attention mechanism
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
CN110929030A (en) Text abstract and emotion classification combined training method
US20030046078A1 (en) Supervised automatic text generation based on word classes for language modeling
CN111858932A (en) Multiple-feature Chinese and English emotion classification method and system based on Transformer
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN110390049B (en) Automatic answer generation method for software development questions
CN109992775A (en) A kind of text snippet generation method based on high-level semantics
CN112883175B (en) Meteorological service interaction method and system combining pre-training model and template generation
CN115658729A (en) Method for converting natural language into SQL (structured query language) statement based on pre-training model
CN111666764A (en) XLNET-based automatic summarization method and device
CN115392252A (en) Entity identification method integrating self-attention and hierarchical residual error memory network
CN113535897A (en) Fine-grained emotion analysis method based on syntactic relation and opinion word distribution
CN114510946B (en) Deep neural network-based Chinese named entity recognition method and system
CN112925918A (en) Question-answer matching system based on disease field knowledge graph
CN114429132A (en) Named entity identification method and device based on mixed lattice self-attention network
CN115952263A (en) Question-answering method fusing machine reading understanding
CN116821168A (en) Improved NL2SQL method based on large language model
CN114356990A (en) Base named entity recognition system and method based on transfer learning
CN111581365B (en) Predicate extraction method
CN114201506B (en) Context-dependent semantic analysis method
CN111813907A (en) Question and sentence intention identification method in natural language question-answering technology
CN115203236B (en) text-to-SQL generating method based on template retrieval
CN116522165A (en) Public opinion text matching system and method based on twin structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant