CN111382253A

CN111382253A - Semantic parsing method and semantic parser

Info

Publication number: CN111382253A
Application number: CN202010135354.2A
Authority: CN
Inventors: 俞凯; 曹瑞升
Original assignee: AI Speech Ltd
Current assignee: AI Speech Ltd
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2020-07-07
Anticipated expiration: 2040-03-02
Also published as: CN111382253B

Abstract

The invention discloses a semantic parsing method and a semantic parser, wherein the method comprises the following steps: receiving a natural sentence to be analyzed; determining a standard statement corresponding to the natural statement to be analyzed; and inputting the standard statement into a pre-trained naive semantic parser to obtain a logic expression corresponding to the standard statement. In the embodiment of the invention, the natural sentences are converted into the standard sentences with the same meanings before being analyzed, and the natural sentences are analyzed by using a naive semantic analyzer (formed by training a standard sentence-logic form sample set in advance), so that the natural sentences are converted into the standard sentences with the same meanings, and the problem of poor generalization performance existing in the process of directly analyzing the natural sentences by using the naive semantic analyzer is solved.

Description

Semantic parsing method and semantic parser

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a semantic parsing method and a semantic parser.

Background

Semantic parsing, which refers to a task of converting a natural language question into a logical form. The logical form is a structured semantic expression, usually an executable statement, such as Lambda expression, SQL query language, which can be directly executed by a program, retrieved from a database and returned an answer. Because of the tightly coupled nature of knowledge bases, semantic parsing is often applied in the field of automatic question-answering based on knowledge maps or databases.

In order to build a semantic parser in a new domain, researchers need to first obtain a large amount of training data, usually starting with writing template rules for (canonical question, logical form) tuples.

Template rules: the researcher manually writes grammar rules that map the question directly into a logical form. For example, the graduation school of the question "${ # person } is (where | and which | what)", and the corresponding logical form (SQL is taken as an example) framework is "select unity from person name $ { # person }". The question portion may be a regular expression and the logical form is a deterministic semantic representation (some semantic slots to be filled). Since the form of a question written by a rule is rigid and not spoken enough, it is generally called a canonical statement (conditional statement), and a question normally asked by a user is called a natural statement (natural language statement).

In one prior art, all regularly written (canonical question, logical form) tuples are used as training samples, and a semantic parser is directly trained without using any additional data source or label.

However, since only the corpus is generated by using the template rule, the trained naive semantic parser has a poor effect on the real question (natural language question) and poor generalization performance due to the obvious difference in data distribution between the standard sentence and the natural sentence.

Disclosure of Invention

The embodiment of the invention provides a semantic parsing method and a semantic parser, which are used for solving at least one of the technical problems.

In a first aspect, an embodiment of the present invention provides a semantic parsing method, including:

receiving a natural sentence to be analyzed;

determining a standard sentence corresponding to the natural sentence to be analyzed;

and inputting the standard statement into a pre-trained naive semantic parser to obtain a logic expression corresponding to the standard statement.

In some embodiments, determining a canonical statement corresponding to the natural language to be parsed comprises: and inputting the natural sentence to be analyzed into a pre-trained sentence conversion model to obtain a standard sentence corresponding to the natural sentence to be analyzed.

In some embodiments, the step of pre-training the sentence transformation model comprises: and pre-training by adopting an unsupervised training method based on the natural sentence sample set to obtain a sentence conversion model.

In some embodiments, pre-training the sentence conversion model based on the natural sentence sample set by using an unsupervised training method includes:

initializing a statement conversion model;

and performing a reverse translation task and a dual reinforcement learning task on the initialized sentence conversion model to obtain a pre-trained sentence conversion model.

In some embodiments, the sentence conversion model comprises a shared encoder, a first decoder and a second decoder, wherein the shared encoder and the first decoder form a natural sentence reconstruction model, and the encoder and the second decoder form a canonical sentence reconstruction model;

initializing the statement conversion model includes:

taking a loss function as a training target, and taking a noise sample natural sentence as an input training natural sentence reconstruction model;

and taking the loss function as a training target and taking the noise sample standard statement as input to train the standard statement reconstruction model.

In some embodiments, the step of noising the sample natural language statements and the sample specification statements comprises: deleting at least one word in the sample natural sentence and the sample specification sentence, and/or

Inserting at least one word in the sample natural language sentence and the sample specification sentence, and/or

Transforming an order of at least one word in the sample natural sentence and the sample specification sentence.

In some embodiments, the encoder and the second decoder in the sentence conversion model after training are configured to convert the received natural sentences into corresponding canonical sentences.

In a second aspect, an embodiment of the present invention provides a semantic parser, including: the standard statement determining module is used for determining a standard statement corresponding to a natural statement to be analyzed; and the plain semantic parser is used for parsing the standard statement to obtain a corresponding logic expression.

In some embodiments, the canonical statement determination module includes a shared encoder, a first decoder, and a second decoder, wherein the shared encoder and the first decoder form a natural statement reconstruction model, and the encoder and the second decoder form a canonical statement reconstruction model.

In a third aspect, an embodiment of the present invention provides a storage medium, where one or more programs including execution instructions are stored, where the execution instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the semantic parsing methods of the present invention.

In a fourth aspect, an electronic device is provided, comprising: the semantic analysis system comprises at least one processor and a memory which is connected with the at least one processor in a communication mode, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute any one of the semantic analysis methods.

In a fifth aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a storage medium, and the computer program includes program instructions, which, when executed by a computer, cause the computer to execute any one of the semantic parsing methods described above.

The embodiment of the invention has the beneficial effects that: the natural sentences are converted into standard sentences with the same meanings before being analyzed, and a naive semantic analyzer (formed by training a standard sentence-logic form sample set in advance) is used for analyzing, so that the natural sentences are converted into the standard sentences with the same meanings, and the problem of poor generalization performance existing in the process of directly analyzing the natural sentences by using the naive semantic analyzer is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of an embodiment of a semantic parsing method of the present invention;

FIG. 2 is an architecture diagram of the semantic parsing system of the present invention;

FIG. 3 is a diagram of one embodiment of a pre-trained sentence transformation model;

FIG. 4 is a diagram illustrating an embodiment of closed-loop learning of a sentence transfer model according to the present invention;

FIG. 5 is a semi-supervised result using labeled data of different proportions in the experiment of the present invention;

FIG. 6 is a functional block diagram of an embodiment of a semantic parser of the present invention;

fig. 7 is a schematic structural diagram of an embodiment of an electronic device according to the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

As used in this disclosure, "module," "device," "system," and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. In particular, for example, an element may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. Also, an application or script running on a server, or a server, may be an element. One or more elements may be in a process and/or thread of execution and an element may be localized on one computer and/or distributed between two or more computers and may be operated by various computer-readable media. The elements may also communicate by way of local and/or remote processes based on a signal having one or more data packets, e.g., from a data packet interacting with another element in a local system, distributed system, and/or across a network in the internet with other systems by way of the signal.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

As shown in fig. 1, an embodiment of the present invention provides a semantic parsing method, including:

s10, receiving a natural sentence to be analyzed;

and S20, determining a standard statement corresponding to the natural statement to be analyzed.

Illustratively, the determining a canonical statement corresponding to the natural sentence to be parsed includes: and inputting the natural sentence to be analyzed to a pre-trained sentence conversion model to obtain a standard sentence corresponding to the natural sentence to be analyzed.

And S30, inputting the canonical statement to a pre-trained naive semantic parser to obtain a logic expression corresponding to the canonical statement.

In the embodiment of the invention, the natural sentences are converted into the standard sentences with the same meanings before being analyzed, and the natural sentences are analyzed by using a naive semantic analyzer (formed by training a standard sentence-logic form sample set in advance), so that the natural sentences are converted into the standard sentences with the same meanings, and the problem of poor generalization performance existing in the process of directly analyzing the natural sentences by using the naive semantic analyzer is solved.

Illustratively, the step of pre-training the sentence transformation model comprises: and pre-training by adopting an unsupervised training method based on the natural sentence sample set to obtain the sentence conversion model. Illustratively, the pre-training of the sentence conversion model by using an unsupervised training method based on the natural sentence sample set includes: initializing the statement conversion model; and performing a reverse translation task and a dual reinforcement learning task on the initialized sentence conversion model to obtain the pre-trained sentence conversion model.

The invention does not directly construct (natural statement, logic form) training samples for training the semantic parser based on (canonical statement, logic form) sample sets, because the inventor finds out in the process of implementing the invention that: firstly, because a large number of natural sentences with labels need to be obtained, manual labeling is needed, and the cost is extremely high; the second is because the canonical statement is only used to obtain the training sample pair (natural statement, logical form). Once the data set is constructed, the canonical statement is discarded and does not participate in the construction of the semantic parser any more, so that the utilization rate of the prior knowledge contained in the canonical statement is low.

The method of the invention adopts an unsupervised training method to train the sentence conversion model in advance, thereby avoiding huge cost overhead caused by marking a large number of natural sentences.

As shown in fig. 2, the architecture diagram of the semantic parsing system of the present invention includes two stages when performing semantic parsing on a natural sentence. In which the first stage employs an unsupervised rewrite model to rewrite natural sentences into canonical sentences having the same meaning, e.g.,

the natural sentence is x: a total number of players;

the canonical statement is z: the number of players;

logical form y: select count (player) from basketball.

Second stage naive neural network semantic parser P_nspFor the standard statement obtained by rewriting

Analyzing to obtain an analytic logic form

Wherein the unsupervised rewrite model includes an encoder and a decoder D_xAnd decoder D_zThe decoder is only useful during the training phase, and during the use phase the encoder and decoder D are active_z。

The method divides the semantic parsing task of the traditional single stage into two stages, wherein the first stage realizes the conversion and rewriting from natural sentences to standard sentences, and the second stage utilizes a naive semantic parser to map the converted standard sentences into the logic form of the target. The two-stage parsing gradually reduces the difference between the input natural sentence and the target semantic representation and more fully utilizes the canonical sentence written by the rules. The naive semantic parser in the second stage is directly trained on the (standard statement, logic form) tuple generated by the rule; and the first stage rewriting model autonomously learns the conversion from the natural sentences to the standard sentences through unsupervised closed-loop learning, thereby greatly reducing the dependence on the labels.

The naive semantic parser at the second stage is obtained at the pre-training stage through a training sample based on (standard sentence, logic form) of traditional supervised learning, and details are not repeated here. The rewrite model of the first stage comprises a shared encoder E (for encoding natural and canonical sentences) and two independent decoders D_xAnd decoder D_zTo generate natural sentences and canonical sentences, respectively. Decoder D of natural language sentence_xOnly used in the training stage, and in the test evaluation stage, a natural sentence x is input and firstly passed

Generating intermediate specification statements

Then via a naive parser P_nspGenerating a final target logical form

The algorithm focuses on how to modify the model by unsupervised training, and mainly comprises two parts, namely pre-training and closed-loop learning.

FIG. 3 is a diagram illustrating an embodiment of a pre-trained sentence transformation model. A reasonable initialization model is obtained by a pre-training task, i.e. a Denoising auto-encoder (DAE). For example, given a natural sentence x, through a noisy channel N_xObtaining a noisy input

Model (model)

Attempting to reconstruct the original input x; similarly, the model

Attempting to rely on noisy inputs

The original input z is reconstructed. The training target at this stage is the loss function of the denoising autoencoder task. In addition, other auxiliary models, such as naive neural network semantic resolvers and models used to compute reward signals are also pre-trained at this stage and are fixed for subsequent use.

In some embodiments, the sentence conversion model comprises a shared encoder, a first decoder, and a second decoder, wherein the shared encoder and the first decoder form a natural sentence reconstruction model and the encoder and the second decoder form a canonical sentence reconstruction model;

initializing the statement conversion model includes:

training the natural sentence reconstruction model by taking a loss function as a training target and taking a natural sentence of a noise sample as input;

and training the canonical statement reconstruction model by taking the loss function as a training target and taking the standard statement of the noise-added sample as input. Illustratively, the encoder and the second decoder in the sentence conversion model after training are configured to convert the received natural sentences into corresponding canonical sentences.

A noisy channel as shown in fig. 3 is used to noise the original input natural or canonical sentences. The method mainly comprises three noise adding operations, namely deletion based on importance, addition of a mixed input source and disorder based on binary (bigram).

In some embodiments, the step of noising the sample natural language statements and the sample specification statements comprises: deleting at least one word in the sample natural language sentence and the sample specification sentence, and/or

For example, a deletion operation, that is, deleting each word in a sentence with a certain probability, is usually deleting each word with an equal probability, where deletion is performed according to the frequency of occurrence of each word in the sentence in the corpus, and the probability of deletion with a small frequency of occurrence is smaller as the probability of deletion with a large frequency of occurrence is larger.

Illustratively, the adding operation, taking the input natural sentence as an example, selects a target source canonical sentence, and randomly samples 10% -20% of words from the canonical sentence to insert into any position in the input sentence. To encourage the use of more highly relevant target source sentences, we pick from the candidate set a canonical sentence that has the smallest distance traveled by the Word (WMD) of the input sentence.

Illustratively, the disorder operation is to disorder the order of input sentences, and the disorder of the order is performed in the channel based on the binary group, so that some bigram-level semantic information is retained.

FIG. 4 is a diagram illustrating an embodiment of closed-loop learning of a sentence transfer model according to the present invention. This phase includes two tasks, back-translation (BT) and Dual Reinforcement Learning (DRL), as shown in fig. 4. Model using input of natural language question x as example

Generation of intermediate specification statements by greedy decoding

Thereby pseudo-training samples

Can be used to model

Carrying out supervised training, wherein a training target is a loss function of the reverse translation, and meanwhile, carrying out the model

Obtaining canonical statements by sampling

By calculating the reward signal

For model

And learning the descending strategy gradient. The training method for the input canonical statement z is similar.

Illustratively, the reward signal design: the reward signal is designed to evaluate the quality of the sampling statement, so that the strategy gradient can be learned. We measure mainly from three angles: fluency, style, relevance.

Fluency: and measuring whether the sampled sentences are smooth or fluent, wherein the general method is to train a language model for each of the two sentences. In addition, we provide an additional 0/1 indirection signal for the specification statement, passing through the downstream naive semantic parser P_nspGenerated logical form

Whether the execution can be executed by the executor without errors or not, and a supervision signal on a certain level is provided.

Style: natural sentences are more spoken and have diversity, standard sentences are generated by rules and are more rigid, different types of sentences have different styles, and a CNN binary sentence classifier P is trained_disThe sampled sentences are stylistically scored.

Correlation: in order to make the sampled sentence retain the important content of the input sentence as much as possible, namely the semantics remain unchanged, the probability of reconstructing the original input from the sampled sentence is calculated through a dual model and is used as the measure of the relevance.

The invention divides the traditional semantic analyzer into two steps of unsupervised rewriting and honour analysis, thereby carrying out completely unsupervised training by using the natural sentences without labels and more fully utilizing the prior knowledge in the standard sentences compiled based on grammar rules.

Furthermore, the semantic parser in the two stages is used as an intermediate bridge through the standard sentences, so that the difference between the natural language and the target semantic representation is reduced. A canonical statement is a regularly generated statement that is formally similar to natural language, but expressively more like a logical form, with the inherent logic of the specification. The rewrite model from natural language to canonical language performs semantic specification, while the naive semantic parser from canonical language to logical form learns the intrinsic mapping. The feasibility of unsupervised learning in the first stage greatly reduces the dependence on annotation data.

In order to solve the problems in the prior art, the invention provides a two-stage semantic parsing framework, wherein an unsupervised rewrite model is utilized to convert an unmarked natural language into a pseudo language in the first stage. And a downstream naive semantic parser receives the intermediate product pseudo language and returns a corresponding logic form. We further designed three types of noise tailored to the de-noising autoencoder task to initialize the adaptation model. Next, a reverse translation technique and a dual reinforcement learning technique based on different angle reward signals are used in the closed-loop learning phase to perform self-training. The experimental results on the baseline dataset OVERNIGHT show that our framework is not only effective, but also compatible with supervised training methods, as will be detailed below:

1. introduction to the design reside in

Semantic parsing is the task of converting a natural language question into a structured semantic representation (usually a logical form). An important method of building semantic parsers from scratch generally follows the following process:

(canonical sentence, logical form) pairs are automatically generated from the domain general grammar and the domain-specific dictionary.

2. Researchers have used crowdsourcing to interpret these normative expressions as expressions in natural language.

3. The semantic parser is built based on collected (natural language question, logical form) pairs.

However, this method has two disadvantages: (1) dependence on non-trivial manual work and (2) inefficient use of canonical statements.

A high level method reduces the rewrite task (step 2) in the data collection process to distinguish correct rewrite examples from irrelevant ones. Although highly efficient, it still relies heavily on extraordinary labor. The exact meaning of a canonical statement may be difficult for an annotator to understand.

Canonical statements are pseudo-language statements that are automatically generated from grammatical rules that one can understand but that sound unnatural. Once the semantic parsing dataset is constructed, the canonical statements are discarded, resulting in under-utilization. Although their effectiveness as intermediate outputs to bridge the gap between natural language question and logical form has been reported, they have experimented in a fully supervised fashion where human annotation is essential.

In this work, inspired by unsupervised neural machine translation, we propose a two-stage semantic parsing framework. The first stage converts the natural language question into a corresponding canonical statement using an unsupervised rewrite model. Then, using supervised training, a naive neuro-semantic parser is built on the automatically generated (canonical statement, logical form) pairs. The two models are connected into one pipe (see fig. 2).

Rewriting aims to perform semantic normalization and reduce diversity in natural languages, while naive neuro-semantic parsers learn the mapping between canonical statements and logical forms and structural constraints.

The unsupervised rewrite model consists of a shared encoder and two separate decoders for natural language and specification statements. To warm up the model without any annotation data, we designed three types of noise tailored to de-noize the autoencoder task in the pre-training phase, namely significance-based deletion, blending input source addition and binary-based misordering. This task aims to reconstruct the original input sentence from the corrupted version. After a good initialization point is obtained, we further combine reverse translation and dual reinforcement learning in a closed loop learning phase. At this stage, one encoder-decoder submodel acts as an environment to provide dummy samples and a bonus signal for the other model. They teach and regularize each other, ultimately balancing utilization and exploration. Rewards evaluated from different aspects, including fluency, style and relevance, should be considered.

We performed extensive experiments on baseline overright in both unsupervised and semi-supervised environments. The results show that our method achieves significant improvement on various benchmarks, achieving an accuracy of 60.7% without supervision. With the full tag data we can reach the latest performance level (79.3%) without considering the use of previous methods with additional data sources.

The major contributions to this work can be summarized as follows:

1) a two-stage semantic parser in an unsupervised environment is presented. No supervision is provided between the input natural language question and the intermediate output specification statement. This work is a precursor to unsupervised semantic analysis.

2) We performed experiments on the data set OVERNIGHT, which achieved an accuracy of 60.7% which was 1.9% higher than the previous traditional supervision method (Wang et al, 2015). Our model is also compatible with supervised training and can be improved upon.

3) The framework also provides another way to reduce annotations in data collections. The annotator can ask any questions at will, thereby getting rid of the hassle of trying to understand what the canonical statement means.

2. Method of the invention

2.1 problem definition

In our rest of the discussion, we denote natural language question with x, canonical statement with z, and logical form with y. We represent X, Z and Y as a collection of all possible natural language questions, canonical statements and logical forms, respectively. There is a mapping function f: Z → Y governed by the grammar rules.

We can use the attention-training naive based neuro-semantic parser P from the label sample { (Z; Y), Z ∈ Z, Y ∈ Y { (Z; Y) }_nsp. It can be pre-trained and stored for later use.

For the overwrite model (see fig. 2), it consists of a shared encoder E and two independent decoders: d_xFor natural language, and D_zFor canonical statements the notation ○ denotes module composition we omit the model implementation here as this is not the main focus.

Given an input statement X ∈ X, rewrite model D_z○ E converts it into possible canonical statements

Then will be

Passing to pretrained naive parser P_nspIn a logical form to obtain a prediction

Another rewrite model D_x○ E are used as aids only during training.

2.2 unsupervised training program

To train the unsupervised rewrite model with no labeled data between X and Z, we divide the whole training process into two stages: pre-training and closed-loop learning. Firstly, D is_x○ E and D_z○ E train to denoise auto-encoder (DAE). this initialization phase plays an important role in accelerating convergence due to the ill-posed rewrite task Next, we use both reverse translation (BT) and Dual Reinforcement Learning (DRL) strategies for self-training and exploration in the closed-loop learning phase.

2.2.1 Pre-training phase

FIG. 3 is a diagram of an embodiment of the pre-trained sentence transformation model, at which stage we initialize the adapted model by the de-noising autoencoder task. All secondary models involved in calculating the reward are also pre-trained.

De-noising an auto-encoder given a natural language sentence x, we pass through a noisy channel N_x(ii) noising it and obtaining a corrupted version thereof

Then, model D_x○ E attempts to remove the corrupted version thereof

The original input x is reconstructed, see fig. 3. Symmetrically, model D_z○ E input N from which an attempt was made to corrupt_z(z) reconstructing the original canonical statement z. The training objectives can be expressed as:

2.2.2, closed loop learning phase

Fig. 4 is a schematic diagram of an embodiment of closed-loop learning of a sentence transformation model according to the present invention, and the training framework so far is only a noisy replication model. To improve this, we adopt two schemes in the closed-loop learning phase, reverse translation (BT) and Dual Reinforcement Learning (DRL), and train the model to the final goal (see fig. 4).

Reverse translation in this task, the shared encoder E aims at mapping different types of input statements into the same underlying semantic space, and the decoder needs to decompose the representation into another type of statement. More specifically, given a natural language question x, we use the rewrite model D in the evaluation mode of greedy decoding_z○ E converts x into a canonical statement

We will obtain the rewritten model D_x○ E pseudo training samples

Similarly, given canonical statement z, one can use model D_x○ E Synthesis

And (4) carrying out pairing. Next, we train from these pseudo-parallel samplesAdapt the model and update the parameters by minimization:

wherein the content of the first and second substances,

and

is a parameter of the system. The updated model will generate a better interpretation in the iterative process.

Reverse translation takes more weight on knowledge learned using dual models, which may lead to local optima. To encourage more experimentation in closed-loop learning, we introduced a dual reinforcement learning strategy and optimized the system through a gradient of the strategy.

Starting from natural language sentence x, we pass through D_z○ E for a canonical statement

Sampling is performed. Then, we evaluated from different aspects

And receive a reward

Similarly, we are sampled natural language sentences

Calculating rewards

To cope with the high variance of the reward signal, we use the sample size K and redefine the reward signal by the baseline b (-) to stabilize learning (to)

For example):

we investigated different baseline choices (e.g., running average, historical cumulative average, and reward for greedy decoding prediction) that work best when we use the average of rewards within the bundle search. Of each input sample, especially when the sample is large. The training goal is the sum of negative desired rewards:

gradient was calculated using the REINFORCE algorithm:

the final loss function of the closed-loop learning phase is the sum of the cross-entropy loss and the policy gradient loss: l is_Cycle＝L_BT+L_DRL。

3. Details of training

In this section we will give details on the different types of noise used in the experiments and the reward design in dual reinforcement learning.

3.1, adding noise channel

We introduce three types of noise to deliberately pollute the input statements.

Deletion based on importance: conventional word dropping would be with equal probability p_wdEach word in the input is discarded. During reconstruction, the decoder must recover those words from the context. We have further introduced a generalisation bias that removes words with higher frequency (e.g. function words) instead of words with lower frequency (e.g. content words). Each word x in a natural language question x_iIndependently discarded by probability:

wherein, w (x)_i) Is the word X in X_iNumber of (2), p_maxIs the maximum deletion probability (in our experiment, p_max0.2). As for the statement of the specification,we apply this method similarly.

Addition of hybrid input sources: for any given original input, it is either a natural language statement or a canonical statement. This observation discourages the shared encoder E from learning the common representation space. Therefore, we propose to insert unwanted words from other sources into the input sentence. Noisy channel N for natural language question_x(. one), we first select a set of candidate canonical statements z; then a candidate specification statement z is selected. Next, 10% -20% of the words are randomly drawn from z and inserted anywhere in x.

To select candidate z with higher relevance, we use a heuristic approach: randomly extracting C standard sentences as candidates (C is 50); we select z which has the smallest distance of word movement from x. For a noisy channel N_zThe noise addition method is similar.

Bigram disorder: we also use dyad-based misordering in noisy channels. This has proven to help prevent the encoder from relying too much on word order. Instead of shuffling words, we first split the input sentence into n-grams and then randomly shuffle them at the level of n-grams (bigrams in our experiments). To account for the added words, we shuffle the original input sentence after merging with words from other sources.

3.2 reward design

To provide more useful reward signals and to improve the performance of DRL tasks, we introduce various rewards from different aspects.

Fluency: the fluency of a sentence is evaluated by a length-normalized language model. For each type of statement, we use a separate Language Model (LM)_xAnd LM_z). As for sampled natural language question

The fluent prize is:

as for canonical statements, we also include a semantic parser P from downstream naive_nspTo imply a sample specification statement

Whether the format is correct.

Style: natural language statements are diverse, arbitrary and flexible, whereas canonical statements are usually rigid, regular, and limited to some particular form caused by grammatical rules. To distinguish their features, we incorporate another reward signal that indicates the pattern of the sampled statements. This is achieved by the CNN discriminator:

wherein, P_dis(. is) a pre-trained pattern classifier that gives the possibility of an input sentence being a canonical sentence.

Relevance includes a relevance reward to measure how much content remains after overwriting. We follow the convention to obtain log-likelihoods from the dual model.

Other methods include calculating cosine similarity of sentence vectors or BLEU scores between the original input and the reconstructed sentences. Nevertheless, we have found that log-likelihood methods are a table in our experimentsNow better. Canonical statements of samples

And natural language question sentence

The total reward of (a) may be expressed as:

4. experiment of

In this section, we will evaluate the system of baseline OVERNIGHT in an unsupervised and semi-supervised environment and introduce ablation studies.

The dataset OVERNIGHT contains natural language rewrites and is paired with logical forms in 8 domains. We follow the traditional 80%/20% training/active training period to select the best model. The canonical statement is generated using the tool sempe paired with the target logical form. Because of the limited number of grammatical rules and the coarse nature, there is only one canonical statement per logical form, and on average only 8 natural language rewrites per canonical statement. For example, to describe the concept of "larger," in a natural language question, many synonyms are used, such as "greater than," "higher," "at least," while in a canonical statement, the expression is grammatically limited.

4.1 Experimental setup

Throughout the experiment, unless otherwise noted, word vectors were initialized using Glove6B with an average coverage of 93.3% and fine-tuned during training. Out-of-vocabulary words are replaced with < unk >. The batch size is fixed at 16 and the sample size K in the DRL task is 6. During the evaluation, we used a beam search of size 5. For all experiments we used the optimizer Adam. All the auxiliary models are pre-trained and fixed for later use.

We report the accuracy of the final representation level of the logical form in different settings.

And (4) supervised setting: this is the traditional case where a (natural language, logical form) pair is used to directly train a single-phase parser, and a (natural language, canonical statement) and (canonical statement, logical form) pair are used to train two parts of a two-phase parser, respectively.

Unsupervised setting: we classify methods into two categories: single stage and two stages. In a single stage parser, a naive semantic parser is trained using only (canonical statement, logical form) pairs, but evaluation is performed on natural language question. The pre-trained word vectors ELMo and Bert-base-uncased are also used to replace the original word vector layer; the word movement distance-based forged sample method labels each natural language sentence using the most similar logical form (one stage) or canonical sentence (two stages), and processes these forged samples in a supervised manner; multitasking implements natural language statements in a one-level parser using another decoder to perform the same DAE task discussed previously. The dual rewrite model may or may not share the encoder E (+ Shared), and may include whether the task is in a closed-loop learning phase (+ Cycle). The downstream parser Pnsp of the unsupervised rewrite model is Naive + Glove6B and is fixed for all two-phase models.

Semi-supervised setup to further validate our framework, we also performed semi-supervised experiments (including pre-training and closed-loop learning phases) by adding part of the supervised tagged phrases step-wise during the training process, based on the best model in the unsupervised setup (Shared + DAE + Cycle).

4.2 results and analysis

As shown in table 1, without supervision: (1) two-stage semantic parsers are preferred over single-stage ones by using canonical statements to bridge the large differences between natural language question and logical form. Even in supervised experiments, the tubing was still useful (76.4% compared to 76.0%). (2) Not surprisingly, model performance is sensitive to word vector initialization. Using the original Glove6B word vector directly, performance was the worst (19.7%) among all baselines. The accuracy is significantly improved (26.2% and 32.7%, respectively) thanks to the pre-trained embedded ELMo or Bert. (3) When we shared the encoder modules in the single stage parser of the multitasking method (DAE), the performance did not improve significantly, even slightly below Naive + Bert (31.9% vs. 32.7%). We suspect that because the semantic parser and denoising autoencoder utilize the input sentence in different ways, we look at different regions in the representation space. However, in the rewrite model, it is more appropriate to share the encoder or map to the same representation space (from 57.5% to 60: 7%) because the input and output statements are perfectly symmetric. Furthermore, the effectiveness of the DAE pre-training task (44.9% accuracy of the target task) can be explained in part by the proximity of natural language and canonical statements. (4) The Faked Samples method is a lower limit, is easy to implement, but has poor generalization effect and an obvious upper limit. Our system can train itself through closed-loop learning and improve performance from the first 44.9% to 60.7%, but 1.9% better than the traditional supervised method.

Table 1: the representation level accuracy of the logical form on the data set OVERNIGHT. The supervised approach with superscript implies that no cross-domain or additional data sources are considered.

For the semi-supervised results in Table 1, (1) only 5% of the labeled data was added, but the performance improved significantly from 60.7% to 68.4%. (2) With 30% annotation, our system and network model under fully supervised training have a competitive advantage (75.0%). (3) Compared to the prior art, our system, when using 50% of the labeled data, outperforms it (4%) when using all labeled data (79.3%) and achieves up-to-date performance (regardless of the results of using other data sources or multi-domain advantages).

Fig. 5 shows the semi-supervised result of using labeled data with different scales. Two benchmarks are models that use only supervised training single and two phases. From the experimental results and fig. 5 (abscissa is the percentage of labeled data, ordinate is the accuracy), we can safely conclude that: (1) when we train a semantic parser without any annotation data, our proposed method solves the difficult problem of cold start. (2) It is also compatible with traditional supervised training and can be extended to process more tagged data.

4.3 ablation study

In this section, we will analyze the impact of each noise type on the DAE task and different combinations of solutions in the closed loop learning phase. By default, the encoder is shared.

4.3.1 Pre-training noise channels in DAEs

Table 2: ablation study of different noise channels.

From the results in table 2, (1) interestingly, even without any noise, in this case the de-noising autoencoder would degrade to a simple replica model, and the overwrite model could still successfully make some useful predictions (26.9%). This observation can be attributed to a shared encoder for different statements. (2) Generalization capability continues to increase as we gradually complicate the DAE task by increasing the number of noise types. (3) In general, deletion based on importance and addition of mixed input sources are more useful in this task than binary out-of-order.

4.3.2 strategy for closed-loop learning

Table 3: ablation plan study in closed loop learning

The most compelling observation in table 3 is that performance dropped by 1.5% when we added the DAE task to the closed loop learning phase (BT + DRL). A possible explanation for this phenomenon may be that the model has reached the bottleneck of the DAE task after pre-training and therefore does not contribute to the closed-loop learning process. Another possible factor may come from conflicting goals of different tasks. If we continue to add this DAE regularization target, exploratory testing of the DRL task may be hindered.

4.4 case study

In table 4, we compared the intermediate canonical statements generated by our model with those created based on forged samples of WMDs. In the basketball domain, our system successfully interpreted the constraint as "at least 3", which is an alias of "3 or more". This finding consolidates the following assumptions: unsupervised models can learn these fine-grained semantics, such as phrase alignment. For the calendra domain, the reference system cannot identify the query object, requiring a "meeting" rather than a "person". While our model correctly understands the purpose, doing unnecessary work is somewhat foolish. The "participants standing weekly" request was repeated. This may be due to an uncontrolled process in the closed-loop learning process, as we encourage the model to take risky steps to seek better solutions.

Table 4: example output, NL: a natural language sentence; CU: and (5) a standard statement.

5. Related work

Annotation of semantic analysis: semantic analysis always requires a large amount of data. However, annotations used for semantic parsing are not user-friendly. Many researchers in semi-supervised attempts to alleviate the burden of human annotation, such as training from weak supervision (Krishnhamurthy and Mitchell, 2012; Bernat et al, 2013; Liang et al, 2017; Goldman et al, 2018), semi-supervised learning (Yin et al, 2018; Guo et al, 2018; Cao et al, 2019), online learning (Iyer et al, 2017; Lawrence and Riezler, 2018) and dependency multilingual (Zou and Lu, 2018) or cross-domain datasets (Herzig and Bernat, 2017). In this work, we try to avoid the heavy annotation work by using canonical statements as intermediate results and build an unsupervised rewrite model.

Unsupervised learning of NMT: a similar task to us is Lample et al, using monolingual data for unsupervised Neural Machine Translation (NMT). However, they rely heavily on pre-trained cross-language word embedding for initialization, as noted by Lample et al. Furthermore, they have focused mainly on learning phrase alignment or word mapping. In doing so, we will delve into sentence-level semantics and improve semantic resolution using the dual structure of unsupervised rewrite models.

6. Conclusion

In this work, to reduce annotations, we propose a two-stage semantic parsing framework. The first phase overwrites an input statement with a dual structure in an unsupervised rewrite model. We further designed three types of noise for the denoise autoencoder task in pre-training. Reverse translation and dual reinforcement learning based on rewards evaluated from different aspects are used to further improve performance. Experimental results show that our framework is efficient and compatible with supervised training.

It should be noted that for simplicity of explanation, the foregoing method embodiments are described as a series of acts or combination of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

As shown in fig. 6, an embodiment of the present invention further provides a semantic parser 600, including: a canonical statement determination module 610, configured to determine a canonical statement corresponding to a natural statement to be parsed; a naive semantic parser 620, configured to parse the canonical statement to obtain a corresponding logical expression.

In some embodiments, the determining a canonical statement corresponding to the natural language to be parsed comprises: and inputting the natural sentence to be analyzed to a pre-trained sentence conversion model to obtain a standard sentence corresponding to the natural sentence to be analyzed.

In some embodiments, the step of pre-training the sentence transformation model comprises: and pre-training by adopting an unsupervised training method based on the natural sentence sample set to obtain the sentence conversion model.

initializing the statement conversion model;

and performing a reverse translation task and a dual reinforcement learning task on the initialized sentence conversion model to obtain the pre-trained sentence conversion model.

initializing the statement conversion model includes:

and training the canonical statement reconstruction model by taking the loss function as a training target and taking the standard statement of the noise-added sample as input.

In some embodiments, the step of noising the sample natural language statements and the sample specification statements comprises: deleting at least one word in the sample natural sentence and the sample specification sentence, and/or inserting at least one word in the sample natural sentence and the sample specification sentence, and/or transforming an order of at least one word in the sample natural sentence and the sample specification sentence.

In some embodiments, the present invention provides a non-transitory computer readable storage medium, in which one or more programs including executable instructions are stored, and the executable instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any of the semantic parsing methods of the present invention.

In some embodiments, the present invention further provides a computer program product comprising a computer program stored on a non-volatile computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the semantic parsing methods described above.

In some embodiments, an embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a semantic parsing method.

In some embodiments, an embodiment of the present invention further provides a storage medium on which a computer program is stored, wherein the program is configured to implement a semantic parsing method when executed by a processor.

The semantic analyzer of the embodiment of the present invention may be used to execute the semantic analysis method of the embodiment of the present invention, and accordingly achieve the technical effect achieved by the implementation of the semantic analysis method of the embodiment of the present invention, which is not described herein again. In the embodiment of the present invention, the relevant functional module may be implemented by a hardware processor (hardware processor).

Fig. 7 is a schematic hardware structure diagram of an electronic device for performing a semantic parsing method according to another embodiment of the present application, and as shown in fig. 7, the electronic device includes:

one or more processors 710 and a memory 720, one processor 710 being illustrated in fig. 7.

The apparatus for performing the semantic parsing method may further include: an input device 730 and an output device 740.

The processor 710, the memory 720, the input device 730, and the output device 740 may be connected by a bus or other means, such as the bus connection in fig. 7.

The memory 720, which is a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the semantic parsing method in the embodiments of the present application. The processor 710 executes various functional applications of the server and data processing by running nonvolatile software programs, instructions and modules stored in the memory 720, so as to implement the semantic parsing method of the above method embodiment.

The memory 720 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the semantic analysis device, and the like. Further, the memory 720 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 720 may optionally include memory located remotely from processor 710, which may be connected to the semantic parsing device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 730 may receive input numeric or character information and generate signals related to user settings and function control of the semantic parsing device. The output device 740 may include a display device such as a display screen.

The one or more modules are stored in the memory 720 and, when executed by the one or more processors 710, perform the semantic parsing method of any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the methods provided in the embodiments of the present application.

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as ipads.

(3) Portable entertainment devices such devices may display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A semantic parsing method, comprising:

receiving a natural sentence to be analyzed;

determining a standard statement corresponding to the natural statement to be analyzed;

2. The method of claim 1, wherein the determining a canonical statement corresponding to the natural sentence to be parsed comprises:

and inputting the natural sentence to be analyzed to a pre-trained sentence conversion model to obtain a standard sentence corresponding to the natural sentence to be analyzed.

3. The method of claim 2, wherein,

the step of pre-training the sentence conversion model comprises: and pre-training by adopting an unsupervised training method based on the natural sentence sample set to obtain the sentence conversion model.

4. The method of claim 3, wherein pre-training the sentence conversion model based on the natural sentence sample set using an unsupervised training method comprises:

initializing the statement conversion model;

5. The method of claim 4, wherein the sentence transformation model comprises a shared encoder, a first decoder, and a second decoder, wherein the shared encoder and the first decoder constitute a natural sentence reconstruction model and the encoder and the second decoder constitute a canonical sentence reconstruction model;

initializing the statement conversion model includes:

6. The method of claim 4, wherein the step of noising the sample natural sentences and the sample canonical sentences comprises:

deleting at least one word in the sample natural language sentence and the sample specification sentence, and/or

7. A semantic parser comprising: the standard statement determining module is used for determining a standard statement corresponding to a natural statement to be analyzed; and the plain semantic parser is used for parsing the standard statement to obtain a corresponding logic expression.

8. The semantic parser according to claim 7, wherein the canonical statement determination module includes a shared encoder, a first decoder, and a second decoder, wherein the shared encoder and the first decoder constitute a natural statement reconstruction model, and the encoder and the second decoder constitute a canonical statement reconstruction model.

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-6.

10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.