CN111160512A

CN111160512A - Method for constructing dual-discriminator dialog generation model based on generative confrontation network

Info

Publication number: CN111160512A
Application number: CN201911224148.2A
Authority: CN
Inventors: 贺樑; 张凉; 朱频频; 杨燕; 陈成才
Original assignee: East China Normal University; Shanghai Xiaoi Robot Technology Co Ltd
Current assignee: East China Normal University; Shanghai Xiaoi Robot Technology Co Ltd
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2020-05-15
Anticipated expiration: 2039-12-04
Also published as: CN111160512B

Abstract

The invention discloses a method for constructing a dual-discriminator dialogue generating model based on a generating countermeasure network, which comprises the following steps of firstly processing corpora to obtain a quaternary corpora with similar dialogue information; then, a rewriting model and a discrimination model are pre-trained, wherein the former can rewrite the matched similar reply to generate a reply more in line with the current context, and the latter can discriminate whether the sentence is from a corpus or the rewriting model; and finally, the rewriting model and the discrimination model are used for counterstudy, and the best rewriting effect is obtained in the game process of the rewriting model and the discrimination model. The invention improves the generation model by introducing two discriminators from multiple angles, and makes great progress in the aspects of grammar, context correlation and the like of the generated sentence.

Description

Method for constructing dual-discriminator dialog generation model based on generative confrontation network

Technical Field

The invention relates to a natural language processing, deep learning and dialogue system, in particular to a method for constructing a dual-discriminator dialogue generating model based on a generative countermeasure network (GAN).

Background

With the development of smart phones and smart homes, interaction between people and machines becomes more and more frequent, the requirement of users for the quality of conversation with the machines becomes higher and higher, and smooth, fluent and various communication experiences are expected to be obtained, which means that the template-based conversation system commonly used in the industry at present is difficult to meet the requirements of users. At present, the method for constructing a dialog system commonly used in the industry is based on templates, namely, manually arranging and defining a large number of dialogs to form the templates, and inputting the dialogs spoken by a user into the pre-defined templates to obtain a fixed reply, and the method covers fewer topics and has higher labor cost. With the development of big data technology, deep learning technology and computer hardware, the academic world explores a lot in the aspect of automatically generating a dialogue system, the big data technology provides a lot of analyzable corpora, the deep learning technology provides a complex computing frame, the computer hardware technology provides high-speed operation, and the three act together to promote the development of the dialogue system.

The generation type dialogue system is characterized in that a user obtains a reply which is created by the system, no one tells the system a fixed answer in advance, and only the user needs to learn by himself when losing a large amount of data, and the process of learning by himself uses deep learning. Most of the current generation-type dialogue systems are self-learning based on a seq2seq model, so that the model learns how to encode input and then decode the input to obtain a reply, and the model is improved by continuously reducing the difference between a generated statement and a real statement, however, the model is lazy, which only learns simple generation, namely, the model is more prone to generating ubiquitous replies of 'I do not know', 'good' and 'kayi'. Meanwhile, because the corpora adopted by the traditional methods are binary corpora such as 'above + reply', there is no intervention from reply sentences in the generation process, so that the model does not know what sentences are correct, and some utterances with wrong grammar and blind semantics are generated sometimes. Therefore, a method for improving the diversity, context correlation and grammar accuracy of the automatic dialog generation system is needed.

Disclosure of Invention

The invention aims to provide a method for constructing a dual-discriminator dialogue generating model based on a generative confrontation network aiming at the defects of the existing model of a generative dialogue system.

The specific technical scheme for realizing the aim of the invention is as follows:

a method for constructing a dual-discriminator dialogue generating model based on a generative confrontation network comprises the following specific steps:

step 1: corpus processing

Matching similar contexts C 'in a corpus by using a text matching algorithm according to the current context C, thereby obtaining a reply R' under the similar contexts to form a quaternary corpus < C, R, C 'and R' >;

step 2: initializing rewrite models

Training the quaternary data obtained in the step 1 in a seq2seq frame to obtain a primary rewriting model, wherein the rewriting model can rewrite R 'by combining the contexts C and C' to generate a reply R, and the rewriting model does not achieve an ideal effect and has a large loss;

and step 3: initialization discriminant model

And (3) training two discriminators of the discrimination model by using the reply R generated in the step (2) and the real reply R, wherein the two discriminators respectively judge from two aspects of rewriting effect and context correlation, and the specific process is as follows:

the discriminator _1 aims at discriminating the quality of the rewriting effect, the input of the discriminator is sentences before and after rewriting, wherein the sentence type before rewriting is "True", the sentence type after rewriting is "False", and the discriminator _1 is used for distinguishing two types as much as possible;

the discriminator _2 is used for discriminating whether the context correlation is good or bad, so that the other inputs are 'current context + generated reply' and 'current context + real reply', obviously, the context correlation of the latter is strongest, the category is marked as 'True', the former is marked as 'False', and the discriminator _2 is used for distinguishing the two categories as far as possible;

and 4, step 4: adapting model and discriminating model countertraining

The rewriting model updates parameters according to feedback of the discrimination model, then, the new rewriting model transmits the generated sentences to the discrimination model, the discrimination model updates the model parameters by using the accuracy rate of distinguishing true and false sentences, in the process of countermeasure, loss of the two sentences is in a descending trend until the sentences tend to be stable, and the countermeasure training is terminated; and the obtained rewriting model is an optimal model, namely the conversation generation model.

In the process of the confrontation training, the lower the accuracy of the discriminator is, the better the effect of rewriting the model is, because the discriminator is 'confused'; however, the discriminators also increase their discrimination power with increasing overwrite model effect and decreasing their own accuracy, which is a so-called "countermeasure" process. When the loss of the generated model and the accuracy of the discriminant model are reduced to a certain degree and tend to be stable, the two models are considered to be equivalent in strength, and the confrontation can be stopped. The generated model obtained at this time is the optimal model, and the rewritten model is taken as the final dialogue generated model.

Compared with the prior art, the invention has the following advantages:

1) the grammar is correct: compared with the method of generating a sentence from zero, the method has the advantages that the method is rewritten based on the existing sentence, and a good grammar basis can be provided;

2) semantic smoothing: the method adapts according to replies in similar contexts, which can have a better context basis, so that the replies obtained are more context-compliant.

3) The model has strong automatic learning performance: and modifying the model parameters according to the effect of the opposite party, not only according to the loss of the model, by modifying the counterstudy of the model and the discriminator.

Drawings

Fig. 1 is an overall frame diagram of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art, and the present invention is not particularly limited, except for those specifically mentioned below.

Examples

Referring to fig. 1, the present invention provides a method for constructing a dual-discriminator dialog generation model based on a generative confrontation network, that is, a method for constructing a dialog generation model by replacing creation and rewriting models with rewriting and discriminator game learning, as shown in the figure, a current context C is known, and a similar context C 'and a reply R' thereof are obtained by matching using a text matching algorithm. The left dashed frame part in the figure is the rewrite model of this embodiment, which is based on the seq2seq frame, the encoder encodes R ', and the decoder decodes while introducing the difference diff (C, C ') between C and C ', and finally obtains the generated reply R. The right dashed frame part in the figure is the discrimination model of the embodiment, the discriminator _1 learns to distinguish R and true reply R obtained by the rewriting model, the discriminator _2 learns to distinguish false dialog C + R and true dialog C + R, and the feedback obtained by the two discriminators is returned to the rewriting model. The embodiment specifically comprises the following steps:

step 1: data pre-processing

Based on observation and experience, it is easier to modify on a template basis than to author from scratch, so this embodiment proposes to get a dialog reply by rewriting a sentence, and all that is needed is to process the corpus. The existing dialogue corpora are paired, that is, the current context (context) corresponds to a sentence reply (response), in order to have a better rewriting foundation, the two contexts are ensured to be under the condition of similar contexts, so that the similar contexts are obtained according to a matching algorithm, the corresponding replies are extracted to form a quaternary corpus < C, R, C ', R ' >, and the embodiment directly calls a text matching algorithm to obtain the first 10C's with the highest matching score with C to form the required corpora. After the quaternary linguistic data are obtained, the data are further cleaned, the maximum sentence length is set to be 50 words, and the words exceeding the threshold value are discarded. Further, the corpus is divided into a training set, a verification set and a test set, and the ratio is 7: 2: 1.

Step 2: pre-training rewrite model

The rewriting model is based on a common frame-seq 2seq of the generation model, the input of the frame is R ', the target output is R, the frame generates R word by word, and an Attention mechanism is introduced in the decoding process, wherein the Attention mechanism is from edit vectors diff (C, C'), R 'and the generated word sequence, and the edit vectors are the difference word vector sequences of C and C'. And (3) training the model by using the training set obtained in the step (1), obtaining a model after each training, verifying by using the verification set obtained in the step (1), and selecting the model with the best effect as an initialization model for rewriting the model in the countermeasure process.

And step 3: pre-training discrimination model

The arbiter is built based on a neural network and functions to determine whether a utterance/set of dialogs is genuine. Before a sentence is input into a discriminator, firstly, the sentence is labeled, the label from a corpus is 1, and the label generated by a rewriting model is 0; the discriminator is then trained to distinguish between the two classes of sentences as much as possible. To distinguish the sentence from the other sentence, the embodiment applies two classifiers, whose inputs are shown in fig. 1, and the discriminative ability of the classifier _1 for sentence grammatical is trained by using R and R', and the discriminative ability of the classifier _2 for context is trained by using C + R and C + R.

And 4, step 4: adapting model and discriminating model counterstudy

In the counterstudy process, the objective of the rewrite model is to generate a true answer to deceive the discriminant model, and the objective of the discriminant model is to distinguish the reply generated by the rewrite model from the true reply. Thus, the rewrite model and the discriminant model constitute a dynamic countermeasure. In the most ideal state, rewriting the model G may generate enough replies to be "spurious". For the discriminant model D, it is difficult to determine whether the recovery generated by the overwrite model is true or not, and the accuracy of the discriminant is D (g) = 0.5. In the process of confrontation, the parameters are continuously adjusted to make the accuracy of the discrimination model approach 0.5, and simultaneously, the loss of the rewriting model is reduced and finally tends to be stable. The generated reply of the finally obtained dialogue generation model is improved in grammar and context, and results of 0.629, 0.755 and 0.682 are respectively obtained on vector cosine similarity measurement indexes Greeny, Average and Extramem. This results in a generative model that can be used to generate the appropriate reply.

Claims

1. A method for constructing a dual-discriminator dialogue generating model based on a generative confrontation network is characterized by comprising the following specific steps:

step 1: corpus processing

step 2: initializing rewrite models

Training the quaternary data obtained in the step 1 in a seq2seq frame to obtain a primary rewriting model, wherein the rewriting model can rewrite R 'by combining the contexts C and C' to generate a reply R;

and step 3: initialization discriminant model

a discriminator _1 for discriminating the rewriting effect, wherein the input is the sentence before and after rewriting, the sentence before rewriting is "True", the sentence after rewriting is "False", and the discriminator _1 distinguishes two categories;

the discriminator _2, aiming at discriminating context correlation, inputs of which are "current context + generated reply" and "current context + real reply", the latter having the strongest context correlation, the category is marked as "True", the former is marked as "False", and the discriminator _2 distinguishes two categories;

and 4, step 4: adapting model and discriminating model countertraining

The rewriting model updates parameters according to feedback of the discrimination model, the updated rewriting model transmits the generated sentences to the discrimination model, the discrimination model updates the model parameters by using the accuracy rate of distinguishing true and false sentences, in the process of countermeasure, loss of the two sentences is in a descending trend until the sentences tend to be stable, and the countermeasure training is terminated; and the obtained rewriting model is an optimal model, namely the conversation generation model.