CN111428014A

CN111428014A - Non-autoregressive conversational speech generation method and model based on maximum mutual information

Info

Publication number: CN111428014A
Application number: CN202010185621.7A
Authority: CN
Inventors: 韩庆宏; 李纪为
Original assignee: Beijing Xiangnong Huiyu Technology Co ltd
Current assignee: Beijing Xiangnong Huiyu Technology Co ltd
Priority date: 2020-03-17
Filing date: 2020-03-17
Publication date: 2020-07-17

Abstract

The invention discloses a non-autoregressive dialogue generation method and a model based on maximum mutual information, and belongs to the technical field of machine dialogue. The non-autoregressive dialogue generation method based on the maximum mutual information comprises the following steps: coding an input first upper sentence through a preceding coder to obtain a first feature vector of the first upper sentence; decoding the first feature vector through a foreigner decoder to obtain reply sentences of the above sentences, and calculating first probability of each reply sentence; encoding the reply through a backward encoder to obtain a second feature vector; decoding the second feature vector through a backward decoder to obtain a second upper statement of the reply statement, and calculating a second probability of the first upper statement appearing in the second upper statement; and calculating the sum of the first probability and the second probability, and selecting the corresponding reply sentence when the sum is maximum. The invention achieves the balance of efficiency and effect in the dialog generation process by utilizing a non-autoregressive method and a maximum mutual information criterion.

Description

Non-autoregressive conversational speech generation method and model based on maximum mutual information

Technical Field

The invention relates to the technical field of machine conversation, in particular to a non-autoregressive conversation generation method and a model based on maximum mutual information.

Background

In the prior art, most of the previous dialog generation uses an "autoregressive" generation mode, so-called autoregressive, that is, in the dialog generation process, dialog contents are generated word by word, and the current word is generated based on all the words generated before, so as to form a sentence. For example, to generate the sentence "I like cats", the autoregressive is generated by: first generate "I", then generate "like" based on "I", and finally generate "cats" based on "I like". Expressed in terms of probability is:

. It is clear that a disadvantage of this method is that model generation is particularly slow when the sentences to be generated are long, since only one word can be generated at a time. This drawback is particularly pronounced in dialog generation.

The non-autoregressive generation method is to generate a plurality of, even all, words at one time, for example, when generating an "I like cats" statement, three words can be generated at one time, represented by probability

And the generation of each word does not depend on other words and can be independently generated, namely, the words can be generated by the model at one time. Obviously, the non-autoregressive generation method can greatly improve the generation efficiency, but has the disadvantage that the generated words have no correlation, so that the generated sentences are very poor and the accuracy requirement of dialog generation cannot be met, for example, "ii I" or "like like like" may be generated instead of the correct sentence "I like" s ".

Disclosure of Invention

The invention mainly solves the technical problem of providing a non-autoregressive dialog generation method and a model machine based on maximum mutual information, which can accelerate the dialog generation speed, improve the generation efficiency, improve the correlation between the text and the text during the dialog generation and improve the accuracy of the dialog generation.

In order to achieve the above object, the first technical solution adopted by the present invention is: the non-autoregressive dialogue generating method based on the maximum mutual information comprises the following steps: coding the input upper sentence through a first coder to obtain a first feature vector of the first upper sentence; decoding the first feature vector through a first decoder to obtain reply sentences of the above sentences, and calculating first probability of each reply sentence; coding the reply statement through a second coder to obtain a second feature vector; decoding the second feature vector through a second decoder to obtain a second upper statement of the reply statement, and calculating a second probability of the first upper statement appearing in the second upper statement; and calculating the sum of the first probability and the second probability, and selecting the corresponding reply sentence when the sum is maximum.

In order to achieve the above object, the second technical solution adopted by the present invention is: there is provided a non-autoregressive dialog generation model based on maximum mutual information, comprising: a reply sentence generation section that generates a reply sentence from the input first upper sentence by first encoding and first decoding; a previous sentence generating section that generates a second previous sentence from the reply sentence by the second encoding and the second decoding; the probability operation part calculates the first probability of each reply statement generated by the first previous statement, calculates the rule that the second previous statement generated by the reply statement is the first previous statement, and solves the probability and the value; and the reply sentence extracting part compares the probability and the value and selects the reply sentence corresponding to the maximum probability and the value.

The invention has the beneficial effects that: when the method is applied, a non-autoregressive conversation production mode is used, the conversation generation efficiency is improved, meanwhile, the maximum mutual information is used for grasping the conversation generation correlation, the conversation generation quality is improved, and the balance of efficiency and effect is achieved.

Drawings

FIG. 1 is a flow chart diagram of a non-autoregressive dialog generation method based on maximum mutual information according to the present invention;

FIG. 2 is a structural diagram of the non-autoregressive dialog generating model based on maximum mutual information according to the present invention.

Detailed Description

The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.

It is noted that the terms first and second in the claims and the description of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

In one embodiment of the present invention, as shown in fig. 1, a schematic flow chart of the non-autoregressive dialog generating method based on maximum mutual information according to the present invention includes the following steps:

step S101, generating a reply statement.

In one embodiment of the present invention, the forward encoder encodes the above dialog into feature vectors corresponding to the above dialog, and the forward decoder decodes the feature vectors of the above dialog to generate a reply sentence. In the process of generating the reply sentence, a non-autoregressive mode is adopted for generation, namely, all words forming the reply sentence are generated at one time at the same time, so that the reply sentence can be generated quickly when the conversation is long, and the generation efficiency of the reply sentence is improved. For example, when generating the statement "Ilike cats", autoregressive is generated by: first generate "I", then generate "like" based on "I", and finally generate "cats" based on "Ilike". Expressed in terms of probability is:

. It is clear that a disadvantage of this approach is that model generation is particularly slow when the sentences to be generated are long. For the non-autoregressive generation mode, three words in the non-autoregressive generation mode can be generated at one time, namely

Wherein each word is generated independently of other words, and can be generated independently and completely at one time by using the model. Obviously, the non-autoregressive generation mode can greatly improve the generation efficiency.

In one embodiment of the present invention, when a sentence X is inputted, a plurality of reply sentences Y are generated by a forward encoder, a forward decoder and a non-autoregressive generation method, and the probability of the generated reply sentences is determined

Size, from which K sentences are sampled, denoted as

The probability is expressed as

，

Summarizing, when a plaintext sentence X is input, the probability of obtaining a reply sentence is expressed as

Where the letter i denotes the ith reply sentence.

Step S102, generating the above sentence.

In one embodiment of the present invention, the generated reply sentence is encoded into the feature vector corresponding to the reply sentence by backward encoding, and the above sentence is generated by decoding the feature vector of the reply sentence by a backward decoder. And calculates the probability that the newly generated sentence is the original sentence.

In one embodiment of the invention, when a reply sentence is input

Then, a plurality of the above sentences are generated by a backward encoder, a forward decoder, a backward decoder and a non-autoregressive generation mode

,

,

Calculating the newly generated sentence

,

,

Probability of the original sentence X, expressed as

. Similarly, when a reply sentence is input

,

,

Calculating the newly generated sentence

,

,

Probability of the original sentence X, expressed as

Analogize, when the ith reply statement is input

Then, the probability representation that the above sentence is the original one is obtained

。

Step S103, a reply sentence extracting step.

In an embodiment of the present invention, the probabilities in the step S101, the step of generating the reply sentence, and the step S102 of generating the previous sentence are summed, and the reply sentence with the largest sum is the reply sentence closest to the previous dialog. In one embodiment of the invention, the probabilities are compared

And probability

Performing a summation operation

Select the reply sentence that maximizes the sum

I.e. the sentence that maximizes mutual information, this is the final generated reply sentence.

When a non-autoregressive sentence generating manner is used, the sentence is less accurate, for example, when the sentence "Ilike cats" is to be generated, an "II I" or a "like like like like" may be generated. By using the maximum mutual information criterion in the dialog generation process, the problem of poor sentence reliability in the non-autoregressive sentence generation process is solved, the dialog generation process is ensured to be fast, the accuracy can be ensured, and the balance between the efficiency and the effect in the dialog generation process is achieved.

In one embodiment of the present invention, as shown in FIG. 2, the present invention generates model knots based on non-autoregressive dialogs with maximum mutual informationThe composition comprises a reply sentence generation module which generates a reply sentence by a first encoding and a first decoding of an input first upper sentence, and in one embodiment of the invention, when an upper sentence X is input, a plurality of reply sentences Y which are marked as a plurality of reply sentences Y are generated by a forward encoder, a forward decoder and a non-autoregressive generation mode

The probability of generating a statement is expressed as

。

In this embodiment, the non-autoregressive dialogue generating model based on the maximum mutual information of the invention comprises a previous sentence generating module, which generates a second previous sentence from the reply sentence through a second encoding and a second decoding. In one embodiment of the invention, when a reply sentence is input

,

,

Calculating the newly generated sentence

,

,

Probability of the original sentence X, expressed as

. Similarly, when a reply sentence is input

,

,

Calculating the newly generated sentence

,

,

Probability of the original sentence X, expressed as

Analogize, when the ith reply statement is input

In this specific embodiment, as shown in fig. 2, the non-autoregressive dialogue generating model based on maximum mutual information further includes a reply statement extracting module, which sums the generation probability of the reply statement and the probability of the newly generated previous statement as the original statement, and selects the maximum sumA corresponding one of the reply statements. In one embodiment of the invention, the probabilities are compared

And probability

Performing a summation operation

Select the reply sentence that maximizes the sum

The invention uses a non-autoregressive dialogue production mode to generate all words of all reply sentences at one time, greatly improves the speed of generating the dialogue, improves the efficiency of the dialogue generation, simultaneously applies the maximum mutual information criterion, fully models the relevance of the text and the reply, grasps the relevance of the dialogue generation, improves the quality of the dialogue generation and achieves the balance of the efficiency and the effect of the dialogue generation.

Claims

1. A non-autoregressive dialog generation method based on maximum mutual information is characterized by comprising the following steps:

coding an input first upper sentence through a first coder to obtain a first feature vector of the first upper sentence;

decoding the first feature vector through a first decoder to obtain reply sentences of the above sentences, and calculating first probability of each reply sentence;

encoding the reply through a second encoder to obtain a second feature vector;

decoding the second feature vector through a second decoder to obtain a second previous statement of the reply statement, and calculating a second probability of the first previous statement appearing in the second previous statement;

and calculating the sum of the first probability and the second probability, and selecting a reply statement corresponding to the maximum sum.

2. The maximum mutual information based non-autoregressive dialog generation method of claim 1, wherein each word in the reply sentence is generated simultaneously at one time during the generation of the reply sentence using the first decoder.

3. The maximum mutual information based non-autoregressive dialog generation method of claim 1, wherein the first probability and the second probability are logarithmized separately when solving for a sum of the first probability and the second probability.

4. The maximum mutual information based non-autoregressive dialog generation method of claim 1, wherein the first encoder is a forward encoder, the first decoder is a forward decoder, the second encoder is a backward encoder, and the second decoder is a backward decoder.

5. A non-autoregressive dialog generation model based on maximum mutual information, comprising:

the reply sentence generation module is used for generating a reply sentence by an input first upper sentence through first coding and first decoding and calculating a first probability of each reply sentence generated by the first upper sentence;

the previous sentence generating module is used for generating a second previous sentence by the reply sentence through second coding and second decoding and calculating a second probability that the second previous sentence generated by the reply sentence is the first previous sentence; and

and the reply sentence extraction part sums the first probability and the second annual probability and selects one reply sentence corresponding to the maximum probability sum value.

6. The maximum mutual information based non-autoregressive dialog generation model of claim 5, wherein said above sentence generation portion uses a forward encoder for said first encoding and a forward decoder for said first decoding.

7. The maximum mutual information based non-autoregressive dialogue generation model of claim 5, wherein the reply sentence is generated in the above sentence generation section using a non-autoregressive method, and each word in the reply sentence is generated at once in synchronization.

8. The maximum mutual information based non-autoregressive dialogue generation model of claim 5, wherein the reply sentence generation section performs the second encoding using a backward encoder and the second decoding using a backward decoder.

9. The maximum mutual information based non-autoregressive dialog generation model of claim 5, wherein in the probability operation section, the first probability and the second probability are respectively logarithmized and re-summed.