CN114692569A

CN114692569A - Sentence generation method and device

Info

Publication number: CN114692569A
Application number: CN202011626095.XA
Authority: CN
Inventors: 谭传奇; 符尧; 黄非; 陈漠沙; 黄松芳
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2022-07-01

Abstract

The application discloses a sentence generation method and a sentence generation device, wherein a variational self-encoder is used for learning a sentence template unsupervised, so that the defect that the template is not marked with data is overcome; meanwhile, the generation of sentences is controlled by using the learned template, and the defect that the decoding process is lack of control information in a model from a classical sequence to a sequence is overcome. Therefore, the content of the sentence is accurately and finely controlled, and the usability of the generated sentence is ensured.

Description

Sentence generation method and device

Technical Field

The present application relates to, but not limited to, natural language processing methods, and more particularly, to a method and apparatus for generating sentences.

Background

In the task of natural language generation, since a decoder is not limited in any way during decoding, sentences can be freely generated, that is, erroneous, meaningless and inappropriate sentences are easily generated at the time of decoding by a Sequence-to-Sequence (Sequence-to-Sequence) method.

In practical applications, it is often desirable to exclude unsuitable sentences, thereby making the generated sentences more usable in practical scenarios.

Disclosure of Invention

The application provides a sentence generation method and device, which can generate usable sentences.

The embodiment of the invention provides a sentence generation method, wherein sentences are generated by utilizing a sentence generation model, and the sentence generation model comprises the following steps: the device comprises an encoder, a template matching module and a decoder; the method comprises the following steps:

inputting the structured data into an encoder to obtain the code of the structured data;

inputting the code of the structured data into a template matching module, and obtaining a matched sentence generation template for the structured data;

inputting the encoding of the structured data and the sentence generation template into a decoder such that the decoder generates a sentence for the structured data from the sentence generation template.

In one illustrative example, the sentence generation model further comprises: a template generation module; the method further comprises the following steps:

inputting a training sample sentence into an encoder to obtain sentence representation of the training sample sentence;

and inputting the sentence expression of the training sample sentence into the template generation module, so that the template generation module performs modeling for the training sample sentence according to the sentence expression to obtain the sentence generation template of the training sample sentence.

In an exemplary embodiment, the modeling the training sample sentence according to the sentence expression to obtain the sentence generation template of the training sample sentence includes:

using a hidden state sequence z to represent a sentence generation template of the training sample sentence;

performing probability modeling on the hidden state sequence z to obtain the template posterior distribution of the hidden state sequence z;

and determining a sentence generation template of the training sample sentence according to the posterior distribution of the template of the hidden state sequence z.

In one illustrative example, the method further comprises: and labeling the sentence generation template to obtain a labeled sentence generation template.

In one illustrative example, the method further comprises:

and visually displaying the sentence generation template.

The present application further provides an apparatus for generating a sentence, comprising a memory and a processor, wherein the memory stores the following instructions executable by the processor: for performing the steps of the sentence generation method of any of the above.

The present application further provides a sentence generation method, including:

determining a template of a sentence from the set of templates according to the input data structure;

and generating a sentence from the input data structure by taking the determined template as a condition for generating the sentence.

In an exemplary embodiment, the determining the template of the sentence further comprises:

coding the sentences in the input training set, and outputting the representation of the sentences;

modeling a sentence template to obtain a template set according to the representation of the input sentence;

according to the sentence template, with the state z of each step of the template_tEach word of the input sentence is reconstructed from left to right to get a sentence.

In one illustrative example, the obtaining the set of templates includes:

respectively coding each sentence x in the given training set, modeling the obtained continuous hidden state sequence z, and obtaining the variation posterior distribution q (z | x);

and carrying out Monte Carlo sampling on the continuous hidden state sequence z to obtain a hidden state sequence sample, and forming a template set by posterior distribution of templates corresponding to each sentence in the training set.

In an exemplary embodiment, the generating a sentence from the input data structure with the determined template as a condition for generating the sentence includes:

taking the determined template as a condition for generating a sentence, and inputting the state z of the template corresponding to each step_tOutputting the word corresponding to the state; and (4) forming the obtained words into sentences corresponding to the input data structure from left to right.

The embodiment of the application controls the generation of a new sentence by using the learned template as the condition for generating the sentence during decoding, accurately and finely controls the content of the sentence, ensures the usability of the generated sentence and ensures the usability of the generated sentence.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the claimed subject matter and are incorporated in and constitute a part of this specification, illustrate embodiments of the subject matter and together with the description serve to explain the principles of the subject matter and not to limit the subject matter.

FIG. 1 is a schematic diagram of a structure of a variational self-encoder according to an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating an embodiment of a sentence generation method in an embodiment of the present application;

fig. 3 is a schematic structural diagram of an embodiment of a variational self-encoder in the embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

In one exemplary configuration of the present application, a computing device includes one or more processors (CPUs), input/output interfaces, a network interface, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

The inventor of the present application finds that the common sentence generation for deep learning depends on a sequence-to-sequence method, which is from left to right in the generation process, and each step of the generation process has no limitation condition and no obvious sentence or grammar information. Therefore, the generated sentence is often not fit the expectation of human beings. In order to accurately and finely control the content of a sentence and improve the usability of the generated sentence, the embodiment of the application provides a variation self-encoder.

In the embodiment of the present application, the apparatus for generating a sentence is a structured variational self-encoder, as shown in fig. 1, fig. 1 is a schematic diagram of a constituent structure of the variational self-encoder in the embodiment of the present application, and the variational self-encoder in the embodiment of the present application includes: the device comprises an encoder, a template generating module, a template matching module and a decoder; wherein,

an encoder is arranged to encode an input sentence and to output a representation of the sentence.

In an exemplary embodiment, the encoder may be a Bi-Directional Long Short Time Memory network (Bi-LSTM).

In one illustrative example, a sentence input to the encoder is used to describe a given data structure. In other words, the sentence inputted into the encoder is a sentence generated according to a given data structure, such as a set, a key-value table, a word graph of a knowledge graph, etc., to describe the data structure. A sentence is represented as a sequence, each step of the sequence being a continuous vector, i.e. a continuous hidden state sequence z.

A template generation module configured to: modeling the templates of the sentences according to the representation of the input sentences (in the training process) to obtain a set of templates;

in an illustrative example, the template generation module outputs a sequence of states with a representation of a sentence as input. A sentence template is a discrete hidden state sequence z, requiring each step z_tAre all a different state. For example, z is_tCorresponding to the syntactic or semantic components of the generated sentence, such as: first few steps z_tPossibly representing the subject of a sentence, middle step z_tPossibly representing a predicate of a sentence, last few steps z_tPossibly representing the object of a sentence, the entire sequence of z represents a sentence of a main-predicate-object structure.

In one illustrative example, the template generation module may be a conditional random field (Gumbel-CRF) model, such as based on extreme-type distributions. Wherein the Gumbel distribution is an extreme value type distribution. A Conditional Random Field (CRF) is a probabilistic undirected graph model that solves for a Conditional probability p (y | x) given an input Random variable x.

A template matching module configured to: and (in the using process), according to the input data structure, finding a template corresponding to the sentence (with the most similar semanteme) corresponding to the input data structure from the template set, and outputting the template serving as a template of the newly generated sentence to a decoder.

In an illustrative example, the data structure may include, for example, a collection, a key-value pair table, a word graph of a knowledge graph, and the like. The template generation module in the embodiment of the application is used for analyzing the generation from the structured data to the sentence.

In an exemplary example, how to determine the junction similarity between two sentences may be implemented by using a related technology, and the specific implementation is not used to limit the scope of the present application. In the embodiment of the present application, it is emphasized that the best sentence template required for generating a sentence is found for the input data structure.

A decoder configured to: reconstructing the input sentence (during training) from the template of sentences from the template generation module; or (in the using process) the matched template is taken as a condition for generating the sentence, and the sentence is generated by the input data structure.

In an illustrative example, the decoder may be a Long Short Time Memory network (LSTM).

In one illustrative example, the reconstructed input sentence at the decoder may include, for example: with each step z of the template_tAs input to the decoder, each word of the input sentence is reconstructed from left to right to get the sentence.

In aIn an exemplary example, generating a sentence in the decoder with the input template as a condition for generating the sentence may include: and taking the input template as a sentence generating condition, and generating sentences corresponding to the input data structure from left to right. Each state z of the template sequence_tCorresponds to each word in the sentence and the structure of the sentence thus generated is controlled by the template.

The variational self-encoder in the embodiment of the application uses the variational self-encoder to learn the sentence template unsupervised, thereby overcoming the defect that the template is not marked with data; meanwhile, the generation of sentences is controlled by using the learned template, and the defect that the decoding process is lack of control information in a model from a classical sequence to a sequence is overcome. Therefore, the content of the sentence is accurately and finely controlled, and the usability of the generated sentence is ensured.

In one illustrative example, the template generation module is further configured to:

and labeling the sentence template, and performing supervised learning by using the labeled data.

and visualizing the generated template.

In one illustrative example, visualizing the generated template may include: and outputting the number corresponding to each state z and the word corresponding to the state together, so that a user can observe the corresponding relation between the state sequence and the sentence sequence. And whether the learned template encodes enough linguistic information is verified through a visualization method, so that the defect that the sequence-to-sequence model is lack of interpretability is further overcome.

In an illustrative example, the present application may use a normal variational autoencoder training method, in which,

the encoder is specifically configured as follows: coding each sentence x in a given training set to obtain the expression of the sentence;

the template generation module is specifically set as follows: modeling the obtained sentence expression, namely a continuous hidden state sequence z, and acquiring a variation posterior distribution q (z | x); carrying out Monte Carlo sampling on the continuous hidden state sequence z to obtain a hidden state sequence sample, and forming a template distribution set by template posterior distribution (namely variation posterior distribution) corresponding to each sentence in a training set;

the decoder is specifically configured to: and reconstructing an input sentence x according to the input hidden state sequence sample.

After training is completed, for each sentence in the training set, the template posterior distribution corresponding to the sentence can be obtained, and in the template posterior distribution, sequence samples with higher sampling probability (for example, the sampling probability is greater than a preset threshold) are collected to obtain a set of templates, namely, a result of modeling the template of the sentence by the template generation module.

In one illustrative example, during use,

the template matching module is specifically set as follows: according to the input data structure, a sentence corresponding to the input data structure is found from the set of templates, for example, the sentence with the most similar semantic meaning is found, and the template of the sentence is taken as the template of the newly generated sentence;

the decoder is specifically configured to: using the determined template as a condition for generating a sentence, the state z of the template is input for each step of the decoder, e.g. LSTM_tAnd outputting the words corresponding to the state, and then forming the obtained words into the whole sentence from left to right.

With the variational self-encoder provided by the embodiment of the application, for the task of generating sentences from the structured data, the use of the structured data as the input condition of the decoder is realized.

In particular, during the training and use phases, the decoder takes as input not only the template, but also a continuous coding of the structured data; in the using stage, given a piece of structured data, using this data as query, searching the training set for the training data closest to the structured data, and using the template of the training data as the template of the new sentence to generate the new sentence. That is to say, the variational self-encoder in the embodiment of the present application controls the generation of a new sentence by using the learned template as a condition for generating a sentence during decoding, so that the generation process of the sentence has better interpretability and controllability, the content of the sentence is accurately and finely controlled, and the usability of the generated sentence is ensured.

Through practical application, the variational self-encoder provided by the embodiment of the application is applied to a task of generating a synonymous sentence, and compared with a traditional Gaussian variational self-encoder, the variational self-encoder provided by the embodiment of the application improves a 14-point Bilingual Evaluation substitution (BLEU) score; the variational self-encoder provided by the embodiment of the application is applied to the task of generating structured data into sentences, the coefficient of 19-point BLEU is improved compared with a rule-based model, and the coefficients of a Co-occurrence coefficient (NIST, N-gram Co-occurrence Scores), a Recall-Oriented key point Evaluation index (Rouge, call-Oriented unknown Evaluation), a Consensus-based Image Description Evaluation index (circle, Consensus-based Image Description Evaluation), a Translation Evaluation index with an Explicit order (measure for Evaluation of Translation with Explicit order Evaluation) and the like are improved.

According to the variational self-encoder shown in fig. 1, the embodiment of the present application provides a sentence generation method, which generates a sentence by using a sentence generation model. In one embodiment, the sentence generation model may include the following as shown in FIG. 1: the system comprises an encoder, a template matching module and a decoder, and thus, the sentence generation method comprises the following steps:

the resulting encoding of the structured data and sentence generation template are input to a decoder such that the decoder generates a sentence for the structured data from the input sentence generation template.

In an exemplary instance, the sentence generation model can further include a template generation module as shown in FIG. 1; thus, the sentence generation method may further include:

and inputting the sentence expression of the training sample sentence into a template generation module, so that the template generation module performs modeling for the training sample sentence according to the sentence expression, and a sentence generation template of the training sample sentence is obtained.

In one illustrative example, modeling a training sample sentence from sentence representations to obtain a template of the training sample sentence, comprises:

using a hidden state sequence z to represent a sentence generation template of a training sample sentence;

and determining sentence generation templates of training sample sentences according to the obtained template posterior distribution of the hidden state sequence z.

In an exemplary instance, the sentence generation method may further include:

and marking the sentence generation template to obtain the marked sentence generation template.

In an exemplary instance, the sentence generation method may further include:

and visually displaying the sentence generation template.

An embodiment of the present application further provides an apparatus for generating a sentence, which includes a memory and a processor, where the memory stores the following instructions executable by the processor: for performing the steps of the sentence generation method of any of the above.

Fig. 2 is a schematic flow chart of an embodiment of a sentence generation method in the embodiment of the present application, as shown in fig. 2, including:

step 200: a template for the sentence is determined from the set of templates based on the input data structure.

In one illustrative example, the step may include:

and according to the input data structure, finding a sentence which is most similar to the semanteme of the input data structure from the set of templates, and taking the template of the sentence as the template of the newly generated sentence.

In an exemplary embodiment, step 200 may be preceded by:

In an illustrative example, a sentence template is a discrete hidden state sequence z, requiring each step z_tAre all a different state. For example, z is_tCorresponding to the syntactic or semantic components of the generated sentence, such as: first few steps z_tPossibly representing the subject of a sentence, middle step z_tPossibly representing a predicate of a sentence, the last few steps z_tPossibly representing the object of a sentence, the entire sequence of z represents a sentence of a main-predicate-object structure.

In one illustrative example, an input sentence is encoded, and a representation of the sentence is output; modeling a template of a sentence to obtain a set of templates from a representation of an input sentence may include:

each sentence x in a given training set is respectively coded, the representation of the obtained sentences, namely a continuous hidden state sequence z, is modeled, and a variational posterior distribution q (z | x) is obtained;

monte Carlo sampling is carried out on the continuous hidden state sequence z to obtain a hidden state sequence sample, and template posterior distribution (namely variation posterior distribution) corresponding to each sentence in the training set forms a template distribution set.

The embodiment of the application uses the variational self-encoder to learn the sentence template unsupervised, and overcomes the defect that the template does not have labeled data.

In one illustrative example, the input sentence is used to describe a given data structure. In other words, the sentence inputted into the encoder is a sentence generated according to a given data structure, such as a set, a key-value table, a word graph of a knowledge graph, etc., to describe the data structure.

Step 201: and generating a sentence from the input data structure by taking the determined template as a condition for generating the sentence.

In one illustrative example, the step may include:

using the determined template as a condition for generating a sentence, the state z of the template is input for each step of the decoder, e.g. LSTM_tAnd outputting the words corresponding to the state, and then combining the obtained words from left to right to form the whole sentence corresponding to the input data structure.

The embodiment of the application uses the learned template to control the generation of the sentence, and overcomes the defect that the decoding process lacks control information in a model from a classical sequence to a sequence. Therefore, the content of the sentence is accurately and finely controlled, and the usability of the generated sentence is ensured.

In an exemplary instance, the method for generating a sentence of the present application may further include:

In an exemplary embodiment, the method for generating a sentence may further include:

and visualizing the generated template. And whether the learned template encodes enough linguistic information is verified through a visualization method, so that the defect that the sequence-to-sequence model is lack of interpretability is further overcome.

By the method for generating the sentence, the task of generating the sentence from the structured data is realized by using the structured data as the input condition of the decoder.

Fig. 3 is a schematic structural diagram of an embodiment of a variational self-Encoder in the embodiment of the present application, and as shown in fig. 3, during Training (Training time), an Encoder (Encoder) obtains a representation of a sentence according to an input sentence; the template generating module is like Gumbel-CRF in FIG. 3, and samples a template according to the representation of the sentence; a Decoder (Decoder) takes each step of the template as input and outputs each word in the generated sentence according to the state of each step of the template. When the training sample is used (Test time), the template matching module finds a training sample, namely a sentence, which is most similar to the semanteme of the input data structure from the set of the templates according to the input data structure, and takes the template of the sentence as a template of a newly generated sentence; the condition for the Decoder to generate a sentence with the input template as the input data structure, that is, the structure of the generated sentence, is controlled by the template, and words are generated from left to right to compose the generated sentence. Each state z of the template sequence_tCorresponds to each word in the sentence and the structure of the sentence thus generated is controlled by the template. In the example shown in FIG. 3, z _t1 denotes the name of a restaurant, z _t2 denotes the predicate, z_tThe 3 represents the restaurant rating, and the structure of the generated sentence is controlled by the state sequence z of the template, corresponding to the description of a restaurant. Compared with the method for generating the sentence in the related technology, the method realizes accurate control on the structure of the generated sentence through the state sequence z of the template, so that the generation process of the sentence is more extensiveThe method has interpretability and controllability, accurately and finely controls the content of the sentence, and ensures the usability of the generated sentence.

Application scenarios of the present application may include, but are not limited to, such as: a description is automatically generated for the goods in the e-commerce platform. Specifically, according to the attributes of the commodity input by the store, sentences which are different in structure and describe the same commodity are generated according to different templates through the variational self-encoder provided by the application, so that the controllable generation of the sentence describing the commodity is completed.

The application scenarios of the present application may further include, for example: finance, insurance, weather forecast, news, air quality reports/forecasts, epidemic reports, medical diagnosis, enterprise financial reports, annual reports, and the like.

Although the embodiments disclosed in the present application are described above, the descriptions are only for the convenience of understanding the present application, and are not intended to limit the present application. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims

1. A sentence generation method for generating a sentence using a sentence generation model, the sentence generation model comprising: the device comprises an encoder, a template matching module and a decoder; the method comprises the following steps:

2. The sentence generation method of claim 1, wherein the sentence generation model further comprises: a template generation module; the method further comprises the following steps:

3. The sentence generation method of claim 2, wherein said modeling said training sample sentence from said sentence representation to obtain a sentence generation template of said training sample sentence comprises:

4. A sentence generation method according to any one of claims 1 to 3, wherein the method further comprises: and labeling the sentence generation template to obtain a labeled sentence generation template.

5. The sentence generation method of claim 1, wherein the method further comprises:

and visually displaying the sentence generation template.

6. An apparatus for generating a sentence, comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor: steps for performing the sentence generation method of any one of claims 1 to 5.

7. A sentence generation method, comprising:

8. The method of claim 7, the determining a template for a sentence further comprising:

9. The method of claim 8, wherein the obtaining the set of templates comprises:

respectively coding each sentence x in the given training set, modeling the obtained continuous hidden state sequence z, and obtaining a variation posterior distribution q (z | x);

10. The method of claim 7, wherein the generating a sentence from the input data structure using the determined template as a condition for generating the sentence comprises:

taking the determined template as a condition for generating sentences corresponding to the state z of the input template at each step_tOutputting the word corresponding to the state; and (4) forming the obtained words into sentences corresponding to the input data structure from left to right.