CN113255372A

CN113255372A - Information generation method and device, electronic equipment and storage medium

Info

Publication number: CN113255372A
Application number: CN202110534852.9A
Authority: CN
Inventors: 崔志; 张嘉益
Original assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2021-08-13

Abstract

The disclosure provides an information generation method and apparatus, an electronic device, and a storage medium. The method comprises the following steps: after receiving the dialogue input, determining a target reference vector suitable for the current dialogue input according to the similarity between the reference vector in the reference vector set and a hidden vector corresponding to the dialogue input, selecting a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, and generating a reply sentence according to the target word. Because target reference vectors suitable for different dialogue inputs are different, different target words and sentences can be determined by combining different target reference vectors, different reply sentences can be finally determined, and diversity of the reply sentences is realized.

Description

Information generation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer communication technologies, and in particular, to an information generating method and apparatus, an electronic device, and a storage medium.

Background

With the development of natural language processing technology, man-machine interaction technology can realize man-machine conversation, so that people can quickly obtain answers to questions. After receiving the dialogue content input by the user, the electronic equipment performs semantic analysis on the dialogue content, and determines and outputs a reply sentence.

At present, reply sentences output by a machine in the field of human-computer interaction are single, and how to enrich the diversity of the reply sentences output by electronic equipment and improve the use experience of users is a technical problem to be solved urgently by technical personnel in the field.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides an information generating method and apparatus.

According to a first aspect of the embodiments of the present disclosure, there is provided an information generating method, the method including:

after receiving a dialog input, determining a hidden vector corresponding to the dialog input;

determining a target reference vector according to the similarity between the reference vector in the reference vector set and the hidden vector;

selecting a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector;

and generating a reply sentence according to the target word.

In some embodiments, the library of words is presented in vector form; selecting a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, wherein the selecting comprises:

splicing the hidden vector and the target reference vector to obtain a combined vector;

inputting the combined vector to a decoder such that the decoder outputs an intermediate vector;

performing dimension conversion on the intermediate vector to obtain a target vector, wherein the dimension of the target vector is less than or equal to the dimension of the word bank;

selecting the target word from a plurality of words included in the word library according to element value sizes of elements in the target vector.

In some embodiments, the generating a reply statement from the target word includes:

inputting the target word into the decoder such that the decoder outputs a new intermediate vector;

performing dimension conversion on the new intermediate vector to obtain a new target vector, wherein the dimension of the new target vector is less than or equal to the dimension of the word bank;

selecting a new target word from a plurality of words included in the word library according to the element value size of an element in the new target vector;

and after circularly inputting the newly selected target words into the decoder, generating the reply sentence according to the target words selected from the word library for multiple times.

In some embodiments, the method further comprises:

extracting a sub-word library from the word library according to the combined vector;

the performing dimension conversion on the intermediate vector to obtain a target vector includes:

performing dimension conversion on the intermediate vector to obtain the target vector with the same dimension as the dimension of the sub-word library;

the selecting the target word from a plurality of words included in the word library according to element value sizes of elements in the target vector includes:

and selecting the target word from a plurality of words included in the sub-word library according to the element value size of the element in the target vector.

In some embodiments, said extracting a sub-word library from the word library according to the combination vector includes:

performing dimension conversion on the combined vector to obtain a conversion vector with the same dimension as the word library;

acquiring an element set of which the element values in the conversion vector meet an element value condition;

acquiring a word set positioned at the same position in the word library according to the position of the element set in the conversion vector;

and generating the sub-word library according to the word set.

In some embodiments, the corpus of words is presented in the form of a 1 xn dimensional vector, the corpus of subwords is presented in the form of a 1 xn dimensional vector, the N being less than the N; the dimension of the intermediate vector is 1 × M;

performing dimension conversion on the intermediate vector to obtain the target vector having the same dimension as the subword library, including:

determining a designated vector corresponding to each word in the sub-word library, wherein the element position of a non-zero element in the designated vector is the same as the position of the corresponding word in the sub-word library, and the dimension of the designated vector is the same as the dimension of the word library;

combining the designated vectors corresponding to all the words in the sub-word library to obtain a matrix with dimension of N multiplied by N;

and determining the target vector with the dimension of 1 multiplied by N according to the global conversion matrix with the dimension of M multiplied by N and the transpose matrix of the matrix with the dimension of N multiplied by N.

In some embodiments, the determining the target vector of dimension 1 × N from the global conversion matrix of dimension M × N and the transpose of the matrix of dimension N × N comprises:

determining a local conversion matrix with dimension of M multiplied by n according to the product of the global conversion matrix and the transposition matrix;

and determining the target vector with the dimension of 1 multiplied by n according to the product of the intermediate vector and the local transformation matrix.

In some embodiments, the method is performed by an information generating network comprising an encoder, a reference vector determination sub-network, and the decoder; the information generation network is obtained by training the following steps:

inputting a sample dialog input into the information generating network such that the decoders in the information generating network cyclically output sample intermediate vectors;

responding to the primary output sample intermediate vector of the decoder, and performing dimension conversion on the primary output sample intermediate vector to obtain a corresponding sample target vector;

replacing the current word to be determined in the sample reply sentence with a first identifier and replacing other words except the current word to be determined with a second identifier in the word library to obtain a first standard vector;

determining the difference between the sample target vector obtained this time and the first standard vector;

and after the decoder outputs the sample intermediate vector for multiple times, adjusting the parameters in the information generation network according to the difference between the sample target vector obtained for multiple times and the first standard vector.

In some embodiments, the information generating network further comprises a sub-term library generating sub-network; the training the information generation network further specifically includes:

after the sample dialogue input is input into the information generation network, obtaining a sample conversion vector determined by the sub-word library generation sub-network;

in the word library, replacing words included in the sample reply sentence with third identifications, and replacing words not included in the sample reply sentence with fourth identifications to obtain a second standard vector;

adjusting parameters in the information generating network according to a difference between the sample conversion vector and the second standard vector.

According to a second aspect of the embodiments of the present disclosure, there is provided an information generating apparatus, the apparatus including:

a hidden vector determination module configured to determine a hidden vector corresponding to a dialog input after receiving the dialog input;

a reference vector determination module configured to determine a target reference vector according to similarity of a reference vector in a set of reference vectors and the hidden vector;

a target word selecting module configured to select a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector;

and the reply sentence generation module is configured to generate a reply sentence according to the target word.

In some embodiments, the library of words is presented in vector form; the target word selecting module comprises:

a combined vector obtaining submodule configured to splice the hidden vector and the target reference vector to obtain a combined vector;

a decoder use sub-module configured to input the combined vector to a decoder such that the decoder outputs an intermediate vector;

the dimension conversion sub-module is configured to perform dimension conversion on the intermediate vector to obtain a target vector, and the dimension of the target vector is smaller than or equal to the dimension of the word bank;

a target word selection sub-module configured to select the target word from a plurality of words included in the word library according to element value sizes of elements in the target vector.

In some embodiments, the reply statement generation module includes:

a target word input submodule configured to input the target word into the decoder such that the decoder outputs a new intermediate vector;

the vector dimension conversion sub-module is configured to perform dimension conversion on the new intermediate vector to obtain a new target vector, and the dimension of the new target vector is smaller than or equal to that of the word bank;

a target word selection sub-module configured to select a new target word from a plurality of words included in the word library according to element value sizes of elements in the new target vector;

and the reply sentence generation submodule is configured to generate the reply sentence according to the target words selected from the word library for a plurality of times after the newly selected target words are circularly input into the decoder.

In some embodiments, the apparatus further comprises:

the sub-word library extraction module is configured to extract a sub-word library from the word library according to the combination vector;

the dimension conversion sub-module is configured to perform dimension conversion on the intermediate vector to obtain the target vector with the same dimension as the subword library;

the target word selection submodule is configured to select the target word from a plurality of words included in the sub-word library according to the element value size of an element in the target vector.

In some embodiments, the subword library extraction module includes:

a combined vector dimension conversion submodule configured to perform dimension conversion on the combined vector to obtain a conversion vector having the same dimension as the word bank;

an element set obtaining submodule configured to obtain an element set in which element values in the conversion vector satisfy an element value condition;

the word set acquisition submodule is configured to acquire a word set located at the same position in the word library according to the position of the element set in the conversion vector;

and the sub-word library generating sub-module is configured to generate the sub-word library according to the word set.

the dimension conversion submodule comprises:

a specified vector obtaining unit configured to determine a specified vector corresponding to each word in the sub-word library, where element positions of non-zero elements in the specified vector are the same as positions of the corresponding word in the sub-word library, and dimensions of the specified vector are the same as dimensions of the word library;

the appointed vector combination unit is configured to combine appointed vectors corresponding to all words in the sub-word library to obtain a matrix with dimension of N multiplied by N;

a target vector determination unit configured to determine the target vector of dimension 1 × N from a global conversion matrix of dimension M × N and a transpose matrix of the matrix of dimension N × N.

In some embodiments, the specifying a vector combining unit comprises:

a matrix multiplication subunit configured to determine a local conversion matrix having a dimension of M × n according to a product of the global conversion matrix and the transposed matrix;

a vector multiplication subunit configured to determine the target vector having a dimension of 1 × n according to a product of the intermediate vector and the local conversion matrix.

In some embodiments, the method is performed by an information generating network comprising an encoder, a reference vector determination sub-network, and the decoder; the device further comprises:

a sample acquisition submodule configured to acquire a sample dialogue input and a sample reply sentence;

an information generation network usage submodule configured to input the sample dialog input into the information generation network such that the decoders in the information generation network cyclically output sample intermediate vectors;

a sample vector dimension conversion sub-module, configured to respond to a sample intermediate vector output by the decoder at a time, perform dimension conversion on the sample intermediate vector output at the time, and obtain a corresponding sample target vector;

a first standard vector obtaining submodule configured to replace, in the term library, a term to be currently determined in the sample reply sentence with a first identifier, and replace terms other than the term to be currently determined with a second identifier, so as to obtain a first standard vector;

a difference determination submodule configured to determine a difference between the sample target vector obtained this time and the first standard vector;

a first parameter adjusting sub-module configured to adjust a parameter in the information generating network according to a difference between a sample target vector obtained multiple times and a first standard vector after the decoder outputs a sample intermediate vector multiple times.

In some embodiments, the network training module further specifically includes:

a sample conversion vector acquisition sub-module configured to acquire a sample conversion vector determined by the subword library generation sub-network after the sample dialogue input is input into the information generation network;

a second standard vector obtaining sub-module configured to replace, in the term library, terms included in the sample reply statement with a third identifier, replace terms not included in the sample reply statement with a fourth identifier, and obtain a second standard vector;

a second parameter adjustment submodule configured to adjust a parameter in the information generation network according to a difference between the sample conversion vector and the second standard vector.

According to a third aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above first aspects.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of any of the first aspect above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the embodiment of the disclosure, after receiving a dialog input, the electronic device determines a target reference vector suitable for the current dialog input according to the similarity between a reference vector in a reference vector set and a hidden vector corresponding to the dialog input, selects a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, and generates a reply sentence according to the target word. Because target reference vectors suitable for different dialogue inputs are different, different target words and sentences can be determined by combining different target reference vectors, different reply sentences are finally determined, and diversity of reply sentence generation is realized.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow chart illustrating a method of generating information according to an exemplary embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating a method of training an information generating network according to an exemplary embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an information generating network shown in accordance with an exemplary embodiment of the present disclosure;

FIG. 4 is a block diagram illustrating an information generating apparatus according to an exemplary embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

With the development of natural language processing technology, man-machine interaction technology can realize man-machine conversation, so that people can quickly obtain answers to questions. After receiving the dialogue content question sentences input by the user, the electronic equipment performs semantic analysis on the dialogue content question sentences, and determines and outputs reply sentences.

At present, reply sentences output by a machine in the field of man-machine interaction of electronic equipment are single, and how to enrich the diversity of the reply sentences output by the electronic equipment and improve the use experience of users is a technical problem to be solved urgently by technical personnel in the field.

The inventor finds in research that it is common practice in the related art to generate dialogues by relying on a sequence-to-sequence (seq2seq) model, which includes an encoder and a decoder. Wherein the encoder is used to encode the query of the user to obtain the concealment vector for this query. And the decoder decodes the corresponding recovery according to the concealment vector.

However, the seq2seq model described above has several problems: first, the generation requires selection from the full vocabulary, and typically a vocabulary is between 3 and 5 tens of thousands in size. Such a large matrix operation generally results in a slow generation rate. Second, the encoder usually has a part of the information lost for query encoding, and if there is no additional optimization, it will directly affect the generation of the reply. Third, the generation of seq2seq often generates monotonous and uninteresting replies, and it is difficult to generate different replies according to one query.

Based on this, the embodiments of the present disclosure provide an information generating method, which may be applied to an electronic device, and after a question sentence dialog input is received, a target reference vector applicable to a current question sentence dialog input is determined according to similarity between a reference vector in a reference vector set and a hidden vector corresponding to the question sentence dialog input, a target word is selected from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, and a reply sentence is generated according to the target word. Because the target reference vectors suitable for the dialogue input of different question sentences have differences, different target words and sentences can be determined by combining different target reference vectors, different reply sentences can be finally determined, and the diversity of the generation of the reply sentences is realized.

Fig. 1 is a flowchart illustrating an information generating method according to an exemplary embodiment, where the method illustrated in fig. 1 is applied to an electronic device, for example, a mobile device such as a smart phone, a tablet computer, a smart watch, a smart bracelet, a smart head-mounted display device, a smart speaker, a PDA (Personal Digital Assistant), or a fixed device such as a desktop computer smart television.

The method shown in fig. 1 comprises:

in step 101, after receiving a dialog input, a hidden vector corresponding to the dialog input is determined.

In some embodiments, the information generating method is performed by an information generating network, the information generating network comprising an encoder. The dialog input is input into the encoder such that the encoder outputs a hidden vector corresponding to the dialog input.

The hidden vector output by the encoder is used for abstracting and expressing the semantics of the dialog input.

The network structure of the encoder may be set as desired, for example, the encoder may be composed of at least one unidirectional GRU network.

For example, a dialog input consists of a set of words, a dialog input query ═ x₁，x₂，x₃，...，x_m]Wherein x is_iRepresents the ith word in the query. Will [ x ]₁，x₂，x₃，...，x_m]Inputting unidirectional GRU network, so that unidirectional GRU network will [ x₁，x₂，x₃，...，x_m]Is converted into [ h₁，h₂，h₃，...，h_m]Wherein h is_mThe corresponding hidden vector is entered for the dialog.

Besides, the hidden vector corresponding to the dialog input can also be determined by the method in the related art.

In step 102, a target reference vector is determined according to the similarity between the reference vector in the reference vector set and the hidden vector.

Inputting a query, together with different reference vectors, will generate different replies, and the reference vector set may include multiple reference vectors, representing the possibility of multiple replies, thereby realizing diversified replies.

In some embodiments, the set of reference vectors may be pre-set, or the set of reference vectors may be randomly introduced. The number of reference vectors included in the reference vector set may be set as desired.

In some embodiments, for each reference vector in the set of reference vectors, a similarity between the reference vector and the hidden vector is determined, and a target reference vector with a similarity satisfying a similarity condition is selected from the set of reference vectors.

For example, from the set of reference vectors, the target reference vector with the greatest similarity is selected.

The similarity in this embodiment may be cosine similarity or other types of similarity.

In some embodiments, the dimensions of the reference vector may be the same as the dimensions of the concealment vector. For example, the dimensions of the reference vector and the concealment vector are both 1 × s, where s is a positive integer.

In some embodiments, the information generating method may be performed by an information generating network, which may include reference vectors to determine the sub-networks.

The set of reference vectors and the concealment vector may be input into the reference vector determination sub-network such that the reference vector determination sub-network determines the target reference vector based on similarity of the reference vectors in the set of reference vectors and the concealment vector, and outputs the target reference vector.

In step 103, a target word is selected from a plurality of words included in a preset word library according to the hidden vector and the target reference vector.

In some embodiments, the corpus of words is presented in vector form. For example, the word library is presented in a vector form of 1 × N dimensions.

The target word may be selected from a plurality of words included in the preset word library according to the hidden vector and the target reference vector by: the first step is as follows: splicing the hidden vector and the target reference vector to obtain a combined vector; a second step of inputting the combined vector to a decoder so that the decoder outputs an intermediate vector; the third step: performing dimension conversion on the intermediate vector to obtain a target vector, wherein the dimension of the target vector is less than or equal to the dimension of the word bank; the fourth step: a target word is selected from a plurality of words included in the word library according to the element value sizes of the elements in the target vector.

For the first step, for example, the reference vector is [ a ]₁，a₂，a₃，...，a₁₀]The target reference vector is [ b ]₁，b₂，b₃，...，b₁₀]Reference vector and target referenceThe combined vector of the vectors is [ a ]₁，a₂，a₃，...，a₁₀，b₁，b₂，b₃，...，b₁₀]。

For the second step, the information generating method is performed by an information generating network including a decoder. The network structure of the decoder may be set as desired, for example, the decoder may be composed of at least one unidirectional GRU network.

The dimensions of the intermediate vector may be the same as the dimensions of the concealment vector.

For the third step, the intermediate vector is the vector output by the decoder once per cycle, and represents the semantics of the currently output word. If the current word needs to be obtained, dimension conversion is carried out on the intermediate vector through full connection to obtain a target vector, the position of the maximum numerical value in the target vector is determined, and the word located at the same position is obtained from the word library, so that the current word is obtained.

For the fourth step, the first case: and under the condition that the dimension of the target vector is equal to the dimension of the word bank, determining the position of the element with the maximum element value in the target vector, and selecting the target word at the same position from a plurality of words in the word bank.

In the second case: after the electronic device obtains the combination vector, the sub-word library can be extracted from the word library according to the combination vector.

In this case, the electronic device may perform dimension conversion on the intermediate vector to obtain a target vector having the same dimension as the sub-vocabulary library, and select a target vocabulary from a plurality of vocabularies included in the sub-vocabulary library according to the element value of an element in the target vector.

For example, the position of the element with the largest element value in the target vector is determined, and the target word at the same position is selected from a plurality of words included in the sub-word library.

In this case, the target vector is multiplied in each target word generation process, so that the speed of generating the target vector can be effectively increased by reducing the dimensionality of the target vector, and the efficiency of overall decoding is effectively increased.

The following describes an example of the implementation process of extracting the sub-term libraries from the term libraries according to the combination vector in the second case.

Firstly, dimension conversion is carried out on the combined vector to obtain a conversion vector with the same dimension as the word library.

The dimension conversion can be performed on the combined vector through full connection to obtain a conversion vector.

The information generating method provided in this embodiment may be implemented by an information generating network, where the information generating network may include a sub-term library generating sub-network, and the combined vector may be input into the sub-term library generating sub-network, so that the sub-term library generating sub-network performs dimension conversion on the combined vector, obtains a conversion vector having the same dimension as the term library, and outputs the conversion vector.

Secondly, an element set of which the element values in the conversion vector satisfy the element value condition is obtained.

The element values of the elements in the translation vector may be scores.

The element value condition may define an element value threshold, and when the element value reaches the element value threshold, it is determined that the element value satisfies the element value condition. Alternatively, the element value condition may define an element value range, and when the element value is within the element value range, it is determined that the element value satisfies the element value condition. Alternatively, the element value condition may define an element value of Q before the ordering, and when the ranking of the element value is before Q, the element value is determined to satisfy the element value condition.

For example, Q is 1000, it is determined that the element values of the top 1000 of the ranking satisfy the element value condition, and further, the element values of the top 1000 of the ranking are combined to obtain an element set.

And thirdly, acquiring a word set positioned at the same position in the word library according to the position of the element set in the conversion vector.

The word stock is presented in vector form, and the conversion vector and the word stock have the same dimension.

The position of the set of elements in the translation vector can be understood as: in the translation vector, the position of each element in the set of elements.

And finally, generating a sub-word library according to the word set.

The following describes an example of a process of performing dimension conversion on the intermediate vector in the second case to obtain a target vector having the same dimension as the subword library.

Suppose that: the word library is presented in a form of 1 × N dimensional vectors, the sub-word library is presented in a form of 1 × N dimensional vectors, N is smaller than N, and the dimension of the intermediate vector is 1 × M.

Determining a designated vector corresponding to each word in the sub-word library, wherein the element position of a nonzero element in the designated vector is the same as the position of the corresponding word in the sub-word library, the dimension of the designated vector is the same as the dimension of the word library, and the dimension of the designated vector is 1 multiplied by N. For example, the non-zero element is 1.

After obtaining the designated vector corresponding to each word in the sub-word library, combining the designated vectors corresponding to all the words in the sub-word library to obtain a matrix with dimension of N × N, and determining a target vector with dimension of 1 × N according to a global conversion matrix with dimension of M × N and a transpose matrix of the matrix.

For example, a global conversion matrix having a dimension of M × N may be referred to as a matrix a, a matrix having a dimension of N × N may be referred to as a matrix B, a local conversion matrix having a dimension of M × N may be determined from a product of the matrix a and a transpose of the matrix B, and a target vector having a dimension of 1 × N may be determined from a product of the intermediate vector and the local conversion matrix.

The dimensions of the target vector and the subword library obtained by the method are both 1 × n.

Illustratively, the word library is presented in the form of a 1 × 2 ten-thousand-dimensional vector, the sub-word library is presented in the form of a 1 × 1000-dimensional vector, and N is less than N; the dimension of the intermediate vector is 1 × 512. The dimension of the designated vector corresponding to each word in the sub-word library is 1 × 2 ten thousand. And combining the designated vectors corresponding to all the words in the sub-word library to obtain a matrix with the dimension of 1000 multiplied by 2 ten thousand. A global conversion matrix having a dimension of 512 × 2 ten thousand is referred to as a matrix a, a matrix having a dimension of 1000 × 2 ten thousand is referred to as a matrix B, and a transposed matrix of the matrix a and the matrix B is multiplied to obtain a local conversion matrix having a dimension of 512 × 1000. Multiplying the intermediate vector with the dimension of 1 × 512 and the local conversion matrix with the dimension of 512 × 1000 to obtain the target vector with the dimension of 1 × 1000.

Compared with a global conversion matrix, the dimension of the local conversion matrix is smaller, so that the process of obtaining the target vector by using the local conversion matrix to carry out dimension conversion on the intermediate vector requires smaller calculation amount, the target vector can be quickly generated in a short time, and the generation rate of the reply statement is improved.

Compared with the total word bank, the number of words included in the sub-word bank is small, and compared with the target vector obtained by using the global transformation matrix, the dimension of the target vector obtained by using the local transformation matrix is small, so that according to the element value of the element in the target vector obtained by using the local transformation matrix, the calculation amount required by the process of selecting the target word from the plurality of words included in the word bank is small, the target vector can be quickly generated in a short time, and the generation rate of the reply sentence is improved.

By adopting the method, the calculation process is simplified, the target vector can be quickly determined, and the generation rate of the reply sentence is improved.

In step 104, a reply statement is generated from the target word.

In some embodiments, for some types of decoders, the decoder may output a plurality of intermediate vectors at a time according to an input combination vector, determine all target words according to the plurality of intermediate vectors, and combine all target words to obtain a reply sentence.

In this embodiment, all target words are determined in step 103.

In some embodiments, for some types of decoders, for example, a decoder composed of a GRU network, the decoder may output one or several intermediate vectors at a time based on the input combined vector, and determine some of all the target words based on the one or several intermediate vectors.

In this case, after step 103 is executed, the target words determined in step 103 may be input into a decoder, so that the decoder outputs a new intermediate vector, performs dimension conversion on the new intermediate vector to obtain a new target vector, the dimension of the new target vector is less than or equal to the dimension of the word library, selects a new target word from a plurality of words included in the word library according to the element value of an element in the new target vector, and generates a reply sentence according to the target words selected from the word library for a plurality of times after the newly selected target word is input into the decoder in a loop.

After the newly selected target word is input to the decoder, other new target words may be selected in the manner described above with reference to the selected target word.

The information generating method provided by the present embodiment may be performed by an information generating network, and the information generating network may include a reference vector determination sub-network. After determining the sub-network using the reference vector, determining the target reference vector based on similarity of the reference vector in the set of reference vectors and the hidden vector, the sub-network may be determined using the reference vector, selecting a new target word from a plurality of words included in the word bank, and outputting the new target word.

During the process of training the information generating network, a stop, e.g., [ EOS ], may be added to the code to instruct the information generating network to stop inputting the newly selected target word into the decoder, and the loop process ends. After the training is finished, the information generation network can recognize when the loop process is finished.

In the embodiment of the disclosure, after receiving a dialog input, the electronic device determines a target reference vector suitable for the current dialog input according to the similarity between a reference vector in a reference vector set and a hidden vector corresponding to the dialog input, selects a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, and generates a reply sentence according to the target word. Because target reference vectors suitable for different dialogue inputs are different, different target words and sentences can be determined by combining different target reference vectors, different reply sentences can be finally determined, and diversity of the reply sentences is realized.

In some embodiments, the information generating method provided by the present embodiment is performed by an information generating network including an encoder, a reference vector determination subnetwork, and a decoder.

Before using the information generating network, the information generating network needs to be trained.

Fig. 2 is a flowchart illustrating a training method of an information generating network according to an exemplary embodiment, and referring to fig. 2, the training method of the information generating network includes:

in step 201, a sample dialog input and a sample reply statement are obtained.

For example, the sample dialog input is "did you eat today" and the sample reply sentence is "eat, very full of eating".

In step 202, a sample dialog is input into the information generating network, so that decoders in the information generating network cyclically output sample intermediate vectors.

In step 203, in response to the decoder outputting the sample intermediate vector once, the sample intermediate vector output once is subjected to dimension conversion to obtain a corresponding sample target vector.

In step 204, in the term library, replacing the term to be determined currently in the sample reply sentence with a first identifier, and replacing other terms except the term to be determined currently with a second identifier, so as to obtain a first standard vector.

For example, in the term library, a term to be currently determined in the sample reply sentence is replaced with 1, and terms other than the term to be currently determined are replaced with 0, so that the first criterion vector is obtained.

In step 205, the difference between the sample target vector obtained this time and the first standard vector is determined.

In step 206, after the decoder outputs the sample intermediate vector for a plurality of times, parameters in the information generating network are adjusted according to a difference between the sample target vector obtained for a plurality of times and the first standard vector.

And (4) counting the differences determined for multiple times, such as direct addition, weight addition or calculation in other modes, and adjusting parameters in the information generation network according to the statistical result.

There are various ways to adjust the parameters in the information generating network according to the difference between the sample target vector obtained multiple times and the first standard vector, for example, adjusting the parameters in the information generating network for a preset number of times according to the difference between the sample target vector obtained multiple times and the first standard vector, or adjusting the parameters in the information generating network until the difference is less than or equal to the difference threshold, or until the difference is minimum.

In some embodiments, the information generation network may further include a sub-term library generation sub-network, where the sub-term library generation sub-network is configured to perform dimension conversion on the combined vector, obtain a conversion vector having the same dimension as the term library, obtain an element set in which element values in the conversion vector satisfy an element value condition, obtain a term set in the term library at the same position according to a position of the element set in the conversion vector, and generate the sub-term library according to the term set.

Fig. 3 is a schematic diagram illustrating an information generating network according to an exemplary embodiment, the information generating network illustrated in fig. 3 including an encoder, a reference vector determination sub-network, a sub-word library generation sub-network, and a decoder, and arrow directions in fig. 3 indicate transmission directions of data.

Under the network structure, the method for training the information generation network may further include: after the sample dialogue is input into the input information generation network, obtaining a sample conversion vector determined by a sub-word library generation sub-network; in the word library, replacing words included in the sample reply sentences with third identifications, replacing words not included in the sample reply sentences with fourth identifications, and obtaining second standard vectors; and adjusting parameters in the information generation network according to the difference between the sample conversion vector and the second standard vector.

For example, in the word library, a word included in the sample reply sentence is replaced with 1, a word not included in the sample reply sentence is replaced with 0, and a second criterion vector is obtained.

There are various ways to adjust the parameters in the information generating network according to the difference between the sample conversion vector and the second standard vector, for example, adjusting the parameters in the information generating network for a preset number of times according to the difference between the sample conversion vector and the second standard vector, or adjusting the parameters in the information generating network for a preset number of times until the difference between the sample conversion vector and the second standard vector is less than or equal to the difference threshold, or until the difference is minimum.

In the process of training the network, parameters in the information generation network can be adjusted according to the difference between the sample target vector and the first standard vector obtained for multiple times and the difference between the sample conversion vector and the second standard vector, so that the optimization of the information generation network is better realized.

In the embodiment, in the network training process, the multi-classification loss of the sub-phrase library is optimized, so that the sub-phrase library can include all the phrases required for generating the reply sentence, the effect that only the target phrase needs to be selected from the determined sub-phrase library and the target phrase does not need to be selected from different sub-phrase libraries is achieved, and the generation rate of the reply sentence is improved.

In some embodiments, an appropriate target reference vector may be selected from the set of reference vectors by Gumbel _ softmax. Gumbel _ softmax was chosen because: introducing multiple discretized reference vectors, using Gumbel _ softmax allows the entire training to be directly end-to-end and all parameter gradients are derivable.

In some embodiments, a sequence-to-sequence network in the related art may be modified to obtain the information generation network used in the above embodiments.

While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently.

Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.

Corresponding to the embodiment of the application function implementation method, the disclosure also provides an embodiment of an application function implementation device and corresponding electronic equipment.

Fig. 4 is a block diagram illustrating an information generating apparatus according to an example embodiment, the apparatus comprising:

a hidden vector determining module 31 configured to determine, after receiving a dialog input, a hidden vector corresponding to the dialog input;

a reference vector determination module 32 configured to determine a target reference vector according to similarity between a reference vector in the reference vector set and the hidden vector;

a target word selecting module 33 configured to select a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector;

a reply sentence generation module 34 configured to generate a reply sentence according to the target word.

In some embodiments, on the basis of the information generating apparatus shown in fig. 4, the word library is presented in a vector form; the target word selecting module 33 may include:

In some embodiments, the reply statement generation module 34 may include:

In some embodiments, the apparatus may further comprise:

In some embodiments, the subword library extraction module may include:

the dimension conversion sub-module may include:

In some embodiments, the target vector determination unit may include:

a matrix multiplication subunit configured to obtain a local conversion matrix with dimension M × n according to a product of the global conversion matrix and the transposed matrix;

a vector multiplication subunit configured to obtain the target vector having a dimension of 1 × n according to a product of the intermediate vector and the local conversion matrix.

In some embodiments, the method is performed by an information generating network comprising an encoder, a reference vector determination sub-network, and the decoder; the apparatus may further include:

In some embodiments, the network training module may further specifically include:

Fig. 5 is a schematic diagram illustrating a structure of an electronic device 1600 according to an example embodiment. For example, the electronic device 1600 may be a user device, which may be embodied as a mobile phone, a computer, a digital broadcast electronic device, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, a wearable device such as a smart watch, smart glasses, smart band, smart running shoe, and the like.

Referring to fig. 5, electronic device 1600 may include one or more of the following components: processing component 1602, memory 1604, power component 1606, multimedia component 1608, audio component 1610, input/output (I/O) interface 1612, sensor component 1614, and communications component 1616.

The processing component 1602 generally controls overall operation of the electronic device 1600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 1602 may include one or more processors 1620 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 1602 can include one or more modules that facilitate interaction between the processing component 1602 and other components. For example, the processing component 1602 can include a multimedia module to facilitate interaction between the multimedia component 1608 and the processing component 1602.

The memory 1604 is configured to store various types of data to support operation at the device 1600. Examples of such data include instructions for any application or method operating on the electronic device 1600, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1604 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 1606 provides power to the various components of the electronic device 1600. The power components 1606 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 1600.

The multimedia component 1608 includes a screen that provides an output interface between the electronic device 1600 and a user as described above. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of the touch or slide action but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1608 comprises a front-facing camera and/or a rear-facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 1600 is in an operating mode, such as an adjustment mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 1610 is configured to output and/or input an audio signal. For example, the audio component 1610 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 1600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 1604 or transmitted via the communications component 1616. In some embodiments, audio component 1610 further includes a speaker for outputting audio signals.

The I/O interface 1612 provides an interface between the processing component 1602 and peripheral interface modules, such as keyboards, click wheels, buttons, and the like. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

Sensor assembly 1614 includes one or more sensors for providing various aspects of status assessment for electronic device 1600. For example, sensor assembly 1614 may detect an open/closed state of electronic device 1600, the relative positioning of components, such as a display and keypad of electronic device 1600, a change in position of electronic device 1600 or a component of electronic device 1600, the presence or absence of user contact with electronic device 1600, orientation or acceleration/deceleration of electronic device 1600, and a change in temperature of electronic device 1600. The sensor assembly 1614 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communications component 1616 is configured to facilitate communications between the electronic device 1600 and other devices in a wired or wireless manner. The electronic device 1600 may access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the aforementioned communication component 1616 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 1600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as the memory 1604 including instructions that, when executed by the processor 1620 of the electronic device 1600, enable the electronic device 1600 to perform an information generation method, the method comprising: after receiving a dialog input, determining a hidden vector corresponding to the dialog input; determining a target reference vector according to the similarity between the reference vector in the reference vector set and the hidden vector; selecting a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector; and generating a reply sentence according to the target word.

The non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An information generating method, characterized in that the method comprises:

and generating a reply sentence according to the target word.

2. The method of claim 1, wherein the library of words is presented in vector form; selecting a target word from a plurality of words included in a preset word library according to the hidden vector and the target reference vector, wherein the selecting comprises:

3. The method of claim 2, wherein generating a reply statement from the target term comprises:

4. A method according to claim 2 or 3, characterized in that the method further comprises:

5. The method of claim 4, wherein extracting a sub-term library from the term library according to the combined vector comprises:

and generating the sub-word library according to the word set.

6. The method of claim 4, wherein the corpus of words is presented in the form of a 1 x N dimensional vector, the corpus of subwords is presented in the form of a 1 x N dimensional vector, and N is less than N; the dimension of the intermediate vector is 1 × M;

7. The method of claim 6, wherein the determining the target vector of dimension 1 x N from the global transformation matrix of dimension M x N and the transpose of the matrix of dimension N x N comprises:

8. The method of claim 5, wherein the method is performed by an information generating network comprising an encoder, a reference vector determination sub-network, and the decoder; the information generation network is obtained by training the following steps:

responding to the sample intermediate vector output by the decoder at one time, and performing dimension conversion on the sample intermediate vector output at one time to obtain a corresponding sample target vector;

9. The method of claim 8, wherein the information generating network further comprises a subword library generating subnetwork; the training the information generation network further specifically includes:

10. An information generating apparatus, characterized in that the apparatus comprises:

11. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the method of any one of claims 1-9.

12. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of any one of claims 1-9.