CN108959388B

CN108959388B - Information generation method and device

Info

Publication number: CN108959388B
Application number: CN201810551680.4A
Authority: CN
Inventors: 马文涛; 崔一鸣; 陈致鹏; 何苏; 王士进; 胡国平; 刘挺
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2020-09-11
Anticipated expiration: 2038-05-31
Also published as: CN108959388A

Abstract

The embodiment of the invention provides an information generation method and device, and belongs to the technical field of natural language processing. The method comprises the following steps: inputting the query text and the reply text matched with the query text into a key content calculation model, and outputting key content in the reply text; and inputting the query text and the key content into the reply generation model, and outputting reply information obtained after the key content is adjusted. The key content can be adjusted through the reply generation model, so that the content which is not directly related to the user question in the key content can be screened out, the information mining related to the user question in the key content is deepened, and the accuracy of the reply information is further ensured. In addition, the expression mode of the key content can be adjusted through the reply generation model, so that the response information obtained after adjustment is more humanized, and the subsequent user interaction experience is improved.

Description

Information generation method and device

Technical Field

The embodiment of the invention relates to the technical field of natural language processing, in particular to an information generation method and device.

Background

In recent years, with the development of artificial intelligence-related disciplines, particularly computational linguistics, various question answering systems and dialogue robots have come into operation, and people can communicate with devices in a natural language to acquire required information. In the related art, it is common to directly find a relevant answer to a user's question as a reply message. The accuracy of the reply message is low because the found relevant answers may contain content that is not directly related to the user's question.

Disclosure of Invention

In order to solve the above problems, embodiments of the present invention provide an information generating method and apparatus that overcome the above problems or at least partially solve the above problems.

According to a first aspect of embodiments of the present invention, there is provided an information generating method, including:

inputting the query text and the reply text matched with the query text into a key content calculation model, and outputting key content in the reply text;

inputting the query text and the key content into a reply generation model, and outputting reply information obtained after the key content is adjusted; the reply generation model is obtained by training based on the sample query text, the sample key content corresponding to the sample query text and the sample reply information.

According to the method provided by the embodiment of the invention, the key contents in the reply text are output by inputting the query text and the reply text matched with the query text into the key content calculation model. And inputting the query text and the key content into the reply generation model, and outputting reply information obtained after the key content is adjusted. The key content can be adjusted through the reply generation model, so that the content which is not directly related to the user question in the key content can be screened out, the information mining related to the user question in the key content is deepened, and the accuracy of the reply information is further ensured. In addition, the expression mode of the key content can be adjusted through the reply generation model, so that the response information obtained after adjustment is more humanized, and the subsequent user interaction experience is improved.

According to a second aspect of embodiments of the present invention, there is provided an information generating apparatus including:

the first output module is used for inputting the query text and the reply text matched with the query text into the key content calculation model and outputting the key content in the reply text;

the second output module is used for inputting the query text and the key content into the reply generation model and outputting reply information obtained after the key content is adjusted; the reply generation model is obtained by training based on the sample query text, the sample key content corresponding to the sample query text and the sample reply information.

According to a third aspect of embodiments of the present invention, there is provided an information generating apparatus including:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, which invokes program instructions capable of performing the information generating method provided in any of the various possible implementations of the first aspect.

According to a fourth aspect of the present invention, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the information generating method provided in any one of the various possible implementations of the first aspect.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of embodiments of the invention.

Drawings

Fig. 1 is a schematic flow chart of an information generating method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of an information generating method according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a key content calculation model according to an embodiment of the present invention;

fig. 4 is a schematic flowchart of an information generating method according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of an information generating method according to an embodiment of the present invention;

fig. 6 is a schematic flowchart of an information generating method according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a reply generation model according to an embodiment of the present invention;

fig. 8 is a block diagram of an information generating apparatus according to an embodiment of the present invention;

fig. 9 is a block diagram of an information generating apparatus according to an embodiment of the present invention.

Detailed Description

The following describes embodiments of the present invention in further detail with reference to the drawings and examples. The following examples are intended to illustrate the examples of the present invention, but are not intended to limit the scope of the examples of the present invention.

At present, people can communicate with equipment in a natural language mode to acquire required information. In the related art, it is common to directly find a relevant answer to a user's question as reply information. The accuracy of the reply message is low because the found relevant answers may contain content that is not directly related to the user's question. For example, taking the user question as "how to adjust the temperature of the in-vehicle air conditioner" and the answer related to the found user question as "temperature adjustment knob operation, clockwise rotation temperature increase, counterclockwise rotation temperature decrease", as an example, if the relevant answer is taken as the answer information, the accuracy of the answer information is low because the relevant answer contains contents irrelevant to the user question. In addition, the content of the reply information may not conform to the habit of conversation between people in terms of expression, i.e. is not humanized enough, so that the user experience is poor.

In view of the above situation, an embodiment of the present invention provides an information generating method. The method can be used in an intelligent question and answer scene, and can also be used in other scenes needing an intelligent question and answer function, such as a driving scene, a shopping scene and the like, which are not specifically limited in the embodiment of the invention. The method may be performed by different devices in combination with different usage scenarios, which are not limited in this embodiment of the present invention. For example, if the method is used in a driving scenario, the execution subject of the method may be an in-vehicle device; if the method is used in a shopping scenario, the execution subject of the method may be a mobile terminal. Referring to fig. 1, the method includes:

101. inputting the query text and the reply text matched with the query text into a key content calculation model, and outputting the key content in the reply text.

Before the above process is executed, the voice data when the user asks the question can be obtained, and the voice data is subjected to voice recognition to obtain a query text; alternatively, the text input by the user may also be directly obtained and used as the query text, which is not specifically limited in the embodiment of the present invention. In addition, the reply text matching the query text may contain the reply content of the query corresponding to the question. Specifically, if the query text corresponds to a question for inquiring how a certain function in a product is used, the reply text matched with the query text may be a description document of the product; further, considering that a product usually has multiple functions, and the description document of the product usually has usage description information of all functions of the product, if the description document is divided into a plurality of structured texts according to each function in advance, and the query text queries a certain function in the product, the reply text matched with the query text can be the structured text corresponding to the function. If the query text corresponds to a query defined as a technical term, the reply text may be a technical dictionary containing the technical term definition, which is not specifically limited in the embodiment of the present invention.

Since the reply text may contain redundant information that is not related to the question corresponding to the query text, the key content in the reply text may be output based on the key content calculation model. The key content may be response content of a question corresponding to the query text, which is not specifically limited in this embodiment of the present invention.

In addition, before the above process is executed, the key content calculation model may be obtained by training in advance, and specifically, the key content calculation model may be obtained by training in the following manner: firstly, collecting a large number of sample query texts and sample reply texts corresponding to the sample query texts; wherein, the key content of the sample in the sample reply text is predetermined and is the reply content of the sample query text corresponding to the question. And training the initial model based on the sample query text, the sample reply text and the sample key content to obtain a key content calculation model. The initial model may be a single neural network model or a combination of a plurality of neural network models, and the embodiment of the present invention does not specifically limit the type and structure of the initial model.

102. And inputting the query text and the key content into the reply generation model, and outputting reply information obtained after the key content is adjusted.

As can be seen from the above, the key content may not conform to the habit of conversation between people in terms of expression, i.e., it is not humanized enough; alternatively, the key content may include some content unrelated to the corresponding question of the query text. Therefore, in the embodiment of the present invention, the key content is not directly fed back (e.g., by broadcasting) to the user, but the key content is adjusted by the reply generation model, and the adjusted reply information is obtained, so that the reply information can be subsequently fed back to the user. Before the above process is performed, the reply generation model may also be obtained by training in advance, and specifically may be obtained by training in the following manner: firstly, collecting a large amount of sample query texts, sample key contents corresponding to the sample query texts and sample reply information; the sample reply information is obtained by adjusting the key content of the sample in advance. And taking the sample query text and the sample key content as the input of the initial model, taking the sample reply information as the output of the initial model, and training the initial model to obtain a reply generation model. The initial model may be a single neural network model or a combination of a plurality of neural network models, and the embodiment of the present invention does not specifically limit the type and structure of the initial model.

It can be known from the above embodiments that the key content calculation model may be a single neural network model or a combination of multiple neural network models. Now, the process of outputting the key content in the reply text by the key content calculation model is explained by taking the key content calculation model as the combination of a plurality of neural network models as an example. Accordingly, based on the contents of the above embodiments, as an alternative embodiment, the embodiment of the present invention does not specifically limit the manner in which the query text and the reply text matching the query text are input to the key content calculation model, and the key content in the reply text is output. Referring to fig. 2, including but not limited to:

and 201, inputting the query text and the reply text into a text representation layer in the key content calculation model respectively, and outputting a word level text representation matrix corresponding to the query text and a sentence level text representation matrix corresponding to the reply text.

The text presentation layer is mainly used for learning word-level question presentation corresponding to a query text and sentence-level document presentation corresponding to a reply text. The text representation layer may be a single neural network model, such as a bidirectional long and short term memory network, or may be a combination of multiple neural network models, such as a combination of a bidirectional long and short term memory network and a convolutional neural network. The word level text representation matrix corresponding to the query text is composed of word vectors corresponding to each participle in the query text, the line number of the word level text representation matrix is consistent with the number of the participles of the query text, and each line in the word level text representation matrix corresponds to a word vector of one participle in the query text. The sentence-level text representation matrix corresponding to the reply text is formed by sentence vectors corresponding to each clause in the reply text, the row number of the sentence-level text representation matrix is consistent with the number of the clauses in the reply text, and each row in the sentence-level text representation matrix corresponds to the sentence vector of one clause in the reply text.

202, inputting the sentence-level text representation matrix to a context representation layer in the key content calculation model, and outputting a context representation matrix corresponding to the reply text.

The context representation layer is mainly used for learning sentence-level context information representation corresponding to the reply text. The context representation layer may be a long-short term memory network or a bidirectional long-short term memory network, and the embodiment of the present invention does not specifically limit the type of the neural network model used by the context representation layer. The context expression matrix is composed of vectors with context information corresponding to each clause in the reply text, the number of rows of the context expression matrix is consistent with the number of clauses in the reply text, and each row in the context expression matrix corresponds to a vector with context information corresponding to one clause in the reply text.

And 203, inputting the word-level text representation matrix and the context representation matrix into an attention layer in the key content calculation model, and outputting an information association matrix between the query text and the reply text.

The attention layer is mainly used for weighting the word-level question representation and then integrating the weighted word-level question representation into the sentence-level context information representation. Specifically, the attention layer first calculates the question attention representation of the query text corresponding to the answer text, that is, determines the correlation between each word in the query text and each sentence in the answer text, and then combines the determined correlation with the word-level text representation matrix and the context representation matrix at the same time, so as to obtain the information association matrix between the query text and the answer text. The information correlation matrix is composed of vectors with question attention representation corresponding to each clause in the reply text, the row number of the information correlation matrix is consistent with the number of the clauses in the reply text, the column number of the information correlation matrix is equal to the sum of the column number of the word level text representation matrix and the column number of the context representation matrix, and each row in the information correlation matrix corresponds to a vector with question attention representation corresponding to one clause in the reply text.

And 204, inputting the information association matrix into an output layer in the key content calculation model, and outputting the key content in the reply text.

It should be noted that, in the above steps 201 to 204, the text representation layer, the context representation layer, the attention layer, and the output layer are involved, that is, the key content calculation model may be composed of the above four layers, and a specific layered structure may refer to fig. 3.

According to the method provided by the embodiment of the invention, the key content in the reply text is determined from different angles such as the participle of the query text, the clause and the context of the reply text, the correlation among the participle, the clause and the context of the reply text and the like based on the key content calculation model, and the key content is used as the reply of the query text, so that the reliability and the accuracy of the reply content can be improved, and the use experience of a user in the process of performing query-reply interaction with equipment is improved.

As can be seen from the above embodiments, the reply generation model may be a single neural network, and the single neural network may be composed of a plurality of neural network layers. Accordingly, based on the content of the above embodiment, as an optional embodiment, the embodiment of the present invention does not specifically limit the manner in which the query text and the key content are input to the reply generation model, and the reply information obtained by adjusting the key content is output. Referring to fig. 4, including but not limited to:

401, inputting the query text and the key content to the vector representation layer, and outputting a first word vector matrix corresponding to the query text and a second word vector matrix corresponding to the key content.

402. Inputting the first word vector matrix and the second word vector matrix into the coding layer, outputting the coding vector, inputting the coding vector into the decoding layer, and outputting the decoding vector.

403. And inputting the decoding vectors into an output layer, outputting each participle in the reply information one by one, and splicing all the output participles according to the output sequence of each participle to obtain the reply information.

The vector representation layer is mainly used for converting the query text and the key content into corresponding word vector matrixes. The coding layer can combine the query text and the key content based on the first word vector matrix and the second word vector matrix to obtain a coding vector which simultaneously contains the query text and key information in the key content. The decoding layer is used for decoding the coding vector to obtain a decoding vector, and the output layer is used for converting the decoding vector into reply information.

Based on the contents of the above embodiments, as an alternative embodiment, the vector representation layer includes an input layer and an embedding layer; accordingly, the embodiment of the present invention does not specifically limit the manner in which the query text and the key content are respectively input to the vector representation layer and the first word vector matrix corresponding to the query text and the second word vector matrix corresponding to the key content are output. Referring to fig. 5, including but not limited to:

4011. and inputting a first word segmentation sequence corresponding to the query text and a second word segmentation sequence corresponding to the key content into the input layer, and outputting the word identifier of each word segmentation in the first word segmentation sequence and the word identifier of each word segmentation in the second word segmentation sequence.

In this step, the word identifier of the word segmentation may be a number, which is not specifically limited in this embodiment of the present invention. After the first word segmentation sequence and the second word segmentation sequence are input to the input layer, the word identifier of each word segmentation in the first word segmentation sequence and the second word segmentation sequence can be determined through the word identifier dictionary, which is not specifically limited in the embodiment of the present invention. Accordingly, a word identification dictionary may be established in advance before the word identification of the segmented word is obtained.

Specifically, all the appeared participles can be sorted from high to low according to the occurrence frequency of the participles in the implementation process, and a word identification dictionary is constructed by selecting a preset number of the participles. Wherein the word identifier of each participle in the word identifier dictionary is determined. After the word identification dictionary is established, the word identification of each participle in the first participle sequence and the second participle sequence can be respectively inquired in the word identification dictionary.

4012. And inputting the word identifier of each participle in the first participle sequence and the word identifier of each participle in the second participle sequence into the embedding layer, and outputting a first word vector matrix and a second word vector matrix.

In this step, after the word identifier of each participle in the first participle sequence and the word identifier of each participle in the second participle sequence are input into the embedding layer, the first word vector matrix and the second word vector matrix can be determined through the word vector dictionary. Accordingly, a word vector dictionary may be pre-established before the first and second word vector matrices are obtained.

Specifically, a word vector corresponding to the word identifier of each participle in the word identifier dictionary may be predetermined, and a word vector dictionary may be established accordingly. After the word vector dictionary is established, the word vector corresponding to the word identifier of each participle in the first participle sequence and the second participle sequence can be respectively inquired in the word vector dictionary. And combining the word vectors corresponding to the word identifications of each word in the first word segmentation sequence to obtain a first word vector matrix. And combining the word vectors corresponding to the word identifications of each participle in the second participle sequence to obtain a second word vector matrix.

According to the method provided by the embodiment of the invention, the word segmentation is converted into the word identification, the word identification is converted into the word vector, and the word vectors of each word segmentation are fused to obtain the word vector matrix, so that the subsequent processing is facilitated.

Based on the content of the foregoing embodiments, as an alternative embodiment, in order to be able to correlate the historical data and the future data, when obtaining the code vector through the coding layer, the coding layer may include a long-short term memory network and a reverse long-short term memory network; accordingly, the embodiment of the present invention does not specifically limit the manner in which the first word vector matrix and the second word vector matrix are input to the coding layer and the coding vector is output. Referring to fig. 6, including but not limited to:

4021. the first word vector matrix is input to the forward long-short term memory network to output a first forward vector, the second word vector matrix is input to the forward long-short term memory network to output a second forward vector.

4022. The first word vector matrix is input to the reverse long-short term memory network to output a first reverse vector, the second word vector matrix is input to the reverse long-short term memory network to output a second reverse vector.

4023. And splicing the first forward vector, the first backward vector, the second forward vector and the second backward vector to obtain a coding vector.

The number of nodes included in the forward long-short term memory network and the reverse long-short term memory network may be the same, which is not limited in the embodiments of the present invention. Taking the nodes included in the two as k as an example, the first word vector matrix can output the first forward vector with the length of k after being input into the forward long-short term memory network. Similarly, the first backward vector, the second forward vector and the second backward vector are all vectors with length k. And splicing the first forward vector and the first backward vector to obtain a vector with the length of 2k, namely the coding vector corresponding to the query text. And splicing the second forward vector and the second backward vector to obtain a vector with the length of 2k, namely a coding vector corresponding to the key content. And splicing the coding vector corresponding to the query text with the coding vector corresponding to the key content to obtain a 4 k-length vector, namely a final coding vector.

The method provided by the embodiment of the invention encodes the query text and the key content through the bidirectional long-short term memory network to obtain the encoding vector containing the key information of the query text and the key content, thereby ensuring the accuracy of the reply information.

Based on the content of the foregoing embodiments, as an alternative embodiment, the embodiment of the present invention does not specifically limit the manner in which the decoded vector is input to the output layer and each participle in the reply information is output one by one, and includes but is not limited to: determining each participle in the reply information one by one from a preset dictionary based on the decoding vector according to the possibility score when the participle in the preset dictionary is taken as the participle in the reply information until a preset condition is met; the preset condition is that the obtained word segmentation is determined to be a preset ending symbol or the total number of the obtained word segmentation reaches a preset threshold value.

Specifically, the likelihood score of each participle in the preset dictionary as the first participle in the reply information may be determined based on the decoding vector, so that the participle with the largest likelihood score is selected from the preset dictionary as the first participle in the reply information. Followed by analogy in order to determine a participle at a time. When determining the likelihood score corresponding to each participle in the preset dictionary, the computation may be performed through a softmax function, and parameters of the function may at least include a decoding vector, which is not specifically limited in the embodiment of the present invention.

This process is repeated up to the nth time, before N-1 participles have been determined. And if the segmented words determined from the preset dictionary at the Nth time are preset end characters, such as < end >, splicing the N-1 segmented words determined in the front according to the determined sequence to obtain the reply information. Or, if the total number N of segmented words just reaches the preset threshold after the nth segmented word is determined from the preset dictionary for the nth time, that is, the preset threshold is also N, the N segmented words may be spliced according to the determined sequence to obtain the reply information.

It should be noted that, in the actual implementation process, the likelihood score when each word identifier in the word identifier dictionary is used as the word identifier of the participle in the reply information may be determined, after the word identifier with the largest likelihood score is determined, the corresponding participle is determined according to the corresponding relationship between the word identifier and the participle, and each participle in the reply information is determined one by one accordingly. Or, the likelihood score when each word vector in the word vector dictionary is used as the word vector of the participle in the reply information can be determined, after the word vector with the maximum likelihood score is determined, the corresponding participle is determined according to the corresponding relation between the word vector and the participle, and each participle in the reply information is determined one by one according to the corresponding relation. In addition, the word vector dictionary, the word identification dictionary and the preset dictionary related in the content of the above embodiment may be the same or different in the word segmentation covered by the three, and this is not specifically limited in the embodiment of the present invention.

In consideration of the practical implementation process, when each participle in the response information is determined, the last determined participle has the same guiding significance for the next participle to be determined next. Based on the principle and the content of the foregoing embodiments, as an alternative embodiment, the embodiment of the present invention does not specifically limit the manner of determining each participle in the reply information one by one from the preset dictionary according to the likelihood score when the participle in the preset dictionary is taken as a participle in the reply information based on the decoding vector, including but not limited to: if the first participle in the reply information needs to be determined at present, determining the probability score of each participle in a preset dictionary as the first participle based on the decoding vector, and selecting the participle with the maximum probability score from the preset dictionary as the first participle; if the Nth participle in the reply information needs to be determined currently, determining the probability score of each participle in a preset dictionary as the Nth participle based on the decoding vector and the (N-1) th participle, and selecting the participle with the maximum probability score from the preset dictionary as the Nth participle; wherein N is a positive integer greater than 1.

Each participle in the reply message may be determined by a hidden node of the output layer. For the first participle in the reply message, the participle is determined by the first hidden node of the output layer; for the kth participle in the reply message, it is determined by the kth hidden node of the output layer. In the embodiment of the invention, when the likelihood score of each participle in the preset dictionary is determined as the Nth participle, the determination can be performed based on the decoding vector and the (N-1) th participle, and also based on the decoding vector, the (N-1) th participle and the intermediate information of the (N-1) th hidden node. The intermediate information of the hidden node may be an output vector of the hidden node, which is not specifically limited in this embodiment of the present invention.

Taking an example that the reply generation model sequentially includes an input layer, an embedded layer, a coding layer, a decoding layer, and an output layer, a flow diagram of the information generation method provided in the embodiment of the present invention may refer to fig. 7.

It should be noted that, all the above-mentioned alternative embodiments may be combined arbitrarily to form alternative embodiments of the present invention, and are not described in detail herein.

Based on the content of the above embodiments, an embodiment of the present invention provides an information generating apparatus for executing the information generating method provided in the above method embodiment. Referring to fig. 8, the apparatus includes:

a first output module 801, configured to input the query text and the reply text matching the query text into the key content calculation model, and output key content in the reply text;

a second output module 802, configured to input the query text and the key content into the reply generation model, and output reply information obtained after adjusting the key content; the reply generation model is obtained by training based on the sample query text, the sample key content corresponding to the sample query text and the sample reply information.

As an alternative embodiment, the first output module 801 is configured to input the query text and the reply text to the text representation layer in the key content calculation model, and output a word-level text representation matrix corresponding to the query text and a sentence-level text representation matrix corresponding to the reply text; inputting the sentence-level text representation matrix to a context representation layer in a key content calculation model, and outputting a context representation matrix corresponding to the reply text; inputting the word-level text representation matrix and the context representation matrix into an attention layer in a key content calculation model, and outputting an information association matrix between the query text and the reply text; and inputting the information association matrix into an output layer in the key content calculation model, and outputting the key content in the reply text.

As an alternative embodiment, the second output module 802 includes:

the vector representation unit is used for respectively inputting the query text and the key content to the vector representation layer and outputting a first word vector matrix corresponding to the query text and a second word vector matrix corresponding to the key content;

the encoding unit is used for inputting the first word vector matrix and the second word vector matrix into the encoding layer and outputting encoding vectors;

a decoding unit for inputting the encoded vector to a decoding layer and outputting a decoded vector;

and the output unit is used for inputting the decoding vectors into the output layer, outputting each participle in the reply information one by one, and splicing all the output participles according to the output sequence of each participle to obtain the reply information.

As an alternative embodiment, the vector representation layer includes an input layer and an embedding layer; correspondingly, the vector representation unit is used for inputting a first word segmentation sequence corresponding to the query text and a second word segmentation sequence corresponding to the key content into the input layer, and outputting a word identifier of each word segmentation in the first word segmentation sequence and a word identifier of each word segmentation in the second word segmentation sequence; and inputting the word identifier of each participle in the first participle sequence and the word identifier of each participle in the second participle sequence into the embedding layer, and outputting a first word vector matrix and a second word vector matrix.

As an alternative embodiment, the coding layer includes a forward long short term memory network and a reverse long short term memory network; correspondingly, the encoding unit is used for inputting the first word vector matrix into the forward long-short term memory network, outputting the first forward vector, inputting the second word vector matrix into the forward long-short term memory network, and outputting the second forward vector; inputting the first word vector matrix into a reverse long-short term memory network, outputting a first reverse vector, inputting the second word vector matrix into the reverse long-short term memory network, and outputting a second reverse vector; and splicing the first forward vector, the first backward vector, the second forward vector and the second backward vector to obtain a coding vector.

As an alternative embodiment, the output unit is configured to determine, based on the decoding vector, each participle in the reply information one by one from a preset dictionary according to a likelihood score when the participle in the preset dictionary is taken as a participle in the reply information until a preset condition is satisfied; the preset condition is that the obtained word segmentation is determined to be a preset ending symbol or the total number of the obtained word segmentation reaches a preset threshold value.

As an alternative embodiment, the output unit is configured to, when a first participle in the reply information needs to be determined currently, determine a likelihood score of each participle in a preset dictionary as the first participle based on the decoding vector, and select the participle with the largest likelihood score from the preset dictionary as the first participle; if the Nth participle in the reply information needs to be determined currently, determining the probability score of each participle in a preset dictionary as the Nth participle based on the decoding vector and the (N-1) th participle, and selecting the participle with the maximum probability score from the preset dictionary as the Nth participle; wherein N is a positive integer greater than 1.

The device provided by the embodiment of the invention outputs the key content in the reply text by inputting the query text and the reply text matched with the query text into the key content calculation model. And inputting the query text and the key content into the reply generation model, and outputting reply information obtained after the key content is adjusted. The key content can be adjusted through the reply generation model, so that the content which is not directly related to the user question in the key content can be screened out, the information mining related to the user question in the key content is deepened, and the accuracy of the reply information is further ensured. In addition, the expression mode of the key content can be adjusted through the reply generation model, so that the response information obtained after adjustment is more humanized, and the subsequent user interaction experience is improved.

Secondly, the key content in the reply text is determined from different angles such as the segmentation of the query text, the segmentation and the context of the reply text, the correlation among the three and the like based on the key content calculation model, and the key content is used as the reply of the query text, so that the reliability and the accuracy of the reply content can be improved, and the use experience of a user in the process of performing query-answer interaction with equipment is improved.

In addition, the word segmentation is converted into the word identification, the word identification is converted into the word vector, and the word vectors of each word segmentation are fused to obtain a word vector matrix, so that subsequent processing is facilitated.

And finally, coding the query text and the reply text through a bidirectional long-short term memory network to obtain a coding vector containing the key information of the query text and the reply text, so that the accuracy of the reply information can be ensured.

The embodiment of the invention provides information generation equipment. Referring to fig. 9, the apparatus includes: a processor (processor)901, a memory (memory)902, and a bus 903;

the processor 901 and the memory 902 complete communication with each other through the bus 903 respectively; the processor 901 is configured to call the program instructions in the memory 902 to execute the information obtaining method provided by the foregoing embodiments, for example, including: inputting the query text and the reply text matched with the query text into a key content calculation model, and outputting key content in the reply text; inputting the query text and the key content into a reply generation model, and outputting reply information obtained after the key content is adjusted; the reply generation model is obtained by training based on the sample query text, the sample key content corresponding to the sample query text and the sample reply information.

An embodiment of the present invention provides a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause a computer to execute the information generating method provided in the foregoing embodiment, for example, including: inputting the query text and the reply text matched with the query text into a key content calculation model, and outputting key content in the reply text; inputting the query text and the key content into a reply generation model, and outputting reply information obtained after the key content is adjusted; the reply generation model is obtained by training based on the sample query text, the sample key content corresponding to the sample query text and the sample reply information.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The above-described embodiments of the information generating apparatus and the like are merely illustrative, where units illustrated as separate components may or may not be physically separate, and components displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute the various embodiments or some parts of the methods of the embodiments.

Finally, the method of the present application is only a preferred embodiment and is not intended to limit the scope of the embodiments of the present invention. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present invention should be included in the protection scope of the embodiments of the present invention.

Claims

1. An information generating method, comprising:

inputting a query text and a reply text matched with the query text into a key content calculation model, and outputting key content in the reply text;

inputting the query text and the key content into a reply generation model, and outputting reply information obtained after the key content is adjusted;

the reply generation model is obtained by training based on a sample query text, sample key content corresponding to the sample query text and sample reply information; the key content is the reply content of the corresponding question of the query text.

2. The method of claim 1, wherein inputting a query text and a reply text matching the query text into a key content calculation model, and outputting key content in the reply text, comprises:

respectively inputting the query text and the reply text to a text representation layer in the key content calculation model, and outputting a word level text representation matrix corresponding to the query text and a sentence level text representation matrix corresponding to the reply text;

inputting the sentence-level text representation matrix to a context representation layer in the key content calculation model, and outputting a context representation matrix corresponding to the reply text;

inputting the word-level text representation matrix and the context representation matrix into an attention layer in the key content calculation model, and outputting an information association matrix between the query text and the reply text;

and inputting the information incidence matrix to an output layer in the key content calculation model, and outputting the key content in the reply text.

3. The method of claim 1, wherein inputting the query text and the key content into a reply generation model, and outputting a reply message obtained by adjusting the key content comprises:

inputting the query text and the key content to a vector representation layer respectively, and outputting a first word vector matrix corresponding to the query text and a second word vector matrix corresponding to the key content;

inputting the first word vector matrix and the second word vector matrix into an encoding layer, outputting an encoding vector, inputting the encoding vector into a decoding layer, and outputting a decoding vector;

and inputting the decoding vectors to an output layer, outputting each participle in the reply information one by one, and splicing all output participles according to the output sequence of each participle to obtain the reply information.

4. The method of claim 3, wherein the vector representation layer comprises an input layer and an embedding layer; the step of inputting the query text and the key content to a vector representation layer respectively, and outputting a first word vector matrix corresponding to the query text and a second word vector matrix corresponding to the key content includes:

inputting a first word segmentation sequence corresponding to the query text and a second word segmentation sequence corresponding to the key content into the input layer, and outputting a word identifier of each word segmentation in the first word segmentation sequence and a word identifier of each word segmentation in the second word segmentation sequence;

and inputting the word identifier of each participle in the first participle sequence and the word identifier of each participle in the second participle sequence into the embedding layer, and outputting the first word vector matrix and the second word vector matrix.

5. The method of claim 3, wherein the coding layer comprises a forward long short term memory network and a reverse long short term memory network; correspondingly, the inputting the first word vector matrix and the second word vector matrix into an encoding layer and outputting an encoded vector includes:

inputting the first word vector matrix into the forward long-short term memory network, outputting a first forward vector, inputting the second word vector matrix into the forward long-short term memory network, and outputting a second forward vector;

inputting the first word vector matrix into the reverse long-short term memory network, outputting a first reverse vector, inputting the second word vector matrix into the reverse long-short term memory network, and outputting a second reverse vector;

and splicing the first forward vector, the first backward vector, the second forward vector and the second backward vector to obtain the coding vector.

6. The method of claim 3, wherein inputting the decoding vector to an output layer, and outputting each participle in the reply message one by one, comprises:

determining each participle in the reply information one by one from a preset dictionary based on the decoding vector according to a likelihood score when the participle in the preset dictionary is taken as the participle in the reply information until a preset condition is met; the preset condition is that the obtained participles are determined to be preset end symbols or the total number of the obtained participles is determined to reach a preset threshold value.

7. The method according to claim 6, wherein the determining each participle in the reply information one by one from the preset dictionary according to a likelihood score when the participle in the preset dictionary is used as a participle in the reply information based on the decoding vector comprises:

if the first participle in the reply information needs to be determined currently, determining a likelihood score when each participle in the preset dictionary is used as the first participle based on the decoding vector, and selecting the participle with the maximum likelihood score from the preset dictionary as the first participle;

if the Nth participle in the reply information needs to be determined currently, determining the probability score of each participle in the preset dictionary as the Nth participle based on the decoding vector and the (N-1) th participle, and selecting the participle with the maximum probability score from the preset dictionary as the Nth participle; wherein N is a positive integer greater than 1.

8. An information generating apparatus, characterized by comprising:

the first output module is used for inputting a query text and a reply text matched with the query text into a key content calculation model and outputting key content in the reply text;

the second output module is used for inputting the query text and the key content into the reply generation model and outputting reply information obtained after the key content is adjusted; the reply generation model is obtained by training based on a sample query text, sample key content corresponding to the sample query text and sample reply information; the key content is the reply content of the corresponding question of the query text.

9. An information generating apparatus characterized by comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 7.

10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 7.