CN114118100A

CN114118100A - Method, apparatus, device, medium and program product for generating dialogue statements

Info

Publication number: CN114118100A
Application number: CN202111405739.7A
Authority: CN
Inventors: 王文彬
Original assignee: Beijing Fangjianghu Technology Co Ltd
Current assignee: Beijing Fangjianghu Technology Co Ltd
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2022-03-01

Abstract

The disclosed embodiment discloses a method, apparatus, device, medium, and program for generating a dialog statement, wherein the method includes: responding to a current sentence sent by a user, and generating a word set of the current sentence; determining a first importance weight for a word element; determining a second importance weight of the word element based on a semantic relationship of a historical dialog sentence in the current dialog scene; determining an importance parameter of the word element based on the first importance weight and the second importance weight; determining a corpus recall pool of a current sentence, wherein candidate reply corpuses in the corpus recall pool comprise a first corpus retrieved from a first index library based on importance parameters and a second corpus retrieved from a second index library based on the similarity between a historical conversation sentence and a preset corpus; and determining the candidate reply corpus with the highest matching degree with the current sentence in the corpus recall pool as the target reply sentence. The contextual information in the historical conversation record can be fully utilized, thereby improving the conversation quality.

Description

Method, apparatus, device, medium and program product for generating dialogue statements

Technical Field

The present disclosure relates to a method, an apparatus, an electronic device, a storage medium, and a computer program for generating a dialogue statement.

Background

At present, the application of intelligent session systems (such as robot service, chat robot, etc.) in daily life is becoming more and more extensive, for example, the intelligent session systems can be used to meet the needs of industrial scenes such as family accompanying, medical treatment, education, government affairs, bank, tourism, etc. Generally, after receiving a sentence sent by a user, the intelligent conversation system can automatically generate a corresponding reply, so as to realize a conversation between a human and a machine. In the process, the matching degree between the reply generated by the intelligent conversation system and the sentence sent by the user is directly related to the conversation quality between the human and the machine.

In the related art, an intelligent conversation system usually retrieves a corresponding reply sentence from a pre-constructed corpus for a sentence sent by a user in a single-turn conversation, so as to implement an intelligent conversation.

Disclosure of Invention

The embodiment of the disclosure provides a method, a device, an electronic device, a storage medium and a computer program for generating dialogue sentences so as to improve the pertinence of the dialogue sentences in an intelligent dialogue system.

In one aspect of the disclosed embodiments, a method for generating a dialogue statement is provided, including: responding to a current sentence sent by a user, generating a word set of the current sentence, wherein word elements in the word set comprise words obtained by segmenting the current sentence and phrases constructed based on the words; determining a first importance weight for a word element; determining a second importance weight of the word element based on a semantic relationship of a historical dialog sentence in the current dialog scene; determining an importance parameter of the word element based on the first importance weight and the second importance weight; determining a corpus recall pool of a current sentence, wherein candidate reply corpuses in the corpus recall pool comprise a first corpus retrieved from a first index library based on importance parameters and a second corpus retrieved from a second index library based on the similarity between a historical conversation sentence and a preset corpus; and determining the candidate reply sentence with the highest matching degree with the current sentence in the corpus recall pool as the target reply sentence.

In some embodiments, determining the candidate reply sentence in the corpus recall pool that matches the current sentence to the highest extent as the target reply sentence includes: inputting the candidate reply corpus, the current sentence and the historical dialogue sentence into at least one corpus determination model which is trained in advance, and determining a first feature vector and a second feature vector and a third feature vector which correspond to the candidate reply corpus, wherein the first feature vector represents a sentence vector of a sentence obtained by splicing the current sentence and the historical dialogue sentence, the second feature vector represents a sentence vector of a sentence obtained by splicing the candidate reply corpus and the historical dialogue sentence, and the third feature vector represents a sentence vector of the candidate reply corpus; splicing the first characteristic vector and a second characteristic vector and a third characteristic vector corresponding to the candidate reply corpus to obtain a target characteristic vector of the candidate reply corpus; inputting the target characteristic vector into a full-connection layer, estimating the confidence degree of the candidate reply corpus corresponding to each preset priority label respectively, wherein the priority label represents the matching degree of the candidate reply corpus and the current sentence; inputting each confidence coefficient into a pre-constructed classifier, and determining the priority label of the candidate reply corpus; and determining a target reply sentence from the corpus recall pool based on the priority label.

In some embodiments, the at least one corpus determination model is trained by: extracting sample statements from the sample dialog log; determining a first preset number of reply sentences and a second preset number of sample historical dialogue sentences corresponding to each sample sentence from the sample dialogue log based on the dialogue sequence, wherein the first preset number of reply sentences occur after the sample sentences and are adjacent to the dialogue sequence of the sample sentences, and the second preset number of sample historical dialogue sentences occur before the sample sentences; respectively marking sample tags for a first preset number of reply sentences based on preset priority tags to obtain a first preset number of sample reply sentences; constructing a sample corpus to obtain a sample set based on the sample sentences, the first preset number of sample reply sentences and the second preset number of sample historical dialogue sentences; and inputting the sample corpora in the sample set into at least one pre-constructed initial corpus determination model, taking the sample label of the sample reply sentence as expected output, and training the initial corpus determination model to obtain at least one corpus determination model.

In some embodiments, inputting a sample corpus in a sample set into at least one pre-constructed initial corpus determination model, taking a sample label of a sample reply sentence as an expected output, training the initial corpus determination model, and obtaining at least one corpus determination model, includes: determining the semantic type of the sample statement; determining a plurality of sample subsets from the sample set based on semantic types of the sample sentences, wherein each sample subset only comprises a sample corpus consisting of sample sentences of one semantic type; training a pre-constructed initial corpus determining model based on each sample subset to obtain a plurality of corpus determining models corresponding to the sample subsets one by one, wherein each corpus determining model corresponds to a semantic type.

In some embodiments, inputting the candidate reply corpus, the current sentence, and the historical dialogue sentences into at least one corpus determination model that is pre-constructed, comprises: determining semantic types of the candidate reply linguistic data; taking a corpus determining model corresponding to the semantic model of the candidate reply corpus in the at least one corpus determining model as a target corpus determining model; and inputting the candidate reply linguistic data, the current sentence and the historical dialogue sentence into a target linguistic data determination model.

In some embodiments, a morpheme also includes synonyms of words.

In some embodiments, generating a set of words for the current sentence comprises: detecting an information directional word from a current sentence; determining historical information pointed by the information pointing words from the historical dialogue sentences; replacing the information directional words in the current sentence with historical information to obtain an updated current sentence; segmenting words of the updated current sentence to obtain a word set corresponding to the updated current sentence; on the basis of the semantic relation, combining the words in the word set into phrases to obtain an updated phrase set corresponding to the current sentence; and combining the word set and the phrase set to obtain a word set of the current sentence.

In some embodiments, the method further comprises: deleting the word elements meeting preset conditions in the word set, wherein the preset conditions are as follows: the occurrence frequency is greater than a first preset threshold or less than a second preset threshold, and the second preset threshold is less than the first preset threshold.

In another aspect of the disclosed embodiments, an electronic device is provided, including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and the computer program, when executed, implements the method for generating a dialogue statement in any of the above embodiments.

In yet another aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor, implements the method for generating a dialogue statement in any of the above embodiments.

In yet another aspect of the disclosed embodiments, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the method for generating dialog statements of any of the above embodiments.

The method for generating the dialogue statement provided by the embodiment of the disclosure responds to the current statement sent by a user to generate a word set of the current statement; then, determining a first importance weight of a word element in the word set, and determining a second importance weight of the word element according to the semantic relation of the historical dialogue statement; then, determining an importance parameter of the word element based on the first importance weight and the second importance weight; then, respectively determining a first corpus and a second corpus from a first index library and a second index library based on the importance parameters and the similarity degree of the historical conversation sentences and the preset corpuses, and constructing a corpus recall pool according to the first corpus and the second corpus; and finally, determining the target reply sentence which is matched with the current sentence to the highest extent from the corpus recall pool. The second importance weight of the word elements is determined through the semantic relation of the historical dialogue sentences, and the second linguistic data is retrieved based on the similarity between the historical dialogue sentences and the preset linguistic data, so that the context information is fully utilized, the content consistency of the target reply sentences and the current sentences can be improved, and the dialogue quality is improved.

The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of an application scenario of the method for generating conversational utterances according to the present disclosure;

FIG. 2 is a flow diagram of one embodiment of a method for generating conversational utterances of the present disclosure;

FIG. 3 is a flow diagram illustrating the determination of a target reply statement in one embodiment of the method for generating dialog statements according to the present disclosure;

FIG. 4 is a schematic flow chart illustrating training corpus determination models in an embodiment of a method for generating conversational utterances according to the present disclosure;

FIG. 5 is a schematic flow chart illustrating a corpus determination model in yet another embodiment of a method for generating conversational utterances according to the present disclosure;

FIG. 6 is a flow diagram illustrating the generation of a set of words in one embodiment of a method for generating conversational utterances according to the present disclosure;

FIG. 7 is a block diagram illustrating an embodiment of an apparatus for generating conversational utterances according to the present disclosure;

fig. 8 is a schematic structural diagram of an embodiment of an application of the electronic device of the present disclosure.

Detailed Description

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.

It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.

It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.

In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

In the process of implementing the present disclosure, the inventors found that, in the related art, an intelligent dialog system generally retrieves a corresponding reply sentence from a pre-constructed corpus for a sentence sent by a user in a single round of dialog, and the intelligent dialog system of this type can be applied to a simple question-and-answer dialog.

However, in some dialog scenarios with a large number of dialog turns (e.g., instant messaging scenarios), the content of the reply sentence generated by the intelligent dialog system is less consistent with the sentence sent by the user due to ignoring contextual information in the dialog scenarios.

A method for generating dialog information of the present disclosure is exemplarily described below with reference to fig. 1. In the scenario shown in fig. 1, the server 110 is a server of the intelligent dialog system, and the mobile terminal 120 may be preloaded with a client of the intelligent dialog system, or the intelligent dialog system may be a module or unit embedded in an application, such as an automatic customer service module of a shopping application. A user may have a conversation with the server 110 through the mobile terminal 120. After the user sends the current sentence 130 to the server in the intelligent dialog system, the server 110 may generate a word set 140 corresponding to the current sentence 130, and determine a first importance weight 150 and a second importance weight 160 of each word element in the word set 140, where the second importance weight 160 is determined based on the semantic relationship of the historical dialog sentence 170 in the current dialog scenario. Thereafter, an importance parameter 180 for the word element may be determined based on the first importance weight 150 and the second importance weight 160. Then, a corpus recall pool 190 corresponding to the current sentence 130 is determined, where the candidate reply corpuses in the corpus recall pool 190 include a first corpus and a second corpus, where the first corpus is a corpus retrieved from a first index library (for example, Elasticsearch) based on the importance parameter 180, and the second corpus is a corpus retrieved from a second index library (for example, faces) based on the similarity between the historical reply sentence 170 and the preset corpus. Finally, the server 110 selects the target reply sentence 191 with the highest matching degree with the current sentence 130 from the corpus recall pool, and sends the target reply sentence 191 to the mobile terminal 120 for presentation to the user, thereby realizing automatic generation of the dialogue sentence.

In the process of generating the target reply sentence, the second importance weight of the word element is determined through the semantic relation of the historical dialogue sentence, and the second corpus is retrieved based on the similarity between the historical dialogue sentence and the preset corpus, so that the context information is fully utilized, the content consistency between the target reply sentence and the current sentence can be improved, and the dialogue quality is improved.

Exemplary method

Referring next to fig. 2, fig. 2 is a flowchart illustrating an embodiment of a method for generating a dialogue statement of the present disclosure, as shown in fig. 2, the flowchart includes the following steps.

Step 210, responding to the current sentence sent by the user, and generating a word set of the current sentence.

The word elements in the word set comprise words obtained by segmenting the current sentence and phrases constructed based on the words.

In a specific example, the execution subject may perform word segmentation on the current sentence through a word segmentation tool to obtain a plurality of words, and delete stop words therein, such as conjunctions and pronouns that have no practical meaning, specifically, such as "what", and the like. And then, combining the words according to the semantic relation to obtain a plurality of phrases. The plurality of words and the plurality of phrases thus obtained constitute the word set of the current sentence.

Step 220, determining a first importance weight of the word element.

In this embodiment, the first importance weight may characterize the importance of a word element from its own characteristics (e.g., part of speech).

As an example, the execution subject may first obtain a dialog record in the current dialog scene, and then determine the first importance weight of each word element using a preset TF-IDF (term frequency-inverse document frequency) algorithm or a BM25 algorithm based on the dialog record.

Step 230, determining a second importance weight of the word element based on the semantic relationship of the historical dialogue statement in the current dialogue scene.

In practice, the importance of words in a dialog scene may change dynamically as the dialog continues. In this embodiment, the second importance weight may represent the importance of the word element in the context, for example, the earlier the word element appears in the dialog, the lower its importance, the smaller the second importance weight is; as another example, the more times a word element occurs in a dialog, the higher its importance, and the correspondingly greater its secondary importance weight.

As an example, the executing entity may parse the historical dialog sentences in the current dialog scene based on the semantic relationship, determine target historical dialog sentences related to the word elements, and then assign different weights to the utterances for each target history according to a time attenuation mechanism, where the weight is the second importance weight of the word element.

As another example, the executing agent may determine a second importance weight of the word element via an attention-based neural network model, such as an LSTM (Long Short-Term Memory) neural network.

Step 240, determining an importance parameter of the word element based on the first importance weight and the second importance weight.

In this embodiment, the importance parameter can characterize the importance of the word in both dimensions of its own features and context.

As an example, a weighted sum of the first importance weight and the second importance weight of the word may be determined as the importance parameter of the word.

And step 250, determining a corpus recall pool of the current sentence.

The candidate reply corpora in the corpus recall pool include a first corpus retrieved from a first index library based on the importance parameter and a second corpus retrieved from a second index library based on the similarity between the historical dialogue sentences and the preset corpora.

In a specific example, the first index repository may be an es (elastic search) -based constructed corpus. The execution main body can firstly map the importance parameter of the word element into an index value of the word element in the ES based on a preset mapping relation, and then retrieve the ES based on the index value to obtain a first corpus corresponding to the word element.

In some optional implementations of this embodiment, the morpheme in step 210 further includes a synonym of the word, and the first corpus further includes a corpus retrieved based on the synonym of the word.

In this implementation manner, the execution main body may add the synonyms of the words obtained after the word segmentation into the word set, so as to increase the number of the first corpus obtained from the first index library.

In another specific example, the second index library may be a corpus constructed based on Faiss, in which preset corpuses are pre-stored in a vector form. The execution subject may first convert the historical dialogue statement into a vector representation, and then retrieve the Faiss based on the vector of the historical dialogue statement to obtain a second corpus corresponding to the historical dialogue statement.

And step 260, determining the candidate reply corpus with the highest matching degree with the current sentence in the corpus recall pool as the target reply sentence.

In this embodiment, the execution subject may determine, based on a preset filtering policy, a candidate reply corpus having a highest matching degree with the current sentence from the corpus recall pool as the target reply sentence.

For example, the execution subject may estimate the matching degree between the candidate reply corpus and the current sentence according to the number of times that the candidate reply corpus appears in the corpus recall pool, and the execution subject may determine the candidate reply corpus with the largest number of times of occurrence as the target reply sentence if the number of times of occurrence is larger and the matching degree is higher.

For another example, the matching degree between the candidate reply corpus and the current sentence may be characterized by the sum of the importance parameters of each word element included in the candidate reply corpus, and the more the word elements included in the candidate reply corpus are, the larger the sum of the importance parameters is, the higher the matching degree is, so that the execution main body may select the candidate reply corpus with the highest matching degree as the target reply sentence.

In the method for generating a dialogue statement provided by this embodiment, a word set of a current statement is generated in response to the current statement sent by a user; then, determining a first importance weight of a word element in the word set, and determining a second importance weight of the word element according to the semantic relation of the historical dialogue statement; then, determining an importance parameter of the word element based on the first importance weight and the second importance weight; then, respectively determining a first corpus and a second corpus from a first index library and a second index library based on the importance parameters and the similarity degree of the historical conversation sentences and the preset corpuses, and constructing a corpus recall pool according to the first corpus and the second corpus; and finally, determining the target reply sentence which is matched with the current sentence to the highest extent from the corpus recall pool. The context information in the historical dialogue records can be fully utilized, the content consistency of the target reply statement and the current statement is improved, and therefore the dialogue quality is improved.

Referring next to fig. 3, fig. 3 is a schematic flow chart illustrating the determination of the target reply sentence in one embodiment of the method for generating the dialogue sentence according to the present disclosure. In some alternative implementations of the present embodiment, the step 260 may adopt a process shown in fig. 3, where the process includes the following steps.

Step 310, inputting the candidate reply corpus, the current sentence and the historical dialogue sentence into at least one corpus determination model trained in advance, and determining a first feature vector and a second feature vector and a third feature vector corresponding to the candidate reply corpus.

The first feature vector represents sentence vectors of sentences obtained by splicing current sentences and historical dialogue sentences, the second feature vector represents sentence vectors of sentences obtained by splicing candidate reply linguistic data and historical dialogue sentences, and the third feature vector represents sentence vectors of candidate reply linguistic data.

And step 320, splicing the first feature vector and the second feature vector and the third feature vector corresponding to the candidate reply corpus to obtain a target feature vector of the candidate reply corpus.

In this implementation, the first feature vector may represent associated features of a current sentence and a context in a current dialog scene, the second feature vector may represent associated features of a candidate reply corpus and a context in a current dialog scene, and the third feature vector may represent features of the candidate reply corpus itself.

In a specific example, the execution body may determine the first feature vector and the second and third feature vectors corresponding to the candidate reply corpus by a pre-trained Bert (Bidirectional Encoder Representation based on a converter) or GRU (Gated recursive Unit). Taking Bert as an example for illustration: the execution main body can input the current statement and the historical dialogue statement into a Bert model, and the Bert model transfers the features in the historical dialogue statement to the features of the current statement to obtain a first feature vector; then, the execution main body can input the candidate reply corpus and the historical dialogue sentences into a Bert model to obtain a second feature vector; finally, the execution subject may input the candidate reply corpus into the Bert model to obtain a third feature vector.

Step 330, inputting the target feature vector into the full-link layer, and estimating the confidence of the candidate reply corpus corresponding to each preset priority label.

In this implementation manner, the priority tag represents the matching degree between the candidate reply corpus and the current sentence, and the higher the matching degree is, the higher the priority is. In one particular example, the priority label from high to low may include: good replies, normal replies, bad replies, no replies to the same conversation, no replies in different conversations. The "same-dialog non-reply" indicates that the candidate reply corpus and the current sentence belong to the same dialog scene, but the candidate reply corpus is not the reply sentence of the current sentence. "non-reply in different dialog" means that the candidate reply belongs to a different dialog scenario than the current sentence, and is not a reply sentence to the current sentence.

The execution main body may sequentially input the target feature vectors of each candidate reply corpus into the full-link layer, and sequentially determine the confidence level of each target feature vector corresponding to each priority label.

And step 340, inputting each confidence coefficient into a pre-constructed classifier, and determining the priority label of the candidate reply corpus.

As an example, softmax may be used as a classifier, and the executing entity may input the confidence level of the target feature vector obtained in step 330 into the classifier, and determine a priority label of the target feature vector, where the priority label is a priority label of the candidate reply corpus.

And step 350, determining a target reply statement from the corpus recall pool based on the priority label.

As an example, the execution subject may sort the candidate reply corpuses according to the priority tags, and then select the target reply sentence from the candidate reply corpuses with the highest priority.

In the implementation mode, the characteristics of the candidate reply corpus are depicted from three dimensions through three characteristic vectors respectively, the matching degree of the candidate reply corpus and the current sentence is represented through the priority tags, the target reply sentence is determined according to the matching degree, the context information in the dialog scene can be coupled to the screening process of the reply corpus, and therefore the matching degree of the target reply sentence and the current sentence is improved.

Referring next to fig. 4, fig. 4 is a schematic flow chart illustrating training corpus determination models in an embodiment of a method for generating conversational utterances according to the present disclosure. As shown in fig. 4, the flow includes the following steps.

Step 410, sample statements are extracted from the sample dialog log.

As an example, the executing agent may extract a historical dialog log from the intelligent dialog system and then extract a user dialog record therefrom as a sample statement.

Step 420, determining a first preset number of reply sentences and a second preset number of sample historical conversation sentences corresponding to each sample sentence from the sample conversation log based on the conversation order.

The preset number of reply sentences occur after the sample sentences and are adjacent to the dialogue sequence of the sample sentences, and the second preset number of sample historical dialogue sentences occur before the sample sentences.

In practice, the intelligent dialog system may have a phenomenon of dialog statement misalignment, that is, the current reply statement corresponds to not the current user statement but a reply generated for a user statement before the current user statement.

In order to avoid the adverse effect of the phenomenon of misalignment of the dialog sentences on the dialog quality of the intelligent dialog system, the embodiment may obtain a plurality of reply sentences after the sample sentence at the same time, and ensure that the reply sentences include replies corresponding to the sample sentence.

The execution subject may first determine a position of the sample statement in the sample dialog log, and then use three replies closest to the sample statement after the sample statement as reply statements of the sample statement, and at the same time, may also extract a second preset number of dialog statements before the position from the sample log as sample history dialog statements.

Step 430, respectively marking sample tags for the first preset number of reply statements based on the preset priority tags to obtain a preset number of sample reply statements.

Step 440, constructing a sample corpus based on the sample sentences, the first preset number of sample reply sentences and the second preset number of sample historical dialogue sentences to obtain a sample set.

Step 450, inputting the sample corpora in the sample set into at least one pre-constructed initial corpus determination model, taking the sample label of the sample reply sentence as expected output, and training the initial corpus determination model to obtain at least one corpus determination model.

In this embodiment, a plurality of replies after the sample statement may be determined as reply statements corresponding to the sample statement, and each reply statement is marked with a priority tag, so as to construct the sample corpus. And then training the initial corpus determining model based on the sample corpus to obtain the trained corpus determining model, so that the adverse effect of the misalignment of conversation sentences on the accuracy of the corpus determining model can be avoided, and the accuracy of the corpus determining model is improved.

Referring next to fig. 5, fig. 5 is a schematic flow chart illustrating a corpus determination model in another embodiment of the method for generating conversational sentences according to the present disclosure, and in some alternative implementations of the embodiment shown in fig. 4, the step 450 may include the following steps.

Step 510, determining the semantic type of the sample statement.

In this implementation, the semantic types may characterize semantic features of the sample statement, which may include, for example, types of statements, queries, answers, and so on.

The execution body may determine its semantic type from the content of the sample statement.

Step 520, a plurality of sample subsets are determined from the sample set based on semantic types of the sample statements.

Wherein each sample subset comprises sample corpora formed by sample sentences of only one semantic type.

As an example, the execution subject may extract a sample corpus of which the sample statement is of a "statement" type from the sample set, and construct a first sample subset; and extracting sample corpora of which the sample sentences are of the type of inquiry from the sample set, and constructing a second sample subset. By analogy, a sample subset can be constructed for each semantic type.

Step 530, training a pre-constructed initial corpus determining model based on each sample subset to obtain a plurality of corpus determining models corresponding to the plurality of sample subsets one to one.

In this implementation, each corpus determination model corresponds to a semantic type.

Continuing with the example in step 520, the execution subject may construct two initial corpus determination models at the same time, train the first initial corpus determination model based on the first sample subset, and obtain a first corpus determination model corresponding to the "statement" type; and training a second initial corpus determining model based on the second sample subset to obtain a second corpus determining model corresponding to the type of the query.

In the implementation manner, the corresponding sample subset can be constructed according to the semantic type of the sample sentence, and the corresponding corpus determination model is trained according to the sample subset, so that the recognition accuracy of the corpus determination model to the semantic type can be improved, and the accuracy of determining the candidate reply corpus priority is improved.

In an optional example of this implementation, the inputting the candidate reply corpus, the current sentence, and the historical dialogue sentence into the at least one corpus determination model that is constructed in advance may further include: determining semantic types of the candidate reply linguistic data; taking a corpus determining model corresponding to the semantic model of the candidate reply corpus in the at least one corpus determining model as a target corpus determining model; and inputting the candidate reply linguistic data, the current sentence and the historical dialogue sentence into a target linguistic data determination model.

In this example, the corpus determination model corresponding to the semantic type of the candidate reply corpus may be used to identify the candidate reply corpus to determine the priority of the candidate reply corpus, so that the matching degree between the candidate reply corpus and the current sentence may be determined more accurately.

Referring next to fig. 6, in some alternative implementations of the embodiment shown in fig. 2, step 210 may also adopt the flow shown in fig. 6, and the flow further includes the following steps.

Step 610, information pointing words are detected from the current sentence.

Step 620, determining historical information pointed to by the information pointing words from the historical dialogue sentences.

In a dialog scenario with multiple rounds of dialog, a subsequent dialog will typically characterize particular information in the previous dialog with information-bearing words.

In this implementation, the information direction word is used to refer to historical information involved in a historical dialogue statement. For example, the phrase may be used to characterize the ordering, or may be used to characterize the object features.

As an example, when multiple objects are involved in a dialog scene, a specific object, e.g., "the few" may be referred to by the number of occurrences of the object.

As another example, objects in a dialog scene that relate to multiple different colors may be referred to by color adjectives and pronouns composition information, such as "the yellow one".

And 630, replacing the information directional words in the current sentence with historical information to obtain an updated current sentence.

As an example, if the current sentence is "i want to buy the second set of houses", and the history information pointed by the second set "is" three rooms in one hall ", the updated current sentence is" i want to buy the houses in the three rooms in one hall ".

And 640, segmenting words of the updated current sentence to obtain a word set corresponding to the updated current sentence.

And 650, combining the words in the word set into word groups based on the semantic relation to obtain an updated word group set corresponding to the current sentence.

And 660, combining the word set and the phrase set to obtain a word set of the current sentence.

In this implementation manner, the historical information in the foregoing may be substituted into the current sentence, and then a corresponding word set is constructed based on the updated current sentence, so that the coupling degree between the context information and the generation process of the target reply sentence may be improved, and the matching degree between the target reply sentence and the current conversation may be further improved.

Further, the method further comprises: deleting the word elements meeting preset conditions in the word set, wherein the preset conditions are as follows: the occurrence frequency is greater than a first preset threshold or less than a second preset threshold, and the second preset threshold is less than the first preset threshold.

In this implementation manner, the word elements satisfying the preset condition represent word elements having lower importance and lower influence on the target reply sentence. The word elements with lower importance are deleted, redundant data in the word set can be reduced, and the operation efficiency can be improved on the premise of ensuring the conversation quality.

Exemplary devices

Referring next to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of an apparatus for generating a dialogue statement according to the present disclosure, as shown in fig. 7, the apparatus includes: a set generating unit 710 configured to generate a word set of a current sentence in response to the current sentence sent by a user, wherein word elements in the word set include words obtained by segmenting the current sentence and word groups constructed based on the words; a first weight unit 720 configured to determine a first importance weight of the word element; a second weight unit 730 configured to determine a second importance weight of the word element based on a semantic relationship of a historical dialog sentence in the current dialog scene; a parameter determining unit 740 configured to determine an importance parameter of the word element based on the first importance weight and the second importance weight; a corpus recall unit 750 configured to determine a corpus recall pool of the current sentence, wherein candidate reply corpuses in the corpus recall pool include a first corpus retrieved from a first index repository based on the importance parameter and a second corpus retrieved from a second index repository based on a similarity between the historical conversation sentence and a preset corpus; a statement determining unit 760 configured to determine a candidate reply corpus having the highest matching degree of the current statements in the corpus recall pool as the target reply statement.

In this embodiment, the sentence determination unit 760 further includes: the feature extraction module is configured to input the candidate reply linguistic data and the historical dialogue sentences into at least one linguistic data determination model which is trained in advance, and determine a first feature vector and a second feature vector and a third feature vector which correspond to the candidate reply linguistic data, wherein the first feature vector represents a sentence vector of a sentence obtained by splicing the current sentence and the historical dialogue sentences, the second feature vector represents a sentence vector of a sentence obtained by splicing the candidate reply linguistic data and the historical dialogue sentences, and the third feature vector represents a sentence vector of the candidate reply linguistic data; the vector splicing module is configured to splice the first feature vector and a second feature vector and a third feature vector corresponding to the candidate reply corpus to obtain a target feature vector of the candidate reply corpus; the confidence coefficient determining module is configured to input the target characteristic vector into the full-connection layer, estimate confidence coefficients of the candidate reply corpus corresponding to preset priority labels respectively, and the priority labels represent the matching degree of the candidate reply corpus and the current statement; the priority determining module is configured to input each confidence coefficient into a pre-constructed classifier and determine a priority label of the candidate reply corpus; a corpus determination module configured to determine a target reply statement from a corpus recall pool based on the priority tags.

In this embodiment, the apparatus further includes a model training unit, and the model training unit includes: a sample statement extraction module configured to extract a sample statement from a sample dialog log; a reply sentence extraction module configured to determine, from the sample dialogue log, a first preset number of reply sentences and a second preset number of sample historical dialogue sentences corresponding to each sample sentence based on the dialogue order, wherein the preset number of reply sentences occur after the sample sentences and adjacent to the dialogue order of the sample sentences, and the second preset number of sample historical dialogue sentences occur before the sample sentences; the marking module is configured to mark sample labels for a first preset number of reply sentences respectively based on preset priority labels to obtain a first preset number of sample reply sentences; the sample construction module is configured to construct a sample corpus to obtain a sample set based on the sample sentences and a preset number of sample reply sentences; and the training module is configured to input the sample corpora in the sample set into at least one pre-constructed initial corpus determination model, take the sample label of the sample reply sentence as expected output, train the initial corpus determination model and obtain at least one corpus determination model.

In this embodiment, the training module is further configured to: determining the semantic type of the sample statement; determining a plurality of sample subsets from the sample set based on semantic types of the sample sentences, wherein each sample subset only comprises a sample corpus consisting of sample sentences of one semantic type; training a pre-constructed initial corpus determining model based on each sample subset to obtain a plurality of corpus determining models corresponding to the sample subsets one by one, wherein each corpus determining model corresponds to a semantic type.

In this embodiment, the feature extraction module further includes: a type determination submodule configured to determine a semantic type of the candidate reply corpus; a model determination submodule configured to take a corpus determination model corresponding to a semantic model of the candidate reply corpus in the at least one corpus determination model as a target corpus determination model; an input sub-module configured to input the candidate reply corpus, the current sentence, and the historical dialogue sentence into the target corpus determination model.

In this embodiment, the word element further includes synonyms of words.

In this embodiment, the set birthday unit further comprises: an information detection module configured to detect an information-directed word from a current sentence; an information determination module configured to determine historical information to which the information pointing word points from the historical dialogue sentences; the information substituting module is configured to replace the information directional words in the current sentence with historical information to obtain an updated current sentence; the word segmentation module is configured to segment words of the updated current sentence to obtain a word set corresponding to the updated current sentence; the phrase construction module is configured to combine the words in the word set into phrases based on the semantic relation to obtain an updated phrase set corresponding to the current sentence; and the merging module is configured to merge the word set and the phrase set to obtain the word set of the current sentence.

In this embodiment, the apparatus further includes a filtering unit configured to delete a word element satisfying a preset condition in the word set, where the preset condition is: the occurrence frequency is greater than a first preset threshold or less than a second preset threshold, and the second preset threshold is less than the first preset threshold.

In addition, an embodiment of the present disclosure also provides an electronic device, including:

a memory for storing a computer program;

a processor for executing the computer program stored in the memory, and when the computer program is executed, implementing the method for generating a dialogue statement according to any of the above embodiments of the present disclosure.

Fig. 8 is a schematic structural diagram of an embodiment of an application of the electronic device of the present disclosure. Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 8. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.

As shown in fig. 8, the electronic device includes one or more processors and memory.

The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.

The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by a processor to implement the methods for generating dialog statements of the various embodiments of the present disclosure described above and/or other desired functionality.

In one example, the electronic device may further include: an input device and an output device, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input device may also include, for example, a keyboard, a mouse, and the like.

The output means may output various information including determined distance information, direction information, and the like to the outside. The output devices may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.

Of course, for simplicity, only some of the components of the electronic device relevant to the present disclosure are shown in fig. 8, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device may include any other suitable components, depending on the particular application.

In addition to the above methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in the method for generating conversational utterances according to the various embodiments of the present disclosure described in the above sections of the specification.

The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the steps in the method for generating dialog statements according to various embodiments of the present disclosure described in the above sections of the present specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A method for generating conversational utterances, comprising:

responding to a current sentence sent by a user, generating a word set of the current sentence, wherein word elements in the word set comprise words obtained by word segmentation of the current sentence and phrases constructed based on the words;

determining a first importance weight for the word element;

determining a second importance weight of the word element based on a semantic relationship of a historical dialog sentence in the current dialog scene;

determining an importance parameter for the word element based on the first importance weight and the second importance weight;

determining a corpus recall pool of the current sentence, wherein candidate reply corpuses in the corpus recall pool comprise a first corpus retrieved from a first index library based on the importance parameter and a second corpus retrieved from a second index library based on the similarity between the historical conversation sentence and a preset corpus;

and determining the candidate reply corpus with the highest matching degree with the current sentence in the corpus recall pool as a target reply sentence.

2. The method according to claim 1, wherein determining a candidate reply corpus in the corpus recall pool that matches the current sentence to a highest extent as a target reply sentence comprises:

inputting the candidate reply corpus, the current sentence and the historical dialogue sentence into at least one corpus determination model trained in advance, and determining a first feature vector and a second feature vector and a third feature vector corresponding to the candidate reply corpus, wherein the first feature vector represents a sentence vector of a sentence obtained by splicing the current sentence and the historical dialogue sentence, the second feature vector represents a sentence vector of a sentence obtained by splicing the candidate reply corpus and the historical dialogue sentence, and the third feature vector represents a sentence vector of the candidate reply corpus;

splicing the first feature vector, the second feature vector and the third feature vector to obtain a target feature vector of the candidate reply corpus;

inputting the target characteristic vector into a full-connection layer, and estimating the confidence degree of the candidate reply corpus corresponding to each preset priority label respectively, wherein the priority label represents the matching degree of the candidate reply corpus and the current statement;

inputting each confidence coefficient into a pre-constructed classifier, and determining a priority label of the candidate reply corpus;

determining the target reply statement from the corpus recall pool based on the priority tag.

3. The method according to claim 2, wherein said at least one corpus determination model is trained by:

extracting sample statements from the sample dialog log;

determining a first preset number of reply sentences and a second preset number of sample historical conversation sentences corresponding to each sample sentence from the sample conversation log based on a conversation order, wherein the first preset number of reply sentences occur after the sample sentences and are adjacent to the conversation order of the sample sentences, and the second preset number of sample historical conversation sentences occur before the sample sentences;

respectively marking sample tags on the reply sentences of the first preset number based on preset priority tags to obtain the sample reply sentences of the first preset number;

constructing a sample corpus to obtain a sample set based on the sample sentences, the first preset number of sample reply sentences and the second preset number of sample historical conversation sentences;

and inputting the sample corpora in the sample set into at least one pre-constructed initial corpus determining model, taking the sample label of the sample reply sentence as expected output, and training the initial corpus determining model to obtain the at least one corpus determining model.

4. The method according to claim 3, wherein the step of inputting the sample corpus in the sample set into at least one pre-constructed initial corpus determination model, using the sample label of the sample reply sentence as an expected output, training the initial corpus determination model to obtain the at least one corpus determination model comprises:

determining a semantic type of the sample statement;

determining a plurality of sample subsets from the sample set based on semantic types of the sample sentences, wherein each sample subset only comprises a sample corpus consisting of sample sentences of one semantic type;

training a pre-constructed initial corpus determining model based on each sample subset to obtain a plurality of corpus determining models corresponding to the sample subsets one by one, wherein each corpus determining model corresponds to a semantic type.

5. The method of claim 4, wherein inputting the candidate reply corpus, the current sentence, and the historical conversational sentence into at least one corpus determination model that is pre-constructed comprises:

determining the semantic type of the candidate reply corpus;

taking a corpus determining model corresponding to the semantic model of the candidate reply corpus in the at least one corpus determining model as a target corpus determining model;

and inputting the candidate reply corpus, the current sentence and the historical dialogue sentence into the target corpus determination model.

6. The method according to one of claims 1 to 5, wherein the morpheme further comprises: synonyms of the words.

7. The method of one of claims 1 to 6, wherein generating a set of words of the current sentence comprises:

detecting an information directional word from the current sentence;

determining historical information pointed by the information pointing words from the historical dialogue sentences;

replacing the information directional words in the current sentence with the historical information to obtain an updated current sentence;

segmenting words of the updated current sentence to obtain a word set corresponding to the updated current sentence;

on the basis of semantic relation, combining the words in the word set into phrases to obtain a phrase set corresponding to the updated current sentence;

and combining the word set and the phrase set to obtain the word set of the current sentence.

8. The method of claim 7, further comprising:

deleting the word elements meeting preset conditions in the word set, wherein the preset conditions are as follows: the occurrence frequency is greater than a first preset threshold or less than a second preset threshold, and the second preset threshold is less than the first preset threshold.

9. An electronic device, comprising:

a memory for storing a computer program;

a processor for executing a computer program stored in the memory, and when executed, implementing the method of any of the preceding claims 1-8.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of the preceding claims 1 to 8.

11. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the method of any of the preceding claims 1-8.