CN111241250B - Emotion dialogue generation system and method - Google Patents

Emotion dialogue generation system and method Download PDF

Info

Publication number
CN111241250B
CN111241250B CN202010074840.8A CN202010074840A CN111241250B CN 111241250 B CN111241250 B CN 111241250B CN 202010074840 A CN202010074840 A CN 202010074840A CN 111241250 B CN111241250 B CN 111241250B
Authority
CN
China
Prior art keywords
emotion
model
reply
words
generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010074840.8A
Other languages
Chinese (zh)
Other versions
CN111241250A (en
Inventor
窦志成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN202010074840.8A priority Critical patent/CN111241250B/en
Publication of CN111241250A publication Critical patent/CN111241250A/en
Application granted granted Critical
Publication of CN111241250B publication Critical patent/CN111241250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to an emotion dialogue generation system and method, comprising the following steps: the emotion dialogue generation module and the reordering module; the emotion dialogue generation module comprises a basic reply generation module and a semantic-correct basic reply generation module, wherein the basic reply generation module is used for generating a semantically-correct basic reply; the multi-model emotion response generation module is used for establishing emotion models corresponding to various emotions through hierarchical training and obtaining emotion responses based on the emotion models; the single model emotion response generation module takes emotion types as input training emotion models and outputs emotion responses according to the emotion models; the reordering module receives replies output by three sub-modules in the dialogue generating module, scores the replies output, reorders the replies output by each module according to the score, and the reply with the highest score is the final emotion reply. In the man-machine conversation process, the machine can generate conversation replies meeting specific emotion, and the conversation replies are smooth in grammar, consistent in semantic and meeting emotion consistency, so that user experience in man-machine interaction is improved.

Description

Emotion dialogue generation system and method
Technical Field
The application relates to an emotion dialogue generation system and method, and belongs to the technical field of artificial intelligence.
Background
With the advent of man-machine interaction products such as Siri, google Home, heaven cat fairy, personal assistants such as colleagues and intelligent sound boxes, man-machine interaction is more and more valued by the industry and the academic world, and has more and more influence on the daily life of people. At present, many methods for implementing a dialogue system exist, but most dialogue modes of the dialogue system are task type dialogues, namely, man-machine interaction is performed in a mode of issuing commands. However, with the development of technology, a mere task dialogue has not been required by people, and people hope that the reply content of a robot can be more fluent, conform to the speaking mode of people, can distinguish emotion colors in a language and can make appropriate replies. So that the man-machine boring conversation mode is generated.
Chat conversations typically do not have a fixed topic area, which makes it difficult for the machine to select appropriate answer content. There are researchers that convert this problem into a matching problem, get candidate replies through the retrieval system, and then get the appropriate replies using the text matching algorithm. Other researchers have focused on conforming the machine-returned content to the context of a daily conversation and the human expression habits. The retrieval-based or template-based dialogue model, while avoiding question-and-answer and grammar errors, is often limited in practical application by the quality of the matching model and the inability of the system to contain all reply sentences or templates. Resulting in a machine that is generally stiff, lettering, semantically unsound, and without emotional color. Especially for some newer vocabularies, expressions, and some with obvious emotional colors, no semantically acceptable answer can be given.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present application aims to provide an emotion dialogue generation system and method, which enable a machine to generate dialogue replies meeting specific emotion in a man-machine dialogue process, so that the replies meet both grammar compliance and semantic consistency and emotion consistency, thereby not only improving the quality of machine generated replies, but also improving the user experience in man-machine interaction.
In order to achieve the above object, the present application provides an emotion dialogue generation system, including: the emotion dialogue generation module comprises a basic reply generation module which generates a basic reply with correct semantics by detecting entity words of an input sentence; the multi-model emotion response generation module is used for establishing emotion models corresponding to various emotions through hierarchical training, and performing fine adjustment on basic response based on the emotion models to obtain emotion response; the single model emotion response generation module takes emotion types as input training emotion models and outputs emotion responses according to the emotion models; the reordering module receives replies output by three sub-modules in the dialogue generating module, scores the replies output, reorders the replies output by each module according to the score, and the reply with the highest score is the final emotion reply.
Further, the basic reply generation module identifies entity words with practical meaning in the input sentence based on the rule generation model, implants a manually written reply template in the rule generation model, determines the reply template according to the entity words, and determines the basic reply with correct semantics according to the reply module.
Further, the multi-model emotion response generation module and the single-model emotion response generation module generate models with emotion types through a Seq2Seq model, wherein the Seq2Seq model comprises an Encoder Encoder and a Decoder, and the Encoder comprises a DecoderThe encoder is used for converting the input sentence into a dense vector h= (H) with an intermediate state 1 ,h 2 ,...,h n ) The decoder is arranged to decode this intermediate state vector H into an output sentence Y of the emotion model.
Further, attention mechanism is introduced in the Seq2Seq model, which is used for enriching input information of the decoder, and the decoder introducing the attention mechanism adopts the following formula for decoding:
s i =GRU decoder (y i-1 ,s i-1 ,c i )
where i is the different time instants of the decoder; j is the different instants of the encoder; s is(s) i Is the implicit state of the decoder at each instant i in the decoding process; h is a j Is the vector representation of the intermediate state dense vector H at time j in the encoder encoding process; e, e ij Is the decoder hidden state s at the last instant i-1 And intermediate state h with encoder at different instants j j Calculated attention importance, where W a Is a learned parameter matrix; alpha ij The weighted weights which are obtained by normalizing the importance by an attention mechanism and are distributed to the intermediate vectors of different times of the encoder; n is the length of the input; c i The method is characterized in that the method comprises the steps of weighting and summing all intermediate states of an encoder through attention mechanism weights to obtain vector representation of context information through calculation; y is i Is a word vector that generates words at time i.
Further, the multi-model emotion response generation module comprises a plurality of Seq2Seq generation models corresponding to emotion categories, and the generation process of the Seq2Seq model is as follows: firstly, generating a general model obtained by training all corpus; secondly, dividing all corpus into positive emotion corpus and negative emotion corpus according to emotion types, respectively trimming a general model, and dividing the general model into a positive emotion model and a negative emotion model; finally, the forward emotion corpus is divided into a happy emotion corpus and a favorite emotion corpus, and the forward emotion model is finely tuned respectively and is divided into a model corresponding to happy and a model corresponding to favorite; the negative emotion corpus is divided into aversion, sadness and anger emotion corpus, and the negative emotion models are finely tuned respectively, and are divided into a model corresponding to aversion, a model corresponding to sadness and a model corresponding to anger.
Further, the single-model emotion reply generation module only comprises a Seq2Seq model, and the emotion type i is converted into an emotion vector e in the Seq2Seq model i And will emotion vector e i And adding the information into the decoding process of a decoder to enable the information of the emotion type to be contained in the Seq2Seq model.
Further, in the Seq2Seq model, a copying mechanism is adopted to improve the generation probability of emotion words in the reply generation process, and the specific process is as follows: dividing words obtained from all input semantics into emotion words and non-emotion words, converting emotion words into emotion word vectors, and combining all emotion word vectors with the current implicit state s of a decoder t Interaction is carried out, the generation probability of the emotion words is obtained, and then the generation probability is added with the additionally increased probability generated by the copying mechanism, so that the generation probability of the emotion words in the reply generation process is improved, and the generation probability of the decoder is expressed as follows:
p(y i |s i )=softmax(p ori (y i |s i )+p copy (y i |s i ,E))
p copy (y i |s i ,E)=softmax(EW e s i )
wherein E is the word vector of all emotion words; w (W) e Is a learned parameter matrix; y is i Generating word index numbers of words at the moment i; s is(s) i Is the intermediate state vector of the decoder at time i; p is p ori S for the original Decoder i The word y under the condition i Generating a probability; p is p copy For additionally adding s i Emotion word y under the condition i Copy probability of (1), when y i Not the emotion word p copy Is 0.
Further, the scoring mechanism of the reordering module comprises an emotion consistency score and a semantic consistency score,
the emotion consistency score constructs different emotion dictionaries for different emotion types, and each emotion dictionary gives out a corresponding emotion word under the emotion category and emotion scores corresponding to the emotion words; the emotion consistency score is calculated as follows:
wherein M is the number of emotion words M, E (y) and E m Representing the emotion scores of candidate reply y and emotion word m, respectively, index (m-1, m) representing the distance of the last emotion word to the current emotion word,representing the adverb y within the distance j Weight fraction, w m Representing the weight score of the emotion word m in the emotion dictionary; gamma ray m Indicating whether the emotion word m is in the corresponding emotion dictionary, and if so, setting 1 in the emotion dictionary of the corresponding emotion category, indicating that the emotion word m has positive contribution to the expression of the corresponding emotion category; otherwise, the method is-1, which indicates that the method has negative contribution to the expression of the corresponding emotion type;
the formula for the semantic consistency score is as follows:
T(y)=Count(x,y)
wherein Count (·) is the number of identical terms of two sentences.
Further, the reordering module gathers according to the emotion consistency score E (y) and the semantic consistency score T (y) to obtain a total score, wherein the total score is as follows:
Φ(y)=λ·E(y)+(1-λ)·T(y)
where Φ (y) is the total score used for ranking, λ is the weight value that adjusts both scores.
The application also discloses an emotion dialogue generation method, which comprises the following steps: s1, generating basic replies with correct semantics by detecting entity words of input sentences; s2, establishing emotion models corresponding to various emotions through hierarchical training, and fine-tuning basic replies based on the emotion models to obtain replies containing the emotions; s3, training an emotion model by taking the emotion type as input, and replying according to emotion output by the emotion model; s4, receiving the replies output in the steps S1-S3, scoring the output replies, and rearranging according to the scores, wherein the reply with the highest score is the final emotion reply.
Due to the adoption of the technical scheme, the application has the following advantages:
(1) In the man-machine conversation process, the machine can generate conversation replies meeting specific emotions, so that the replies meet grammar smoothness and semantic consistency and emotion consistency, the quality of machine generated replies can be improved, and the user experience in man-machine interaction can be improved.
(2) The generation dialogue under different emotions is considered, namely 5 emotions of like, sadness, anger, aversion and happiness are specifically considered, a basic reply generation module, a multi-model emotion reply generation module and a single-model emotion reply generation module are generated, a plurality of alternative replies are obtained, and the best reply is selected through a reordering module, so that the reply is obtained.
(3) The end-to-end model for directly generating various specified emotions is provided, and emotion factors are introduced to enable the model to generate sentences towards the specified emotion directions, and meanwhile generation of emotion words is explicitly improved by using a copying mechanism, so that the emotion expressed by the sentences is richer.
(4) Aiming at the emotion dialogue problem, a way for measuring the emotion score of sentences is designed. The method can be combined with other sentence dimension characteristics to reorder the results of different emotion dialogue models.
Drawings
FIG. 1 is a diagram illustrating the structure of an emotion dialogue generation system according to an embodiment of the present application;
FIG. 2 is a logic diagram illustrating a method for generating a multi-modal emotion response generation module according to an embodiment of the present application.
Detailed Description
The present application will be described in detail with reference to specific examples thereof in order to better understand the technical direction of the present application by those skilled in the art. It should be understood, however, that the detailed description is presented only to provide a better understanding of the application, and should not be taken to limit the application. In the description of the present application, it is to be understood that the terminology used is for the purpose of description only and is not to be interpreted as indicating or implying relative importance.
Example 1
The embodiment discloses an emotion dialogue generation system, as shown in fig. 1, including: the emotion dialogue generation module and the reordering module;
the emotion dialogue generation module comprises a basic reply generation module which generates a basic reply with correct semantics by detecting entity words of an input sentence; the multi-model emotion response generation module is used for establishing emotion models corresponding to various emotions through hierarchical training, and performing fine adjustment on basic response based on the emotion models to obtain emotion response; the single model emotion response generation module takes emotion types as input training emotion models and outputs emotion responses according to the emotion models;
the reordering module receives replies output by three sub-modules in the dialogue generating module, scores the replies output, reorders the replies output by each module according to the score, and the reply with the highest score is the final emotion reply.
In the embodiment, the system can generate dialogue replies meeting specific emotion in the man-machine dialogue process, so that the replies meet grammar smoothness and semantic consistency and emotion consistency, the quality of machine generated replies can be improved, and the user experience in man-machine interaction can be improved.
The emotion dialogue generation process in this embodiment includes: converting user's voice, video or input text into machine recognizable text x= (X) 1 ,x 2 ,...,x n ). Wherein x is 1 To x n Are all constituent elements in text and may represent paragraphs, sentences, words, etc. Selecting emotion categories for the text by an emotion dialogue generation module, and generating emotion replies Y= (Y) conforming to the emotion categories based on the selected emotion categories 1 ,y 2 ,...,y m ). The emotion categories include { like, sad, anger, aversion, happiness, and others }. The generated emotion response Y must be consistent with the emotion of the selected emotion category, and must also be grammatically smooth and semantically consistent.
The emotion dialogue generation module comprises a basic reply generation module, a multi-model emotion reply generation module and a single-model emotion reply generation module.
The basic reply generation module identifies entity words with practical meaning in the input sentence based on the rule generation model, implants a manually written reply template in the rule generation model, determines the reply template according to the entity words, and determines the basic reply with correct semantics according to the reply module. Because the reply template is a manually written template, the basic reply formed finally can better conform to grammar smoothness, semantic consistency and emotion consistency. In this embodiment, the RUCNLP tool is used to extract the entity words in the sentence. In the embodiment, whether the entity word is detected is adopted as a trigger point of the rule generation model, and the output result of the rule generation model is directly used as a reply after triggering, so that the calculation amount of the model can be reduced in actual use.
The multi-model emotion reply generation module and the single-model emotion reply generation model generate models with emotion types through a Seq2Seq model, wherein the Seq2Seq model comprises an Encoder Encoder and a Decoder Decoder, and the Encoder is used for converting an input sentence X into a dense vector H= (H) with an intermediate state 1 ,h 2 ,...,h n ) The decoder is arranged to decode this intermediate state vector H into an output sentence Y of the emotion model. The above process is typically implemented by a long and short time dependent memory unit (LSTM) or a gate loop unit (GRU). The present embodiment is described taking a gate cycle unit as an example. The gating circulation unit is controlled by the update gate and the reset gate, and the calculation process is as follows:
z=σ(W z x t +U z h t-1 )
r=σ(W r x t +U r h t-1 )
wherein z is the update gate output; r is the reset gate output result; s is the input cell state vector; tanh (·) and σ (·) are activation functions.Representing a dot product of the vectors; w (W) z ,W r ,W s ,U z ,U r ,U s Respectively parameter matrixes under different gates for inputting vector x at time t t And the intermediate state h at the last moment t-1 Mapped to the same semantic space. Word vectors are trained with models by random initialization.
The encoder and decoder calculation process can be expressed as:
h t =GRU encoder (x t ,h t-1 )
s t =GRU decoder (y t-1 ,s t-1 )
p(y t |s t )=softmax(W o s t )
wherein p (y) t |s t ) To generate probability of word vector at decoder time t, word with maximum probability of word vector is used as currently generated word y t 。h t ,s t The intermediate implicit states of the encoder and decoder, respectively, at time t. W (W) o Is to output decoder state s during output t A parameter matrix mapped to the vocabulary space.
Since only the last output h of the encoder is used in the encoding process n As a representation of the input sentence; meanwhile, in the decoding process, the output value of the decoder at each time t is only dependent on the state s at the last time t-1 And the word vector y of the last generated word t Other information of the input sentence is not fully utilized nor fully expressed. An attention mechanism needs to be introduced to enrich the input information of the decoding process. After introducing the attention mechanism, the decoder decodes using the following formula:
s i =GRU decoder (y i-1 ,s i-1 ,c i )
where i is the different time instants of the decoder; j is the different instants of the encoder; s is(s) i Is the implicit state of the decoder at each instant i in the decoding process; h is a j Is the vector representation of the intermediate state dense vector H at time j in the encoder encoding process; e, e ij Is the decoder hidden state s at the last instant i-1 And intermediate state h with encoder at different instants j j Calculated attention importance, where W a Is a learned parameter matrix; alpha ij The weighted weights which are obtained by normalizing the importance by an attention mechanism and are distributed to the intermediate vectors of different times of the encoder; n is the length of the input; c i The method is characterized in that the method comprises the steps of weighting and summing all intermediate states of an encoder through attention mechanism weights to obtain vector representation of context information through calculation; y is i Is a word vector that generates words at time i.
As shown in fig. 2, the multi-model emotion response generation module includes a plurality of Seq2Seq models corresponding to emotion types, and the generation process of the Seq2Seq models is as follows: firstly, generating a general model obtained by training a semantic set, and obtaining different generation models by performing fine-tuning on different subdivision data sets; and secondly, training by using data sets with different emotion polarities on the basis of a general model to obtain models corresponding to different emotion polarities. Namely, the general model is divided into a positive emotion model and a negative emotion model; and finally, training by using data sets of different emotion types on the basis of the positive emotion model and the negative emotion model respectively to obtain models corresponding to different emotion types. Namely, the forward emotion model is divided into a model corresponding to happiness and a model corresponding to liking; negative emotion models are classified into a model corresponding to aversion, a model corresponding to sadness, and a model corresponding to anger.
Compared with the previous model, the model in the embodiment effectively improves the accuracy of the reply of emotion categories (such as anger, aversion and the like) with low quality of the data set. Compared with the returns of positive emotions such as happiness, liking, etc., the returns such as anger or aversion with negative emotion have not only a smaller data amount but also a subtle emotion expression. For computers, it is easier to learn to generate replies with negative emotions than to learn to generate replies with emotions such as anger or aversion. Because the negative emotion is more data and emotion expression is more sufficient. For this reason, the model in this embodiment achieves good performance in the generation of a reply to negative emotion, especially in anger and aversive emotion types.
Traditional Seq2Seq models tend to generate generalized and generic replies, although sentences have some fluencyHowever, the response cannot be provided with a specific emotion in the emotion dialogue. Therefore, a single model emotion response generation module is introduced in the embodiment. The single model emotion reply generation module only comprises a Seq2Seq model, and the emotion type i is converted into an emotion vector e in the Seq2Seq model i And will emotion vector e i The method and the device are added into the decoding process of the decoder, so that each generated reply of the decoder can have emotion type information, and the replies are developed towards a direction with a certain emotion in the generation process. The emotion vector is randomly initialized and is continuously updated in the learning process, so that the Seq2Seq model contains emotion type information. The corresponding decoder calculation process is as follows: s is(s) t =GRU decoder (y t-1 ,s t-1 ,c t ,e i )。
After the emotion vector is added, only the model can perceive emotion type information when generating a reply, but emotion expression is often reflected on specific emotion words, so that the generation probability of the emotion words in the generation process is improved by adopting a copying mechanism. The specific process is as follows: dividing words obtained from all input semantics into emotion words and non-emotion words, converting emotion words into emotion word vectors, and combining all emotion word vectors with the current implicit state s of a decoder t Interaction is carried out, the generation probability of the emotion words is obtained, and then the generation probability is added with the additionally increased probability generated by the copying mechanism, so that the generation probability of the emotion words in the reply generation process is improved, and the generation probability of the decoder is expressed as follows:
p(y t |s t )=softmax(p ori (y t |s t )+p copy (y t |s t ,E))
p copy (y t |s t ,E)=softmax(EW e s t )
wherein E is the word vector of all emotion words; w (W) e Is a learned parameter matrix; y is i Generating word index numbers of words at the moment i; s is(s) i Is the intermediate state vector of the decoder at time i; p is p ori S for the original Decoder i The word y under the condition i Generating a probability; p is p copy For additionally adding s i Emotion word y under the condition i Copy probability of (1), when y i Not the emotion word p copy Is 0.
The model generated by the single-model emotion response generation module can tend to generate a response with emotion under the input of a given text and emotion.
And establishing a reply candidate database for all replies generated by the emotion dialogue generation module, including the basic replies and the replies with emotion. In order to select the optimal reply from the reply candidate database for output, the embodiment introduces a reordering module.
The scoring mechanism of the reordering module includes an emotion consistency score and a semantic consistency score.
The specific determination process for emotion consistency scores is as follows:
based on the emotion vocabulary ontology library released by university of Connect and the result of chi-square clustering according to different emotion text data, different emotion dictionaries are constructed for different emotion types. Each emotion dictionary gives emotion words under the emotion category and emotion scores corresponding to the emotion words. The score of the emotion word combines the weight given by the emotion vocabulary ontology library and the word frequency in the data set, and reflects the importance degree of the emotion word for expressing the emotion. In general, explicit emotion words have a higher score than implicit emotion words. For example, in the emotion dictionary of the happy emotion category, "happy" has a higher score than "winning". According to the emotion dictionary, emotion scores corresponding to each emotion word can be obtained. The emotion score of a sentence is the sum of emotion scores of emotion words appearing in the sentence.
Furthermore, the expression of emotion may be enhanced, reduced or reversed by the terms of degree adverbs, such as "very", "dotted", "no". To reflect the impact of these adverbs on emotional expression, these degree adverbs are classified by degree level and given different weights in this embodiment. Words of degree such as "very" which enhance emotion expression, have a weight greater than 1, and increase emotion score after multiplication; words with reduced emotion expression such as 'dotted', wherein the weight is less than 1, and the emotion score is reduced after multiplication; the words with the degrees of emotion expression reversed such as 'no' have the weight of-1, and the emotion scores are reversed after multiplication; if the result is double negative, the result is-1 by-1, which is equivalent to that the emotion score is not changed, namely, the double negative table is affirmed.
In summary, emotion consistency scores are calculated according to the following formula:
wherein M is the number of emotion words M, E (y) and E m Representing the emotion scores of candidate reply y and emotion word m, respectively, index (m-1, m) representing the distance of the last emotion word to the current emotion word,representing the adverb y within the distance j Weight fraction, w m Representing the weight score of the emotion word m in the emotion dictionary; gamma ray m Indicating whether the emotion word m is in the corresponding emotion dictionary, if so, setting to 1 in the emotion dictionary of the corresponding emotion category, indicating a positive contribution to the expression of the corresponding emotion category, such as "happiness" appearing in the happy category; otherwise-1, the explanation has a negative contribution to the corresponding emotion class expression, such as "sadness" appearing in the happy class.
Semantic consistency score
The present embodiment takes term similarity as a score for semantic consistency, encouraging the model to generate more replies with consistent information. In this embodiment, the same number of terms of two sentences is selected as the semantic consistency score, and the formula of the semantic consistency score is as follows:
T(y)=Count(x,y)
wherein Count (·) is the number of identical terms of two sentences.
The reordering module gathers according to the emotion consistency score E (y) and the semantic consistency score T (y) to obtain a total score, wherein the total score is as follows:
Φ(y)=λ·E(y)+(1-λ)·T(y)
where Φ (y) is the total score used for ranking, λ is the weight value that adjusts both scores.
Example two
Based on the same inventive concept, the embodiment discloses an emotion dialogue generation method, which comprises the following steps:
s1, generating basic replies with correct semantics by detecting entity words of input sentences;
s2, establishing emotion models corresponding to various emotions through hierarchical training, and fine-tuning basic replies based on the emotion models to obtain replies containing the emotions;
s3, training an emotion model by taking the emotion type as input, and replying according to emotion output by the emotion model;
s4, receiving the replies output in the steps S1-S3, scoring the output replies, and rearranging according to the scores, wherein the reply with the highest score is the final emotion reply.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily appreciate variations or alternatives within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (9)

1. An emotion conversation generation system, comprising: an emotion dialogue generation module and a reordering module,
the emotion dialogue generation module comprises a basic reply generation module which generates a basic reply with correct semantics by detecting entity words of an input sentence; the multi-model emotion response generation module is used for establishing emotion models corresponding to various emotions through hierarchical training, and performing fine adjustment on the basic response based on the emotion models to obtain emotion responses; the single model emotion response generation module takes emotion types as input training emotion models and outputs emotion responses according to the emotion models;
the reordering module receives replies output by three sub-modules in the dialogue generating module, scores the replies output, reorders the replies output by each module according to the score, and the reply with the highest score is the final emotion reply;
the multi-model emotion response generation module comprises a plurality of Seq2Seq models corresponding to emotion categories, wherein the single-model emotion response generation module only comprises one Seq2Seq model, and in the Seq2Seq model, a copying mechanism is adopted to improve the generation probability of emotion words in the response generation process, and the specific process is as follows: dividing words obtained from all input semantics into emotion words and non-emotion words, converting the emotion words into emotion word vectors, and combining all emotion word vectors with the current implicit state s of a decoder t Interaction is carried out, the generation probability of the emotion words is obtained, and then the generation probability is added with the additionally increased probability generated by the copying mechanism, so that the generation probability of the emotion words in the reply generation process is improved, and the generation probability of the decoder is expressed as the following formula:
p(y t |s t )=softmax(p ori (y t |s t )+p copy (y t |s t ,E))
p copy (y t |s t ,E)=softmax(EW e s t )
wherein E is the word vector of all emotion words; w (W) e Is a learned parameter matrix; y is i Generating word index numbers of words at the moment i; s is(s) i Is the intermediate state vector of the decoder at time i; p is p ori S for the original Decoder i The word y under the condition i Generating a probability; p is p copy For additionally adding s i Emotion word y under the condition i Copy probability of (1), when y i Not the emotion word p copy Is 0.
2. The emotion conversation generation system of claim 1, wherein the basic reply generation module identifies entity words having actual meanings in an input sentence based on a rule generation model, and implants a manually composed reply template in the rule generation model, determines the reply template from the entity words, and determines a semantically correct basic reply from the reply module.
3. The emotion conversation generation system of claim 1 or 2, wherein said multi-model emotion reply generation module and single-model emotion reply generation model each generate a model with emotion categories through a Seq2Seq model, said Seq2Seq model including an Encoder and a Decoder, said Encoder for converting input statements into a dense vector h= (H 1 ,h 2 ,…,h n ) The decoder is arranged to decode this intermediate state vector H into an output sentence Y of the emotion model.
4. The emotion dialog generation system of claim 3, wherein an attention mechanism is introduced in the Seq2Seq model for enriching input information of a decoder, and the attention mechanism-introduced decoder decodes using the following formula:
s i =GRU decoder (y i-1 ,s i-1 ,c i )
where i is the different time instances of the decoderThe method comprises the steps of carrying out a first treatment on the surface of the j is the different instants of the encoder; s is(s) i Is the implicit state of the decoder at each instant i in the decoding process; h is a j Is the vector representation of the intermediate state dense vector H at time j in the encoder encoding process; e, e ij Is the decoder hidden state s at the last instant i-1 And intermediate state h with encoder at different instants j j Calculated attention importance, where W a Is a learned parameter matrix; alpha ij The weighted weights which are obtained by normalizing the importance by an attention mechanism and are distributed to the intermediate vectors of different times of the encoder; n is the length of the input; c i The method is characterized in that the method comprises the steps of weighting and summing all intermediate states of an encoder through attention mechanism weights to obtain vector representation of context information through calculation; y is i Is a word vector that generates words at time i.
5. The emotion conversation generation system of claim 3, wherein said multimodal emotion response generation module includes a plurality of Seq2Seq models corresponding to emotion categories, said Seq2Seq model being generated by: firstly, generating a general model obtained by training all corpus; secondly, dividing all corpus into positive emotion corpus and negative emotion corpus according to emotion types, respectively trimming a general model, and dividing the general model into a positive emotion model and a negative emotion model; finally, the forward emotion corpus is divided into a happy emotion corpus and a favorite emotion corpus, and the forward emotion model is finely tuned respectively and is divided into a model corresponding to happy and a model corresponding to favorite; the negative emotion corpus is divided into aversion, sadness and anger emotion corpus, and the negative emotion models are finely tuned respectively, and are divided into a model corresponding to aversion, a model corresponding to sadness and a model corresponding to anger.
6. The emotion conversation generation system of claim 4 wherein only one Seq2Seq model is included in said single model emotion reply generation module, said Seq2Seq model converting emotion category i into emotion vector e i And the emotion vector e i Added to the decoding process of a decoder to enable the SeqThe 2Seq model contains emotion type information.
7. The emotional dialog generation system of claim 1 or 2, wherein the scoring mechanism of the reordering module comprises an emotion consistency score and a semantic consistency score,
the emotion consistency score builds different emotion dictionaries for different emotion types, and each emotion dictionary gives out a corresponding emotion word under the emotion category and emotion scores corresponding to the emotion words; the emotion consistency score is calculated as follows:
wherein M is the number of emotion words M, E (y) and E m Representing the emotion scores of candidate reply y and emotion word m, respectively, index (m-1, m) representing the distance of the last emotion word to the current emotion word,representing the adverb y within said distance j Weight fraction, w m Representing the weight score of the emotion word m in the emotion dictionary; gamma ray m Indicating whether the emotion word m is in the corresponding emotion dictionary, and if so, setting 1 in the emotion dictionary of the corresponding emotion category, indicating that the emotion word m has positive contribution to the expression of the corresponding emotion category; otherwise, the method is-1, which indicates that the method has negative contribution to the expression of the corresponding emotion type;
the formula of the semantic consistency score is as follows:
T(y)=Count(x,y)
wherein Count (·) is the number of identical terms of two sentences.
8. The emotion conversation generation system of claim 7 wherein the reordering module sums up a total score from the emotion consistency score E (y) and semantic consistency score T (y), the total score being:
Φ(y)=λ·E(y)+(1-λ)·T(y)
where Φ (y) is the total score used for ranking, λ is the weight value that adjusts both scores.
9. An emotion dialogue generation method, characterized by being used for an emotion dialogue generation system according to any one of claims 1 to 8, comprising the steps of:
s1, generating basic replies with correct semantics by detecting entity words of input sentences;
s2, establishing emotion models corresponding to various emotions through hierarchical training, and fine-tuning the basic reply based on the emotion models to obtain a reply containing emotion;
s3, training an emotion model by taking emotion types as input, and replying according to emotion output by the emotion model;
s4, receiving the replies output in the steps S1-S3, scoring the output replies, and rearranging according to the scores, wherein the reply with the highest score is the final emotion reply.
CN202010074840.8A 2020-01-22 2020-01-22 Emotion dialogue generation system and method Active CN111241250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010074840.8A CN111241250B (en) 2020-01-22 2020-01-22 Emotion dialogue generation system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010074840.8A CN111241250B (en) 2020-01-22 2020-01-22 Emotion dialogue generation system and method

Publications (2)

Publication Number Publication Date
CN111241250A CN111241250A (en) 2020-06-05
CN111241250B true CN111241250B (en) 2023-10-24

Family

ID=70866275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010074840.8A Active CN111241250B (en) 2020-01-22 2020-01-22 Emotion dialogue generation system and method

Country Status (1)

Country Link
CN (1) CN111241250B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112417125B (en) * 2020-12-01 2023-03-24 南开大学 Open domain dialogue reply method and system based on deep reinforcement learning
CN112530415B (en) * 2021-02-10 2021-07-16 北京百度网讯科技有限公司 Negative reply recognition model acquisition and negative reply recognition method and device
CN112818090B (en) * 2021-02-24 2023-10-03 中国人民大学 Method and system for generating answer questions and questions based on harmonic words
CN112905776B (en) * 2021-03-17 2023-03-31 西北大学 Emotional dialogue model construction method, emotional dialogue system and method
CN113139042B (en) * 2021-04-25 2022-04-29 内蒙古工业大学 Emotion controllable reply generation method using fine-tuning and reordering strategy
CN113360614A (en) * 2021-05-31 2021-09-07 多益网络有限公司 Method, device, terminal and medium for controlling reply emotion of generating type chat robot
CN114610861B (en) * 2022-05-11 2022-08-26 之江实验室 End-to-end dialogue method integrating knowledge and emotion based on variational self-encoder

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960277A (en) * 2017-05-19 2018-12-07 百度(美国)有限责任公司 Cold fusion is carried out to sequence to series model using language model
CN109635253A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Text style conversion method, device and storage medium, computer equipment
CN109800295A (en) * 2019-01-11 2019-05-24 南京信息工程大学 The emotion session generation method being distributed based on sentiment dictionary and Word probability
CN109977201A (en) * 2019-01-28 2019-07-05 平安科技(深圳)有限公司 Machine chat method, device, computer equipment and storage medium with emotion
CN110427490A (en) * 2019-07-03 2019-11-08 华中科技大学 A kind of emotion dialogue generation method and device based on from attention mechanism
WO2019235103A1 (en) * 2018-06-07 2019-12-12 日本電信電話株式会社 Question generation device, question generation method, and program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960277A (en) * 2017-05-19 2018-12-07 百度(美国)有限责任公司 Cold fusion is carried out to sequence to series model using language model
WO2019235103A1 (en) * 2018-06-07 2019-12-12 日本電信電話株式会社 Question generation device, question generation method, and program
CN109635253A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Text style conversion method, device and storage medium, computer equipment
CN109800295A (en) * 2019-01-11 2019-05-24 南京信息工程大学 The emotion session generation method being distributed based on sentiment dictionary and Word probability
CN109977201A (en) * 2019-01-28 2019-07-05 平安科技(深圳)有限公司 Machine chat method, device, computer equipment and storage medium with emotion
CN110427490A (en) * 2019-07-03 2019-11-08 华中科技大学 A kind of emotion dialogue generation method and device based on from attention mechanism

Also Published As

Publication number Publication date
CN111241250A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN111241250B (en) Emotion dialogue generation system and method
CN110782870B (en) Speech synthesis method, device, electronic equipment and storage medium
Yao et al. An improved LSTM structure for natural language processing
Cahn CHATBOT: Architecture, design, & development
Wu et al. Emotion recognition from text using semantic labels and separable mixture models
CN111145718B (en) Chinese mandarin character-voice conversion method based on self-attention mechanism
WO2019000170A1 (en) Generating responses in automated chatting
CN111460132B (en) Generation type conference abstract method based on graph convolution neural network
Colombo Learning to represent and generate text using information measures
CN112417894A (en) Conversation intention identification method and system based on multi-task learning
CN109308316B (en) Adaptive dialog generation system based on topic clustering
CN112131367A (en) Self-auditing man-machine conversation method, system and readable storage medium
CN112818106A (en) Evaluation method of generating type question and answer
CN113239666A (en) Text similarity calculation method and system
CN114911932A (en) Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement
Bird et al. Optimisation of phonetic aware speech recognition through multi-objective evolutionary algorithms
Lin Reinforcement learning and bandits for speech and language processing: Tutorial, review and outlook
CN112948558B (en) Method and device for generating context-enhanced problems facing open domain dialog system
CN114328866A (en) Strong anthropomorphic intelligent dialogue robot with smooth and accurate response
CN111949762B (en) Method and system for context-based emotion dialogue and storage medium
CN110046239B (en) Dialogue method based on emotion editing
Kondurkar et al. Modern Applications With a Focus on Training ChatGPT and GPT Models: Exploring Generative AI and NLP
CN116303966A (en) Dialogue behavior recognition system based on prompt learning
Dilawari et al. Neural attention model for abstractive text summarization using linguistic feature space
CN112199503B (en) Feature-enhanced unbalanced Bi-LSTM-based Chinese text classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant