CN112925896A - Topic extension emotional dialogue generation method based on joint decoding - Google Patents
Topic extension emotional dialogue generation method based on joint decoding Download PDFInfo
- Publication number
- CN112925896A CN112925896A CN202110364233.XA CN202110364233A CN112925896A CN 112925896 A CN112925896 A CN 112925896A CN 202110364233 A CN202110364233 A CN 202110364233A CN 112925896 A CN112925896 A CN 112925896A
- Authority
- CN
- China
- Prior art keywords
- emotion
- content
- subject
- decoder
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a topic expansion emotional dialogue generation method based on joint decoding, belonging to the field of natural language processing; the method comprises the following steps: the content and the theme joint attention vector are sent to a content independent unit of a decoder, and the consistency of the generated reply theme is improved; the category of the appointed emotion is used as additional input and is embedded into an emotion independent unit of a decoder, so that the generated reply has richer content emotion word expression, and the problem of content relevance reduction is relieved to a certain extent; the invention can not only generate the dialogue with specific emotion types, but also ensure that the generated reply and the input dialogue are under the same dialogue theme.
Description
Technical Field
The invention relates to the technical field of non-task dialogue generation systems, in particular to a topic extension emotional dialogue generation type reply generation method based on joint decoding.
Background
In recent years, with the progress of man-machine interactive systems, the application range thereof has been expanded, and early task interactive systems have helped customers to meet specific target demands, but have failed to provide deep-level emotional response to users. Nowadays, a deep learning technology is adopted to establish a non-task type conversation system in the open field, researchers use emotion perception as an important component of the conversation system, hope that the robot can realize daily chatting with a user, perceive the emotion of the user in the chatting and make a corresponding emotion response. Zhou et al generated various emotional reactions by the emotional guidance in the supervised conversation, indicating that the emotion is a high-level abstract expression, and the conversation system added with emotion can improve the experience and satisfaction of the user, so that people turn their needs from richness of the reply content to deeper mental level communication. Furthermore, we find through chatting in daily life that the conversation between people not only involves the content of conversation but also includes the consistency of the topic of the conversation. At present, the research still faces many problems, so that the improvement of the personification of the dialogue generation model and the sharing of the dialogue capacity are still the main research direction of the dialogue system.
In 2017, Zhou et al first proposed an Emotion Chat Machine (ECM) based on a memory network, which can generate different emotion responses according to a designated emotion category. Based on this, variant models generated by seq2seq dialog have been proposed later, such as Peng et al, which use a TE-ECG model to dynamically generate emotions using emotion computation to specify emotion classes. Yang et al use the fusion module to expand the subject term and then promote the richness of the content on this basis. Liang et al predict appropriate emotions through an isomeric neural network in combination with multimodal data information. Zhang et al uses multiple embedded fusion layers to generate high quality content. However, these neglect the reduction of emotion expression caused by adding theme in the encoder, and in order to solve this problem, we propose a method that not only ensures the diversity of chat theme, but also generates rich content under the appointed emotion.
Disclosure of Invention
The invention aims to provide a topic extension emotional dialogue generation method based on joint decoding, which is designed for overcoming the defects of the prior art. Firstly, obtaining an input sequence subject term by adopting a Twitter LDA model; then, the combined content and subject attention machine is made as the input of a content decoder, and the generated content and the input content are ensured to be consistent in performance on subject relevance; and finally, the category of the appointed emotion is used as additional input to be embedded into an emotion independent unit of a decoder, so that the expression of the content influenced by the emotion added into the model is reduced. The invention has simple design, concise content, convenience and practicability, and meets two conditions of a conversation generation system: not only can conversations of a particular emotion category be generated, but also the generated replies are ensured to be in the same conversation topic as the input conversations.
In order to achieve the above object, the present invention provides a new dialog generation method, which is characterized in that a generation model composed of an encoder, a joint attention device and a decoder is adopted, and the generation of a specific reply comprises the following steps.
The method comprises the following steps: semantic information is obtained for input conversations through a BilSTM encoder, subject words are extracted through a Twitter LDA model, and input contents and the subject words are weighted through an attention mechanism. When a topic model is used for extracting topic words, parameters of a Twitter LDA topic model are estimated by a Gibbs sampling algorithm, then the extracted topic words are distributed to a source sequence by using the model, the top m keywords with the highest probability under the topic words are selected (m is set to be 10), and nonsensical common words such as 'good' and 'our' are deleted.
Step two: the class of the specified emotion is embedded as an additional input into the emotion independent unit of the decoder. Each randomly initialized emotion category is represented using a low-dimensional vector one-hot.
Step three: and combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output together, and sending the spliced hidden state and the output to the content independent unit of the decoder to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit.
The invention has the beneficial effects that:
1) the practicability is as follows: by introducing topic information into the content independent unit of the decoder, the generated dialog is better represented in topic relevance with the input dialog. And meanwhile, feeding emotion information as additional input to an emotion independent unit of a decoder, and finally smoothly fusing the emotion independent unit and the content independent unit. This not only ensures diversity in chat content, but also generates replies specifying emotion classifications.
2) Correctness: the emotion independent unit and the content independent unit of the decoder respectively contain the emotion type and content of the conversation, so that the generated reply has richer content emotion word expression, and the problem of content relevance reduction is relieved to a certain extent.
3) The design is simple, the content is concise, and the method has wider practical significance.
Drawings
FIG. 1 is a schematic diagram of a new generative model in an embodiment of the method of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, the dialog generating method of the present embodiment includes the following steps:
the method comprises the following steps: semantic information of an input dialogue is obtained through a BilSTM encoder, a subject term is extracted through a Twitter LDA model, and input content and the subject term are weighted by adopting an attention mechanism; when a subject word is extracted by using a subject model, firstly, estimating parameters of a Twitter LDA subject model by using a Gibbs sampling algorithm, then, using the model to allocate the extracted subject to a source sequence, selecting the top m keywords (m is set to be 10) with the highest probability under the subject, and deleting meaningless common words such as 'good' and 'our';
converting a given source sequence dialog message X into { X ═ X1,x2,...,xmY-Y mapped as a target sequence by word embedding1,y2,...,ymThen the generation probability of the model target sequence Y is:
p(y1,y2,...,ym|x1,x2,...,xmb) of the group A and B); where forward LSTM encodes the word vector as a hidden layer stateInverse LSTM encodes word vectors into hidden layer statesWill hide the stateAndvector splicing is carried out to obtain the final hidden state hi(ii) a Meanwhile, unlike the conventional Seq2Seq model, the attention of each hidden state output by the encoder is calculated to obtain a content attention mechanism vector ci;
In order to better consider the relevance and diversity of topics, a Twitter LDA topic model is introduced into an encoder to extract topic words and take the topic words as additional input of the model, compared with a traditional attention mechanism, a dynamic attention mechanism combining topics and contents is adopted, the weight of the topic words in reply generation is enhanced, topic vectors are more relevant to the contents of input messages, and the probability of the topic words in reply generation is further increased; estimating parameters of a Twitter LDA topic model by using a Gibbs sampling algorithm; then use the model toAssigning the extracted subject to a source sequence, selecting the top m keywords with the highest probability under the subject (setting m to 10), and deleting nonsensical common words such as 'good' and 'our', wherein the distribution is used as the vector representation of the subject words because the vector representation of each subject word is needed in the learning process; at the same time, the word vector of the subject word { topic1mThe topic attention mechanism vector o can be obtained through attention mechanism calculationiRepresents; finally, the attention of the spliced content is controlled by a vector ciAnd a topic vector oiThe vector of (a) is sent to a decoder as the input of a decoder content independent unit;
step two: embedding the category of the appointed emotion into an emotion independent unit of a decoder as an additional input, wherein each randomly initialized emotion category is represented by a low-dimensional vector one-hot;
topic and c in guarantee generation responseiWhen the input is consistent, the generated reply is also provided with emotion, and the emotion decoding unit adds the emotion type d, which is different from the content decoding uniti-1And labeling the dialogue data by using a BilSTM emotion classifier, wherein the emotion classifier is divided into six types of emotions: happiness, anger, sadness, likes, dislikes and others; each randomly initialized emotion category is represented by using a low-dimensional vector one-hot, and the emotion category generating response is used as additional input, so that the emotion decoding unit learns the high-level abstract expression capability of the emotion under the guidance of the specified emotion category, and the inaccuracy of response to content and theme due to the addition of emotional factors in the model is reduced;
step three: combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output at the moment of i-1, and sending the spliced hidden state and the output to the content independent unit of the decoder so as to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit;
the model uses two layers of GRUs as joint decoders of content and emotion respectively, firstly, the first layer of GRU divides a decoded hidden state into two modules of content and emotion, and secondly, the second layer of GRU fuses the hidden states of the two modules;
the first layer decoding unit of the model is composed of two GRUs, one is a content decoding unit which enables the content in the response to be consistent with the content of the input dialogue, the other is an emotion decoding unit which synthesizes all kinds of emotions into one unit, and different emotions can be distinguished through the input of appointed emotion types, and the ability of emotion expression in the response can be learned; finally, the decoder decodes the content of the ith time step into a unit sg(i)And emotion decoding unit sa(i)The hidden states of (a) are spliced into the hidden state s of the first layer decoding unit of the ith time step1(i);
The second layer decoding unit of the model smoothly fuses the hidden states of the first layer decoding unit and updates the states, and the hidden state of the second layer decoder at the ith time step is composed of the hidden state at the step i-1 and the hidden state s of the first layer decoder1(i)Calculating through a neural network GRU; the final hidden state of the decoder is denoted s2(i);
And finally, sequentially obtaining the probability distribution of the target sequence by the decoder through full connection and a Softmax function.
Claims (2)
1. A new dialogue generation model is characterized in that generated contents not only contain designated emotion categories, but also the topics of the generated contents are consistent with input dialogue, and richer contents can be generated in response;
the method comprises the following steps:
the method comprises the following steps: semantic information of an input dialogue is obtained through a BilSTM encoder, a subject term is extracted through a Twitter LDA model, and input content and the subject term are weighted by adopting an attention mechanism; when a subject word is extracted by using a subject model, firstly, estimating parameters of a Twitter LDA subject model by using a Gibbs sampling algorithm, then, using the model to allocate the extracted subject to a source sequence, selecting the top m keywords (setting m = 10) with the highest probability under the subject, and deleting meaningless common words such as 'good' and 'our';
a given source sequence dialogue message is mapped into a response of a target sequence through word embedding, and a hidden layer state of a forward LSTM word vector and a hidden layer state of a reverse LSTM word vector are spliced to obtain a final hidden state; meanwhile, different from the traditional Seq2Seq model, attention calculation is carried out on each hidden state output by the encoder to obtain a content attention mechanism vector;
in order to better consider the relevance and diversity of topics, a Twitter LDA topic model is introduced into an encoder to extract topic words and take the topic words as additional input of the model, compared with a traditional attention mechanism, a dynamic attention mechanism combining topics and contents is adopted, the weight of the topic words in reply generation is enhanced, topic vectors are more relevant to the contents of input messages, and the probability of the topic words in reply generation is further increased; assigning the extracted theme to a source sequence by using a theme model, selecting the top m keywords with the highest probability under the theme, and deleting meaningless common words, wherein the distribution is used as the vector representation of the subject words because the vector representation of each subject word is required in the learning process; meanwhile, the word vector of the subject word can be calculated through the attention mechanism to obtain the expression of the subject attention mechanism vector; finally, the spliced content attention mechanism vector and the topic vector are sent to a decoder and used as the input of a decoder content independent unit;
step two: embedding the category of the appointed emotion into an emotion independent unit of a decoder as an additional input, wherein each randomly initialized emotion category is represented by a low-dimensional vector one-hot;
when the theme in the generated response is ensured to be consistent with the input, the generated reply is also provided with emotion, which is different from the content decoding unit that the emotion decoding unit adds emotion categories, and a BilSTM emotion classifier is used for labeling the dialogue data, and the emotion is divided into six types: happiness, anger, sadness, likes, dislikes and others; each randomly initialized emotion category is represented by using a low-dimensional vector one-hot, and the emotion category generating response is used as additional input, so that the emotion decoding unit learns the high-level abstract expression capability of the emotion under the guidance of the specified emotion category, and the inaccuracy of response to content and theme due to the addition of emotional factors in the model is reduced;
step three: combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output at the moment of i-1, and sending the spliced hidden state and the output to the content independent unit of the decoder so as to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit;
the model uses two layers of GRUs as joint decoders of content and emotion respectively, firstly, the first layer of GRU divides a decoded hidden state into two modules of content and emotion, and secondly, the second layer of GRU fuses the hidden states of the two modules;
the first layer decoding unit of the model is composed of two GRUs, one is a content decoding unit which enables the content in the response to be consistent with the content of the input dialogue, the other is an emotion decoding unit which synthesizes all kinds of emotions into one unit, and different emotions can be distinguished through the input of appointed emotion types, and the ability of emotion expression in the response can be learned; finally, the decoder splices the hidden states of the content decoding unit and the emotion decoding unit of the ith time step into the hidden state of the first layer decoding unit of the ith time step;
the second layer decoding unit of the model smoothly fuses the hidden states of the first layer decoding unit and updates the states, and the hidden state of the second layer decoder at the ith time step is obtained by calculating the hidden state at the step i-1 and the hidden state of the first layer decoder through a neural network GRU;
and finally, sequentially obtaining the probability distribution of the target sequence by the decoder through full connection and a Softmax function.
2. The dialog generation method according to claim 1, characterized in that the expression of the content influenced by the factors such as the added subject and emotion is reduced, and the specific steps include:
and weighting the input content and the subject term by adopting an attention mechanism, splicing the content attention mechanism vector, the subject attention mechanism vector and the final hidden state output by the decoder and the output at the moment of i-1 together, and sending the spliced content attention mechanism vector, the final hidden state output by the decoder and the output at the moment of i-1 into a content independent unit of the decoder, so that the weight of the subject term in reply generation is enhanced, the subject vector is more related to the content of the input message, and the probability of the subject term in reply generation is further increased.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110364233.XA CN112925896A (en) | 2021-04-04 | 2021-04-04 | Topic extension emotional dialogue generation method based on joint decoding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110364233.XA CN112925896A (en) | 2021-04-04 | 2021-04-04 | Topic extension emotional dialogue generation method based on joint decoding |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112925896A true CN112925896A (en) | 2021-06-08 |
Family
ID=76174094
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110364233.XA Pending CN112925896A (en) | 2021-04-04 | 2021-04-04 | Topic extension emotional dialogue generation method based on joint decoding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112925896A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180129931A1 (en) * | 2016-11-04 | 2018-05-10 | Salesforce.Com, Inc. | Quasi-recurrent neural network based encoder-decoder model |
US20180285348A1 (en) * | 2016-07-19 | 2018-10-04 | Tencent Technology (Shenzhen) Company Limited | Dialog generation method, apparatus, and device, and storage medium |
US20180300400A1 (en) * | 2017-04-14 | 2018-10-18 | Salesforce.Com, Inc. | Deep Reinforced Model for Abstractive Summarization |
CN111522924A (en) * | 2020-03-31 | 2020-08-11 | 华东师范大学 | Emotional chat type reply generation method with theme perception |
CN111949761A (en) * | 2020-07-06 | 2020-11-17 | 合肥工业大学 | Dialogue question generation method and system considering emotion and theme, and storage medium |
-
2021
- 2021-04-04 CN CN202110364233.XA patent/CN112925896A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285348A1 (en) * | 2016-07-19 | 2018-10-04 | Tencent Technology (Shenzhen) Company Limited | Dialog generation method, apparatus, and device, and storage medium |
US20180129931A1 (en) * | 2016-11-04 | 2018-05-10 | Salesforce.Com, Inc. | Quasi-recurrent neural network based encoder-decoder model |
US20180300400A1 (en) * | 2017-04-14 | 2018-10-18 | Salesforce.Com, Inc. | Deep Reinforced Model for Abstractive Summarization |
CN111522924A (en) * | 2020-03-31 | 2020-08-11 | 华东师范大学 | Emotional chat type reply generation method with theme perception |
CN111949761A (en) * | 2020-07-06 | 2020-11-17 | 合肥工业大学 | Dialogue question generation method and system considering emotion and theme, and storage medium |
Non-Patent Citations (2)
Title |
---|
彭叶红: "基于主题模型与变分自编码的情感对话生成技术研究", 《中国优秀硕士学位论文全文数据库》, 15 January 2020 (2020-01-15), pages 21 - 27 * |
李孟: "基于深度学习的情感对话生成模型研究", 《中国优秀硕士学位论文全文数据库》, 15 January 2020 (2020-01-15), pages 22 - 27 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Han et al. | Adversarial training in affective computing and sentiment analysis: Recent advances and perspectives | |
CN110717017B (en) | Method for processing corpus | |
Cassell et al. | Beat: the behavior expression animation toolkit | |
JP6889281B2 (en) | Analyzing electronic conversations for presentations in alternative interfaces | |
Cavazza et al. | Dialogue generation in character-based interactive storytelling | |
CN107870977A (en) | Chat robots output is formed based on User Status | |
Deldjoo et al. | Towards multi-modal conversational information seeking | |
CN113762322A (en) | Video classification method, device and equipment based on multi-modal representation and storage medium | |
CN113407663B (en) | Image-text content quality identification method and device based on artificial intelligence | |
GB2581943A (en) | Interactive systems and methods | |
CN113392261B (en) | Conversational music recommendation method based on film and television theme | |
Kao et al. | Model of multi-turn dialogue in emotional chatbot | |
CN116028846A (en) | Multi-mode emotion analysis method integrating multi-feature and attention mechanisms | |
CN112819933A (en) | Data processing method and device, electronic equipment and storage medium | |
Petrova | Meme language, its impact on digital culture and collective thinking | |
CN116894085A (en) | Dialog generation method and device, electronic equipment and storage medium | |
CN111522924A (en) | Emotional chat type reply generation method with theme perception | |
US20220253609A1 (en) | Social Agent Personalized and Driven by User Intent | |
CN117173497A (en) | Image generation method and device, electronic equipment and storage medium | |
CN117011875A (en) | Method, device, equipment, medium and program product for generating multimedia page | |
CN112925896A (en) | Topic extension emotional dialogue generation method based on joint decoding | |
CN116415596A (en) | Emotion support man-machine conversation method and system based on emotion strategy matching | |
KR20230130580A (en) | Autonomous generation, deployment, and personalization of real-time interactive digital agents | |
Jbene et al. | User sentiment analysis in conversational systems based on augmentation and attention-based bilstm | |
Wanner et al. | Towards a multimedia knowledge-based agent with social competence and human interaction capabilities |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210608 |