CN112925896A - Topic extension emotional dialogue generation method based on joint decoding - Google Patents

Topic extension emotional dialogue generation method based on joint decoding Download PDF

Info

Publication number
CN112925896A
CN112925896A CN202110364233.XA CN202110364233A CN112925896A CN 112925896 A CN112925896 A CN 112925896A CN 202110364233 A CN202110364233 A CN 202110364233A CN 112925896 A CN112925896 A CN 112925896A
Authority
CN
China
Prior art keywords
emotion
content
subject
decoder
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110364233.XA
Other languages
Chinese (zh)
Inventor
肖乐
段梦诗
李清
杨卫东
李家馨
岳思雯
轩辕敏峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN202110364233.XA priority Critical patent/CN112925896A/en
Publication of CN112925896A publication Critical patent/CN112925896A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a topic expansion emotional dialogue generation method based on joint decoding, belonging to the field of natural language processing; the method comprises the following steps: the content and the theme joint attention vector are sent to a content independent unit of a decoder, and the consistency of the generated reply theme is improved; the category of the appointed emotion is used as additional input and is embedded into an emotion independent unit of a decoder, so that the generated reply has richer content emotion word expression, and the problem of content relevance reduction is relieved to a certain extent; the invention can not only generate the dialogue with specific emotion types, but also ensure that the generated reply and the input dialogue are under the same dialogue theme.

Description

Topic extension emotional dialogue generation method based on joint decoding
Technical Field
The invention relates to the technical field of non-task dialogue generation systems, in particular to a topic extension emotional dialogue generation type reply generation method based on joint decoding.
Background
In recent years, with the progress of man-machine interactive systems, the application range thereof has been expanded, and early task interactive systems have helped customers to meet specific target demands, but have failed to provide deep-level emotional response to users. Nowadays, a deep learning technology is adopted to establish a non-task type conversation system in the open field, researchers use emotion perception as an important component of the conversation system, hope that the robot can realize daily chatting with a user, perceive the emotion of the user in the chatting and make a corresponding emotion response. Zhou et al generated various emotional reactions by the emotional guidance in the supervised conversation, indicating that the emotion is a high-level abstract expression, and the conversation system added with emotion can improve the experience and satisfaction of the user, so that people turn their needs from richness of the reply content to deeper mental level communication. Furthermore, we find through chatting in daily life that the conversation between people not only involves the content of conversation but also includes the consistency of the topic of the conversation. At present, the research still faces many problems, so that the improvement of the personification of the dialogue generation model and the sharing of the dialogue capacity are still the main research direction of the dialogue system.
In 2017, Zhou et al first proposed an Emotion Chat Machine (ECM) based on a memory network, which can generate different emotion responses according to a designated emotion category. Based on this, variant models generated by seq2seq dialog have been proposed later, such as Peng et al, which use a TE-ECG model to dynamically generate emotions using emotion computation to specify emotion classes. Yang et al use the fusion module to expand the subject term and then promote the richness of the content on this basis. Liang et al predict appropriate emotions through an isomeric neural network in combination with multimodal data information. Zhang et al uses multiple embedded fusion layers to generate high quality content. However, these neglect the reduction of emotion expression caused by adding theme in the encoder, and in order to solve this problem, we propose a method that not only ensures the diversity of chat theme, but also generates rich content under the appointed emotion.
Disclosure of Invention
The invention aims to provide a topic extension emotional dialogue generation method based on joint decoding, which is designed for overcoming the defects of the prior art. Firstly, obtaining an input sequence subject term by adopting a Twitter LDA model; then, the combined content and subject attention machine is made as the input of a content decoder, and the generated content and the input content are ensured to be consistent in performance on subject relevance; and finally, the category of the appointed emotion is used as additional input to be embedded into an emotion independent unit of a decoder, so that the expression of the content influenced by the emotion added into the model is reduced. The invention has simple design, concise content, convenience and practicability, and meets two conditions of a conversation generation system: not only can conversations of a particular emotion category be generated, but also the generated replies are ensured to be in the same conversation topic as the input conversations.
In order to achieve the above object, the present invention provides a new dialog generation method, which is characterized in that a generation model composed of an encoder, a joint attention device and a decoder is adopted, and the generation of a specific reply comprises the following steps.
The method comprises the following steps: semantic information is obtained for input conversations through a BilSTM encoder, subject words are extracted through a Twitter LDA model, and input contents and the subject words are weighted through an attention mechanism. When a topic model is used for extracting topic words, parameters of a Twitter LDA topic model are estimated by a Gibbs sampling algorithm, then the extracted topic words are distributed to a source sequence by using the model, the top m keywords with the highest probability under the topic words are selected (m is set to be 10), and nonsensical common words such as 'good' and 'our' are deleted.
Step two: the class of the specified emotion is embedded as an additional input into the emotion independent unit of the decoder. Each randomly initialized emotion category is represented using a low-dimensional vector one-hot.
Step three: and combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output together, and sending the spliced hidden state and the output to the content independent unit of the decoder to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit.
The invention has the beneficial effects that:
1) the practicability is as follows: by introducing topic information into the content independent unit of the decoder, the generated dialog is better represented in topic relevance with the input dialog. And meanwhile, feeding emotion information as additional input to an emotion independent unit of a decoder, and finally smoothly fusing the emotion independent unit and the content independent unit. This not only ensures diversity in chat content, but also generates replies specifying emotion classifications.
2) Correctness: the emotion independent unit and the content independent unit of the decoder respectively contain the emotion type and content of the conversation, so that the generated reply has richer content emotion word expression, and the problem of content relevance reduction is relieved to a certain extent.
3) The design is simple, the content is concise, and the method has wider practical significance.
Drawings
FIG. 1 is a schematic diagram of a new generative model in an embodiment of the method of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, the dialog generating method of the present embodiment includes the following steps:
the method comprises the following steps: semantic information of an input dialogue is obtained through a BilSTM encoder, a subject term is extracted through a Twitter LDA model, and input content and the subject term are weighted by adopting an attention mechanism; when a subject word is extracted by using a subject model, firstly, estimating parameters of a Twitter LDA subject model by using a Gibbs sampling algorithm, then, using the model to allocate the extracted subject to a source sequence, selecting the top m keywords (m is set to be 10) with the highest probability under the subject, and deleting meaningless common words such as 'good' and 'our';
converting a given source sequence dialog message X into { X ═ X1,x2,...,xmY-Y mapped as a target sequence by word embedding1,y2,...,ymThen the generation probability of the model target sequence Y is:
p(y1,y2,...,ym|x1,x2,...,xmb) of the group A and B); where forward LSTM encodes the word vector as a hidden layer state
Figure BDA0003006759600000031
Inverse LSTM encodes word vectors into hidden layer states
Figure BDA0003006759600000032
Will hide the state
Figure BDA0003006759600000033
And
Figure BDA0003006759600000034
vector splicing is carried out to obtain the final hidden state hi(ii) a Meanwhile, unlike the conventional Seq2Seq model, the attention of each hidden state output by the encoder is calculated to obtain a content attention mechanism vector ci
In order to better consider the relevance and diversity of topics, a Twitter LDA topic model is introduced into an encoder to extract topic words and take the topic words as additional input of the model, compared with a traditional attention mechanism, a dynamic attention mechanism combining topics and contents is adopted, the weight of the topic words in reply generation is enhanced, topic vectors are more relevant to the contents of input messages, and the probability of the topic words in reply generation is further increased; estimating parameters of a Twitter LDA topic model by using a Gibbs sampling algorithm; then use the model toAssigning the extracted subject to a source sequence, selecting the top m keywords with the highest probability under the subject (setting m to 10), and deleting nonsensical common words such as 'good' and 'our', wherein the distribution is used as the vector representation of the subject words because the vector representation of each subject word is needed in the learning process; at the same time, the word vector of the subject word { topic1mThe topic attention mechanism vector o can be obtained through attention mechanism calculationiRepresents; finally, the attention of the spliced content is controlled by a vector ciAnd a topic vector oiThe vector of (a) is sent to a decoder as the input of a decoder content independent unit;
step two: embedding the category of the appointed emotion into an emotion independent unit of a decoder as an additional input, wherein each randomly initialized emotion category is represented by a low-dimensional vector one-hot;
topic and c in guarantee generation responseiWhen the input is consistent, the generated reply is also provided with emotion, and the emotion decoding unit adds the emotion type d, which is different from the content decoding uniti-1And labeling the dialogue data by using a BilSTM emotion classifier, wherein the emotion classifier is divided into six types of emotions: happiness, anger, sadness, likes, dislikes and others; each randomly initialized emotion category is represented by using a low-dimensional vector one-hot, and the emotion category generating response is used as additional input, so that the emotion decoding unit learns the high-level abstract expression capability of the emotion under the guidance of the specified emotion category, and the inaccuracy of response to content and theme due to the addition of emotional factors in the model is reduced;
step three: combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output at the moment of i-1, and sending the spliced hidden state and the output to the content independent unit of the decoder so as to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit;
the model uses two layers of GRUs as joint decoders of content and emotion respectively, firstly, the first layer of GRU divides a decoded hidden state into two modules of content and emotion, and secondly, the second layer of GRU fuses the hidden states of the two modules;
the first layer decoding unit of the model is composed of two GRUs, one is a content decoding unit which enables the content in the response to be consistent with the content of the input dialogue, the other is an emotion decoding unit which synthesizes all kinds of emotions into one unit, and different emotions can be distinguished through the input of appointed emotion types, and the ability of emotion expression in the response can be learned; finally, the decoder decodes the content of the ith time step into a unit sg(i)And emotion decoding unit sa(i)The hidden states of (a) are spliced into the hidden state s of the first layer decoding unit of the ith time step1(i)
The second layer decoding unit of the model smoothly fuses the hidden states of the first layer decoding unit and updates the states, and the hidden state of the second layer decoder at the ith time step is composed of the hidden state at the step i-1 and the hidden state s of the first layer decoder1(i)Calculating through a neural network GRU; the final hidden state of the decoder is denoted s2(i)
And finally, sequentially obtaining the probability distribution of the target sequence by the decoder through full connection and a Softmax function.

Claims (2)

1. A new dialogue generation model is characterized in that generated contents not only contain designated emotion categories, but also the topics of the generated contents are consistent with input dialogue, and richer contents can be generated in response;
the method comprises the following steps:
the method comprises the following steps: semantic information of an input dialogue is obtained through a BilSTM encoder, a subject term is extracted through a Twitter LDA model, and input content and the subject term are weighted by adopting an attention mechanism; when a subject word is extracted by using a subject model, firstly, estimating parameters of a Twitter LDA subject model by using a Gibbs sampling algorithm, then, using the model to allocate the extracted subject to a source sequence, selecting the top m keywords (setting m = 10) with the highest probability under the subject, and deleting meaningless common words such as 'good' and 'our';
a given source sequence dialogue message is mapped into a response of a target sequence through word embedding, and a hidden layer state of a forward LSTM word vector and a hidden layer state of a reverse LSTM word vector are spliced to obtain a final hidden state; meanwhile, different from the traditional Seq2Seq model, attention calculation is carried out on each hidden state output by the encoder to obtain a content attention mechanism vector;
in order to better consider the relevance and diversity of topics, a Twitter LDA topic model is introduced into an encoder to extract topic words and take the topic words as additional input of the model, compared with a traditional attention mechanism, a dynamic attention mechanism combining topics and contents is adopted, the weight of the topic words in reply generation is enhanced, topic vectors are more relevant to the contents of input messages, and the probability of the topic words in reply generation is further increased; assigning the extracted theme to a source sequence by using a theme model, selecting the top m keywords with the highest probability under the theme, and deleting meaningless common words, wherein the distribution is used as the vector representation of the subject words because the vector representation of each subject word is required in the learning process; meanwhile, the word vector of the subject word can be calculated through the attention mechanism to obtain the expression of the subject attention mechanism vector; finally, the spliced content attention mechanism vector and the topic vector are sent to a decoder and used as the input of a decoder content independent unit;
step two: embedding the category of the appointed emotion into an emotion independent unit of a decoder as an additional input, wherein each randomly initialized emotion category is represented by a low-dimensional vector one-hot;
when the theme in the generated response is ensured to be consistent with the input, the generated reply is also provided with emotion, which is different from the content decoding unit that the emotion decoding unit adds emotion categories, and a BilSTM emotion classifier is used for labeling the dialogue data, and the emotion is divided into six types: happiness, anger, sadness, likes, dislikes and others; each randomly initialized emotion category is represented by using a low-dimensional vector one-hot, and the emotion category generating response is used as additional input, so that the emotion decoding unit learns the high-level abstract expression capability of the emotion under the guidance of the specified emotion category, and the inaccuracy of response to content and theme due to the addition of emotional factors in the model is reduced;
step three: combining the attention mechanism of the content and the theme with the hidden state output by the content independent unit of the decoder and the output at the moment of i-1, splicing the hidden state and the output at the moment of i-1, and sending the spliced hidden state and the output to the content independent unit of the decoder so as to ensure that the content related to the theme is output, and finally smoothly fusing the emotion independent unit and the content independent unit;
the model uses two layers of GRUs as joint decoders of content and emotion respectively, firstly, the first layer of GRU divides a decoded hidden state into two modules of content and emotion, and secondly, the second layer of GRU fuses the hidden states of the two modules;
the first layer decoding unit of the model is composed of two GRUs, one is a content decoding unit which enables the content in the response to be consistent with the content of the input dialogue, the other is an emotion decoding unit which synthesizes all kinds of emotions into one unit, and different emotions can be distinguished through the input of appointed emotion types, and the ability of emotion expression in the response can be learned; finally, the decoder splices the hidden states of the content decoding unit and the emotion decoding unit of the ith time step into the hidden state of the first layer decoding unit of the ith time step;
the second layer decoding unit of the model smoothly fuses the hidden states of the first layer decoding unit and updates the states, and the hidden state of the second layer decoder at the ith time step is obtained by calculating the hidden state at the step i-1 and the hidden state of the first layer decoder through a neural network GRU;
and finally, sequentially obtaining the probability distribution of the target sequence by the decoder through full connection and a Softmax function.
2. The dialog generation method according to claim 1, characterized in that the expression of the content influenced by the factors such as the added subject and emotion is reduced, and the specific steps include:
and weighting the input content and the subject term by adopting an attention mechanism, splicing the content attention mechanism vector, the subject attention mechanism vector and the final hidden state output by the decoder and the output at the moment of i-1 together, and sending the spliced content attention mechanism vector, the final hidden state output by the decoder and the output at the moment of i-1 into a content independent unit of the decoder, so that the weight of the subject term in reply generation is enhanced, the subject vector is more related to the content of the input message, and the probability of the subject term in reply generation is further increased.
CN202110364233.XA 2021-04-04 2021-04-04 Topic extension emotional dialogue generation method based on joint decoding Pending CN112925896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110364233.XA CN112925896A (en) 2021-04-04 2021-04-04 Topic extension emotional dialogue generation method based on joint decoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110364233.XA CN112925896A (en) 2021-04-04 2021-04-04 Topic extension emotional dialogue generation method based on joint decoding

Publications (1)

Publication Number Publication Date
CN112925896A true CN112925896A (en) 2021-06-08

Family

ID=76174094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110364233.XA Pending CN112925896A (en) 2021-04-04 2021-04-04 Topic extension emotional dialogue generation method based on joint decoding

Country Status (1)

Country Link
CN (1) CN112925896A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129931A1 (en) * 2016-11-04 2018-05-10 Salesforce.Com, Inc. Quasi-recurrent neural network based encoder-decoder model
US20180285348A1 (en) * 2016-07-19 2018-10-04 Tencent Technology (Shenzhen) Company Limited Dialog generation method, apparatus, and device, and storage medium
US20180300400A1 (en) * 2017-04-14 2018-10-18 Salesforce.Com, Inc. Deep Reinforced Model for Abstractive Summarization
CN111522924A (en) * 2020-03-31 2020-08-11 华东师范大学 Emotional chat type reply generation method with theme perception
CN111949761A (en) * 2020-07-06 2020-11-17 合肥工业大学 Dialogue question generation method and system considering emotion and theme, and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285348A1 (en) * 2016-07-19 2018-10-04 Tencent Technology (Shenzhen) Company Limited Dialog generation method, apparatus, and device, and storage medium
US20180129931A1 (en) * 2016-11-04 2018-05-10 Salesforce.Com, Inc. Quasi-recurrent neural network based encoder-decoder model
US20180300400A1 (en) * 2017-04-14 2018-10-18 Salesforce.Com, Inc. Deep Reinforced Model for Abstractive Summarization
CN111522924A (en) * 2020-03-31 2020-08-11 华东师范大学 Emotional chat type reply generation method with theme perception
CN111949761A (en) * 2020-07-06 2020-11-17 合肥工业大学 Dialogue question generation method and system considering emotion and theme, and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
彭叶红: "基于主题模型与变分自编码的情感对话生成技术研究", 《中国优秀硕士学位论文全文数据库》, 15 January 2020 (2020-01-15), pages 21 - 27 *
李孟: "基于深度学习的情感对话生成模型研究", 《中国优秀硕士学位论文全文数据库》, 15 January 2020 (2020-01-15), pages 22 - 27 *

Similar Documents

Publication Publication Date Title
Han et al. Adversarial training in affective computing and sentiment analysis: Recent advances and perspectives
CN110717017B (en) Method for processing corpus
Cassell et al. Beat: the behavior expression animation toolkit
JP6889281B2 (en) Analyzing electronic conversations for presentations in alternative interfaces
Cavazza et al. Dialogue generation in character-based interactive storytelling
CN107870977A (en) Chat robots output is formed based on User Status
Deldjoo et al. Towards multi-modal conversational information seeking
CN113762322A (en) Video classification method, device and equipment based on multi-modal representation and storage medium
CN113407663B (en) Image-text content quality identification method and device based on artificial intelligence
GB2581943A (en) Interactive systems and methods
CN113392261B (en) Conversational music recommendation method based on film and television theme
Kao et al. Model of multi-turn dialogue in emotional chatbot
CN116028846A (en) Multi-mode emotion analysis method integrating multi-feature and attention mechanisms
CN112819933A (en) Data processing method and device, electronic equipment and storage medium
Petrova Meme language, its impact on digital culture and collective thinking
CN116894085A (en) Dialog generation method and device, electronic equipment and storage medium
CN111522924A (en) Emotional chat type reply generation method with theme perception
US20220253609A1 (en) Social Agent Personalized and Driven by User Intent
CN117173497A (en) Image generation method and device, electronic equipment and storage medium
CN117011875A (en) Method, device, equipment, medium and program product for generating multimedia page
CN112925896A (en) Topic extension emotional dialogue generation method based on joint decoding
CN116415596A (en) Emotion support man-machine conversation method and system based on emotion strategy matching
KR20230130580A (en) Autonomous generation, deployment, and personalization of real-time interactive digital agents
Jbene et al. User sentiment analysis in conversational systems based on augmentation and attention-based bilstm
Wanner et al. Towards a multimedia knowledge-based agent with social competence and human interaction capabilities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210608