CN114911932A - Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement - Google Patents

Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement Download PDF

Info

Publication number
CN114911932A
CN114911932A CN202210429360.8A CN202210429360A CN114911932A CN 114911932 A CN114911932 A CN 114911932A CN 202210429360 A CN202210429360 A CN 202210429360A CN 114911932 A CN114911932 A CN 114911932A
Authority
CN
China
Prior art keywords
sentence
graph
word
emotion
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210429360.8A
Other languages
Chinese (zh)
Inventor
马廷淮
俞慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202210429360.8A priority Critical patent/CN114911932A/en
Publication of CN114911932A publication Critical patent/CN114911932A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement, which comprises the following steps: carrying out emotion word embedding operation on the input conversation, and converting the input conversation from human language into vector representation with emotion; constructing a syntax dependence graph according to the dependency syntax relationship, wherein the nodes are words in the words, inputting the syntax dependence graph into a graph convolution neural network to update node information, and obtaining word vectors with enhanced semantics and corresponding sentence representation vectors; constructing a topic extraction model, extracting the topic of each sentence of conversation, and obtaining a sentence representation with enhanced topic; clustering according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relation, constructing a heterogeneous dialogue graph, wherein nodes represent sentences of each sentence, and updating graph nodes by using a graph loop network; and obtaining a classification result. The invention fully considers the interactive information among the conversation persons and improves the prediction precision of the conversation emotion analysis of the multiple conversation persons.

Description

Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement.
Background
Emotion recognition in a conversation is an emerging task in natural language processing, which aims to recognize the emotion of each utterance in a conversation. It can be seen as an extension of traditional text emotion detection, or as a problem in dialog systems, helping the machine understand the emotion that the user has created, thus generating an emotionally aware dialog reply. The traditional dialogue emotion analysis research focuses on double-person dialogue, and has less influence on the environment of multiple speakers. Compared with the double-speaker conversation, the information interaction in the multi-speaker conversation is more frequent and diversified, so that the emotion recognition is more complicated.
First, because personal statements are usually logical and coherent, but chats are more jumpy, and uncertain jumpiness of information exponentially increases the difficulty of emotion analysis, there is a need to focus on contextual context, particularly on topic changes during chatting. Meanwhile, group chat is information interaction among multiple persons, and the interaction is a continuous process rather than fixed in a short time. This essentially changes the way emotion judgment is evaluated. Under the application of non-interactive sentiment analysis, such as commodity comments, the application value can be realized after the sentiment of the comments is successfully judged, and the method is a clear classification task. However, in the dialogue state, the emotional state is continuously changed, and the single analysis has little meaning. Finally, the existence of the interaction hides part of the state information. In a chat, both interacting parties usually default that the other party knows a lot of information. Such as communicating the relationship status between the subject and the object, the requirement purpose of each other, the social relationship, and the common sense and character of all the things. This problem extends to the greater complexity of the group chat context, considering the personality and characteristics of each speaker, and considering the reactions that they encounter different perspectives. Some people are always strong and easy to shake, and the emotions are different. How to effectively model the infection level of the inter-speaker emotion and how well the speaker adheres to his/her own emotion is also an important and challenging problem.
The invention with the patent number of CN113656564A provides a power grid service dialogue data emotion detection method based on a graph neural network, which comprises the following steps: step 1, extracting a conversation set, and constructing a statement-level self-influence and mutual-influence relation graph and a feature extraction model; step 2, constructing a word-level undirected graph and a feature extraction model; step 3, constructing a relation undirected graph and a feature extraction model between the subject vocabulary and the context words; and 4, fusing the sentence level characteristics, the word graph characteristics and the relation characteristics between the subject vocabulary and the context words in the steps 1,2 and 3, and calculating the conversation emotion. The method can greatly improve the accuracy of interactive emotion analysis, thereby providing important technical support for constructing human interactive systems such as question-answering systems, chat robots, public service robots and the like. But the invention does not relate to emotion analysis processing in the scene of multi-conversation person chatting.
Disclosure of Invention
The technical problem to be solved is as follows: the invention aims to solve the problem of emotion analysis in multi-conversation person conversation and mainly solves the problems of semantic modeling and interaction among conversation persons. Specifically, (1) how to provide more emotional features to facilitate subsequent emotion classification when performing semantic modeling and acquiring semantic content of an input sentence, and also need to remove redundant information as much as possible to prevent interference with judgment of a computer. (2) Under the condition that the sentence emotion cannot be judged only by semantic content, how to judge the current speech emotion according to the context information and emotion is particularly under the scene that a speaker can generate emotion transition due to information interaction in the conversation process.
The technical scheme is as follows:
a topic semantic enhancement based heterogeneous graph structure multi-conversation person emotion analysis method comprises the following steps:
s10, carrying out emotion word embedding operation on the input conversation, and converting the input conversation from human language into vector representation with emotion;
s20, constructing a syntactic dependency graph according to the dependency syntactic relation for each sentence input in the step S10, wherein the nodes are words in the sentences, and the syntactic dependency graph is input into a graph convolution neural network to update node information so as to obtain word vectors with enhanced semantics and corresponding sentence representation vectors;
s30, constructing a topic extraction model according to the semantically enhanced word vector obtained in the step S20, extracting the topic of each sentence of dialogue, and obtaining a sentence representation with enhanced topic;
s40, clustering the sentence characterization of the enhanced topic obtained in the step S30 as an initial node according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relation, constructing a heterogeneous dialogue graph, wherein the node is the sentence characterization of each sentence, and updating graph nodes by using a graph loop network;
and S50, inputting the graph nodes obtained in the step S40 into a classifier to obtain a classification result, namely an emotion classification.
Further, in step S10, the process of performing emotion word embedding operation on the input dialog to convert it from human language to vector representation with emotion includes the following sub-steps:
s11, let the input be a multi-conversation-person text D containing N rounds of conversation:
Figure BDA0003609473210000021
wherein u is i Representing the ith utterance in the dialog text D,
Figure BDA0003609473210000022
representing an utterance u i Corresponding speaker, u i ={w i,1 ,w i,2 ,...,w i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversation persons; n is the total number of the dialogues;
for each sentence u input i Vector encoding is performed in time order using word2vec to obtain a basic vector representation of each word2vec (w) i );
S12, obtaining emotion vector representation of each word from external emotion dictionary VAD, and mapping the word to emotion dictionary by adopting the following formula:
Figure BDA0003609473210000031
wherein l (w) represents a morphological reduction of each word; when a certain word has actual emotional significance, the word emotional vector W2AV has corresponding real values in each dimension of VAD, otherwise, when the word does not have emotional significance, the emotional vector of the word is uniformly expressed as 5,1 and 5, and the three values respectively represent extremely weak emotional fitness V, medium emotional intensity A and extremely weak emotional significance D;
s13, the emotional word embedding W2AV of the word W obtained in the step S12 is combined with the basic word2vec word vector obtained in the step S11 in series to obtain the final word vector representation
Figure BDA0003609473210000032
W' i As an initial input to the encoder, it is called emotion word embedding.
Further, in step S20, the process of building a syntactic dependency graph according to dependency syntactic relations for each utterance input in step S10, where the nodes are words in the utterance, inputting the syntactic dependency graph into a graph convolutional neural network to update node information, and obtaining semantically enhanced word vectors and corresponding sentence representation vectors includes the following sub-steps:
s21, constructing syntax dependency tree for the input sentence of step S10, analyzing the dependency relationship between words, and executing the sentence u i Constructing a relation tree to represent the syntactic structure of the sentence, wherein the original node is the word vector representation w 'obtained in the step S10' i The core verb of the sentence is the root node of the tree, represented as a central word, the root node allows the domination of other components and is not dominated by any other components;
s22, finding out the dependent words having dependency relationship with the central word according to the grammar rule, and distinguishing the left and right nodes of the tree according to the time sequence order of the central word and the dependent words to put the dependent words into the tree until all the words are detected, thereby completing the syntactic dependency graph;
s23, inputting the syntax dependence graph into the graph convolution neural network to update node information, and obtaining a word vector with enhanced semantics:
w i ″=GRU(w′ 1 ,w′ 2 ,...,w′ n );
obtaining sentence characterization vector v of each sentence by calculating mean value of words i
Further, in step S22, finding the dependent words having dependency relationship with the core word according to the grammar rule, and distinguishing left and right nodes of the tree according to the time sequence order of the core word and the dependent words to put the dependent words into the tree until all words are detected, and the process of completing the syntactic dependency graph includes the following sub-steps:
s221, constructing a syntax dependence graph by using the stack and a queue containing the words to be processed; initializing a stack and a queue, emptying the stack, wherein the stack in an initial state only has one root node; importing all words in the sentence into a queue;
s222, selecting and executing corresponding operation types by using an Oracle function according to the current state, wherein the operation types comprise three types:
when the stack top and the words below the stack top form a dependency relationship and the central word is a stack top element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;
when the stack top and the words below the stack top form a dependency relationship and the central word is the following element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;
otherwise, adding a word in the queue to the stack top;
and S223, looping the step S222 until only the root node exists in the stack and the queue is empty.
Further, in step S30, the process of constructing a topic extraction model according to the semantically enhanced word vector obtained in step S20, extracting the topic of each sentence of the dialog, and obtaining a topic enhanced sentence representation includes the following steps:
s31, a variation self-encoder is adopted to form a theme extraction module; the semantically enhanced word vector w ″, which is obtained in step S20 i Input subject recursive according to timing informationTraining in the extraction module, wherein the output of the training is a potential vector of the topic discussed in the sentence; the latent vector constrains the coherent topics of a single dialog by a recurring hidden state whose variation distribution of the posterior approximation is:
Figure BDA0003609473210000041
wherein h is n-1 =f τ (z n-1 ,w″ n-1 ),n>1;
Figure BDA0003609473210000042
And
Figure BDA0003609473210000043
are all full connection layers, f τ () Is a cyclic unit, adopts the multi-head attention mechanism of a Transformer, and the query of the input is the last hidden variable z n-1
Figure BDA0003609473210000044
In the formula (I), the compound is shown in the specification,
Figure BDA0003609473210000045
representing a given input x n Is/are as follows
Figure BDA0003609473210000046
Is then outputted from the output of (a),
Figure BDA0003609473210000047
representing an underlying network of the language model preceding the subject layer;
s32, latent variable z trained by the theme extraction module i Treated as the current statement u i Subject vector of, will z i And the sentence characterization vector v obtained in step S20 i Connected in series to obtain a sentence characterization vector ve with enhanced subject i
Further, in step S40, clustering the sentence representations with enhanced topics obtained in step S30 as initial nodes according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relationship, constructing a heterogeneous dialogue graph, where the nodes are sentence representations of each sentence, and the process of updating graph nodes using a graph loop network includes the following sub-steps:
s41, taking the sentence representation of the topic enhancement obtained in the step S30 as an initial node, clustering the conversations of the multiple conversers according to the similarity of topic information, connecting the nodes according to spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation; each dialog subgraph represents that the discussion things in the time period are the same, and the emotion is connected;
s42, processing nodes in each dialog subgraph, wherein the nodes of the graph are ve for each speak of the spoke i Corresponding speaker is denoted as p vei (ii) a Constructing a heteromorphic graph aiming at each dialog subgraph, wherein the edge types are divided into two types: speaker-identical edge e 1 Speaker-dependent edge e 0 (ii) a Each node is arranged according to a time sequence, and the rule of the node i for constructing the edge is as follows: ve i Construct an edge with the back node, if the back node ve j Of a speaker
Figure BDA0003609473210000048
Then connect the two nodes, denoted as
Figure BDA0003609473210000049
Until a node with the same speaker is detected, let ve k ,k>i is when
Figure BDA00036094732100000410
When constructing an edge of
Figure BDA00036094732100000411
And stops detecting and starts to construct node ve i+1 Edges of subsequent nodes are detected until all the nodes are detected, and a final subgraph initial graph is obtained;
s43, update step S42 Using graph convolution networkThe nodes of the obtained subgraph initial graph are updated according to the preset nodes and the edges connected with the preset nodes, the relationship types of the edges are different, the weights are different, and the updated nodes are sentence nodes vse subjected to sentence semantic enhancement and emotion interaction processing i
Further, in step S51, the final sentence obtained in step S43 is represented as vse i Inputting the words into a classifier, and classifying the emotion of the words; the classifier uses one fully connected layer:
h i =ReLU(W h vse i +b h )
l i =softmax(W l h i +b l )
Figure BDA0003609473210000051
wherein k represents the number of emotion category labels,
Figure BDA0003609473210000052
for the final predicted emotion label, W h 、b h 、W l 、b l Are learnable parameters.
Further, in step S52, the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:
Figure BDA0003609473210000053
wherein, lambda is a parameter which can be learnt,
Figure BDA0003609473210000054
is a loss function of the subject module,
Figure BDA0003609473210000055
is the loss function of the classifier;
Figure BDA0003609473210000056
under the evidence of useThe calculation of the limit is carried out,
Figure BDA0003609473210000057
Figure BDA0003609473210000058
cross entropy loss function calculations are used.
Has the advantages that:
firstly, the heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement constructs a syntactic dependency graph and a theme extraction model, and improves the semantic expression of a single sentence. Under the environment of multiple conversants, the chat content has jumping property, the topic characteristics of each sentence of conversation need to be extracted, the conversion of the chat topic of the multiple conversations is concerned, the chat object and the chat state change of the conversants are judged, the local context depending on the current conversation is locked, the semantic expression of topic enhancement is modeled, and the method lays a foundation for the next step.
Secondly, the topic semantics enhancement-based heterogeneous graph structure multi-conversation person emotion analysis method clusters the words with similar semantic topics, and then constructs local dependency subgraphs according to a time sequence, wherein the construction of the edges reflects the emotion influence among users. The method enables sentences with the same theme and time sequence information to be clustered to the same subgraph, the emotion continuity and infectivity of the sentences are more vivid, and the method can be used for improving the emotion analysis accuracy in the environment of multiple conversations.
Thirdly, the method for analyzing the emotion of the multi-conversation person with the heterogeneous graph structure based on the theme semantic enhancement constructs a syntactic dependency graph and a theme extraction model optimization feature extraction problem, and based on the method, useful features with emotion information are reserved, redundant information is removed, and subsequent emotion classification is greatly facilitated. The invention constructs a heterogeneous dialogue subgraph module modeling emotion interaction problem, simplifies a multi-user chat task with a large amount of data, and divides the multi-user chat task into a plurality of dialogue subgraphs with consistent subjects, thereby quickening the operation speed of the model and improving the accuracy of the classifier.
Drawings
FIG. 1 is a flowchart of a topic semantic enhancement-based heterogeneous graph structure multi-conversation person emotion analysis method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an overall system architecture corresponding to the emotion analysis method according to the embodiment of the present invention.
Detailed Description
The following examples will give the skilled person a more complete understanding of the present invention, but do not limit the invention in any way.
The embodiment adopts an emotion word embedding and multi-task collaborative interaction graph network structure model to analyze the conversation emotion.
As shown in fig. 1 and 2, the method comprises the following steps:
step 1) carrying out word embedding operation on the input dialog to obtain initial input of the encoder.
Step 101) the word embedding operation is to map words to real-valued vectors of fixed dimensions. Assume an input as a segment of multi-dialog text containing N number of dialogs, where u i Representing the ith utterance in the dialog text D,
Figure BDA0003609473210000061
representing an utterance u i Corresponding speaker, u i ={w i,1 ,w i,2 ,...,w i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversants; n is the total sentence number of the conversation. Given an input dialog sequence, the sequence is first subjected to word2vec operations to obtain a word vector having the basic semantics of a word.
Step 102) inputting the dialogue sequence into an external cognitive engineering emotion dictionary, enhancing traditional word embedding by using a three-dimensional emotion space to obtain a three-dimensional word vector with emotion, and mapping the words to the emotion dictionary by the following method:
Figure BDA0003609473210000062
where l (w) represents the reduction of the morphology of each word. When a certain word has actual emotional meaning, the word emotion vector W2AV has corresponding real value in each dimension of VAD, otherwise, when the word does not have emotional meaning, such as stop word, the word emotion vector is uniformly expressed as [5,1,5], and the three values respectively represent extremely weak emotional fitness (V), medium emotional intensity (A) and extremely weak emotional importance (D).
Step 103) connecting the vectors obtained in the steps 101) and 102) in series to obtain a final emotion word vector
Figure BDA0003609473210000063
And the emotion word embedding is called as input of a next encoder.
And 2), constructing a syntactic dependency graph for the input sentence, maximally improving the extraction of the dialogue features, and obtaining sentence representations and word representations with enhanced semantics. The word vector w 'obtained in the step 1)' i Expressed as an initial node, dependency syntax relationship detection is performed in the time-series order of words, thereby constructing a syntax dependency graph.
Step 201) constructs a syntactic dependency graph using the stack (stack) and a queue (queue) containing words to be processed. The start stack is empty and the queue contains all words in the sentence. It has two operations at each step: inserting an element in the queue into a stack; the two elements at the top of stack are merged (reduce) into one element.
Step 202) there is only one root node (root) in the initial state stack and all words in the queue, and then loops to the end state. Each step of the loop is to select the appropriate operation using the Oracle function based on the current state and then perform this operation. The following three operations are performed:
LEFT-ARC: the stack top and its lower words form dependency relationship, and the central word is the stack top element, these two words are popped up from the stack, this dependency relationship is added into the analyzed data structure, and finally the central word is added into the stack
RIGHT-ARC: the top of the stack and the words below it form a dependency relationship, the central word is the following element, the two words are popped up from the stack, the dependency relationship is added into the analyzed data structure, and finally the central word is added into the stack
SHIFT: adding a word in a queue to the top of a stack
When the state is only root in the stack and the queue is also empty, the state is an end state, which indicates that the syntax dependency graph representing the sentence is constructed completely.
Step 203) inputting the syntax dependence graph and the initial graph node into a graph convolutional neural network (GRU) to update node information, and obtaining a word vector with enhanced semantics:
w″ i =GRU(w′ 1 ,w′ 2 ,...,w′ n )
obtaining sentence characterization vector v of the sentence by calculating mean value of words i
And 3) considering influence factors of multiple turns of multi-person conversation, multiple text contents and complexity, constructing a theme extraction model for the input sentences, and extracting the theme of each sentence of conversation.
Step 301) the theme extraction module is mainly composed of a variational self-encoder (VAE), and the semantically enhanced word vector w ″, obtained in step 2) i Training is performed in the input VAE recursively according to the timing information.
Step 302) the output of the VAE is a potential vector for the topic discussed in the sentence. The latent vector constrains the coherent topics of a single conversation by a recurring hidden state, whose variation distribution of the posterior approximation is:
Figure BDA0003609473210000071
wherein h is n-1 =f τ (z n-1 ,w″ n-1 ),for n>1。
Figure BDA0003609473210000072
And
Figure BDA0003609473210000073
all are fully connected layers, f τ () Is a loop unit, adopts the multi-head attention mechanism of a Transformer, and the input query is the last hidden variable z n-1
Figure BDA0003609473210000074
Step 303) training out latent variable z through theme module i It is considered as the current sentence u i Subject vector of, will z i And the sentence characterization vector v obtained in the step 3) i Connected in series to obtain a sentence characterization vector ve with enhanced subject i
And 4) taking the sentence representation of the topic enhancement obtained in the step 3) as an initial node, constructing a dialogue subgraph, and solving the problem of complicated multi-person dialogue discussion content.
Step 401) because the information amount of the group chat conversation is large and the span is variable, if the whole section of the conversation is directly constructed into a graph structure, useful information is easily filtered in the information transmission process, and therefore a long conversation needs to be processed. Taking the sentence representation of the topic enhancement obtained in the step 3) as an initial node, clustering the conversations of the multiple conversers according to the topic information similarity, connecting the nodes according to the spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation.
Step 402) each subgraph represents that the discussion things in the time period are the same, and the emotion is connected. First, the nodes in each subgraph are processed, and the nodes of the subgraph are the words ve of each spoke i Corresponding speaker is denoted as p i . Constructing a different graph for each subgraph, wherein the types of edges are divided into two types: speaker-identical edge e 1 Speaker-dependent edge e 0 . Each node is arranged according to a time sequence, and the rule of constructing the edge by the node i is as follows: ve i Construct an edge with the back node, if the back node ve j Of speaker p j ≠p i Then connect two nodes, denoted as
Figure BDA0003609473210000081
Until p is detected k =p i Constructed as an edge of
Figure BDA0003609473210000082
And stops detecting and starts to construct node ve i+1 And edges of subsequent nodes. And obtaining a final subgraph initial graph until all the nodes are detected.
Step 403) updating the nodes of the subgraph obtained in step 402) by using a Graph Convolution Network (GCN), wherein the updating of the nodes is determined by the front nodes and the edges connected with the front nodes, and the relationship types and the weights of the edges are different. The updated node is the sentence node vse processed by sentence semantic enhancement and emotion interaction i
And 5) inputting the graph nodes into a classifier to obtain a classification result, namely the emotion classification.
Step 501), the final sentence obtained in step 403) is represented vse i And putting the words into a classifier to classify the emotion of the words. The classifier uses one fully connected layer:
h i =ReLU(W h vse i +b h )
l i =softmax(W l h i +b l )
Figure BDA0003609473210000083
wherein k represents the number of emotion category labels,
Figure BDA0003609473210000084
for the final predicted emotion label, W h 、b h 、W l 、b l Are learnable parameters.
Step 502), the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:
Figure BDA0003609473210000085
wherein, lambda is a parameter which can be learnt,
Figure BDA0003609473210000086
is a loss function of the subject module,
Figure BDA0003609473210000087
is the loss function of the classifier;
Figure BDA0003609473210000088
using the calculation of the lower bound of evidence,
Figure BDA0003609473210000089
Figure BDA00036094732100000810
cross entropy loss function calculations are used.
The method mainly aims to perform emotion recognition, construct a syntax dependency graph for each sentence, reconstruct the sentence by using a variational self-coding network, train latent variables into the theme of the sentence, facilitate the follow-up observation of the conversion of the group chat theme, and improve the extraction of conversation features to the maximum extent. Meanwhile, network coding sentences with different composition figures are selected, similar nodes with the same theme are clustered, long-turn conversations with multiple themes, multiple callers and disordered information are cut into easily-classified conversation subgraphs with clear themes and concise information, and emotion changes of different callers affected by the outside can be distinguished by selecting the different composition figures. The invention simultaneously considers semantic modeling and emotion interaction tasks in multi-person conversation tasks and combines the two tasks based on the purpose of improving the emotion recognition precision of conversation. Considering that the theme change of the open conversation can affect the continuity of emotional interaction, the invention also incorporates the theme detection task into learning. The invention uses the topic extraction model to simulate the jumping of the chat information, and simultaneously combines a dependency syntax analysis method to jointly acquire the information of the input dialog. The invention also designs a heterogeneous graph network to update the feature information of the conversation nodes, model the emotional interaction among different interlocutors and be used for better emotional analysis. The method innovatively simplifies the multi-user dialogue emotion analysis task, performs theme segmentation, and realizes double improvement of emotion recognition speed and accuracy. In summary, the invention fully considers the interactive information among the interlocutors, and improves the prediction precision of the dialogue emotion analysis of the multiple interlocutors.

Claims (8)

1. A topic semantic enhancement based heterogeneous graph structure multi-conversation person emotion analysis method is characterized by comprising the following steps:
s10, carrying out emotion word embedding operation on the input conversation, and converting the input conversation from human language into vector representation with emotion;
s20, constructing a syntactic dependency graph according to the dependency syntactic relation for each sentence input in the step S10, wherein the nodes are words in the sentences, and the syntactic dependency graph is input into a graph convolution neural network to update node information so as to obtain word vectors with enhanced semantics and corresponding sentence representation vectors;
s30, constructing a topic extraction model according to the semantically enhanced word vector obtained in the step S20, extracting the topic of each sentence of dialogue, and obtaining a sentence representation with enhanced topic;
s40, clustering the sentence characterization of the enhanced topic obtained in the step S30 as an initial node according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relation, constructing a heterogeneous dialogue graph, wherein the node is the sentence characterization of each sentence, and updating graph nodes by using a graph loop network;
and S50, inputting the graph nodes obtained in the step S40 into a classifier to obtain a classification result, namely an emotion classification.
2. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S10, the process of performing emotion word embedding operation on the input dialogue to convert it from human language to vector representation with emotion comprises the following sub-steps:
s11, let the input be a multi-conversation-person text D containing N rounds of conversation:
Figure FDA0003609473200000011
wherein u is i Representing the i-th utterance, p, in the dialog text D ui Representing an utterance u i Corresponding speaker, u i ={w i,1 ,w i,2 ,...,w i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversants; n is the total number of the dialogues;
for each sentence u input i Vector encoding is performed in time order using word2vec to obtain a basic vector representation of each word2vec (w) i );
S12, obtaining emotion vector representation of each word from external emotion dictionary VAD, and mapping the word to emotion dictionary by adopting the following formula:
Figure FDA0003609473200000012
wherein l (w) represents a morphological reduction of each word; when a certain word has actual emotional significance, the word emotional vector W2AV has corresponding real values in each dimension of VAD, otherwise, when the word does not have emotional significance, the emotional vector of the word is uniformly expressed as [5,1,5], and the three values respectively express extremely weak emotional fitness V, medium emotional intensity A and extremely weak emotional significance D;
s13, embedding the emotional words of the word W obtained in the step S12 into W2AV, and combining the emotional words with the basic word2vec word vectors obtained in the step S11 in series to obtain final word vector representation
Figure FDA0003609473200000021
W' i As an initial input to the encoder, it is called emotion word embedding.
3. The method for analyzing emotion of multiple conversations based on topic semantic enhanced heterogeneous graph structure according to claim 1, wherein in step S20, the process of constructing a syntactic dependency graph according to dependency syntactic relations for each sentence input in step S10, the nodes being words in the sentences, inputting the syntactic dependency graph into a graph convolution neural network to update node information, and obtaining a semantic enhanced word vector and a corresponding sentence representation vector comprises the following sub-steps:
s21, constructing syntax dependency tree for the input sentence of step S10, analyzing the dependency relationship between words, and executing the sentence u i Constructing a relation tree to represent the syntactic structure of the sentence, wherein the original node is the word vector representation w 'obtained in the step S10' i The core verb of the sentence is the root node of the tree, represented as a central word, the root node allows the domination of other components and is not dominated by any other components;
s22, finding out the dependent words having dependency relationship with the central word according to the grammar rule, distinguishing left and right nodes of the tree according to the time sequence order of the central word and the dependent words, putting the dependent words into the tree until all words are detected, and completing a syntactic dependency graph;
s23, inputting the syntax dependence graph into the graph convolution neural network to update node information, and obtaining a word vector with enhanced semantics:
w″ i =GRU(w′ 1 ,w′ 2 ,...,w′ n );
obtaining sentence characterization vector v of each sentence by calculating mean value of words i
4. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 3, wherein in step S22, the dependent word having dependency relationship with the central word is found according to grammar rules, the left and right nodes of the tree are distinguished according to the time sequence order of the central word and the dependent word, and the dependent word is contained in the tree until all words are detected, and the process of completing the syntactic dependency graph includes the following sub-steps:
s221, constructing a syntax dependence graph by using the stack and a queue containing the words to be processed; initializing a stack and a queue, emptying the stack, wherein the stack in an initial state only has one root node; importing all words in the sentence into a queue;
s222, selecting and executing corresponding operation types by using an Oracle function according to the current state, wherein the operation types comprise three types:
when the stack top and the words below the stack top form a dependency relationship and the central word is a stack top element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;
when the stack top and the words below the stack top form a dependency relationship and the central word is the following element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;
otherwise, adding a word in the queue to the stack top;
and S223, circulating the step S222 until only the root node exists in the stack and the queue is empty.
5. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S30, a topic extraction model is constructed according to the semantically enhanced word vector obtained in step S20, and the topic of each sentence is extracted, so as to obtain topic enhanced sentence representation, the process of obtaining topic enhanced sentence representation includes the following steps:
s31, a variation self-encoder is adopted to form a theme extraction module; the semantically strengthened word vector w obtained in the step S20 i "input subject extraction module recursive according to timing information, training, the output of which is the potential vector of the subject discussed in the sentence; the latent vector constrains the coherent topics of a single dialog by a recurring hidden state whose variation distribution of the posterior approximation is:
Figure FDA0003609473200000031
wherein h is n-1 =f τ (z n-1 ,w″ n-1 ),n>1;
Figure FDA0003609473200000032
And
Figure FDA0003609473200000033
are all fully connected layers, f τ () Is a loop unit, adopts the multi-head attention mechanism of a Transformer, and the input query is the last hidden variable z n-1
Figure FDA0003609473200000034
In the formula (I), the compound is shown in the specification,
Figure FDA0003609473200000035
representing a given input x n Is
Figure FDA0003609473200000036
Is then outputted from the output of (a),
Figure FDA0003609473200000037
representing an underlying network of the language model preceding the subject layer;
s32, latent variable z trained by the theme extraction module i Treated as the current statement u i Subject vector of, will z i And the sentence characterization vector v obtained in step S20 i Connected in series to obtain a sentence characterization vector ve with enhanced subject i
6. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S40, the topic enhanced sentence representations obtained in step S30 are used as initial nodes, clustering is performed according to topic similarity, a conversation sub-graph is constructed according to sentence topic information and time sequence relations, a heterogeneous conversation graph is constructed, the nodes are sentence representations of each sentence, and the process of updating graph nodes by using a graph loop network includes the following sub-steps:
s41, taking the sentence representation of the topic enhancement obtained in the step S30 as an initial node, clustering the conversations of the multiple conversers according to the topic information similarity, connecting the nodes according to the spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation; each dialog sub-graph represents that the discussion things in the time period are the same, and the emotion is connected;
s42, processing nodes in each dialog subgraph, wherein the nodes of the graphs are the words ve of each speaker i Corresponding speaker is represented as
Figure FDA00036094732000000312
Constructing a heteromorphic graph aiming at each dialog subgraph, wherein the edge types are divided into two types: speaker-identical edge e 1 Speaker-dependent edge e 0 (ii) a Each node is arranged according to a time sequence, and the rule of constructing the edge by the node i is as follows: ve i Construct an edge with the back node, if the back node ve j Of a speaker
Figure FDA0003609473200000038
Then connect the two nodes, denoted as
Figure FDA0003609473200000039
Until a node of the same speaker is detected, let ve k ,k>i is when
Figure FDA00036094732000000310
When constructing an edge of
Figure FDA00036094732000000311
And stops detecting and starts to construct node ve i+1 Edges of subsequent nodes are detected until all the nodes are detected, and a final subgraph initial graph is obtained;
s43, using the nodes of the subgraph initial graph obtained in the graph convolution network updating step S42, wherein the updating of the nodes is determined by the front nodes and the edges connected with the front nodes, the relationship types of the edges are different, the weights are different, and the updated nodes are the sentence nodes vse processed by sentence semantic enhancement and emotion interaction i
7. The subject based semantics of claim 6The method for analyzing the emotion of the multiple conversations with the enhanced heterogeneous graph structure is characterized in that in step S51, the final sentence obtained in step S43 is represented as vse i Inputting the words into a classifier, and classifying the emotion of the words; the classifier uses one fully connected layer:
h i =ReLU(W h vse i +b h )
l i =softmax(W l h i +b l )
Figure FDA0003609473200000041
wherein k represents the number of emotion category labels,
Figure FDA0003609473200000042
for the final predicted emotion label, W h 、b h 、W l 、b l Are learnable parameters.
8. The topic semantics enhanced heterogeneous graph structure multisession emotion analysis method of claim 7, wherein in step S52, the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:
Figure FDA0003609473200000043
wherein, lambda is a parameter which can be learnt,
Figure FDA0003609473200000044
is a loss function of the subject module,
Figure FDA0003609473200000045
is the loss function of the classifier;
Figure FDA0003609473200000046
using the calculation of the lower bound of evidence,
Figure FDA0003609473200000047
Figure FDA0003609473200000048
cross entropy loss function calculations are used.
CN202210429360.8A 2022-04-22 2022-04-22 Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement Pending CN114911932A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210429360.8A CN114911932A (en) 2022-04-22 2022-04-22 Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210429360.8A CN114911932A (en) 2022-04-22 2022-04-22 Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement

Publications (1)

Publication Number Publication Date
CN114911932A true CN114911932A (en) 2022-08-16

Family

ID=82765606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210429360.8A Pending CN114911932A (en) 2022-04-22 2022-04-22 Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement

Country Status (1)

Country Link
CN (1) CN114911932A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115841119A (en) * 2023-02-21 2023-03-24 中国科学技术大学 Emotional cause extraction method based on graph structure
CN116258134A (en) * 2023-04-24 2023-06-13 中国科学技术大学 Dialogue emotion recognition method based on convolution joint model
CN116484004A (en) * 2023-05-26 2023-07-25 大连理工大学 Dialogue emotion recognition and classification method
CN116662554A (en) * 2023-07-26 2023-08-29 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN117493490A (en) * 2023-11-17 2024-02-02 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115841119A (en) * 2023-02-21 2023-03-24 中国科学技术大学 Emotional cause extraction method based on graph structure
CN115841119B (en) * 2023-02-21 2023-06-16 中国科学技术大学 Emotion cause extraction method based on graph structure
CN116258134A (en) * 2023-04-24 2023-06-13 中国科学技术大学 Dialogue emotion recognition method based on convolution joint model
CN116258134B (en) * 2023-04-24 2023-08-29 中国科学技术大学 Dialogue emotion recognition method based on convolution joint model
CN116484004A (en) * 2023-05-26 2023-07-25 大连理工大学 Dialogue emotion recognition and classification method
CN116484004B (en) * 2023-05-26 2024-06-07 大连理工大学 Dialogue emotion recognition and classification method
CN116662554A (en) * 2023-07-26 2023-08-29 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN116662554B (en) * 2023-07-26 2023-11-14 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN117493490A (en) * 2023-11-17 2024-02-02 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph
CN117493490B (en) * 2023-11-17 2024-05-14 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph

Similar Documents

Publication Publication Date Title
CN108597541B (en) Speech emotion recognition method and system for enhancing anger and happiness recognition
CN110427617B (en) Push information generation method and device
CN111159368B (en) Reply generation method of personalized dialogue
CN114911932A (en) Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement
CN106448670A (en) Dialogue automatic reply system based on deep learning and reinforcement learning
CN110297887B (en) Service robot personalized dialogue system and method based on cloud platform
CN114722838A (en) Conversation emotion recognition method based on common sense perception and hierarchical multi-task learning
Merdivan et al. Dialogue systems for intelligent human computer interactions
KR20210070213A (en) Voice user interface
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
Li et al. Learning fine-grained cross modality excitement for speech emotion recognition
CN113987179A (en) Knowledge enhancement and backtracking loss-based conversational emotion recognition network model, construction method, electronic device and storage medium
US11132994B1 (en) Multi-domain dialog state tracking
CN112417894A (en) Conversation intention identification method and system based on multi-task learning
CN112131359A (en) Intention identification method based on graphical arrangement intelligent strategy and electronic equipment
CN111899766B (en) Speech emotion recognition method based on optimization fusion of depth features and acoustic features
WO2023226239A1 (en) Object emotion analysis method and apparatus and electronic device
CN117349427A (en) Artificial intelligence multi-mode content generation system for public opinion event coping
CN112183106A (en) Semantic understanding method and device based on phoneme association and deep learning
CN116303966A (en) Dialogue behavior recognition system based on prompt learning
Ai et al. A Two-Stage Multimodal Emotion Recognition Model Based on Graph Contrastive Learning
CN116108856B (en) Emotion recognition method and system based on long and short loop cognition and latent emotion display interaction
Yang [Retracted] Design of Service Robot Based on User Emotion Recognition and Environmental Monitoring
Du et al. Multimodal emotion recognition based on feature fusion and residual connection
Jiang et al. An affective chatbot with controlled specific emotion expression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination