CN114911932A

CN114911932A - Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement

Info

Publication number: CN114911932A
Application number: CN202210429360.8A
Authority: CN
Inventors: 马廷淮; 俞慧
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2022-04-22
Filing date: 2022-04-22
Publication date: 2022-08-16

Abstract

The invention discloses a heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement, which comprises the following steps: carrying out emotion word embedding operation on the input conversation, and converting the input conversation from human language into vector representation with emotion; constructing a syntax dependence graph according to the dependency syntax relationship, wherein the nodes are words in the words, inputting the syntax dependence graph into a graph convolution neural network to update node information, and obtaining word vectors with enhanced semantics and corresponding sentence representation vectors; constructing a topic extraction model, extracting the topic of each sentence of conversation, and obtaining a sentence representation with enhanced topic; clustering according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relation, constructing a heterogeneous dialogue graph, wherein nodes represent sentences of each sentence, and updating graph nodes by using a graph loop network; and obtaining a classification result. The invention fully considers the interactive information among the conversation persons and improves the prediction precision of the conversation emotion analysis of the multiple conversation persons.

Description

Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement

Technical Field

The invention belongs to the technical field of natural language processing, and particularly relates to a heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement.

Background

Emotion recognition in a conversation is an emerging task in natural language processing, which aims to recognize the emotion of each utterance in a conversation. It can be seen as an extension of traditional text emotion detection, or as a problem in dialog systems, helping the machine understand the emotion that the user has created, thus generating an emotionally aware dialog reply. The traditional dialogue emotion analysis research focuses on double-person dialogue, and has less influence on the environment of multiple speakers. Compared with the double-speaker conversation, the information interaction in the multi-speaker conversation is more frequent and diversified, so that the emotion recognition is more complicated.

First, because personal statements are usually logical and coherent, but chats are more jumpy, and uncertain jumpiness of information exponentially increases the difficulty of emotion analysis, there is a need to focus on contextual context, particularly on topic changes during chatting. Meanwhile, group chat is information interaction among multiple persons, and the interaction is a continuous process rather than fixed in a short time. This essentially changes the way emotion judgment is evaluated. Under the application of non-interactive sentiment analysis, such as commodity comments, the application value can be realized after the sentiment of the comments is successfully judged, and the method is a clear classification task. However, in the dialogue state, the emotional state is continuously changed, and the single analysis has little meaning. Finally, the existence of the interaction hides part of the state information. In a chat, both interacting parties usually default that the other party knows a lot of information. Such as communicating the relationship status between the subject and the object, the requirement purpose of each other, the social relationship, and the common sense and character of all the things. This problem extends to the greater complexity of the group chat context, considering the personality and characteristics of each speaker, and considering the reactions that they encounter different perspectives. Some people are always strong and easy to shake, and the emotions are different. How to effectively model the infection level of the inter-speaker emotion and how well the speaker adheres to his/her own emotion is also an important and challenging problem.

The invention with the patent number of CN113656564A provides a power grid service dialogue data emotion detection method based on a graph neural network, which comprises the following steps: step 1, extracting a conversation set, and constructing a statement-level self-influence and mutual-influence relation graph and a feature extraction model; step 2, constructing a word-level undirected graph and a feature extraction model; step 3, constructing a relation undirected graph and a feature extraction model between the subject vocabulary and the context words; and 4, fusing the sentence level characteristics, the word graph characteristics and the relation characteristics between the subject vocabulary and the context words in the steps 1,2 and 3, and calculating the conversation emotion. The method can greatly improve the accuracy of interactive emotion analysis, thereby providing important technical support for constructing human interactive systems such as question-answering systems, chat robots, public service robots and the like. But the invention does not relate to emotion analysis processing in the scene of multi-conversation person chatting.

Disclosure of Invention

The technical problem to be solved is as follows: the invention aims to solve the problem of emotion analysis in multi-conversation person conversation and mainly solves the problems of semantic modeling and interaction among conversation persons. Specifically, (1) how to provide more emotional features to facilitate subsequent emotion classification when performing semantic modeling and acquiring semantic content of an input sentence, and also need to remove redundant information as much as possible to prevent interference with judgment of a computer. (2) Under the condition that the sentence emotion cannot be judged only by semantic content, how to judge the current speech emotion according to the context information and emotion is particularly under the scene that a speaker can generate emotion transition due to information interaction in the conversation process.

The technical scheme is as follows:

a topic semantic enhancement based heterogeneous graph structure multi-conversation person emotion analysis method comprises the following steps:

s10, carrying out emotion word embedding operation on the input conversation, and converting the input conversation from human language into vector representation with emotion;

s20, constructing a syntactic dependency graph according to the dependency syntactic relation for each sentence input in the step S10, wherein the nodes are words in the sentences, and the syntactic dependency graph is input into a graph convolution neural network to update node information so as to obtain word vectors with enhanced semantics and corresponding sentence representation vectors;

s30, constructing a topic extraction model according to the semantically enhanced word vector obtained in the step S20, extracting the topic of each sentence of dialogue, and obtaining a sentence representation with enhanced topic;

s40, clustering the sentence characterization of the enhanced topic obtained in the step S30 as an initial node according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relation, constructing a heterogeneous dialogue graph, wherein the node is the sentence characterization of each sentence, and updating graph nodes by using a graph loop network;

and S50, inputting the graph nodes obtained in the step S40 into a classifier to obtain a classification result, namely an emotion classification.

Further, in step S10, the process of performing emotion word embedding operation on the input dialog to convert it from human language to vector representation with emotion includes the following sub-steps:

s11, let the input be a multi-conversation-person text D containing N rounds of conversation:

wherein u is _i Representing the ith utterance in the dialog text D,

representing an utterance u _i Corresponding speaker, u _i ＝{w _i,1 ,w _i,2 ,...,w _i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversation persons; n is the total number of the dialogues;

for each sentence u input _i Vector encoding is performed in time order using word2vec to obtain a basic vector representation of each word2vec (w) _i )；

S12, obtaining emotion vector representation of each word from external emotion dictionary VAD, and mapping the word to emotion dictionary by adopting the following formula:

wherein l (w) represents a morphological reduction of each word; when a certain word has actual emotional significance, the word emotional vector W2AV has corresponding real values in each dimension of VAD, otherwise, when the word does not have emotional significance, the emotional vector of the word is uniformly expressed as 5,1 and 5, and the three values respectively represent extremely weak emotional fitness V, medium emotional intensity A and extremely weak emotional significance D;

s13, the emotional word embedding W2AV of the word W obtained in the step S12 is combined with the basic word2vec word vector obtained in the step S11 in series to obtain the final word vector representation

W' _i As an initial input to the encoder, it is called emotion word embedding.

Further, in step S20, the process of building a syntactic dependency graph according to dependency syntactic relations for each utterance input in step S10, where the nodes are words in the utterance, inputting the syntactic dependency graph into a graph convolutional neural network to update node information, and obtaining semantically enhanced word vectors and corresponding sentence representation vectors includes the following sub-steps:

s21, constructing syntax dependency tree for the input sentence of step S10, analyzing the dependency relationship between words, and executing the sentence u _i Constructing a relation tree to represent the syntactic structure of the sentence, wherein the original node is the word vector representation w 'obtained in the step S10' _i The core verb of the sentence is the root node of the tree, represented as a central word, the root node allows the domination of other components and is not dominated by any other components;

s22, finding out the dependent words having dependency relationship with the central word according to the grammar rule, and distinguishing the left and right nodes of the tree according to the time sequence order of the central word and the dependent words to put the dependent words into the tree until all the words are detected, thereby completing the syntactic dependency graph;

s23, inputting the syntax dependence graph into the graph convolution neural network to update node information, and obtaining a word vector with enhanced semantics:

w _i ″＝GRU(w′ ₁ ,w′ ₂ ,...,w′ _n )；

obtaining sentence characterization vector v of each sentence by calculating mean value of words _i 。

Further, in step S22, finding the dependent words having dependency relationship with the core word according to the grammar rule, and distinguishing left and right nodes of the tree according to the time sequence order of the core word and the dependent words to put the dependent words into the tree until all words are detected, and the process of completing the syntactic dependency graph includes the following sub-steps:

s221, constructing a syntax dependence graph by using the stack and a queue containing the words to be processed; initializing a stack and a queue, emptying the stack, wherein the stack in an initial state only has one root node; importing all words in the sentence into a queue;

s222, selecting and executing corresponding operation types by using an Oracle function according to the current state, wherein the operation types comprise three types:

when the stack top and the words below the stack top form a dependency relationship and the central word is a stack top element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;

when the stack top and the words below the stack top form a dependency relationship and the central word is the following element, popping the two words from the stack, adding the dependency relationship into the analyzed data structure, and finally adding the central word into the stack;

otherwise, adding a word in the queue to the stack top;

and S223, looping the step S222 until only the root node exists in the stack and the queue is empty.

Further, in step S30, the process of constructing a topic extraction model according to the semantically enhanced word vector obtained in step S20, extracting the topic of each sentence of the dialog, and obtaining a topic enhanced sentence representation includes the following steps:

s31, a variation self-encoder is adopted to form a theme extraction module; the semantically enhanced word vector w ″, which is obtained in step S20 _i Input subject recursive according to timing informationTraining in the extraction module, wherein the output of the training is a potential vector of the topic discussed in the sentence; the latent vector constrains the coherent topics of a single dialog by a recurring hidden state whose variation distribution of the posterior approximation is:

wherein h is _n-1 ＝f _τ (z _n-1 ,w″ _n-1 )，n>1；

And

are all full connection layers, f _τ () Is a cyclic unit, adopts the multi-head attention mechanism of a Transformer, and the query of the input is the last hidden variable z _n-1 ：

In the formula (I), the compound is shown in the specification,

representing a given input x _n Is/are as follows

Is then outputted from the output of (a),

representing an underlying network of the language model preceding the subject layer;

s32, latent variable z trained by the theme extraction module _i Treated as the current statement u _i Subject vector of, will z _i And the sentence characterization vector v obtained in step S20 _i Connected in series to obtain a sentence characterization vector ve with enhanced subject _i 。

Further, in step S40, clustering the sentence representations with enhanced topics obtained in step S30 as initial nodes according to topic similarity, constructing a dialogue subgraph according to sentence topic information and a time sequence relationship, constructing a heterogeneous dialogue graph, where the nodes are sentence representations of each sentence, and the process of updating graph nodes using a graph loop network includes the following sub-steps:

s41, taking the sentence representation of the topic enhancement obtained in the step S30 as an initial node, clustering the conversations of the multiple conversers according to the similarity of topic information, connecting the nodes according to spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation; each dialog subgraph represents that the discussion things in the time period are the same, and the emotion is connected;

s42, processing nodes in each dialog subgraph, wherein the nodes of the graph are ve for each speak of the spoke _i Corresponding speaker is denoted as p _vei (ii) a Constructing a heteromorphic graph aiming at each dialog subgraph, wherein the edge types are divided into two types: speaker-identical edge e ¹ Speaker-dependent edge e ⁰ (ii) a Each node is arranged according to a time sequence, and the rule of the node i for constructing the edge is as follows: ve _i Construct an edge with the back node, if the back node ve _j Of a speaker

Then connect the two nodes, denoted as

Until a node with the same speaker is detected, let ve _k ，k>i is when

When constructing an edge of

And stops detecting and starts to construct node ve _i+1 Edges of subsequent nodes are detected until all the nodes are detected, and a final subgraph initial graph is obtained;

s43, update step S42 Using graph convolution networkThe nodes of the obtained subgraph initial graph are updated according to the preset nodes and the edges connected with the preset nodes, the relationship types of the edges are different, the weights are different, and the updated nodes are sentence nodes vse subjected to sentence semantic enhancement and emotion interaction processing _i 。

Further, in step S51, the final sentence obtained in step S43 is represented as vse _i Inputting the words into a classifier, and classifying the emotion of the words; the classifier uses one fully connected layer:

h _i ＝ReLU(W _h vse _i +b _h )

l _i ＝softmax(W _l h _i +b _l )

wherein k represents the number of emotion category labels,

for the final predicted emotion label, W _h 、b _h 、W _l 、b _l Are learnable parameters.

Further, in step S52, the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:

wherein, lambda is a parameter which can be learnt,

is a loss function of the subject module,

is the loss function of the classifier;

under the evidence of useThe calculation of the limit is carried out,

cross entropy loss function calculations are used.

Has the advantages that:

firstly, the heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement constructs a syntactic dependency graph and a theme extraction model, and improves the semantic expression of a single sentence. Under the environment of multiple conversants, the chat content has jumping property, the topic characteristics of each sentence of conversation need to be extracted, the conversion of the chat topic of the multiple conversations is concerned, the chat object and the chat state change of the conversants are judged, the local context depending on the current conversation is locked, the semantic expression of topic enhancement is modeled, and the method lays a foundation for the next step.

Secondly, the topic semantics enhancement-based heterogeneous graph structure multi-conversation person emotion analysis method clusters the words with similar semantic topics, and then constructs local dependency subgraphs according to a time sequence, wherein the construction of the edges reflects the emotion influence among users. The method enables sentences with the same theme and time sequence information to be clustered to the same subgraph, the emotion continuity and infectivity of the sentences are more vivid, and the method can be used for improving the emotion analysis accuracy in the environment of multiple conversations.

Thirdly, the method for analyzing the emotion of the multi-conversation person with the heterogeneous graph structure based on the theme semantic enhancement constructs a syntactic dependency graph and a theme extraction model optimization feature extraction problem, and based on the method, useful features with emotion information are reserved, redundant information is removed, and subsequent emotion classification is greatly facilitated. The invention constructs a heterogeneous dialogue subgraph module modeling emotion interaction problem, simplifies a multi-user chat task with a large amount of data, and divides the multi-user chat task into a plurality of dialogue subgraphs with consistent subjects, thereby quickening the operation speed of the model and improving the accuracy of the classifier.

Drawings

FIG. 1 is a flowchart of a topic semantic enhancement-based heterogeneous graph structure multi-conversation person emotion analysis method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of an overall system architecture corresponding to the emotion analysis method according to the embodiment of the present invention.

Detailed Description

The following examples will give the skilled person a more complete understanding of the present invention, but do not limit the invention in any way.

The embodiment adopts an emotion word embedding and multi-task collaborative interaction graph network structure model to analyze the conversation emotion.

As shown in fig. 1 and 2, the method comprises the following steps:

step 1) carrying out word embedding operation on the input dialog to obtain initial input of the encoder.

Step 101) the word embedding operation is to map words to real-valued vectors of fixed dimensions. Assume an input as a segment of multi-dialog text containing N number of dialogs, where u _i Representing the ith utterance in the dialog text D,

representing an utterance u _i Corresponding speaker, u _i ＝{w _i,1 ,w _i,2 ,...,w _i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversants; n is the total sentence number of the conversation. Given an input dialog sequence, the sequence is first subjected to word2vec operations to obtain a word vector having the basic semantics of a word.

Step 102) inputting the dialogue sequence into an external cognitive engineering emotion dictionary, enhancing traditional word embedding by using a three-dimensional emotion space to obtain a three-dimensional word vector with emotion, and mapping the words to the emotion dictionary by the following method:

where l (w) represents the reduction of the morphology of each word. When a certain word has actual emotional meaning, the word emotion vector W2AV has corresponding real value in each dimension of VAD, otherwise, when the word does not have emotional meaning, such as stop word, the word emotion vector is uniformly expressed as [5,1,5], and the three values respectively represent extremely weak emotional fitness (V), medium emotional intensity (A) and extremely weak emotional importance (D).

Step 103) connecting the vectors obtained in the steps 101) and 102) in series to obtain a final emotion word vector

And the emotion word embedding is called as input of a next encoder.

And 2), constructing a syntactic dependency graph for the input sentence, maximally improving the extraction of the dialogue features, and obtaining sentence representations and word representations with enhanced semantics. The word vector w 'obtained in the step 1)' _i Expressed as an initial node, dependency syntax relationship detection is performed in the time-series order of words, thereby constructing a syntax dependency graph.

Step 201) constructs a syntactic dependency graph using the stack (stack) and a queue (queue) containing words to be processed. The start stack is empty and the queue contains all words in the sentence. It has two operations at each step: inserting an element in the queue into a stack; the two elements at the top of stack are merged (reduce) into one element.

Step 202) there is only one root node (root) in the initial state stack and all words in the queue, and then loops to the end state. Each step of the loop is to select the appropriate operation using the Oracle function based on the current state and then perform this operation. The following three operations are performed:

LEFT-ARC: the stack top and its lower words form dependency relationship, and the central word is the stack top element, these two words are popped up from the stack, this dependency relationship is added into the analyzed data structure, and finally the central word is added into the stack

RIGHT-ARC: the top of the stack and the words below it form a dependency relationship, the central word is the following element, the two words are popped up from the stack, the dependency relationship is added into the analyzed data structure, and finally the central word is added into the stack

SHIFT: adding a word in a queue to the top of a stack

When the state is only root in the stack and the queue is also empty, the state is an end state, which indicates that the syntax dependency graph representing the sentence is constructed completely.

Step 203) inputting the syntax dependence graph and the initial graph node into a graph convolutional neural network (GRU) to update node information, and obtaining a word vector with enhanced semantics:

w″ _i ＝GRU(w′ ₁ ,w′ ₂ ,...,w′ _n )

obtaining sentence characterization vector v of the sentence by calculating mean value of words _i 。

And 3) considering influence factors of multiple turns of multi-person conversation, multiple text contents and complexity, constructing a theme extraction model for the input sentences, and extracting the theme of each sentence of conversation.

Step 301) the theme extraction module is mainly composed of a variational self-encoder (VAE), and the semantically enhanced word vector w ″, obtained in step 2) _i Training is performed in the input VAE recursively according to the timing information.

Step 302) the output of the VAE is a potential vector for the topic discussed in the sentence. The latent vector constrains the coherent topics of a single conversation by a recurring hidden state, whose variation distribution of the posterior approximation is:

wherein h is _n-1 ＝f _τ (z _n-1 ,w″ _n-1 )，for n>1。

And

all are fully connected layers, f _τ () Is a loop unit, adopts the multi-head attention mechanism of a Transformer, and the input query is the last hidden variable z _n-1 ：

Step 303) training out latent variable z through theme module _i It is considered as the current sentence u _i Subject vector of, will z _i And the sentence characterization vector v obtained in the step 3) _i Connected in series to obtain a sentence characterization vector ve with enhanced subject _i 。

And 4) taking the sentence representation of the topic enhancement obtained in the step 3) as an initial node, constructing a dialogue subgraph, and solving the problem of complicated multi-person dialogue discussion content.

Step 401) because the information amount of the group chat conversation is large and the span is variable, if the whole section of the conversation is directly constructed into a graph structure, useful information is easily filtered in the information transmission process, and therefore a long conversation needs to be processed. Taking the sentence representation of the topic enhancement obtained in the step 3) as an initial node, clustering the conversations of the multiple conversers according to the topic information similarity, connecting the nodes according to the spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation.

Step 402) each subgraph represents that the discussion things in the time period are the same, and the emotion is connected. First, the nodes in each subgraph are processed, and the nodes of the subgraph are the words ve of each spoke _i Corresponding speaker is denoted as p _i . Constructing a different graph for each subgraph, wherein the types of edges are divided into two types: speaker-identical edge e ¹ Speaker-dependent edge e ⁰ . Each node is arranged according to a time sequence, and the rule of constructing the edge by the node i is as follows: ve _i Construct an edge with the back node, if the back node ve _j Of speaker p _j ≠p _i Then connect two nodes, denoted as

Until p is detected _k ＝p _i Constructed as an edge of

And stops detecting and starts to construct node ve _i+1 And edges of subsequent nodes. And obtaining a final subgraph initial graph until all the nodes are detected.

Step 403) updating the nodes of the subgraph obtained in step 402) by using a Graph Convolution Network (GCN), wherein the updating of the nodes is determined by the front nodes and the edges connected with the front nodes, and the relationship types and the weights of the edges are different. The updated node is the sentence node vse processed by sentence semantic enhancement and emotion interaction _i 。

And 5) inputting the graph nodes into a classifier to obtain a classification result, namely the emotion classification.

Step 501), the final sentence obtained in step 403) is represented vse _i And putting the words into a classifier to classify the emotion of the words. The classifier uses one fully connected layer:

h _i ＝ReLU(W _h vse _i +b _h )

l _i ＝softmax(W _l h _i +b _l )

wherein k represents the number of emotion category labels,

Step 502), the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:

wherein, lambda is a parameter which can be learnt,

is a loss function of the subject module,

is the loss function of the classifier;

using the calculation of the lower bound of evidence,

cross entropy loss function calculations are used.

The method mainly aims to perform emotion recognition, construct a syntax dependency graph for each sentence, reconstruct the sentence by using a variational self-coding network, train latent variables into the theme of the sentence, facilitate the follow-up observation of the conversion of the group chat theme, and improve the extraction of conversation features to the maximum extent. Meanwhile, network coding sentences with different composition figures are selected, similar nodes with the same theme are clustered, long-turn conversations with multiple themes, multiple callers and disordered information are cut into easily-classified conversation subgraphs with clear themes and concise information, and emotion changes of different callers affected by the outside can be distinguished by selecting the different composition figures. The invention simultaneously considers semantic modeling and emotion interaction tasks in multi-person conversation tasks and combines the two tasks based on the purpose of improving the emotion recognition precision of conversation. Considering that the theme change of the open conversation can affect the continuity of emotional interaction, the invention also incorporates the theme detection task into learning. The invention uses the topic extraction model to simulate the jumping of the chat information, and simultaneously combines a dependency syntax analysis method to jointly acquire the information of the input dialog. The invention also designs a heterogeneous graph network to update the feature information of the conversation nodes, model the emotional interaction among different interlocutors and be used for better emotional analysis. The method innovatively simplifies the multi-user dialogue emotion analysis task, performs theme segmentation, and realizes double improvement of emotion recognition speed and accuracy. In summary, the invention fully considers the interactive information among the interlocutors, and improves the prediction precision of the dialogue emotion analysis of the multiple interlocutors.

Claims

1. A topic semantic enhancement based heterogeneous graph structure multi-conversation person emotion analysis method is characterized by comprising the following steps:

2. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S10, the process of performing emotion word embedding operation on the input dialogue to convert it from human language to vector representation with emotion comprises the following sub-steps:

wherein u is _i Representing the i-th utterance, p, in the dialog text D _ui Representing an utterance u _i Corresponding speaker, u _i ＝{w _i,1 ,w _i,2 ,...,w _i,n Means that the sentence is composed of N words, i is 1,2, … N, j is 1,2, … M; m is the total number of the conversants; n is the total number of the dialogues;

wherein l (w) represents a morphological reduction of each word; when a certain word has actual emotional significance, the word emotional vector W2AV has corresponding real values in each dimension of VAD, otherwise, when the word does not have emotional significance, the emotional vector of the word is uniformly expressed as [5,1,5], and the three values respectively express extremely weak emotional fitness V, medium emotional intensity A and extremely weak emotional significance D;

s13, embedding the emotional words of the word W obtained in the step S12 into W2AV, and combining the emotional words with the basic word2vec word vectors obtained in the step S11 in series to obtain final word vector representation

W' _i As an initial input to the encoder, it is called emotion word embedding.

3. The method for analyzing emotion of multiple conversations based on topic semantic enhanced heterogeneous graph structure according to claim 1, wherein in step S20, the process of constructing a syntactic dependency graph according to dependency syntactic relations for each sentence input in step S10, the nodes being words in the sentences, inputting the syntactic dependency graph into a graph convolution neural network to update node information, and obtaining a semantic enhanced word vector and a corresponding sentence representation vector comprises the following sub-steps:

s22, finding out the dependent words having dependency relationship with the central word according to the grammar rule, distinguishing left and right nodes of the tree according to the time sequence order of the central word and the dependent words, putting the dependent words into the tree until all words are detected, and completing a syntactic dependency graph;

w″ _i ＝GRU(w′ ₁ ,w′ ₂ ,...,w′ _n )；

4. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 3, wherein in step S22, the dependent word having dependency relationship with the central word is found according to grammar rules, the left and right nodes of the tree are distinguished according to the time sequence order of the central word and the dependent word, and the dependent word is contained in the tree until all words are detected, and the process of completing the syntactic dependency graph includes the following sub-steps:

otherwise, adding a word in the queue to the stack top;

and S223, circulating the step S222 until only the root node exists in the stack and the queue is empty.

5. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S30, a topic extraction model is constructed according to the semantically enhanced word vector obtained in step S20, and the topic of each sentence is extracted, so as to obtain topic enhanced sentence representation, the process of obtaining topic enhanced sentence representation includes the following steps:

s31, a variation self-encoder is adopted to form a theme extraction module; the semantically strengthened word vector w obtained in the step S20 _i "input subject extraction module recursive according to timing information, training, the output of which is the potential vector of the subject discussed in the sentence; the latent vector constrains the coherent topics of a single dialog by a recurring hidden state whose variation distribution of the posterior approximation is:

wherein h is _n-1 ＝f _τ (z _n-1 ,w″ _n-1 )，n>1；

And

are all fully connected layers, f _τ () Is a loop unit, adopts the multi-head attention mechanism of a Transformer, and the input query is the last hidden variable z _n-1 ：

In the formula (I), the compound is shown in the specification,

representing a given input x _n Is

Is then outputted from the output of (a),

6. The method for analyzing emotion of multiple conversations based on topic semantics enhanced heterogeneous graph structure according to claim 1, wherein in step S40, the topic enhanced sentence representations obtained in step S30 are used as initial nodes, clustering is performed according to topic similarity, a conversation sub-graph is constructed according to sentence topic information and time sequence relations, a heterogeneous conversation graph is constructed, the nodes are sentence representations of each sentence, and the process of updating graph nodes by using a graph loop network includes the following sub-steps:

s41, taking the sentence representation of the topic enhancement obtained in the step S30 as an initial node, clustering the conversations of the multiple conversers according to the topic information similarity, connecting the nodes according to the spatial distance, if the spatial distances of the nodes are close and the nodes are adjacent in time, constructing edges, and finally obtaining a plurality of conversation subgraphs obtained by segmentation; each dialog sub-graph represents that the discussion things in the time period are the same, and the emotion is connected;

s42, processing nodes in each dialog subgraph, wherein the nodes of the graphs are the words ve of each speaker _i Corresponding speaker is represented as

Constructing a heteromorphic graph aiming at each dialog subgraph, wherein the edge types are divided into two types: speaker-identical edge e ¹ Speaker-dependent edge e ⁰ (ii) a Each node is arranged according to a time sequence, and the rule of constructing the edge by the node i is as follows: ve _i Construct an edge with the back node, if the back node ve _j Of a speaker

Then connect the two nodes, denoted as

Until a node of the same speaker is detected, let ve _k ，k>i is when

When constructing an edge of

s43, using the nodes of the subgraph initial graph obtained in the graph convolution network updating step S42, wherein the updating of the nodes is determined by the front nodes and the edges connected with the front nodes, the relationship types of the edges are different, the weights are different, and the updated nodes are the sentence nodes vse processed by sentence semantic enhancement and emotion interaction _i 。

7. The subject based semantics of claim 6The method for analyzing the emotion of the multiple conversations with the enhanced heterogeneous graph structure is characterized in that in step S51, the final sentence obtained in step S43 is represented as vse _i Inputting the words into a classifier, and classifying the emotion of the words; the classifier uses one fully connected layer:

h _i ＝ReLU(W _h vse _i +b _h )

l _i ＝softmax(W _l h _i +b _l )

wherein k represents the number of emotion category labels,

8. The topic semantics enhanced heterogeneous graph structure multisession emotion analysis method of claim 7, wherein in step S52, the whole method is trained in an end-to-end manner, and the loss function of the total method is defined as follows:

wherein, lambda is a parameter which can be learnt,

is a loss function of the subject module,

is the loss function of the classifier;

using the calculation of the lower bound of evidence,

cross entropy loss function calculations are used.