CN113947074A

CN113947074A - Deep collaborative interaction emotion reason joint extraction method

Info

Publication number: CN113947074A
Application number: CN202111188307.5A
Authority: CN
Inventors: 史树敏; 邬成浩; 黄河燕
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2022-01-18

Abstract

The invention relates to a deep collaborative interaction emotion reason combined extraction method, and belongs to the technical field of natural language processing emotion analysis. The method adopts pre-trained word feature vectors to represent vectorized representation of each word in a text sequence, and uses a bidirectional long-time memory network to perform sentence-level text coding on the word representation fused with external knowledge. The importance of each word in the representation learning process is determined through an attention mechanism, so that a shallow emotion representation and a candidate reason representation are obtained. And (4) modeling association of the emotion representation and the reason representation by adopting multilayer collaborative attention network stacking, and outputting to obtain deep interactive emotion representation and reason representation. And finally, calculating the emotion probability vector and the reason probability vector simultaneously in a joint learning mode. The method can better capture the characteristics of the emotion and the reason of the text, can be simultaneously applied to the emotion reason extraction scene of the explicit emotion text and the implicit emotion text, and realizes the synchronous combined extraction of the emotion and the reason thereof.

Description

Deep collaborative interaction emotion reason joint extraction method

Technical Field

The invention relates to a deep collaborative interaction emotion reason joint extraction method, in particular to a method for deeply modeling potential relation between emotion and reason information by collaborative interaction and efficiently extracting emotion and reason contained in a text by joint learning, and belongs to the technical field of natural language processing emotion analysis.

Background

In recent years, with the development of the internet and social networks, text emotion reason extraction has become one of the most popular research directions in the field of natural language processing. The method can comprehensively and accurately understand the emotion expressed by the text, and mine the reason for the emotion, and can be applied to a plurality of scenes such as social public opinion analysis, client feedback tracking, system supervision, medical health supervision and the like, thereby generating wide social benefits.

Emotion cause extraction is a technique for deep text analysis and understanding that has appeared in recent years, and can identify causes of emotion expressed in a text. The existing emotion reason extraction method mainly aims at identifying and extracting reasons under the condition that the text has obvious emotion characteristics, namely, has explicit emotion words.

However, in daily expression, human emotions reflected in objective experiences of things and behaviors thereof are rich and abstract. Besides expressing emotion by using specific explicit emotion words, the emotion of the user is implicitly expressed by using objective statement or revival. Implicit emotions are defined as "language segments (sentences, clauses or phrases) that do not contain explicit emotional words but express subjective emotions". Because the expression of the implicit emotional text is more obscure and understandable than the explicit emotional text, in the process of extracting the emotional reasons from the implicit emotional text, the existing method cannot well capture the deep link between the implicit emotional information and the reason information and effectively extract the emotional reasons.

Therefore, further research on emotion reason extraction is needed, so that emotion reason extraction can be simultaneously applied to the explicit emotion text and the implicit emotion text, and the text information mining performance is improved more comprehensively and more accurately. The method has the advantages that the method plays a positive promoting role in research on aspects of natural language understanding, text expression learning, joint learning and the like, and further promotes the rapid development of application and industry in the relevant fields based on text emotion analysis.

Disclosure of Invention

The invention aims to solve the technical problems that the prior art cannot fully capture the deep connection between the emotion information and the reason information of a text, especially cannot be applied to the scene of an implicit emotion text, better solves the technical problems that the application scene is single, the connection between emotion and reason is sensed and extraction is carried out simultaneously, particularly enables emotion reason extraction to be simultaneously applied to an explicit emotion text and an implicit emotion text, and creatively provides an emotion reason combined extraction method for deep collaborative interaction.

First, the concept will be explained:

definition 1: emotional cause corpus

The method comprises the steps of providing corresponding specific emotion and candidate reasons for an emotion reason extraction task and a document set to be extracted, wherein documents in a corpus comprise explicit emotion texts and implicit emotion texts. Candidate reasons can be labeled at any level, such as clause level or tuple level.

Definition 2: text sequence s

The expression is as follows: { s ═ w₁,w₂,...,w_NAnd indicating a sentence needing emotion analysis. There are N words w in the sentence₁,w₂,...,w_NWhere the subscript N is the sentence word sequence length and w represents a word.

Definition 3: word feature vector for input text sequence

The pre-training vector used for vectorizing the input text sequence comprises a semantic vector and a position vector, wherein the semantic vector refers to semantic feature information of a current word, and the position vector refers to position feature information of the current word in the text sequence.

Definition 4: attention to

Refers to a phenomenon in which a human needs to select a specific portion in a visual region and then focus on it in order to make reasonable use of limited visual information processing resources.

Artificial intelligence exploits this phenomenon to provide neural networks with the ability to select specific inputs. In this method, attention is drawn to: if a word in a sentence is more relevant to the representation of the current sentence, the word is given a higher weight.

Definition 5: deep collaborative interaction network model

Refers to a network model that captures deep connections between two representations through a stack of multiple collaborative interaction layer models.

The invention is realized by adopting the following technical scheme.

A deep collaborative interaction emotion reason joint extraction method comprises the following steps:

step 1: and acquiring an emotion reason corpus, and acquiring a text document needing to be extracted and an emotion candidate reason text thereof.

Step 2: performing word segmentation processing on the input document needing emotion reason extraction and candidate reason text thereof to obtain a text sequence s of the corresponding document₁And text sequences s of candidate reasons₂。

And step 3: respectively representing text sequences s by using pre-trained word feature vectors₁And s₂To obtain a word feature representation.

And taking the sum of the semantic representation and the position representation of each word as a word feature vector, thereby obtaining the feature vector corresponding to each word in the text sequence.

And 4, step 4: respectively comparing the text sequences s obtained in the step 3 by using a bidirectional long-time memory network LSTM₁And s₂The words in (1) represent the text encoding at the sentence level.

Specifically, the long-time memory network LSTM performs node state calculation in the network according to formulas 1 to 5:

i^(t)＝δ(U_ix^(t)+W_ih^(t-1)+b_i) (1)

f^(t)＝δ(U_fx^(t)+W_fh^(t-1)+b_f) (2)

o^(t)＝δ(U_ox^(t)+W_oh^(t-1)+b_o) (3)

c^(t)＝f^(t)⊙c^(t-1)+i^(t)⊙tanh(U_cx^(t)+W_ch^(t-1)+b_c) (4)

h^(t)＝o^(t)⊙tanh(c^(t)) (5) wherein x^(t)、i^(t)、f^(t)、o^(t)、c^(t)、h^(t)Respectively representing the input vector, the input gate state, the forgetting gate state, the output gate state, the memory unit state and the hidden layer state of the LSTM at the moment t, c^(t-1)Represents the state of the memory cell of LSTM at time t-1, h^(t-1)Representing the hidden layer state of the LSTM at time t-1; w, U, b respectively representing parameters of the circulation network structure, parameters of the input structure and deviation parameters; the symbol [ ] indicates an element product, and a sequence indicating vector corresponding to each sentence at the moment t is obtained according to the output of the model; δ () represents a sigmoid function; b_iRepresenting the parameter of the deviation term in calculating the state of the input gate, b_fRepresenting the parameter of the deviation term when calculating the forgotten door state, b_oRepresenting the deviation term parameter in calculating the state of the output gate, b_cRepresenting a deviation term parameter when calculating the state of the memory unit; u shape_iWeight matrix representing input vectors in calculating the state of the input gate, U_fWeight matrix, U, representing input vectors when calculating a forgetting gate state_oWeight matrix representing input vectors in calculating output gate states, U_cRepresenting a weight matrix of input vectors when calculating the state of the memory unit; w_iWeight matrix representing the state of the hidden layer when calculating the state of the input gate, W_fWeight matrix representing the state of the hidden layer when calculating the forgotten gate state, W_oWeight matrix representing the state of the hidden layer when calculating the state of the output gate, W_cA weight matrix representing the state of the hidden layer when calculating the state of the memory cell.

To obtain the information of the preceding and following words related to each word, the sequence is encoded using bi-directional LSTM, and the representation vectors of the ith element of the sequence obtained by forward and backward LSTM are spliced to obtain the final representation h_i：

Wherein the content of the first and second substances,

a hidden state representation of the ith element representing the forward LSTM network output;

represents a forward LSTM network; d_iA representation representing the ith element in the input sequence; l represents the number of words contained in the sentence;

a hidden state representation of the ith element representing the output to the LSTM network,

representing a backward LSTM network.

And 5: determining the importance of each word in the step 4 in the representation learning process through an Attention mechanism, and respectively calculating to obtain a text sequence s₁E, s shallow emotion representation₂Superficial layer reason tableC is shown.

Specifically, the attention score a of each word is calculated_i：

Wherein h is_iRepresenting the representation of the ith element in the input sequence, h_jA representation representing the jth element in the input sequence; l is the number of words contained in the sentence; w_aEach sentence is represented as a group of word sequences, and further represented as a weighted average of the word representations of the word sequences after connection, wherein the weight is the Attention value calculated by formula 9.

Text sequence s₁The shallow emotion representation E of (a) is of the form:

wherein L is₁For a text sequence s₁The number of words contained in (1).

Text sequence s₂The reason for the shallow layer of (a) indicates that C is in the form:

wherein L is₂For a text sequence s₂The number of words contained in (1).

Step 6: and 5, taking the shallow emotion expression E and the shallow reason expression C output in the step 5 as the input of the deep collaborative interaction network model, and outputting to obtain the deep emotion expression and the deep reason expression.

Specifically, the M layers of cooperative attention network stacking are adopted to model the association of the emotion representation and the reason representation, and the emotion representation and the reason representation of deep interaction are output:

E^m+1＝E^m+Softmax(E^m((C^m)^T))C^m (12)

C^m+1＝C^m+Softmax(C^m((E^m)^T))E^m (13)

wherein E is^m+1Representing emotional representations at level m +1 in a collaborative interactive network, E^mRepresenting emotion representation of the mth layer in the collaborative interactive network; c^m+1A reason representation, C, representing the m +1 th layer in a collaborative interactive network^mRepresenting a reason representation of an m-th layer in the collaborative interactive network; t denotes a matrix transposition.

And 7: expressing the deep emotion obtained in step 6^MAnd deep cause of C^MRespectively jointly calculating corresponding emotion probability vectors y through a Softmax layer^EAnd a cause probability vector y^C。

Specifically, the emotion probability vector y^ERepresents the probability of a particular specific emotion of the corresponding document:

reason probability vector y^CRepresents the probability of whether the current candidate reason is causing the text emotion:

wherein the content of the first and second substances,

representing the probability of the document being an i-th implicit emotion,

a probability indicating a reason why the candidate reason is a document or not; w is a_iDenotes the ith weight parameter, w_jRepresents the jth weight parameter; b_iRepresenting the ith deviation term parameter, b_jRepresents the jth deviation term parameter; k represents the dimension of the probability vector(ii) a T denotes a matrix transposition.

And after obtaining the emotion probability vector and the reason probability vector, updating the parameters by using the sum of the cross entropies of the emotion probability vector and the reason probability vector as a loss function and using a gradient descent mode to minimize the joint prediction error of the model.

So far, from step 1 to step 7, the emotion probability of the given document and the judgment result of the candidate reason are obtained, and the emotion reason joint extraction of the deep collaborative interaction is completed.

Advantageous effects

Compared with the prior art, the method of the invention has the following advantages:

1. the method can fully capture the deep connection between the text emotion information and the reason information, carry out deep modeling on the emotion and the reason, better capture the characteristics of the text emotion and the reason,

2. the method can be simultaneously applied to the emotion reason extraction scenes of the explicit emotion texts and the implicit emotion texts, can better solve the technical problems that the application scenes are single, the relation between the emotion and the reason is sensed, and extraction is carried out simultaneously, and achieves synchronous combined extraction of the emotion and the reason.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The method of the present invention will be described in further detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, a method for extracting emotion reason combination of deep collaborative interaction includes the following steps:

step A: and obtaining a document text to be extracted and a candidate reason text.

Specifically, in this embodiment, the same as the step 1 of the invention is performed;

and B: obtaining a text sequence representation;

specifically, in the embodiment, a document text sequence and a candidate reason text sequence are obtained, a pre-training word feature representation is adopted to initialize the text sequence, and a corresponding text sequence representation is obtained, which is specifically the same as the steps 2 to 3 of the invention content;

and C: obtaining shallow emotion semantic representation and reason semantic representation;

specifically, in the embodiment, the document text sequence representation and the candidate reason text sequence representation are respectively subjected to a bidirectional LSTM and an attention network coding sequence to obtain a shallow emotion semantic representation and a shallow reason semantic representation, which are specifically the same as the steps 4 to 5 of the invention content;

step D: obtaining deep interactive emotion representation and reason representation;

specifically, in the embodiment, the shallow emotion semantic representation and the reason semantic representation are subjected to a multi-layer stacked cooperative attention network, deep contact between the shallow emotion semantic representation and the reason semantic representation is captured, and a deep interactive emotion representation and a deep interactive reason representation are obtained, which are the same as the step 6 of the invention content;

step E: jointly calculating the emotion and reason probabilities;

specifically, in this embodiment, the emotion and reason probabilities are calculated simultaneously by using a joint learning method, which is the same as the step 7 of the invention.

Examples

Taking the emotion reason extraction corpus "funeral that I are eating fried chicken in a dining room and suddenly invited to the best friend" as an embodiment, the embodiment will explain the specific operation steps of the method in detail by using specific examples.

Specifically, in the present embodiment, the document text "i am eating a fried chicken in a dining room, funeral suddenly invited to the best friend" and the candidate reason text "funeral suddenly invited to the best friend" are obtained.

And B: a text sequence representation is obtained.

Specifically, in the embodiment, the document text and the candidate reason text are segmented, and Word2Vec Word feature vectors pre-trained on a large-scale text data set for each Word are obtained to obtain a sequence representation of the document text "i am eating a fried chicken in a dining room and suddenly invited to the funeral of the best friend" and a sequence representation of the candidate reason text "suddenly invited to the funeral of the best friend".

And C: and obtaining shallow emotion semantic representation and reason semantic representation.

Specifically, in the embodiment, the document text sequence and the reason text sequence are input into the bidirectional LSTM and attention network for coding learning, and a shallow emotion semantic representation of "i eat a fried chicken in a dining room and suddenly invite a funeral to a best friend" and a shallow reason semantic representation of "suddenly invite a funeral to a best friend" are obtained.

Step D: and obtaining deep interactive emotion representation and reason representation.

Specifically, in the embodiment, the shallow emotion semantic representation and the shallow reason semantic representation are input to the 3-layer stacked collaborative attention network together, so as to obtain the deep interactive emotion representation and reason representation.

Step E: and jointly calculating the emotion and reason probabilities.

Specifically, in the embodiment, the deep interactive emotion expression and the deep reason expression are respectively input into the Softmax layer to calculate the corresponding probability vectors, so as to obtain the emotion to which the document most likely belongs and whether the candidate reason is the reason of the document emotion.

Claims

1. A deep collaborative interaction emotion reason joint extraction method is characterized by comprising the following steps:

step 1: acquiring an emotion reason corpus, and acquiring text documents needing to be extracted and emotion candidate reason texts thereof;

the emotion reason corpus is a corresponding specific emotion and candidate reason provided for the emotion reason extraction task and a document set to be extracted, wherein documents in the corpus comprise explicit emotion texts and implicit emotion texts; the candidate reasons can be marked at any level;

step 2: the method comprises the steps of performing word segmentation on an input document needing emotion reason extraction and a candidate reason text thereof,obtaining a text sequence s of a corresponding document₁And text sequences s of candidate reasons₂；

Wherein the expression of the text sequence s is as follows: { s ═ w₁,w₂,...,w_NRepresents a sentence with N words w to be analyzed₁,w₂,...,w_NWherein, subscript N is the length of sentence word sequence, and w represents a word;

and step 3: respectively representing text sequences s by using pre-trained word feature vectors₁And s₂The vectorization representation of each word in the word graph is obtained to obtain the word feature representation; the sum of semantic representation and position representation of each word is used as a word feature vector, so that a feature vector corresponding to each word in a text sequence is obtained;

the word feature vector is a pre-training vector used for vectorizing the input text sequence, and comprises a semantic vector and a position vector, wherein the semantic vector refers to semantic feature information of a current word, and the position vector refers to position feature information of the current word in the text sequence;

and 4, step 4: respectively comparing the text sequences s obtained in the step 3 by using a bidirectional long-time memory network LSTM₁And s₂The words in (1) represent the text coding at sentence level;

and 5: determining the importance of each word in the step 4 in the representation learning process through an Attention mechanism, and respectively calculating to obtain a text sequence s₁E, s shallow emotion representation₂The shallow cause of (a) represents C;

step 6: taking the shallow emotion expression E and the shallow reason expression C output in the step 5 as the input of the deep collaborative interaction network model, and outputting to obtain a deep emotion expression and a deep reason expression;

and 7: expressing the deep emotion obtained in step 6^MAnd deep cause of C^MRespectively jointly calculating corresponding emotion probability vectors y through a Softmax layer^EAnd a cause probability vector y^C；

2. The method for extracting emotion reason combination of deep collaborative interaction according to claim 1, wherein in step 4, the long-time memory network LSTM performs node state calculation in the network according to equations 1 to 5:

i^(t)＝δ(U_ix^(t)+W_ih^(t-1)+b_i) (1)

f^(t)＝δ(U_fx^(t)+W_fh^(t-1)+b_f) (2)

o^(t)＝δ(U_ox^(t)+W_oh^(t-1)+b_o) (3)

c^(t)＝f^(t)⊙c^(t-1)+i^(t)⊙tanh(U_cx^(t)+W_ch^(t-1)+b_c) (4)

h^(t)＝o^(t)⊙tanh(c^(t)) (5)

wherein x is^(t)、i^(t)、f^(t)、o^(t)、c^(t)、h^(t)Respectively representing the input vector, the input gate state, the forgetting gate state, the output gate state, the memory unit state and the hidden layer state of the LSTM at the moment t, c^(t-1)Represents the state of the memory cell of LSTM at time t-1, h^(t-1)Representing the hidden layer state of the LSTM at time t-1; w, U, b respectively representing parameters of the circulation network structure, parameters of the input structure and deviation parameters; the symbol [ ] indicates an element product, and a sequence indicating vector corresponding to each sentence at the moment t is obtained according to the output of the model; δ () represents a sigmoid function; b_iRepresenting the parameter of the deviation term in calculating the state of the input gate, b_fRepresenting the parameter of the deviation term when calculating the forgotten door state, b_oRepresenting the deviation term parameter in calculating the state of the output gate, b_cRepresenting a deviation term parameter when calculating the state of the memory unit;U_iweight matrix representing input vectors in calculating the state of the input gate, U_fWeight matrix, U, representing input vectors when calculating a forgetting gate state_oWeight matrix representing input vectors in calculating output gate states, U_cRepresenting a weight matrix of input vectors when calculating the state of the memory unit; w_iWeight matrix representing the state of the hidden layer when calculating the state of the input gate, W_fWeight matrix representing the state of the hidden layer when calculating the forgotten gate state, W_oWeight matrix representing the state of the hidden layer when calculating the state of the output gate, W_cRepresenting a weight matrix of a hidden layer state when calculating the state of the memory unit;

Wherein the content of the first and second substances,

representing a backward LSTM network.

3. The method for joint extraction of emotional causes of deep collaborative interaction according to claim 1, wherein in step 5, an attention score a of each word is calculated_i：

Wherein h is_iRepresenting the representation of the ith element in the input sequence, h_jA representation representing the jth element in the input sequence; l is the number of words contained in the sentence; w_aEach sentence is expressed as a group of word sequences and further expressed as a weighted average of the expression of each connected word in the word sequences, wherein the weight is the Attention value calculated by the formula 9;

text sequence s₁The shallow emotion representation E of (a) is of the form:

wherein L is₁For a text sequence s₁The number of words contained in (1);

wherein L is₂For a text sequence s₂The number of words contained in (1).

4. The method for extracting emotion reason combination of deep collaborative interaction according to claim 1, wherein in step 6, the association between emotion representation and reason representation is modeled by stacking M layers of collaborative attention networks, and the emotion representation and reason representation of deep collaborative interaction are output:

E^m+1＝E^m+Softmax(E^m((C^m)^T))C^m (12)

C^m+1＝C^m+Softmax(C^m((E^m)^T))E^m (13)

5. The method for jointly extracting emotion reason for deep collaborative interaction according to claim 1, wherein in step 7, emotion probability vector y^ERepresents the probability of a particular specific emotion of the corresponding document:

wherein the content of the first and second substances,

indicating that the document is the ith implicit conditionThe probability of the sensation is that the user is,

a probability indicating a reason why the candidate reason is a document or not; w is a_iDenotes the ith weight parameter, w_jRepresents the jth weight parameter; b_iRepresenting the ith deviation term parameter, b_jRepresents the jth deviation term parameter; k represents the dimension of the probability vector; t denotes a matrix transposition.