CN115017912A

CN115017912A - Double-target entity emotion analysis method for multi-task learning

Info

Publication number: CN115017912A
Application number: CN202210054948.XA
Authority: CN
Inventors: 文瑜; 旷中洁; 朱新华
Original assignee: Guilin Tourism University
Current assignee: Guilin Tourism University
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-09-06

Abstract

The invention discloses a double-target entity emotion analysis method for multi-task learning, which is characterized in that a neural network model with automatic sentence context disjunction identification and double-target entity emotion polarity classification is jointly trained through multi-task learning of sentence context disjunction identification and left and right entity emotion polarity classification. Second, context breakers in the emotion sentences are identified using the trained neural network model. And then, separating the semantic representation of the emotional sentences by the obtained context disjuncts to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity. The emotional expressions of two target entities in the emotional sentences are separated from each other through the context disjunction symbol, and the aspect level emotional analysis problem is solved in a more effective method.

Description

Double-target entity emotion analysis method for multi-task learning

Technical Field

The invention relates to aspect level emotion analysis in natural language understanding, in particular to a double-target entity emotion analysis method for multi-task learning, which can be widely applied to aspect level emotion analysis tasks in various fields.

Background

The purpose of aspect level sentiment classification is to predict the polarity of a plurality of target entities in a sentence or a document, which is a task of fine-grained sentiment analysis and is different from the traditional sentiment analysis task, wherein polarity analysis is carried out on the target entities (generally, three categories of positive, negative and neutral). Facet-level sentiment classification is commonly used in commentator sentences, such as: mall shopping reviews, restaurant reviews, movie reviews, and the like. The facet emotional category, usually has two facet words and their associated emotional orientations in a sentence, such as the sentence "Prices area high her to dine but the person food is good, which is negative for the target entity" Prices "but positive for the target entity" food ".

With the continuous development of artificial neural Network technology, various neural networks such as Bidirectional Encoder retrieval from transformations (BERT) Language models proposed by Long Short-Term Memory (LSTM), Deep Memory Network and Google AI Language are applied to the aspect polarity classification, thereby providing an end-to-end classification method for the neural networks without any feature engineering work. However, when there are multiple target entities in a sentence, the aspect polarity classification task needs to distinguish between different aspects of emotion. Thus, the task of facet polarity classification is more complex than document-level sentiment analysis with only one overall sentiment orientation, and the main challenges facing it are: how to highlight the emotional expressions related to different target entities and suppress the emotional expressions not related to the different target entities when performing emotional analysis on the different target entities. In order to achieve this goal, various emotion semantic learning methods centered on aspects are proposed in the deep learning method aiming at the aspect polarity classification, for example: attention-based semantic learning, position attenuation, left-right semantic learning, aspect connection, global semantic learning and the like, but each method has influence of irrelevant emotional expression to a certain extent. In order to thoroughly solve the influence of irrelevant emotion expressions in multi-target emotion analysis, the invention provides a dual-target entity emotion analysis method for multi-task learning, wherein emotion expressions of two target entities in an emotion sentence are separated from each other through context disjunctors.

Disclosure of Invention

The invention discloses a double target entity emotion analysis method for multi-task learning, which is characterized in that a neural network model with automatic sentence context disjunction identification and double target entity emotion polarity classification is jointly trained through multi-task learning of sentence context disjunction identification and left and right entity emotion polarity classification, and the problem of aspect level emotion analysis is solved in a more effective way.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a double-target entity emotion analysis method for multitask learning is characterized by comprising the following steps:

s1, through sentence context disjunction symbol recognition and multi-task learning of left and right entity emotion polarity classification, a neural network model with sentence context disjunction symbol automatic recognition and dual-target entity emotion polarity automatic classification is jointly trained;

s2, identifying context disjunctors in the emotion sentences by using the neural network model trained in the step S1;

s3, in the neural network model trained in the step S1, separating the semantic representation of the emotion sentences by the positions corresponding to the context disjuncts obtained in the step S2 to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity;

the emotion sentences are multi-emotion expression sentences comprising a left target entity and a right target entity;

the context breaking symbol is a word which is positioned between a left target entity and a right target entity in the emotional sentence and enables emotional expressions of the two target entities to be separated from each other;

the neural network model is a neural network structure based on a BERT language model; the BERT Language model refers to a Bidirectional Encoder retrieval from transforms (BERT) Language model proposed by Google AI Language.

Further, the step S1 specifically includes:

S1.1 BERTthe input sequence s of the language model is composed of emotional sentences Sen ═ { …, t ₁ ,w ₁ ,w ₂ ,…,w _n ,t ₂ … with the BERT encoding notation, as follows:

Mid＝{w ₁ ,w ₂ ,...,w _n } (2)

wherein, [ CLS]Is the encoding of the BERT classifier, [ SEP]Is the coding of the BERT terminator, t ₁ Is the left target entity to be analyzed, t ₂ Is the right target entity to be analyzed, Mid ═ w ₁ ,w ₂ ,...,w _n Is the left and right target entities t ₁ And t ₂ The intermediate word sequence in between, "…" represents the omitted word sequence, m is the length of the input sequence s, d _w Is the dimension of character coding in BERT, n is the length of the middle word sequence Mid, the word refers to the language fragment separated by the word separator Tokenzier of Bert;

s1.2, the input sequence S is sent into a BERT language model to be processed, and sentence semantic representation C of the emotion sentence Sen is obtained _Sen As follows:

wherein,

the representation of the BERT language model,

is the i-th hidden state of the BERT language model, d _b Is the number of hidden units of the BERT language model;

s1.3 according to the corresponding relation, from C _Sen Extracting middle word sequence Mid ═ w ₁ ,w ₂ ,...,w _n The corresponding intermediate semantic representation C _Mid As follows:

wherein,

the extraction of the intermediate semantics is represented,

is the ith intermediate word w _i At C _Sen The hidden state corresponding to (1);

s1.4 pairs of intermediate semantic representations C _Mid Executing a softmax linear transformation to identify the context disjunctor, and calculating as follows:

wherein, the formulas (5) and (6) represent C for the intermediate semantic _Mid A calculation process of the softmax linear transformation is performed,

is a learnable parameter vector for context break identifier identification,

is a parameter of the offset that is,

represents the operation of the dot product of the vector,

is a context-breaking confidence score vector corresponding to the intermediate word sequence Mid, w is an intermediate word, p (w | C) _Mid Theta) represents the predicted probability that the middle word w is a context disjunctor,

representation returns such that Pp (w | C) _Mid Theta) is the middle word of the maximum value, w ^* For the calculated context disjunct, θ is the set of all learnable parameters, exp (·) represents an exponential function with e as the base;

s1.5 breaking symbol w by context _sp As a separator, two mask matrixes consisting of 1 and 0 are formed, and the semantic meaning of the sentence is expressed as C _Sen Separation into left clause semantic representation C _left And right clause semantic representation C _right The calculation process is as follows:

wherein, mask ^L For mask matrix, mask, to separate left clause semantics ^r For the mask matrix to separate the right clause semantics,

is a vector of all 1s, and the vector is a vector,

is an all 0 vector, token _i e.Sen is the ith word in the sentence Sen, function

The position of the specified word in the sentence Sen is numbered,

is mask ^L The ith column vector in (1), i ∈ [ m, m ]]And is an integer which is the number of the whole,

is mask ^r J ∈ [1, m ] of the j-th column vector of (1)]And is an integer which is the number of the whole,

representing element-by-element multiplication;

s1.6 semantic representation of C in the left clause, respectively _left And right clause semantic representation C _right Executing a multi-head self-attention coding process to obtain a left clause semantic code C' _left And right clause semantic code C' _right The calculation process is as follows:

wherein MHSA () X represents an input

MHA (Q, K, V);

s1.7 semantic coding C 'for left clauses respectively' _left And right clause semantic code C' _right Executing average pooling operation to obtain left clause emotion vector Z ^L And right clause emotion vector Z ^r The calculation process is as follows:

wherein ave machining (C) represents a pair of parameters

Performing a pooling operation by column averaging;

s1.8 Emotion vector Z of left clause ^L And right clause emotion vector Z ^r Executing linear transformation of softmax, performing probability calculation of emotion polarity, and obtaining final emotion polarity, wherein the calculation process is as follows:

wherein,

is a representation matrix of the polarity of the emotion,

is an offset vector, d _k Is the number of emotion polarities, Y is the set of emotion polarities, Y is one emotion polarity,

are each Z ^L And Z ^r Corresponding emotion polarity confidence score vector, Ρ (y | Z) ^L ,θ)、Ρ(y|Z ^r Theta) respectively represent Z ^L And Z ^r Predicted probability in emotion polarity y, y ^L 、y ^r Respectively the finally assessed left emotion polarity and right emotion polarity,

respectively represent return such that p (yZ) ^L θ) and p (y | Z) ^r θ) the emotion polarity for the maximum value, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base.

Further, in step S1, the joint training method for jointly training a neural network model with sentence context disjunctors for automatic recognition and dual target entity emotion polarity automatic classification includes:

(1) calculating a loss function identified by the context disjunction symbol and a loss function of the dual-target entity emotion analysis by using the cross entropy loss error respectively, wherein the calculation process is as follows:

wherein, omega is the set of training sentences of the dual-target entity emotion analysis task, | omega | represents the size of the set omega,

is a word label of the context breaker of the ith training sentence in omega,

is an intermediate semantic representation of the ith training sentence in omega,

the left emotion polarity label and the right emotion polarity label of the ith training sentence in omega are respectively,

the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi ^Mid (θ) is a loss function used in the context break identifier recognition training, Ψ ^L (θ) is a loss function used in performing left target entity emotion analysis training, Ψ ^r (theta) is a loss function used in performing right target entity sentiment analysis training;

(2) calculating a joint training sentence context score using equation (27) belowJoint loss function of broken symbol identification and dual-target entity emotion polarity classification

Wherein alpha is ₁ And alpha ₂ Are two weight parameters;

(3) the joint training objective is to minimize the joint loss error calculated by equation (27).

In order to thoroughly solve the influence of irrelevant emotion expressions in multi-target emotion analysis, the invention provides a dual-target entity emotion analysis method for multi-task learning, wherein emotion expressions of two target entities in an emotion sentence are separated from each other through context disjunctors. Firstly, a neural network model with automatic sentence context disjunction identification and double-target entity emotion polarity automatic classification is trained in a combined way through the combined learning of sentence context disjunction identification and left and right entity emotion polarity classification. Second, context breakers in the emotion sentences are identified using the trained neural network model. And then, separating the semantic representation of the emotional sentences by the positions corresponding to the obtained context disjuncts to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to obtain the emotion polarities of the binocular entity.

The invention has the following advantages:

(1) the method has the advantages that the BERT language model with wide pre-training and task fine-tuning is used for dynamically coding the emotion sentences, and the problem that the aspect-level emotion analysis corpus is too small can be effectively solved;

(2) the emotion expressions of two target entities in the emotion sentences are separated from each other through the context disjunction symbols, and the influence of irrelevant emotion expressions in multi-target emotion analysis is thoroughly solved;

(3) the dual-target entity emotion analysis is converted into two independent single-target entity emotion analyses through the context disjunction symbol, so that the performance of the dual-target entity emotion analysis is greatly improved;

(4) by converting the emotional sentences containing more target entities into a plurality of dual-target entity emotional sentences, the method can be applied to various types of aspect level emotion analysis tasks.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention.

Detailed Description

The present invention is further illustrated by the following specific examples, but the scope of the present invention is not limited to the following examples.

Let it contain the left target entity t ₁ And right target entity t ₂ Sentiment sentence of (1) ═ …, t ₁ ,w ₁ ,w ₂ …,w _n ,t ₂ … }, the dual target entity t is analyzed by the following steps ₁ And t ₂ The emotion of (2):

Further, the step S1 specifically includes:

the input sequence S of the S1.1 BERT language model is composed of an emotion sentence Sen ═ { …, t ₁ ,w ₁ ,w ₂ ,…,w _n ,t ₂ ,. } and BERT coding symbols, as follows:

Mid＝{w ₁ ,w ₂ ,...,w _n } (2)

wherein,

the representation of the BERT language model,

is the ith hidden state of the BERT language model,d _b is the number of hidden units of the BERT language model;

s1.3 according to the corresponding relation, from C _Sen Extracting middle word sequence Mid ═ w ₁ ,w ₂ ,...,w _n The intermediate semantic representation C corresponding to _Mid As follows:

wherein,

the extraction of the intermediate semantics is represented,

wherein,

is a learnable parameter vector for context disjunct identification,

is a parameter of the offset that is,

represents the operation of the dot product of the vector,

wherein, mask ^L For mask matrix, mask, to separate left clause semantics ^r To be a mask matrix for separating the right clause semantics,

is a vector of all 1s, and the vector is a vector,

is an all-0 vector, tonken _i e.Sen is the ith word in the sentence Sen, function

The position of the specified word in the sentence Sen is numbered,

is mask ^r J ∈ [1, m ] of the j column vector in (1)]And is an integer which is the number of the whole,

represents element-by-element multiplication;

wherein MHSA () X represents an input

MHA (Q, K, V);

s1.7 semantic coding C 'for left clauses respectively' _left And right clause semantic code C' _right Executing average pooling operation to obtain a left clause emotion vector Z ^L And right clause emotion vector Z ^r The calculation process is as follows:

wherein ave machining (C) represents a pair parameter

Performing a pooling operation of the column-wise averaging;

wherein,

is a representation matrix of the polarity of the emotion,

is an offset vector, d _k Is the number of emotion polarity classes, Y is the set of emotion polarity classes, Y is an emotion polarity,

are each Z ^L And Z ^r Corresponding emotion polarity confidence score vector, p (y | Z) ^L ,θ)、Ρ(y|Z ^r And theta) each represents Z ^L And Z ^r Predicted probability in emotion polarity y, y ^L 、y ^r Respectively the finally assessed left and right emotion polarities,

respectively denote return such that p (y | Z) ^L θ) and p (y | Z) ^r θ) the emotion polarity for the maximum value, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base.

(1) calculating a loss function identified by the context disjunctor and a loss function of dual-target entity emotion analysis by using the cross entropy loss error respectively, wherein the calculation process comprises the following steps:

is a word label of the context breaker of the ith training sentence in omega,

the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi ^Mid (θ) is a loss function used in the context disjunctor recognition training, Ψ ^L (θ) is a loss function used in performing left target entity emotion analysis training, Ψ ^r (theta) is a loss function used in performing right target entity sentiment analysis training;

(2) calculating a joint loss function for joint training sentence context segmenter identification and dual target entity emotion polarity classification using equation (27) below

Wherein alpha is ₁ And alpha ₂ Are two weight parameters;

According to the embodiment, the emotional expressions of two target entities in the emotional sentences are separated from each other through the context disjunction symbol, and the influence of irrelevant emotional expressions in multi-target emotional analysis is thoroughly solved.

Examples of the applications

1. Example Environment

The present example uses the BERT-BASE version, which is proposed and developed by Google AI Language in the literature "Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional transforms for Language understanding. in: Proceedings of the 2019Conference of NAACL, pp 4171-4186", as a Pre-training model for the BERT coding layer, which includes 12 layers of transforms, 768 hidden units, 12 multiple heads, and a total parameter of 110M); the multi-head Attention adopted in the example Is derived from documents' Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You New. in:31st Conference on Neural Information Processing Systems (NIPS 2017), pp 5998-; to minimize the loss value, this example uses an Adam Optimizer and sets the learning rate to 2e-5, the batch size to 16; during training, the present example sets epochs to 10.

2. Data set

The example uses the internationally widely used SemEval-2014task 4 dataset published in 2014 at the eighth International workshop for semantic evaluation. It provides two sets of review data from the restaurant (Rest) and notebook (Lap) domains. Each sample in the SemEval-2014task 4 dataset consists of one comment sentence, some opinion targets and corresponding sentiment polarity to opinion targets. The data set details are shown in table 1.

Table 1 data set details

3. Comparison method

This example compares the model of the invention with 5 non-BERT methods and 4 BERT-based methods, the comparative methods are shown below:

(1) non-BERT method

MenNet [1] uses a multi-layer memory network in conjunction with attention to capture the contribution of each context word to the aspect polarity classification.

IAN [2] adopts two LSTM networks to acquire the features of specific aspects and contexts respectively, then generates their attention vectors interactively, and finally connects the two attention vectors for aspect polarity classification.

TNet-LF [3] uses the CNN network to extract important features from the bi-directional LSTM network based word representation and proposes a relevance-based mechanism to generate a specific target representation of the words in the sentence. The model also employs a position attenuation technique.

MCRF-SA [4] proposes a compact attention model based on multiple CRFs, which can extract aspect-specific opinion spans. The model also employs position attenuation and facet joining techniques.

MAN [5] builds two attentions with a position function on top of the multi-layer transducer encoder: an interactive attention for generating context and relationship between aspects, and a local attention based on the aspect to context of the transducer encoder.

(2) BERT-based methods

BERT-BASE [6] is a version of BERTBAE developed by Google AI Language, using a single sentence input: the "[ CLS ] + comment sentence + [ SEP ]" is subjected to the aspect polarity classification.

BERT-SPC [7] is the application of a pre-trained BERT model in sentence-for-classification (SPC) tasks. The input mode of applying BERT-SPC to the aspect polarity classification task is as follows: "[ CLS ] + comment sentence + [ SEP ] + aspect target + [ SEP ]".

AEN-BERT [7] constructs two multi-headed attention mechanisms on top of the BERT encoder: one multi-headed self-attentiveness mechanism to model context, one aspect to a context multi-headed attentiveness mechanism to model aspect targets.

MAN-BERT is a variant of the MAN [5] model. This example uses the BERT model to replace the Transformer encoder in MAN [5 ].

Wherein, the above-mentioned related documents are respectively:

1.Tang D,Qin B,Liu T(2016)Aspect Level Sentiment Classification with Deep Memory Network.In:Empirical methods in natural language processing,pp 214–224

2.Ma D,Li S,Zhang X,Wang H(2017)Interactive attentions networks for aspect-level sentiment classification.In:Proceedings of the 26th International Joint Conference on Artificial Intelligence,Melbourne,Australia,19-25August 2017,pp 4068-4074

3.Li X,Bing L,Lam W,Shi B(2018)Transformation Networks for Target-Oriented Sentiment Classification.In Proceedings of ACL,pp 946-956

4.Xu L,Bing L,Lu W,Huang F(2020)Aspect Sentiment Classification with Aspect-Specific Opinion Spans.In Proceedings of EMNLP 2020,pp 3561-3567

5.Xu Q,Zhu Li,Dai T,Yan C(2020)Aspect-based sentiment classification with multi-attention network.Neurocomputing,388(3):135-143

6.Devlin J,Chang MW,Lee K,Toutanova K(2019)BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding.In:Proceedings of the 2019Conference of NAACL,pp 4171–4186

7.Song Y,Wang J,Jiang T,Liu Z,Rao Y(2019)Attentional encoder network for targeted sentiment classification.In:arXiv preprint arXiv:1902.09314

4. examples comparative results

The present example evaluates the various models by reporting accuracy (Accuracy) (Acc) and Macro-average F1(M-F1) on the data set.

Table 2 results of the experiments, wherein the symbol "+" is from the original article, the symbol "+" is from document [5], the others from our experiments, the bold values indicate the best ones

The experimental results in table 2 show that the two-target entity emotion analysis method for multitask learning provided by the invention realizes the results of optimal accuracy (acc) and Macro-average F1(M-F1) in two data sets of a notebook and a restaurant, and significantly exceeds the results of all similar methods, which fully proves that the method of the invention is feasible and excellent.

5. Examples of the invention

For an emotional sentence "price area high to two key things this person food is good" including the two target entities "price" and "food", the example model firstly identifies that the context break symbol is "but", then obtains the semantics of the left sub-sentence "price area high to two" and the semantics of the right sub-sentence "the person food good", and finally carries out the emotional analysis on the left sub-sentence semantics and the right sub-sentence semantics respectively, obtains that the emotional polarity of the left target entity "price" is "negative", and the emotional polarity of the right target entity "food" is "positive".

Claims

1. A double-target entity emotion analysis method for multitask learning is characterized by comprising the following steps:

the neural network model is a neural network structure based on a BERT language model; the BERT Language model refers to a Bidirectional Encoder expressions from Transformers (BERT) Language model proposed by Google AI Language.

2. The dual-target entity emotion analysis method for multitask learning according to claim 1, characterized in that:

the step S1 specifically includes:

the input sequence S of the S1.1 BERT language model is composed of an emotion sentence Sen ═ { …, t ₁ ,w ₁ ,w ₂ ,…,w _n ,t ₂ … } with the BERT coded symbols, as follows:

Mid＝{w ₁ ,w ₂ ,...,w _n } (2)

wherein, [ CLS]Is the encoding of the BERT classifier, [ SEP]Is the coding of the BERT terminator, t ₁ Is the left target entity to be analyzed, t ₂ Is the right target entity to be analyzed, Mid ═ w ₁ ,w ₂ ,...,w _n Is the left and right target entities t ₁ And t ₂ The intermediate word sequence in between, "…" represents the omitted word sequence, m is the length of the input sequence s, d _w Is the dimension of character coding in BERT, n is the length of the middle word sequence Mid, the word refers to the language fragment separated by the word separator Tokenzier of the BERT;

s1.2, the input sequence S is sent into a BERT language model for processing to obtain a sentence semantic representation C of the emotion sentence Sen _Sen As follows:

wherein,

the representation of the BERT language model,

wherein,

the extraction of the intermediate semantics is represented,

wherein,

is a learnable parameter vector for context disjunct identification,

is a parameter of the offset that is,

represents the operation of the dot product of the vector,

representation returns such that Pp (w | C) _Mid Theta) is the middle word of the maximum value, w ^* For the computed context disjunct, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base;

wherein, mask ^L Mask as a mask matrix for separating left clause semantics ^r To be a mask matrix for separating the right clause semantics,

is a vector of all 1s, and the vector is a vector,

The position number of the specified word in the sentence Sen is obtained,

representing element-by-element multiplication;

wherein MHSA () X represents an input

MHA (Q, K, V);

s1.7 left clause semantic compilationCode C' _left And right clause semantic code C' _right Executing average pooling operation to obtain a left clause emotion vector Z ^L And right clause emotion vector Z ^r The calculation process is as follows:

wherein ave machining (C) represents a pair of parameters

Performing a pooling operation by column averaging;

wherein,

is a representation matrix of the polarity of the emotion,

3. The dual-target entity emotion analysis method for multitask learning according to claim 1, characterized in that:

in step S1, the joint training method for jointly training a neural network model with sentence context disjunctors automatic identification and dual target entity emotion polarity automatic classification includes:

is a word label of the context breaker of the ith training sentence in omega,

the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi ^Mid (θ) is a loss function used in the context disjunctor recognition training, Ψ ^L (θ) is a loss function used in the left target entity emotion analysis training, Ψ ^r (θ) Is a loss function used when performing sentiment analysis training on the right target entity;

Wherein alpha is ₁ And alpha ₂ Are two weight parameters;