CN115017912A - Double-target entity emotion analysis method for multi-task learning - Google Patents
Double-target entity emotion analysis method for multi-task learning Download PDFInfo
- Publication number
- CN115017912A CN115017912A CN202210054948.XA CN202210054948A CN115017912A CN 115017912 A CN115017912 A CN 115017912A CN 202210054948 A CN202210054948 A CN 202210054948A CN 115017912 A CN115017912 A CN 115017912A
- Authority
- CN
- China
- Prior art keywords
- emotion
- context
- target entity
- clause
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 145
- 238000004458 analytical method Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 39
- 230000002996 emotional effect Effects 0.000 claims abstract description 28
- 238000003062 neural network model Methods 0.000 claims abstract description 20
- 230000014509 gene expression Effects 0.000 claims abstract description 19
- 239000013598 vector Substances 0.000 claims description 53
- 238000012549 training Methods 0.000 claims description 40
- 230000006870 function Effects 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000009977 dual effect Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 230000002457 bidirectional effect Effects 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims description 3
- 238000003754 machining Methods 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a double-target entity emotion analysis method for multi-task learning, which is characterized in that a neural network model with automatic sentence context disjunction identification and double-target entity emotion polarity classification is jointly trained through multi-task learning of sentence context disjunction identification and left and right entity emotion polarity classification. Second, context breakers in the emotion sentences are identified using the trained neural network model. And then, separating the semantic representation of the emotional sentences by the obtained context disjuncts to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity. The emotional expressions of two target entities in the emotional sentences are separated from each other through the context disjunction symbol, and the aspect level emotional analysis problem is solved in a more effective method.
Description
Technical Field
The invention relates to aspect level emotion analysis in natural language understanding, in particular to a double-target entity emotion analysis method for multi-task learning, which can be widely applied to aspect level emotion analysis tasks in various fields.
Background
The purpose of aspect level sentiment classification is to predict the polarity of a plurality of target entities in a sentence or a document, which is a task of fine-grained sentiment analysis and is different from the traditional sentiment analysis task, wherein polarity analysis is carried out on the target entities (generally, three categories of positive, negative and neutral). Facet-level sentiment classification is commonly used in commentator sentences, such as: mall shopping reviews, restaurant reviews, movie reviews, and the like. The facet emotional category, usually has two facet words and their associated emotional orientations in a sentence, such as the sentence "Prices area high her to dine but the person food is good, which is negative for the target entity" Prices "but positive for the target entity" food ".
With the continuous development of artificial neural Network technology, various neural networks such as Bidirectional Encoder retrieval from transformations (BERT) Language models proposed by Long Short-Term Memory (LSTM), Deep Memory Network and Google AI Language are applied to the aspect polarity classification, thereby providing an end-to-end classification method for the neural networks without any feature engineering work. However, when there are multiple target entities in a sentence, the aspect polarity classification task needs to distinguish between different aspects of emotion. Thus, the task of facet polarity classification is more complex than document-level sentiment analysis with only one overall sentiment orientation, and the main challenges facing it are: how to highlight the emotional expressions related to different target entities and suppress the emotional expressions not related to the different target entities when performing emotional analysis on the different target entities. In order to achieve this goal, various emotion semantic learning methods centered on aspects are proposed in the deep learning method aiming at the aspect polarity classification, for example: attention-based semantic learning, position attenuation, left-right semantic learning, aspect connection, global semantic learning and the like, but each method has influence of irrelevant emotional expression to a certain extent. In order to thoroughly solve the influence of irrelevant emotion expressions in multi-target emotion analysis, the invention provides a dual-target entity emotion analysis method for multi-task learning, wherein emotion expressions of two target entities in an emotion sentence are separated from each other through context disjunctors.
Disclosure of Invention
The invention discloses a double target entity emotion analysis method for multi-task learning, which is characterized in that a neural network model with automatic sentence context disjunction identification and double target entity emotion polarity classification is jointly trained through multi-task learning of sentence context disjunction identification and left and right entity emotion polarity classification, and the problem of aspect level emotion analysis is solved in a more effective way.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a double-target entity emotion analysis method for multitask learning is characterized by comprising the following steps:
s1, through sentence context disjunction symbol recognition and multi-task learning of left and right entity emotion polarity classification, a neural network model with sentence context disjunction symbol automatic recognition and dual-target entity emotion polarity automatic classification is jointly trained;
s2, identifying context disjunctors in the emotion sentences by using the neural network model trained in the step S1;
s3, in the neural network model trained in the step S1, separating the semantic representation of the emotion sentences by the positions corresponding to the context disjuncts obtained in the step S2 to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity;
the emotion sentences are multi-emotion expression sentences comprising a left target entity and a right target entity;
the context breaking symbol is a word which is positioned between a left target entity and a right target entity in the emotional sentence and enables emotional expressions of the two target entities to be separated from each other;
the neural network model is a neural network structure based on a BERT language model; the BERT Language model refers to a Bidirectional Encoder retrieval from transforms (BERT) Language model proposed by Google AI Language.
Further, the step S1 specifically includes:
S1.1 BERTthe input sequence s of the language model is composed of emotional sentences Sen ═ { …, t 1 ,w 1 ,w 2 ,…,w n ,t 2 … with the BERT encoding notation, as follows:
Mid={w 1 ,w 2 ,...,w n } (2)
wherein, [ CLS]Is the encoding of the BERT classifier, [ SEP]Is the coding of the BERT terminator, t 1 Is the left target entity to be analyzed, t 2 Is the right target entity to be analyzed, Mid ═ w 1 ,w 2 ,...,w n Is the left and right target entities t 1 And t 2 The intermediate word sequence in between, "…" represents the omitted word sequence, m is the length of the input sequence s, d w Is the dimension of character coding in BERT, n is the length of the middle word sequence Mid, the word refers to the language fragment separated by the word separator Tokenzier of Bert;
s1.2, the input sequence S is sent into a BERT language model to be processed, and sentence semantic representation C of the emotion sentence Sen is obtained Sen As follows:
wherein,the representation of the BERT language model,is the i-th hidden state of the BERT language model, d b Is the number of hidden units of the BERT language model;
s1.3 according to the corresponding relation, from C Sen Extracting middle word sequence Mid ═ w 1 ,w 2 ,...,w n The corresponding intermediate semantic representation C Mid As follows:
wherein,the extraction of the intermediate semantics is represented,is the ith intermediate word w i At C Sen The hidden state corresponding to (1);
s1.4 pairs of intermediate semantic representations C Mid Executing a softmax linear transformation to identify the context disjunctor, and calculating as follows:
wherein, the formulas (5) and (6) represent C for the intermediate semantic Mid A calculation process of the softmax linear transformation is performed,is a learnable parameter vector for context break identifier identification,is a parameter of the offset that is,represents the operation of the dot product of the vector,is a context-breaking confidence score vector corresponding to the intermediate word sequence Mid, w is an intermediate word, p (w | C) Mid Theta) represents the predicted probability that the middle word w is a context disjunctor,representation returns such that Pp (w | C) Mid Theta) is the middle word of the maximum value, w * For the calculated context disjunct, θ is the set of all learnable parameters, exp (·) represents an exponential function with e as the base;
s1.5 breaking symbol w by context sp As a separator, two mask matrixes consisting of 1 and 0 are formed, and the semantic meaning of the sentence is expressed as C Sen Separation into left clause semantic representation C left And right clause semantic representation C right The calculation process is as follows:
wherein, mask L For mask matrix, mask, to separate left clause semantics r For the mask matrix to separate the right clause semantics,is a vector of all 1s, and the vector is a vector,is an all 0 vector, token i e.Sen is the ith word in the sentence Sen, functionThe position of the specified word in the sentence Sen is numbered,is mask L The ith column vector in (1), i ∈ [ m, m ]]And is an integer which is the number of the whole,is mask r J ∈ [1, m ] of the j-th column vector of (1)]And is an integer which is the number of the whole,representing element-by-element multiplication;
s1.6 semantic representation of C in the left clause, respectively left And right clause semantic representation C right Executing a multi-head self-attention coding process to obtain a left clause semantic code C' left And right clause semantic code C' right The calculation process is as follows:
s1.7 semantic coding C 'for left clauses respectively' left And right clause semantic code C' right Executing average pooling operation to obtain left clause emotion vector Z L And right clause emotion vector Z r The calculation process is as follows:
wherein ave machining (C) represents a pair of parametersPerforming a pooling operation by column averaging;
s1.8 Emotion vector Z of left clause L And right clause emotion vector Z r Executing linear transformation of softmax, performing probability calculation of emotion polarity, and obtaining final emotion polarity, wherein the calculation process is as follows:
wherein,is a representation matrix of the polarity of the emotion,is an offset vector, d k Is the number of emotion polarities, Y is the set of emotion polarities, Y is one emotion polarity,are each Z L And Z r Corresponding emotion polarity confidence score vector, Ρ (y | Z) L ,θ)、Ρ(y|Z r Theta) respectively represent Z L And Z r Predicted probability in emotion polarity y, y L 、y r Respectively the finally assessed left emotion polarity and right emotion polarity, respectively represent return such that p (yZ) L θ) and p (y | Z) r θ) the emotion polarity for the maximum value, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base.
Further, in step S1, the joint training method for jointly training a neural network model with sentence context disjunctors for automatic recognition and dual target entity emotion polarity automatic classification includes:
(1) calculating a loss function identified by the context disjunction symbol and a loss function of the dual-target entity emotion analysis by using the cross entropy loss error respectively, wherein the calculation process is as follows:
wherein, omega is the set of training sentences of the dual-target entity emotion analysis task, | omega | represents the size of the set omega,is a word label of the context breaker of the ith training sentence in omega,is an intermediate semantic representation of the ith training sentence in omega, the left emotion polarity label and the right emotion polarity label of the ith training sentence in omega are respectively,the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi Mid (θ) is a loss function used in the context break identifier recognition training, Ψ L (θ) is a loss function used in performing left target entity emotion analysis training, Ψ r (theta) is a loss function used in performing right target entity sentiment analysis training;
(2) calculating a joint training sentence context score using equation (27) belowJoint loss function of broken symbol identification and dual-target entity emotion polarity classification
Wherein alpha is 1 And alpha 2 Are two weight parameters;
(3) the joint training objective is to minimize the joint loss error calculated by equation (27).
In order to thoroughly solve the influence of irrelevant emotion expressions in multi-target emotion analysis, the invention provides a dual-target entity emotion analysis method for multi-task learning, wherein emotion expressions of two target entities in an emotion sentence are separated from each other through context disjunctors. Firstly, a neural network model with automatic sentence context disjunction identification and double-target entity emotion polarity automatic classification is trained in a combined way through the combined learning of sentence context disjunction identification and left and right entity emotion polarity classification. Second, context breakers in the emotion sentences are identified using the trained neural network model. And then, separating the semantic representation of the emotional sentences by the positions corresponding to the obtained context disjuncts to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to obtain the emotion polarities of the binocular entity.
The invention has the following advantages:
(1) the method has the advantages that the BERT language model with wide pre-training and task fine-tuning is used for dynamically coding the emotion sentences, and the problem that the aspect-level emotion analysis corpus is too small can be effectively solved;
(2) the emotion expressions of two target entities in the emotion sentences are separated from each other through the context disjunction symbols, and the influence of irrelevant emotion expressions in multi-target emotion analysis is thoroughly solved;
(3) the dual-target entity emotion analysis is converted into two independent single-target entity emotion analyses through the context disjunction symbol, so that the performance of the dual-target entity emotion analysis is greatly improved;
(4) by converting the emotional sentences containing more target entities into a plurality of dual-target entity emotional sentences, the method can be applied to various types of aspect level emotion analysis tasks.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The present invention is further illustrated by the following specific examples, but the scope of the present invention is not limited to the following examples.
Let it contain the left target entity t 1 And right target entity t 2 Sentiment sentence of (1) ═ …, t 1 ,w 1 ,w 2 …,w n ,t 2 … }, the dual target entity t is analyzed by the following steps 1 And t 2 The emotion of (2):
s1, through sentence context disjunction symbol recognition and multi-task learning of left and right entity emotion polarity classification, a neural network model with sentence context disjunction symbol automatic recognition and dual-target entity emotion polarity automatic classification is jointly trained;
s2, identifying context disjunctors in the emotion sentences by using the neural network model trained in the step S1;
s3, in the neural network model trained in the step S1, separating the semantic representation of the emotion sentences by the positions corresponding to the context disjuncts obtained in the step S2 to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity;
the emotion sentences are multi-emotion expression sentences comprising a left target entity and a right target entity;
the context breaking symbol is a word which is positioned between a left target entity and a right target entity in the emotional sentence and enables emotional expressions of the two target entities to be separated from each other;
the neural network model is a neural network structure based on a BERT language model; the BERT Language model refers to a Bidirectional Encoder retrieval from transforms (BERT) Language model proposed by Google AI Language.
Further, the step S1 specifically includes:
the input sequence S of the S1.1 BERT language model is composed of an emotion sentence Sen ═ { …, t 1 ,w 1 ,w 2 ,…,w n ,t 2 ,. } and BERT coding symbols, as follows:
Mid={w 1 ,w 2 ,...,w n } (2)
wherein, [ CLS]Is the encoding of the BERT classifier, [ SEP]Is the coding of the BERT terminator, t 1 Is the left target entity to be analyzed, t 2 Is the right target entity to be analyzed, Mid ═ w 1 ,w 2 ,...,w n Is the left and right target entities t 1 And t 2 The intermediate word sequence in between, "…" represents the omitted word sequence, m is the length of the input sequence s, d w Is the dimension of character coding in BERT, n is the length of the middle word sequence Mid, the word refers to the language fragment separated by the word separator Tokenzier of Bert;
s1.2, the input sequence S is sent into a BERT language model to be processed, and sentence semantic representation C of the emotion sentence Sen is obtained Sen As follows:
wherein,the representation of the BERT language model,is the ith hidden state of the BERT language model,d b is the number of hidden units of the BERT language model;
s1.3 according to the corresponding relation, from C Sen Extracting middle word sequence Mid ═ w 1 ,w 2 ,...,w n The intermediate semantic representation C corresponding to Mid As follows:
wherein,the extraction of the intermediate semantics is represented,is the ith intermediate word w i At C Sen The hidden state corresponding to (1);
s1.4 pairs of intermediate semantic representations C Mid Executing a softmax linear transformation to identify the context disjunctor, and calculating as follows:
wherein,is a learnable parameter vector for context disjunct identification,is a parameter of the offset that is,represents the operation of the dot product of the vector,is a context-breaking confidence score vector corresponding to the intermediate word sequence Mid, w is an intermediate word, p (w | C) Mid Theta) represents the predicted probability that the middle word w is a context disjunctor,representation returns such that Pp (w | C) Mid Theta) is the middle word of the maximum value, w * For the calculated context disjunct, θ is the set of all learnable parameters, exp (·) represents an exponential function with e as the base;
s1.5 breaking symbol w by context sp As a separator, two mask matrixes consisting of 1 and 0 are formed, and the semantic meaning of the sentence is expressed as C Sen Separation into left clause semantic representation C left And right clause semantic representation C right The calculation process is as follows:
wherein, mask L For mask matrix, mask, to separate left clause semantics r To be a mask matrix for separating the right clause semantics,is a vector of all 1s, and the vector is a vector,is an all-0 vector, tonken i e.Sen is the ith word in the sentence Sen, functionThe position of the specified word in the sentence Sen is numbered,is mask L The ith column vector in (1), i ∈ [ m, m ]]And is an integer which is the number of the whole,is mask r J ∈ [1, m ] of the j column vector in (1)]And is an integer which is the number of the whole,represents element-by-element multiplication;
s1.6 semantic representation of C in the left clause, respectively left And right clause semantic representation C right Executing a multi-head self-attention coding process to obtain a left clause semantic code C' left And right clause semantic code C' right The calculation process is as follows:
s1.7 semantic coding C 'for left clauses respectively' left And right clause semantic code C' right Executing average pooling operation to obtain a left clause emotion vector Z L And right clause emotion vector Z r The calculation process is as follows:
wherein ave machining (C) represents a pair parameterPerforming a pooling operation of the column-wise averaging;
s1.8 Emotion vector Z of left clause L And right clause emotion vector Z r Executing linear transformation of softmax, performing probability calculation of emotion polarity, and obtaining final emotion polarity, wherein the calculation process is as follows:
wherein,is a representation matrix of the polarity of the emotion,is an offset vector, d k Is the number of emotion polarity classes, Y is the set of emotion polarity classes, Y is an emotion polarity,are each Z L And Z r Corresponding emotion polarity confidence score vector, p (y | Z) L ,θ)、Ρ(y|Z r And theta) each represents Z L And Z r Predicted probability in emotion polarity y, y L 、y r Respectively the finally assessed left and right emotion polarities, respectively denote return such that p (y | Z) L θ) and p (y | Z) r θ) the emotion polarity for the maximum value, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base.
Further, in step S1, the joint training method for jointly training a neural network model with sentence context disjunctors for automatic recognition and dual target entity emotion polarity automatic classification includes:
(1) calculating a loss function identified by the context disjunctor and a loss function of dual-target entity emotion analysis by using the cross entropy loss error respectively, wherein the calculation process comprises the following steps:
wherein, omega is the set of training sentences of the dual-target entity emotion analysis task, | omega | represents the size of the set omega,is a word label of the context breaker of the ith training sentence in omega,is an intermediate semantic representation of the ith training sentence in omega, the left emotion polarity label and the right emotion polarity label of the ith training sentence in omega are respectively,the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi Mid (θ) is a loss function used in the context disjunctor recognition training, Ψ L (θ) is a loss function used in performing left target entity emotion analysis training, Ψ r (theta) is a loss function used in performing right target entity sentiment analysis training;
(2) calculating a joint loss function for joint training sentence context segmenter identification and dual target entity emotion polarity classification using equation (27) below
Wherein alpha is 1 And alpha 2 Are two weight parameters;
(3) the joint training objective is to minimize the joint loss error calculated by equation (27).
According to the embodiment, the emotional expressions of two target entities in the emotional sentences are separated from each other through the context disjunction symbol, and the influence of irrelevant emotional expressions in multi-target emotional analysis is thoroughly solved.
Examples of the applications
1. Example Environment
The present example uses the BERT-BASE version, which is proposed and developed by Google AI Language in the literature "Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional transforms for Language understanding. in: Proceedings of the 2019Conference of NAACL, pp 4171-4186", as a Pre-training model for the BERT coding layer, which includes 12 layers of transforms, 768 hidden units, 12 multiple heads, and a total parameter of 110M); the multi-head Attention adopted in the example Is derived from documents' Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You New. in:31st Conference on Neural Information Processing Systems (NIPS 2017), pp 5998-; to minimize the loss value, this example uses an Adam Optimizer and sets the learning rate to 2e-5, the batch size to 16; during training, the present example sets epochs to 10.
2. Data set
The example uses the internationally widely used SemEval-2014task 4 dataset published in 2014 at the eighth International workshop for semantic evaluation. It provides two sets of review data from the restaurant (Rest) and notebook (Lap) domains. Each sample in the SemEval-2014task 4 dataset consists of one comment sentence, some opinion targets and corresponding sentiment polarity to opinion targets. The data set details are shown in table 1.
Table 1 data set details
3. Comparison method
This example compares the model of the invention with 5 non-BERT methods and 4 BERT-based methods, the comparative methods are shown below:
(1) non-BERT method
MenNet [1] uses a multi-layer memory network in conjunction with attention to capture the contribution of each context word to the aspect polarity classification.
IAN [2] adopts two LSTM networks to acquire the features of specific aspects and contexts respectively, then generates their attention vectors interactively, and finally connects the two attention vectors for aspect polarity classification.
TNet-LF [3] uses the CNN network to extract important features from the bi-directional LSTM network based word representation and proposes a relevance-based mechanism to generate a specific target representation of the words in the sentence. The model also employs a position attenuation technique.
MCRF-SA [4] proposes a compact attention model based on multiple CRFs, which can extract aspect-specific opinion spans. The model also employs position attenuation and facet joining techniques.
MAN [5] builds two attentions with a position function on top of the multi-layer transducer encoder: an interactive attention for generating context and relationship between aspects, and a local attention based on the aspect to context of the transducer encoder.
(2) BERT-based methods
BERT-BASE [6] is a version of BERTBAE developed by Google AI Language, using a single sentence input: the "[ CLS ] + comment sentence + [ SEP ]" is subjected to the aspect polarity classification.
BERT-SPC [7] is the application of a pre-trained BERT model in sentence-for-classification (SPC) tasks. The input mode of applying BERT-SPC to the aspect polarity classification task is as follows: "[ CLS ] + comment sentence + [ SEP ] + aspect target + [ SEP ]".
AEN-BERT [7] constructs two multi-headed attention mechanisms on top of the BERT encoder: one multi-headed self-attentiveness mechanism to model context, one aspect to a context multi-headed attentiveness mechanism to model aspect targets.
MAN-BERT is a variant of the MAN [5] model. This example uses the BERT model to replace the Transformer encoder in MAN [5 ].
Wherein, the above-mentioned related documents are respectively:
1.Tang D,Qin B,Liu T(2016)Aspect Level Sentiment Classification with Deep Memory Network.In:Empirical methods in natural language processing,pp 214–224
2.Ma D,Li S,Zhang X,Wang H(2017)Interactive attentions networks for aspect-level sentiment classification.In:Proceedings of the 26th International Joint Conference on Artificial Intelligence,Melbourne,Australia,19-25August 2017,pp 4068-4074
3.Li X,Bing L,Lam W,Shi B(2018)Transformation Networks for Target-Oriented Sentiment Classification.In Proceedings of ACL,pp 946-956
4.Xu L,Bing L,Lu W,Huang F(2020)Aspect Sentiment Classification with Aspect-Specific Opinion Spans.In Proceedings of EMNLP 2020,pp 3561-3567
5.Xu Q,Zhu Li,Dai T,Yan C(2020)Aspect-based sentiment classification with multi-attention network.Neurocomputing,388(3):135-143
6.Devlin J,Chang MW,Lee K,Toutanova K(2019)BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding.In:Proceedings of the 2019Conference of NAACL,pp 4171–4186
7.Song Y,Wang J,Jiang T,Liu Z,Rao Y(2019)Attentional encoder network for targeted sentiment classification.In:arXiv preprint arXiv:1902.09314
4. examples comparative results
The present example evaluates the various models by reporting accuracy (Accuracy) (Acc) and Macro-average F1(M-F1) on the data set.
Table 2 results of the experiments, wherein the symbol "+" is from the original article, the symbol "+" is from document [5], the others from our experiments, the bold values indicate the best ones
The experimental results in table 2 show that the two-target entity emotion analysis method for multitask learning provided by the invention realizes the results of optimal accuracy (acc) and Macro-average F1(M-F1) in two data sets of a notebook and a restaurant, and significantly exceeds the results of all similar methods, which fully proves that the method of the invention is feasible and excellent.
5. Examples of the invention
For an emotional sentence "price area high to two key things this person food is good" including the two target entities "price" and "food", the example model firstly identifies that the context break symbol is "but", then obtains the semantics of the left sub-sentence "price area high to two" and the semantics of the right sub-sentence "the person food good", and finally carries out the emotional analysis on the left sub-sentence semantics and the right sub-sentence semantics respectively, obtains that the emotional polarity of the left target entity "price" is "negative", and the emotional polarity of the right target entity "food" is "positive".
Claims (3)
1. A double-target entity emotion analysis method for multitask learning is characterized by comprising the following steps:
s1, through sentence context disjunction symbol recognition and multi-task learning of left and right entity emotion polarity classification, a neural network model with sentence context disjunction symbol automatic recognition and dual-target entity emotion polarity automatic classification is jointly trained;
s2, identifying context disjunctors in the emotion sentences by using the neural network model trained in the step S1;
s3, in the neural network model trained in the step S1, separating the semantic representation of the emotion sentences by the positions corresponding to the context disjuncts obtained in the step S2 to obtain left clause semantic representation and right clause semantic representation, and then performing emotion analysis on the left clause semantic representation and the right clause semantic representation respectively to finally obtain the emotion polarity of the dual-target entity;
the emotion sentences are multi-emotion expression sentences comprising a left target entity and a right target entity;
the context breaking symbol is a word which is positioned between a left target entity and a right target entity in the emotional sentence and enables emotional expressions of the two target entities to be separated from each other;
the neural network model is a neural network structure based on a BERT language model; the BERT Language model refers to a Bidirectional Encoder expressions from Transformers (BERT) Language model proposed by Google AI Language.
2. The dual-target entity emotion analysis method for multitask learning according to claim 1, characterized in that:
the step S1 specifically includes:
the input sequence S of the S1.1 BERT language model is composed of an emotion sentence Sen ═ { …, t 1 ,w 1 ,w 2 ,…,w n ,t 2 … } with the BERT coded symbols, as follows:
Mid={w 1 ,w 2 ,...,w n } (2)
wherein, [ CLS]Is the encoding of the BERT classifier, [ SEP]Is the coding of the BERT terminator, t 1 Is the left target entity to be analyzed, t 2 Is the right target entity to be analyzed, Mid ═ w 1 ,w 2 ,...,w n Is the left and right target entities t 1 And t 2 The intermediate word sequence in between, "…" represents the omitted word sequence, m is the length of the input sequence s, d w Is the dimension of character coding in BERT, n is the length of the middle word sequence Mid, the word refers to the language fragment separated by the word separator Tokenzier of the BERT;
s1.2, the input sequence S is sent into a BERT language model for processing to obtain a sentence semantic representation C of the emotion sentence Sen Sen As follows:
wherein,the representation of the BERT language model,is the i-th hidden state of the BERT language model, d b Is the number of hidden units of the BERT language model;
s1.3 according to the corresponding relation, from C Sen Extracting middle word sequence Mid ═ w 1 ,w 2 ,...,w n The intermediate semantic representation C corresponding to Mid As follows:
wherein,the extraction of the intermediate semantics is represented,is the ith intermediate word w i At C Sen The hidden state corresponding to (1);
s1.4 pairs of intermediate semantic representations C Mid Executing a softmax linear transformation to identify the context disjunctor, and calculating as follows:
wherein,is a learnable parameter vector for context disjunct identification,is a parameter of the offset that is,represents the operation of the dot product of the vector,is a context-breaking confidence score vector corresponding to the intermediate word sequence Mid, w is an intermediate word, p (w | C) Mid Theta) represents the predicted probability that the middle word w is a context disjunctor,representation returns such that Pp (w | C) Mid Theta) is the middle word of the maximum value, w * For the computed context disjunct, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base;
s1.5 breaking symbol w by context sp As a separator, two mask matrixes consisting of 1 and 0 are formed, and the semantic meaning of the sentence is expressed as C Sen Separation into left clause semantic representation C left And right clause semantic representation C right The calculation process is as follows:
wherein, mask L Mask as a mask matrix for separating left clause semantics r To be a mask matrix for separating the right clause semantics,is a vector of all 1s, and the vector is a vector,is an all-0 vector, tonken i e.Sen is the ith word in the sentence Sen, functionThe position number of the specified word in the sentence Sen is obtained,is mask L The ith column vector in (1), i ∈ [ m, m ]]And is an integer which is the number of the whole,is mask r J ∈ [1, m ] of the j-th column vector of (1)]And is an integer which is the number of the whole,representing element-by-element multiplication;
s1.6 semantic representation of C in the left clause, respectively left And right clause semantic representation C right Executing a multi-head self-attention coding process to obtain a left clause semantic code C' left And right clause semantic code C' right The calculation process is as follows:
s1.7 left clause semantic compilationCode C' left And right clause semantic code C' right Executing average pooling operation to obtain a left clause emotion vector Z L And right clause emotion vector Z r The calculation process is as follows:
wherein ave machining (C) represents a pair of parametersPerforming a pooling operation by column averaging;
s1.8 Emotion vector Z of left clause L And right clause emotion vector Z r Executing linear transformation of softmax, performing probability calculation of emotion polarity, and obtaining final emotion polarity, wherein the calculation process is as follows:
wherein,is a representation matrix of the polarity of the emotion,is an offset vector, d k Is the number of emotion polarity classes, Y is the set of emotion polarity classes, Y is an emotion polarity,are each Z L And Z r Corresponding emotion polarity confidence score vector, Ρ (y | Z) L ,θ)、Ρ(y|Z r Theta) respectively represent Z L And Z r Predicted probability in emotion polarity y, y L 、y r Respectively the finally assessed left emotion polarity and right emotion polarity,respectively denote return such that p (y | Z) L θ) and p (y | Z) r θ) the emotion polarity for the maximum value, θ is the set of all learnable parameters, exp (-) represents an exponential function with e as the base.
3. The dual-target entity emotion analysis method for multitask learning according to claim 1, characterized in that:
in step S1, the joint training method for jointly training a neural network model with sentence context disjunctors automatic identification and dual target entity emotion polarity automatic classification includes:
(1) calculating a loss function identified by the context disjunctor and a loss function of dual-target entity emotion analysis by using the cross entropy loss error respectively, wherein the calculation process comprises the following steps:
wherein, omega is the set of training sentences of the dual-target entity emotion analysis task, | omega | represents the size of the set omega,is a word label of the context breaker of the ith training sentence in omega,is an intermediate semantic representation of the ith training sentence in omega, the left emotion polarity label and the right emotion polarity label of the ith training sentence in omega are respectively,the left clause emotion vector and the right clause emotion vector of the ith training sentence in omega are respectively psi Mid (θ) is a loss function used in the context disjunctor recognition training, Ψ L (θ) is a loss function used in the left target entity emotion analysis training, Ψ r (θ) Is a loss function used when performing sentiment analysis training on the right target entity;
(2) calculating a joint loss function for joint training sentence context segmenter identification and dual target entity emotion polarity classification using equation (27) below
Wherein alpha is 1 And alpha 2 Are two weight parameters;
(3) the joint training objective is to minimize the joint loss error calculated by equation (27).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210054948.XA CN115017912A (en) | 2022-01-18 | 2022-01-18 | Double-target entity emotion analysis method for multi-task learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210054948.XA CN115017912A (en) | 2022-01-18 | 2022-01-18 | Double-target entity emotion analysis method for multi-task learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115017912A true CN115017912A (en) | 2022-09-06 |
Family
ID=83066454
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210054948.XA Pending CN115017912A (en) | 2022-01-18 | 2022-01-18 | Double-target entity emotion analysis method for multi-task learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115017912A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115618884A (en) * | 2022-11-16 | 2023-01-17 | 华南师范大学 | Language analysis method, device and equipment based on multi-task learning |
CN117633239A (en) * | 2024-01-23 | 2024-03-01 | 中国科学技术大学 | End-to-end face emotion recognition method combining combined category grammar |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765769A (en) * | 2019-08-27 | 2020-02-07 | 电子科技大学 | Entity attribute dependency emotion analysis method based on clause characteristics |
CN111144130A (en) * | 2019-12-26 | 2020-05-12 | 辽宁工程技术大学 | Context-aware-based fine-grained emotion classification method for hybrid neural network |
WO2021073116A1 (en) * | 2019-10-18 | 2021-04-22 | 平安科技(深圳)有限公司 | Method and apparatus for generating legal document, device and storage medium |
-
2022
- 2022-01-18 CN CN202210054948.XA patent/CN115017912A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765769A (en) * | 2019-08-27 | 2020-02-07 | 电子科技大学 | Entity attribute dependency emotion analysis method based on clause characteristics |
WO2021073116A1 (en) * | 2019-10-18 | 2021-04-22 | 平安科技(深圳)有限公司 | Method and apparatus for generating legal document, device and storage medium |
CN111144130A (en) * | 2019-12-26 | 2020-05-12 | 辽宁工程技术大学 | Context-aware-based fine-grained emotion classification method for hybrid neural network |
Non-Patent Citations (2)
Title |
---|
DUYU TANG ET AL.: "Aspect Level Sentiment Classification with Deep Memory Network", PROCEEDINGS OF THE 2016 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 1 January 2016 (2016-01-01), pages 214 - 224 * |
杨鹏;杨青;李志斌;王扬;: "基于注意力机制的交互式神经网络模型在细粒度情感分类中的应用", 计算机应用与软件, no. 07, 12 July 2020 (2020-07-12), pages 130 - 135 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115618884A (en) * | 2022-11-16 | 2023-01-17 | 华南师范大学 | Language analysis method, device and equipment based on multi-task learning |
CN115618884B (en) * | 2022-11-16 | 2023-03-10 | 华南师范大学 | Language analysis method, device and equipment based on multi-task learning |
CN117633239A (en) * | 2024-01-23 | 2024-03-01 | 中国科学技术大学 | End-to-end face emotion recognition method combining combined category grammar |
CN117633239B (en) * | 2024-01-23 | 2024-05-17 | 中国科学技术大学 | End-to-end face emotion recognition method combining combined category grammar |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
Shi et al. | Deep adaptively-enhanced hashing with discriminative similarity guidance for unsupervised cross-modal retrieval | |
CN111259127B (en) | Long text answer selection method based on transfer learning sentence vector | |
CN115017912A (en) | Double-target entity emotion analysis method for multi-task learning | |
CN113051380B (en) | Information generation method, device, electronic equipment and storage medium | |
CN115796182A (en) | Multi-modal named entity recognition method based on entity-level cross-modal interaction | |
CN116561305A (en) | False news detection method based on multiple modes and transformers | |
CN115687567A (en) | Method for searching similar long text by short text without marking data | |
Barse et al. | Cyber-Trolling Detection System | |
Jin et al. | A review of text sentiment analysis methods and applications | |
CN115795037B (en) | Multi-label text classification method based on label perception | |
Ronghui et al. | Application of Improved Convolutional Neural Network in Text Classification. | |
CN114911906A (en) | Aspect-level emotion analysis method based on hybrid neural network | |
Zouidine et al. | A comparative study of pre-trained word embeddings for Arabic sentiment analysis | |
Syaputra et al. | Improving mental health surveillance over Twitter text classification using word embedding techniques | |
Prajapati et al. | Automatic Question Tagging using Machine Learning and Deep learning Algorithms | |
Affi et al. | Arabic named entity recognition using variant deep neural network architectures and combinatorial feature embedding based on cnn, lstm and bert | |
Sharma et al. | A framework for image captioning based on relation network and multilevel attention mechanism | |
Ranjan et al. | An Optimized Deep ConvNet Sentiment Classification Model with Word Embedding and BiLSTM Technique | |
Gao et al. | Label Smoothing for Enhanced Text Sentiment Classification | |
Chen et al. | Multi-Label Text Classification Based on BERT and Label Attention Mechanism | |
Vemulapalli et al. | A comparative study of twitfeel and transformer-based techniques for the analysis of text data for sentiment classification | |
Mouthami et al. | Text Sentiment Analysis of Film Reviews Using Bi-LSTM and GRU | |
Pokhrel et al. | Automatic Extractive Text Summarization for Text in Nepali Language with Bidirectional Encoder Representation Transformers and K-Mean Clustering | |
CN117933254B (en) | Chinese entity relation extraction method based on multi-feature fusion and progressive comparison |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |