CN113705238B

CN113705238B - Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model

Info

Publication number: CN113705238B
Application number: CN202110670846.6A
Authority: CN
Inventors: 庞光垚; 陆科达; 玉振明; 彭子真; 朱肖颖; 黄宏本; 莫智懿; 农健; 冀肖榆
Original assignee: Wuzhou University
Current assignee: Wuzhou University
Priority date: 2021-06-17
Filing date: 2021-06-17
Publication date: 2022-11-08
Anticipated expiration: 2041-06-17
Also published as: CN113705238A

Abstract

The invention relates to an aspect level emotion analysis method and a model based on a BERT and an aspect feature positioning model, wherein the method comprises the following steps: firstly, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information; then constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the surface and the context representation, integrating the relationship between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result; then, an aspect feature positioning model is constructed to capture aspect information during sentence modeling, and complete information of aspects is integrated into interactive semantics, so that the influence of interference words irrelevant to the aspect words is reduced, and the integrity of the aspect word information is improved; and finally, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information. The implicit relation between the contexts can be better simulated, the information of the aspect words is better utilized, and the interference of the information irrelevant to the aspect words is reduced, so that higher accuracy and macro F1 are obtained.

Description

Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model

Technical Field

The invention belongs to the technical field of aspect level emotion analysis, and particularly relates to an aspect level emotion analysis method and system (ALM-BERT) based on a BERT and an aspect feature positioning model.

Background

Electronic commerce is a rapidly developing industry, and the importance of electronic commerce to global economy is increasing day by day. In particular, with the rapid development of social media and the continuous popularization of social networking platforms, more and more users begin to express comments with emotion on various networking platforms. These comments reflect the mood of the user and consumer, providing the seller and government alike with a lot of valuable feedback information about the quality of the goods or services. For example: before purchasing a product, a user may browse through a large number of reviews on the product on an e-commerce platform to determine whether the product is worth purchasing. Also, governments and enterprises can collect a large amount of public comments directly from the internet, analyze the opinions and satisfaction of users, and further meet their needs. Therefore, sentiment analysis has attracted a great deal of attention from the theoretical and practical world as a fundamental and critical task of natural language processing.

However, common emotion analysis tasks (e.g., sentence-level emotion analysis) can only determine the user's emotional polarity (e.g., positive, negative, and neutral) for a product or event from the entire sentence, and cannot determine the emotional polarity of a particular aspect of the sentence. In contrast, aspect level sentiment analysis is a more granular classification task that can identify sentiment polarity of aspects in a sentence. For example, as shown in FIG. 9, some examples of sentence-level sentiment analysis and aspect-based sentiment analysis are provided (a consumer review example with three aspect terms), and we can see from the review text that "it does not have any accompanying software installed outside the windows media, but for price i are very satisfied with its condition and overall product", the emotional polarity of the aspect term "software" is negative, "windows media" is neutral, "price" and "very satisfied" are positive.

In prior studies, researchers have proposed various methods to accomplish aspect level emotion analysis tasks. Most of the methods are based on supervised machine learning algorithm, and certain effect is achieved. However, these statistical methods require careful design of manual features on large-scale datasets, resulting in significant labor and time costs. In view of the ability of neural network models to automatically learn low-dimensional representations of aspects and contexts from comment text without relying on artificial feature engineering, neural networks have gained increasing attention in recent years in aspect-level sentiment analysis tasks.

Unfortunately, existing methods mostly utilize either a Recurrent Neural Network (RNN) or a Convolutional Neural Network (CNN) directly to model independently and express semantic information of aspect words (aspect words) and their contexts, but ignore the fact that they lack sensitivity to the location of critical components. In practice, researchers have demonstrated that the emotional polarity of body words is highly correlated with body word information and word order information, which means that the emotional polarity of facet words is more susceptible to contextual words that are closer to the facet words. In addition, it is difficult for neural networks to capture long-term dependencies between facet words and context, resulting in loss of valuable information.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides a BERT and aspect feature localization model-based aspect level emotion analysis method capable of better utilizing information of aspect words and reducing interference of information irrelevant to the aspect words, thereby obtaining higher accuracy and macro F1, and a system based on the BERT and aspect feature localization model-based aspect level emotion analysis method.

In order to solve the technical problems, the invention adopts the following technical scheme:

the invention provides an aspect level emotion analysis method based on a BERT and an aspect feature positioning model, which comprises the following steps:

s1, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;

s2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the body word representation and the context representation, integrating the relation between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result;

s3, constructing an aspect feature positioning model to capture aspect information during sentence modeling, and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;

and S4, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information.

Further, the "obtaining high-quality context information representation and aspect information representation by using BERT model" means that a pre-trained BERT model is used as a text vectorization mechanism to generate high-quality text feature vector representation, the BERT model is a pre-trained language representation model, and the text vectorization mechanism is to map each word to a high-dimensional vector space, specifically: the BERT model generates text representation by using a deep-layer bidirectional converter coder, divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for the different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.

Wherein, the "dividing a given word sequence into different segments by adding special word segmentation markers at the beginning and end of the input sequence, generating marker embedding, segment embedding and position embedding for different segments, and finally converting the annotation text and the aspect words respectively to obtain context information representation and aspect information representation" specifically includes:

the BERT model adds special word segmentation marks [ CLS ] at the beginning and the end of an input sequence respectively]And [ SEP]Dividing a given word sequence into different segments, generating mark embedding, segment embedding and position embedding for the different segments, enabling the embedded representation of the input sequence to contain all the information of the three kinds of embedding, and finally converting the comment text and the aspect word into "[ CLS ] respectively in a BERT model]+ comment text + [ SEP]"and" [ CLS]+ target + [ SEP]"get context representation E _c And aspect represents E _a ：

E _c ＝{we _[CLS] ，we ₁ ，we ₂ ，...，we _[SEP] }；

E _a ＝{ae _[CLS] ，ae ₁ ，ae ₂ ，...，ae _[SEP] }；

Wherein we _[CLS] ，ae _[CLS] Indicates a Classification marker [ CLS]Vector of (2), we _[SEP] And ae _[SEP] Representation delimiter [ SEP]The vector of (2).

Further, the "constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the body word representation and the context representation and integrate the relationship between the body word and the context" means that the important feature extraction of the aspect-level emotion analysis is realized based on the multi-head attention mechanism, and the important information of the context and the target is extracted, specifically: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; and then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, the long-term dependence information and the context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and the final interactive hidden state of the context interaction and the final interactive hidden state of the aspect words are obtained after mean value pooling operation.

The "extracting interactive semantics from the aspect information representation and the context information representation generated by the BERT model through the transcoder, determining the context which is most important for emotion qualification of the aspect words, simultaneously generating hidden states by using the long-term dependence information and the context perception information of the context as the input data of the position feed-forward network, and obtaining the final interactive hidden state of context interaction and the final interactive hidden state of the context and the aspect words after the mean pooling operation" specifically includes:

s201, mapping a query sequence and a series of key (K) values (V) for capturing different important information in a parallel subspace from aspect information representation and context information representation generated by a BERT model through a plurality of self-attention mechanisms forming a multi-head attention mechanism in a conversion encoder;

s202, through an attention scoring function formula f _s (Q，K，V)＝σ(f _e (Q, K)) V calculates an attention score for each important captured message, where σ (f) _e (Q, K)) represents a normalized exponential function, f _e (Q, K) is an energy function for learning the correlation characteristics between K and Q, and is calculated by the following formula;

wherein

Represents a scale factor, d _k Is the dimensionality of query Q and key vector K;

s203, inputting the context expression and the aspect expression into the attention score function formula f _mh (Q，K，V)＝[a ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]W _d Respectively obtaining long-term dependency information c of context _cc And context-aware information t _ca To capture the long-term dependencies of contexts and to determine which contexts are most important for the emotional qualification of the facet words; wherein, a ¹ Attention score, which represents the ith important information captured, [ a ] ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]Denotes a concatenation vector, W _d Is an attention weight matrix, c _cc ＝f _mh (E _c ，E _c )，t _ca ＝f _mh (E _c ，E _a )；

S204, converting the encoder with c _cc And t _ca Generating hidden states h as input data to a position feedforward network _c And h _a The position feedforward network PFN (h) is a variant of the multi-layer perceptron _c And h _a The definition is as follows:

h _c ＝PFN(c _cc )

h _a ＝PFN(t _ca )；

PFN(h)＝ζ(hW ₁ +b ₁ )W ₂ +b ₂ ；

wherein, ζ (hW) ₁ +b ₁ ) Is a corrected linear unit, b ₁ And b ₂ Is an offset value, W ₁ And W ₂ Representing a learnable weight parameter;

s205. In the hidden state h _c And h _a After the mean value pooling operation is carried out, a final interactive hidden state h of context interaction is obtained _cm Final interactive hidden state h of context and aspect word _am 。

Further, the working process of the aspect feature positioning model is as follows, algorithm 1:

in particular, the feature localization algorithm represents E from the context according to the position and length of the facet words _c Extracting the most important related information of the aspect word af; while taking the most important feature AF from the AF using max pooling, then performing a dropout operation on the most important feature AF, and representing E in context _c Important characteristics h of the Chinese obtained aspect word _af 。

Further, the "fusing context and target important information related to the target, and predicting probabilities of different emotion polarities by using the emotion prediction factors on the basis of the fused information" specifically includes:

s401, h is spliced by using a vector splicing mode _cm 、h _am And h _af Taken together to give the overall characteristic r:

r＝[h _cm ；h _am ；h _af ]；

s402, performing data preprocessing on r by adopting a linear function, namely:

x＝W _u r+b _u wherein W is _u Is a weight matrix, b _u Is a bias value;

s403, calculating the probability Pr (a = p) that the emotion polarity of the aspect word a in the sentence is p by using a softmax function:

where p represents the candidate emotion polarity and C is the number of categories of emotion polarities.

Further, the method for analyzing the aspect level emotion based on the BERT and the aspect feature positioning model further comprises the following steps: training is carried out by adopting cross entropy and L2 regularization as loss functions, and the training is defined as follows:

where D represents all training data, j and i are indices of the training data samples and emotion classes, respectively, λ represents a factor for L2 regularization, θ represents a set of parameters for the model, y represents predicted emotion polarity,

indicating the correct emotional polarity.

The invention also provides an aspect level emotion analysis system, which comprises:

the text vectorization mechanism obtains high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;

the feature extraction model of the aspect-level emotion analysis is used for learning the interaction between the representation of the body words and the representation of the context, integrating the relationship between the body words and the context to distinguish the contributions of different sentences and the aspect words to the classification result, capturing aspect information during sentence modeling, and integrating the complete information of the aspect into interactive semantics to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the information of the aspect words;

and the emotion predictor is used for fusing context related to the target and important information of the target and predicting the probability of different emotion polarities by utilizing the emotion prediction factors on the basis of the fused information.

Further, the BERT model is a pre-trained language representation model, a text representation is generated by using a deep-layer multi-layer bidirectional converter encoder, meanwhile, special word segmentation markers are respectively added at the beginning and the end of an input sequence to divide a given word sequence into different fragments, marker embedding, segmentation embedding and position embedding are generated for the different fragments, and finally, a comment text and an aspect word are respectively converted to obtain a context information representation and an aspect information representation;

the feature extraction model of the aspect level emotion analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and used for learning interaction between the body word representation and the context representation and integrating the relationship between the body words and the context to distinguish contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;

the emotion predictor connects the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performs data preprocessing on the overall features by adopting a linear function, and finally calculates the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function.

The invention has the beneficial effects that:

according to the technical scheme, the implicit relation between contexts can be better simulated through the conversion encoder, the information of the aspect words can be better utilized through the aspect feature positioning model, interference of information irrelevant to the aspect words is reduced, and therefore higher accuracy and macro F1 are obtained (the accuracy rate of the macro F1 and the average value of the macro F1 on sentences of different lengths are respectively 3.1% and 6.56% higher), and meanwhile feasibility and effectiveness of the BERT model and the aspect information in aspect-level emotion analysis tasks are verified.

Drawings

FIG. 1 is a flow diagram of an embodiment of a method for facet emotion analysis based on BERT and facet feature location models in accordance with the present invention;

FIG. 2 is a schematic structural diagram of an embodiment of an aspect level sentiment analysis system based on BERT and aspect feature localization models according to the present invention;

FIG. 3 is a graph of experimental results of drop rate parameter optimization in an evaluation experiment according to the aspect level emotion analysis method based on BERT and the aspect feature localization model of the present invention;

FIG. 4 is a graph of experimental results of learning rate parameter optimization in an evaluation experiment of the aspect level emotion analysis method based on BERT and the aspect feature localization model according to the present invention;

FIG. 5 is a graph of the experimental results of L2 regularization parameter optimization in an evaluation experiment of the aspect level emotion analysis method based on BERT and the aspect feature localization model according to the present invention;

FIG. 6 is a graph of ROUGE scores (ROUGE-1) of different lengths of source text for a BERT and aspect feature localization model-based aspect level emotion analysis method and TD-LSTM validation experiment according to the present invention;

FIG. 7 is a graph of the ROUGE score (ROUGE-2) of different length source texts of a TD-LSTM verification experiment and an aspect level emotion analysis method based on BERT and an aspect feature localization model according to the present invention;

FIG. 8 is a graph of ROUGE scores (ROUGE-L) of different lengths of source text for a BERT and aspect feature localization model-based aspect level emotion analysis method and TD-LSTM validation experiment in accordance with the present invention;

FIG. 9 is an example of prior art facet sentiment analysis.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, an aspect-level sentiment analysis method based on BERT and aspect feature localization models according to an embodiment of the present invention includes the following steps:

s1, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information; specifically, a pre-trained BERT model is used as a text vectorization mechanism to generate high-quality text feature vector representation, the BERT model is a pre-trained language representation model, and the text vectorization mechanism is used for mapping each word to a high-dimensional vector space, and specifically comprises the following steps: the BERT model generates text representation by using a deep-layer bidirectional converter coder, divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for the different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.

S2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the body word representation and the context representation, integrating the relationship between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result; specifically, the method is to extract important features of aspect-level emotion analysis based on a multi-head attention mechanism, and extract important information of context and a target, and specifically includes: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; and then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, the long-term dependence information and the context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and the final interactive hidden state of the context interaction and the final interactive hidden state of the aspect words are obtained after mean value pooling operation.

S3, constructing an aspect feature positioning model to capture aspect information during sentence modeling, and integrating complete information of aspects into interactive semantics to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information; the aspect feature positioning module is constructed based on a maximum pooling function, namely, the extracted aspect words and the context hidden features thereof are divided into a plurality of areas, the maximum value is selected in each area to represent the area, and the aspect feature positioning module (positioning core features) is constructed in such a way; the working process of the aspect feature positioning model expresses E from the context according to the position and the length of the aspect word through a feature positioning algorithm _c Extracting the most important related information of the aspect word af; while taking the most important feature AF from the AF using max pooling, then performing a dropout operation on the most important feature AF, and representing E in context _c To obtain the important characteristics h of the facet word _af 。

S4, fusing context related to the target and important target information, and predicting probabilities of different emotion polarities by using emotion prediction factors on the basis of the fused information; the method comprises the following specific steps: and connecting the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performing data preprocessing on the overall features by adopting a linear function, and finally calculating the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function.

As shown in FIG. 2, the present invention further provides an aspect level emotion analysis model, which includes a text vectorization mechanism 100, a feature extraction model 200 for aspect level emotion analysis, and an emotion predictor 300.

The text vectorization mechanism 100 is a multi-angle text vectorization mechanism, and obtains high-quality context information representation and aspect information representation by using a BERT model to maintain the integrity of text information; the BERT model is a pre-trained language representation model, generates text representation by using a deep-layer bidirectional converter coder, simultaneously divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for the different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.

The feature extraction model 200 of the aspect-level emotion analysis is used for learning the interaction between the representation of the body words and the representation of the context, integrating the relationship between the body words and the context to distinguish the contributions of different sentences and the aspect words to the classification result, capturing aspect information during sentence modeling and integrating complete information of the aspects into interactive semantics; the method specifically comprises the following steps: the feature extraction model 200 of the aspect-level sentiment analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and used for learning interaction between the body word representation and the context representation and integrating the relationship between the body words and the context so as to distinguish the contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics; therefore, the influence of interference words irrelevant to the aspect words can be reduced, and the completeness of the aspect word information is improved;

the emotion predictor 300 is used for fusing context related to the target and important information of the target and predicting the probability of different emotion polarities by using emotion prediction factors on the basis of the fused information; specifically, the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words are connected in a vector splicing mode to obtain overall features, then data preprocessing is performed on the overall features by adopting a linear function, and finally the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity is calculated by utilizing a softmax function.

In general, aspect level emotion analysis refers to a process of taking a sentence and some predefined aspect words as input data, and finally outputting the emotion polarity of each aspect word in the sentence. Here we use some practical review examples to illustrate the aspect level sentiment analysis task.

It is clear that each example sentence contains two aspect words, each having four different emotional polarities, i.e., positive, neutral, negative, and conflicting, as shown in table 1. Then the aspect level sentiment analysis is defined as follows:

some examples of Table 1 aspect level sentiment analysis

Defining one: formally, one comment sentence S = { w } is given ₁ ，w ₂ ，...，w _n N is the total number of words in S. One aspect word table a = { a = { [ a ] ₁ ，...，a _i ，...，a _m H, length m, wherein a _i Represents the ith aspect word in aspect word table a, which is a subsequence of sentence S. P = { P ₁ ，...，p _j ，...，p _C Denotes the candidate emotion polarity, where C denotes the number of categories of emotion polarity, p _j Indicating the jth emotion polarity.

The problems are that: the goal of the aspect level emotion analysis model is to predict the most likely emotion polarity for a particular aspect word, which can be expressed as:

where phi denotes a function for quantizing the facet words a _i And the emotional polarity p in the sentence s _j The degree of match between. And finally, outputting the emotion polarity with the highest matching degree as a classification result by the model. Table 2 summarizes the symbols in the model and their delineationsThe above-mentioned processes are described.

Table 2: symbols used and their description

The invention relates to an aspect level emotion analysis method based on a BERT and aspect feature positioning model, which comprises the following steps: firstly, generating a high-quality sequence word vector by utilizing a pre-training BERT model, and providing effective support for the subsequent steps; then, in the feature extraction method of aspect-level emotion analysis, an important feature extraction module is realized based on a multi-head attention mechanism, and important information of context and a target is extracted; then, providing an aspect feature positioning model, and comprehensively considering the important features of the target words to obtain target related features; and finally, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information. The specific method and principle are as follows:

1. multi-angle text vectorization mechanism

The text vectorization mechanism essentially maps each word to a high-dimensional vector space. Generally, two context-based Word embedding models, namely Word2vec and Glove, are widely applied to text vectorization, and achieve great performance in aspect-level emotion analysis tasks. However, research has shown that the two-word embedding model cannot obtain enough information in the text, which results in insufficient classification accuracy and reduced performance. Therefore, a high-quality word embedding model has an important influence on improving the accuracy of the classification result.

The key to implementing aspect-level sentiment analysis is to effectively understand natural language processing, which is usually highly dependent on large-scale high-quality labeled text, and fortunately, the BERT model is a language pre-training model that can effectively utilize unlabeled text and uses a random masking partIn the vocabulary mode, a deep multi-layer bidirectional converter encoder is used for extracting a universal natural language recognition model from massive unlabeled texts, and a small amount of labeled data is further used for fine adjustment, so that high-quality text feature vector representation can be generated. It is inspired by this that in the ALM-BERT method proposed by the present invention, for a given word sequence, special segmentation flags [ CLS ] are added at the beginning and end of the input sequence, respectively]And [ SEP]In order to divide the sequence into different segments. That is, the word embedding vector input in this way includes vectors such as mark embedding, segment embedding, and position embedding generated for different segments. Specifically, the comment text and the aspect word are converted into "[ CLS ], respectively]+ comment text + [ SEP]"and" [ CLS]+ target + [ SEP]", the resulting context representation E _c And aspect represents E _a ：

E _c ＝{we _[CLS] ，we ₁ ，we ₂ ，...，we _[SEP] } (2)

E _a ＝{ae _[CLS] ，ae ₁ ，ae ₂ ，...，ae _[SEP] } (3)

2. Feature extraction method for aspect-level emotion analysis

In order to extract hidden features of an aspect word and context thereof and emphatically consider auxiliary information contained in the aspect word, a converter encoder is introduced, and an aspect word feature positioning module is provided. The basic idea is to model the context and the target word interactively to integrate the information of the aspect words and the context fully. In addition, the emotion classification accuracy can be improved by acquiring the feature information of the aspect words in the context.

2.1 important feature extraction model

A transform encoder (transform encoder) is a novel feature extractor based on a multi-head attention mechanism and a position feed-forward network. It can learn different important information in different feature representation subspaces. Moreover, the transcoder can directly capture long-term correlation in the sequence, is easier to parallelize than a recurrent neural network and a convolutional neural network, and greatly reduces training time. The invention extracts interactive semantics from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, determines the most important context for emotion qualification of aspect words, simultaneously uses the long-term dependence information and the context perception information of the context as the input data of a position feed-forward network, respectively generates hidden states, and obtains the final interactive hidden state of context interaction and the final interactive hidden state of the context and the aspect words after mean pooling operation.

Intuitively, a multi-head attention mechanism is composed of a plurality of self-attention mechanisms (self-attention mechanisms) that can map to a query sequence (Q) and a series of key (K) values (V) that capture different important information in a parallel subspace. Attention score function f _s () the calculation process in the self-attention mechanism is as follows:

f _s (Q，K，V)＝σ(f _e (Q，K))V (4)

where σ () denotes a normalized exponential function, f _e (.) is an energy function that learns the correlation features between K and Q, which can be calculated using the following formula:

wherein

Represents a scale factor, d _k Is the dimensionality of query Q and key vector K.

Attention scoring function f of multi-head attention mechanism _mh () obtained by concatenating the attention scores of the self-attention mechanism:

f _mh (Q，K，V)＝[a ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]W _d (6)

wherein a is ⁱ Attention score, which represents the ith important information captured, [ a ] ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]Representing a concatenation vector, W _d Is the attention weight matrix.

As shown in equations (8) - (9) below, the context representation and the facet representation are input into a multi-headed attention mechanism to capture the long-term dependencies of contexts and determine which contexts are most important for sentiment qualification of facet words.

c _cc ＝f _mh (E _c ，E _c ) (8)

t _ca ＝f _mh (E _c ，E _a ) (9)

Wherein, c _cc And t _ca Respectively long-term dependency information and context-aware information of the context.

Then, the coders are converted to c respectively _cc And t _ca Generating hidden states h as input data to a position feedforward network _c And h _a . In particular, the position feedforward network PFN (h) is a variant of a multi-layer perceptron. Formally, a position feedforward network PFN, h _c And h _a The definition is as follows:

h _c ＝PFN(c _cc ) (10)

h _a ＝PFN(t _ca ) (11)

PFN(h)＝ζ(hW ₁ +b ₁ )W ₂ +b ₂ (12)

wherein, ζ (hW) ₁ +b ₁ ) Is a corrected linear unit, b ₁ And b ₂ Is an offset value, W ₁ And W ₂ Representing a learnable weight parameter.

In pair h _c And h _a After the mean value pooling operation is carried out, a final interactive hidden state h of context interaction is obtained _cm Final interactive hidden state h of context and aspect word _am 。

2.2 aspect feature localization model

The transcoder captures long term dependencies of the context and generates semantic information of the interaction between the facet words and the context. In order to highlight the importance of different aspect words, the invention establishes an aspect word feature positioning model, and the main idea is to select information related to the aspect words from context feature representations, and better integrate the aspect information by capturing feature representation vectors containing the aspect information, thereby improving the accuracy of aspect level emotion classification. The working process of the aspect feature positioning model is shown as an algorithm 1:

in particular, the feature localization algorithm represents E from the context according to the position and length of the aspect word _c Extracting the most important related information of the aspect word af; the most important feature AF is also obtained from AF with maximum pooling, as follows:

AF＝Maxpooling(af，dim＝0) (13)

thereafter, a dropout operation is performed on the most important feature AF, and E is indicated in the context _c Important characteristics h of the Chinese obtained aspect word _af 。

3. Emotion predictor

Firstly, h is spliced by using a vector splicing mode _cm 、h _am And h _af Taken together to give the overall characteristic r:

r＝[h _cm ；h _am ；h _af ] (14)

then, a linear function is used to perform data preprocessing on r, namely:

x＝W _u r+b _u (15)

wherein, W _u Is a weight matrix, b _u Is the offset value.

Finally, calculating the probability Pr (a = p) that the emotion polarity of the aspect word a in the sentence is p by using the softmax function:

where p represents the emotion polarity candidate and C is the number of categories of emotion polarities.

In summary, the method for analyzing the emotion at the aspect level based on the BERT and the aspect feature positioning model of the present invention is an end-to-end operation process. Furthermore, to optimize the parameters of the method, the predicted emotional polarity y and the correct emotional polarity are made

And (3) minimizing losses therebetween, further comprising: training is performed by using cross entropy and L2 regularization as loss functions, and the training is defined as:

where D represents all training data, j and i are indices of the training data samples and emotion classes, respectively, λ represents a factor for L2 regularization, θ represents a parameter set for the model, y represents predicted emotion polarity,

indicating the correct emotional polarity.

4. Evaluation test

In order to evaluate the rationality and effectiveness of the BERT and aspect feature positioning model-based aspect level emotion analysis method and the model, the analysis is carried out through the following evaluation experiments.

4.1 data set and evaluation index

We constructed our relevant evaluation experiments in three published English review data sets. The details of these three data sets are shown in table 3: restaurant (Restaurant) and notebook (Laptop) datasets are provided by SemEval (references: pontiki M, D Galanis, pavlooplos J, et al. SemEval-2014 Task 4; the Twitter dataset consists of user comments on Twitter collected by Li et al (ref: li D, wei F, tan C, et al. Adaptive reactive Neural Network for Target-dependent Twitter sententiment Classification [ C ]// Meeting of the Association for Computational Linear constraints.2014.), with emotional polarities labeled as positive, negative and neutral. The three data sets are popular comment data sets at present and are widely applied to aspect-level sentiment analysis tasks.

TABLE 3 statistical information of data sets

In addition, in order to objectively evaluate the performance of the BERT and aspect feature localization model-based aspect-level emotion analysis method and model, evaluation indexes commonly used in aspect-level emotion analysis tasks, namely macro F1 (macro-F1) and accuracy (Acc) are adopted. Is defined as follows:

Acc＝SC/N (18)

where SC represents the number of correctly sorted samples and N represents the total number of samples. In general, the higher the accuracy, the better the performance of the model.

In addition, the macro F1 (macro-F1) is used to truly reflect the performance of the model, i.e., the weighted average of precision and recall. macro-F1 is calculated according to the following formula:

where T is the number of samples correctly classified as emotion polarity i, FP is the number of samples misclassified as emotion polarity i, FN is the number of samples whose emotion polarity i is misclassified as other emotion polarities, C is the number of categories of emotion polarities,

is the accuracy of emotion polarity i (precision),

indicating the recall (recall) of the emotion polarity i. In our experiments, to more fully evaluate the performance of our model, we classified the emotional polarity into two categories, 3C = { positive, neutral, negative } and 4C = { positive, neutral, negative, conflict }.

4.2 parameter optimization

During the training of the model, we utilize the BERT model to generate vector representations of context and aspect words. Specifically, we use the standard parameter BERT of the BERT model _BASE To complete the model training. Wherein, in BERT _BASE The number of conversion modules, the number of hidden neurons, and the number of heads of self-attention in (1) are 12, 768, and 12, respectively. Furthermore, to analyze the optimal hyper-parameter settings, we provide several important hyper-parameter setting examples.

First, the drop rate (Dropout) refers to the probability of dropping some neurons during the training of the neural network to solve the overfitting and enhance the generalization ability of the model. Where we initialize the value of dropout to 0.3 and then search for the best value at 0.1 intervals. Experimental results as shown in fig. 3, when dropout is 0.5, the precision and F1 value of the aspect-level emotion analysis method and model based on the BERT and aspect feature localization model of the present invention are the best on three data sets.

Second, the learning rate (1 earning rate) determines whether and when the objective function converges to a local minimum. In our experiments, we used the Adam optimization algorithm to update the parameters of the model and explore at [10 ] ^-5 ，0.1]An optimal learning rate parameter within a range. As shown in fig. 4, when the learning rate is 2 × 10 ^-5 In time, the performance of the aspect level emotion analysis method and the aspect level emotion analysis model based on the BERT and the aspect feature positioning model is the best.

Finally, the L2 regularization parameter is a hyper-parameter that can prevent the model from over-fitting. As shown in fig. 5, when the value of the L2 regularization parameter is set to 0.01, the performance of the aspect level emotion analysis method and model based on the BERT and aspect feature localization model of the present invention is the best; meanwhile, the weights of the model are initialized by a Glorot parameter initialization method, the batch size is set to be 16, and 10 iteration times are trained in total.

4.3 comparison Algorithm

In order to verify the effectiveness of the BERT and aspect feature positioning model-based aspect level emotion analysis method and model of the present invention, the BERT and aspect feature positioning model-based aspect level emotion analysis method and model are compared with many popular aspect level emotion analysis models, as follows:

TD-LSTM is a classical classification model, which integrates related information of the aspect words and their contexts into the LSTM-based classification model, improving the classification accuracy.

ATAE-LSTM is a classification model that inputs the embedded representation of the aspect words as an embedded representation of the sentence into the model, and then applies an attention mechanism to compute weights to achieve high-precision emotion classification.

MemNet is a data-driven classification model that uses multiple attention-based models to capture the importance of each context word to complete emotion classification.

IAN is an interactive attention network that models the aspect words and their contexts, respectively, and generates an associative representation of the target and context.

RAM builds a multi-attention mechanism-based framework to capture distant features in the text, enhancing the representation capability of the model.

TNet uses a two-way LSTM to generate hidden representations of context and aspect words. The CNN layer is used instead of the attention mechanism to extract important features from the hidden representation.

Cabasc utilizes two attention-enhancing mechanisms, focusing on the facet and context separately, and comprehensively considering the context and the correlation between the facets.

AOA constructs a dual attention module that links emotion words to facet words. Further, the dual attention module automatically generates mutual attention weights from facet to text and from text to facet.

MGAN is a multi-granular attention model that captures information about interactions between terms and context from coarse to fine.

AEN-BERT is a model based on attention mechanism and BERT, showing good performance in the aspect-level sentiment analysis task.

BERT-base is a pre-trained BERT based aspect-level sentiment analysis model with complete connectivity layers and softmax layers for classification tasks.

In order to measure the performance of the model more accurately, the AOA model, the IAN model and the MemNet model are expanded, and embedded layers of the models are replaced by the BERT model to obtain the AOA-BERT model, the IAN-BERT model and the MemNet-BERT model. The structure of the rest model is consistent with that described herein.

4.4 evaluation test analysis

As shown in table 4 below, the results of the emotion classification at the emotion polarity C =3 are shown. We can easily observe from the table that BERT-based (BERT pre-training based aspect-level sentiment analysis method) accuracy and macroscopic Fl are significantly higher than models based on glove and word2vec methods. Particularly for restaurant data sets, the precision and macro F1 of the aspect-level emotion analysis method and the model based on the BERT and the aspect feature positioning model are respectively 12.77% higher and 30.97% higher than those of the classical IAN model. This shows that BERT can better express semantic and grammatical features of text, and the facet-level emotion analysis method and model based on the BERT and facet feature localization model of the invention achieve the best classification performance on the three data sets. Specifically, in the restaurant dataset, the accuracy and macro F1 of the method for analyzing the emotion at the facet level based on the BERT and facet feature location models of the present invention are improved by 4.2% and 8.81%, respectively, compared to the AEN method. In addition, it can be easily found that on a notebook computer data set, the classification accuracy of the aspect-level emotion analysis method based on the BERT and the aspect feature localization model and the macro F1 are respectively 3.29% higher and 3.15% higher than that of the BERT-base model, which shows that the aspect feature localization module plays a positive role in the aspect-level emotion analysis.

TABLE 4 Experimental evaluation results for various comparative methods

From the perspective of capturing long-term dependency relationships in comment texts, a series of verification experiments are constructed on texts with different lengths.

As shown in FIGS. 6-8, the aspect level emotion analysis method and model based on the BERT and aspect feature localization model of the present invention generally achieves higher accuracy and macro F1 than TD-LSTM, which means that we build a transform coder that can better simulate the implicit relationship between contexts than LSTM based coders. Furthermore, as shown in the following graph 7, we also note that the ALM-BERT model has 3.1% and 6.56% higher accuracy and mean of macro F1 over sentences of different lengths than AEN, respectively, because the aspect-level sentiment analysis method and model based on BERT and aspect feature localization model of the present invention utilizes information of aspect words better than AEN, reducing interference of information unrelated to aspect words.

In conclusion, the experiments show that the BERT and aspect feature positioning model-based aspect-level emotion analysis method and model can obtain higher accuracy and macro F1, and further verify the feasibility and effectiveness of the BERT model and aspect information in aspect-level emotion analysis tasks.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. An aspect level sentiment analysis method based on BERT and an aspect feature localization model is characterized by comprising the following steps:

s2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the body word representation and the context representation, integrating the relationship between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result; the "constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the body word representation and the context representation and integrate the relationship between the body word and the context" means that the important feature extraction of the aspect-level emotion analysis is realized based on the multi-head attention mechanism, and the important information of the context and the target is extracted, specifically: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, long-term dependence information and context perception information of the context are used as input data of a position feed-forward network, hidden states are respectively generated, and the final interactive hidden state of context interaction and the final interactive hidden state of the context and the aspect words are obtained after mean value pooling operation; wherein the content of the first and second substances,

the "extracting interactive semantics from the aspect information representation and the context information representation generated by the BERT model through the transform coder, determining a context which is most important for emotion qualification of the aspect words, simultaneously generating hidden states by using long-term dependence information and context perception information of the context as input data of a position feed-forward network, and obtaining a final interactive hidden state of context interaction and a final interactive hidden state of the context and the aspect words after a mean pooling operation" specifically includes:

s201, mapping a query sequence and a series of key K values V for capturing different important information in a parallel subspace from aspect information representation and context information representation generated by a BERT model through a plurality of self-attention mechanisms forming a multi-head attention mechanism in a conversion encoder;

wherein

s203, inputting the context expression and the aspect expression into the attention score function formula f _mh (Q，K，V)＝[a ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]W _d In (2), respectively obtaining long-term dependencies of contextInformation c _cc And context awareness information t _ca To capture long term dependencies of contexts and to determine which contexts are most important for sentiment characterization of the facet words; wherein, a ⁱ Attention score, which represents the ith important information captured, [ a ] ¹ ；a ² ；…；a ⁱ ；…；a ^n-head ]Denotes a concatenation vector, W _d Is an attention weight matrix, c _cc ＝f _mh (E _c ,E _c )，t _ca ＝f _mh (E _c ,E _a )；

S204, converting the encoder with c _cc And t _ca Generating hidden states h as input data to a position feedforward network _c And h _a The position feedforward network is a variant of the multi-layer perceptron and is denoted as PFN (h), the PFN (h), h _c And h _a The definition is as follows:

h _c ＝PFN(c _cc )

h _a ＝PFN(t _ca )；

PFN(h)＝ζ(hW ₁ +b ₁ )W ₂ +b ₂ ；

wherein ξ (hW) ₁ +b ₁ ) Is a corrected linear unit, b ₁ And b ₂ Is an offset value, W ₁ And W ₂ Representing a learnable weight parameter;

s205. In the hidden state h _c And h _a After the mean value pooling operation is carried out, the final interactive hidden state h of the context interaction is obtained _cm Final interactive hidden state h of context and aspect word _am ；

2. The method according to claim 1, wherein the "obtaining high-quality context information representation and aspect information representation using BERT model" means generating high-quality text feature vector representation using a pre-trained BERT model as a text vectorization mechanism, wherein the BERT model is a pre-trained language representation model, and the text vectorization mechanism is mapping each word to a high-dimensional vector space, specifically: the BERT model generates text representation by using a deep-layer bidirectional converter coder, divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for the different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.

3. The method according to claim 2, wherein said "dividing a given word sequence into different segments by adding special segmentation markers at the beginning and end of the input sequence, respectively, generating marker embedding, segment embedding and position embedding for the different segments, and finally converting the annotation text and the facet words, respectively, to obtain the context information representation and the facet information representation", specifically:

the BERT model adds special word segmentation marks [ CLS ] at the beginning and the end of an input sequence respectively]And [ SEP ]]Dividing a given word sequence into different segments, generating mark embedding, segment embedding and position embedding for the different segments, enabling the embedded representation of the input sequence to contain all the information of the three embedding, and finally respectively converting the annotation text and the aspect words into 'CLS' in a BERT model]+ comment text + [ SEP]"and" [ CLS]+ target + [ SEP]"get context representation E _c And aspect represents E _a ：

E _c ＝{we _[CLS] ,we ₁ ,we ₂ ,...,we _[SEP] }；

E _a ＝{ae _[CLS] ,ae ₁ ,ae ₂ ,...,ae _[SEP] }；

Wherein we _[CLS] ，ae _[CLS] Represents a Classification tag [ CLS ]]Vector of (2), we _[SEP] And ae _[SEP] Representation delimiter [ SEP]The vector of (2).

4. The method of claim 3, wherein the aspect feature localization model works as the following algorithm 1:

in particular, the feature localization algorithm represents E from the context according to the position and length of the facet words _c Extracting the most important related information of the aspect word af; while the most important feature AF is obtained from the AF using max pooling, then a dropout operation is performed on the most important feature AF and is indicated as E in the context _c To obtain the important characteristics h of the facet word _af 。

5. The method according to claim 4, wherein the fusing context and target importance information related to the target and predicting probabilities of different emotion polarities using the emotion prediction factors on the basis of the fused information specifically comprises:

r＝[h _cm ；h _am ；h _af ]；

s402, data preprocessing is carried out on r by adopting a linear function, namely:

x＝W _u r+b _u wherein W is _u Is a weight matrix, b _u Is a bias value;

6. The method according to any one of claims 1-5, further comprising: training is performed by using cross entropy and L2 regularization as loss functions, and the training is defined as:

indicating the correct emotional polarity.

7. An aspect-level sentiment analysis system, comprising:

a text vectorization mechanism, which utilizes a BERT model to obtain high-quality context information representation and aspect information representation so as to maintain the integrity of text information;

the emotion predictor is used for fusing context related to the target and important target information and predicting the probability of different emotion polarities by utilizing the emotion prediction factors on the basis of the fused information;

the BERT model is a pre-trained language representation model, the representation of a text is generated by using a deep-layer multi-layer bidirectional converter encoder, meanwhile, a given word sequence is divided into different fragments by respectively adding special word segmentation marks at the beginning and the end of an input sequence, mark embedding, segmentation embedding and position embedding are generated for the different fragments, and finally, a comment text and an aspect word are respectively converted to obtain context information representation and aspect information representation;

the feature extraction model of the aspect level emotion analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and used for learning interaction between the body word representation and the context representation and integrating the relationship between the body words and the context so as to distinguish the contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;

the emotion predictor connects the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performs data preprocessing on the overall features by adopting a linear function, and finally calculates the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function;

interactive semantics are extracted from aspect information representation and context information representation generated by a BERT model through a conversion encoder, the context which is most important for emotion qualification of aspect words is determined, meanwhile, long-term dependence information and context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and a final interactive hidden state of context interaction and a final interactive hidden state of the aspect words are obtained after mean value pooling operation, and the method specifically comprises the following steps:

wherein

Denotes a scale factor, d _k Is the dimensionality of query Q and key vector K;

s203, inputting the context expression and the aspect expression into the attention score function formula f _mh (Q，K，V)＝[a ¹ ；a ² ；...；a ⁱ ；...；a ^n-head ]W _d Respectively obtaining long-term dependency information c of context _cc And context-aware information t _ca To capture long term dependencies of contexts and to determine which contexts are most important for sentiment characterization of the facet words; wherein, a ⁱ Attention score representing the ith important information captured, [ a ] ¹ ；a ² ，…，a ⁱ ；…；a ^n-head ]Denotes a concatenation vector, W _d Is an attention weight matrix, c _cc ＝f _mh (E _c ,E _c )，t _ca ＝f _mh (E _c ,E _a )；

S204, converting encoders with c respectively _cc And t _ca Generating hidden states h as input data to a position feedforward network _c And h _a The position feedforward network is a variant of the multi-layer perceptron and is denoted as PFN (h), the PFN (h), h _c And h _a The definition is as follows:

h _c ＝PFN(c _cc )

h _a ＝PFN(t _ca )；

PFN(h)＝ζ(hW ₁ +b ₁ )W ₂ +b ₂ ；

The working process of the aspect feature positioning model is as follows, algorithm 1: