CN116541579A

CN116541579A - Aspect-level emotion analysis based on local context focus mechanism and conversational attention

Info

Publication number: CN116541579A
Application number: CN202310548728.7A
Authority: CN
Inventors: 李弼程; 林煌; 林正超; 康智勇; 王华珍; 皮慧娟; 王成
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2023-05-16
Filing date: 2023-05-16
Publication date: 2023-08-04

Abstract

The invention provides an aspect-level emotion analysis based on a local context focus mechanism and conversational attention, comprising: s1, constructing an analysis model; step S2, modeling words in the local context form sequence and the global context form sequence by the BERT pre-training layer respectively to obtain preliminary local context characteristics and preliminary global context characteristics; s3, at a feature extraction layer, utilizing a local context focus mechanism, further extracting local context features by combining a context feature dynamic masking technology with a talking attention mechanism, and extracting global context features by using the talking attention mechanism; s4, at a feature learning layer, fusing local context features and global context features to obtain fusion vectors, and extracting features of the fusion vectors by adopting a conversational attention mechanism; and S5, acquiring a result of the aspect emotion analysis according to the characteristics of the fusion vector at the output layer. The invention can better capture the emotion contained in different aspects.

Description

Aspect-level Sentiment Analysis Based on Local Context Focus Mechanism and Conversational Attention

技术领域technical field

本发明涉及涉及观点挖掘领域，特别是指基于局部上下文焦点机制和交谈注意力的方面级情感分析。The present invention relates to the field of opinion mining, in particular to aspect-level sentiment analysis based on local context focus mechanism and conversational attention.

背景技术Background technique

随着互联网飞速发展，各种在线平台相继出现，从新闻、博客到论坛，互联网中用户的参与度越来越高，用户通过在网上浏览热点信息的方式，对发生的一些事件表达各自的观点和态度。与此同时，各类产品和娱乐方式也通过互联网的形式展现给用户，用户在购买和体验后，会对产品和服务发表大量表达自己观点的评论。这些带有用户观点的文本数据是非常重要的数据资源，通过分析这些带有用户观点的文本数据有着非常重要的意义。例如，商家可以通过分析这些数据获得用户针对某种产品的喜好和存在的不足，从而找到更好的改进方向，提高产品的销量；突发事件发生后，通过分析人们对该事件所发表的评论数据，从而更好的把控舆论的走向；政府出台新政策时，通过分析网民的观点判断提出的政策是否具有实效，从而进行调整。With the rapid development of the Internet, various online platforms have emerged one after another. From news, blogs to forums, the participation of users in the Internet is getting higher and higher. Users express their views on some events by browsing hot information on the Internet. and attitude. At the same time, various products and entertainment methods are also displayed to users through the Internet. After purchasing and experiencing, users will post a large number of comments expressing their own opinions on products and services. These text data with user opinions are very important data resources, and it is of great significance to analyze these text data with user opinions. For example, merchants can analyze the data to obtain users' preferences and deficiencies for a certain product, so as to find better directions for improvement and increase product sales; after an emergency occurs, by analyzing people's comments on the event Data, so as to better control the direction of public opinion; when the government introduces new policies, it analyzes the opinions of netizens to judge whether the proposed policies are effective, so as to make adjustments.

随着社交媒体的不断进步，情感分析在自然语言处理领域具有很高的理论意义和应用价值。情感分析主要是对文本中带有的不同情感表达进行情感分类。以往的大多数情感分析研究都是一种粗粒度分析，无法满足更加准确精细的分析需求。例如，对某一个产品来说，从不同的方面去分析该产品存在哪些优点和哪些缺点等。方面级情感分析不同于以往的粗粒度研究，能够对句子中不同方面的情感极性进行分析，因此成为了情感分析领域中一个重要的研究方向。例如：“The price of this house is good,but the locationis terrible”，这句针对房子的评论当中存在着两个方面词，分别是“price”和“location”，两者对应的观点词分别是“good”和“terrible”，其中“good”代表的是积极的情感，“terrible”代表的是消极的情感。在这种情况下，方面级情感分析能够更充分的捕获文中表达的不同方面的情感。由此可见，对于方面级情感分析的研究有着非常重要的研究意义和价值。With the continuous advancement of social media, sentiment analysis has high theoretical significance and application value in the field of natural language processing. Sentiment analysis is mainly to classify the different emotional expressions in the text. Most of the previous research on sentiment analysis is a kind of coarse-grained analysis, which cannot meet the needs of more accurate and fine-grained analysis. For example, for a certain product, analyze the advantages and disadvantages of the product from different aspects. Aspect-level sentiment analysis is different from the previous coarse-grained research, which can analyze the emotional polarity of different aspects in a sentence, so it has become an important research direction in the field of sentiment analysis. For example: "The price of this house is good, but the location is terrible", there are two aspect words in this comment on the house, namely "price" and "location", and the corresponding opinion words are " good” and “terrible”, where “good” represents positive emotions and “terrible” represents negative emotions. In this case, aspect-level sentiment analysis can more fully capture the sentiment of different aspects expressed in the text. It can be seen that the research on aspect-level sentiment analysis has very important research significance and value.

随着预训练模型的提出，以BERT为首的预训练模型受到了广泛的关注，越来越多的研究人员将该模型用到方面级情感分析领域，证明BERT预训练模型对方面级情感分析任务是可行的。但现有技术中的方面级情感分析研究都是对句子中不同方面的情感极性进行预测，缺乏考虑情感极性和局部语境之间的关系。此外，大多数研究都是结合单个注意力或多头注意力机制展开研究，但是多头注意力机制中的每个头的运算都是相互独立的，因此现有技术的方面级情感分析方法在语言处理模型上还有提升空间。With the introduction of the pre-training model, the pre-training model headed by BERT has received extensive attention. More and more researchers have used this model in the field of aspect-level sentiment analysis, proving that the BERT pre-training model is effective for aspect-level sentiment analysis tasks. It works. However, aspect-level sentiment analysis research in the prior art is all about predicting the emotional polarity of different aspects in a sentence, and lacks consideration of the relationship between emotional polarity and local context. In addition, most studies combine single attention or multi-head attention mechanisms, but the operations of each head in the multi-head attention mechanism are independent of each other, so the existing aspect-level sentiment analysis methods in the language processing model There is still room for improvement.

发明内容Contents of the invention

本发明的目的是提出基于局部上下文焦点机制和交谈注意力的方面级情感分析，将情感极性和局部语境建立关系，并且将多头注意力机制中相互独立的头联系起来，得到了更强的注意力设计，在语言处理模型上取得了更优的效果，能够更好地捕获不同方面中蕴含的情感。The purpose of the present invention is to propose an aspect-level sentiment analysis based on the local context focus mechanism and conversation attention, establish a relationship between emotional polarity and local context, and connect independent heads in the multi-head attention mechanism to obtain a stronger The attention design has achieved better results in language processing models, and can better capture the emotions contained in different aspects.

本发明通过以下技术方案实现：The present invention is realized through the following technical solutions:

基于局部上下文焦点机制和交谈注意力的方面级情感分析，包括如下步骤：Aspect-level sentiment analysis based on local context focus mechanism and conversation attention, including the following steps:

步骤S1、构建包括BERT预训练层、特征提取层、特征学习层和输出层的分析模型；Step S1, constructing an analysis model including BERT pre-training layer, feature extraction layer, feature learning layer and output layer;

步骤S2、BERT预训练层将待分析语料处理为局部上下文形式序列和全局上下文形式序列，并分别对局部上下文形式序列和全局上下文形式序列中的词进行建模，得到初步局部上下文特征和初步全局上下文特征/>步骤S3、在特征提取层，利用局部上下文焦点机制，通过上下文特征动态掩码技术结合交谈注意力机制来进一步提取局部上下文特征，并使用交谈注意力机制提取全局上下文特征，具体包括：Step S2, the BERT pre-training layer processes the corpus to be analyzed into a local context form sequence and a global context form sequence, and models the words in the local context form sequence and the global context form sequence respectively to obtain preliminary local context features and preliminary global context features /> Step S3. In the feature extraction layer, use the local context focus mechanism to further extract local context features through the context feature dynamic masking technology combined with the conversation attention mechanism, and use the conversation attention mechanism to extract global context features, specifically including:

步骤S31、根据公式计算局部上下文形式序列的语义相对距离，其中，i表示局部上下文形式序列中词的位置，F_a表示局部上下文形式序列中方面词的位置，n表示局部上下文形式序列中方面词的长度；Step S31, according to the formula Calculate the semantic relative distance of the local context form sequence, where i represents the position of the word in the local context form sequence, F _a represents the position of the aspect word in the local context form sequence, and n represents the length of the aspect word in the local context form sequence;

步骤S32、通过上下文特征动态掩码技术来帮助模型捕捉局部上下文特征，得到局部上下文特征其中，M＝[V₁,V₂,...V_n]为用于屏蔽非局部上下文特征的屏蔽矩阵，/>为局部上下文形式序列中每个上下文词的掩码向量，i＝1,2,…,n，a为语义相对距离阈值，E为长度为n的1向量，O为长度为n的0向量；Step S32, using context feature dynamic masking technology to help the model capture local context features, and obtain local context features Among them, M=[V ₁ , V ₂ ,...V _n ] is a masking matrix for masking non-local contextual features, /> Be the mask vector of each context word in the local context form sequence, i=1,2,...,n, a is the semantic relative distance threshold, E is a 1 vector whose length is n, and O is a 0 vector whose length is n;

步骤S33、采用交谈注意力机制进一步提取局部上下文特征采用交谈注意力机制提取全局上下文特征/> Step S33, using the chat attention mechanism to further extract local context features Using Conversation Attention Mechanism to Extract Global Context Features/>

步骤S4、在特征学习层，将局部上下文特征O^l和全局上下文特征O^g进行融合得到融合向量，并采用交谈注意力机制提取融合向量的特征步骤S5、在输出层，根据融合向量的特征/>获取方面级情感分析的结果。进一步的，所述步骤S2中，将待分析语料预处理为局部上下文形式序列和全局上下文形式序列具体为：将待分析语料分别处理成局部上下文形式序列X^l＝[CLS]+上下文+[SEP]、全局上下文形式序列X^g＝[CLS]+上下文+[SEP]+方面词+[SEP]的形式，其中，[CLS]可以作为整个句子的语义表示。Step S4, in the feature learning layer, fuse the local context feature O ^l and the global context feature O ^g to obtain a fusion vector, and use the conversation attention mechanism to extract the features of the fusion vector Step S5, at the output layer, according to the characteristics of the fusion vector /> Get the results of aspect-level sentiment analysis. Further, in the step S2, preprocessing the corpus to be analyzed into a local context form sequence and a global context form sequence is specifically: processing the corpus to be analyzed into a local context form sequence X ^l =[CLS]+context+[SEP ], global context form sequence X ^g =[CLS]+context+[SEP]+aspect word+[SEP] form, where [CLS] can be used as the semantic representation of the entire sentence.

进一步的，所述步骤S2中，BERT预训练层采用第一BERT训练模型BERT^l对X^l进行建模得到采用第二BERT训练模型BERT^g对X^g进行建模得到其中，第一BERT训练模型BERT^l和第二BERT训练模型BERT^g相互独立。Further, in the step S2, the BERT pre-training layer adopts the first BERT training model BERT ¹ to model X ¹ to obtain Using the second BERT training model BERT ^g to model X ^g to get Wherein, the first BERT training model BERT ^l and the second BERT training model BERT ^g are independent of each other.

进一步的，所述步骤S4中，所述融合向量的特征为其中，O^lg＝[O^l；O^g]，W^lg表示权重系数矩阵，b^lg表示偏置向量，/>为一个全连接层。Further, in the step S4, the feature of the fusion vector is in, O ^lg = [O ^l ; O ^g ], W ^lg represents the weight coefficient matrix, b ^lg represents the bias vector, /> is a fully connected layer.

进一步的，所述步骤S5中，所述输出层为非线性层，将所述融合向量的特征输入非线性层，利用softmax函数进行预测：/>其中，/>为分析结果，W_o为权重矩阵，b_o为偏置向量。Further, in the step S5, the output layer is a nonlinear layer, and the features of the fusion vector Enter the nonlinear layer and use the softmax function to predict: /> where, /> is the analysis result, W _o is the weight matrix, b _o is the bias vector.

本发明具有如下有益效果：The present invention has following beneficial effect:

本发明充分考虑了局部语境对情感极性的重要性，使用相互独立的第一BERT训练模型和第二BERT训练模型来提取局部和全局上下文特征，再通过局部上下文焦点机制中的上下文特征动态掩码层结合交谈注意力机制进一步捕获局部上下文特征，再与全局信息进行融合后输入到非线性层进行情感分析，实现了更强的注意力设计，在语言处理模型上取得了更优的效果，能够更好地捕获不同方面中蕴含的情感。The present invention fully considers the importance of local context on emotional polarity, uses the first BERT training model and the second BERT training model that are independent of each other to extract local and global context features, and then uses the context features in the local context focus mechanism to dynamically The mask layer combines the conversation attention mechanism to further capture the local context features, and then fuses with the global information and then inputs it to the nonlinear layer for sentiment analysis, achieving stronger attention design and achieving better results in the language processing model , which can better capture the emotions contained in different aspects.

附图说明Description of drawings

下面结合附图对本发明做进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.

图1为本发明的流程图。Fig. 1 is a flowchart of the present invention.

图2为本发明的分析模型图。Fig. 2 is an analysis model diagram of the present invention.

图3为本发明上下文特征动态掩码图。Fig. 3 is a dynamic mask diagram of context features in the present invention.

图4为本发明语义相对距离对分析准确率的影响图。Fig. 4 is a graph showing the influence of semantic relative distance on analysis accuracy in the present invention.

图5为本发明语义相对距离对MF1值的影响图。Fig. 5 is a graph showing the influence of semantic relative distance on MF1 value in the present invention.

图6为本发明对分析准确率的影响图。Fig. 6 is a graph showing the influence of the present invention on the analysis accuracy.

图7为本发明对MF1的影像图。Fig. 7 is an image diagram of MF1 according to the present invention.

具体实施方式Detailed ways

如图1和图2所示，基于局部上下文焦点机制和交谈注意力的方面级情感分析，包括如下步骤：As shown in Figure 1 and Figure 2, the aspect-level sentiment analysis based on the local context focus mechanism and conversational attention includes the following steps:

步骤S1、构建包括BERT预训练层、特征提取层、特征学习层和输出层的分析模型(即LCFTHA模型)；Step S1, constructing an analysis model (i.e. LCFTHA model) comprising BERT pre-training layer, feature extraction layer, feature learning layer and output layer;

步骤S2、BERT预训练层将待分析语料处理为局部上下文形式序列和全局上下文形式序列，并分别对局部上下文形式序列和全局上下文形式序列中的词进行建模，得到初步局部上下文特征和初步全局上下文特征/> Step S2, the BERT pre-training layer processes the corpus to be analyzed into a local context form sequence and a global context form sequence, and models the words in the local context form sequence and the global context form sequence respectively to obtain preliminary local context features and preliminary global context features />

具体地，将待分析语料分别处理成局部上下文形式序列X^l＝[CLS]+上下文+[SEP]、全局上下文形式序列X^g＝[CLS]+上下文+[SEP]+方面词+[SEP]的形式，其中，[CLS]可以作为整个句子的语义表示，[SEP]为BERT预训练层所处理语句的分隔符；Specifically, the corpus to be analyzed is processed into a local context form sequence X ^l = [CLS] + context + [SEP], a global context form sequence X ^g = [CLS] + context + [SEP] + aspect words + [SEP] In the form of , [CLS] can be used as the semantic representation of the entire sentence, and [SEP] is the separator of the sentence processed by the BERT pre-training layer;

分别采用第一BERT训练模型BERT^l对X^l进行建模得到第二BERT训练模型BERT^g对X^g进行建模得到/>其中，第一BERT训练模型BERT^l和第二BERT训练模型BERT^g相互独立；The first BERT training model BERT ^l is used to model X ^l respectively to obtain The second BERT training model BERT ^g models X ^g to get /> Wherein, the first BERT training model BERT ^l and the second BERT training model BERT ^g are independent of each other;

步骤S3、在特征提取层，利用局部上下文焦点机制，通过上下文特征动态掩码技术结合交谈注意力机制来进一步提取局部上下文特征，并使用交谈注意力机制提取全局上下文特征，具体包括：Step S3. In the feature extraction layer, use the local context focus mechanism to further extract local context features through the context feature dynamic masking technology combined with the conversation attention mechanism, and use the conversation attention mechanism to extract global context features, specifically including:

步骤S31、根据公式计算局部上下文形式序列的语义相对距离(语义相对距离是基于Token-Aspect对的概念，即句子中词的位置和方面词的位置，描述Token与Aspect之间的距离，可以理解成在二者之间相隔多少个词)，其中，i表示局部上下文形式序列中词的位置，F_a表示局部上下文形式序列中方面词的位置，n表示局部上下文形式序列中方面词的长度，D_i表示第i个词的位置与目标方面之间的距离；Step S31, according to the formula Calculate the semantic relative distance of the local context form sequence (semantic relative distance is based on the concept of Token-Aspect pair, that is, the position of the word in the sentence and the position of the aspect word, describing the distance between Token and Aspect, which can be understood as the distance between the two How many words are separated between them), where i represents the position of the word in the local context form sequence, F _a represents the position of the aspect word in the local context form sequence, n represents the length of the aspect word in the local context form sequence, and D _i represents the i-th The distance between the position of a word and the target facet;

步骤S32、通过上下文特征动态掩码技术来帮助模型捕捉局部上下文特征，得到局部上下文特征其中，M＝[V₁,V₂,...V_n]为用于屏蔽非局部上下文特征的屏蔽矩阵，/>为局部上下文形式序列中每个上下文词的掩码向量，当D_i≤a时V_i表示的是局部上下文，i＝1,2,…,n，a为语义相对距离阈值，E为长度为n的1向量，O为长度为n的0向量；Step S32, using context feature dynamic masking technology to help the model capture local context features, and obtain local context features Among them, M=[V ₁ , V ₂ ,...V _n ] is a masking matrix for masking non-local contextual features, /> is the mask vector of each context word in the local context form sequence, when D _i ≤ a, V _i represents the local context, i=1,2,...,n, a is the semantic relative distance threshold, E is the length 1 vector of n, O is a 0 vector of length n;

其中，如图3所示，除了局部上下文特征，上下文动态特征掩码层将屏蔽层学习到的非局部上下文特征，虚线箭头指向的输出位置的特征将被屏蔽，实线箭头指向的输出位置的特征将被保留，POS表示为输出的位置，上下文特征动态掩码将非局部上下文的所有位置的特征设置为零向量；Among them, as shown in Figure 3, in addition to the local context features, the context dynamic feature mask layer will shield the non-local context features learned by the layer, the features of the output position pointed by the dotted line arrow will be masked, and the output position pointed by the solid line arrow The feature will be preserved, POS is represented as the position of the output, and the context feature dynamic mask sets the feature of all positions of the non-local context to a zero vector;

通过将多头注意力机制的各个独立的注意力头联系起来，也就是将多个注意力头用一个参数矩阵重新进行融合，形成多个混合注意力，每一个新得到的混合注意力都融合了原先各个头的注意力，即：O_THA＝concat[{O⁽¹⁾,O⁽²⁾,...,O^(h)}·W^WH]，其中，concat[]用于连接两个或者多个数组，O⁽¹⁾＝P⁽¹⁾V⁽¹⁾,O⁽²⁾＝P⁽²⁾V⁽²⁾,...,O^(h)＝P^(h)V^(h) ，P⁽¹⁾＝softmax(J⁽¹⁾),P⁽²⁾＝softmax(J⁽²⁾),...,P^(h)＝softmax(J^(h)) ，J^(h)表示在softmax操作前引入各个注意力头之间的线性映射，O^(h)表示Attention(Q^(h),K^(h),V^(h))，表示缩因子，h表示注意力个头数，W^WH表示权重矩阵，λ_hh表示可训练的参数矩阵；By connecting the independent attention heads of the multi-head attention mechanism, that is, re-integrating multiple attention heads with a parameter matrix to form multiple mixed attentions, each newly obtained mixed attention is fused The original attention of each head, namely: O _THA = concat[{O ⁽¹⁾ ,O ⁽²⁾ ,...,O ^(h) } W ^WH ], where concat[] is used to connect two or Multiple arrays, O ⁽¹⁾ = P ⁽¹⁾ V ⁽¹⁾ , O ⁽²⁾ = P ⁽²⁾ V ⁽²⁾ ,..., O ^(h) = P ^(h) V ^(h) , P ⁽¹⁾ =softmax(J ⁽¹⁾ ),P ⁽²⁾ =softmax(J ⁽²⁾ ),...,P ^(h) =softmax(J ^(h) ), J ^(h) means introducing a linear mapping between each attention head before the softmax operation, O ^(h) means Attention (Q ^(h) , K ^(h) , V ^(h) ), Represents the reduction factor, h represents the number of attention, W ^WH represents the weight matrix, λ _hh represents the trainable parameter matrix;

交谈注意力机制可以重新平衡被屏蔽的局部上下文特征，避免上下文特征动态掩码后特征分布的不均衡机制，公式中的THA()即为采用O_THA＝concat[{O⁽¹⁾,O⁽²⁾,...,O^(h)}·W^WH]计算，具体计算过程为现有技术；The chat attention mechanism can rebalance the masked local contextual features and avoid the unbalanced feature distribution mechanism after the dynamic masking of contextual features, the formula The THA () in is to adopt O _THA =concat[{O ⁽¹⁾ ,O ⁽²⁾ ,...,O ^(h) } W ^WH ] to calculate, and the specific calculation process is the prior art;

步骤S4、在特征学习层，将局部上下文特征O^l和全局上下文特征O^g进行融合得到融合向量，并采用交谈注意力机制提取融合向量的特征 Step S4, in the feature learning layer, fuse the local context feature O ^l and the global context feature O ^g to obtain a fusion vector, and use the conversation attention mechanism to extract the features of the fusion vector

具体地，融合向量的特征为其中，/>O^lg＝[O^l；O^g]，W^lg表示权重系数矩阵，/>b^lg表示偏置向量，/>d_n、d_h分别表示O^l、O^g的行列数，/>为一个全连接层；Specifically, the fusion vector is characterized by where, /> O ^lg = [O ^l ; O ^g ], W ^lg represents the weight coefficient matrix, /> b ^lg represents the bias vector, /> d _n and d _h represent the number of rows and columns of O ^l and O ^g respectively, /> is a fully connected layer;

步骤S5、在输出层，根据融合向量的特征获取方面级情感分析的结果；Step S5, at the output layer, according to the characteristics of the fusion vector Obtain the results of aspect-level sentiment analysis;

具体为：输出层为非线性层，将融合向量的特征输入非线性层，利用softmax函数进行预测：/>其中，/>为分析结果，W_o为权重矩阵，b_o为偏置向量。Specifically: the output layer is a nonlinear layer, and the features of the vector will be fused Enter the nonlinear layer and use the softmax function to predict: /> where, /> is the analysis result, W _o is the weight matrix, b _o is the bias vector.

本发明在实验方面通过使用Restaurant14和Laptop14两个数据集以及Twitter数据集，来验证本文模型的有效性。在这三个公开数据集中的评论信息当中，每个句子当中的方面词都对应着三种不同的情感极性，数据细节如表1所示。In terms of experiments, the present invention verifies the validity of the model in this paper by using two data sets of Restaurant14 and Laptop14 and a Twitter data set. Among the comment information in these three public data sets, the aspect words in each sentence correspond to three different emotional polarities, and the data details are shown in Table 1.

表1数据集分布Table 1 Dataset distribution

LCFTHA模型的部分超参数设置：BERT层学习率为0.00002，迭代次数为20次，批量处理大小为32，句子最大长度为85，交谈注意力头数为12，L2正则化权值为0.01，BERT模型选择bert-base-uncased(开源的bert预训练模型)。Some hyperparameter settings of the LCFTHA model: the learning rate of the BERT layer is 0.00002, the number of iterations is 20, the batch size is 32, the maximum sentence length is 85, the number of conversational attention heads is 12, the L2 regularization weight is 0.01, BERT Model selection bert-base-uncased (open source bert pre-training model).

本发明选择七个基线模型进行对比实验，来验证LCFTHA模型的有效性。主要采用了方面级情感分析任务中两个常用的评价指标，来验证模型的效果。第一个评价指标是准确率Accuracy，表示的是在任务中正确预测样本占所有样本的比例，预测正确的可能是正样本或负样本，即：The present invention selects seven baseline models for comparative experiments to verify the validity of the LCFTHA model. Two commonly used evaluation indicators in aspect-level sentiment analysis tasks are mainly used to verify the effect of the model. The first evaluation index is Accuracy, which indicates the proportion of correctly predicted samples to all samples in the task. The correct prediction may be a positive sample or a negative sample, that is:

其中，TP表示的是预测正确的正样本，TN表示的是预测正确的负样本，FP表示的是预测错误的正样本，FN表示的是预测错误的负样本。Among them, TP represents the positive samples that are predicted correctly, TN represents the negative samples that are correctly predicted, FP represents the positive samples that are incorrectly predicted, and FN represents the negative samples that are incorrectly predicted.

其中，P表示情感类别中的精确率，R表示情感类别中的召回率，C表示情感类别的总数。Among them, P represents the precision rate in the emotion category, R represents the recall rate in the emotion category, and C represents the total number of emotion categories.

通过在Restaurant14、Laptop14和Twitter三个公开数据集上进行实验，对比七个基线模型，结果如表2所示。By conducting experiments on three public datasets of Restaurant14, Laptop14 and Twitter, and comparing seven baseline models, the results are shown in Table 2.

表2模型准确率与宏平均(％)Table 2 Model accuracy and macro average (%)

其中在三个公开数据集Restaurant14、Laptop14和Twitter上，本文提出的LCFTHA模型在准确率和MF1值均优于基线模型，其准确率相较于其它模型有很大提升。Among them, on the three public datasets Restaurant14, Laptop14 and Twitter, the LCFTHA model proposed in this paper is superior to the baseline model in accuracy and MF1 value, and its accuracy is greatly improved compared with other models.

在模型LCFTHA中，语义相对距离是影响局部上下文特征提取的重要因素，所以本发明通过实验分析了语义相对距离对三个数据集的影响，从图4和图5可以看出，语义相对距离的阈值为2的时候，Restaurant14数据集的分析准确率和MF1值(MF1表示不同情感类别情况下模型的平均表现)的效果达到最优。对于Laptop14数据集，当语义相对距离的阈值为7的时候，分析准确率和MF1值的效果达到最优。对于Twitter数据集，语义相对距离的阈值与Laptop14数据集相同，当其阈值为7的时候，分析准确率和MF1值的效果达到最优。In the model LCFTHA, the semantic relative distance is an important factor affecting the extraction of local context features, so the present invention analyzes the influence of the semantic relative distance on the three data sets through experiments. It can be seen from Figure 4 and Figure 5 that the semantic relative distance When the threshold is 2, the analysis accuracy and MF1 value (MF1 represents the average performance of the model under different emotional categories) of the Restaurant14 data set are optimal. For the Laptop14 dataset, when the threshold of semantic relative distance is 7, the analysis accuracy and MF1 value are optimal. For the Twitter dataset, the threshold of semantic relative distance is the same as that of the Laptop14 dataset. When the threshold is 7, the analysis accuracy and MF1 value are optimal.

最后，本发明在三个公开数据集上进行了消融实验，以此，来验证LCFTHA模型中各个模块的重要性，其中w/o为without的缩写，FLL为特征学习层的缩写，结果如表3所示。Finally, the present invention carried out ablation experiments on three public data sets to verify the importance of each module in the LCFTHA model, where w/o is the abbreviation of without, and FLL is the abbreviation of feature learning layer. The results are shown in the table 3.

表3消融实验结果(％)Table 3 Ablation experiment results (%)

通过对表3的观察可以看出，将全局和局部两个模块组件结合后，可以看出效果比起单一的局部或全局模块更优。在此基础上，加上FLL可以明显看出模型的性能进一步提高，实验结果表明FLL层对该模型的设计具有重要的意义。通过柱形图来更加直观的表示各个模型组件的重要性，如图6和图7所示。It can be seen from the observation of Table 3 that after combining the global and local module components, it can be seen that the effect is better than that of a single local or global module. On this basis, it can be seen that the performance of the model is further improved by adding FLL, and the experimental results show that the FLL layer is of great significance to the design of the model. The importance of each model component is more intuitively represented by a histogram, as shown in Figure 6 and Figure 7.

综上所述，本发明提出了一种基于局部上下文焦点机制和交谈注意力机制的方面级情感分析模型LCFTHA。首先，使用两个BERT预训练模型来提取局部和全局上下文特征；然后，将经过BERT提取到的初步局部特征，再通过局部上下文焦点机制中的CDM层结合交谈注意力机制进一步捕获局部上下文特征，针对全局上下文特征部署一个交谈注意力编码器来学习全局上下文特征；最后，在特征学习层通过将局部和全局上下文特征进行融合后，输入到非线性层进行情感分析。通过在Restaurant14、Laptop14和Twitter三个公开数据集上的实验证明，LCFTHA模型的性能优于其它方面级情感分析任务中的基线模型。To sum up, the present invention proposes an aspect-level sentiment analysis model LCFTHA based on local context focus mechanism and conversation attention mechanism. First, use two BERT pre-training models to extract local and global context features; then, use the preliminary local features extracted by BERT to further capture local context features through the CDM layer in the local context focus mechanism combined with the conversation attention mechanism, A conversational attention encoder is deployed for global context features to learn global context features; finally, after the feature learning layer fuses local and global context features, it is input to the nonlinear layer for sentiment analysis. Experiments on three public datasets of Restaurant14, Laptop14 and Twitter prove that the performance of the LCFTHA model is better than the baseline models in other aspect-level sentiment analysis tasks.

以上所述，仅为本发明的较佳实施例而已，故不能以此限定本发明实施的范围，即依本发明申请专利范围及说明书内容所作的等效变化与修饰，皆应仍属本发明专利涵盖的范围内。The above is only a preferred embodiment of the present invention, so it cannot limit the scope of the present invention, that is, equivalent changes and modifications made according to the patent scope of the present invention and the content of the specification should still belong to the present invention covered by the patent.

Claims

1. Aspect-level sentiment analysis based on local context focus mechanism and conversational attention, characterized in that: comprising the following steps:

Step S1, constructing an analysis model including BERT pre-training layer, feature extraction layer, feature learning layer and output layer;

Step S2, the BERT pre-training layer processes the corpus to be analyzed into a local context form sequence and a global context form sequence, and models the words in the local context form sequence and the global context form sequence respectively to obtain preliminary local context features and preliminary global context features />

Step S3. In the feature extraction layer, use the local context focus mechanism to further extract local context features through the context feature dynamic masking technology combined with the conversation attention mechanism, and use the conversation attention mechanism to extract global context features, specifically including:

Step S31, according to the formula Calculate the semantic relative distance of the local context form sequence, where i represents the position of the word in the local context form sequence, F _a represents the position of the aspect word in the local context form sequence, and n represents the length of the aspect word in the local context form sequence;

Step S32, using context feature dynamic masking technology to help the model capture local context features, and obtain local context features Among them, M=[V ₁ , V ₂ ,...V _n ] is a masking matrix for masking non-local contextual features, /> Be the mask vector of each context word in the local context form sequence, i=1,2,...,n, a is the semantic relative distance threshold, E is a 1 vector whose length is n, and O is a 0 vector whose length is n;

Step S33, using the chat attention mechanism to further extract local context features Using Conversation Attention Mechanism to Extract Global Context Features/>

Step S4, in the feature learning layer, fuse the local context feature O ^l and the global context feature O ^g to obtain a fusion vector, and use the conversation attention mechanism to extract the features of the fusion vector

Step S5, at the output layer, according to the characteristics of the fusion vector Get the results of aspect-level sentiment analysis.

2. The aspect-level sentiment analysis based on local context focus mechanism and conversational attention according to claim 1, characterized in that: in the step S2, the corpus to be analyzed is preprocessed into a local context form sequence and a global context form sequence Specifically: process the corpus to be analyzed into local context form sequence X ^l = [CLS] + context + [SEP], global context form sequence X ^g = [CLS] + context + [SEP] + aspect word + [SEP] In the form of , [CLS] can be used as the semantic representation of the entire sentence.

3. The aspect-level sentiment analysis based on local context focus mechanism and conversation attention according to claim 2, characterized in that: in the step S2, the BERT pre-training layer adopts the first BERT training model BERT ¹ to X ¹ modeled Using the second BERT training model BERT ^g to model X ^g to get /> Wherein, the first BERT training model BERT ^l and the second BERT training model BERT ^g are independent of each other.

4. The aspect-level sentiment analysis based on local context focus mechanism and conversation attention according to claim 1 or 2 or 3, characterized in that: in the step S4, the feature of the fusion vector is in, O ^lg = [O ^l ; O ^g ], W ^lg represents the weight coefficient matrix, b ^lg represents the bias vector, /> is a fully connected layer.

5. according to claim 1 or 2 or 3 described aspect-level sentiment analysis based on local context focus mechanism and conversation attention, it is characterized in that: in described step S5, described output layer is a nonlinear layer, and described Features of fused vectors Enter the nonlinear layer and use the softmax function to predict: /> where, /> is the analysis result, W _o is the weight matrix, b _o is the bias vector.