CN115169429A - Lightweight aspect-level text emotion analysis method - Google Patents

Lightweight aspect-level text emotion analysis method Download PDF

Info

Publication number
CN115169429A
CN115169429A CN202210390699.1A CN202210390699A CN115169429A CN 115169429 A CN115169429 A CN 115169429A CN 202210390699 A CN202210390699 A CN 202210390699A CN 115169429 A CN115169429 A CN 115169429A
Authority
CN
China
Prior art keywords
model
emotion
context
meta
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210390699.1A
Other languages
Chinese (zh)
Inventor
曹小鹏
梁浩
王凯丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN202210390699.1A priority Critical patent/CN115169429A/en
Publication of CN115169429A publication Critical patent/CN115169429A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a lightweight aspect-level text emotion analysis method, which solves the problem that a DNN-based method only focuses on analyzing the correlation between global context and emotion polarity before recognizing the emotion polarity of a target aspect based on global context characteristics, and the technical scheme mainly comprises the following steps: the method comprises the steps of (1) inputting a data set subjected to data cleaning into a Distilroberta pre-training model (2), inputting word vectors represented by vectorization into an SRU + + feature extraction network (3), setting SRD thresholds according to different data sets, extracting local context features (4), performing interactive learning on the local and global context features by using a multi-head attention mechanism, and predicting emotion polarity by using a Softmax function to obtain probability distribution of emotion categories of words in corresponding aspects. The method is mainly applied to text emotion analysis application.

Description

Lightweight aspect-level text emotion analysis method
Technical Field
The invention belongs to the field of computer natural language processing, and particularly relates to a lightweight aspect-level text sentiment analysis method.
Background
In the big data age, digital information interacted in the internet is continuously generated at an exponential level, wherein user comments occupy a considerable weight. However, the process of analyzing user information from large-scale user comment data by manual means is time-consuming and labor-consuming, and mass extraction and analysis by a computer gradually become the mainstream of information analysis means.
The emotion analysis can acquire the viewpoint, emotion and attitude expressed by the user in the comment information by analyzing and extracting the emotion information in the comment of the user. Emotion analysis can be divided into chapter-level emotion classification, sentence-level emotion classification and Aspect-level emotion classification according to the research granularity of texts, wherein Aspect-level emotion analysis (ABSA) can more accurately judge the emotion polarities in different aspects in a sentence due to the small text granularity, and becomes an important research direction [1] in the field of emotion analysis, for example: in the sentence "While the food is a good and a porous way waiting can be a rightmare," the emotion of "food" is positive and the emotion of "waiting" is negative. Since the two aspect items express opposite emotions, it is not appropriate to assign only one sentence level emotion polarity, and at this time, the analysis of the whole sentence cannot accurately extract the emotion information of the user for various aspects and attributes of the product, so that a fine-grained emotion analysis, that is, the aspect level emotion analysis, needs to be adopted in this problem. The method can be used for finishing a fine-grained sentiment analysis task, namely mining sentiment information of different aspects of the comment text.
The starting layer of the aspect level sentiment analysis model is a word embedding layer, and the purpose of the layer is to map each input word into a low-dimensional vector. Depending on the downstream task, it is important to select the appropriate word embedding tool. The mainstream Word embedding Word2vec Model is divided into a skip-gram Model and a Bag of Words (CBOW) Model. The word vector formed by the model can express the similarity between different words, but does not fuse the overall context semantics. The pre-training model obtains a pre-training model independent of a specific task from large-scale data through self-supervised learning. The semantic representation of a certain word in a specific context is better reflected. A Glove pre-training corpus obtained from large-scale corpus training can construct vectors according to co-occurrence information among words. However, in practical applications, the word vectors generated by Glove fail to effectively incorporate context information. The subsequent Elmo model and the BERT model can fuse context information in the word embedding process, and effectively solve the problem of word ambiguity. The method has the advantages that when the BEART classification model is applied to the aspect level emotion analysis task by Hoang and the like and Gao and the like, the obtained classification effect is higher than that of a model constructed by other word embedding tools. The BERT model can efficiently acquire context information, and the word embedding expression generated by the model can effectively solve the problem of word ambiguity. Song et al propose BERT-SPC based on BERT, which model prepares the input sequence by attaching aspects to a context, with the context and aspects as two segments. Li et al introduced a new approach named GBCN that embedded a gating mechanism with context-aware aspects to enhance and control BERT representation of aspect-based emotional analysis.
Long Short Term Memory networks (LSTM) are well suited to handle serialized data, but using LSTM directly can cause the facet words to be ignored, resulting in the overall sentiment of the sentence. Attention mechanisms are then introduced to this task. Wang et al propose AT-LSTM, encode the context, input into LSTM and extract the semantic information, reuse the attention mechanism to further highlight the characteristic that helps the emotion to distinguish, distinguish the emotion again, has ignored the influence that the word of the aspect is on the emotion prediction; wang et al also proposed an AT-LSTM-based ATAE-LSTM that concatenates together representations of aspect words and context and uses an attention mechanism to model aspects and context. Considering that an aspect item may be a phrase consisting of multiple words, and none of the previous models model the aspect word separately, ma et al propose an IAN (Interactive Attention Networks) that models the aspect word and context separately and links them together through an Attention mechanism.
Conventional DNN-based approaches focus on analyzing the global context's correlation to emotional polarity only, before identifying the emotional polarity in terms of targets based on global context features. Zeng et al propose a Local Context Focus (LCF) model that is different from the global Context Focus approach described above. The LCF model uses the word Distance in the sentence sequence as the Semantic-Relative Distance (SRD) to get the local context representation. The model notes that the emotional polarity of an aspect is more affected by context words closer to itself, and that context words further away from an aspect may negatively impact the accuracy of prediction of a particular aspect polarity. As in the above example, the opinion word "good" is closer to "food" in the sentence sequence, and "nightmare" is closer to "waiting", so it can be said that the farther away words have less influence on the emotional polarity of the aspect. However, the LCF baseline model uses a self-attention network for coding, so that the model is slow to converge during training, and an efficient representation cannot be trained due to a small data set. In addition, local and global context Representations obtained by two BERT (Bidirectional Encoder retrieval from transforms) models of LCF-BERT greatly increase the parameter number of the models, and an LGLFF model is proposed for the purpose.
Disclosure of Invention
The invention provides a text sentiment analysis method with multi-feature fusion, which can efficiently and accurately analyze the sentiment polarity of sentences in a text, and the technical scheme of the invention mainly comprises the following steps:
1. performing word embedding and coding representation on the global context by using a Distilroberta pre-training model; 2. using an SRU + + network to extract features to obtain global features; 3. adjusting the size of a Semantic-Relative Distance (SRD) threshold according to different data sets to mask the global context representation to obtain a local context representation; 4. interactive learning modeling is carried out on the global context and the local context characteristics by using multi-head attention; 5. after the interactive features obtained through the multi-head attention mechanism learning are obtained, dimension reduction is carried out through pooling, and the learned representations are collected together. And finally, predicting the emotion polarity by utilizing a Softmax layer to obtain the probability distribution of the emotion categories of the corresponding aspect words.
The invention has the following effects: by applying the method to the Laptop and SemEval-2014 Restaurant review data set Restaurant and the introduced ACL 14twitter public social data set, the accuracy and the F1 value of the optimal experimental result on the Restaurant data set are 87.23% and 81.78% respectively, the accuracy and the F1 value of the optimal experimental result on the Laptop data set are 81.46% and 78.31% respectively, and the accuracy and the F1 value of the optimal experimental result on the twitter data set are 77.11% and 76.2% respectively. The emotion analysis effect is superior to that of the traditional model.
Drawings
FIG. 1 model structure diagram
FIG. 2 graph of pre-training model results
FIG. 3 Pre-training model vectorized representation
FIG. 4 feature extraction network diagram
Detailed Description
The specific implementation of the invention is divided into four steps: 1. inputting the data set subjected to data cleaning into a Distilroberta pre-training model; 2. inputting the vectorized word vector into an SRU + + feature extraction network; 3. setting SRD threshold values according to different data sets to extract local context features; 4. performing interactive learning on local and global context characteristics by using a multi-head attention mechanism; 5. and predicting the emotion polarity by utilizing a Softmax layer to obtain the probability distribution of the emotion categories of the corresponding aspect words. Firstly, mapping each input word into a low-dimensional vector by using a pre-training model, then extracting the features of the low-dimensional vector, then determining the value of the SRD according to the difference of data sets, interactively learning the local and global features, and finally predicting the emotion polarity.
The structure of the method is shown in figure 1:
(1) Text vectorization
For the aspect-level emotion classification task, the input sequence prepared for the model is generally composed of a context sequence and an aspect sequence, which enables the model to learn the relevance of the context and the aspect. Let s = { w = { [ w ] 0 ,w 1 ,…,w n Is a sequence of input contexts containing facets, the sequence containing n words containing facet targets.
Figure BDA0003595367760000031
Is a device comprising an object aspectAnd (4) sequencing. s t Is a subsequence of s consisting of m (m.gtoreq.1) words.
The first layer of the LGLFF model is the input layer, which aims to convert the context text sequence coding into a serialized vector representation, and this layer is mainly composed of a word embedding layer and a coding layer. The LGLFF model uses the BERT-SPC input proposed by Song et al, which is a sentence pair classification task that pre-trains the BERT model, which prepares the input sequence by appending aspects to a context, treating the context and aspects as two fragments. The global input sequence of the BERT-SPC construct of the ABSA task is "[ CLS ] + s + [ SEP ] + [ ASP ] + [ SEP ]".
Distilroberta is a pre-training model designed based on a knowledge distillation method, and the design is based on Transformers models. Through knowledge distillation, a large amount of knowledge coded in the large-scale TeacherBERT model can be transferred to the small-scale student BERT model to achieve the purpose of reducing the scale of the model, so that the Distilroberta model can accelerate reasoning speed, reduce the scale of model parameters and keep accuracy.
Distilroberta Pre-training model sequence w 1 ,w 2 ,w 3 ,…,w n As input, T 1 ,T 2 ,T 3 ,…,T n Vector representation as output of the Distilroberta model. Wherein each vector T i Mapping to each word w in the sequence i . Distilroberta learns the context information for each word in the input sequence using a Transformer encoder. The Transformer encoder generates context embedding using a self-attention mechanism. The context embedding extracted for each word is connected into a vector to represent the semantic information present in the input sequence.
The pre-training model structure is shown in fig. 2:
the pre-training process is shown in fig. 3:
(2) Global feature extraction
The RNN model deals with the timing relationship between sequences through a memory unit, which is one of the most common models for sequence information analysis. However, in the conventional RNN models, such as LSTM and GRU, the calculation of the output state at the current time must be performed after the output state at the previous time is completely completed, and parallel calculation cannot be implemented. The dependency between the front time step and the back time step enables the loop network to be much slower than other models, the operation efficiency is limited, the parallel processing can not be realized, and the limitation is removed by a Simple loop recovery Unit (SRU). And in the model, the improved SRU + + of the SRU is adopted for feature extraction. SRU + + adds a mechanism of attention to SRU to improve learning of dependency relationships between the current word and other words.
The SRU + + part of the calculation process is as follows:
f t =σ(U[t,0]+v⊙c t-1 +b)
r t =σ(U[t,1]+v′⊙c t-1 +b′)
c t =f t ⊙c t-1 +(1-f t )⊙U[t,2]
H g =h t =r t ⊙c t +(1-r t )⊙x t
wherein U [ t,0]、U[t,1]And U [ t,2 ]]Replaces Wx in SRU t 、W′x t And W' x t Improving SRU from a code level to improve the parallelism capability, wherein sigma represents a sigmoid function; an indication of a matrix element multiplication; w, W ', v and v' are learnable weight matrices; b. b' is an offset value; f. of t 、c t 、r t And h t Respectively representing the forgetting gate, the t-time hidden state, the reset gate and the t-time state output; x is the number of t The word vector input at time T is the row vector of the word vector matrix T. H g I.e. the extracted global context feature representation. The simple cyclic unit no longer depends on the output h of the last moment t-1 And parallel processing can be realized.
The process of feature extraction is as shown in fig. 4:
(3) Local feature extraction
1. Semantic correlation distance: the model uses semantic relative distances, SRDs, to determine whether a context word is a local context for a particular aspect.
2. Contextual Dynamic Weighting (CDW): the context features semantically related to the target aspect will be preserved, while the non-semantically related context features will be weighted to be attenuatedAnd (4) subtracting. Features of contextual words that are far from the target aspect will be attenuated according to the SRD. The CDW constructs a weighted vector W for each context word with relatively weak semantics i To weight the features to construct a mask matrix M.
Representing H according to a mask matrix M and a global context g The local context representation H can be computed l :H l =H g ·M
(4) Feature interactive learning
The model uses multi-head self-attention to carry out emotion classification in the aspect of feature extraction, and the multi-head nature is a plurality of independent self-attention calculations, so that the model can be allowed to learn related information in different expression subspaces. Global context H obtained through word embedding layer and feature extraction layer g And local context H l And splicing vector representation, and performing interactive learning through a multi-head self-attention mechanism to obtain features relevant to aspects in context and suppress useless features.
O d =W d ([H l ;Hgl)+b d O m =MHSA(O d )
Wherein, W d ,b d Respectively, a parameter matrix and a bias vector for the linear layer.
(5) Emotional polarity prediction
After the interactive features obtained through the multi-head attention mechanism learning are obtained, dimension reduction is carried out through pooling, and the learned representations are collected together. And finally, predicting the emotion polarity by using a Softmax layer to obtain the probability distribution of the emotion categories of the corresponding aspect words, wherein the maximum probability value is the emotion category to which the corresponding aspect class belongs.
Figure BDA0003595367760000051
Wherein K is the number of the three classification labels, and Y is the predicted emotion polarity output by the model.
Evaluation indexes are as follows:
in order to verify the performance of the proposed model, the classification Accuracy and the F1 score are used for evaluation, and the calculation process is as follows.
Figure BDA0003595367760000052
Figure BDA0003595367760000053
Figure BDA0003595367760000061
Figure BDA0003595367760000062
Wherein TP indicates that both the predicted tag and the true tag are positive; FP indicates that the predicted tag is positive and the true tag is negative; TN indicates that both the predictive tag and the authentic tag are negative; FN indicates that the predicted tag is negative and the true tag is positive; precision is Precision and Recall is Recall.
Table 1: results of the experiment
Figure BDA0003595367760000063
Compared with the LSTM, the accuracies Acc of the AT-LSTM are respectively improved by 2.23%, 2.47% and 1.91% on the three public data sets, and the accuracies Acc of the F1 are respectively improved by 1.09%, 2.30% and 1.87%, so that the judgment of the model on the emotion is effectively improved by further extracting important emotion characteristics from the context by an attention mechanism; ATAE-LSTM connects facet word coding with context coding on the basis of AT-LSTM, preliminarily associates facet word coding information with context, respectively promotes Acc on Rest14 and Twitter data sets by 0.24% and 0.69%, and the effect of fine-grained facet words on emotion prediction cannot be reflected under simple connection; the IAN model further considers the interaction between the aspect words and the targets, compared with the ATAE-LSTM, on three public data sets, the accuracy Acc is respectively improved by 3.85%, 3.35% and 2.06%, the accuracy F1 is respectively improved by 4.28%, 3.45% and 3.07%, the interaction information between the context and the aspect words is fused into the model, the effect improvement is quite obvious, and the importance of the interaction information between the aspect words and the context is further reflected; compared with BERT-SPC and BERT-PT, the introduction of external knowledge greatly reserves context semantic information during text vectorization, better reserves the context semantic information during word vector generation for the framework of the ABSA sentences for classification tasks, and respectively improves the accuracy Acc by 1.03 percent and 2.18 percent and the accuracy F1 by 1.83 percent and 2.33 percent on three public data sets; in an LCF-BERT model, external knowledge and attention mechanism are introduced, a Local Context (LCF) mechanism is provided at the same time, and the capture capability of local important semantic information is effectively improved. After the global and local context features are obtained, the feature interaction learning layer effectively learns the context features related to the aspect by using a multi-head attention mechanism, namely, the emotion polarity beyond the SRD threshold range is not ignored while the local context related to the aspect is emphasized.
In conclusion, the LGLFF model is effective in an aspect-level emotion analysis task, vectorization representation and context semantic extraction of a Distilroberta module, extraction of features by an SRU + + module and learning of local and global context and aspect item interaction information play a certain role in improving emotion judgment effects.
The above examples are merely illustrative of the present invention and should not be construed as limiting the scope of the invention, which is intended to be covered by the claims as well as any design similar or equivalent to the scope of the present invention.

Claims (1)

1. A multi-model dynamic collaborative semantic matching method is characterized by comprising the following steps:
(1) Defining meta-models and domain models: according to the model scale and the processing task, defining the model as a unique meta model and a plurality of domain models, wherein the domain models can be dynamically added with configuration;
(2) Multi-model pre-training: respectively training word vectors aiming at different tasks, pre-training a meta-model by using a universal field data set to obtain a universal word vector, and pre-training a domain model by using a certain data set to obtain the field word vector;
(3) Calculating the text similarity of each model: the meta-model and the domain model respectively obtain the matching similarity of the meta-model and the matching similarity of each domain model through matching the pyramid model;
(4) Setting a cooperation rule, and calculating text similarity: and combining the matching similarity of the meta model and the domain model through setting rules to calculate the text similarity.
CN202210390699.1A 2022-04-14 2022-04-14 Lightweight aspect-level text emotion analysis method Pending CN115169429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210390699.1A CN115169429A (en) 2022-04-14 2022-04-14 Lightweight aspect-level text emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210390699.1A CN115169429A (en) 2022-04-14 2022-04-14 Lightweight aspect-level text emotion analysis method

Publications (1)

Publication Number Publication Date
CN115169429A true CN115169429A (en) 2022-10-11

Family

ID=83483369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210390699.1A Pending CN115169429A (en) 2022-04-14 2022-04-14 Lightweight aspect-level text emotion analysis method

Country Status (1)

Country Link
CN (1) CN115169429A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561592A (en) * 2023-07-11 2023-08-08 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561592A (en) * 2023-07-11 2023-08-08 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device
CN116561592B (en) * 2023-07-11 2023-09-29 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device

Similar Documents

Publication Publication Date Title
CN110502749B (en) Text relation extraction method based on double-layer attention mechanism and bidirectional GRU
Chang et al. Chinese named entity recognition method based on BERT
CN113239181B (en) Scientific and technological literature citation recommendation method based on deep learning
CN109472024B (en) Text classification method based on bidirectional circulation attention neural network
US20220147836A1 (en) Method and device for text-enhanced knowledge graph joint representation learning
Wang et al. Application of convolutional neural network in natural language processing
Chen et al. Research on text sentiment analysis based on CNNs and SVM
CN113239700A (en) Text semantic matching device, system, method and storage medium for improving BERT
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN109086269B (en) Semantic bilingual recognition method based on semantic resource word representation and collocation relationship
Zhang et al. Aspect-based sentiment analysis for user reviews
CN112818118B (en) Reverse translation-based Chinese humor classification model construction method
CN109815493A (en) A kind of modeling method that the intelligence hip-hop music lyrics generate
CN111414481A (en) Chinese semantic matching method based on pinyin and BERT embedding
CN113515632B (en) Text classification method based on graph path knowledge extraction
CN111985205A (en) Aspect level emotion classification model
CN113705238B (en) Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model
Hu et al. A multi-level supervised contrastive learning framework for low-resource natural language inference
CN113360667A (en) Biomedical trigger word detection and named entity identification method based on multitask learning
CN115238691A (en) Knowledge fusion based embedded multi-intention recognition and slot filling model
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
Yong et al. A new emotion analysis fusion and complementary model based on online food reviews
Seilsepour et al. Self-supervised sentiment classification based on semantic similarity measures and contextual embedding using metaheuristic optimizer
CN115169429A (en) Lightweight aspect-level text emotion analysis method
CN116562291A (en) Chinese nested named entity recognition method based on boundary detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination