CN113705197A - Fine-grained emotion analysis method based on position enhancement - Google Patents
Fine-grained emotion analysis method based on position enhancement Download PDFInfo
- Publication number
- CN113705197A CN113705197A CN202111000430.XA CN202111000430A CN113705197A CN 113705197 A CN113705197 A CN 113705197A CN 202111000430 A CN202111000430 A CN 202111000430A CN 113705197 A CN113705197 A CN 113705197A
- Authority
- CN
- China
- Prior art keywords
- word
- layer
- vector
- words
- fine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 69
- 238000004458 analytical method Methods 0.000 title claims abstract description 41
- 230000007246 mechanism Effects 0.000 claims abstract description 39
- 230000003993 interaction Effects 0.000 claims abstract description 14
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 86
- 230000006870 function Effects 0.000 claims description 23
- 239000011159 matrix material Substances 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 20
- 238000012549 training Methods 0.000 claims description 16
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 230000004927 fusion Effects 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 5
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 230000000873 masking effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 238000011176 pooling Methods 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 230000015572 biosynthetic process Effects 0.000 claims 1
- 238000005755 formation reaction Methods 0.000 claims 1
- 230000003313 weakening effect Effects 0.000 claims 1
- 230000014509 gene expression Effects 0.000 abstract description 6
- 238000011160 research Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a fine-grained emotion analysis method based on position enhancement, which is used for solving the problem of low precision caused by fine-grained emotion analysis of a text in the prior art. Firstly, preprocessing a text, and then carrying out emotion analysis through a fine-granularity emotion analysis model. The model comprises an embedding layer, a semantic representation layer, an information interaction layer and an output layer. The embedding layer maps sentences into context word embedding and aspect word embedding, the semantic representation layer enhances the text semantic representation capability of the model through a position enhanced attention mechanism, the information interaction layer enhances the interactivity of the aspect words and the contexts thereof through a memory network, the context semantic enhanced representation based on the aspects is used as an external memory unit interacting with the aspects, so that the external memory unit can learn semantic information in the complex text, and finally the output layer carries out emotion prediction. According to the invention, the context range for performing emotion expression on aspects is reasonably defined, so that the fine-grained emotion analysis accuracy is improved.
Description
Technical Field
The invention belongs to the technical field of information processing, and relates to a fine-grained emotion analysis method based on position enhancement.
Background
The rapid development of social networks and e-commerce shopping platforms enables people to make opinions and expressions on the network platforms more conveniently, so that a large amount of text data containing user emotion information is generated, and huge practical values are included. And the product has multidimensional attribute, so that consumers can comment on the product and the service from different angles, such as quality, price, service and the like. The traditional text sentiment analysis technology generally provides a sentiment judgment on the whole, and cannot meet the requirement of judging sentiment tendentiousness in different aspects in comment texts, so the sentiment analysis granularity of the text needs to be more refined. For example, the sentence "the food in the restaurant is too good to eat, but the service quality is poor. ", the fine-grained sentiment analysis task is aimed at judging the sentiment tendencies (food, positive) and (service, negative) of the aspects" food "and" service ". The "food" and "service" in the example sentence are called facets, and the other text that does not belong to the facets "this restaurant is too delicious, but the quality is poor. "is referred to as context. The task is a popular research direction in the field of natural language processing, which is beneficial to the selection of consumers and the decision of enterprises on products, and has wide commercial prospect and application value.
With the continuous maturation of deep learning technology, the method is effectively applied to the field of text emotion analysis. The research method of the fine-grained text sentiment analysis mainly focuses on the improvement of the basic structure of the neural network. Due to the attention mechanism, the deep learning model has stronger text representation capability and data processing capability, and good progress is achieved in a fine-grained emotion analysis research task. But combining textual feature representations for specific aspects remains challenging due to the complexity of the linguistic structure and the implicit emotional attribute expressions.
Disclosure of Invention
In order to solve the problem of low precision caused by performing fine-grained emotion analysis on a text in the prior art, the invention provides a fine-grained emotion analysis method based on position enhancement. The method comprises the steps of preprocessing a text, constructing a word vector, and performing emotion analysis through a fine-granularity emotion analysis model. The model consists of an embedding layer, a semantic representation layer, an information interaction layer and an output layer. The method comprises the steps that firstly, an embedding layer maps sentences in a text into context word embedding and aspect word embedding representations according to word vectors, secondly, a semantic representation layer enhances the text semantic representation capability of a model through a position enhanced attention mechanism, secondly, an information interaction layer enhances the interactivity of aspect words and contexts of the aspect words through a memory network, and takes the context semantic enhanced representations based on specific aspects as external memory units interacting with the aspects, so that the external memory units can learn semantic information in complex texts, and finally, an output layer conducts emotion prediction on the specific aspect words.
In order to achieve the purpose, the invention adopts the following technical scheme
A fine-grained emotion analysis method based on position enhancement comprises the following steps:
step 1 text preprocessing
(1) Case and case conversion: all upper case letters present are converted to lower case letters.
(2) Word segmentation: and performing word segmentation on the text data set by adopting a general language word segmentation module.
(3) Removing stop words: some words in the text data that have no practical meaning are removed.
(4) Constructing a position weight matrix M for the text data by using a shielding mechanism, and calculating the weight according to the position of a word in a sentence, wherein the calculation formula is as follows:
wherein h ismaxExpressed as the maximum length of the input sentence. The shielding mechanism is based on the middle part of the sentenceCalculating the position weight of the relative position of the face word and the context word, MijDenotes the term wiThe position weight of the word pair at the center, i and j are the position indices of the words. Word pair distance is hmaxWithin/2, giving weight according to the distance, otherwise, giving MijIs set to 0.
Step 2, constructing word vectors
And mapping each word of the preprocessed text data to a vector space, wherein each word corresponds to a vector with the dimension d. For each word in the text data, if the word exists in the pre-trained word vector table, using a word vector in the word vector table as a word vector of the word, and if the word does not exist in the pre-trained word vector table, using a normal distribution random initialization vector as the word vector of the word.
Step 3, constructing a fine-grained emotion analysis model, and performing fine-grained emotion prediction of aspect-level emotion on a text to be analyzed by using the fine-grained emotion analysis model, wherein the model specifically comprises the following steps:
3.1 embedding layer
Each sentence in the text data contains 1 or more aspects, and each aspect has a corresponding sentiment value. And mapping each sentence in the text data into a low-dimensional dense word embedding expression vector according to the word vector constructed in the step 2, wherein the word embedding expression vector is divided into aspect word embedding and contextual word embedding. Each sentence is composed of an aspect and a context of the aspect, and the corresponding word embedding representation is divided into context word embedding and aspect word embedding. Context refers to other parts of the sentence that do not include aspects. If the aspect is composed of a plurality of words, average pooling of word-embedded vector representations of the plurality of words is performed as the vector representation of the aspect.
3.2 semantic representation layer
The semantic representation layer is used for extracting high-level abstract representation of the text, is composed of blocks (Block) in series connection, and obtains deeper abstract features H of the text through continuous iterative computation. Each Block is formed by combining a position weight fusion mechanism and a feedforward neural network through residual connection and layer normalization. The output of two submodules, namely a position weight fusion mechanism and a feedforward neural network in a single Block can be formally expressed as follows:
output ═ LayerNorm (x + sublayer (x)) (equation 2)
Wherein LayerNorm () is layer normalization, Sublayer () is designed in detail for each Block of the function realized by the submodule itself as follows:
(1) a self-attention mechanism is used. The context word embedding E is linearly mapped to three different spaces to obtain a corresponding query matrix Q, a key matrix K and a value matrix V, and the mutual dependency relationship between the context word pairs is calculated by utilizing a key value pair attention mechanism and a query vector and a key vector. Wherein, the ith word in the context word embedding E is embedded into EiThe linear mapping of (i ═ 1,2, …, n) is represented as a query vector qiKey vector kiVector of sum values viWhere n denotes the number of context words, the linear mapping process is expressed as follows:
wherein Wq、WkAnd WvParameter matrices, respectively linear mapping, Q, K and V respectively query vector qiKey vector kiVector of sum values viA matrix is formed.
(2) A location weight fusion mechanism is used. In the self-attention mechanism, measures of weight enhancement and weight shielding are added to the relevance weight matrix through the position weight matrix M, so that information with high relevance to the aspect words is enhanced, and the influence of irrelevant information or wrong emotion information is weakened. The enhancement or masking of the weights is measured by how far the context words are from the side.
First, each query vector qiRespectively with key vector kjPerforming association score calculation by using a compatible function f to obtain an association score matrix S belonging to R between word pairsn×nQuery vector qiAnd the key vector kjIs associated with a score SijIs represented as follows:
Sij=f(qi,kj)=wTσ(Wqi+Vkj) i, j ∈ {1,2, …, n } (equation 4)
Wherein W, V and W are parameters to be trained, sigma is a Sigmoid activation function, and n is the number of context words.
And then by giving the association score f (q)i,kj) Increasing the position weight MijTo strengthen and weaken the influence of the context on the aspect, and combining the value vector viInformation aggregation is carried out, and the characteristic head is extracted as h1,h2,…,hn]Each feature hiThe calculation of (d) is represented as follows:
wherein A ism∈Rn×nIs composed ofConstructed position-fused weight matrix, MijIs the position weight of the word, beta is the expansion coefficient of the position weight, and n is the number of the context words.
(3) A multi-headed self-attentive mechanism is used. For capturing different aspects of textual information, the calculation is as follows:
LF-MultiHead(Q,K,V)=Concat(head1,…,headh)WO(formula 7)
Where Concat (. cndot.) represents vector splicing, WORepresenting linear compression transformation, compressing the matrix formed by the multi-head attention mechanism to the original dimension, h representing the number of aspects, headiFeatures extracted by the self-attention mechanism for the ith aspect;
(4) residual join and layer normalization operations are used. Residual error connection takes the following word embedding as input, and carries out layer normalization operation together after being fused with the result of the multi-head self-attention mechanism, so as to obtain the output of the position weight fusion mechanism.
(5) And taking the output of the position weight fusion mechanism as the input of a feedforward neural network layer, and outputting the result as Block finally through residual connection and normalization operation. The feedforward neural network is realized by adopting a full connection layer and a Relu activation function.
3.3 information interaction layer
And (3) interacting by using a memory network, and enhancing the relation between the abstract characteristics H and the aspect to ensure the interactivity of the aspect and the context thereof. Abstract text characteristic H ═ H obtained by semantic representation layer of memory network1,h2,…,hn]The 1 st computing layer is embedded with the aspect words vaspectAnd taking the weighted combination r of the memory units as an initial input, taking the output of the memory units as the input of the next calculation layer, and carrying out iterative calculation on each layer in turn. The weighted combination formula is as follows:
wherein n is the memory capacity, alphai∈[0,1]Is a memory cell hiIs a weight ofi=1αi1. Weight αiThe semantic relevance of the context to the computational aspect of the feedforward neural network is obtained by calculating as follows:
ωi=tanh(Watt[hi;vaspect]+batt) (formula 9)
Wherein Watt∈R1×2nIs a parameter vector, batt∈R1×1To be offset, αiAs a memory cell hiAnd (4) distributing the weight value.
3.4 output layer
After the aspect information is interacted with the memory unit for multiple times, the obtained final representation is used as the emotion characteristics corresponding to the specific aspect and is finally input into the softmax function to obtain emotion distribution, and the aspect emotion is predicted.
The model needs to be trained before prediction, which is as follows:
the fine-grained emotion analysis model is trained by minimizing the cross-entropy loss function and the L2 regularization term, with the entire loss function optimizing the model parameters by gradient descent. The loss function is as follows:
wherein D is a training data set, c represents the context of the sentence, e represents the aspect of the sentence, l represents the emotion label of the aspect, S is an emotion category set, ysFor one-hot codes generated according to emotion classes, fsAnd (c, e and theta) are prediction emotion distribution of the model, lambda is a regularization term coefficient, strength of control regularization is strong or weak, and theta is a model weight coefficient.
Advantageous effects
(1) The influence of the position information on emotion prediction of a specific aspect is fully considered, the context range of emotion expression of the aspect is defined through a position strengthening attention mechanism, and the text semantic representation capability of the model for the specific aspect is enhanced; according to the invention, the context range for emotion expression on aspects is reasonably defined, so that the text semantic representation capability and the information interaction of the emotion analysis system are enhanced, and the fine-grained emotion analysis accuracy is improved.
(2) The context semantic enhancement expression based on the specific aspect is used as an external memory unit interacting with the aspect, so that the problem that the semantic information in a more complex text is difficult to learn due to the fact that the external memory unit is established on a single word vector in a memory network is solved;
(3) the invention can realize parallelization calculation in model training and improve the efficiency of model training.
Drawings
FIG. 1 is a flow chart of a model structure;
FIG. 2 is a diagram of a location embedding model architecture;
FIG. 3 is a structure of a position-enhanced fine-grained sentiment analysis model.
Detailed Description
The following examples are intended to illustrate the present invention but are not intended to limit the scope of the invention.
The invention takes a public data set and a multi-aspect emotion data set MAMS of a computer note field and a restaurant field contained in a semantic evaluation match SemEval-2014 Task4 as training corpora of a model so as to verify the effectiveness of the model. The specific implementation steps are as follows:
step 1 text preprocessing
Firstly, text preprocessing is carried out on a training corpus, and the processing steps are as follows:
(1) case and case conversion: all upper case letters present are converted to lower case letters.
(2) Word segmentation: and performing word segmentation on the text data set by adopting a general language word segmentation module.
(3) Removing stop words: some words in the text data that have no practical meaning are removed.
(4) Constructing a position weight matrix M: calculating the weight according to the relative position of the words in the sentence, wherein the calculation formula is as follows:
wherein h ismaxExpressed as the maximum length of the input sentence. The masking mechanism calculates its position weight, M, based on the relative positions of the aspect words and the context words in the sentenceijDenotes the term wiThe position weight of the word pair at the center, i and j are the position indices of the words. Word pair distance is hmaxWithin/2, giving weight according to the distance, otherwise, giving MijIs set to 0.
Here, The sentence "The food in The same resource is delayed, but The service is related force" is taken as an example to explain The position weight matrix MAnd (5) constructing. j sentences of length 13, aspects in the sentences "food" and "service", the sentences being represented as s { (food,2), (retaurant, 5), (delicious,7), (service,10), (realaly, 12), (por, 13) }, the position weight M of the word "por" relative to the aspect "service", after removal of stop words10,13=1-|10-13|/13=0.769。
The aspect words and the context words are marked in the training corpus, and the automatic recognition mode of the aspect words is out of the research scope of the invention.
Step 2: word vector construction
For the preprocessed training samples, each word in the training samples is mapped to a vector space through a Glove word vector, and then each word corresponds to a vector with the dimension d being 300. For each word in the training sample, if the word exists in the pre-training word vector table, using the word vector in the word vector table as the word vector of the word, and if the word does not exist in the pre-training word vector table, using the randomly distributed U (-0.25,0.25) random initialization vector as the word vector of the word.
And 3, constructing a fine-grained emotion analysis model, and performing fine-grained emotion analysis on the text to be analyzed by using the trained fine-grained emotion analysis model.
3.1 embedding layer
First, for the sentence s ═ w1,w2,…,wnUsing Glove word embedding technology to embed word wi(i ═ 1,2, …, n) is mapped to a low-dimensional dense vector as a word-embedded representation. Word embedding means is divided into contextual word embedding E ═ E1,e2,…,en]And aspect word embedding vaspectIf the aspect is composed of a plurality of words, the vector representations of the plurality of words are averaged and pooled as the vector representation of the aspect.
3.2 semantic representation layer
The semantic representation layer is composed of 6 blocks (Block), each Block is formed by combining two submodules of a position weight fusion mechanism and a feedforward neural network in series, and each submodule is subjected to residual error connection and layer normalization. The output of each sub-layer can be formally represented as follows:
output ═ LayerNorm (x + sublayer (x)) (equation 2)
Where LayerNorm () is the layer normalization, Sublayer () is the function implemented by the Sublayer itself, and x is the input characteristic.
The specific design steps of the Block are as follows:
first, the context word embedding E is linearly mapped to three different spaces, E for each word embeddingi(i-1, 2, …, n) to obtain the corresponding query vector qiKey vector kiVector of sum values vi. The linear mapping process is represented as follows:
wherein Wq、WkAnd WvRespectively, linear mapped parameter matrices, and Q, K and V respectively query vectors qiKey vector kiVector of sum values viA matrix is formed.
Second, position weight fusion is performed. The self-attention mechanism is added with measures of weight enhancement and weight shielding, and the enhancement or shielding of the weight takes the distance from the aspect words as a measure, enhances the information with higher relevance with the aspect words and weakens the influence of irrelevant information or wrong emotional information. First, each query vector qiRespectively with key vector kiPerforming association score calculation by using a compatible function f to obtain an association score matrix S belonging to R between word pairsn×nExpressed as follows:
Sij=f(qi,kj)=wTσ(Wqi+Vkj) i, j ∈ {1,2, …, n } (equation 4)
Where W and V are the parameters to be trained and σ is the Sigmoid activation function.
And then by giving the association score f (q)i,kj) Increasing the position weight MijTo strengthen and weaken the influence of the context on the aspect, and combining the value vector viInformation aggregation is carried out, and the characteristic head is extracted as h1,h2,…,hn]The calculation is expressed as follows:
wherein A ism∈Rn×nAs a weight matrix after position fusion, MijIs the relative position weight of the word, and beta is the expansion coefficient of the position weight, and the value is 10.
To capture different aspects of textual information, the self-attention mechanism is expanded to a multi-headed self-attention mechanism, represented as follows:
LF-MultiHead(Q,K,V)=Concat(head1,…,headh)WO(formula 7)
Where Concat (. cndot.) represents vector splicing, WORepresenting a linear compression transform, h representing the number of aspects, headiFeatures extracted by the self-attention mechanism for the ith aspect word;
and finally, embedding the words into the output obtained by a position weight fusion mechanism as the input of a feedforward neural network layer, and obtaining the output of Block by residual connection and normalization operation, wherein the feedforward neural network is realized by adopting a full connection layer and a Relu activation function, and the feedforward neural network is expressed as follows:
FFN(x)=Relu(xW1+b1)W2+b2(formula 8)
Wherein W1And W2For training parameters, b1And b2Is the bias term.
The whole semantic representation layer is continuously calculated in an iterative mode through the serial connection among a plurality of Block layers, so that a deeper abstract feature H of the text is obtained, and emotion semantic information aiming at a specific aspect is deeply mined.
3.3 information interaction layer
Information interaction layer ensures that aspects and their contexts have interaction using memory networkAnd (4) mutual performance. Text feature H ═ H extracted by semantic representation layer based on position enhancement1,h2,…,hn]The method is characterized in that the method is used as a memory unit and comprises 3 calculation layers, an aspect word vector v is used as an initial input, a weighted combination r is adaptively selected from a context hiding state H as the input of a next calculation layer, and the weighted combination formula is as follows:
wherein n is the memory capacity, alphai∈[0,1]Is a memory cell hiIs a weight ofi=1αi1. Since context has different effects on emotion determination for a particular target, semantic relevance of aspects to context is calculated using a feed-forward neural network. Thus, according to the external memory unit hiSemantic relation with aspect, adaptive as memory unit hiWeights are assigned. The scoring function is calculated as follows:
ωi=tanh(Watt[hi;vaspect]+batt) (formula 10)
Wherein Watt∈R1×2nIs a parameter vector, batt∈R1×1To be offset, αiAs a memory cell hiAnd (4) distributing the weight value.
3.4 output layer
After the facet information and the memory unit are interacted for a plurality of times, the obtained final representation is used as the emotion characteristics corresponding to the specific target and is input to the softmax layer to predict the facet emotion.
Before performing emotion analysis by using a fine-grained emotion analysis model, the model needs to be trained, specifically, the model is trained by minimizing a cross entropy loss function and an L2 regularization term, the whole loss function optimizes model parameters by gradient descent, and the loss function is as follows:
wherein S is an emotion category set, D is a training data set, ysFor one-hot codes generated according to emotion classes, fsAnd (c, e and theta) are prediction emotion distribution of the model, lambda is a regularization term coefficient, the value of lambda is 0.001, the strength of regularization is controlled, and theta is a model weight coefficient. The L2 normalized attenuation coefficient was set to 10 e-4. All weight matrices in the training are randomly initialized with a uniform distribution of U (-0.01 ) and the bias is initialized to 0.
Step 4 Experimental analysis
In order to verify the performance of the model, experiments are carried out on three data sets of Restaurant, Laptop and MAMS and are compared with other baseline models, so that the effectiveness of the method is verified
TABLE 1 comparison of the results
Claims (6)
1. A fine-grained emotion analysis method based on position enhancement is characterized by comprising the following steps:
step 1, text preprocessing;
step 2, word vector construction: mapping each word in the preprocessed text data to a vector space to obtain a word vector of each word;
step 3, performing aspect-level emotion fine-grained emotion prediction on the text to be analyzed by utilizing a fine-grained emotion analysis model, wherein the fine-grained emotion analysis model comprises an embedding layer, a semantic representation layer, an information interaction layer and an output layer,
the specific prediction process is as follows:
firstly, the embedding layer maps sentences in the text into context word embedding and aspect word embedding according to the word vectors obtained in the step 2;
then, embedding the context words by the semantic representation layer by using a self-attention mechanism, and enhancing the attention mechanism by the aspect word position information to enhance the text semantic representation capability of the model;
next, enhancing the interactivity of the aspect words and the contexts thereof through an information interaction layer;
and finally, the output layer predicts the emotion of the aspect words.
2. The fine-grained emotion analysis method based on location enhancement according to claim 1, wherein:
step 1 the text preprocessing comprises the following steps:
(1) case and case conversion: converting all existing upper case letters into lower case letters;
(2) word segmentation: performing word segmentation on the text data by adopting a general language word segmentation module;
(3) removing stop words: removing some words without practical meaning in the text data;
(4) constructing a position weight matrix M for the text data by using a shielding mechanism, wherein the calculation formula is as follows:
wherein h ismaxExpressed as the maximum length of the input sentence, the masking mechanism calculates its position weight, M, based on the relative positions of the aspect words and the context words in the sentenceijDenotes the term wiThe position weight of the word pair as the center, i and j are the position indexes of the words, and the distance of the word pair is hmaxWithin/2, giving weight according to the distance, otherwise, giving MijIs set to 0.
3. The fine-grained emotion analysis method based on location enhancement according to claim 1, characterized by comprising the steps of: the acquisition mode of the word vector in the step 2 is as follows: for each word in the text data, if the word exists in the pre-trained word vector table, using a word vector in the word vector table as a word vector of the word, and if the word does not exist in the pre-trained word vector table, using a normal distribution random initialization vector as the word vector of the word.
4. The fine-grained emotion analysis method based on location enhancement according to claim 1, wherein: the specific steps of each layer of the fine-grained emotion analysis model are as follows:
the embedding layer obtains word vectors of all words according to the step 2, the sentences in the text are mapped into word embedding representations in a low-dimensional dense vector form, the sentences with marked aspects are regarded as being formed by the contexts of aspects and aspects, the corresponding word embedding representations are divided into word embedding of the upper and lower languages and word embedding of the aspects, and if one aspect is formed by a plurality of words, the vector representations of the words are subjected to average pooling to be used as the vector representation of the aspects;
the semantic representation layer is used for extracting high abstract representation of a text, a network concrete structure of the semantic representation layer is composed of K blocks in series connection, and deeper abstract features H of the text are obtained through continuous iterative calculation, wherein each Block is added with position weight in an attention mechanism and processed by using a residual connection layer, a layer normalization layer and a feedforward neural network layer;
the information interaction layer uses a memory network for interaction and is used for enhancing the relation between the abstract characteristics H and the aspect and ensuring the interactivity between the aspect and the context thereof; abstract text characteristic H ═ H obtained by semantic representation layer of memory network1,h2,…,hn]As a memory unit, the memory unit is composed of L calculation layers, wherein the 1 st calculation layer is embedded with aspect words vaspectTaking the weighted combination r of the memory unit as an initial input, taking the output of the memory unit as the input of the next calculation layer, and sequentially and iteratively calculating each layer, wherein the weighted combination formula is as follows:
wherein n is the memory capacity, alphai∈[0,1]Is a memory cell hiIs a weight ofi=1αi1, weight αiThe semantic relevance of the context to the computational aspect of the feedforward neural network is obtained by calculating as follows:
ωi=tanh(Watt[hi;vaspect]+batt) (formula 3)
Wherein Watt∈R1×2nIs a parameter vector, batt∈R1×1To be offset, αiAs a memory cell hiDistributing the weight value of (1);
the output layer takes the result of the information interaction layer as input and predicts the emotion in the aspect by utilizing the softmax function.
5. The fine-grained emotion analysis method based on location enhancement according to claim 4, characterized by comprising the steps of: the Block of the semantic representation layer in the step 3 is designed as follows:
(1) using a self-attention mechanism: the context word embedding E is linearly mapped to three different spaces to obtain a corresponding query matrix Q, a key matrix K and a value matrix V, wherein the ith word embedding E in the context word embedding EiThe linear mapping of (i ═ 1,2, …, n) is represented as a query vector qiKey vector kiVector of sum values viWhere n denotes the number of context words, the linear mapping process is expressed as follows:
wherein Wq、WkAnd WvParameter matrices, respectively linear mapping, Q, K and V respectively query vector qiKey vector kiVector of sum values viA matrix of formations;
(2) using a location weight fusion mechanism: the measures of weight enhancement and weight shielding are added in the self-attention mechanism and are used for enhancing information with high relevance with the aspect words and weakening the influence of irrelevant information or wrong emotion information, and the enhancement or shielding of the weight is measured by the distance between the context words and the aspect words, and specifically the measures are as follows:
first, each query vector qiRespectively with key vector kjPerforming association score calculation by using a compatible function f to obtain an association score matrix S belonging to R between word pairsn×nQuery vector qiAnd the key vector kjIs associated with a score SijIs represented as follows:
Sij=f(qi,kj)=wTσ(Wqi+Vkj) i, j ∈ {1,2, …, n } (equation 6)
Wherein W, V and W are parameters to be trained, sigma is a Sigmoid activation function, and n is the number of context words;
and then by giving the association score f (q)i,kj) Increasing the position weight MiijTo strengthen and weaken the influence of the context on the aspect, and combining the value vector viInformation aggregation is carried out, and the characteristic head is extracted as h1,h2,…,hn]The calculation is expressed as follows:
wherein A ism∈Rn×nIs formed byConstructed position-fused weight matrix, MijIs the relative position weight of the word, n is the number of the context words, and beta is the expansion coefficient of the position weight;
(3) using a multi-headed self-attention mechanism for capturing different aspects of textual information, the calculation is as follows:
LF-MultiHead(Q,K,V)=Concat(head1,…,headh)WO(formula 9)
Where Concat (. cndot.) represents vector splicing, WORepresenting linear compression transformation, compressing the matrix formed by the multi-head attention mechanism to the original dimension, h representing the number of aspects, headiFeatures extracted by the self-attention mechanism for the ith aspect;
(4) using residual join and layer normalization operations: residual error connection takes the following word embedding as input, and carries out layer normalization operation together with the result of the multi-head self-attention mechanism to obtain the output of the position weight fusion mechanism;
(5) and taking the output of the position weight fusion mechanism as the input of a feedforward neural network layer, and performing residual connection and normalization operation again, wherein the result is finally output as a Block layer, and the feedforward neural network is realized by adopting a full connection layer and a Relu activation function.
6. The fine-grained emotion analysis method based on location enhancement according to claim 1, characterized by comprising the steps of: training a fine-grained emotion analysis model: the model was trained by minimizing the cross-entropy loss function and the L2 regularization term, the entire loss function optimizing the model parameters by gradient descent, the loss function being as follows:
Loss=∑(c,e,l)∈D∑s∈Syslog fs(c;e;θ)+λ||θ||2(formula 11)
Wherein D is a training data set, c represents the context of the sentence, e represents the aspect of the sentence, l represents the emotion label of the aspect, S is an emotion category set, ysFor one-hot codes generated according to emotion classes, fs(c; e; theta) as the predicted emotion score of the modelAnd distributing, wherein lambda is a regularization term coefficient, the strength of regularization is controlled, and theta is a model weight coefficient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111000430.XA CN113705197B (en) | 2021-08-30 | 2021-08-30 | Fine granularity emotion analysis method based on position enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111000430.XA CN113705197B (en) | 2021-08-30 | 2021-08-30 | Fine granularity emotion analysis method based on position enhancement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113705197A true CN113705197A (en) | 2021-11-26 |
CN113705197B CN113705197B (en) | 2024-04-02 |
Family
ID=78656373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111000430.XA Active CN113705197B (en) | 2021-08-30 | 2021-08-30 | Fine granularity emotion analysis method based on position enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113705197B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118013045A (en) * | 2024-04-02 | 2024-05-10 | 深圳市奥福德电子科技有限公司 | Sentence emotion detection method and device based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522548A (en) * | 2018-10-26 | 2019-03-26 | 天津大学 | A kind of text emotion analysis method based on two-way interactive neural network |
CN110472042A (en) * | 2019-07-02 | 2019-11-19 | 桂林电子科技大学 | A kind of fine granularity sensibility classification method |
US20200356724A1 (en) * | 2019-05-06 | 2020-11-12 | University Of Electronic Science And Technology Of China | Multi-hop attention and depth model, method, storage medium and terminal for classification of target sentiments |
CN112100376A (en) * | 2020-09-11 | 2020-12-18 | 湖南大学 | Mutual enhancement conversion network for fine-grained emotion analysis |
WO2021164199A1 (en) * | 2020-02-20 | 2021-08-26 | 齐鲁工业大学 | Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device |
-
2021
- 2021-08-30 CN CN202111000430.XA patent/CN113705197B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522548A (en) * | 2018-10-26 | 2019-03-26 | 天津大学 | A kind of text emotion analysis method based on two-way interactive neural network |
US20200356724A1 (en) * | 2019-05-06 | 2020-11-12 | University Of Electronic Science And Technology Of China | Multi-hop attention and depth model, method, storage medium and terminal for classification of target sentiments |
CN110472042A (en) * | 2019-07-02 | 2019-11-19 | 桂林电子科技大学 | A kind of fine granularity sensibility classification method |
WO2021164199A1 (en) * | 2020-02-20 | 2021-08-26 | 齐鲁工业大学 | Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device |
CN112100376A (en) * | 2020-09-11 | 2020-12-18 | 湖南大学 | Mutual enhancement conversion network for fine-grained emotion analysis |
Non-Patent Citations (3)
Title |
---|
XINYU CAO 等: "Microblog-oriented Multi-scale CNN Multi-label Sentiment Classification Model", 2020 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), 24 June 2021 (2021-06-24) * |
YINGHONG SUN 等: "A High-Dimensional and Multi-granularity Feature Selection Method Based on CNN and RF", ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, 31 January 2020 (2020-01-31) * |
徐德华 等: "基于深度记忆网络的在线评论细粒度情感分类", 电子制作, no. 01, 1 January 2020 (2020-01-01) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118013045A (en) * | 2024-04-02 | 2024-05-10 | 深圳市奥福德电子科技有限公司 | Sentence emotion detection method and device based on artificial intelligence |
CN118013045B (en) * | 2024-04-02 | 2024-06-18 | 深圳市奥福德电子科技有限公司 | Sentence emotion detection method and device based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN113705197B (en) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110083705B (en) | Multi-hop attention depth model, method, storage medium and terminal for target emotion classification | |
CN110490946B (en) | Text image generation method based on cross-modal similarity and antagonism network generation | |
CN111554268B (en) | Language identification method based on language model, text classification method and device | |
CN110609891A (en) | Visual dialog generation method based on context awareness graph neural network | |
CN109947912A (en) | A kind of model method based on paragraph internal reasoning and combined problem answer matches | |
CN113312500A (en) | Method for constructing event map for safe operation of dam | |
CN111984791B (en) | Attention mechanism-based long text classification method | |
CN112232053B (en) | Text similarity computing system, method and storage medium based on multi-keyword pair matching | |
CN110889282B (en) | Text emotion analysis method based on deep learning | |
CN112232087B (en) | Specific aspect emotion analysis method of multi-granularity attention model based on Transformer | |
CN111414481A (en) | Chinese semantic matching method based on pinyin and BERT embedding | |
CN110765240A (en) | Semantic matching evaluation method for multiple related sentence pairs | |
CN111400492B (en) | Hierarchical feature text classification method and system based on SFM-DCNN | |
CN110569355B (en) | Viewpoint target extraction and target emotion classification combined method and system based on word blocks | |
CN113392209A (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN112269874A (en) | Text classification method and system | |
CN114841151B (en) | Medical text entity relation joint extraction method based on decomposition-recombination strategy | |
CN114881042A (en) | Chinese emotion analysis method based on graph convolution network fusion syntax dependence and part of speech | |
CN115563314A (en) | Knowledge graph representation learning method for multi-source information fusion enhancement | |
CN114004220A (en) | Text emotion reason identification method based on CPC-ANN | |
CN116187349A (en) | Visual question-answering method based on scene graph relation information enhancement | |
CN116976505A (en) | Click rate prediction method of decoupling attention network based on information sharing | |
CN115169429A (en) | Lightweight aspect-level text emotion analysis method | |
CN112651225B (en) | Multi-item selection machine reading understanding method based on multi-stage maximum attention | |
CN113705197B (en) | Fine granularity emotion analysis method based on position enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |