CN111259153B - Attribute-level emotion analysis method of complete attention mechanism - Google Patents

Attribute-level emotion analysis method of complete attention mechanism Download PDF

Info

Publication number
CN111259153B
CN111259153B CN202010072375.4A CN202010072375A CN111259153B CN 111259153 B CN111259153 B CN 111259153B CN 202010072375 A CN202010072375 A CN 202010072375A CN 111259153 B CN111259153 B CN 111259153B
Authority
CN
China
Prior art keywords
level
attribute
matrix
information
vocabulary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010072375.4A
Other languages
Chinese (zh)
Other versions
CN111259153A (en
Inventor
林煜明
傅裕
李优
周娅
张敬伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202010072375.4A priority Critical patent/CN111259153B/en
Publication of CN111259153A publication Critical patent/CN111259153A/en
Application granted granted Critical
Publication of CN111259153B publication Critical patent/CN111259153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses an attribute level sentiment analysis method of a complete attention mechanism, which combines a self-attention mechanism network SAM-NN and a specific aspect attention mechanism network AAM-NN to respectively generate semantic features of vocabulary level and sentence level, and finally calculates the sentiment polarity of the contents of a comment sentence through a fully connected neural network FC-NN output layer. The method provided by the invention is of a parallel structure in implementation, and in each network computing module, the method integrates information characteristics of specific aspects, so that the method is ensured to further analyze the emotion polarity of the user comment information about specific attributes of the target object as far as possible according to the information of the specific aspects. Compared with the prior art, the method provided by the invention not only effectively improves the accuracy of emotion analysis tasks in specific aspects, but also effectively reduces the cost on model training time.

Description

Attribute-level emotion analysis method of complete attention mechanism
Technical Field
The invention relates to the technical field of deep learning, in particular to an attribute level emotion analysis method of a complete attention mechanism.
Background
The rapid development and popularization of internet technology has prompted the generation of such a large amount of online comment information, such as various e-commerce transaction platforms, social network platforms, etc., and how to effectively mine useful information from a large amount of comments has become a research hotspot in the natural language processing field in recent years, wherein the research hotspot includes a viewpoint mining task in comment information of users. The sentiment expressed in online comment information is often diverse, and a piece of user comment may contain different sentiments in terms of multiple attributes of the comment target. For example, a user comment from a restaurant: the noodles are delicious, and the soup is difficult to drink. The comment contains characteristic information of two attributes of 'noodle' and 'soup', and the user expresses 'positive' and 'negative' viewpoints on the two attributes of the information respectively. Obviously, it is not reasonable to analyze the emotional tendency of the user comment sentences with the composite emotion as a whole, and it is necessary to further analyze the viewpoint expressed by the user in a fine-grained manner according to each attribute information. Compared with the past emotion analysis tasks, the attribute-level emotion analysis task can provide more reference values for users and enterprises and is more and more concerned by the industry and academia.
With the success of deep learning techniques in the field of natural language processing, researchers have attempted to apply deep learning techniques to emotion analysis tasks. The deep learning method can automatically extract text features, and effectively avoids a great deal of time and energy spent on manual feature engineering. In the emotion analysis task, emotion recognition is generally converted into a multi-classification task, each emotion of a user is regarded as a category, and the emotion to be expressed by the user is recognized by distinguishing the category to which the user comment belongs. The existing research work mainly includes methods based on a Recurrent Neural Network (RNN) and a Convolutional Neural Network (CNN). However, both RNN-based and CNN-based approaches have their own pertinence and limitations, and their main problems are as follows:
since the RNN network is configured by sequential calculation units, text data needs to be sequentially processed when processing natural sentences and text information. In the emotion analysis task of a specific aspect, each vocabulary in the comment sentences of the user needs to be processed by the network computing unit in sequence, and the sequence structure undoubtedly increases the computing time overhead of the whole network model. In addition, the method based on the RNN simulates the sequential order of words to process text data when processing the user comment sentences. When a user comment sentence is long, two words far away from each other in the comment sentence cause information loss due to the fact that the information transfer distance is too far, and semantic dependency relationship between the two words is lost.
The CNN network-based method is used for acquiring text semantic information in parallel through convolution windows, and the parallel structure can effectively improve the calculation efficiency of the model. However, due to the limitation of the size of the convolution window, the method based on the CNN still can only capture the semantic dependency in the convolution window, and the global dependency of a single in the comment sentence is lost.
Disclosure of Invention
The invention provides an attribute-level sentiment analysis method of a full attention mechanism, aiming at the problems of low time overhead and low accuracy when sentiment analysis is carried out on user comments by utilizing deep learning in the prior art.
In order to solve the problems, the invention is realized by the following technical scheme:
an attribute level emotion analysis method of a complete attention mechanism specifically comprises the following steps:
step 1, a given user comment sentence with a real emotion mark classification is used as sample data, and the sample data is preprocessed;
step 2, firstly, extracting the characteristics of the preprocessed sample data to extract word embedding characteristics and attribute information word embedding characteristics of the words of the user comment sentences; then, the word embedding characteristics of the vocabulary are used as an input characteristic matrix of sample data, and the attribute aspect information word embedding characteristics are used as attribute level aspect information of the sample data;
step 3, initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain prediction emotion mark classification of the comment sentences of the user;
step 4, training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step 1 and the predicted emotion mark classification obtained in the step 3, and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism;
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the network model which is trained in the step 4 and is completely based on the attention mechanism, and obtaining the emotion mark classification of the measured data.
In the above steps 1 and 5, the process of preprocessing the sample data and the measured data includes word segmentation, writing standardization and part of speech tagging.
In the above steps 2 and 6, further extracting embedded features and/or position information of words corresponding to the part-of-speech information of the words of the user comment sentences of the sample data and the measured data; and the embedding characteristics corresponding to the part-of-speech information of the word and/or the position information of the word are fused to the word embedding characteristics of the vocabulary, and the word embedding characteristics are jointly used as an input characteristic matrix.
In the scheme, when the embedding characteristics corresponding to the part-of-speech information of the word are fused with the word embedding characteristics of the vocabulary, the two are fused in a splicing mode; when the position information of the word is fused with the word embedding characteristics of the vocabulary, the position information of the word and the word embedding characteristics are fused in a summing mode.
In the step 3, the process of learning the vocabulary level semantic features by using the SAM-NN module is as follows:
step 3.1.1, mapping the input characteristic matrix of the sample data obtained in the step 2 into a query matrix, a key matrix and a value matrix;
step 3.1.2, performing matrix linear transformation on the attribute level aspect information of the sample data obtained in the step 2 to obtain hidden layer representation of attribute level characteristics;
step 3.1.3, fusing the hidden layer representation obtained in the step 3.1.2 into a query matrix and a key matrix respectively to obtain a query matrix fused with attribute information and a key matrix fused with attribute information;
step 3.1.4, respectively performing linear transformation on the query matrix fused with the attribute information and the key matrix fused with the attribute information obtained in the step 3.1.3 and the value matrix obtained in the step 3.1.1 for m times to obtain m query matrix triples, key matrix triples and value matrix triples;
step 3.1.5, performing self-attention calculation on the m query matrix triplets, the key matrix triplets and the value matrix triplets obtained in the step 3.1.4 respectively to obtain m vocabulary level embedded representations of information in the aspect of fusion attributes;
step 3.1.6, splicing the m vocabulary level embedded representations obtained in the step 3.1.5 together, and obtaining hidden layer representation of vocabulary level embedded characteristics through a matrix linear mapping network;
wherein m is a positive integer.
In the step 3, the process of learning sentence-level semantic features by using the AAM-NN module is as follows:
step 3.2.1, the hidden layer representation of the vocabulary level embedded features and the hidden layer representation of the attribute level features, which are obtained by utilizing the SAM-NN module to learn the vocabulary level semantic features, are spliced, and intermediate variables corresponding to each vocabulary are obtained through linear mapping;
step 3.2.2, performing softmax regression operation on the intermediate variable corresponding to each vocabulary obtained in the step 3.2.1 to obtain a weight value represented by the hidden layer embedded in each vocabulary level;
and 3.2.3, performing product operation on the hidden layer embedded in each vocabulary level obtained in the step 3.2.1 and the corresponding weight value represented by the hidden layer embedded in each vocabulary level obtained in the step 3.2.2, and accumulating all product operation results in each user comment sentence to obtain hidden layer representation of sentence level embedding characteristics.
In the step 3, the process of identifying the comment semantic emotional tendency by using the FC-NN module is as follows:
step 3.3.1, splicing hidden layer representation of sentence-level embedded features and hidden layer representation of attribute-level features obtained by learning sentence-level semantic features by using an AAM-NN module, and obtaining output layer hidden layer representation through linear mapping;
and 3.3.2, performing classification processing on the hidden layer representation of the output layer obtained in the step 3.3.1 by utilizing a classifier to obtain the predicted emotion mark classification of the comment sentence of the user.
The invention provides an attribute level emotion analysis method of a complete attention machine system, which combines a Self-attention machine Network (SAM-NN) and an Aspect-specific attention machine Network (AAM-NN) to respectively generate semantic features of a vocabulary level and a sentence level, and finally calculates the emotion polarity of comment sentence contents through a full Connected Neural Network (FC-NN) output layer. The method provided by the invention is of a parallel structure in implementation, and in each network computing module, the method integrates information characteristics of specific aspects, so that the method is ensured to further analyze the emotion polarity of the user comment information about specific attributes of the target object as far as possible according to the information of the specific aspects.
Compared with the prior art, the method provided by the invention not only effectively improves the accuracy of emotion analysis tasks in specific aspects, but also effectively reduces the cost on model training time.
Drawings
FIG. 1 is a flow diagram of a method for attribute level sentiment analysis for a full attention mechanism.
FIG. 2 is a schematic diagram of the overall structure of the SA-NET model proposed by the present invention
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings in conjunction with specific examples.
Referring to fig. 1, a method for attribute-level sentiment analysis of a full attention mechanism includes the following steps:
1) training a network model completely based on attention mechanism by using sample data:
step 1, sample data preprocessing
And taking the given user comment sentences with the real emotion mark classification as sample data, and preprocessing the sample data.
The emotion polarity mark content is the emotion polarity of a user comment sentence under corresponding characteristics, and specifically comprises three emotions: positive emotions, negative emotions, neutral emotions.
The purpose of data preprocessing is to standardize data and construct a training sample data set with the same format. The data preprocessing work of the invention mainly comprises word segmentation, writing standardization and part of speech tagging.
(1) Word segmentation: for comment data in the Chinese data set, performing word segmentation by using a word segmentation tool, and recombining continuous word sequences into word sequences according to a certain standard;
(2) writing standardization: and removing special characters in the data set, and uniformly requiring capital and lowercase English in the data set to convert capital characters into lowercase English.
(3) Part of speech tagging: and recognizing the part of speech corresponding to each word in the comment sentence of the user by using a grammar parsing tool, and establishing a word part of speech association file so as to quickly recognize the part of speech of the word in the comment sentence.
Step 2, feature extraction of sample data
The input features of the sample data include necessary input features and optional additional features. The necessary input features include Word Embedding (WE) of words in the user comment sentence and Aspect Embedding (AE) of information words in the attribute of the user comment sentence. The optional additional features include an embedded feature (POSE) corresponding to Part-Of-Speech information Of a word in the user comment sentence and position information (PE) Of the word in the user comment sentence.
(1) The user reviews the Word Embedding feature (WE) of the words in the sentence:
the invention uses WE as an essential characteristic of model input, the characteristic is obtained by training a word embedding generation tool, and the word embedding characteristic of each word can be expressed as:
w=[e1,e2,…,ed] (1)
wherein e isiE R represents the ith position feature code of the word embedding feature, and d represents the length of the feature code, namely the dimension of the feature vector.
After the word embedding expression of the words is obtained, word embedding characteristics corresponding to all words in a comment sentence are spliced in sequence to serve as input characteristics of the comment sentence, and then a user comment sentence can be expressed as follows:
WE=[w1,w2,…,wn] (2)
wherein n represents the length of the word sequence, WE is equal to Rn*d
(2) Embedding characteristics (AE) of attribute-side information of a user comment sentence:
the present invention uses AE as an essential feature of the model input, which is acquired in a manner consistent with WE acquisition. The word embedding of the word in each attribute-level information may be expressed as:
a=[e1,e2,…,ed] (3)
after obtaining the word-embedded representation of the vocabulary, the input characteristics of the attribute-level facet information may be expressed as:
AE=[a1,a2,…,at] (4)
wherein t represents the sequence length of the facet information, and t ∈ [1, n ].
(3) The user reviews the embedded characteristics (POSE) corresponding to the Part-Of-Speech Embedding Of the words in the sentence:
the invention uses POSE as the optional characteristic of model input, the representation mode of the characteristic is consistent with WE representation mode, the characteristic is obtained by the random initialization of the model, and the characteristic embedding content is optimized in the model training process.
Pose=[po1,po2,…,pod′] (5)
Wherein, poiE R represents the ith bit feature code of the embedded feature, and d' represents the length of the feature code, namely the dimension of the feature vector.
After the word embedding expression is obtained, the word embedding characteristics are obtained by mapping the word information to a d' dimensional vector space, the characteristics are obtained by random initialization and are gradually optimized in the training process, and the word characteristic of a user comment sentence can be expressed as follows:
POSE=[Pose1,Pose2,…,Posen] (6)
wherein n represents the length of the part-of-speech sequence of the total words of the comment sentences, namely the length of the word sequence, and POSE belongs to Rn*d′
(4) User reviews the position information of words in the sentence (PE)
The invention uses PE as an optional feature of the model input, and the PE feature is expressed as follows:
Pe=[p1,p2,…,pd"] (7)
wherein p isiE R represents the ith bit feature code of the embedded feature, and d' represents the length of the feature code, namely the dimension of the feature vector. After obtaining the word position feature, the position feature of one user comment sentence can be expressed as:
PE=[Pe1,Pe2,…,Pen] (8)
wherein n represents the length of the part-of-speech sequence of the total words of the comment sentences, namely the length of the word sequence, and PE belongs to Rn*d". The calculation method of the PE characteristics is as follows;
Figure BDA0002377622630000061
wherein t represents the position of the word in the comment sentence, i represents the position of the ith dimension, and d' represents the length of the feature code, i.e. the dimension of the feature vector.
In summary, the feature extraction of the sample data in the present invention includes the following four cases:
when the features extracted from the sample data are only necessary input features (word embedding feature WE of a word and attribute facet information word embedding feature AE), the word embedding feature WE of the word is directly used as the input feature matrix X of the model input, and the attribute facet information word embedding feature AE is directly used as the attribute-level facet information AE of the model input.
When the features extracted from the sample data include necessary input features (word embedding features WE and attribute aspect information word embedding features AE of the words) and optional additional input features (embedding features POSE corresponding to the part of speech information of the words), at this time, the word embedding features WE of the words and the embedding features POSE corresponding to the part of speech information of the words are spliced to be used as an input feature matrix X of type input, and the attribute aspect information word embedding features AE are directly used as attribute level aspect information AE of model input.
When the features extracted from the sample data include necessary input features (word embedding features WE and attribute aspect information word embedding features AE of the words) and optional additional input features (position information PE of the words in the user comment sentences), at this time, the sum of the word embedding features WE of the words and the position information PE of the words in the user comment sentences is used as an input feature matrix X for model input, and the attribute aspect information word embedding features AE are directly used as attribute-level aspect information AE for model input.
And fourthly, when the features extracted from the sample data comprise necessary input features (word embedding features WE and attribute aspect information word embedding features AE of words) and optional additional input features (embedding features POSE corresponding to the part of speech information of the words and position information PE of words in the user comment sentences), summing the word embedding features WE and the position features PE of the words, splicing the obtained result and the embedding features POSE corresponding to the part of speech information to form an input feature matrix X of type input, and directly using the attribute aspect information word embedding features AE as attribute-level aspect information AE of model input.
Step 3, characteristic learning of sample data
Initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain the predicted emotion mark classification of the comment sentences of the user.
Referring to fig. 2, the attribute-level emotion analysis model based on attention mechanism completely provided by the present invention sequentially includes three modules: the SAM-NN module is used for learning vocabulary level semantic features, the AAM-NN module is used for learning sentence level semantic features, and the FC-NN module is used for identifying comment semantic emotional tendency. The specific processing contents are as follows:
(1) SAM-NN module learning vocabulary level semantic features
The SAM-NN module is responsible for learning a hidden layer representation of the vocabulary level embedding features of the user comment sentences further according to the attribute aspect information on the basis of the input word embedding features.
(1.1) first mapping the input feature matrix X into a query matrix Q, a key matrix K and a value matrix V, the formula is as follows:
Figure BDA0002377622630000071
wherein, Wq,Wk,WvA network learning weight matrix representing the query matrix, the key matrix and the value matrix, respectively, bq,bk,bvBias values representing the query matrix, the key matrix, and the value matrix, respectively.
(1.2) performing matrix linear transformation on input attribute-level aspect information AE to obtain a hidden layer expression A of attribute-level characteristics, wherein the formula is as follows:
A=Wa·AE+ba (11)
wherein, WaNetwork learning weight matrix representing attribute-level features, baA bias value representing a property level characteristic.
(1.3) fusing the attribute level characteristics A into a query matrix Q and a key matrix K to obtain the query matrix Q fused with the attribute informationaAnd key matrix KaThe formula is as follows:
Figure BDA0002377622630000072
wherein Fuse (×) represents the fusion process of attribute-level feature information, and methods such as vector splicing and vector summation can be used. In the implementation process, the fusion process of the attribute-level feature information uses a gating logic mode, as shown in formula (12), λqAnd λkGated logic coefficients, λ, of matrix and features, respectivelyq∈[0,1],λk∈[0,1]。
(1.4) respectively integrating the query matrixes Q of the attribute informationaKey matrix K with fused attribute informationaAnd performing linear mapping on the value matrix V of the comment sentence for m times to respectively obtain m triples containing the query matrix, the key matrix and the value matrix
Figure BDA0002377622630000073
And ViThe formula is as follows:
Figure BDA0002377622630000074
wherein i ∈ [1, m ]]And m represents the number of times of linear mapping,
Figure BDA0002377622630000075
a network learning weight matrix respectively representing a query matrix, a key matrix and a value matrix required for the ith linear mapping,
Figure BDA0002377622630000076
and respectively representing bias values of the query matrix, the key matrix and the value matrix required by the ith linear mapping.
(1.5) obtaining m triple matrixes
Figure BDA0002377622630000077
Then, each triple matrix is subjected to self-attention calculation to obtain m vocabulary level embedded representations H of information in the aspect of fusion attributesiThe formula is as follows:
Figure BDA0002377622630000081
wherein d represents the calculation dimension of the triple matrix, and the calculation formula of softmax (x) is as follows:
Figure BDA0002377622630000082
wherein C represents the number of samples, xjThe jth sample instance is shown.
(1.6) all the resulting lexical level embedding representations HiSplicing together, and obtaining a hidden layer expression H of the vocabulary level embedding characteristics through a matrix linear mapping network, wherein the formula is as follows:
Figure BDA0002377622630000083
wherein H' represents the vocabulary level embedded splicing result, H represents the linear mapping result corresponding to a user comment, and WcNetwork learning weight matrix representing lexical level embedded features, bcA bias value representing a vocabulary level embedded feature,. indicates a concatenation operation.
(2) AAM-NN module learning sentence-level semantic features
The AAM-NN module is responsible for learning the hidden layer representation of the sentence-level embedded features of the user comment sentences according to the attribute aspect information on the basis of the hidden layer representation of the vocabulary-level embedded features.
(2.1) for the vocabulary level embedded characteristic H corresponding to one user comment, taking HiE.g. H represents the word level embedding characteristics corresponding to the ith word in the user comment sentence, and the word level embedding characteristics H corresponding to the ith word in the user comment sentenceiSplicing with the attribute-level feature A to obtain an intermediate temporary variable hiThen, the intermediate temporary variable hiObtaining an intermediate variable e corresponding to each vocabulary through linear mappingi
Figure BDA0002377622630000084
Wherein, WeNetwork learning weight matrix representing intermediate variables, beAn offset value indicating an intermediate variable,. indicates a splicing operation.
(2.2) obtaining an intermediate variable e corresponding to each vocabularyiCalculating a weight value beta of each vocabulary level embedded hidden layer representationiThe formula is as follows:
Figure BDA0002377622630000085
where n represents the length of one user comment sentence.
(2.3) embedding the obtained each vocabulary level into a hidden layer representation hiCorresponding weight value betaiPerforming product operation, and accumulating all operation results in a user comment sentence to obtain a hidden layer expression S of sentence-level embedding characteristics, wherein the formula is as follows:
Figure BDA0002377622630000086
where n represents the length of one user comment sentence.
(3) FC-NN module identifies comment semantic emotional trends
And the FC-NN module is responsible for carrying out sentiment analysis on the user comment sentences at a semantic level and identifying the sentiment belonging category of the user comment through multi-classification. In a specific implementation process, the emotions are divided into three categories, namely positive emotions, negative emotions and neutral emotions.
(3.1) splicing the hidden layer representation S of sentence-level embedded features with the attribute features A to obtain an intermediate temporary variable O ', and then carrying out linear transformation on the intermediate temporary variable O' to obtain an output layer hidden layer representation O, wherein the formula is as follows:
Figure BDA0002377622630000091
wherein, WoNetwork learning weight matrix representing the output layer, boIndicating the offset value of the output layer, # indicates the splicing operation.
(3.2) calculating the probability distribution of the emotion categories to which the user comment sentences belong by using a softmax classifier to obtain predicted emotion mark classification, wherein the formula is as follows:
Y=softmax(O) (21)
step 4, model training
And (3) training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step (1) and the predicted emotion mark classification obtained in the step (3), and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism.
The invention adopts an Adam optimization method to minimize a cross entropy loss function to train a model, and a minimization target equation is defined as follows:
Figure BDA0002377622630000092
wherein the content of the first and second substances,
Figure BDA0002377622630000093
representing the real sentiment mark classification, Y representing the model prediction result, X representing the characteristic input of the user comment sentence, A representing the characteristic input of the attribute-level aspect information, and thetaRepresenting model training parameters.
Specifically, in the training implementation process, the emotion marks of the user are divided into three categories, namely positive emotion, neutral emotion and negative emotion.
Figure BDA0002377622630000094
Each column represents a category of emotion,
Figure BDA0002377622630000095
which represents a positive emotion that,
Figure BDA0002377622630000096
represents a neutral emotion that is to be interpreted,
Figure BDA0002377622630000097
representing a negative emotion. Assigning values in classification using real emotion labels
Figure BDA0002377622630000098
The method comprises the following steps: when the real emotion mark of the sample data (user comment sentence) is classified as a positive emotion, then
Figure BDA0002377622630000099
Is 1, other
Figure BDA00023776226300000910
And
Figure BDA00023776226300000911
is 0, i.e.
Figure BDA00023776226300000912
When the real emotion mark of the sample data (user comment sentence) is classified as neutral emotion, then
Figure BDA00023776226300000913
Is 1, other
Figure BDA00023776226300000914
And
Figure BDA00023776226300000915
is 0, i.e.
Figure BDA00023776226300000916
When the real emotion mark of the sample data (user comment sentence) is classified as a negative emotion, then
Figure BDA00023776226300000917
Is 1, other
Figure BDA00023776226300000918
And
Figure BDA00023776226300000919
is 0, i.e.
Figure BDA00023776226300000920
Y=[y0,y1,y2]Each column represents the probability of predicting the emotion category to which it belongs, y0Representing positive emotion, y1Representing neutral emotion, y2Representing a negative emotion. Solved y0,y1,y2Because of the probability value, it can be an integer or a decimal.
The invention carries out iteration for many times on the training data set, terminates training when the iteration number meets a certain condition, solidifies the model parameters and saves the parameters, in particular, the invention carries out iteration for 50 times on the training data set, and solidifies and saves the parameters of the model for testing the model performance for the next 30 times.
2) And (3) carrying out emotion mark classification on the measured data by using a trained attention-based mechanism network model:
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the trained network model completely based on the attention mechanism in the step 4 to obtain the probability distribution of the emotion categories to which the comment sentences belong, wherein the classification corresponding to the maximum probability is used as the final prediction emotion classification result, and the formula is as follows:
ypredict=argmax(Y) (23)
to verify the effectiveness of the method of the present invention, experiments were performed on 3 datasets, namely Restaurant (Restaurant), Laptop (Laptop), Twitter dataset (Twitter). During the experiment, it is found that the model gradually becomes stable after 10 times of training iterations, so that 50 times of iteration are performed on each data set respectively, and the experimental results of the last 30 times are counted. The experimental results include accuracy statistics and model calculation time statistics, which are shown in tables 1 and 2, respectively, wherein the bottom 4 rows in the two tables are the methods provided by the present invention, and the top 7 rows are the reference methods for comparison.
TABLE 1 Attribute-level Emotion analysis accuracy results
Figure BDA0002377622630000101
In table 1, max represents the result of the highest accuracy of the test set in the last 30 times of iterative training, avg represents the result of the average accuracy of the test set in the last 30 times of iterative training, and var represents the variance of the accuracy of the test set in the last 30 times of iterative training. The accuracy rate can reflect the identification capability of the model about the attribute emotion, and the variance can reflect the stability of the model.
TABLE 2 computation time (units: seconds/s) spent for attribute level emotion analysis
Figure BDA0002377622630000102
Figure BDA0002377622630000111
The time spent by the model on each iteration of the data set is counted in table 2, where Train represents the time spent training the data and Test represents the time spent testing the data. It can be seen that the method provided by the present invention has a significant improvement in computational efficiency compared to the prior art.
As can be seen from tables 1 and 2, the attribute-level emotion analysis method of the full attention mechanism provided by the invention not only ensures the accuracy and good stability of the model in the attribute-level emotion analysis task, but also improves the training efficiency, thereby fully explaining the effectiveness of the method provided by the invention.
It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and thus the present invention is not limited to the above-mentioned embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.

Claims (5)

1. An attribute-level emotion analysis method of a complete attention mechanism is characterized in that,
step 1, a given user comment sentence with a real emotion mark classification is used as sample data, and the sample data is preprocessed;
step 2, firstly, extracting the characteristics of the preprocessed sample data to extract word embedding characteristics and attribute information word embedding characteristics of the words of the user comment sentences; then, the word embedding characteristics of the vocabulary are used as an input characteristic matrix of sample data, and the attribute aspect information word embedding characteristics are used as attribute level aspect information of the sample data;
step 3, initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain prediction emotion mark classification of the comment sentences of the user;
the process of learning vocabulary level semantic features by using the SAM-NN module is as follows:
step 3.1.1, mapping the input characteristic matrix of the sample data obtained in the step 2 into a query matrix, a key matrix and a value matrix;
step 3.1.2, performing matrix linear transformation on the attribute level aspect information of the sample data obtained in the step 2 to obtain hidden layer representation of attribute level characteristics;
step 3.1.3, fusing the hidden layer representation obtained in the step 3.1.2 into a query matrix and a key matrix respectively to obtain a query matrix fused with attribute information and a key matrix fused with attribute information;
step 3.1.4, respectively performing linear transformation on the query matrix fused with the attribute information and the key matrix fused with the attribute information obtained in the step 3.1.3 and the value matrix obtained in the step 3.1.1 for m times to obtain m query matrix triples, key matrix triples and value matrix triples;
step 3.1.5, performing self-attention calculation on the m query matrix triplets, the key matrix triplets and the value matrix triplets obtained in the step 3.1.4 respectively to obtain m vocabulary level embedded representations of information in the aspect of fusion attributes;
step 3.1.6, splicing the m vocabulary level embedded representations obtained in the step 3.1.5 together, and obtaining hidden layer representation of vocabulary level embedded characteristics through a matrix linear mapping network;
wherein m is a set positive integer;
the process of learning sentence-level semantic features by using the AAM-NN module is as follows:
step 3.2.1, the hidden layer representation of the vocabulary level embedded features and the hidden layer representation of the attribute level features, which are obtained by utilizing the SAM-NN module to learn the vocabulary level semantic features, are spliced, and intermediate variables corresponding to each vocabulary are obtained through linear mapping;
step 3.2.2, performing softmax regression operation on the intermediate variable corresponding to each vocabulary obtained in the step 3.2.1 to obtain a weight value represented by the hidden layer embedded in each vocabulary level;
step 3.2.3, performing product operation on the hidden layer embedded in each vocabulary level obtained in the step 3.2.1 and the corresponding weight value represented by the hidden layer embedded in each vocabulary level obtained in the step 3.2.2, and accumulating all product operation results in each user comment sentence to obtain hidden layer representation of sentence level embedding characteristics;
step 4, training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step 1 and the predicted emotion mark classification obtained in the step 3, and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism;
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the network model which is trained in the step 4 and is completely based on the attention mechanism, and obtaining the emotion mark classification of the measured data.
2. The method for analyzing emotion in attribute level of full attention mechanism as claimed in claim 1, wherein in steps 1 and 5, the pre-processing procedure of the sample data and the measured data includes word segmentation, writing standardization and part of speech tagging.
3. The method for analyzing emotion in attribute level of full attention mechanism as claimed in claim 1, wherein, in steps 2 and 6, further extracting embedded features and/or position information of words corresponding to the part of speech information of words in user comment sentences of sample data and measured data; and the embedding characteristics corresponding to the part-of-speech information of the word and/or the position information of the word are fused to the word embedding characteristics of the vocabulary, and the word embedding characteristics are jointly used as an input characteristic matrix.
4. The method for analyzing emotion in attribute level of complete attention mechanism as claimed in claim 3, wherein when the embedding features corresponding to the part of speech information of a word are fused with the word embedding features of a vocabulary, the two are fused in a splicing manner; when the position information of the word is fused with the word embedding characteristics of the vocabulary, the position information of the word and the word embedding characteristics are fused in a summing mode.
5. The method for attribute-level sentiment analysis of a full attention mechanism according to claim 1, wherein in step 3, the process of identifying the comment semantic sentiment tendency by using the FC-NN module is as follows:
step 3.3.1, splicing hidden layer representation of sentence-level embedded features and hidden layer representation of attribute-level features obtained by learning sentence-level semantic features by using an AAM-NN module, and obtaining output layer hidden layer representation through linear mapping;
and 3.3.2, performing classification processing on the hidden layer representation of the output layer obtained in the step 3.3.1 by utilizing a classifier to obtain the predicted emotion mark classification of the comment sentence of the user.
CN202010072375.4A 2020-01-21 2020-01-21 Attribute-level emotion analysis method of complete attention mechanism Active CN111259153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010072375.4A CN111259153B (en) 2020-01-21 2020-01-21 Attribute-level emotion analysis method of complete attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010072375.4A CN111259153B (en) 2020-01-21 2020-01-21 Attribute-level emotion analysis method of complete attention mechanism

Publications (2)

Publication Number Publication Date
CN111259153A CN111259153A (en) 2020-06-09
CN111259153B true CN111259153B (en) 2021-06-22

Family

ID=70949099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010072375.4A Active CN111259153B (en) 2020-01-21 2020-01-21 Attribute-level emotion analysis method of complete attention mechanism

Country Status (1)

Country Link
CN (1) CN111259153B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859957B (en) * 2020-07-15 2023-11-07 中南民族大学 Emotion reason clause label extraction method, device, equipment and storage medium
CN111858944B (en) * 2020-07-31 2022-11-22 电子科技大学 Entity aspect level emotion analysis method based on attention mechanism
CN112329474B (en) * 2020-11-02 2022-10-04 山东师范大学 Attention-fused aspect-level user comment text emotion analysis method and system
CN113313140B (en) * 2021-04-14 2022-11-01 中国海洋大学 Three-dimensional model classification and retrieval method and device based on deep attention
CN113469184B (en) * 2021-04-21 2022-08-12 华东师范大学 Character recognition method for handwritten Chinese character based on multi-mode data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019204186A1 (en) * 2018-04-18 2019-10-24 Sony Interactive Entertainment Inc. Integrated understanding of user characteristics by multimodal processing
CN110517121A (en) * 2019-09-23 2019-11-29 重庆邮电大学 Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis
CN110597991A (en) * 2019-09-10 2019-12-20 腾讯科技(深圳)有限公司 Text classification method and device, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11514333B2 (en) * 2018-04-30 2022-11-29 Meta Platforms, Inc. Combining machine-learning and social data to generate personalized recommendations
CN108763326B (en) * 2018-05-04 2021-01-12 南京邮电大学 Emotion analysis model construction method of convolutional neural network based on feature diversification
CN110046353B (en) * 2019-04-22 2022-05-13 重庆理工大学 Aspect level emotion analysis method based on multi-language level mechanism
CN109948165B (en) * 2019-04-24 2023-04-25 吉林大学 Fine granularity emotion polarity prediction method based on mixed attention network
CN110210032B (en) * 2019-05-31 2023-10-31 鼎富智能科技有限公司 Text processing method and device
CN110347831A (en) * 2019-06-28 2019-10-18 西安理工大学 Based on the sensibility classification method from attention mechanism
CN110569508A (en) * 2019-09-10 2019-12-13 重庆邮电大学 Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019204186A1 (en) * 2018-04-18 2019-10-24 Sony Interactive Entertainment Inc. Integrated understanding of user characteristics by multimodal processing
CN110597991A (en) * 2019-09-10 2019-12-20 腾讯科技(深圳)有限公司 Text classification method and device, computer equipment and storage medium
CN110517121A (en) * 2019-09-23 2019-11-29 重庆邮电大学 Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis

Also Published As

Publication number Publication date
CN111259153A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN111259153B (en) Attribute-level emotion analysis method of complete attention mechanism
US20220147836A1 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN110059188B (en) Chinese emotion analysis method based on bidirectional time convolution network
CN111209401A (en) System and method for classifying and processing sentiment polarity of online public opinion text information
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN110287323B (en) Target-oriented emotion classification method
CN112001187A (en) Emotion classification system based on Chinese syntax and graph convolution neural network
CN112001186A (en) Emotion classification method using graph convolution neural network and Chinese syntax
CN112667818B (en) GCN and multi-granularity attention fused user comment sentiment analysis method and system
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN112069312B (en) Text classification method based on entity recognition and electronic device
CN113360582B (en) Relation classification method and system based on BERT model fusion multi-entity information
CN110969023B (en) Text similarity determination method and device
CN113705238B (en) Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model
CN111832293A (en) Entity and relation combined extraction method based on head entity prediction
CN112101014B (en) Chinese chemical industry document word segmentation method based on mixed feature fusion
CN115759119B (en) Financial text emotion analysis method, system, medium and equipment
CN114881043B (en) Deep learning model-based legal document semantic similarity evaluation method and system
CN112800184A (en) Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction
CN111858933A (en) Character-based hierarchical text emotion analysis method and system
CN111178080A (en) Named entity identification method and system based on structured information
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
CN112862569B (en) Product appearance style evaluation method and system based on image and text multi-modal data
CN113159831A (en) Comment text sentiment analysis method based on improved capsule network
CN114911940A (en) Text emotion recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20200609

Assignee: Guilin Zhongchen Information Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000215

Denomination of invention: An attribute level emotion analysis method based on complete attention mechanism

Granted publication date: 20210622

License type: Common License

Record date: 20221206

EE01 Entry into force of recordation of patent licensing contract