CN111259153B - Attribute-level emotion analysis method of complete attention mechanism - Google Patents
Attribute-level emotion analysis method of complete attention mechanism Download PDFInfo
- Publication number
- CN111259153B CN111259153B CN202010072375.4A CN202010072375A CN111259153B CN 111259153 B CN111259153 B CN 111259153B CN 202010072375 A CN202010072375 A CN 202010072375A CN 111259153 B CN111259153 B CN 111259153B
- Authority
- CN
- China
- Prior art keywords
- level
- attribute
- matrix
- information
- vocabulary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention discloses an attribute level sentiment analysis method of a complete attention mechanism, which combines a self-attention mechanism network SAM-NN and a specific aspect attention mechanism network AAM-NN to respectively generate semantic features of vocabulary level and sentence level, and finally calculates the sentiment polarity of the contents of a comment sentence through a fully connected neural network FC-NN output layer. The method provided by the invention is of a parallel structure in implementation, and in each network computing module, the method integrates information characteristics of specific aspects, so that the method is ensured to further analyze the emotion polarity of the user comment information about specific attributes of the target object as far as possible according to the information of the specific aspects. Compared with the prior art, the method provided by the invention not only effectively improves the accuracy of emotion analysis tasks in specific aspects, but also effectively reduces the cost on model training time.
Description
Technical Field
The invention relates to the technical field of deep learning, in particular to an attribute level emotion analysis method of a complete attention mechanism.
Background
The rapid development and popularization of internet technology has prompted the generation of such a large amount of online comment information, such as various e-commerce transaction platforms, social network platforms, etc., and how to effectively mine useful information from a large amount of comments has become a research hotspot in the natural language processing field in recent years, wherein the research hotspot includes a viewpoint mining task in comment information of users. The sentiment expressed in online comment information is often diverse, and a piece of user comment may contain different sentiments in terms of multiple attributes of the comment target. For example, a user comment from a restaurant: the noodles are delicious, and the soup is difficult to drink. The comment contains characteristic information of two attributes of 'noodle' and 'soup', and the user expresses 'positive' and 'negative' viewpoints on the two attributes of the information respectively. Obviously, it is not reasonable to analyze the emotional tendency of the user comment sentences with the composite emotion as a whole, and it is necessary to further analyze the viewpoint expressed by the user in a fine-grained manner according to each attribute information. Compared with the past emotion analysis tasks, the attribute-level emotion analysis task can provide more reference values for users and enterprises and is more and more concerned by the industry and academia.
With the success of deep learning techniques in the field of natural language processing, researchers have attempted to apply deep learning techniques to emotion analysis tasks. The deep learning method can automatically extract text features, and effectively avoids a great deal of time and energy spent on manual feature engineering. In the emotion analysis task, emotion recognition is generally converted into a multi-classification task, each emotion of a user is regarded as a category, and the emotion to be expressed by the user is recognized by distinguishing the category to which the user comment belongs. The existing research work mainly includes methods based on a Recurrent Neural Network (RNN) and a Convolutional Neural Network (CNN). However, both RNN-based and CNN-based approaches have their own pertinence and limitations, and their main problems are as follows:
since the RNN network is configured by sequential calculation units, text data needs to be sequentially processed when processing natural sentences and text information. In the emotion analysis task of a specific aspect, each vocabulary in the comment sentences of the user needs to be processed by the network computing unit in sequence, and the sequence structure undoubtedly increases the computing time overhead of the whole network model. In addition, the method based on the RNN simulates the sequential order of words to process text data when processing the user comment sentences. When a user comment sentence is long, two words far away from each other in the comment sentence cause information loss due to the fact that the information transfer distance is too far, and semantic dependency relationship between the two words is lost.
The CNN network-based method is used for acquiring text semantic information in parallel through convolution windows, and the parallel structure can effectively improve the calculation efficiency of the model. However, due to the limitation of the size of the convolution window, the method based on the CNN still can only capture the semantic dependency in the convolution window, and the global dependency of a single in the comment sentence is lost.
Disclosure of Invention
The invention provides an attribute-level sentiment analysis method of a full attention mechanism, aiming at the problems of low time overhead and low accuracy when sentiment analysis is carried out on user comments by utilizing deep learning in the prior art.
In order to solve the problems, the invention is realized by the following technical scheme:
an attribute level emotion analysis method of a complete attention mechanism specifically comprises the following steps:
step 1, a given user comment sentence with a real emotion mark classification is used as sample data, and the sample data is preprocessed;
step 3, initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain prediction emotion mark classification of the comment sentences of the user;
step 4, training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step 1 and the predicted emotion mark classification obtained in the step 3, and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism;
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the network model which is trained in the step 4 and is completely based on the attention mechanism, and obtaining the emotion mark classification of the measured data.
In the above steps 1 and 5, the process of preprocessing the sample data and the measured data includes word segmentation, writing standardization and part of speech tagging.
In the above steps 2 and 6, further extracting embedded features and/or position information of words corresponding to the part-of-speech information of the words of the user comment sentences of the sample data and the measured data; and the embedding characteristics corresponding to the part-of-speech information of the word and/or the position information of the word are fused to the word embedding characteristics of the vocabulary, and the word embedding characteristics are jointly used as an input characteristic matrix.
In the scheme, when the embedding characteristics corresponding to the part-of-speech information of the word are fused with the word embedding characteristics of the vocabulary, the two are fused in a splicing mode; when the position information of the word is fused with the word embedding characteristics of the vocabulary, the position information of the word and the word embedding characteristics are fused in a summing mode.
In the step 3, the process of learning the vocabulary level semantic features by using the SAM-NN module is as follows:
step 3.1.1, mapping the input characteristic matrix of the sample data obtained in the step 2 into a query matrix, a key matrix and a value matrix;
step 3.1.2, performing matrix linear transformation on the attribute level aspect information of the sample data obtained in the step 2 to obtain hidden layer representation of attribute level characteristics;
step 3.1.3, fusing the hidden layer representation obtained in the step 3.1.2 into a query matrix and a key matrix respectively to obtain a query matrix fused with attribute information and a key matrix fused with attribute information;
step 3.1.4, respectively performing linear transformation on the query matrix fused with the attribute information and the key matrix fused with the attribute information obtained in the step 3.1.3 and the value matrix obtained in the step 3.1.1 for m times to obtain m query matrix triples, key matrix triples and value matrix triples;
step 3.1.5, performing self-attention calculation on the m query matrix triplets, the key matrix triplets and the value matrix triplets obtained in the step 3.1.4 respectively to obtain m vocabulary level embedded representations of information in the aspect of fusion attributes;
step 3.1.6, splicing the m vocabulary level embedded representations obtained in the step 3.1.5 together, and obtaining hidden layer representation of vocabulary level embedded characteristics through a matrix linear mapping network;
wherein m is a positive integer.
In the step 3, the process of learning sentence-level semantic features by using the AAM-NN module is as follows:
step 3.2.1, the hidden layer representation of the vocabulary level embedded features and the hidden layer representation of the attribute level features, which are obtained by utilizing the SAM-NN module to learn the vocabulary level semantic features, are spliced, and intermediate variables corresponding to each vocabulary are obtained through linear mapping;
step 3.2.2, performing softmax regression operation on the intermediate variable corresponding to each vocabulary obtained in the step 3.2.1 to obtain a weight value represented by the hidden layer embedded in each vocabulary level;
and 3.2.3, performing product operation on the hidden layer embedded in each vocabulary level obtained in the step 3.2.1 and the corresponding weight value represented by the hidden layer embedded in each vocabulary level obtained in the step 3.2.2, and accumulating all product operation results in each user comment sentence to obtain hidden layer representation of sentence level embedding characteristics.
In the step 3, the process of identifying the comment semantic emotional tendency by using the FC-NN module is as follows:
step 3.3.1, splicing hidden layer representation of sentence-level embedded features and hidden layer representation of attribute-level features obtained by learning sentence-level semantic features by using an AAM-NN module, and obtaining output layer hidden layer representation through linear mapping;
and 3.3.2, performing classification processing on the hidden layer representation of the output layer obtained in the step 3.3.1 by utilizing a classifier to obtain the predicted emotion mark classification of the comment sentence of the user.
The invention provides an attribute level emotion analysis method of a complete attention machine system, which combines a Self-attention machine Network (SAM-NN) and an Aspect-specific attention machine Network (AAM-NN) to respectively generate semantic features of a vocabulary level and a sentence level, and finally calculates the emotion polarity of comment sentence contents through a full Connected Neural Network (FC-NN) output layer. The method provided by the invention is of a parallel structure in implementation, and in each network computing module, the method integrates information characteristics of specific aspects, so that the method is ensured to further analyze the emotion polarity of the user comment information about specific attributes of the target object as far as possible according to the information of the specific aspects.
Compared with the prior art, the method provided by the invention not only effectively improves the accuracy of emotion analysis tasks in specific aspects, but also effectively reduces the cost on model training time.
Drawings
FIG. 1 is a flow diagram of a method for attribute level sentiment analysis for a full attention mechanism.
FIG. 2 is a schematic diagram of the overall structure of the SA-NET model proposed by the present invention
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings in conjunction with specific examples.
Referring to fig. 1, a method for attribute-level sentiment analysis of a full attention mechanism includes the following steps:
1) training a network model completely based on attention mechanism by using sample data:
step 1, sample data preprocessing
And taking the given user comment sentences with the real emotion mark classification as sample data, and preprocessing the sample data.
The emotion polarity mark content is the emotion polarity of a user comment sentence under corresponding characteristics, and specifically comprises three emotions: positive emotions, negative emotions, neutral emotions.
The purpose of data preprocessing is to standardize data and construct a training sample data set with the same format. The data preprocessing work of the invention mainly comprises word segmentation, writing standardization and part of speech tagging.
(1) Word segmentation: for comment data in the Chinese data set, performing word segmentation by using a word segmentation tool, and recombining continuous word sequences into word sequences according to a certain standard;
(2) writing standardization: and removing special characters in the data set, and uniformly requiring capital and lowercase English in the data set to convert capital characters into lowercase English.
(3) Part of speech tagging: and recognizing the part of speech corresponding to each word in the comment sentence of the user by using a grammar parsing tool, and establishing a word part of speech association file so as to quickly recognize the part of speech of the word in the comment sentence.
The input features of the sample data include necessary input features and optional additional features. The necessary input features include Word Embedding (WE) of words in the user comment sentence and Aspect Embedding (AE) of information words in the attribute of the user comment sentence. The optional additional features include an embedded feature (POSE) corresponding to Part-Of-Speech information Of a word in the user comment sentence and position information (PE) Of the word in the user comment sentence.
(1) The user reviews the Word Embedding feature (WE) of the words in the sentence:
the invention uses WE as an essential characteristic of model input, the characteristic is obtained by training a word embedding generation tool, and the word embedding characteristic of each word can be expressed as:
w=[e1,e2,…,ed] (1)
wherein e isiE R represents the ith position feature code of the word embedding feature, and d represents the length of the feature code, namely the dimension of the feature vector.
After the word embedding expression of the words is obtained, word embedding characteristics corresponding to all words in a comment sentence are spliced in sequence to serve as input characteristics of the comment sentence, and then a user comment sentence can be expressed as follows:
WE=[w1,w2,…,wn] (2)
wherein n represents the length of the word sequence, WE is equal to Rn*d。
(2) Embedding characteristics (AE) of attribute-side information of a user comment sentence:
the present invention uses AE as an essential feature of the model input, which is acquired in a manner consistent with WE acquisition. The word embedding of the word in each attribute-level information may be expressed as:
a=[e1,e2,…,ed] (3)
after obtaining the word-embedded representation of the vocabulary, the input characteristics of the attribute-level facet information may be expressed as:
AE=[a1,a2,…,at] (4)
wherein t represents the sequence length of the facet information, and t ∈ [1, n ].
(3) The user reviews the embedded characteristics (POSE) corresponding to the Part-Of-Speech Embedding Of the words in the sentence:
the invention uses POSE as the optional characteristic of model input, the representation mode of the characteristic is consistent with WE representation mode, the characteristic is obtained by the random initialization of the model, and the characteristic embedding content is optimized in the model training process.
Pose=[po1,po2,…,pod′] (5)
Wherein, poiE R represents the ith bit feature code of the embedded feature, and d' represents the length of the feature code, namely the dimension of the feature vector.
After the word embedding expression is obtained, the word embedding characteristics are obtained by mapping the word information to a d' dimensional vector space, the characteristics are obtained by random initialization and are gradually optimized in the training process, and the word characteristic of a user comment sentence can be expressed as follows:
POSE=[Pose1,Pose2,…,Posen] (6)
wherein n represents the length of the part-of-speech sequence of the total words of the comment sentences, namely the length of the word sequence, and POSE belongs to Rn*d′。
(4) User reviews the position information of words in the sentence (PE)
The invention uses PE as an optional feature of the model input, and the PE feature is expressed as follows:
Pe=[p1,p2,…,pd"] (7)
wherein p isiE R represents the ith bit feature code of the embedded feature, and d' represents the length of the feature code, namely the dimension of the feature vector. After obtaining the word position feature, the position feature of one user comment sentence can be expressed as:
PE=[Pe1,Pe2,…,Pen] (8)
wherein n represents the length of the part-of-speech sequence of the total words of the comment sentences, namely the length of the word sequence, and PE belongs to Rn*d". The calculation method of the PE characteristics is as follows;
wherein t represents the position of the word in the comment sentence, i represents the position of the ith dimension, and d' represents the length of the feature code, i.e. the dimension of the feature vector.
In summary, the feature extraction of the sample data in the present invention includes the following four cases:
when the features extracted from the sample data are only necessary input features (word embedding feature WE of a word and attribute facet information word embedding feature AE), the word embedding feature WE of the word is directly used as the input feature matrix X of the model input, and the attribute facet information word embedding feature AE is directly used as the attribute-level facet information AE of the model input.
When the features extracted from the sample data include necessary input features (word embedding features WE and attribute aspect information word embedding features AE of the words) and optional additional input features (embedding features POSE corresponding to the part of speech information of the words), at this time, the word embedding features WE of the words and the embedding features POSE corresponding to the part of speech information of the words are spliced to be used as an input feature matrix X of type input, and the attribute aspect information word embedding features AE are directly used as attribute level aspect information AE of model input.
When the features extracted from the sample data include necessary input features (word embedding features WE and attribute aspect information word embedding features AE of the words) and optional additional input features (position information PE of the words in the user comment sentences), at this time, the sum of the word embedding features WE of the words and the position information PE of the words in the user comment sentences is used as an input feature matrix X for model input, and the attribute aspect information word embedding features AE are directly used as attribute-level aspect information AE for model input.
And fourthly, when the features extracted from the sample data comprise necessary input features (word embedding features WE and attribute aspect information word embedding features AE of words) and optional additional input features (embedding features POSE corresponding to the part of speech information of the words and position information PE of words in the user comment sentences), summing the word embedding features WE and the position features PE of the words, splicing the obtained result and the embedding features POSE corresponding to the part of speech information to form an input feature matrix X of type input, and directly using the attribute aspect information word embedding features AE as attribute-level aspect information AE of model input.
Step 3, characteristic learning of sample data
Initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain the predicted emotion mark classification of the comment sentences of the user.
Referring to fig. 2, the attribute-level emotion analysis model based on attention mechanism completely provided by the present invention sequentially includes three modules: the SAM-NN module is used for learning vocabulary level semantic features, the AAM-NN module is used for learning sentence level semantic features, and the FC-NN module is used for identifying comment semantic emotional tendency. The specific processing contents are as follows:
(1) SAM-NN module learning vocabulary level semantic features
The SAM-NN module is responsible for learning a hidden layer representation of the vocabulary level embedding features of the user comment sentences further according to the attribute aspect information on the basis of the input word embedding features.
(1.1) first mapping the input feature matrix X into a query matrix Q, a key matrix K and a value matrix V, the formula is as follows:
wherein, Wq,Wk,WvA network learning weight matrix representing the query matrix, the key matrix and the value matrix, respectively, bq,bk,bvBias values representing the query matrix, the key matrix, and the value matrix, respectively.
(1.2) performing matrix linear transformation on input attribute-level aspect information AE to obtain a hidden layer expression A of attribute-level characteristics, wherein the formula is as follows:
A=Wa·AE+ba (11)
wherein, WaNetwork learning weight matrix representing attribute-level features, baA bias value representing a property level characteristic.
(1.3) fusing the attribute level characteristics A into a query matrix Q and a key matrix K to obtain the query matrix Q fused with the attribute informationaAnd key matrix KaThe formula is as follows:
wherein Fuse (×) represents the fusion process of attribute-level feature information, and methods such as vector splicing and vector summation can be used. In the implementation process, the fusion process of the attribute-level feature information uses a gating logic mode, as shown in formula (12), λqAnd λkGated logic coefficients, λ, of matrix and features, respectivelyq∈[0,1],λk∈[0,1]。
(1.4) respectively integrating the query matrixes Q of the attribute informationaKey matrix K with fused attribute informationaAnd performing linear mapping on the value matrix V of the comment sentence for m times to respectively obtain m triples containing the query matrix, the key matrix and the value matrixAnd ViThe formula is as follows:
wherein i ∈ [1, m ]]And m represents the number of times of linear mapping,a network learning weight matrix respectively representing a query matrix, a key matrix and a value matrix required for the ith linear mapping,and respectively representing bias values of the query matrix, the key matrix and the value matrix required by the ith linear mapping.
(1.5) obtaining m triple matrixesThen, each triple matrix is subjected to self-attention calculation to obtain m vocabulary level embedded representations H of information in the aspect of fusion attributesiThe formula is as follows:
wherein d represents the calculation dimension of the triple matrix, and the calculation formula of softmax (x) is as follows:
wherein C represents the number of samples, xjThe jth sample instance is shown.
(1.6) all the resulting lexical level embedding representations HiSplicing together, and obtaining a hidden layer expression H of the vocabulary level embedding characteristics through a matrix linear mapping network, wherein the formula is as follows:
wherein H' represents the vocabulary level embedded splicing result, H represents the linear mapping result corresponding to a user comment, and WcNetwork learning weight matrix representing lexical level embedded features, bcA bias value representing a vocabulary level embedded feature,. indicates a concatenation operation.
(2) AAM-NN module learning sentence-level semantic features
The AAM-NN module is responsible for learning the hidden layer representation of the sentence-level embedded features of the user comment sentences according to the attribute aspect information on the basis of the hidden layer representation of the vocabulary-level embedded features.
(2.1) for the vocabulary level embedded characteristic H corresponding to one user comment, taking HiE.g. H represents the word level embedding characteristics corresponding to the ith word in the user comment sentence, and the word level embedding characteristics H corresponding to the ith word in the user comment sentenceiSplicing with the attribute-level feature A to obtain an intermediate temporary variable hiThen, the intermediate temporary variable hiObtaining an intermediate variable e corresponding to each vocabulary through linear mappingi:
Wherein, WeNetwork learning weight matrix representing intermediate variables, beAn offset value indicating an intermediate variable,. indicates a splicing operation.
(2.2) obtaining an intermediate variable e corresponding to each vocabularyiCalculating a weight value beta of each vocabulary level embedded hidden layer representationiThe formula is as follows:
where n represents the length of one user comment sentence.
(2.3) embedding the obtained each vocabulary level into a hidden layer representation hiCorresponding weight value betaiPerforming product operation, and accumulating all operation results in a user comment sentence to obtain a hidden layer expression S of sentence-level embedding characteristics, wherein the formula is as follows:
where n represents the length of one user comment sentence.
(3) FC-NN module identifies comment semantic emotional trends
And the FC-NN module is responsible for carrying out sentiment analysis on the user comment sentences at a semantic level and identifying the sentiment belonging category of the user comment through multi-classification. In a specific implementation process, the emotions are divided into three categories, namely positive emotions, negative emotions and neutral emotions.
(3.1) splicing the hidden layer representation S of sentence-level embedded features with the attribute features A to obtain an intermediate temporary variable O ', and then carrying out linear transformation on the intermediate temporary variable O' to obtain an output layer hidden layer representation O, wherein the formula is as follows:
wherein, WoNetwork learning weight matrix representing the output layer, boIndicating the offset value of the output layer, # indicates the splicing operation.
(3.2) calculating the probability distribution of the emotion categories to which the user comment sentences belong by using a softmax classifier to obtain predicted emotion mark classification, wherein the formula is as follows:
Y=softmax(O) (21)
step 4, model training
And (3) training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step (1) and the predicted emotion mark classification obtained in the step (3), and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism.
The invention adopts an Adam optimization method to minimize a cross entropy loss function to train a model, and a minimization target equation is defined as follows:
wherein the content of the first and second substances,representing the real sentiment mark classification, Y representing the model prediction result, X representing the characteristic input of the user comment sentence, A representing the characteristic input of the attribute-level aspect information, and thetaRepresenting model training parameters.
Specifically, in the training implementation process, the emotion marks of the user are divided into three categories, namely positive emotion, neutral emotion and negative emotion.
Each column represents a category of emotion,which represents a positive emotion that,represents a neutral emotion that is to be interpreted,representing a negative emotion. Assigning values in classification using real emotion labelsThe method comprises the following steps: when the real emotion mark of the sample data (user comment sentence) is classified as a positive emotion, thenIs 1, otherAndis 0, i.e.When the real emotion mark of the sample data (user comment sentence) is classified as neutral emotion, thenIs 1, otherAndis 0, i.e.When the real emotion mark of the sample data (user comment sentence) is classified as a negative emotion, thenIs 1, otherAndis 0, i.e.
Y=[y0,y1,y2]Each column represents the probability of predicting the emotion category to which it belongs, y0Representing positive emotion, y1Representing neutral emotion, y2Representing a negative emotion. Solved y0,y1,y2Because of the probability value, it can be an integer or a decimal.
The invention carries out iteration for many times on the training data set, terminates training when the iteration number meets a certain condition, solidifies the model parameters and saves the parameters, in particular, the invention carries out iteration for 50 times on the training data set, and solidifies and saves the parameters of the model for testing the model performance for the next 30 times.
2) And (3) carrying out emotion mark classification on the measured data by using a trained attention-based mechanism network model:
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the trained network model completely based on the attention mechanism in the step 4 to obtain the probability distribution of the emotion categories to which the comment sentences belong, wherein the classification corresponding to the maximum probability is used as the final prediction emotion classification result, and the formula is as follows:
ypredict=argmax(Y) (23)
to verify the effectiveness of the method of the present invention, experiments were performed on 3 datasets, namely Restaurant (Restaurant), Laptop (Laptop), Twitter dataset (Twitter). During the experiment, it is found that the model gradually becomes stable after 10 times of training iterations, so that 50 times of iteration are performed on each data set respectively, and the experimental results of the last 30 times are counted. The experimental results include accuracy statistics and model calculation time statistics, which are shown in tables 1 and 2, respectively, wherein the bottom 4 rows in the two tables are the methods provided by the present invention, and the top 7 rows are the reference methods for comparison.
TABLE 1 Attribute-level Emotion analysis accuracy results
In table 1, max represents the result of the highest accuracy of the test set in the last 30 times of iterative training, avg represents the result of the average accuracy of the test set in the last 30 times of iterative training, and var represents the variance of the accuracy of the test set in the last 30 times of iterative training. The accuracy rate can reflect the identification capability of the model about the attribute emotion, and the variance can reflect the stability of the model.
TABLE 2 computation time (units: seconds/s) spent for attribute level emotion analysis
The time spent by the model on each iteration of the data set is counted in table 2, where Train represents the time spent training the data and Test represents the time spent testing the data. It can be seen that the method provided by the present invention has a significant improvement in computational efficiency compared to the prior art.
As can be seen from tables 1 and 2, the attribute-level emotion analysis method of the full attention mechanism provided by the invention not only ensures the accuracy and good stability of the model in the attribute-level emotion analysis task, but also improves the training efficiency, thereby fully explaining the effectiveness of the method provided by the invention.
It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and thus the present invention is not limited to the above-mentioned embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.
Claims (5)
1. An attribute-level emotion analysis method of a complete attention mechanism is characterized in that,
step 1, a given user comment sentence with a real emotion mark classification is used as sample data, and the sample data is preprocessed;
step 2, firstly, extracting the characteristics of the preprocessed sample data to extract word embedding characteristics and attribute information word embedding characteristics of the words of the user comment sentences; then, the word embedding characteristics of the vocabulary are used as an input characteristic matrix of sample data, and the attribute aspect information word embedding characteristics are used as attribute level aspect information of the sample data;
step 3, initializing a network model completely based on an attention machine system, inputting the input feature matrix of the sample data obtained in the step 2 and the attribute level aspect information of the sample data into the initialized network model completely based on the attention machine system for feature learning, in the initialized network model completely based on the attention machine system, firstly learning vocabulary level semantic features by using an SAM-NN module, then learning sentence level semantic features by using an AAM-NN module, and then identifying comment semantic emotion tendencies by using an FC-NN module so as to obtain prediction emotion mark classification of the comment sentences of the user;
the process of learning vocabulary level semantic features by using the SAM-NN module is as follows:
step 3.1.1, mapping the input characteristic matrix of the sample data obtained in the step 2 into a query matrix, a key matrix and a value matrix;
step 3.1.2, performing matrix linear transformation on the attribute level aspect information of the sample data obtained in the step 2 to obtain hidden layer representation of attribute level characteristics;
step 3.1.3, fusing the hidden layer representation obtained in the step 3.1.2 into a query matrix and a key matrix respectively to obtain a query matrix fused with attribute information and a key matrix fused with attribute information;
step 3.1.4, respectively performing linear transformation on the query matrix fused with the attribute information and the key matrix fused with the attribute information obtained in the step 3.1.3 and the value matrix obtained in the step 3.1.1 for m times to obtain m query matrix triples, key matrix triples and value matrix triples;
step 3.1.5, performing self-attention calculation on the m query matrix triplets, the key matrix triplets and the value matrix triplets obtained in the step 3.1.4 respectively to obtain m vocabulary level embedded representations of information in the aspect of fusion attributes;
step 3.1.6, splicing the m vocabulary level embedded representations obtained in the step 3.1.5 together, and obtaining hidden layer representation of vocabulary level embedded characteristics through a matrix linear mapping network;
wherein m is a set positive integer;
the process of learning sentence-level semantic features by using the AAM-NN module is as follows:
step 3.2.1, the hidden layer representation of the vocabulary level embedded features and the hidden layer representation of the attribute level features, which are obtained by utilizing the SAM-NN module to learn the vocabulary level semantic features, are spliced, and intermediate variables corresponding to each vocabulary are obtained through linear mapping;
step 3.2.2, performing softmax regression operation on the intermediate variable corresponding to each vocabulary obtained in the step 3.2.1 to obtain a weight value represented by the hidden layer embedded in each vocabulary level;
step 3.2.3, performing product operation on the hidden layer embedded in each vocabulary level obtained in the step 3.2.1 and the corresponding weight value represented by the hidden layer embedded in each vocabulary level obtained in the step 3.2.2, and accumulating all product operation results in each user comment sentence to obtain hidden layer representation of sentence level embedding characteristics;
step 4, training the network model completely based on the attention mechanism by using the real emotion mark classification of the user comment sentences in the step 1 and the predicted emotion mark classification obtained in the step 3, and optimizing model parameters by minimizing a loss function to obtain the trained network model completely based on the attention mechanism;
step 5, taking actually measured user comment sentences which are not classified with real emotion marks as actually measured data, and preprocessing the actually measured data by adopting the same method as the step 1;
step 6, performing feature extraction on the preprocessed actual measurement data by adopting the same method as the step 2 to obtain an input feature matrix of the actual measurement data and attribute level aspect information of the actual measurement data;
and 7, sending the input feature matrix of the measured data obtained in the step 6 and the attribute level aspect information of the measured data into the network model which is trained in the step 4 and is completely based on the attention mechanism, and obtaining the emotion mark classification of the measured data.
2. The method for analyzing emotion in attribute level of full attention mechanism as claimed in claim 1, wherein in steps 1 and 5, the pre-processing procedure of the sample data and the measured data includes word segmentation, writing standardization and part of speech tagging.
3. The method for analyzing emotion in attribute level of full attention mechanism as claimed in claim 1, wherein, in steps 2 and 6, further extracting embedded features and/or position information of words corresponding to the part of speech information of words in user comment sentences of sample data and measured data; and the embedding characteristics corresponding to the part-of-speech information of the word and/or the position information of the word are fused to the word embedding characteristics of the vocabulary, and the word embedding characteristics are jointly used as an input characteristic matrix.
4. The method for analyzing emotion in attribute level of complete attention mechanism as claimed in claim 3, wherein when the embedding features corresponding to the part of speech information of a word are fused with the word embedding features of a vocabulary, the two are fused in a splicing manner; when the position information of the word is fused with the word embedding characteristics of the vocabulary, the position information of the word and the word embedding characteristics are fused in a summing mode.
5. The method for attribute-level sentiment analysis of a full attention mechanism according to claim 1, wherein in step 3, the process of identifying the comment semantic sentiment tendency by using the FC-NN module is as follows:
step 3.3.1, splicing hidden layer representation of sentence-level embedded features and hidden layer representation of attribute-level features obtained by learning sentence-level semantic features by using an AAM-NN module, and obtaining output layer hidden layer representation through linear mapping;
and 3.3.2, performing classification processing on the hidden layer representation of the output layer obtained in the step 3.3.1 by utilizing a classifier to obtain the predicted emotion mark classification of the comment sentence of the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010072375.4A CN111259153B (en) | 2020-01-21 | 2020-01-21 | Attribute-level emotion analysis method of complete attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010072375.4A CN111259153B (en) | 2020-01-21 | 2020-01-21 | Attribute-level emotion analysis method of complete attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111259153A CN111259153A (en) | 2020-06-09 |
CN111259153B true CN111259153B (en) | 2021-06-22 |
Family
ID=70949099
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010072375.4A Active CN111259153B (en) | 2020-01-21 | 2020-01-21 | Attribute-level emotion analysis method of complete attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111259153B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859957B (en) * | 2020-07-15 | 2023-11-07 | 中南民族大学 | Emotion reason clause label extraction method, device, equipment and storage medium |
CN111858944B (en) * | 2020-07-31 | 2022-11-22 | 电子科技大学 | Entity aspect level emotion analysis method based on attention mechanism |
CN112329474B (en) * | 2020-11-02 | 2022-10-04 | 山东师范大学 | Attention-fused aspect-level user comment text emotion analysis method and system |
CN113313140B (en) * | 2021-04-14 | 2022-11-01 | 中国海洋大学 | Three-dimensional model classification and retrieval method and device based on deep attention |
CN113469184B (en) * | 2021-04-21 | 2022-08-12 | 华东师范大学 | Character recognition method for handwritten Chinese character based on multi-mode data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019204186A1 (en) * | 2018-04-18 | 2019-10-24 | Sony Interactive Entertainment Inc. | Integrated understanding of user characteristics by multimodal processing |
CN110517121A (en) * | 2019-09-23 | 2019-11-29 | 重庆邮电大学 | Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis |
CN110597991A (en) * | 2019-09-10 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Text classification method and device, computer equipment and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11514333B2 (en) * | 2018-04-30 | 2022-11-29 | Meta Platforms, Inc. | Combining machine-learning and social data to generate personalized recommendations |
CN108763326B (en) * | 2018-05-04 | 2021-01-12 | 南京邮电大学 | Emotion analysis model construction method of convolutional neural network based on feature diversification |
CN110046353B (en) * | 2019-04-22 | 2022-05-13 | 重庆理工大学 | Aspect level emotion analysis method based on multi-language level mechanism |
CN109948165B (en) * | 2019-04-24 | 2023-04-25 | 吉林大学 | Fine granularity emotion polarity prediction method based on mixed attention network |
CN110210032B (en) * | 2019-05-31 | 2023-10-31 | 鼎富智能科技有限公司 | Text processing method and device |
CN110347831A (en) * | 2019-06-28 | 2019-10-18 | 西安理工大学 | Based on the sensibility classification method from attention mechanism |
CN110569508A (en) * | 2019-09-10 | 2019-12-13 | 重庆邮电大学 | Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism |
-
2020
- 2020-01-21 CN CN202010072375.4A patent/CN111259153B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019204186A1 (en) * | 2018-04-18 | 2019-10-24 | Sony Interactive Entertainment Inc. | Integrated understanding of user characteristics by multimodal processing |
CN110597991A (en) * | 2019-09-10 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Text classification method and device, computer equipment and storage medium |
CN110517121A (en) * | 2019-09-23 | 2019-11-29 | 重庆邮电大学 | Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis |
Also Published As
Publication number | Publication date |
---|---|
CN111259153A (en) | 2020-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111259153B (en) | Attribute-level emotion analysis method of complete attention mechanism | |
US20220147836A1 (en) | Method and device for text-enhanced knowledge graph joint representation learning | |
CN110059188B (en) | Chinese emotion analysis method based on bidirectional time convolution network | |
CN111209401A (en) | System and method for classifying and processing sentiment polarity of online public opinion text information | |
CN111931506B (en) | Entity relationship extraction method based on graph information enhancement | |
CN110287323B (en) | Target-oriented emotion classification method | |
CN112001187A (en) | Emotion classification system based on Chinese syntax and graph convolution neural network | |
CN112001186A (en) | Emotion classification method using graph convolution neural network and Chinese syntax | |
CN112667818B (en) | GCN and multi-granularity attention fused user comment sentiment analysis method and system | |
CN112052684A (en) | Named entity identification method, device, equipment and storage medium for power metering | |
CN112069312B (en) | Text classification method based on entity recognition and electronic device | |
CN113360582B (en) | Relation classification method and system based on BERT model fusion multi-entity information | |
CN110969023B (en) | Text similarity determination method and device | |
CN113705238B (en) | Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model | |
CN111832293A (en) | Entity and relation combined extraction method based on head entity prediction | |
CN112101014B (en) | Chinese chemical industry document word segmentation method based on mixed feature fusion | |
CN115759119B (en) | Financial text emotion analysis method, system, medium and equipment | |
CN114881043B (en) | Deep learning model-based legal document semantic similarity evaluation method and system | |
CN112800184A (en) | Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction | |
CN111858933A (en) | Character-based hierarchical text emotion analysis method and system | |
CN111178080A (en) | Named entity identification method and system based on structured information | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
CN112862569B (en) | Product appearance style evaluation method and system based on image and text multi-modal data | |
CN113159831A (en) | Comment text sentiment analysis method based on improved capsule network | |
CN114911940A (en) | Text emotion recognition method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20200609 Assignee: Guilin Zhongchen Information Technology Co.,Ltd. Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY Contract record no.: X2022450000215 Denomination of invention: An attribute level emotion analysis method based on complete attention mechanism Granted publication date: 20210622 License type: Common License Record date: 20221206 |
|
EE01 | Entry into force of recordation of patent licensing contract |