CN114595693A - Text emotion analysis method based on deep learning - Google Patents

Text emotion analysis method based on deep learning Download PDF

Info

Publication number
CN114595693A
CN114595693A CN202011428365.6A CN202011428365A CN114595693A CN 114595693 A CN114595693 A CN 114595693A CN 202011428365 A CN202011428365 A CN 202011428365A CN 114595693 A CN114595693 A CN 114595693A
Authority
CN
China
Prior art keywords
deep learning
text
model
attention
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011428365.6A
Other languages
Chinese (zh)
Inventor
蔡颖凯
王楚
王忠锋
张冶
曹世龙
关艳
高曦莹
宋纯贺
李力刚
赵洪莹
邹云峰
夏靖怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Marketing Service Center Of State Grid Liaoning Electric Power Co ltd
Shenyang Institute of Automation of CAS
Original Assignee
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Marketing Service Center Of State Grid Liaoning Electric Power Co ltd
Shenyang Institute of Automation of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangsu Electric Power Co ltd Marketing Service Center, Marketing Service Center Of State Grid Liaoning Electric Power Co ltd, Shenyang Institute of Automation of CAS filed Critical State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Priority to CN202011428365.6A priority Critical patent/CN114595693A/en
Publication of CN114595693A publication Critical patent/CN114595693A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a text emotion analysis method based on deep learning, which comprises the following steps: the method comprises the following steps: preprocessing text sample data, and manually marking emotion assessment grades in advance; step two: constructing a Self-Attention deep learning model for online text emotion analysis, and training the model by using training set data; calculating a loss function every time, calculating the gradient of neurons in an output layer, and carrying out forward propagation and backward propagation to update the network parameter value of each layer until a cut-off condition is reached, and then obtaining an optimized Self-Attention deep learning model and each network parameter; step three: and acquiring actual text corpus data, and processing the data by using an optimized Self-Attention deep learning model to obtain an online text emotion analysis result.

Description

Text emotion analysis method based on deep learning
Technical Field
The invention belongs to the field of machine learning and text mining, and particularly relates to a text emotion analysis method based on deep learning.
Background
In an online interactive platform, text analysis greatly changes the communication and thinking modes of people, and promotes the explosive growth of user generated information. In recent years, a large amount of text generated by a user has become one of the most representative data sources of big data. Mining and analyzing user generated information has become an important component of social developmental research. As an emerging information processing technology, emotion analysis of social media text for analyzing, processing, summarizing, and reasoning subjective text with emotion has received much attention in recent years in both academic and industrial areas and has been widely applied to many fields of the internet. Even in life, the method has a wide application range, for example, in the field of user interaction of service robots in power business halls, the traditional emotion analysis research work mainly focuses on analyzing text emotion, but ignores individual differences of users in emotion expression, and therefore, the quality of analysis results is influenced. To solve these problems, the present invention aims to solve the personalized emotion analysis problem. In consideration of the wide application of the BP neural network technology in social media processing, the invention provides various models based on the BP neural network to apply the social media text personalized emotion analysis method to online commodity comments.
Sentiment Analysis (SA) is a process of analyzing, processing, summarizing, and reasoning about subjective text with sentiment expressions (e.g., micro blogs, online reviews, online news, etc.). The history of the emotional analytics study is not long. It started to receive a great deal of attention and rapidly developed in around 2000, and then gradually became a hot topic in the fields of natural language processing and text mining. There are many alternative names and similar techniques for emotion analysis, such as opinion mining, emotion mining, subjective analysis, etc., all of which can be studied under emotion analysis. For example, for movie reviews, the user's evaluation of the movie may be identified and analyzed, or the product reviews of the digital camera may be analyzed, such as the user's emotional tendency for "price", "size", "zoom", and other indicators. At present, emotion analysis becomes a comprehensive research field across subjects such as natural language processing, information retrieval, computational linguistics, machine learning and artificial intelligence. The existing text emotion analysis algorithm mainly relates to the viewpoint and the opinion of a user on the text. Due to the lack of interpretation of the user characteristics of the text, it is difficult for these algorithms to fully accurately reflect the user's true emotional expressions. In order to overcome the defects of the existing method, the invention can provide a personalized text emotion analysis method by introducing the influence of users and even product functions. Although emotion analysis of text is currently widely studied and performs well in many public evaluation tasks. However, there has been little research on truly available text emotion analysis tools, particularly personalized text emotion analysis tools, and to some extent has been overlooked by scholars and the industry. Meanwhile, how to dynamically capture the personalized emotion preference of the user also provides a new battle for emotion analysis. The invention firstly introduces an Attention Model idea, and the Attention Model can calculate the importance degree of the current part, namely an Attention Model can track the object to be classified, focus on and find the interesting classified object. Meanwhile, the LSTM deep neural network model based on the Attention model can effectively solve the problems of text information redundancy, information loss and other long-term dependence. In order to design a lighter network structure, the invention is improved on the basis of the Attention-LSTM network model, and the Attention-GRU network model is obtained to classify the personalized feelings.
Disclosure of Invention
In order to solve the above-mentioned deficiency in the prior art, the technical scheme adopted by the invention is as follows:
a text emotion analysis method based on deep learning comprises the following steps:
the method comprises the following steps: preprocessing text sample data, and manually marking emotion assessment grades in advance;
step two: constructing a Self-Attention deep learning model for online text emotion analysis, and training the model by using training set data; calculating a loss function every time, calculating the gradient of neurons in an output layer, and carrying out forward and backward propagation to update network parameter values of each layer until a cut-off condition is reached, and then obtaining an optimized Self-Attention deep learning model and network parameters;
step three: and collecting actual text corpus data, processing the data by using an optimized Self-Attention deep learning model, and acquiring an online text emotion analysis result.
In the second step, a Self-Attention deep learning model for online text emotion analysis is constructed, and training the model by using training set data comprises the following steps:
modeling is carried out based on a BERT deep learning model under a Self-Attention deep learning model;
embedding words of all words of an input sentence as semantic representation of the sentence; for the input semantic representation, acquiring the semantic representation of a hidden layer by using linear operation of matrix multiplication and a nonlinear activation function; inputting semantic representation of a hidden layer, and obtaining sentence-level semantic representation by using dimension reduction operation; combining the sentence representation form and the user representation form and inputting the combination into a classification layer, and combining the functions of user information on the sentence level; the classification layer maps the obtained vectors into a two-dimensional emotion space and performs emotion classification by using a softmax method;
inputting a model: training set D { (x)1,...,xk) Where k is 1, M is the number of training data;
and (3) outputting a model: training the optimized BERT deep learning model and network parameters;
wherein (x)1,...,xk) Is a word vector, ukManually pre-labeled emotion rating for representing the characteristics of the user.
And the training set D is obtained after the sentences in the original online text sample data are subjected to word segmentation and word deactivation.
Dividing the sample data into a training set and a verification set, training a BERT deep learning model by using the data in the training set, and verifying by using the data in the verification set.
The method comprises the following steps: the method for constructing the text emotion analysis BERT deep learning model specifically comprises the following steps:
1) the method of the Encode-Decoder coding-decoding module in the Attention model is adopted to carry out the following processing:
and performing machine coding on the input natural sequence or the characteristic matrix sequence in an Encoder coding module, wherein the expression formula is expressed as follows:
C=F(x1,x2,x3,…,xn)
wherein x is1,x2,x3,…,xnFor word segmentation and removalStopping the word processing to obtain a word vector; f is an Encoder coding function, and C is a word vector form coded according to the coding function;
decoding the natural sequence or the characteristic matrix sequence after the Encoder coding in a Decoder module, wherein the expression is as follows:
yi=G(C,y1,y2,y3,…,yi-1)
wherein, y1,y2,y3,…,yi-1,yiFor the decoded word vector, i is the number; g is a Decoder decoding function;
2) segmenting the emotion text, and obtaining a word expression T by fine tuning a BERT model as follows:
Ti=BERT(yi)
a new user vector representation is obtained by embedding the user vector:
Figure BDA0002819969560000041
wherein E isuA newly added user vector;
3) new user vector N obtained by using Bi-LSTM layer pairiForward propagation of results
Figure BDA0002819969560000042
And back propagation results
Figure BDA0002819969560000043
The calculation of (a) is as follows:
Figure BDA0002819969560000044
get the latest representation of the ith word of the text:
Figure BDA0002819969560000045
the high level of feature acquisition by Attention indicates that the classification of text is predicted at the classification level, as follows:
ci=softmax(Wwhi+bw)
wherein the content of the first and second substances,
Figure BDA0002819969560000046
to classify the final parameters of the network, bw∈RlC represents probability distribution belonging to the current ith class for the classification result bias information;
the network is trained using a cross entropy loss function as follows:
Figure BDA0002819969560000047
wherein u iskA truth label representing an emotion sample i, ciPredicting class y for a sampleiThe probability of (c). Network parameters are optimized using random gradient descent (SGD).
Inputting verification data into a BERT deep learning model to obtain a corresponding evaluation grade result, manually marking emotion evaluation grades in advance for comparison, stopping iteration if the evaluation grade result meets an error range, and obtaining the optimized BERT deep learning model and network parameters.
Further comprising: and automatically mapping and associating the text emotion analysis result with the original actual text material data of the user according to the emotion rating, marking the preference degree, and performing visual display for visually displaying the preference degree of the user.
The method adopts a color gradient sequence with strong, dark, light and weak colors to carry out information coloring labeling on actual text corpus data of different users.
The invention has the advantages that:
text emotion analysis is handled in the present invention using a deep neural network based approach. The deep learning method is completely different from the traditional machine learning method, is mainly based on a neural network method, can autonomously perform feature representation learning of discrimination, does not need to design and train features and dictionaries of texts in advance, and can perform deep capture on semantic information. The method is used for text emotion analysis.
Drawings
FIG. 1 general framework of Encoder-Decoder model;
FIG. 2 is a schematic diagram of the Self-orientation structure;
FIG. 3 BERT model structure;
FIG. 4 is a block diagram of a BERT-Attention-BilSTM network;
FIG. 5 Bi-LSTM layer;
FIG. 6 a classification layer;
FIG. 7 is a comparison of the Yelp2013 algorithm with a conventional classification method;
FIG. 8 is a comparison of the Yelp2014 algorithm with conventional classification methods;
FIG. 9 comparison of the Yelp2013 algorithm with other deep learning models;
FIG. 10 comparison of the Yelp2013 algorithm with other deep learning models;
Detailed Description
Attention Model as a Model simulating the characteristics of the Attention of the human brain to the objects of interest, Attention Model can focus on the recognition of the regions of interest. The core idea is to draw the reference that the human brain concentrates more on certain convenient attention of certain things in a specific scene specific area at a certain specific moment, and simultaneously neglects other secondary parts or uninteresting parts, so that the model is a model for the optimal allocation of human brain resources. The principle is to allocate more attention to some key parts and interested parts, allocate less attention to other uninteresting parts or even no attention, reasonably utilize the computing resources of the human brain, and also remove the influence of non-key factors or interference factors on the human brain. The Attention Model is applied in the field of computer vision at first, is used for tasks such as picture recognition, classification and target detection, and achieves good effect. Later attention models were used to deal with the problem of image-to-text conversion, i.e. converting pictures into descriptions of natural language that humans can easily understand, which makes traditional models more effective in this respect.
The data sets employed by the present invention are Yelp2013 and Yelp 2014. The data set includes 470 ten thousand user ratings, 15 more than ten thousand merchant information, 20 ten thousand pictures, and 12 metropolis. In addition, 100 million tips of 110 multiple users are covered, and over 120 million merchant attributes (such as business hours, whether a parking lot exists, whether reservation can be made, environment and other information) are covered, and the total number of users signed in at each merchant is increased along with the time. The reviews in the dataset are divided into 5 levels, and the 5 levels are respectively in english: "Eek, metals not", "Meh, I have experienced better bet", "A-OK", "Yay! I am a fun "," Woohoo! As good As it gets! ". As shown in table 1, the larger the number of stars of the customer for the merchant, the better the evaluation of the customer.
TABLE 1 evaluation of a Sandwich restaurant on the Yelp Web site and its rating examples
Figure BDA0002819969560000061
Figure BDA0002819969560000071
The invention mainly describes an Attention Model in the field of Natural Language Processing (NLP). In the field of NLP, the Attention Model is often used in conjunction with the Encoder-Decoder Model. The invention explains the use principle and the use effect of the Attention model through the use method and the flow of an Encoder-Decoder module in the Attention model, and introduces the application method and the experimental effect of the Attention model in personalized situation classification.
The Encoder-Decoder model is a classic coding and decoding natural language processing model, which uses the idea that a natural sequence or a characteristic matrix sequence is labeled to input a group of natural sequences or characteristic matrix sequences to be converted into another group of changed (coded) sequences. The core idea of the encoding and decoding model is that an encode module encodes an input natural sequence (which may be a matrix sequence after word vectorization or a feature obtained after a deep neural network) to form a form which is easy to calculate and process by a computer, so as to obtain a code natural sequence or a feature matrix sequence after encoding, then the code natural sequence or the feature matrix sequence after encoding by the encode module is input to a Decoder module for interpretation (decoding), and finally the changed natural sequence or the feature matrix sequence which is easy to identify and classify is output. The Encode-Decoder model has strong universality and usability, and can be conveniently combined with various traditional neural network models and deep neural network models. A variety of coding models can be used in the Encoder module to encode natural or feature matrix sequences. From this, the general framework structure of the Encoder-Decoder model can be seen.
Where input is typically a natural sequence or a sequence of characteristic matrices, X ═ X (X)1,x2,x3,…,xn) Output is a decoded natural sequence or a characteristic matrix sequence Y ═ Y1,y2,y3,…,yn}. The input natural sequence or characteristic matrix sequence is generally subjected to specific machine coding in an Encoder module, where the invention uses C to represent the coded natural sequence or characteristic matrix sequence, and the expression is as follows:
C=F(x1,x2,x3,…,xn)
wherein x is1,x2,x3,…,xnThe word vector is obtained after word segmentation and word stop processing; f is an Encoder coding function, and C is a word vector form coded according to the coding function;
the Encoder encoded natural sequence or feature matrix sequence is decoded in a Decoder module. E.g. computer output yiC and y generated before are used1,y2,y3,…,yi-1Thus y isiThe calculation formula of (2) is as follows:
yi=G(C,y1,y2,y3,…,yi-1)
wherein, y1,y2,y3,…,yi-1,yiFor the decoded word vector, i is the number; g is a Decoder decoding function;
it can thus be seen that the output y is calculated in the Decoder decoding blockiAs a result, the semantic information used is the same, all by x1,x2,x3,…,xnA code natural sequence or a feature matrix sequence generated after an Encoder. That is to say for the input sequence x1,x2,x3,…,xnThe interference and influence capabilities of each element of (a) on the output sequence are the same. The magnitude of these influences is determined by the order of the elements in the input sequence and the relevance of each element to the target. Furthermore, for the input of longer natural sequences or feature matrix sequences, although some codec models can efficiently preserve the influence information generated by historical data, part of the information valid for the result is lost due to the dimensional limitations of the semantic Encoder (encoding) vector in natural language. From the above analysis, the present invention finds that for the sequence encoded by the same encoding rule, the influence of the decoding data generated in the decoding stage on the output is the same, and this encoding and decoding mechanism is greatly different from the best attention allocation thinking mode. Therefore, in order to achieve the same or similar effect of the Attention model of the human brain on the coding Attention mode of the machine, an Attention model mechanism of the human brain-like Attention mode is provided. The principle of this attention mechanism consists in computing the input natural sequence or the sequence of characteristic matrices for the currently output sequence y in the decoder stageiWhen the attention probability distribution is carried out, a unique semantic code corresponding to the interested target is calculated for each output, the code integrates the attention probability distribution of the feature sequence coded in the input to the current output feature sequence, and the current output result can be optimized.
The Attention-Model structure of the invention adopts the most popular Self-Attention structure form to process data.
The structure of Self-orientation is shown in FIG. 2. The three matrices of Q (query), K (Key), and V (value) all come from the same input, and the invention calculates the dot product between Q and K, and then divides the square root of a dimension, which is the dimension of a query and key vector, to prevent the result from being too large. And normalizing the result into probability distribution by utilizing Softmax operation, and multiplying the probability distribution by a matrix V to obtain the representation of weight summation. The attention weight of the output element is calculated as follows:
Figure BDA0002819969560000091
most of the existing models use Word2Vec pre-training Word vectors, however, the Word vectors trained by using the models have a problem when the Word vectors required by the invention are generated. Because the word vector acquired by the method belongs to one of static codes, the same word is still expressed in different context environments, and the semantic understanding of the model to different situations is deviated. To address this problem, the present invention selects the BERT pre-training language model as the word vector generation model. The BERT processing structure is a new natural language representation model, the BERT model can better represent the spatial interrelation of semantic information of sentences in the whole text, namely represent different expression meanings and information of the same text information in different contexts, on one hand, the sentences in the same or similar contexts have similar expression meanings, and theoretically, the distances of the sentences in the space are closer than the distances of the sentences in the space. On the other hand, the BERT model uses an operation method substantially similar to the understanding of the human brain when processing vector operations between sentences, and its model structure is shown in fig. 3.
The input to the BERT model is the sum of 3 vectors. For each input word, its representation includes 3 parts, respectively, a word vector (token entries), a segment vector (segment entries), and a position vector (positions entries). Where a word vector represents the encoding of the current word, a segment vector represents the positional encoding of the sentence in which the current word is located, and a position vector represents the positional encoding of the current word, each sentence using CLS and SEP as the beginning and end markers.
The most important part of the BERT model is a bidirectional Transformer coding layer, and text feature extraction is performed by the bidirectional Transformer coding layer, and an Encode feature extractor of the Transformer is used. The Encoder consists of a self-attention mechanism (self-attention) and a feed-forward-neural-network (feed-forward-neural-network). The core of the Encoder is self-attribute, which can find the degree of association between each word and other words in the word without the limitation of distance, and the relationship between dozens of or even hundreds of words can still be found, so that the left and right context information of each word can be fully mined, and the bidirectional representation of the words can be obtained.
The model mainly comprises three layers, namely a BERT structural layer, a BilSTM structural layer and an Attention structural layer. The model structure is shown in fig. 4 below. The model is divided into five layers in total: a BERT layer, a user vector embedding layer, a Bi-LSTM layer, an attention layer, and a classification layer.
BERT layer: firstly, segmenting words of the emotion text, and obtaining a word expression T by finely adjusting a BERT model. I.e. given text D ═ x1,x2,x3,…,xn},xiRepresenting words in a given text. The word vector representation of each word can be obtained by the BERT word vector model:
Ti=BERT(yi)
wherein T isi∈RdA word vector representation for each word is represented and d represents the dimension of the word vector.
User vector embedding layer: each user representation being a vector representation Eu∈RdFinally, the invention can obtain a new user vector representation:
Figure BDA0002819969560000101
Bi-LSTM layer: two operations are performed on the obtained new user vector. Each vector NiForward propagation results and backward propagation results. The forward propagation result can be calculated according to the following formula:
Figure BDA0002819969560000111
the back propagation results can be according to the formula:
Figure BDA0002819969560000112
wherein the content of the first and second substances,
Figure BDA0002819969560000113
dhthe number of neurons is implied for each LSTM. This then results in the latest representation of the ith word of the text:
Figure BDA0002819969560000114
then, the text expression matrix H ═ H can be obtained1;h2;h3;...;hn]Wherein
Figure BDA0002819969560000115
The Bi-LSTM layer network structure is shown in figure 5.
An Attention layer: in this section, the present invention provides a new attention expression form, that is, the attention of the ith word and all words is calculated to find the emotion related words, which is specifically expressed as follows:
Figure BDA0002819969560000116
wherein h isiAnd hjRespectively, different words are represented by the characteristics of Bi-LSTM. Final hiIs defined as:
Figure BDA0002819969560000117
finally, the product is processed
Figure BDA0002819969560000118
A high level of feature representation in an emotional sentence is represented.
A classification layer: according to the high-level feature representation obtained by the above Attention, the classification of the text is predicted at the classification level:
ci=softmax(Wwhi+bw)
wherein the content of the first and second substances,
Figure BDA0002819969560000119
to classify the final parameters of the network, bw∈RlTo bias information for the classification result, c represents a probability distribution. The hierarchical structure is shown in fig. 6.
This section still chooses to train the network with cross entropy loss function:
Figure BDA0002819969560000121
wherein u iskA truth label representing an emotion sample i, ciPredicting a sample as class yiThe probability of (c). The optimizer chooses to optimize network parameters using random gradient descent (SGD) in this section.
During model training, the method is trained and tested respectively based on a word vector classification model and a sentence vector classification model. The classification model based on the word vector is a method for segmenting a document into a series of words and inputting the words into a deep neural network for training during training; the sentence vector classification model is a method for inputting word vectors into a deep neural network in a sentence form as a whole for training. When the word vector is calculated, the special characters are screened, and the special characters such as repeated marks, messy codes and the like are deleted. The sentence vector is formed by splicing word vectors into fixed length and is sent to a network for training. The hyper-parameters used in the model training process are shown in table 2.
TABLE 2 Superparameter settings
Figure BDA0002819969560000122
The model training procedure is shown in table 3:
TABLE 3 Algorithm flow
Figure BDA0002819969560000123
Figure BDA0002819969560000131
In order to evaluate the performance of the model proposed by the present invention, the following evaluation criteria were used for evaluation. The emotion classification is generally classified into a binomial classification and a multinomial classification.
For the two-term classification, if class C in the sample:
TABLE 4 two-item classification result matrix
Figure BDA0002819969560000132
Figure BDA0002819969560000141
For multiple classifications, it is assumed that there is a class Ci,i∈[1…Nc](NcNumber of categories):
TABLE 5 Multi-item Classification result matrix
Figure BDA0002819969560000142
Wherein N isijRepresenting the number of samples of class i in class j.
Then, each evaluation index is:
(1) rate of accuracy
The accuracy rate is a common evaluation index of a neural network model, and the definition is as follows: for a given data set, the ratio of the number of samples correctly classified by the classifier to the total number of samples, the specific calculation formula can in a sense of accuracy derive whether a classifier is valid, but it cannot always evaluate the working of a classifier effectively. As a simple example, 90% positive samples and 10% negative samples, which are severely unbalanced, are used. In this case, only all samples need to be predicted as positive samples to obtain an accuracy of 90%, but actually, it is impossible to determine whether the classifier is valid. This indicates that there is greater moisture at high accuracy due to sample imbalance problems.
Figure BDA0002819969560000143
(2) Rate of accuracy
The calculation formula of the probability of actually being positive samples among all the samples predicted to be correct is shown below. The accuracy and precision look somewhat similar, but are completely different concepts. The accuracy rate represents the degree of prediction in the positive sample results, while the accuracy rate represents the overall degree of prediction accuracy, including both positive and negative samples.
Figure BDA0002819969560000151
(3) Recall rate
It is defined as the probability of being predicted as a positive sample in the actual positive sample, and the calculation formula is shown as the following formula. Application scenario of recall rate: for example, a poor-rated user is more concerned about the poor-rated user in website comments than a good-rated user. This may affect the subsequent user's judgment if too many users are classified as bad-rated users. The higher the recall rate, the higher the probability that represents an realistically bad appraised user is predicted.
Figure BDA0002819969560000152
(4)F-score
For Precision and Recall, there is no necessary correlation relationship, though from a formulation point of view. However, in large-scale data sets, these two criteria tend to be mutually restrictive. Ideally, it is of course best to achieve both indices being high. But in general, the higher the accuracy, the lower the recall. In practical application, the trade-off is often made according to specific situations. Therefore, in order to balance the two indexes comprehensively, a new index F-score is introduced, and the calculation formula is shown as the following formula, which is a harmonic value comprehensively considering the accuracy and the recall ratio.
Figure BDA0002819969560000153
(5)RMSE
The mean square error represents the degree of variance of the predicted values, also called the standard error, and the best fit is when RMSE is 0. The root mean square error is also one of the comprehensive indicators of error analysis. The calculation formula is as follows
Figure BDA0002819969560000161
The experiment of the invention adopts the used programming language Python3.7, and develops by matching with a deep learning framework of the Pythroch 1.5 version on the Jupyter Notebook platform. The data sets used were: "Yelp 13", "Yelp 14". The data set is divided, 80% of the data set is used as a training set, 10% is used as a verification set to store the model, and 10% is used as a test set to calculate the effect of the model. This section compares these three models with the base model, respectively. The performance evaluation criteria used for the experiments were accuracy, precision, recall, F-score, and RMSE.
TABLE 6 Yelp2013 data set Algorithm comparison results
Figure BDA0002819969560000162
TABLE 7 Yelp2014 data set Algorithm comparison results
Figure BDA0002819969560000163
Figure BDA0002819969560000171
It can be seen from tables 6 and 7 and bar graphs 7 and 8 that the Bert-Attention has the best results over all the evaluation criteria. The Bert model has been successfully applied to many tasks of natural language processing, and obtaining a representation of a word using Bert can well capture the semantics of the word. In Table 6, the Bert-Attention model is nearly 3% higher than the Attention-LSTM model for the F-score index, and in Table 7, the Bert-Attention model is nearly 4% higher than the Attention-LSTM model. Attention-LSTM and Attention-GRU have similar performance on both datasets, because they are structurally similar, and their ability to capture semantic features is similar. The table shows that the three models are superior to other basic models, and the fact that deep learning designed by the invention has better performance in personalized emotion analysis is verified again, so that the accuracy, recall rate and precision of emotion classification can be further improved. According to the invention, an attention-based deep neural network emotion analysis model is designed aiming at user individuation, and after an attention structure is added into a deep learning model, relevant feature information of a text at different positions can be effectively obtained, and similar semantic features can be effectively extracted. As can be seen in the histogram, the attention-based deep learning model achieves a lower RMSE result, which indicates that the deep learning model is more stable in emotion analysis than the conventional machine learning model in personalized emotion analysis.
In order to detect whether the attention mechanism of the model is effective, the invention designs an ablation experiment and compares the experimental results of the model with attention and without attention. As shown in tables 8 and 9.
TABLE 8 Yelp2013 data set Algorithm comparison results
Figure BDA0002819969560000181
TABLE 9 Yelp2014 data set Algorithm comparison results
Figure BDA0002819969560000182
From table 8, table 9 and histograms 9 and 10, it can be seen that the structure with the attention model is higher in F-socre and accuracy than the structure without the attention model. From Table 8, it can be seen that the Attention-LSTM is 1.41% higher in accuracy than the LSTM model, the Attention-GRU model is 1.67% higher than the F-score of the GRU model, and the Bert-Attention model is 1.23% higher than the Bert model. These illustrate the importance of the attention model in personalized sentiment analysis. The Attention model can well capture important features of sentences, more accurately position key information, filter unimportant information, further extract important individual emotional features, and further improve the classification performance of the model.
The Attention mechanism and principle of the Attention model and the Attention weight calculation formula calculation method show that the Attention model can highlight the effect of the input key natural sequence or feature matrix accumulation on the analysis of the output feature coding sequence by calculating the probability distribution of Attention, and has good optimization effect on the traditional depth network model. Meanwhile, the application of the Attention model in various fields can be analyzed, so that the thought application range of the Attention model is wide, and the Attention model has a good effect on tasks such as text classification, emotion analysis and the like in the current natural language processing field.
In the traditional model, the model depends heavily on the emotion dictionary, and an accurate emotion dictionary is difficult to construct. These shortcomings have led researchers to seek more convenient solutions. Machine learning based analysis methods have emerged. The method based on machine learning is very effective, but for different data, a proper classifier and a proper method for extracting text features need to be selected to obtain a good analysis result. With the research and application of deep learning and neural networks, emotion analysis based on deep learning and neural networks is a universal solution. The model based on the combination of the deep learning and attention mechanism has strong generalization capability on different data sets, can be applied to emotion analysis of a plurality of data sets, and does not need to select the model like a traditional machine learning model. The model provided by the invention enables the personalized difference between users and the potential personalized factors of the users to be as follows: language habits, user personality, opinion bias, etc. The model provided by the invention simply and effectively solves the problem of cold start of the user. Most importantly, the model provided by the invention is far superior to the traditional machine learning model in each evaluation index, and provides a better solution for solving the emotion analysis problem.

Claims (9)

1. A text emotion analysis method based on deep learning is characterized by comprising the following steps:
the method comprises the following steps: preprocessing text sample data, and manually marking emotion assessment grades in advance;
step two: constructing a Self-Attention deep learning model for online text emotion analysis, and training the model by using training set data; calculating a loss function every time, calculating the gradient of neurons in an output layer, and carrying out forward propagation and backward propagation to update the network parameter value of each layer until a cut-off condition is reached, and then obtaining an optimized Self-Attention deep learning model and each network parameter;
step three: and acquiring actual text corpus data, and processing the data by using an optimized Self-Attention deep learning model to obtain an online text emotion analysis result.
2. The method of claim 1, wherein the step two of constructing a Self-Attention deep learning model for online text emotion analysis, and the training of the model with training set data comprises:
modeling is carried out based on a BERT deep learning model under a Self-Attention deep learning model;
embedding words of all words of an input sentence as semantic representation of the sentence; for the input semantic representation, acquiring the semantic representation of the hidden layer by using linear operation of matrix multiplication and a nonlinear activation function; inputting semantic representation of a hidden layer, and obtaining sentence-level semantic representation by using dimension reduction operation; combining the sentence representation form and the user representation form and inputting the combination into a classification layer, and combining the functions of user information on the sentence level; the classification layer maps the obtained vectors into a two-dimensional emotion space, and performs emotion classification by using a softmax method;
inputting a model: training set D { (x)1,...,xk) Where k is 1, M is the number of training data;
and (3) outputting a model: training the optimized BERT deep learning model and network parameters;
wherein (x)1,...,xk) Is a word vector, ukManually pre-labeled emotional ratings representing characteristics of the user.
3. The text emotion analysis method based on deep learning of claim 2, wherein the training set D is obtained by performing word segmentation and word deactivation on sentences in original online text sample data.
4. The method according to claim 2, wherein the sample data is divided into a training set and a verification set, the BERT deep learning model is trained by using the data in the training set, and the verification is performed by using the data in the verification set.
5. An emotion analysis method based on deep learning as claimed in claim 2, comprising: the method for constructing the text emotion analysis BERT deep learning model specifically comprises the following steps:
1) the method of the Encode-Decoder coding-decoding module in the Attention model is adopted to carry out the following processing:
performing machine coding on an input natural sequence or a characteristic matrix sequence in an Encoder coding module, wherein an expression is expressed as follows:
C=F(x1,x2,x3,…,xn)
wherein x is1,x2,x3,...,xnThe word vector is obtained after word segmentation and word stop processing; f is an Encoder coding function, and C is a word vector form coded according to the coding function;
decoding the natural sequence or the characteristic matrix sequence after the Encoder coding in a Decoder module, wherein the expression is as follows:
yi=G(C,y1,y2,y3,...,yi-1)
wherein, y1,y2,y3,...,yi-1,yiFor the decoded word vector, i is the number; g is a Decoder decoding function;
2) segmenting the emotion text, and obtaining a word expression T by fine tuning a BERT model as follows:
Ti=BERT(yi)
a new user vector representation is obtained by embedding the user vector:
Figure FDA0002819969550000021
wherein, EuA newly added user vector;
3) new user vector N obtained by using Bi-LSTM layer pairiForward propagation of results
Figure FDA0002819969550000031
And back-propagating the results
Figure FDA0002819969550000032
The calculation of (a) is as follows:
Figure FDA0002819969550000033
get the latest representation of the ith word of the text:
Figure FDA0002819969550000034
6. the method of claim 5, wherein the high-level features obtained by the Attention predict the text category at the classification level as follows:
ci=softmax(Wwhi+bw)
wherein the content of the first and second substances,
Figure FDA0002819969550000035
to classify the final parameters of the network, bw∈RlC represents probability distribution belonging to the current ith class for the classification result bias information;
the network is trained using a cross entropy loss function as follows:
Figure FDA0002819969550000036
wherein u iskA truth label representing an emotion sample i, ciPredicting class y for a sampleiThe probability of (c). Network parameters are optimized using random gradient descent (SGD).
7. The method for emotion analysis based on deep learning text according to claim 1, wherein the verification data is input into a BERT deep learning model to obtain a corresponding evaluation grade result, emotion rating grade marking is manually performed in advance for comparison, iteration is stopped if an error range is met, and an optimized BERT deep learning model and network parameters are obtained.
8. The method for analyzing text emotion based on deep learning of any one of claims 1-7, characterized by further comprising: and automatically mapping and associating the text emotion analysis result with the original actual text corpus data of the user according to the emotion rating level, marking the preference degree, and performing visual display for visually displaying the preference degree of the user.
9. The text emotion analysis method based on deep learning of claim 8, wherein the information coloring and labeling are performed on the actual text corpus data of different users by adopting a color gradient sequence of strong, dark, light and weak.
CN202011428365.6A 2020-12-07 2020-12-07 Text emotion analysis method based on deep learning Pending CN114595693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011428365.6A CN114595693A (en) 2020-12-07 2020-12-07 Text emotion analysis method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011428365.6A CN114595693A (en) 2020-12-07 2020-12-07 Text emotion analysis method based on deep learning

Publications (1)

Publication Number Publication Date
CN114595693A true CN114595693A (en) 2022-06-07

Family

ID=81803248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011428365.6A Pending CN114595693A (en) 2020-12-07 2020-12-07 Text emotion analysis method based on deep learning

Country Status (1)

Country Link
CN (1) CN114595693A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115811630A (en) * 2023-02-09 2023-03-17 成都航空职业技术学院 Education informatization method based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115811630A (en) * 2023-02-09 2023-03-17 成都航空职业技术学院 Education informatization method based on artificial intelligence
CN115811630B (en) * 2023-02-09 2023-05-02 成都航空职业技术学院 Education informatization method based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN109740148B (en) Text emotion analysis method combining BiLSTM with Attention mechanism
CN111241837B (en) Theft case legal document named entity identification method based on anti-migration learning
CN107862087B (en) Emotion analysis method and device based on big data and deep learning and storage medium
CN110287320A (en) A kind of deep learning of combination attention mechanism is classified sentiment analysis model more
CN111914096A (en) Public transport passenger satisfaction evaluation method and system based on public opinion knowledge graph
CN110287323B (en) Target-oriented emotion classification method
CN113051916B (en) Interactive microblog text emotion mining method based on emotion offset perception in social network
CN108563638B (en) Microblog emotion analysis method based on topic identification and integrated learning
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
CN113505204B (en) Recall model training method, search recall device and computer equipment
CN110929034A (en) Commodity comment fine-grained emotion classification method based on improved LSTM
CN111259153B (en) Attribute-level emotion analysis method of complete attention mechanism
CN112905739B (en) False comment detection model training method, detection method and electronic equipment
CN109101490B (en) Factual implicit emotion recognition method and system based on fusion feature representation
CN112069320B (en) Span-based fine-grained sentiment analysis method
CN112256866A (en) Text fine-grained emotion analysis method based on deep learning
CN112101040A (en) Ancient poetry semantic retrieval method based on knowledge graph
CN115952292B (en) Multi-label classification method, apparatus and computer readable medium
CN116737922A (en) Tourist online comment fine granularity emotion analysis method and system
CN116680363A (en) Emotion analysis method based on multi-mode comment data
CN113704459A (en) Online text emotion analysis method based on neural network
CN112862569B (en) Product appearance style evaluation method and system based on image and text multi-modal data
CN114356990A (en) Base named entity recognition system and method based on transfer learning
CN112925983A (en) Recommendation method and system for power grid information
CN114595693A (en) Text emotion analysis method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination