CN110287320A - A kind of deep learning of combination attention mechanism is classified sentiment analysis model more - Google Patents
A kind of deep learning of combination attention mechanism is classified sentiment analysis model more Download PDFInfo
- Publication number
- CN110287320A CN110287320A CN201910553755.7A CN201910553755A CN110287320A CN 110287320 A CN110287320 A CN 110287320A CN 201910553755 A CN201910553755 A CN 201910553755A CN 110287320 A CN110287320 A CN 110287320A
- Authority
- CN
- China
- Prior art keywords
- word
- cnn
- text
- model
- feature extraction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 36
- 238000013135 deep learning Methods 0.000 title claims abstract description 23
- 230000007246 mechanism Effects 0.000 title claims abstract description 21
- 238000000605 extraction Methods 0.000 claims abstract description 54
- 230000008451 emotion Effects 0.000 claims abstract description 52
- 238000000034 method Methods 0.000 claims abstract description 33
- 239000013598 vector Substances 0.000 claims description 52
- 238000011176 pooling Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 210000002569 neuron Anatomy 0.000 claims description 12
- 230000002996 emotional effect Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000007635 classification algorithm Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 abstract description 6
- 239000000284 extract Substances 0.000 abstract description 5
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 230000000052 comparative effect Effects 0.000 abstract 1
- 239000000203 mixture Substances 0.000 abstract 1
- 238000013527 convolutional neural network Methods 0.000 description 31
- 238000012360 testing method Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of deep learning of combination attention mechanism mostly classification sentiment analysis models, belong to natural language processing technique field, the present invention analyzes the weakness of existing CNN network and LSTM network in terms of text emotion analysis, proposes a kind of deep learning mostly classification sentiment analysis model of combination attention mechanism.The model use attention mechanism blends the word order feature of local feature and LSTM model extraction that CNN network extracts, and the thought of integrated model is used in classification layer, the affective characteristics that CNN network and LSTM network extract are spliced respectively, the affective characteristics finally extracted as model.By comparative experiments, it is found that the accuracy rate of the model has significant raising.
Description
Technical Field
The invention belongs to the field of text information processing, and relates to a deep learning multi-classification emotion analysis model combined with an attention mechanism.
Background
With the continuous rise of social networks such as microblogs and Twitter, the internet is not only a source for people to obtain daily information, but also an indispensable platform for people to express their own opinions. People comment hot events, express film and comment views, describe product experiences and the like in a network community, a large amount of text information with emotional colors (such as joy, anger, sadness and the like) can be generated, effective emotional analysis is carried out on the text information, and the interest tendency and the attention degree of users can be better known. However, with the increase of the attention of people to network information, a network community generates massive texts with emotional colors every day, and if the network community only depends on artificial marks, the task can not be completed far enough, so that the text emotional analysis becomes a research hotspot in the field of natural language processing.
With the successful application of deep learning methods in computer vision direction, more and more deep learning techniques are also applied in natural language processing direction. The deep learning has the advantages that the features of the text can be automatically extracted, and the strong expression capability on big data is realized. At present, mainstream text emotion analysis methods based on deep learning mainly include a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN), and emotion analysis models based on the two methods have low accuracy and are mainly caused by the following aspects:
firstly, in the emotion analysis process of a text, the convolutional neural network effectively captures emotion information at different positions by enlarging the size of a convolutional kernel, and then local emotion characteristics of the text are obtained. However, in the process of convolution, the context between word sequences in the text is often ignored. However, in the process of text emotion analysis, the precedence relationship of the word order is very important, and the result has certain deviation without the characteristic information of the word order.
And secondly, the recurrent neural network effectively simulates the sequence of the text data by utilizing the front and back dependency relationship, and can extract the word order relationship and semantic information of the text, so that a good effect can be achieved in the emotion analysis of the text. However, when the sample data is Long or the language scene is complex, the useful interval of the emotion information is large or small, and the length is different, and the performance of the Long Short-Term Memory (LSTM) is also limited.
The invention fully utilizes the attention mechanism, the CNN network and the LSTM network, and provides and realizes a deep learning multi-classification emotion analysis model combined with the attention mechanism. The model can effectively improve the accuracy of text emotion analysis.
Disclosure of Invention
The invention provides a deep learning multi-classification emotion analysis model based on an attention mechanism. The model combines a CNN network and an LSTM network to perform emotional feature fusion. Firstly, extracting local features of a text to be analyzed by using a multi-scale convolution kernel of a CNN network, and then fusing the local features extracted by the CNN network into an LSTM network by using an attention mechanism. And finally, splicing the result of the pooling layer of the CNN network and the feature extraction result of the LSTM network by using the idea of an integrated model, and outputting the result as a final model. Experiments show that the accuracy of the model is remarkably improved in text emotion analysis.
In order to achieve the purpose, the invention adopts the following technical scheme:
1. a deep learning multi-classification emotion analysis method combined with an attention mechanism is characterized by comprising the following steps:
step (1) data preprocessing
Let the emotion data set be expressed as: g ═ segtxt1,y1),(segtxt2,y2),......,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the emotion type label is corresponding, N represents the number of samples in the data set G, the samples in G are subjected to data preprocessing,
the dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]Wherein: segiExpressed as the ith sample, y, in the data set GiThen the label is a corresponding emotion category label, and M represents the number of samples in the data set G';
step (2) input of the constructed model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,w3,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεOne-hot coding of the ith word in the text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, and d represents the sentence length of the text. y is formed by RpThe method is based on one-hot coding of emotion classes, and p represents the number of classes to be classified by the model. The word vector embedding matrix for that sample can be represented as:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor a word vector matrix representation of the text to be analyzed, m is the dimension of the word vector, xi∈RmRepresenting the word vector of the ith vocabulary in the text, and E representing the word vector embedded layer;
step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnThe splicing is carried out in a splicing way,i.e. vector [ CCnn;C'Rnn]As the feature vector that the model finally extracts. Then the feature vector [ CCnn;C'Rnn]Obtaining the final model output vector through the full connection layerWhere p represents the number of classes to which the model is to be assigned.
The local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and the number of filters of the convolution kernel with the same scale is k, namely, k of each neuron;
in the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: k filtering of convolution layer with same convolution scale in partial feature extraction stage based on CNN networkSplicing convolution results of the devices to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi1,π2,...,πnAnd finishing the extraction of the local features of the multi-scale CNN network.
Wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qB being a weight matrix, λ representing the dimension of the corresponding weight matrix1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism1,π2,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the output of the LSTM module corresponding to the first word in the text to be analyzed, the invention uses a bi-directional LSTM model, i.e. a BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting training data into a multi-classification emotion analysis model, adjusting parameters by adopting a cross entropy loss function and combining a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm;
and (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
The pretreatment process comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, and conversion from traditional Chinese to simplified Chinese.
2) Selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresents the ith word in the vocabulary word list, and epsilon represents the total number of words in the data set G with word frequency exceeding sigma.
3) For each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with symbols </>.
The convolution layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1A word vector representing the i-th word to the i + s-1 words in the text sentence.
The training data is preprocessed data.
The convolution layer of the CNN network-based local feature extraction stage adopts 4 convolution kernels with different scales. The training end condition is that the accuracy is not changed any more or the set iteration number is reached.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a structural diagram of a deep learning multi-classification emotion analysis model combined with an attention mechanism.
Detailed Description
The following describes the embodiments of the present invention in further detail with reference to the drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The method provided by the invention is realized by the following steps in sequence:
step (1) data preprocessing
The emotion language data set is represented as: g ═ segtxt1,y1),(segtxt2,y2),......,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the corresponding emotion category label. N represents the number of samples in the data set G, the emotion labels are classified into four categories of joy, anger, disgust and low, and N is 80000, wherein 20000 emotion samples are respectively adopted in the four categories. The data preprocessing of the samples in G comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, and conversion from traditional Chinese to simplified Chinese.
2) Selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresenting the ith word in the data set G and epsilon representing the total number of words in the data set G with a word frequency exceeding sigma. And sigma is 2, 41763 words with the word frequency more than or equal to 2 in the finally obtained data set G, namely epsilon is 41763.
3) After the above processing, for each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with a symbol </>. And d is 64.
The dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]. Wherein: segiExpressed as the ith sample, y, in the data set GiThen M represents the number of samples in the data set G' for the corresponding emotion classification label. The number of samples in the final data set G' is 73150, and the number of samples for each emotion category is shown in table 1:
TABLE 1 number of samples of each class after pretreatment
Step (2) inputting model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,w3,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεThe method is characterized in that one-hot coding of the ith word in the text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, and the sentence length d of the text is 64. y is formed by RpThe method is coded according to one-hot of the emotion classes, p represents the number of classes to be classified of the model, and p is 4. The word vector embedding matrix for that sample can be represented as:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor the word vector matrix representation of the text to be analyzed, the word vector dimension m takes 256. x is the number ofi∈RmFor the word vector representation of the ith vocabulary in the text, a word vector embedding layer represents E, a Wikipedia open source word2vec word vector is adopted, and then X is used as the input of a network model.
Step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnSplicing, i.e. vector [ CCnn;C'Rnn]As the feature vector that the model finally extracts. Then the feature vector [ CCnn;C'Rnn]Obtaining the final model output vector through the full connection layerWhere p represents the number of classes to which the model is to be assigned.
The local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and filters of the convolution kernels with the same scale, namely k neurons, are respectively adopted for n and 128 in the invention.
In the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer, and the total number is 512;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: splicing convolution results of k filters with the same convolution scale of convolution layers in the local feature extraction stage based on the CNN network to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi1,π2,...,πnAnd finishing the extraction of the local features of the multi-scale CNN network.
Wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qFor the weight matrix, λ represents the dimension of the corresponding weight matrix, b1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network, and q is 256;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism1,π2,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the first of the text to be analyzedThe output of LSTM module corresponding to the word, the invention adopts bidirectional LSTM model, i.e. BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting the training data into a multi-classification emotion analysis model, adjusting parameters by combining a cross entropy loss function and a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm.
And (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
The convolution layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1Representing a word vector from the ith word to i + s-1 words in a text sentence, s being taken to be [2,3,4,5]Four different convolution sizes, f (-) employ the RELU activation function.
The training data is preprocessed data.
The convolution layer of the CNN network-based local feature extraction stage adopts 4 convolution kernels with different scales. The training end condition is that the accuracy is not changed any more or the set iteration number is reached.
1. Analysis of experiments
In the testing stage, 2000 kinds of emotion corpora of joy, anger, disgust and low fall are selected. The accuracy Acc (Accuracy) is used as an evaluation index, the parameters of the test stage model are kept unchanged, and the test set results are shown in Table 2:
TABLE 2 comparison of Emotion analysis results
A comparison of the test results of several models is given in table 2, where experiment 1 is a generic single-scale CNN network model with convolution kernel size 3, experiment 2 is a generic LSTM network, and experiment 3 is a text emotion analysis model based on attention mechanism as proposed herein.
Compared with the common CNN network and the LSTM network, the emotion analysis model based on the attention mechanism provided by the invention has the advantages that the accuracy is obviously improved, the local characteristic information of the CNN network and the language sequence characteristic information of the LSTM network can be effectively extracted by the method provided by the invention, and the effectiveness of the method is demonstrated.
Claims (6)
1. A deep learning multi-classification emotion analysis method combined with an attention mechanism is characterized by comprising the following steps:
step (1) data preprocessing
Let the emotion data set be expressed as: g ═ segtxt1,y1),(segtxt2,y2),...,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the corresponding emotion type label is obtained, N represents the number of samples in the data set G, and the samples in G are processedThe pre-processing of the data is carried out,
the dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]Wherein: segiExpressed as the ith sample, y, in the data set GiThen the label is a corresponding emotion category label, and M represents the number of samples in the data set G';
step (2) input of the constructed model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,...,wi,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεOne-hot coding of the ith word in a text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, d represents the sentence length of the text, and y belongs to RpAccording to one-hot encoding of emotion categories, p represents the number of categories to be classified by the model, and the word vector embedding matrix of the sample can be represented as follows:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor a word vector matrix representation of the text to be analyzed, m is the dimension of the word vector, xi∈RmRepresenting the word vector of the ith vocabulary in the text, and E representing the word vector embedded layer;
step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnSplicing, i.e. vector [ CCnn;C'Rnn]As the feature vector of the final extraction of the model, and then the feature vector [ C ]Cnn;C'Rnn]Obtaining the final model through the full connection layerOutput vectorWhere p represents the number of classes to be assigned to the model,
the local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and the number of filters of the convolution kernel with the same scale is k, namely, k of each neuron;
in the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: splicing convolution results of k filters with the same convolution scale of convolution layers in the local feature extraction stage based on the CNN network to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi1,π2,...,πnFinishing the extraction of the local features of the multi-scale CNN network,
wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qFor the weight matrix, λ represents the dimension of the corresponding weight matrix, b1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism1,π2,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the output of the LSTM module corresponding to the first word in the text to be analyzed, the invention uses a bi-directional LSTM model, i.e. a BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting training data into a multi-classification emotion analysis model, adjusting parameters by adopting a cross entropy loss function and combining a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm;
and (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
2. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the preprocessing process comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, conversion from traditional Chinese to simplified Chinese,
2) selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresenting the ith word in the vocabulary word, epsilon represents the total number of words in the data set G with a word frequency exceeding sigma,
3) for each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with symbols </>.
3. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the convolutional layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1A word vector representing the i-th word to the i + s-1 words in the text sentence.
4. The method as claimed in claim 1, wherein the training data is preprocessed data.
5. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the convolutional layer of the local feature extraction stage based on the CNN network employs 4 convolutional kernels with different scales.
6. The method as claimed in claim 1, wherein the end condition of the training is that the accuracy is not changed or the set number of iterations is reached.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910553755.7A CN110287320B (en) | 2019-06-25 | 2019-06-25 | Deep learning multi-classification emotion analysis model combining attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910553755.7A CN110287320B (en) | 2019-06-25 | 2019-06-25 | Deep learning multi-classification emotion analysis model combining attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287320A true CN110287320A (en) | 2019-09-27 |
CN110287320B CN110287320B (en) | 2021-03-16 |
Family
ID=68005491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910553755.7A Active CN110287320B (en) | 2019-06-25 | 2019-06-25 | Deep learning multi-classification emotion analysis model combining attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287320B (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110855474A (en) * | 2019-10-21 | 2020-02-28 | 广州杰赛科技股份有限公司 | Network feature extraction method, device, equipment and storage medium of KQI data |
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
CN111079985A (en) * | 2019-11-26 | 2020-04-28 | 昆明理工大学 | Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features |
CN111079547A (en) * | 2019-11-22 | 2020-04-28 | 武汉大学 | Pedestrian moving direction identification method based on mobile phone inertial sensor |
CN111291832A (en) * | 2020-03-11 | 2020-06-16 | 重庆大学 | Sensor data classification method based on Stack integrated neural network |
CN111339768A (en) * | 2020-02-27 | 2020-06-26 | 携程旅游网络技术(上海)有限公司 | Sensitive text detection method, system, electronic device and medium |
CN111402953A (en) * | 2020-04-02 | 2020-07-10 | 四川大学 | Protein sequence classification method based on hierarchical attention network |
CN111582397A (en) * | 2020-05-14 | 2020-08-25 | 杭州电子科技大学 | CNN-RNN image emotion analysis method based on attention mechanism |
CN111881262A (en) * | 2020-08-06 | 2020-11-03 | 重庆邮电大学 | Text emotion analysis method based on multi-channel neural network |
CN111914084A (en) * | 2020-01-09 | 2020-11-10 | 北京航空航天大学 | Deep learning-based emotion label text generation and evaluation system |
CN112597279A (en) * | 2020-12-25 | 2021-04-02 | 北京知因智慧科技有限公司 | Text emotion analysis model optimization method and device |
CN112598065A (en) * | 2020-12-25 | 2021-04-02 | 天津工业大学 | Memory-based gated convolutional neural network semantic processing system and method |
CN112818123A (en) * | 2021-02-08 | 2021-05-18 | 河北工程大学 | Emotion classification method for text |
CN113177111A (en) * | 2021-05-28 | 2021-07-27 | 中国人民解放军国防科技大学 | Chinese text emotion analysis method and device, computer equipment and storage medium |
CN113239199A (en) * | 2021-05-18 | 2021-08-10 | 重庆邮电大学 | Credit classification method based on multi-party data set |
CN113268592A (en) * | 2021-05-06 | 2021-08-17 | 天津科技大学 | Short text object emotion classification method based on multi-level interactive attention mechanism |
CN113377901A (en) * | 2021-05-17 | 2021-09-10 | 内蒙古工业大学 | Mongolian text emotion analysis method based on multi-size CNN and LSTM models |
CN113379818A (en) * | 2021-05-24 | 2021-09-10 | 四川大学 | Phase analysis method based on multi-scale attention mechanism network |
WO2021174922A1 (en) * | 2020-03-02 | 2021-09-10 | 平安科技(深圳)有限公司 | Statement sentiment classification method and related device |
CN114298025A (en) * | 2021-12-01 | 2022-04-08 | 国家电网有限公司华东分部 | Emotion analysis method based on artificial intelligence |
CN114547299A (en) * | 2022-02-18 | 2022-05-27 | 重庆邮电大学 | Short text sentiment classification method and device based on composite network model |
CN114662547A (en) * | 2022-04-07 | 2022-06-24 | 天津大学 | MSCRNN emotion recognition method and device based on electroencephalogram signals |
CN114897078A (en) * | 2022-05-19 | 2022-08-12 | 辽宁大学 | Short text similarity calculation method based on deep learning and topic model |
CN115116448A (en) * | 2022-08-29 | 2022-09-27 | 四川启睿克科技有限公司 | Voice extraction method, neural network model training method, device and storage medium |
US20230160942A1 (en) * | 2020-04-22 | 2023-05-25 | Qingdao Topscomm Communication Co., Ltd | Fault arc signal detection method using convolutional neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460089A (en) * | 2018-01-23 | 2018-08-28 | 哈尔滨理工大学 | Diverse characteristics based on Attention neural networks merge Chinese Text Categorization |
CN109670169A (en) * | 2018-11-16 | 2019-04-23 | 中山大学 | A kind of deep learning sensibility classification method based on feature extraction |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
-
2019
- 2019-06-25 CN CN201910553755.7A patent/CN110287320B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460089A (en) * | 2018-01-23 | 2018-08-28 | 哈尔滨理工大学 | Diverse characteristics based on Attention neural networks merge Chinese Text Categorization |
CN109670169A (en) * | 2018-11-16 | 2019-04-23 | 中山大学 | A kind of deep learning sensibility classification method based on feature extraction |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
Non-Patent Citations (3)
Title |
---|
MING-HSIANG SU.ETL: "LSTM-based Text Emotion Recognition Using Semantic and Emotional Word Vectors", 《 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION》 * |
THITITORN SENEEWONG NA AYUTTHAYA.ETL: "Thai Sentiment Analysis via Bidirectional LSTM-CNN Model with Embedding Vectors and Sentic Features", 《2018 INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING》 * |
关鹏飞等: "注意力增强的双向LSTM情感分析", 《中文信息学报》 * |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
CN110866113B (en) * | 2019-09-30 | 2022-07-26 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning burt model |
CN110855474B (en) * | 2019-10-21 | 2022-06-17 | 广州杰赛科技股份有限公司 | Network feature extraction method, device, equipment and storage medium of KQI data |
CN110855474A (en) * | 2019-10-21 | 2020-02-28 | 广州杰赛科技股份有限公司 | Network feature extraction method, device, equipment and storage medium of KQI data |
CN111079547A (en) * | 2019-11-22 | 2020-04-28 | 武汉大学 | Pedestrian moving direction identification method based on mobile phone inertial sensor |
CN111079985A (en) * | 2019-11-26 | 2020-04-28 | 昆明理工大学 | Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features |
CN111914084A (en) * | 2020-01-09 | 2020-11-10 | 北京航空航天大学 | Deep learning-based emotion label text generation and evaluation system |
CN111339768B (en) * | 2020-02-27 | 2024-03-05 | 携程旅游网络技术(上海)有限公司 | Sensitive text detection method, system, electronic equipment and medium |
CN111339768A (en) * | 2020-02-27 | 2020-06-26 | 携程旅游网络技术(上海)有限公司 | Sensitive text detection method, system, electronic device and medium |
WO2021174922A1 (en) * | 2020-03-02 | 2021-09-10 | 平安科技(深圳)有限公司 | Statement sentiment classification method and related device |
CN111291832A (en) * | 2020-03-11 | 2020-06-16 | 重庆大学 | Sensor data classification method based on Stack integrated neural network |
CN111402953A (en) * | 2020-04-02 | 2020-07-10 | 四川大学 | Protein sequence classification method based on hierarchical attention network |
CN111402953B (en) * | 2020-04-02 | 2022-05-03 | 四川大学 | Protein sequence classification method based on hierarchical attention network |
US20230160942A1 (en) * | 2020-04-22 | 2023-05-25 | Qingdao Topscomm Communication Co., Ltd | Fault arc signal detection method using convolutional neural network |
US11860216B2 (en) * | 2020-04-22 | 2024-01-02 | Qingdao Topscomm Communication Co., Ltd | Fault arc signal detection method using convolutional neural network |
CN111582397B (en) * | 2020-05-14 | 2023-04-07 | 杭州电子科技大学 | CNN-RNN image emotion analysis method based on attention mechanism |
CN111582397A (en) * | 2020-05-14 | 2020-08-25 | 杭州电子科技大学 | CNN-RNN image emotion analysis method based on attention mechanism |
CN111881262A (en) * | 2020-08-06 | 2020-11-03 | 重庆邮电大学 | Text emotion analysis method based on multi-channel neural network |
CN111881262B (en) * | 2020-08-06 | 2022-05-20 | 重庆邮电大学 | Text emotion analysis method based on multi-channel neural network |
CN112598065B (en) * | 2020-12-25 | 2023-05-30 | 天津工业大学 | Memory-based gating convolutional neural network semantic processing system and method |
CN112597279A (en) * | 2020-12-25 | 2021-04-02 | 北京知因智慧科技有限公司 | Text emotion analysis model optimization method and device |
CN112598065A (en) * | 2020-12-25 | 2021-04-02 | 天津工业大学 | Memory-based gated convolutional neural network semantic processing system and method |
CN112818123A (en) * | 2021-02-08 | 2021-05-18 | 河北工程大学 | Emotion classification method for text |
CN113268592A (en) * | 2021-05-06 | 2021-08-17 | 天津科技大学 | Short text object emotion classification method based on multi-level interactive attention mechanism |
CN113377901B (en) * | 2021-05-17 | 2022-08-19 | 内蒙古工业大学 | Mongolian text emotion analysis method based on multi-size CNN and LSTM models |
CN113377901A (en) * | 2021-05-17 | 2021-09-10 | 内蒙古工业大学 | Mongolian text emotion analysis method based on multi-size CNN and LSTM models |
CN113239199A (en) * | 2021-05-18 | 2021-08-10 | 重庆邮电大学 | Credit classification method based on multi-party data set |
CN113379818B (en) * | 2021-05-24 | 2022-06-07 | 四川大学 | Phase analysis method based on multi-scale attention mechanism network |
CN113379818A (en) * | 2021-05-24 | 2021-09-10 | 四川大学 | Phase analysis method based on multi-scale attention mechanism network |
CN113177111A (en) * | 2021-05-28 | 2021-07-27 | 中国人民解放军国防科技大学 | Chinese text emotion analysis method and device, computer equipment and storage medium |
CN114298025A (en) * | 2021-12-01 | 2022-04-08 | 国家电网有限公司华东分部 | Emotion analysis method based on artificial intelligence |
CN114547299A (en) * | 2022-02-18 | 2022-05-27 | 重庆邮电大学 | Short text sentiment classification method and device based on composite network model |
CN114662547A (en) * | 2022-04-07 | 2022-06-24 | 天津大学 | MSCRNN emotion recognition method and device based on electroencephalogram signals |
CN114897078A (en) * | 2022-05-19 | 2022-08-12 | 辽宁大学 | Short text similarity calculation method based on deep learning and topic model |
CN115116448A (en) * | 2022-08-29 | 2022-09-27 | 四川启睿克科技有限公司 | Voice extraction method, neural network model training method, device and storage medium |
CN115116448B (en) * | 2022-08-29 | 2022-11-15 | 四川启睿克科技有限公司 | Voice extraction method, neural network model training method, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110287320B (en) | 2021-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110287320B (en) | Deep learning multi-classification emotion analysis model combining attention mechanism | |
CN107608956B (en) | Reader emotion distribution prediction algorithm based on CNN-GRNN | |
CN106650813B (en) | A kind of image understanding method based on depth residual error network and LSTM | |
CN111126386B (en) | Sequence domain adaptation method based on countermeasure learning in scene text recognition | |
CN108334605B (en) | Text classification method and device, computer equipment and storage medium | |
CN109241255B (en) | Intention identification method based on deep learning | |
CN108614875B (en) | Chinese emotion tendency classification method based on global average pooling convolutional neural network | |
CN107609009B (en) | Text emotion analysis method and device, storage medium and computer equipment | |
CN110059188B (en) | Chinese emotion analysis method based on bidirectional time convolution network | |
CN109933664B (en) | Fine-grained emotion analysis improvement method based on emotion word embedding | |
CN106980683B (en) | Blog text abstract generating method based on deep learning | |
CN110083700A (en) | A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks | |
CN109740148A (en) | A kind of text emotion analysis method of BiLSTM combination Attention mechanism | |
CN109284506A (en) | A kind of user comment sentiment analysis system and method based on attention convolutional neural networks | |
CN107818084B (en) | Emotion analysis method fused with comment matching diagram | |
CN106886580B (en) | Image emotion polarity analysis method based on deep learning | |
CN110414009B (en) | Burma bilingual parallel sentence pair extraction method and device based on BilSTM-CNN | |
CN111127146B (en) | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder | |
CN107832663A (en) | A kind of multi-modal sentiment analysis method based on quantum theory | |
CN112364638B (en) | Personality identification method based on social text | |
CN110188195B (en) | Text intention recognition method, device and equipment based on deep learning | |
CN107247703A (en) | Microblog emotional analysis method based on convolutional neural networks and integrated study | |
CN110472245B (en) | Multi-label emotion intensity prediction method based on hierarchical convolutional neural network | |
CN110263174B (en) | Topic category analysis method based on focus attention | |
CN110046356B (en) | Label-embedded microblog text emotion multi-label classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |