CN110287320A - A kind of deep learning of combination attention mechanism is classified sentiment analysis model more - Google Patents

A kind of deep learning of combination attention mechanism is classified sentiment analysis model more Download PDF

Info

Publication number
CN110287320A
CN110287320A CN201910553755.7A CN201910553755A CN110287320A CN 110287320 A CN110287320 A CN 110287320A CN 201910553755 A CN201910553755 A CN 201910553755A CN 110287320 A CN110287320 A CN 110287320A
Authority
CN
China
Prior art keywords
word
cnn
text
model
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910553755.7A
Other languages
Chinese (zh)
Other versions
CN110287320B (en
Inventor
刘磊
孙应红
陈浩
李静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910553755.7A priority Critical patent/CN110287320B/en
Publication of CN110287320A publication Critical patent/CN110287320A/en
Application granted granted Critical
Publication of CN110287320B publication Critical patent/CN110287320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of deep learning of combination attention mechanism mostly classification sentiment analysis models, belong to natural language processing technique field, the present invention analyzes the weakness of existing CNN network and LSTM network in terms of text emotion analysis, proposes a kind of deep learning mostly classification sentiment analysis model of combination attention mechanism.The model use attention mechanism blends the word order feature of local feature and LSTM model extraction that CNN network extracts, and the thought of integrated model is used in classification layer, the affective characteristics that CNN network and LSTM network extract are spliced respectively, the affective characteristics finally extracted as model.By comparative experiments, it is found that the accuracy rate of the model has significant raising.

Description

Deep learning multi-classification emotion analysis model combining attention mechanism
Technical Field
The invention belongs to the field of text information processing, and relates to a deep learning multi-classification emotion analysis model combined with an attention mechanism.
Background
With the continuous rise of social networks such as microblogs and Twitter, the internet is not only a source for people to obtain daily information, but also an indispensable platform for people to express their own opinions. People comment hot events, express film and comment views, describe product experiences and the like in a network community, a large amount of text information with emotional colors (such as joy, anger, sadness and the like) can be generated, effective emotional analysis is carried out on the text information, and the interest tendency and the attention degree of users can be better known. However, with the increase of the attention of people to network information, a network community generates massive texts with emotional colors every day, and if the network community only depends on artificial marks, the task can not be completed far enough, so that the text emotional analysis becomes a research hotspot in the field of natural language processing.
With the successful application of deep learning methods in computer vision direction, more and more deep learning techniques are also applied in natural language processing direction. The deep learning has the advantages that the features of the text can be automatically extracted, and the strong expression capability on big data is realized. At present, mainstream text emotion analysis methods based on deep learning mainly include a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN), and emotion analysis models based on the two methods have low accuracy and are mainly caused by the following aspects:
firstly, in the emotion analysis process of a text, the convolutional neural network effectively captures emotion information at different positions by enlarging the size of a convolutional kernel, and then local emotion characteristics of the text are obtained. However, in the process of convolution, the context between word sequences in the text is often ignored. However, in the process of text emotion analysis, the precedence relationship of the word order is very important, and the result has certain deviation without the characteristic information of the word order.
And secondly, the recurrent neural network effectively simulates the sequence of the text data by utilizing the front and back dependency relationship, and can extract the word order relationship and semantic information of the text, so that a good effect can be achieved in the emotion analysis of the text. However, when the sample data is Long or the language scene is complex, the useful interval of the emotion information is large or small, and the length is different, and the performance of the Long Short-Term Memory (LSTM) is also limited.
The invention fully utilizes the attention mechanism, the CNN network and the LSTM network, and provides and realizes a deep learning multi-classification emotion analysis model combined with the attention mechanism. The model can effectively improve the accuracy of text emotion analysis.
Disclosure of Invention
The invention provides a deep learning multi-classification emotion analysis model based on an attention mechanism. The model combines a CNN network and an LSTM network to perform emotional feature fusion. Firstly, extracting local features of a text to be analyzed by using a multi-scale convolution kernel of a CNN network, and then fusing the local features extracted by the CNN network into an LSTM network by using an attention mechanism. And finally, splicing the result of the pooling layer of the CNN network and the feature extraction result of the LSTM network by using the idea of an integrated model, and outputting the result as a final model. Experiments show that the accuracy of the model is remarkably improved in text emotion analysis.
In order to achieve the purpose, the invention adopts the following technical scheme:
1. a deep learning multi-classification emotion analysis method combined with an attention mechanism is characterized by comprising the following steps:
step (1) data preprocessing
Let the emotion data set be expressed as: g ═ segtxt1,y1),(segtxt2,y2),......,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the emotion type label is corresponding, N represents the number of samples in the data set G, the samples in G are subjected to data preprocessing,
the dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]Wherein: segiExpressed as the ith sample, y, in the data set GiThen the label is a corresponding emotion category label, and M represents the number of samples in the data set G';
step (2) input of the constructed model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,w3,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεOne-hot coding of the ith word in the text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, and d represents the sentence length of the text. y is formed by RpThe method is based on one-hot coding of emotion classes, and p represents the number of classes to be classified by the model. The word vector embedding matrix for that sample can be represented as:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor a word vector matrix representation of the text to be analyzed, m is the dimension of the word vector, xi∈RmRepresenting the word vector of the ith vocabulary in the text, and E representing the word vector embedded layer;
step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnThe splicing is carried out in a splicing way,i.e. vector [ CCnn;C'Rnn]As the feature vector that the model finally extracts. Then the feature vector [ CCnn;C'Rnn]Obtaining the final model output vector through the full connection layerWhere p represents the number of classes to which the model is to be assigned.
The local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and the number of filters of the convolution kernel with the same scale is k, namely, k of each neuron;
in the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: k filtering of convolution layer with same convolution scale in partial feature extraction stage based on CNN networkSplicing convolution results of the devices to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi12,...,πnAnd finishing the extraction of the local features of the multi-scale CNN network.
Wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qB being a weight matrix, λ representing the dimension of the corresponding weight matrix1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism12,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the output of the LSTM module corresponding to the first word in the text to be analyzed, the invention uses a bi-directional LSTM model, i.e. a BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting training data into a multi-classification emotion analysis model, adjusting parameters by adopting a cross entropy loss function and combining a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm;
and (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
The pretreatment process comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, and conversion from traditional Chinese to simplified Chinese.
2) Selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresents the ith word in the vocabulary word list, and epsilon represents the total number of words in the data set G with word frequency exceeding sigma.
3) For each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with symbols </>.
The convolution layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1A word vector representing the i-th word to the i + s-1 words in the text sentence.
The training data is preprocessed data.
The convolution layer of the CNN network-based local feature extraction stage adopts 4 convolution kernels with different scales. The training end condition is that the accuracy is not changed any more or the set iteration number is reached.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a structural diagram of a deep learning multi-classification emotion analysis model combined with an attention mechanism.
Detailed Description
The following describes the embodiments of the present invention in further detail with reference to the drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The method provided by the invention is realized by the following steps in sequence:
step (1) data preprocessing
The emotion language data set is represented as: g ═ segtxt1,y1),(segtxt2,y2),......,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the corresponding emotion category label. N represents the number of samples in the data set G, the emotion labels are classified into four categories of joy, anger, disgust and low, and N is 80000, wherein 20000 emotion samples are respectively adopted in the four categories. The data preprocessing of the samples in G comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, and conversion from traditional Chinese to simplified Chinese.
2) Selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresenting the ith word in the data set G and epsilon representing the total number of words in the data set G with a word frequency exceeding sigma. And sigma is 2, 41763 words with the word frequency more than or equal to 2 in the finally obtained data set G, namely epsilon is 41763.
3) After the above processing, for each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with a symbol </>. And d is 64.
The dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]. Wherein: segiExpressed as the ith sample, y, in the data set GiThen M represents the number of samples in the data set G' for the corresponding emotion classification label. The number of samples in the final data set G' is 73150, and the number of samples for each emotion category is shown in table 1:
TABLE 1 number of samples of each class after pretreatment
Step (2) inputting model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,w3,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεThe method is characterized in that one-hot coding of the ith word in the text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, and the sentence length d of the text is 64. y is formed by RpThe method is coded according to one-hot of the emotion classes, p represents the number of classes to be classified of the model, and p is 4. The word vector embedding matrix for that sample can be represented as:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor the word vector matrix representation of the text to be analyzed, the word vector dimension m takes 256. x is the number ofi∈RmFor the word vector representation of the ith vocabulary in the text, a word vector embedding layer represents E, a Wikipedia open source word2vec word vector is adopted, and then X is used as the input of a network model.
Step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnSplicing, i.e. vector [ CCnn;C'Rnn]As the feature vector that the model finally extracts. Then the feature vector [ CCnn;C'Rnn]Obtaining the final model output vector through the full connection layerWhere p represents the number of classes to which the model is to be assigned.
The local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and filters of the convolution kernels with the same scale, namely k neurons, are respectively adopted for n and 128 in the invention.
In the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer, and the total number is 512;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: splicing convolution results of k filters with the same convolution scale of convolution layers in the local feature extraction stage based on the CNN network to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi12,...,πnAnd finishing the extraction of the local features of the multi-scale CNN network.
Wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qFor the weight matrix, λ represents the dimension of the corresponding weight matrix, b1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network, and q is 256;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism12,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the first of the text to be analyzedThe output of LSTM module corresponding to the word, the invention adopts bidirectional LSTM model, i.e. BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting the training data into a multi-classification emotion analysis model, adjusting parameters by combining a cross entropy loss function and a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm.
And (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
The convolution layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1Representing a word vector from the ith word to i + s-1 words in a text sentence, s being taken to be [2,3,4,5]Four different convolution sizes, f (-) employ the RELU activation function.
The training data is preprocessed data.
The convolution layer of the CNN network-based local feature extraction stage adopts 4 convolution kernels with different scales. The training end condition is that the accuracy is not changed any more or the set iteration number is reached.
1. Analysis of experiments
In the testing stage, 2000 kinds of emotion corpora of joy, anger, disgust and low fall are selected. The accuracy Acc (Accuracy) is used as an evaluation index, the parameters of the test stage model are kept unchanged, and the test set results are shown in Table 2:
TABLE 2 comparison of Emotion analysis results
A comparison of the test results of several models is given in table 2, where experiment 1 is a generic single-scale CNN network model with convolution kernel size 3, experiment 2 is a generic LSTM network, and experiment 3 is a text emotion analysis model based on attention mechanism as proposed herein.
Compared with the common CNN network and the LSTM network, the emotion analysis model based on the attention mechanism provided by the invention has the advantages that the accuracy is obviously improved, the local characteristic information of the CNN network and the language sequence characteristic information of the LSTM network can be effectively extracted by the method provided by the invention, and the effectiveness of the method is demonstrated.

Claims (6)

1. A deep learning multi-classification emotion analysis method combined with an attention mechanism is characterized by comprising the following steps:
step (1) data preprocessing
Let the emotion data set be expressed as: g ═ segtxt1,y1),(segtxt2,y2),...,(segtxtN,yN)]Wherein segtxtiDenotes the ith sample, yiThen the corresponding emotion type label is obtained, N represents the number of samples in the data set G, and the samples in G are processedThe pre-processing of the data is carried out,
the dataset G was preprocessed and denoted as G' ═ seg [ ("seg1,y1),(seg2,y2),...,(segM,yM)]Wherein: segiExpressed as the ith sample, y, in the data set GiThen the label is a corresponding emotion category label, and M represents the number of samples in the data set G';
step (2) input of the constructed model
For any sample data to be analyzed (seg, y) in the data set G', it is further detailed as:
seg=[w1,w2,...,wi,...,wd]T (1)
y=[0,0,1,...,0] (2)
wherein: w is ai∈RεOne-hot coding of the ith word in a text to be analyzed is carried out according to a word list wordList, epsilon is the size of the word list wordList, d represents the sentence length of the text, and y belongs to RpAccording to one-hot encoding of emotion categories, p represents the number of categories to be classified by the model, and the word vector embedding matrix of the sample can be represented as follows:
X=seg*ET (3)
wherein: x is formed by Rd×m,X=[x1,x2,...,xd]TFor a word vector matrix representation of the text to be analyzed, m is the dimension of the word vector, xi∈RmRepresenting the word vector of the ith vocabulary in the text, and E representing the word vector embedded layer;
step (3) constructing a deep learning multi-classification emotion analysis model
The deep learning multi-classification emotion analysis model comprises a CNN network-based local feature extraction stage, an LSTM network-based word order relation feature extraction stage and a pooling layer result C of the CNN network-based local feature extraction stageCnnAnd result C 'of language sequence relation feature extraction stage based on LSTM network'RnnSplicing, i.e. vector [ CCnn;C'Rnn]As the feature vector of the final extraction of the model, and then the feature vector [ C ]Cnn;C'Rnn]Obtaining the final model through the full connection layerOutput vectorWhere p represents the number of classes to be assigned to the model,
the local feature extraction stage based on the CNN network comprises the following contents:
inputting a word vector matrix representation X of the text to be analyzed of a formula 3 in a local feature extraction stage;
the local feature extraction stage is based on a CNN network and comprises two layers in total, namely a convolutional layer and a pooling layer, wherein:
the convolution layer adopts n convolution kernels with different scales to convolute the text to be analyzed, and the number of filters of the convolution kernel with the same scale is k, namely, k of each neuron;
in the pooling layer, the vector obtained by convolution is down-sampled by adopting a maximum pooling layer method, and local optimal features are selected, so that each filter becomes a scalar through the maximum pooling layer, and the scalar represents the optimal emotional features in the filter;
the output of the local feature extraction module is CCnn=[c1,c2,...,cnk]Splicing the optimal features selected by a plurality of filters with different sizes in the pooling layer together CCnn=[c1,c2,...,cnk]As output of this module, where CCnn∈RnkAnd nk is the number of all filters in the convolutional layer;
the language order relation feature extraction stage based on the LSTM network comprises the following contents:
multi-scale CNN network local feature extraction: splicing convolution results of k filters with the same convolution scale of convolution layers in the local feature extraction stage based on the CNN network to obtain a set ZCnnThen set ZCnnEach vector Z in (a)iInputting the result into a GLU mechanism, namely, gating a convolution network, and marking the obtained result as { pi12,...,πnFinishing the extraction of the local features of the multi-scale CNN network,
wherein Z isCnn={Z1,Z2,...,Zn},ZiSplicing convolution results of a plurality of filters with the scale i;
wherein,Zistitching of convolution results of k filters representing a certain scale, W1,W2∈Rλ×qFor the weight matrix, λ represents the dimension of the corresponding weight matrix, b1,b2∈RqFor the offset, σ denotes the sigmoid function, πi∈RqQ is the output dimension of the LSTM network;
then, extracting the local feature extraction result { pi ] of the multi-scale CNN network by using an attention mechanism12,...,πnIntegrating the language sequence relationship feature extraction stage into an LSTM network to obtain an output result C 'of the language sequence relationship feature extraction stage based on the LSTM network'RnnI.e. by
Wherein,represents the output of the LSTM module corresponding to the last word in the text to be analyzed,representing the output of the LSTM module corresponding to the first word in the text to be analyzed, the invention uses a bi-directional LSTM model, i.e. a BiLSTM model,
the forward propagation is adopted, and the specific calculation process is as follows:
d is the length of the text to be analyzed, each word in the text sequentially corresponds to an LSTM module,
in the forward propagation process, the output of the t-1 th LSTM module isThen the output of the tth LSTM moduleThe calculation formula is as follows:
wherein:is a dot product of two vectors, also known as a scoring function, and is the output of the LSTM used to calculate the previous wordAnd the similarity of the current local feature vector,
wherein αt,iEpsilon R represents characteristic piiThe weight of (a) is determined,
wherein: st-1∈RqIs a weighted result of a plurality of convolution features, using st-1Instead of the formerWord vector x in combination with current wordtEvaluating the output of a current LSTM moduleThe formula is as follows:
the backward propagation is adopted, the specific calculation process is the same as the forward propagation, and the details are not repeated here;
step (4), model training: inputting training data into a multi-classification emotion analysis model, adjusting parameters by adopting a cross entropy loss function and combining a back propagation BP algorithm, and finishing training by using softmax regression as a classification algorithm;
and (5) analyzing a model: and inputting the text to be analyzed into the trained model, and finally outputting the emotion classification result after the text is analyzed.
2. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the preprocessing process comprises the following steps:
1) word segmentation, stop removal, conversion from English capitals to lowercases, conversion from traditional Chinese to simplified Chinese,
2) selecting words with frequency more than or equal to sigma in the data set G, and constructing a vocabulary table word List ═ word1,word2,...wordεTherein, wordiRepresenting the ith word in the vocabulary word, epsilon represents the total number of words in the data set G with a word frequency exceeding sigma,
3) for each sample in the data set G, if the length is larger than d, deleting the sample, and if the length is smaller than d, filling the sample with symbols </>.
3. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the convolutional layer calculation formula of the CNN network-based local feature extraction module is as follows:
z=f(∑WT*xi:i+s-1+b) (8)
wherein z represents a feature vector obtained by convolution of a neuron with a text to be analyzed, f (-) represents an activation function, and W is equal to Rs×mWeight matrix representing neurons, shared by the same neuron parameter, sxm represents the size of convolution kernel, b represents threshold, x representsi:i+s-1A word vector representing the i-th word to the i + s-1 words in the text sentence.
4. The method as claimed in claim 1, wherein the training data is preprocessed data.
5. The method for deep learning multi-classification emotion analysis combined with attention mechanism as claimed in claim 1, wherein the convolutional layer of the local feature extraction stage based on the CNN network employs 4 convolutional kernels with different scales.
6. The method as claimed in claim 1, wherein the end condition of the training is that the accuracy is not changed or the set number of iterations is reached.
CN201910553755.7A 2019-06-25 2019-06-25 Deep learning multi-classification emotion analysis model combining attention mechanism Active CN110287320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910553755.7A CN110287320B (en) 2019-06-25 2019-06-25 Deep learning multi-classification emotion analysis model combining attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910553755.7A CN110287320B (en) 2019-06-25 2019-06-25 Deep learning multi-classification emotion analysis model combining attention mechanism

Publications (2)

Publication Number Publication Date
CN110287320A true CN110287320A (en) 2019-09-27
CN110287320B CN110287320B (en) 2021-03-16

Family

ID=68005491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910553755.7A Active CN110287320B (en) 2019-06-25 2019-06-25 Deep learning multi-classification emotion analysis model combining attention mechanism

Country Status (1)

Country Link
CN (1) CN110287320B (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110855474A (en) * 2019-10-21 2020-02-28 广州杰赛科技股份有限公司 Network feature extraction method, device, equipment and storage medium of KQI data
CN110866113A (en) * 2019-09-30 2020-03-06 浙江大学 Text classification method based on sparse self-attention mechanism fine-tuning Bert model
CN111079985A (en) * 2019-11-26 2020-04-28 昆明理工大学 Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features
CN111079547A (en) * 2019-11-22 2020-04-28 武汉大学 Pedestrian moving direction identification method based on mobile phone inertial sensor
CN111291832A (en) * 2020-03-11 2020-06-16 重庆大学 Sensor data classification method based on Stack integrated neural network
CN111339768A (en) * 2020-02-27 2020-06-26 携程旅游网络技术(上海)有限公司 Sensitive text detection method, system, electronic device and medium
CN111402953A (en) * 2020-04-02 2020-07-10 四川大学 Protein sequence classification method based on hierarchical attention network
CN111582397A (en) * 2020-05-14 2020-08-25 杭州电子科技大学 CNN-RNN image emotion analysis method based on attention mechanism
CN111881262A (en) * 2020-08-06 2020-11-03 重庆邮电大学 Text emotion analysis method based on multi-channel neural network
CN111914084A (en) * 2020-01-09 2020-11-10 北京航空航天大学 Deep learning-based emotion label text generation and evaluation system
CN112597279A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Text emotion analysis model optimization method and device
CN112598065A (en) * 2020-12-25 2021-04-02 天津工业大学 Memory-based gated convolutional neural network semantic processing system and method
CN112818123A (en) * 2021-02-08 2021-05-18 河北工程大学 Emotion classification method for text
CN113177111A (en) * 2021-05-28 2021-07-27 中国人民解放军国防科技大学 Chinese text emotion analysis method and device, computer equipment and storage medium
CN113239199A (en) * 2021-05-18 2021-08-10 重庆邮电大学 Credit classification method based on multi-party data set
CN113268592A (en) * 2021-05-06 2021-08-17 天津科技大学 Short text object emotion classification method based on multi-level interactive attention mechanism
CN113377901A (en) * 2021-05-17 2021-09-10 内蒙古工业大学 Mongolian text emotion analysis method based on multi-size CNN and LSTM models
CN113379818A (en) * 2021-05-24 2021-09-10 四川大学 Phase analysis method based on multi-scale attention mechanism network
WO2021174922A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Statement sentiment classification method and related device
CN114298025A (en) * 2021-12-01 2022-04-08 国家电网有限公司华东分部 Emotion analysis method based on artificial intelligence
CN114547299A (en) * 2022-02-18 2022-05-27 重庆邮电大学 Short text sentiment classification method and device based on composite network model
CN114662547A (en) * 2022-04-07 2022-06-24 天津大学 MSCRNN emotion recognition method and device based on electroencephalogram signals
CN114897078A (en) * 2022-05-19 2022-08-12 辽宁大学 Short text similarity calculation method based on deep learning and topic model
CN115116448A (en) * 2022-08-29 2022-09-27 四川启睿克科技有限公司 Voice extraction method, neural network model training method, device and storage medium
US20230160942A1 (en) * 2020-04-22 2023-05-25 Qingdao Topscomm Communication Co., Ltd Fault arc signal detection method using convolutional neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460089A (en) * 2018-01-23 2018-08-28 哈尔滨理工大学 Diverse characteristics based on Attention neural networks merge Chinese Text Categorization
CN109670169A (en) * 2018-11-16 2019-04-23 中山大学 A kind of deep learning sensibility classification method based on feature extraction
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460089A (en) * 2018-01-23 2018-08-28 哈尔滨理工大学 Diverse characteristics based on Attention neural networks merge Chinese Text Categorization
CN109670169A (en) * 2018-11-16 2019-04-23 中山大学 A kind of deep learning sensibility classification method based on feature extraction
CN109710761A (en) * 2018-12-21 2019-05-03 中国标准化研究院 The sentiment analysis method of two-way LSTM model based on attention enhancing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MING-HSIANG SU.ETL: "LSTM-based Text Emotion Recognition Using Semantic and Emotional Word Vectors", 《 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION》 *
THITITORN SENEEWONG NA AYUTTHAYA.ETL: "Thai Sentiment Analysis via Bidirectional LSTM-CNN Model with Embedding Vectors and Sentic Features", 《2018 INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING》 *
关鹏飞等: "注意力增强的双向LSTM情感分析", 《中文信息学报》 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866113A (en) * 2019-09-30 2020-03-06 浙江大学 Text classification method based on sparse self-attention mechanism fine-tuning Bert model
CN110866113B (en) * 2019-09-30 2022-07-26 浙江大学 Text classification method based on sparse self-attention mechanism fine-tuning burt model
CN110855474B (en) * 2019-10-21 2022-06-17 广州杰赛科技股份有限公司 Network feature extraction method, device, equipment and storage medium of KQI data
CN110855474A (en) * 2019-10-21 2020-02-28 广州杰赛科技股份有限公司 Network feature extraction method, device, equipment and storage medium of KQI data
CN111079547A (en) * 2019-11-22 2020-04-28 武汉大学 Pedestrian moving direction identification method based on mobile phone inertial sensor
CN111079985A (en) * 2019-11-26 2020-04-28 昆明理工大学 Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features
CN111914084A (en) * 2020-01-09 2020-11-10 北京航空航天大学 Deep learning-based emotion label text generation and evaluation system
CN111339768B (en) * 2020-02-27 2024-03-05 携程旅游网络技术(上海)有限公司 Sensitive text detection method, system, electronic equipment and medium
CN111339768A (en) * 2020-02-27 2020-06-26 携程旅游网络技术(上海)有限公司 Sensitive text detection method, system, electronic device and medium
WO2021174922A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Statement sentiment classification method and related device
CN111291832A (en) * 2020-03-11 2020-06-16 重庆大学 Sensor data classification method based on Stack integrated neural network
CN111402953A (en) * 2020-04-02 2020-07-10 四川大学 Protein sequence classification method based on hierarchical attention network
CN111402953B (en) * 2020-04-02 2022-05-03 四川大学 Protein sequence classification method based on hierarchical attention network
US20230160942A1 (en) * 2020-04-22 2023-05-25 Qingdao Topscomm Communication Co., Ltd Fault arc signal detection method using convolutional neural network
US11860216B2 (en) * 2020-04-22 2024-01-02 Qingdao Topscomm Communication Co., Ltd Fault arc signal detection method using convolutional neural network
CN111582397B (en) * 2020-05-14 2023-04-07 杭州电子科技大学 CNN-RNN image emotion analysis method based on attention mechanism
CN111582397A (en) * 2020-05-14 2020-08-25 杭州电子科技大学 CNN-RNN image emotion analysis method based on attention mechanism
CN111881262A (en) * 2020-08-06 2020-11-03 重庆邮电大学 Text emotion analysis method based on multi-channel neural network
CN111881262B (en) * 2020-08-06 2022-05-20 重庆邮电大学 Text emotion analysis method based on multi-channel neural network
CN112598065B (en) * 2020-12-25 2023-05-30 天津工业大学 Memory-based gating convolutional neural network semantic processing system and method
CN112597279A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Text emotion analysis model optimization method and device
CN112598065A (en) * 2020-12-25 2021-04-02 天津工业大学 Memory-based gated convolutional neural network semantic processing system and method
CN112818123A (en) * 2021-02-08 2021-05-18 河北工程大学 Emotion classification method for text
CN113268592A (en) * 2021-05-06 2021-08-17 天津科技大学 Short text object emotion classification method based on multi-level interactive attention mechanism
CN113377901B (en) * 2021-05-17 2022-08-19 内蒙古工业大学 Mongolian text emotion analysis method based on multi-size CNN and LSTM models
CN113377901A (en) * 2021-05-17 2021-09-10 内蒙古工业大学 Mongolian text emotion analysis method based on multi-size CNN and LSTM models
CN113239199A (en) * 2021-05-18 2021-08-10 重庆邮电大学 Credit classification method based on multi-party data set
CN113379818B (en) * 2021-05-24 2022-06-07 四川大学 Phase analysis method based on multi-scale attention mechanism network
CN113379818A (en) * 2021-05-24 2021-09-10 四川大学 Phase analysis method based on multi-scale attention mechanism network
CN113177111A (en) * 2021-05-28 2021-07-27 中国人民解放军国防科技大学 Chinese text emotion analysis method and device, computer equipment and storage medium
CN114298025A (en) * 2021-12-01 2022-04-08 国家电网有限公司华东分部 Emotion analysis method based on artificial intelligence
CN114547299A (en) * 2022-02-18 2022-05-27 重庆邮电大学 Short text sentiment classification method and device based on composite network model
CN114662547A (en) * 2022-04-07 2022-06-24 天津大学 MSCRNN emotion recognition method and device based on electroencephalogram signals
CN114897078A (en) * 2022-05-19 2022-08-12 辽宁大学 Short text similarity calculation method based on deep learning and topic model
CN115116448A (en) * 2022-08-29 2022-09-27 四川启睿克科技有限公司 Voice extraction method, neural network model training method, device and storage medium
CN115116448B (en) * 2022-08-29 2022-11-15 四川启睿克科技有限公司 Voice extraction method, neural network model training method, device and storage medium

Also Published As

Publication number Publication date
CN110287320B (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN110287320B (en) Deep learning multi-classification emotion analysis model combining attention mechanism
CN107608956B (en) Reader emotion distribution prediction algorithm based on CNN-GRNN
CN106650813B (en) A kind of image understanding method based on depth residual error network and LSTM
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
CN108334605B (en) Text classification method and device, computer equipment and storage medium
CN109241255B (en) Intention identification method based on deep learning
CN108614875B (en) Chinese emotion tendency classification method based on global average pooling convolutional neural network
CN107609009B (en) Text emotion analysis method and device, storage medium and computer equipment
CN110059188B (en) Chinese emotion analysis method based on bidirectional time convolution network
CN109933664B (en) Fine-grained emotion analysis improvement method based on emotion word embedding
CN106980683B (en) Blog text abstract generating method based on deep learning
CN110083700A (en) A kind of enterprise&#39;s public sentiment sensibility classification method and system based on convolutional neural networks
CN109740148A (en) A kind of text emotion analysis method of BiLSTM combination Attention mechanism
CN109284506A (en) A kind of user comment sentiment analysis system and method based on attention convolutional neural networks
CN107818084B (en) Emotion analysis method fused with comment matching diagram
CN106886580B (en) Image emotion polarity analysis method based on deep learning
CN110414009B (en) Burma bilingual parallel sentence pair extraction method and device based on BilSTM-CNN
CN111127146B (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
CN107832663A (en) A kind of multi-modal sentiment analysis method based on quantum theory
CN112364638B (en) Personality identification method based on social text
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN107247703A (en) Microblog emotional analysis method based on convolutional neural networks and integrated study
CN110472245B (en) Multi-label emotion intensity prediction method based on hierarchical convolutional neural network
CN110263174B (en) Topic category analysis method based on focus attention
CN110046356B (en) Label-embedded microblog text emotion multi-label classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant