CN110879938A - Text emotion classification method, device, equipment and storage medium - Google Patents
Text emotion classification method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN110879938A CN110879938A CN201911110950.9A CN201911110950A CN110879938A CN 110879938 A CN110879938 A CN 110879938A CN 201911110950 A CN201911110950 A CN 201911110950A CN 110879938 A CN110879938 A CN 110879938A
- Authority
- CN
- China
- Prior art keywords
- feature
- feature representation
- context
- vector
- text data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000008451 emotion Effects 0.000 title claims abstract description 55
- 239000013598 vector Substances 0.000 claims abstract description 93
- 238000011176 pooling Methods 0.000 claims abstract description 27
- 230000015654 memory Effects 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims abstract description 13
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 239000011159 matrix material Substances 0.000 claims description 25
- 230000006870 function Effects 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000013145 classification model Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 14
- 239000010410 layer Substances 0.000 description 31
- 238000013527 convolutional neural network Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a text emotion classification method, a text emotion classification device, text emotion classification equipment and a storage medium. The method comprises the following steps: acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; and on the basis of the extracted context feature representation, an Attention mechanism is utilized, then a top-k-max Pooling processing mode is introduced to fully extract the text feature representation, and the extracted features are sent to a classifier to obtain higher accuracy. The method of the embodiment of the invention improves the classification accuracy of the text emotion classification, and has a good classification effect.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a text emotion classification method, device, equipment and storage medium.
Background
With the development of the internet and the increase of internet users, a great amount of text information is generated on the internet by network users, such as comments on a certain commodity, a movie, a shop and the like, and how to extract useful information from the text is beneficial to merchants, consumers and the like. Therefore, the text emotion tendency analysis becomes more important, and the text emotion tendency analysis (i.e. text emotion classification) is a branch of the Natural Language Processing (NLP) field, and the traditional text emotion classification mainly includes: the two methods do not consider the context information of words or the word order problem of texts and need a large amount of manpower to extract text features, and may not extract important features in texts at a deeper level.
In recent years, with the development of deep learning technology, Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) models have been proposed, where the CNN model mainly uses a Convolutional layer and a downsampling layer to perform feature extraction, and the RNN model makes the state of a current node (or the first nodes) affect the state of a next node and uses the state of the last node as a feature, but the above models have a poor text emotion classification effect.
Disclosure of Invention
The invention provides a text emotion classification method, a text emotion classification device, text emotion classification equipment and a storage medium, which are used for improving the text emotion classification effect.
In a first aspect, the present invention provides a text emotion classification method, including:
acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector by using convolution operation;
extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the semantic codes subjected to maximum pooling to obtain spliced feature representations;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
In a second aspect, the present invention provides a text emotion classification apparatus, including:
the extraction module is used for acquiring word vectors in the text data to be processed and extracting the feature vectors corresponding to the word vectors;
the extraction module is also used for extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
the determining module is used for determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
the processing module is used for performing maximum pooling processing on the semantic codes corresponding to the context feature representation and splicing the semantic codes subjected to maximum pooling processing to obtain spliced feature representation;
and classifying the spliced feature representations to acquire emotion categories corresponding to the text data.
In a third aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method described in any one of the first aspect.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of the first aspects via execution of the executable instructions.
The text emotion classification method, the text emotion classification device, the text emotion classification equipment and the storage medium, provided by the embodiment of the invention, are used for acquiring word vectors in text data to be processed and extracting feature vectors corresponding to the word vectors by using convolution operation; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; acquiring the importance of different features by using an Attention mechanism according to the extracted context feature representation, and then sending the context feature representation into a top-k-maxporoling pooling layer to extract the most important first k features, so as to determine semantic codes corresponding to the context feature representation; and classifying the semantic codes corresponding to the context feature representation to obtain the emotion categories corresponding to the text data, wherein the Bi-LSTM model can fully obtain the context features of words in the text data, and can distinguish important features and filter non-important features through semantic coding, so that the important features have higher weight, the accuracy of text emotion classification is improved, and the classification effect is better.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flowchart illustrating a text emotion classification method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a text emotion classification effect method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the principle of pooling according to one embodiment of the method of the present invention;
FIG. 4 is a schematic diagram of Bi-LSTM model principle of an embodiment of the method provided by the present invention;
FIG. 5 is a schematic illustration of an attention mechanism according to an embodiment of the method of the present invention;
FIG. 6 is a schematic structural diagram of an embodiment of a text emotion classification apparatus provided in the present invention;
fig. 7 is a schematic structural diagram of an embodiment of an electronic device provided in the present invention.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terms "comprising" and "having," and any variations thereof, in the description and claims of this invention and the drawings described herein are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Firstly, the application scene related to the invention is introduced:
the text emotion classification method provided by the embodiment of the invention is applied to a scene for carrying out emotion classification on text data so as to improve classification accuracy.
The emotion root is an emotion for judging whether a text data expresses a positive emotion or a negative emotion, for example, for comments on the network, such as purchase evaluation, movie evaluation, microblog comments and the like.
The method provided by the invention can be realized by the electronic equipment such as a processor executing corresponding software codes, and can also be realized by the electronic equipment performing data interaction with a server while executing the corresponding software codes, for example, the server performs part of operations to control the electronic equipment to execute the method.
The following embodiments are all described with electronic devices as the executing bodies.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
FIG. 1 is a flowchart illustrating a text emotion classification method according to an embodiment of the present invention. As shown in fig. 1, the method provided by this embodiment includes:
Specifically, the word vector may be trained before obtaining the word vector, for example, the word vector may be trained using 30G news corpus of dog search, the corpus may be segmented by using Jieba under python, and then the word vector may be trained, and the word vector may be trained using cbow model under word2vec, and the parameters may be set as: the context window length is set to 5, the learning rate alpha is used with a default of 0.025, the lowest frequency min-count is used with a default of 5, i.e., if a word occurs in a document less than 5 times, it is discarded and the word vector dimension is set to 100 dimensions.
And segmenting the text data to be processed to obtain a plurality of words, and converting the words in the text data into word vectors according to the trained word vector model. The text data includes, for example, a plurality of sentences, each sentence corresponding to a plurality of word vectors. For example by the word embedding layer shown in fig. 2.
Further, as shown in fig. 2, after text data is converted into corresponding word vectors, preliminary feature vectors are extracted through one layer of one-dimensional convolution layer.
And 102, extracting context feature representation for the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model.
Specifically, in the method of the embodiment of the present invention, the feature vector obtained by the convolution operation, which is the calculation result of the convolution layer, is sent to the Bi-LSTM model (as shown in fig. 2), and the Bi-LSTM model can fully extract the text features.
The two-way long-and-short-term memory network has great advantages in processing sequence data (and text data with sequence), so the feature vector Bi-LSTM model is used after the first layer of convolution operation in the embodiment of the invention. Compared with a traditional Recurrent Neural Network (RNN) model, the long and short term memory Network (LSTM) has no problems of gradient extinction and gradient explosion, and has good effect in natural language processing. In order to enable the LSTM to fuse the vocabulary information of the current time and all the context information thereof, a Bi-LSTM model capable of reading text in two directions is used in the embodiment of the present invention.
And extracting context feature representation from the feature vector after the convolution operation through a Bi-LSTM model.
And 103, determining semantic codes corresponding to the context feature representations according to the extracted context feature representations.
Specifically, as shown in fig. 2, the semantic code corresponding to the context feature representation is determined by an Attention mechanism, that is, the influence of the important feature can be highlighted by calculating the Attention distribution probability in the text emotion classification, that is, different keywords in the sentence have different influence on the classification result.
For example, it is impossible to remember all descriptions when looking at a review about a certain product, and only some keywords such as "good", "not good", "like" and "like" can be remembered, and these words are important for the expression of the emotional tendency of the text, so that different features in the text data have different effects on the classification result.
The semantic code is calculated according to the output result (namely, the context feature representation) and the probability weight of the Bi-LSTM layer.
104, performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the multiple semantic codes subjected to maximum pooling to obtain spliced feature representations;
in order to further reduce the data latitude, the maximum pooling operation can be performed after the semantic code is generated: and k-max-firing, selecting the first k maximum values in the generated semantic coding result by using a fixed sliding window, extracting the most important first k characteristics, and filtering out the non-important characteristics to reduce the data latitude so as to improve the convergence speed and the prediction precision of the model.
And 105, classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
Specifically, the text data is classified according to the semantic code and the result of the maximum pre-K pooling process, and the emotion type corresponding to the text data is obtained. The classification process may be performed by a preset classification function. Wherein different classifiers can be utilized for the classification process.
For example, for a comment on a certain product, if two categories are performed, the comment can be classified into a category of good comment or bad comment. Good comments may include words such as "good", and bad comments may include statements such as bad use experience and no purchase.
The method of the embodiment comprises the steps of obtaining word vectors in text data to be processed, and extracting feature vectors corresponding to the word vectors; extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model; determining semantic codes corresponding to the context feature representations according to the extracted context feature representations; and classifying the semantic codes corresponding to the context feature representation to obtain the emotion categories corresponding to the text data, wherein the Bi-LSTM model can fully obtain the context features of words in the text data, and can distinguish important features and filter non-important features through semantic coding, so that the important features have higher weight, the accuracy of text emotion classification is improved, and the classification effect is better.
On the basis of the foregoing embodiment, optionally, step 104 may be specifically implemented by:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
Specifically, as shown in fig. 2, after semantic coding, k-max pooling is performed, and as shown in fig. 3, a top-k calculation formula is:
top-k=maxk{c1,c2,c3,…cp}
k in the above formula represents the first k values to be taken as maximum, c1,c2,c3,…cpRespectively, the semantic code values, and p the size of the sliding window.The numbers represent the concatenation of the vectors. The step size of the sliding window may be k-1.
That is, the first k maximum semantic code values are found at p semantic code values at a time, then the sliding window moves to the right by k-1 semantic code values, and the next group of p semantic code values is found.
And finally, classifying the spliced feature representation by using a preset classification function to obtain the emotion type corresponding to the text data.
On the basis of the foregoing embodiment, optionally, the extracting of the feature vector corresponding to the word vector in step 101 may specifically be implemented in the following manner:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf+θ),Wherein ReLU is activation function, f is equal to Rk×DRepresents the convolution operation of a filter with convolution layer length k (i.e. convolution window size k) on a D-dimensional word vector, theta represents the offset, sjA word vector matrix s composed of k successive words starting from the jth word in the text dataj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDRepresenting a word vector of a jth word in the text data, wherein the dimension is D dimension, and the value range of i is 1-m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
For example, m is 3, the size of the convolution window k of type 1 is 2, the size of the convolution window k of type 2 is 3, and the size of the convolution window k of type 3 is 4.
Further, as shown in fig. 4, step 102 may be specifically implemented by:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ft⊙t-1+it⊙tanh(Wc·X+bc),ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, and n is the number of word vectors.
In particular, δ (·) may be an activation function sigmoid.
The Bi-LSTM layer can fuse the current vocabulary information and all the context information thereof together to obtain the characteristic representation of the context.
Further, as shown in fig. 5, step 103 may be specifically implemented as follows:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation of the context extracted by the Bi-LSTM model and the probability weight;
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs the transpose of the random initial matrix v.
The value range of l is 1 to n, and n is the number of word vectors.
The semantic coding corresponding to the characteristic representation of the context is calculated according to the output result of the Bi-LSTM layer and the probability weight, and the influence of important characteristics is highlighted by utilizing an Attention mechanism.
The method of the embodiment of the invention takes the Bi-LSTM model as a single layer to be fused between the convolutional layer and the pooling layer, firstly utilizes the convolutional layer to carry out the preliminary feature extraction of the text, to fully obtain the contextual characteristics of words in the text data the feature vectors are fed into the Bi-LSTM model, in order to distinguish important features and filter non-important features, an attention mechanism and Top-k maximum pooling are introduced to an output result of the Bi-LSTM model, the important features have higher weights by the attention mechanism, the step length of a sliding window of the Top-k maximum pooling can be k-1, and therefore N-Gram operation in natural language processing is simulated.
To sum up, the method of the embodiment of the invention comprises the steps of after convolution calculation of feature vectors of a convolutional layer, putting the calculation results into a feature matrix F one by one according to the calculation sequence, directly introducing Bi-LSTM after the convolutional layer in order to prevent the sequence of the original sentence from being disturbed by a pooling layer, introducing the calculation result of the Bi-LSTM into an Attention mechanism to enable important features to have higher weight, introducing top-k maximum pooling processing in order to reduce feature dimensionality and improve classification accuracy, and finally sending extracted feature representations into a strong classifier for classification in order to improve the accuracy of text emotion classification, thereby obtaining better classification effect.
Based on the method of the embodiment, the CBLTK model of the embodiment of the present invention is established in a mode of merging the CNN and Bi-LSTM models, features of the text are extracted by using the feature that deep learning can extract deeper features, in order to improve the accuracy of final classification, several commonly used strong classifiers are compared below, and the extracted features are sent to the strong classifiers for classification (support vector machine, SVM for short)/random forest (RF for short), etc.):
the text data used below is, for example, broad shadow evaluation data, which has been marked with the number of stars of the user at the time of review (five stars represent well and one star represents poorly), from which five stars and one star data are extracted for the study of text emotion classification (i.e., the text data is subjected to two classifications), thirty thousand text data for each category of positive and negative types is used as training data, twenty thousand text data for each category of positive and negative types is used as test data, and on statistical average each review includes 45 words, the model of the embodiment of the present invention is specified as 45 words using the sentence length of the text data, is directly truncated if there are sentences exceeding 45, if the sentence length is less than 45, null is used for filling, the word segmentation tool uses the jieba word segmentation of Python and Tensorflow1.6 to construct the CBLTK model of the embodiment of the invention. The word vector training corpus may be a dog news corpus 30G.
Based on the sequential characteristic among words of text data, the embodiment of the invention only uses one layer of convolution kernel at the first layer of the CBLTK model, and uses a max-posing maximum pooling layer and a classification layer, such as softmax classification, after using an attention mechanism, and dropout can be set to be 50% in the model training process and is regularized by using L2. The model of the embodiment of the invention can set minimatch to be 100, uses three types of convolution windows with different sizes and 150 convolution kernels in each type, and selects the best convolution window of the three types.
TABLE 1 selection of convolution window size
From the results in table 1, we select convolution windows with convolution windows of lengths 2, 3 and 4 respectively to perform convolution operation, and can select 150 hidden layer neurons for the number of the hidden layer neurons of the LSTM layer in the Bi-LSTM layer. For the k max pooling layer, the most suitable value for k values from 1 to 6 was found, and k can be selected to be 3 from the results given in the following table.
TABLE 2 selection of k values
Top-k value | Rate of |
1 | 71.2% |
2 | 74.5% |
3 | 77.6% |
4 | 75.3% |
5 | 76.8% |
6 | 75.6% |
The CBLTK model of the embodiment of the invention mainly takes the final classification accuracy as an evaluation index, several groups of comparison tests are respectively classified by using softmax and strong classifiers, and the classification result of the traditional text emotion classification method is added as a reference. The specific results are as follows:
TABLE 3 use of softmax classifier
Model (model) | Rate of accuracy |
CNN | 80.3% |
LSTM | 81.1% |
CNN+LSTM | 82.8% |
LSTM+CNN | 83.3% |
CBLTK | 85.2% |
TABLE 4 use of Strong classifiers (svm)
It can be seen from tables 3 and 4 that the method for classifying text emotion by using a conventional text emotion classification method (for example, term frequency-inverse text frequency index (Tf-idf) algorithm) is high in accuracy without using a deep learning method because text features in a shallow layer can be extracted, and the effect of using first CNN and then LSTM is good without using first LSTM and then CNN. It can be seen from table 2 that although the CBLTM model proposed in the embodiment of the present invention uses SVM, the effect enhancement is not large, so that the following model combines four commonly used strong classifiers: SVM, RF, naive Bayes and GBDT to find out a combination mode with better classification effect.
TABLE 5 combination of different classifiers
Combination mode | Rate of accuracy |
CBLTK+SVM | 86.1% |
CBLTK+RF | 87.6% |
CBLTK+GBDT | 89.3% |
CBLTK + naive Bayes | 85.3% |
Table 5 the current results are related to other factors besides the currently selected data, such as the amount of data, whether the features extracted by the current model are applicable to the current classifier, etc., and it can be seen from the results in table 5 that the use of the strong classifier is not necessarily all effective, for example, the effect of the honokibayes combination is not as good as expected, which may be related to the characteristics of the currently used data and the honokibayes: the premise of naive bayes is that each feature is assumed to be independent, and words and phrases in text classification have strong correlation, and it can be seen from the result that the reason why GBDT is superior to random forest (hereinafter referred to as RF) is probably due to the fact that RF uses bagging in ensemble learning, that is: the GBDT belongs to boosting idea, which is to sample according to error rate, that is, a weak classifier gives a relatively low weight to a weak classification error during training, so the training process of the GBDT is similar to the deep learning model integrated into the Attention mechanism, and the weight value is used to highlight important features.
Fig. 6 is a structural diagram of an embodiment of a text emotion classification device provided in the present invention, and as shown in fig. 6, the text emotion classification device of the present embodiment includes:
the extraction module 601 is configured to obtain a word vector in text data to be processed, and extract a feature vector corresponding to the word vector;
the extraction module 601 is further configured to extract context feature representation for the feature vector by using a Bi-directional long-and-short time memory network Bi-LSTM model;
a determining module 602, configured to determine, according to the extracted context feature representation, a semantic code corresponding to the context feature representation;
a processing module 603, configured to perform maximal pooling on the semantic codes corresponding to the context feature representations, and splice multiple semantic codes after the maximal pooling to obtain a spliced feature representation;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
In a possible implementation manner, the extracting module 601 is specifically configured to:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf + θ), wherein ReLU is the activation function, f ∈ Rk×DRepresents the convolution operation of a filter with convolution layer length k on a D-dimensional word vector, theta represents the offset, and sjK successive word groups representing the beginning of the ith word in the text dataResultant word vector matrix sj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDRepresenting a word vector of a jth word in the text data, wherein the dimension is D dimension, and the value range of i is 1-m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
In a possible implementation manner, the extracting module 601 is specifically configured to:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ft⊙ct-1+it⊙tanh(Wc·X+bc),ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, n is the number of word vectors, and the value range of t is 1 to n.
In a possible implementation manner, the determining module 602 is specifically configured to:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation of the context extracted by the Bi-LSTM model and the probability weight;
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs a transpose of the random chamber matrix v.
In a possible implementation manner, the processing module 603 is specifically configured to:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
In a possible implementation manner, the processing module 603 is specifically configured to:
and classifying the spliced feature representation by using a preset classification function to obtain the emotion classification corresponding to the text data.
The apparatus of this embodiment may be configured to implement the technical solutions of the above method embodiments, and the implementation principles and technical effects are similar, which are not described herein again.
Fig. 7 is a structural diagram of an embodiment of an electronic device provided in the present invention, and as shown in fig. 7, the electronic device includes:
a processor 701, and a memory 702 for storing executable instructions for the processor 701.
Optionally, the method may further include: a communication interface 703 for enabling communication with other devices.
The above components may communicate over one or more buses.
The processor 501 is configured to execute the corresponding method in the foregoing method embodiment by executing the executable instruction, and the specific implementation process of the method may refer to the foregoing method embodiment, which is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method in the foregoing method embodiment is implemented.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (10)
1. A text emotion classification method is characterized by comprising the following steps:
acquiring a word vector in text data to be processed, and extracting a feature vector corresponding to the word vector by using convolution operation;
extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
determining semantic codes corresponding to the context feature representations according to the extracted context feature representations;
performing maximum pooling on semantic codes corresponding to the context feature representations, and splicing the semantic codes subjected to maximum pooling to obtain spliced feature representations;
and classifying the spliced feature representations to acquire emotion categories corresponding to the text data.
2. The method according to claim 1, wherein the extracting the feature vector corresponding to the word vector comprises:
inputting the word vector into the convolutional layer to obtain a characteristic matrix as follows:
each row in the F matrix represents a feature vector generated after convolution of a word vector in the text data by convolution windows with different sizes;
wherein c in each rowij=ReLU(sjf + θ), wherein ReLU is the activation function, f ∈ Rk×DRepresents the convolution operation of a filter with convolution layer length k on a D-dimensional word vector, theta represents the offset, and sjA word vector matrix s composed of k successive words starting from the ith word in the text dataj=[wj,wj+1,…,wj+k-1]Wherein w isj∈RDA word vector representing a jth word in the text data, the dimension being a D dimension, the i value rangeIs 1 to m; m is the number of the types of the convolution windows, the value range of j is 1 to n, and n is the number of words after the words are segmented in the text data.
3. The method of claim 2, wherein the extracting context feature representation of the feature vector by using a Bi-directional long-and-short memory network Bi-LSTM model comprises:
determining the above feature representation corresponding to the feature vector according to the following formula (2);
determining the following feature representation corresponding to the feature vector according to the following formula (3);
obtaining the feature representation of the context corresponding to the feature vector by using the following formula (4) according to the feature representation and the following feature representation;
wherein h istA hidden state h corresponding to the t-th word in the text datat=ot⊙tanh(ct),ot=δ(Wo·X+bo),ct=ft⊙ct-1+it⊙tanh(Wc·X+bc),ft=δ(Wf·X+bf),,it=δ(Wi·X+bi);
Wherein, Wf、Wi、Wo、WcWeight matrix for LSTM, bf、bi、bo、bcOffset of LSTM, w1tColumn vectors of the t-th column of the F matrix, delta (·) is an activation function, ⊙ is a dot product operation of the matrix, n is the number of word vectors, and the value range of t is 1 to n.
4. The method according to claim 3, wherein the determining semantic coding corresponding to the context feature representation according to the extracted context feature representation comprises:
determining semantic codes corresponding to the feature representation of the context by using the following formula (5) according to the feature representation and the probability weight of the context extracted by the Bi-LSTM model;
wherein outtIs a characteristic representation of the context, altRepresenting the degree of importance of the tth feature representation, i.e. the probability weight of the tth feature representation, altCalculated from the following equation (6):
wherein r ist=vTtanh(WAoutt+b),WAIs a parameter matrix, b is a bias term, vTIs the transpose of the random initial matrix v.
5. The method according to claim 1, wherein performing maximal pooling on semantic codes corresponding to the context feature representations and splicing the multiple semantic codes after the maximal pooling to obtain a spliced feature representation comprises:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
6. The method according to any one of claims 1 to 5, wherein the classifying the spliced feature representation comprises:
and classifying the spliced feature representations by using different preset classification models to obtain emotion categories corresponding to the text data.
7. A text emotion classification device, comprising:
the extraction module is used for acquiring word vectors in the text data to be processed and extracting the feature vectors corresponding to the word vectors;
the extraction module is also used for extracting context feature representation from the feature vector by adopting a Bi-directional long-time memory network Bi-LSTM model;
the determining module is used for determining semantic codes corresponding to the context feature representations according to the extracted context feature representations by utilizing an Attention mechanism top-k-maxporoling mechanism;
the processing module is used for performing maximum pooling processing on the semantic codes corresponding to the context feature representation and splicing the semantic codes subjected to maximum pooling processing to obtain spliced feature representation;
and classifying the spliced feature representation to acquire the emotion type corresponding to the text data.
8. The apparatus of claim 7, wherein the processing module is specifically configured to:
selecting the first k largest semantic codes in the sliding window by using a preset sliding window to obtain the first k largest semantic codes corresponding to a plurality of sliding windows;
and splicing the first k maximum semantic codes corresponding to the plurality of sliding windows to obtain the spliced feature representation.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-6.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-6 via execution of the executable instructions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911110950.9A CN110879938A (en) | 2019-11-14 | 2019-11-14 | Text emotion classification method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911110950.9A CN110879938A (en) | 2019-11-14 | 2019-11-14 | Text emotion classification method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110879938A true CN110879938A (en) | 2020-03-13 |
Family
ID=69730444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911110950.9A Pending CN110879938A (en) | 2019-11-14 | 2019-11-14 | Text emotion classification method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110879938A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111538835A (en) * | 2020-03-30 | 2020-08-14 | 东南大学 | Social media emotion classification method and device based on knowledge graph |
CN111737467A (en) * | 2020-06-22 | 2020-10-02 | 华南师范大学 | Object-level emotion classification method based on segmented convolutional neural network |
CN111930938A (en) * | 2020-07-06 | 2020-11-13 | 武汉卓尔数字传媒科技有限公司 | Text classification method and device, electronic equipment and storage medium |
CN112069307A (en) * | 2020-08-25 | 2020-12-11 | 中国人民大学 | Legal law citation information extraction system |
CN113361252A (en) * | 2021-05-27 | 2021-09-07 | 山东师范大学 | Text depression tendency detection system based on multi-modal features and emotion dictionary |
CN113393276A (en) * | 2021-06-25 | 2021-09-14 | 食亨(上海)科技服务有限公司 | Comment data classification method and device and computer readable medium |
CN113469365A (en) * | 2021-06-30 | 2021-10-01 | 上海寒武纪信息科技有限公司 | Inference and compilation method based on neural network model and related products thereof |
CN114168730A (en) * | 2021-11-26 | 2022-03-11 | 一拓通信集团股份有限公司 | Consumption tendency analysis method based on BilSTM and SVM |
CN114298019A (en) * | 2021-12-29 | 2022-04-08 | 中国建设银行股份有限公司 | Emotion recognition method, emotion recognition apparatus, emotion recognition device, storage medium, and program product |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092596A (en) * | 2017-04-24 | 2017-08-25 | 重庆邮电大学 | Text emotion analysis method based on attention CNNs and CCR |
CN108363753A (en) * | 2018-01-30 | 2018-08-03 | 南京邮电大学 | Comment text sentiment classification model is trained and sensibility classification method, device and equipment |
WO2019079922A1 (en) * | 2017-10-23 | 2019-05-02 | 腾讯科技(深圳)有限公司 | Session information processing method and device, and storage medium |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
WO2019085328A1 (en) * | 2017-11-02 | 2019-05-09 | 平安科技(深圳)有限公司 | Enterprise relationship extraction method and device, and storage medium |
CN109740148A (en) * | 2018-12-16 | 2019-05-10 | 北京工业大学 | A kind of text emotion analysis method of BiLSTM combination Attention mechanism |
US20190188260A1 (en) * | 2017-12-14 | 2019-06-20 | Qualtrics, Llc | Capturing rich response relationships with small-data neural networks |
CN110162636A (en) * | 2019-05-30 | 2019-08-23 | 中森云链(成都)科技有限责任公司 | Text mood reason recognition methods based on D-LSTM |
-
2019
- 2019-11-14 CN CN201911110950.9A patent/CN110879938A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107092596A (en) * | 2017-04-24 | 2017-08-25 | 重庆邮电大学 | Text emotion analysis method based on attention CNNs and CCR |
WO2019079922A1 (en) * | 2017-10-23 | 2019-05-02 | 腾讯科技(深圳)有限公司 | Session information processing method and device, and storage medium |
WO2019085328A1 (en) * | 2017-11-02 | 2019-05-09 | 平安科技(深圳)有限公司 | Enterprise relationship extraction method and device, and storage medium |
US20190188260A1 (en) * | 2017-12-14 | 2019-06-20 | Qualtrics, Llc | Capturing rich response relationships with small-data neural networks |
CN108363753A (en) * | 2018-01-30 | 2018-08-03 | 南京邮电大学 | Comment text sentiment classification model is trained and sensibility classification method, device and equipment |
CN109740148A (en) * | 2018-12-16 | 2019-05-10 | 北京工业大学 | A kind of text emotion analysis method of BiLSTM combination Attention mechanism |
CN109710761A (en) * | 2018-12-21 | 2019-05-03 | 中国标准化研究院 | The sentiment analysis method of two-way LSTM model based on attention enhancing |
CN110162636A (en) * | 2019-05-30 | 2019-08-23 | 中森云链(成都)科技有限责任公司 | Text mood reason recognition methods based on D-LSTM |
Non-Patent Citations (2)
Title |
---|
成璐: "\"基于注意力机制的双向LSTM模型在中文商品评论感觉分类中的研究\"" * |
白静;李霏;姬东鸿;: "基于注意力的BiLSTM-CNN中文微博立场检测模型" * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111538835A (en) * | 2020-03-30 | 2020-08-14 | 东南大学 | Social media emotion classification method and device based on knowledge graph |
CN111538835B (en) * | 2020-03-30 | 2023-05-23 | 东南大学 | Social media emotion classification method and device based on knowledge graph |
CN111737467B (en) * | 2020-06-22 | 2023-05-23 | 华南师范大学 | Object-level emotion classification method based on segmented convolutional neural network |
CN111737467A (en) * | 2020-06-22 | 2020-10-02 | 华南师范大学 | Object-level emotion classification method based on segmented convolutional neural network |
CN111930938A (en) * | 2020-07-06 | 2020-11-13 | 武汉卓尔数字传媒科技有限公司 | Text classification method and device, electronic equipment and storage medium |
CN112069307A (en) * | 2020-08-25 | 2020-12-11 | 中国人民大学 | Legal law citation information extraction system |
CN113361252A (en) * | 2021-05-27 | 2021-09-07 | 山东师范大学 | Text depression tendency detection system based on multi-modal features and emotion dictionary |
CN113393276A (en) * | 2021-06-25 | 2021-09-14 | 食亨(上海)科技服务有限公司 | Comment data classification method and device and computer readable medium |
CN113393276B (en) * | 2021-06-25 | 2023-06-16 | 食亨(上海)科技服务有限公司 | Comment data classification method, comment data classification device and computer-readable medium |
CN113469365A (en) * | 2021-06-30 | 2021-10-01 | 上海寒武纪信息科技有限公司 | Inference and compilation method based on neural network model and related products thereof |
CN113469365B (en) * | 2021-06-30 | 2024-03-19 | 上海寒武纪信息科技有限公司 | Reasoning and compiling method based on neural network model and related products thereof |
CN114168730A (en) * | 2021-11-26 | 2022-03-11 | 一拓通信集团股份有限公司 | Consumption tendency analysis method based on BilSTM and SVM |
CN114298019A (en) * | 2021-12-29 | 2022-04-08 | 中国建设银行股份有限公司 | Emotion recognition method, emotion recognition apparatus, emotion recognition device, storage medium, and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11216620B1 (en) | Methods and apparatuses for training service model and determining text classification category | |
CN110879938A (en) | Text emotion classification method, device, equipment and storage medium | |
CN108363790B (en) | Method, device, equipment and storage medium for evaluating comments | |
CN107943784B (en) | Relationship extraction method based on generation of countermeasure network | |
CN116194912A (en) | Method and system for aspect-level emotion classification using graph diffusion transducers | |
CN110502738A (en) | Chinese name entity recognition method, device, equipment and inquiry system | |
CN112749274B (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN110619044B (en) | Emotion analysis method, system, storage medium and equipment | |
CN112711948A (en) | Named entity recognition method and device for Chinese sentences | |
CN113011186B (en) | Named entity recognition method, named entity recognition device, named entity recognition equipment and computer readable storage medium | |
CN113326374B (en) | Short text emotion classification method and system based on feature enhancement | |
CN109766557A (en) | A kind of sentiment analysis method, apparatus, storage medium and terminal device | |
CN111522908A (en) | Multi-label text classification method based on BiGRU and attention mechanism | |
CN113220886A (en) | Text classification method, text classification model training method and related equipment | |
CN111739520B (en) | Speech recognition model training method, speech recognition method and device | |
CN111078833A (en) | Text classification method based on neural network | |
CN110968725B (en) | Image content description information generation method, electronic device and storage medium | |
CN109614611B (en) | Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network | |
CN114860930A (en) | Text classification method and device and storage medium | |
CN112667782A (en) | Text classification method, device, equipment and storage medium | |
CN109885830A (en) | Sentence interpretation method, device, computer equipment | |
US20230121404A1 (en) | Searching for normalization-activation layer architectures | |
CN113158667B (en) | Event detection method based on entity relationship level attention mechanism | |
CN117094383A (en) | Joint training method, system, equipment and storage medium for language model | |
CN107729509B (en) | Discourse similarity determination method based on recessive high-dimensional distributed feature representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200313 |
|
WD01 | Invention patent application deemed withdrawn after publication |