CN115544252A - Text emotion classification method based on attention static routing capsule network - Google Patents
Text emotion classification method based on attention static routing capsule network Download PDFInfo
- Publication number
- CN115544252A CN115544252A CN202211152911.7A CN202211152911A CN115544252A CN 115544252 A CN115544252 A CN 115544252A CN 202211152911 A CN202211152911 A CN 202211152911A CN 115544252 A CN115544252 A CN 115544252A
- Authority
- CN
- China
- Prior art keywords
- text
- attention
- layer
- vector
- capsule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A text sentiment classification method based on an attention static routing capsule network collects non-labeled text data of a target language; training by using label-free text data and a word2vec method to obtain word vector representation of a target language; collecting text data with tags of a target language; constructing a classification model based on an attention static routing capsule network; carrying out supervised training on the classification model by using the text data with the tag of the target language; and evaluating the trained classification model by using the accuracy, the precision, the recall rate and the F1Score to obtain a text sentiment classification model meeting the requirements, and classifying the input text by using the text sentiment classification model meeting the requirements. The method and the device can improve the extraction capability of the text features and the relation modeling capability among the text features, and finally improve the precision of the text emotion classification.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence and text emotion classification, and particularly relates to a text emotion classification method based on an attention static routing capsule network.
Background
Text sentiment classification is one of the most basic and important tasks in the field of machine learning. Conventionally, word frequency inverse text frequency (tf-idf) is used as a feature representation of text, and then a general classifier such as a Support Vector Machine (SVM) or logistic regression is used for text emotion classification.
However, in recent years, the continued development of deep learning methods has made it possible to find distributed representations of words and documents in an efficient manner, which further improves the accuracy of textual emotion classification. The main deep learning models used in the field of text emotion classification are mainly based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) and the Transformer architecture of the big fire in recent years. Hinton in 2017 proposed a capsule network in view of the shortcomings of the convolutional neural network, and applied it in the field of image processing, proving that it is effective in understanding spatial relationships in high-level data. Researchers try to apply the capsule network to text processing and achieve good effects later, and prove that the capsule network has advantages for text information processing. The information transmission between different layers of capsules of the traditional capsule network adopts a dynamic routing mechanism, and the dynamic routing mechanism needs to iteratively calculate the weights of different capsules according to data dynamics each time, so that the process is very time-consuming.
Therefore, how to reduce the time spent on routing between capsules in the capsule network without reducing the accuracy of the model becomes an urgent problem to be solved in the field.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a text emotion classification method based on an attention static routing capsule network, so as to improve the extraction capability of text features and the relation modeling capability among the text features, and finally improve the accuracy of text emotion classification.
In order to achieve the purpose, the invention adopts the technical scheme that:
a text emotion classification method based on an attention static routing capsule network comprises the following steps:
step 1, collecting non-labeled text data of a target language; the target language is a language used for finishing the text emotion classification task finally;
step 2, training by using the label-free text data and the word2vec method in the step 1 to obtain word vector representation of the target language;
step 3, collecting text data with labels of the target language;
step 4, constructing a classification model based on the attention static routing capsule network;
step 5, performing supervised training on the classification model in the step 4 by using the text data with the labels of the target language obtained in the step 3;
and 6, evaluating the classification model trained in the step 5 by using the accuracy, the precision, the recall rate and the F1Score to obtain a text emotion classification model meeting the requirements, and classifying the input text by using the text emotion classification model meeting the requirements.
In one embodiment, in step 1, text data is collected and cleaned, useless text and non-text content are removed, and label-free text data is obtained, wherein the total number of words of the label-free text is not less than 100 ten thousand words.
In one embodiment, in step 2, word embedding pre-training is performed by using a continuous bag of words model (CBOW) model of word2vec, so as to obtain real number vectors of all words in the target language, that is, the word vector representation.
In one embodiment, step 3, text data is collected and cleaned, useless text and non-text content are removed, and then emotional tendency of each text is marked manually.
In one embodiment, the model component of the classification model comprises: the word2vec word embedded layer, the two-dimensional convolution layer, the full connection layer, the extrusion pooling layer, the primary capsule layer, the intermediate capsule layer, the high-grade capsule layer and the classification capsule layer;
the word2vec word embedding layer is used for mapping texts into word vector sequences; the word vector sequence forms a real number matrix and is used as a picture of a single input channel to be input into the two-dimensional convolution layer, and the two-dimensional convolution layer extracts multi-scale features of a text by utilizing multi-scale convolution and converts the multi-scale features into a vector capsule;
the full-connection layer is used for unifying dimensions of the multi-scale features extracted from the two-dimensional convolutional layer and then performing feature fusion on the multi-scale features with unified dimensions based on attention weight;
the extrusion pooling layer is used for compressing the fused features into vectors with the die length of 0-1 and then serving as the input of the primary capsule layer;
the primary capsule layer, the middle-level capsule layer, the high-level capsule layer and the classification capsule layer are used for transmitting the most original semantic information extracted by the convolution layer to the model to be output by using the attention static route step by step, so that the category of the text emotion is obtained.
In one embodiment, the step 5 training process is as follows:
1) Text data to be classified T = { w = { (w) 1 ,w 2 ,…,w n Inputting into the word2vec word embedding layer, and inputting each word w i Mapped as a real vector v i ∈R d So that the entire text becomes a matrix D = { v = 1 ,v 2 ,…,v n }∈R d×n Wherein d is the dimension of the word vector, and n is the length of the text;
2) Inputting the matrix D into a two-dimensional convolution layer as a picture of a single input channel, performing feature extraction on the matrix D by using a multi-scale convolution kernel to obtain multi-scale features, wherein the formula of an output shape is as follows;
whereinDenotes a downward integer, n h Is the longitudinal length, k, of the matrix D h Is the longitudinal length of the convolution kernel, p h For longitudinal filling, s h Is a longitudinal stride.
3) Changing the dimensionality of the multi-scale features to be the same through the full connection layer to obtain multi-scale output features g i ;
4) Outputting the characteristic g in multiple scales i Performing weighted fusion on the same output channel based on attention weight to obtain a fused feature s i ;
5) In extrusion pooling, fused features s i Compressing the mixture into a vector c with the die length of 1 by an extrusion operation, and inputting the vector into a subsequent capsule layer, wherein the formula of the extrusion operation is as follows:
6) The primary capsule layer, the intermediate capsule layer, the high-level capsule layer and the classification capsule layer are all connected, and the routing mode among the capsules adopts an attention static routing mechanism.
In one embodiment, the two-dimensional convolution layers and the multi-scale convolution kernels have a total number of 5, and the sizes of the two-dimensional convolution layers and the multi-scale convolution kernels are respectively as follows: 1 xd, 3 xd, 5 xd, 7 xd and 9 xd, longitudinal step s h =1, vertical filling p h =0, the output channels are all 256, and the output shapes of the calculated multi-scale convolution are: o 1 ∈R n×1 ,o 2 ∈R (n-2)×1 ,o 3 ∈R (n-4)×1 ,o 4 ∈R (n-6)×1 ,o 5 ∈R (n-8)×1 (ii) a The full connection layer is respectively as follows: w 1 ∈R e×n ,W 2 ∈R e×(n-2) ,W 3 ∈R e×(n-4) ,W 4 ∈R e×(n-6) ,W 5 ∈R e×(n-8) (ii) a The dimensions are unified as follows: w i o i =g i ∈R e×1 Wherein g is i Of the same dimensionMulti-scale output features; and e is the dimension after the multi-scale features are unified.
In one embodiment, the weighted fusion method is as follows:
the multi-scale output features are m vectors g on each channel i ∈R e Let g i =k i =v i ∈R e (ii) a Setting a query vector q epsilon R for querying semantic feature importance e And m key value pairs (k) 1 ,v 1 ),…,(k m ,v m ) Fusing the multi-scale features based on attention weights is represented as the following formula:
s∈R e
let g i =k i =v i ∈R e Q represents query, k represents key, v represents value; q, k 1 …k m ,v 1 …v i Is a function input, the function relation is
Wherein q and k i Attention weight of (a, k) of (b) i ) Is a function of attention scoringThe vectors q and k i Mapping into scalar, and calculating with softmax to obtain real number weight between 0-1, alpha (q, k) i ) The calculation formula of (a) is as follows:
α(q,k i )∈R
attention scoring functionIs calculated with additive attentionGiven a vector q ∈ R e Vector k i ∈R e Learnable parameter matrix W q ∈R e×e Learnable parameter matrix W k ∈R e×e Learnable parameter vector w v ∈R 1×e Will matrix W q Performing matrix multiplication with the vector q and the matrix W k And vector k i Adding the results after matrix multiplication, inputting into tanh function for nonlinear transformation, and vector w v The result of the transposition and the nonlinear transformation is multiplied to finally obtain the attention fraction, wherein the attention fraction is a real number, and the calculation formula is as follows:
in one embodiment, the static routing mechanism assigns a weight to each vector by means of a learnable parameter matrix and an attention mechanism, lower level capsules 1, 2, 3, respectively outputting a vector v 1 、v 2 And v 3 Using additive attention scoring functionScoring each output vector, inputting the attention score into softmax operation to obtain corresponding weight, and combining the weight with v 1 、v 2 And v 3 Carrying out weighted summation to obtain a vector y, and then carrying out extrusion operation on the vector y to obtain a vector v with the modular length between 0 and 1 i And then the mixture is sent into the next layer of capsules.
Compared with the prior art, the invention has the beneficial effects that:
firstly, the invention designs a new model structure: the overall structure of a Capsule network (Capsule network-ASR: capsule network based on attentive static routing) based on attention static routing is as follows: word embedding layer, convolution layer, initial capsule layer, intermediate capsule layer, high-grade capsule layer and classification capsule layer. Secondly, dynamic routing between the capsule layers is replaced by a special attention static routing mechanism of the invention, and the network automatically learns how to distribute the weight of the routing to the bottom layer capsule in the training stage, thereby improving the routing efficiency. Thirdly, the convolution layer in the model adopts a multi-scale convolution kernel to better extract text information, and the multi-scale convolution characteristics are subjected to weighted fusion on the same output channel by using an attention mechanism. And finally, replacing the common pooling operation of the convolutional neural network with extrusion operation so as to improve the modeling capability of the relationship among the semantic features. Through the improvement, the extraction capability of the text features can be effectively improved, the relation modeling capability among the text features is improved, and finally the text emotion classification precision is improved.
Drawings
FIG. 1 is a diagram of a classification model architecture for an attention-based static routing capsule network.
FIG. 2 is a schematic diagram of fusion of multi-scale features based on attention weights.
Fig. 3 is an attention static routing diagram.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
Compared with the existing text emotion classification method, the text emotion classification method has the advantages that the dynamic routing process of the capsule network is replaced by the static routing based on the attention mechanism, and the network can automatically learn how to distribute the weight of the routing to the bottom-layer capsule in the training stage; and the usual pooling operation of convolutional neural networks is replaced by a squeezing operation. The overall structure of the network model is as follows: the capsule comprises a word embedding layer, a two-dimensional convolution layer, a full connection layer, an extrusion pooling layer, an initial capsule layer, a middle-grade capsule layer, a high-grade capsule layer and a classification capsule layer.
Specifically, the method comprises the following steps:
step 1, collecting and sorting non-labeled text data of a target language.
The method mainly comprises the step of collecting corresponding text data according to specific tasks, wherein a target language is a language used for finishing a text emotion classification task finally. The purpose of word embedding is to allow subsequent neural networks to identify the similarity between words by mapping each word to a real number vector. The word embedding process only needs to be aided by the context information of each word, and therefore only needs to collect the unmarked text of the target language.
For example, if the target task is microblog comment sentiment analysis, a large amount of Chinese unlabeled comment information in the microblog is collected and sorted. Exemplarily, the text data is collected and cleaned in the step, and useless texts and non-text contents such as hyperlinks, symbols, emoticons and the like are removed to obtain the unmarked text data. Such as collecting various articles on the internet or various texts (topic texts, comment texts, etc.) on a microblog, and then removing irrelevant hyperlinks, symbols, emoticons, etc. in the texts. In order to ensure the accuracy of the word vector, the more the unlabeled text used for the pre-training word vector, the better, and generally the total word number is not less than 100 ten thousand words.
And 2, training by using the label-free text data and the word2vec method in the step 1 to obtain word vector representation of the target language.
Specifically, word frequency statistics is carried out on the collected and sorted non-labeled text data in the step 1, such as microblog comment texts, a vocabulary table is established, and each word in the vocabulary table corresponds to a real number vector w to be trained i ∈R d . And then performing word embedding training. The word2vec method includes two types, the Skip-Gram (Skip-Gram) model and the continuous bag of words (CBOW) model. The method adopts a self-supervision training mode, utilizes a continuous bag of words model (CBOW) of word2vec to carry out word embedding pre-training, and obtains real number vectors of all words in a target language, namely the word vector representation.
The continuous bag of words model assumes that the core word is generated based on its surrounding context words in the text sequence. For example, in the text sequence "i", "people", "love", "self", "have", "ancestor", "country", in case "love" is the core word and the context window is 2, the continuous bag of words model considers the conditional probability of generating the core word "love" based on the context words "i", "people", "self", "have", i.e.: p ("love" | "me", "people", "from", "have").
Training all words in the sorted text by using maximum likelihood estimation and continuously updating the iterative word vector w by taking the maximum conditional probability as a target i ∈R d 。
And 3, collecting and sorting the text data with the tag of the target language.
In order to realize the task of text emotion classification, the model needs to be supervised and trained, and therefore labeled text data needs to be collected, wherein each training sample is a text and a category label. For example, comment texts and corresponding emotion labels when microblog comment emotion is classified.
Illustratively, similar to step 1, this step also collects text data and cleans it to remove useless text and non-text content, after which the emotional propensity (positive, negative, neutral) of each text is manually labeled. Taking microblog emotional analysis as an example, microblog comment texts are collected, emoticons, hyperlinks and other non-text contents in the microblog comment texts are removed, the emotional tendency of each comment text is manually marked, and finally each comment is changed into a text-emotional tag pair. For example, the collected comment information such as "this blogger is really young, we should learn from you" and then manually label their sentiment tag as "positive sentiment". Therefore, a piece of text data with labels in the microblog comment sentiment classification field is obtained.
And 4, constructing a classification model based on the attention static routing capsule network.
The classification model of the invention can use the PyTorch framework widely used in academia at present to write the model code. Referring to fig. 1, the model assembly includes: word2vec word embedding layer, two-dimensional convolution layer, full connection layer, extrusion pooling layer, primary capsule layer, middle-level capsule layer, senior capsule layer and categorised capsule layer, wherein:
the word2vec word embedding layer is used for mapping the text into a word vector sequence; the word vector sequence forms a real number matrix, the real number matrix is used as a picture of a single input channel to be input into the two-dimensional convolution layer, the two-dimensional convolution layer utilizes multi-scale convolution to extract multi-scale features of the text, and the multi-scale features are converted into vector capsules;
the full-connection layer is used for unifying dimensions of the multi-scale features extracted by the two-dimensional convolutional layer, and then performing feature fusion on the multi-scale features with unified dimensions based on attention weight;
the extrusion pooling layer is used for compressing the fused features into vectors with the die length of 0-1 and then used as input of the primary capsule layer;
the primary capsule layer, the intermediate capsule layer, the high-level capsule layer and the classification capsule layer are used for transmitting the most original semantic information extracted by the convolution layer to the model for output by using the attention static route step by step, so that the category of the text emotion is obtained.
The overall sequence of the classification model during training is as follows:
1) And inputting the text data to be classified into a word2vec word embedding layer. Input is T = { w = { (w) 1 ,w 2 ,…,w n Will each word w i Mapped as a real vector v i ∈R d So that the entire text becomes a matrix D = { v = 1 ,v 2 ,…,v n }∈R d ×n Where d is the dimension of the word vector and n is the length of the text.
As shown in fig. 1, the text: "the Bo owner is young and is in, after the word is embedded into the layer, it is mapped into a real matrix D ∈ R d×n . The text length n =10 and the hyperparameter d =64.
2) Inputting the matrix D into a two-dimensional convolution layer as a picture of a single input channel, performing feature extraction on the matrix D by using a multi-scale convolution kernel to obtain multi-scale features, and determining the output shape according to the following formula:
whereinDenotes a downward integer, n h Is the longitudinal length, k, of the matrix D h Is the longitudinal length of the convolution kernel, p h For longitudinal filling, s h Is a longitudinal stride.
Illustratively, there are 5 multiscale convolution kernels, each with a size: 1 xd, 3 xd, 5 xd, 7 xd and 9 xd. Longitudinal step s h =1, vertical filling p h =0, the output channels are all 256. The output shapes of the multi-scale convolution obtained by calculation are respectively as follows: o 1 ∈R n×1 ,o 2 ∈R (n-2)×1 ,o 3 ∈R (n-4)×1 ,o 4 ∈R (n-6)×1 ,o 5 ∈R (n-8)×1 . In the above text: the result of this blogger is a young one, and the output shape is o 1 ∈R 10×1 ,o 2 ∈R 8×1 ,o 3 ∈R 6×1 ,o 4 ∈R 4×1 ,o 5 ∈R 2×1 。
3) Because the output shapes of the multi-scale features in each output channel are different, the dimensions of the multi-scale features need to be changed into the same dimensions through the full-connection layer, and the multi-scale output features g are obtained i . The specific method comprises the following steps:
with a fully-connected layer W 1 ∈R e×n ,W 2 ∈R e×(n-2) ,W 3 ∈R e×(n-4) ,W 4 ∈R e×(n-6) ,W 5 ∈R e×(n-8) (in the above-mentioned text: "this owner is young person 1 ∈R e×10 ,W 2 ∈R e×8 ,W 3 ∈R e×6 ,W 4 ∈R e×4 ,W 5 ∈R e×2 ). The dimensions are unified as follows: w i o i =g i ∈R e×1 Wherein g is i Outputting features for multiple scales of the same dimension; and e is the dimension after the multi-scale features are unified. Because of the matrix and vector multiplication W i o i =g i ∈R e×1 The results obtained are all e-dimensional vectors, e.g. W 1 Is an e × n matrix, and o 1 Is a vector of n × 1, so W 1 o 1 To obtaine × 1 vector.
4) Multi-scale output characteristic g after changing dimensionality i Performing weighted fusion on the same output channel based on attention weight to obtain fused features S i 。
Referring to FIG. 2, the multi-scale output features are m vectors g on each channel i ∈R e Let g i =k i =v i ∈R e . Suppose there is a query vector q ∈ R for querying semantic feature importance e And m key value pairs (k) 1 ,v 1 ),…,(k m ,v m ) The fusion of multi-scale features based on attention weights can be expressed as the following formula:
s∈R e
wherein q represents query, k represents key, and v represents value; q, k 1 …k m ,v 1 …v i Is a function input, the function relation isq and k i Attention weight of (a) (q, k) i ) Is a function of attention scoringThe vectors q and k i Mapping into scalar, and calculating by softmax to obtain real number weight between 0 and 1. Attention weight α (q, k) i ) The calculation formula of (a) is as follows:
α(q,k i )∈R
attention scoring functionThe calculation of (c) takes additive attention.Given vector q ∈ R e Vector k i ∈R e Learnable parameter matrix W q ∈R e×e Learnable parameter matrix W k ∈R e×e Learnable parameter vector w v ∈R 1×e Will matrix W q Performing matrix multiplication with the vector q and the matrix W k And vector k i Adding the results after matrix multiplication, inputting into tanh function for nonlinear transformation, and vector w v The result of the transposition and the nonlinear transformation is multiplied to finally obtain the attention fraction, wherein the attention fraction is a real number. The calculation formula is as follows:
5) In extrusion pooling, fused features s i The compression is performed as a vector c with a die length of 1 by an extrusion operation, which is then fed into the subsequent layer of capsules, as shown below.
6) The primary capsule layer, the intermediate capsule layer, the high-grade capsule layer and the classification capsule layer are all connected, and the routing mode among the capsules adopts an attention static routing mechanism.
The conventional dynamic routing mechanism assigns weights to output vectors of each low-level capsule in an iterative manner, while the attention static routing mechanism assigns weights to each vector in a learnable parameter matrix and attention mechanism, as shown in fig. 3, for each of low-level capsules 1, 2, and 3, which respectively output vectors v 1 、v 2 And v 3 Using additive attention scoring functionScoring each output vector, inputting the attention score into softmax operation to obtain corresponding weight, and combining the weight with v 1 、v 2 And v 3 Carrying out weighted summation to obtain a vector y, and then carrying out extrusion operation on the vector y to obtain a vector v with the modular length between 0 and 1 i And then the mixture is sent into the next layer of capsules.
And 5, performing supervised training on the classification model in the step 4 by using the text data with the label of the target language obtained in the step 3. For example, the text: the emotion category "this blogger is young to" has been manually labeled as "forward". Prediction results obtained during supervised trainingThe loss is calculated in the "forward" direction from the actual class and the model parameters are updated using a back propagation algorithm.
And 6, evaluating the classification model trained in the step 5 by utilizing the accuracy, the precision, the recall rate and the F1 Score. And after the model training is finished, testing the model by using a part of test data sets which are not used for training. And evaluating the model by using the accuracy, precision, recall and F1Score according to the result obtained by the model test to finally obtain a text emotion classification model meeting the requirements, and carrying out emotion classification on the input text by using the text emotion classification model meeting the requirements.
The Accuracy, refers to the proportion of all samples with correct prediction, and the calculation formula is as follows:
the Precision indicates how many of the samples predicted to be positive are true positive samples, and the calculation formula is as follows:
the Recall rate recalling indicates how much the positive case in the sample is predicted correctly, and the calculation formula is as follows:
the F1Score (F1 Score) is an index used statistically to measure the accuracy of the two-class model. The method simultaneously considers the accuracy rate and the recall rate of the classification model. The F1score can be viewed as a harmonic mean of the model accuracy and recall, reflecting the robustness of the model, with a maximum of 1 and a minimum of 0. The formula for the F1-score calculation is shown below:
in order to obtain accuracy, precision, recall and F1Score, a confusion matrix needs to be drawn for statistics, TP, TN, FP and FN are obtained respectively, under a classification task, four different combinations exist between a prediction result and an actual result, and the confusion matrix can be formed as shown in the following table:
in the confusion matrix, TP (True Positive) represents the number of samples that are actually True Positive samples among the samples whose prediction results are Positive samples; FP (False Positive) represents the number of samples that are not Positive in fact among samples whose prediction results are Positive; FP (False Negative) represents the number of samples which are not Negative in reality in the samples with the Negative prediction result; TN (True Negative) represents the number of samples that are actually Negative in the samples whose prediction results are Negative;
the positive examples and the negative examples are relative, for example, in the emotion classification task, the emotion classification of a sentence can be three types of positive, neutral and negative. If positive is chosen as the positive case, then neutral and negative are together called negative cases.
The model CapsNet-ASR of the invention is used for carrying out experiments on Chinese emotion text data sets ASAP, chnSentiCorp, NLPCC14-SC and SE-ABSA16 respectively, and the evaluation and comparison are carried out by taking the accuracy as an index compared with the traditional model. The comparison result is shown in the following table, and it can be seen that the improvement effect of the method is significant in the field of Chinese text sentiment classification.
ASAP | ChnSentiCorp | NLPCC14-SC | SE-ABSA16 | |
RNN | 75.9 | 84.4 | 83.8 | 83.1 |
LSTM | 80.3 | 85.7 | 84.5 | 89.5 |
Generic capsule network | 81.2 | 88.9 | 87.5 | 90.8 |
CapsNet-ASR | 84.5 | 92.2 | 91.9 | 91.5 |
In addition, because the invention uses a static routing mechanism in the CapsNet-ASR, and uses a dynamic routing mechanism in the common capsule network, the training time of the invention is shorter in theory. Therefore, two models are respectively trained on data sets ASAP, chnSentiCorp, NLPCC14-SC and SE-ABSA16, the two models are trained for 60 epochs, the experimental result shows that the training time of the CapsNet-ASR model is obviously shorter than that of the common capsule network, the experimental result is shown in the following table, the figures in the table are the training time of the models in the data set, and the unit is hour.
ASAP | ChnSentiCorp | NLPCC14-SC | SE-ABSA16 | |
Generic capsule network | 8 | 14 | 16 | 9 |
CapsNet-ASR | 3 | 8 | 10 | 6 |
Claims (9)
1. A text emotion classification method based on an attention static routing capsule network is characterized by comprising the following steps:
step 1, collecting non-labeled text data of a target language; the target language is a language used for finishing the text emotion classification task finally;
step 2, training by using the label-free text data and word2vec method in the step 1 to obtain word vector representation of the target language;
step 3, collecting text data with labels of the target language;
step 4, constructing a classification model based on the attention static routing capsule network;
step 5, performing supervised training on the classification model in the step 4 by using the text data with the labels of the target language obtained in the step 3;
and 6, evaluating the classification model trained in the step 5 by using the accuracy, the precision, the recall rate and the F1Score to obtain a text emotion classification model meeting the requirements, and classifying the input text by using the text emotion classification model meeting the requirements.
2. The method for classifying emotion of text based on capsule network with static attention routing as claimed in claim 1, wherein in step 1, text data is collected and washed to remove useless text and non-text content, so as to obtain non-labeled text data, wherein the total word number of the non-labeled text is not less than 100 ten thousand words.
3. The attention static routing capsule network-based text emotion classification method according to claim 1, wherein in the step 2, word embedding pre-training is performed by using a continuous bag of words model (CBOW) model of word2vec to obtain real number vectors, namely the word vector representation, of all words in a target language.
4. The text emotion classification method based on attention static routing capsule network as claimed in claim 1, wherein step 3, text data is collected and washed to remove useless text and non-text contents, and then emotion tendencies of each text are labeled manually.
5. The method for text sentiment classification based on attention static routing capsule network according to claim 1, wherein the model component of the classification model comprises: the word2vec word embedded layer, the two-dimensional convolution layer, the full-connection layer, the extrusion pooling layer, the primary capsule layer, the middle-grade capsule layer, the high-grade capsule layer and the classification capsule layer;
the word2vec word embedding layer is used for mapping texts into word vector sequences; the word vector sequence forms a real number matrix and is used as a picture of a single input channel to be input into the two-dimensional convolution layer, and the two-dimensional convolution layer extracts multi-scale features of a text by utilizing multi-scale convolution and converts the multi-scale features into a vector capsule;
the full-connection layer is used for unifying dimensions of the multi-scale features extracted from the two-dimensional convolutional layer and then performing feature fusion on the multi-scale features with unified dimensions based on attention weight;
the extrusion pooling layer is used for compressing the fused features into vectors with the die length of 0-1 and then serving as the input of the primary capsule layer;
the primary capsule layer, the middle-level capsule layer, the high-level capsule layer and the classification capsule layer are used for transmitting the most original semantic information extracted by the convolution layer to the model to be output by using the attention static route step by step, so that the category of the text emotion is obtained.
6. The text emotion classification method based on attention static routing capsule network as claimed in claim 4, wherein in the step 5, the training process is as follows:
1) To be classified text data T = { w 1 ,w 2 ,…,w n Inputting into the word2vec word embedding layer, and inputting each word w i Mapped as a real vector v i ∈R d So that the entire text becomes a matrix D = { v = 1 ,v 2 ,…,v n }∈R d×n Wherein d is the dimension of the word vector, and n is the length of the text;
2) Inputting the matrix D into a two-dimensional convolution layer as a picture of a single input channel, performing feature extraction on the matrix D by using a multi-scale convolution kernel to obtain multi-scale features, wherein the formula of an output shape is as follows;
whereinDenotes a downward integer, n h Is the longitudinal length, k, of the matrix D h Is the longitudinal length of the convolution kernel, p h For longitudinal filling, s h Is the longitudinal step.
3) Changing the dimensionality of the multi-scale features to be the same through the full connection layer to obtain multi-scale output features g i ;
4) Outputting the multi-scale output characteristic g i Performing weighted fusion on the same output channel based on attention weight to obtain a fused feature s i ;
5) In extrusion pooling, fused features s i Compressing the mixture into a vector c with the die length of 1 by an extrusion operation, and inputting the vector into a subsequent capsule layer, wherein the formula of the extrusion operation is as follows:
6) The primary capsule layer, the intermediate capsule layer, the high-level capsule layer and the classification capsule layer are all connected, and the routing mode among the capsules adopts an attention static routing mechanism.
7. The text emotion classification method based on attention static routing capsule network of claim 6, wherein the number of the two-dimensional convolution layers and the number of the multi-scale convolution kernels are 5, and the sizes are respectively as follows: 1 xd, 3 xd, 5 xd, 7 xd and 9 xd, longitudinal step s h =1, vertical filling p h =0, the output channels are all 256, and the output shapes of the calculated multi-scale convolution are: o 1 ∈R n×1 ,o 2 ∈R (n-2)×1 ,o 3 ∈R (n-4)×1 ,o 4 ∈R (n-6)×1 ,o 5 ∈R (n-8)×1 (ii) a The full connection layer is respectively as follows: w is a group of 1 ∈R e ×n ,W 2 ∈R e×(n-2) ,W 3 ∈R e×(n-4) ,W 4 ∈R e×(n-6) ,W 5 ∈R e×(n-8) (ii) a The dimensions are unified as follows: w is a group of i o i =g i ∈R e×1 Wherein g is i Outputting features for multiple scales of the same dimension; and e is the dimension of the unified multi-scale feature.
8. The text emotion classification method based on attention static routing capsule network according to claim 6, wherein the weighted fusion method is as follows:
the multi-scale output features are m vectors g on each channel i ∈R e Let g i =k i =v i ∈R e (ii) a Setting a query vector q epsilon R for querying semantic feature importance e And m key value pairs (k) 1 ,v 1 ),…,(k m ,v m ) Fusing the multi-scale features based on attention weights is represented as the following formula:
s∈R e
let g i =k i =v i ∈R e Q represents query, k represents key, v represents value; q, k 1 …k m ,v 1 …v i Is a function input, the function relation is
Wherein q and k i Attention weight of (a) (q, k) i ) Is a function of attention scoringThe vectors q and k i Mapping into scalar, and calculating with softmax to obtain real number weight between 0 and 1, alpha (q, k) i ) The calculation formula of (a) is as follows:
α(q,k i )∈R
attention scoring functionIs calculated with additive attention, given vector q ∈ R e Vector k i ∈R e Learnable parameter matrix W q ∈R e×e Learnable parameter matrix W k ∈R e×e Learnable parameter vector w v ∈R 1×e Will matrix W q Performing matrix multiplication with the vector q and the matrix w k And vector k i Adding the results after matrix multiplication, inputting into tanh function for nonlinear transformation, and vector w v The result of the transposition and the nonlinear transformation is multiplied to finally obtain the attention fraction, wherein the attention fraction is a real number, and the calculation formula is as follows:
9. the method according to claim 6, wherein the static attention routing capsule network assigns a weight to each vector by means of a learnable parameter matrix and an attention mechanism, and the low-level capsules 1, 2 and 3 respectively output a vector v 1 、v 2 And v 3 Using additive attention scoring functionScoring each output vector, inputting the attention score into softmax operation to obtain corresponding weight, and combining the weight with v 1 、v 2 And v 3 Carrying out weighted summation to obtain a vector y, and then carrying out extrusion operation on the vector y to obtain a vector v with a modular length between 0 and 1 i And then the mixture is sent into the next layer of capsules.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211152911.7A CN115544252A (en) | 2022-09-21 | 2022-09-21 | Text emotion classification method based on attention static routing capsule network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211152911.7A CN115544252A (en) | 2022-09-21 | 2022-09-21 | Text emotion classification method based on attention static routing capsule network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115544252A true CN115544252A (en) | 2022-12-30 |
Family
ID=84726699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211152911.7A Pending CN115544252A (en) | 2022-09-21 | 2022-09-21 | Text emotion classification method based on attention static routing capsule network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115544252A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116304842A (en) * | 2023-05-18 | 2023-06-23 | 南京信息工程大学 | Capsule network text classification method based on CFC structure improvement |
CN116304585A (en) * | 2023-05-18 | 2023-06-23 | 中国第一汽车股份有限公司 | Emotion recognition and model training method and device, electronic equipment and storage medium |
-
2022
- 2022-09-21 CN CN202211152911.7A patent/CN115544252A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116304842A (en) * | 2023-05-18 | 2023-06-23 | 南京信息工程大学 | Capsule network text classification method based on CFC structure improvement |
CN116304585A (en) * | 2023-05-18 | 2023-06-23 | 中国第一汽车股份有限公司 | Emotion recognition and model training method and device, electronic equipment and storage medium |
CN116304585B (en) * | 2023-05-18 | 2023-08-15 | 中国第一汽车股份有限公司 | Emotion recognition and model training method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108399158B (en) | Attribute emotion classification method based on dependency tree and attention mechanism | |
CN110245229B (en) | Deep learning theme emotion classification method based on data enhancement | |
CN108363743B (en) | Intelligent problem generation method and device and computer readable storage medium | |
CN104794169B (en) | A kind of subject terminology extraction method and system based on sequence labelling model | |
CN107025284A (en) | The recognition methods of network comment text emotion tendency and convolutional neural networks model | |
CN112001186A (en) | Emotion classification method using graph convolution neural network and Chinese syntax | |
JPH07295989A (en) | Device that forms interpreter to analyze data | |
CN115544252A (en) | Text emotion classification method based on attention static routing capsule network | |
CN112256866B (en) | Text fine-grained emotion analysis algorithm based on deep learning | |
CN110750648A (en) | Text emotion classification method based on deep learning and feature fusion | |
CN113343690B (en) | Text readability automatic evaluation method and device | |
CN110717330A (en) | Word-sentence level short text classification method based on deep learning | |
CN113806547B (en) | Deep learning multi-label text classification method based on graph model | |
CN111368082A (en) | Emotion analysis method for domain adaptive word embedding based on hierarchical network | |
CN112287197B (en) | Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases | |
CN111582506A (en) | Multi-label learning method based on global and local label relation | |
CN114742071B (en) | Cross-language ideas object recognition analysis method based on graph neural network | |
CN112732872A (en) | Biomedical text-oriented multi-label classification method based on subject attention mechanism | |
Mozafari et al. | Emotion detection by using similarity techniques | |
CN115329085A (en) | Social robot classification method and system | |
Baboo et al. | Sentiment analysis and automatic emotion detection analysis of twitter using machine learning classifiers | |
Sajeevan et al. | An enhanced approach for movie review analysis using deep learning techniques | |
Mehendale et al. | Cyber bullying detection for Hindi-English language using machine learning | |
CN116775880A (en) | Multi-label text classification method and system based on label semantics and transfer learning | |
CN116562302A (en) | Multi-language event viewpoint object identification method integrating Han-Yue association relation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |