CN110472245B - Multi-label emotion intensity prediction method based on hierarchical convolutional neural network - Google Patents

Multi-label emotion intensity prediction method based on hierarchical convolutional neural network Download PDF

Info

Publication number
CN110472245B
CN110472245B CN201910751989.2A CN201910751989A CN110472245B CN 110472245 B CN110472245 B CN 110472245B CN 201910751989 A CN201910751989 A CN 201910751989A CN 110472245 B CN110472245 B CN 110472245B
Authority
CN
China
Prior art keywords
emotion
label
data
social media
short text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910751989.2A
Other languages
Chinese (zh)
Other versions
CN110472245A (en
Inventor
冯时
谢宏亮
王大玲
张一飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910751989.2A priority Critical patent/CN110472245B/en
Publication of CN110472245A publication Critical patent/CN110472245A/en
Application granted granted Critical
Publication of CN110472245B publication Critical patent/CN110472245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a multi-label emotion intensity prediction method based on a hierarchical convolutional neural network, which comprises the following steps of: dividing an original multi-label social media short text into a training set and a testing set; preprocessing a section of original multi-label social media short text data in a training set to obtain basic emotion single label data of the training set; constructing a single-label emotion classification model based on a hierarchical convolutional neural network; constructing an emotion intensity value model based on the attention convolution neural network; and aiming at the multi-label social media short text test data, predicting by using a single label emotion classification model of a hierarchical convolutional neural network to obtain an optimized multi-label emotion intensity vector. By adopting the multi-label emotion intensity prediction method based on the hierarchical convolutional neural network, the accuracy of social media text emotion intensity prediction can be further improved, and the method is particularly suitable for scenes with multiple basic emotions in the text.

Description

Multi-label emotion intensity prediction method based on hierarchical convolutional neural network
Technical Field
The invention belongs to the field of text mining and public sentiment analysis, and particularly relates to a multi-label emotion intensity prediction method based on a hierarchical convolutional neural network;
background
With the development of mobile internet technology in recent years, people can conveniently share their opinions and opinions by using social media, and the social media becomes an important way for many people to publish their opinions and opinions. Meanwhile, the huge amount of users causes a great amount of short text data to be generated in social media every day, and the data becomes an important data source of the online public opinion analysis system. The emotion analysis is a component of a public opinion analysis system, and the research of the emotion analysis has important significance. Meanwhile, short text data occupies a high proportion in social media, so emotion analysis for short texts is a research direction with practical application value.
Current textual emotion analysis research is mainly focused on the text classification problem, i.e. the classification of text into appropriate categories based on "happy", "angry", "aversive" and other basic emotions. But the text can express not only the emotion categories conveyed by the authors, but also the intensity of the emotions expressed by the same emotion in the text is greatly different. Furthermore, the expression of human emotions is very complex. For example, a plurality of emotional intensities can be expressed in a short text of social media. If the emotion analysis method of the single mark is adopted, only the emotion with the strongest emotion intensity in the text can be analyzed. However, some existing multi-label emotion classification algorithms can correctly give other emotions contained in the text, but cannot predict the emotion intensity values of various emotions, and cannot know which emotions are dominant in the sentence through the algorithms. The practical multi-mark emotion intensity prediction algorithm accords with the law of emotion complexity expression of people, and has great application value in the fields of social media network public opinion early warning, emergency public opinion tracking and the like.
Disclosure of Invention
Aiming at the problem, the invention provides a multi-label emotion intensity prediction method (HCNN) based on a hierarchical convolutional neural network. The method mainly aims to learn a mapping function from a training data set through a deep learning method, give a social media short text, and predict emotion marks corresponding to the social media short text, wherein the emotion marks are vectors consisting of n real numbers, and each real number value [0,1] represents the strength of a corresponding basic emotion.
A multi-label emotion intensity prediction method based on a hierarchical convolutional neural network comprises the following specific processes:
step 1: dividing an original multi-label social media short text into a training set and a testing set;
and 2, step: preprocessing a section of original multi-label social media short text data in a training set to obtain preprocessed single label data, wherein the data in the original multi-label social media short text training set are n real number vectors [ e ] representing basic emotion intensity values 1 ,e 2 …e i ,e n ];
Step 2.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 2.2, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 2.3: if the emotion category with the largest scale has num texts, when the emotion category distribution of the social media short text data after numerical value replacement is unbalanced, namely when the number of texts owned by other emotion categories is less than (3 x num)/4, resampling is carried out on the social media short text data of the category, and finally short text data with similar scales of all emotion categories are obtained, wherein the similar short text data is defined as: the data volume of the minimum scale category is not less than 0.75 times of the data volume of the maximum scale category; if the number of texts owned by other emotion categories is more than or equal to (3 x num)/4, no resampling is carried out, and the step 2.4 is carried out;
step 2.4: aiming at the social media short text data after resampling, the social media short text data contains basic emotion e i Into the corresponding basic emotion ticket markup data D i If a piece of resampled social media short text data comprises n basic emotion intensity values, n pieces of basic emotion sheet mark data are generated, wherein each piece is e i >0, then the emotion is considered to be present, e i =0, the emotion is considered to be absent;
and step 3: obtaining a plurality of sections of original multi-label social media short text data to form a training set, and processing each section of original multi-label social media short text data by adopting the method in the step 2 to obtain basic emotion single label data D of the training set i
And 4, step 4: constructing a single-label emotion classification model based on a Hierarchical Convolutional Neural Network (HCNN);
step 4.1: basic emotion single-labeled data D of training set i Converting the word vector matrix into a word vector matrix X and initializing an embedded layer of a neural network model;
step 4.2: convolution window and max pooling operations with Convolutional Neural Network (CNN) for word vector matrix XAs extracted local features v w
v w =CNN(X)
Step 4.3: coding is carried out aiming at a word vector matrix X, bidirectional long-short time memory network (BilSt) coding is used for obtaining an enhanced vector representation which considers context information and aims at each word, the enhanced vector is used for representing sentences S to obtain the matrix X c In matrix X c On the basis of the above-mentioned data, the convolutional neural network is used to extract the characteristics of logic layer so as to obtain vector v c
X c =BiLSTM(X)
v c =CNN(X c )
Step 4.4: fusing local features and logic layer features to form a new vector v of the text f
Figure GDA0003842507290000021
Wherein, the symbol
Figure GDA0003842507290000022
Representing a vector splicing operation or a vector addition operation.
Step 4.5: new vector v of text f Inputting the emotion data into a full connection layer to obtain a single-label emotion classification model of the hierarchical convolutional neural network;
step 4.6: new vector v of text f Inputting the emotion data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure GDA0003842507290000031
the single label emotion classification model of the hierarchical convolutional neural network uses a cross entropy loss function as follows:
Figure GDA0003842507290000032
in the above formula, N represents the number of training examples, y i Is a binary variable used to indicate whether the ith sample belongs to a certain class,
Figure GDA0003842507290000033
the representation model predicts the probability that the ith sample belongs to the specified class.
Optimizing a single-label emotion classification model of the hierarchical convolutional neural network by using a cross entropy loss function, performing iterative optimization by using a gradient descent algorithm, and ending the optimization process when the training data set is integrally iterated and circulated for L times to obtain the single-label emotion classification model of the hierarchical convolutional neural network after the loss function is optimized, namely the final single-label emotion classification model of the hierarchical convolutional neural network;
and 5: constructing an emotion intensity value model based on an Attention Convolution Neural Network (ACNN);
step 5.1: for single marker data set D i Filtering out e i Text of =0, next for e i >0, training a mood intensity value prediction model by utilizing the following steps;
and step 5.2: single marker data set D i Converting each text S into a word vector matrix X' for initializing an embedded layer of the neural network model;
step 5.3: word vector matrix X using long and short time memory model P Coding is carried out to obtain a task related expression vector v of the text S s Wherein:
v s =LSTM(X P )
step 5.4: representing a vector v by a sentence s And the original word vector matrix X P And calculating the related weight of the word vector through an attention mechanism, wherein the attention vector calculation method comprises the following steps:
v a =X P Wv s
wherein v is a Is the attention vector, W is the weight;
weighting the word vector by the attention vector, namely scaling the word vector in the subsequent window, wherein the formula is as follows:
α i =l*softmax(v a [i:i+l]),i∈{0,1,…,n-l}
Figure GDA0003842507290000041
where l represents the size of the window at the time,
Figure GDA0003842507290000042
represents the word vector corresponding to the i +0 th vocabulary in the sentence. Therefore, the similarity score in the current window is converted into a probability distribution by utilizing the softmax function, and multiplied by l to obtain the weight of the zoom word vector, and then the weight is multiplied by the original word vector X P Multiplied and scaled. For each window, a new weighted token Z of text is generated.
And step 5.5: and extracting the characteristic of the weighted representation Z by using a convolutional neural network. Weighted token vector Z generated for window size l l The most significant features are extracted using CNN networks and max pooling methods:
v l =CNN(Z l )
v different window sizes l Splicing the features to form a final characterization vector v of the input text g
Step 5.6: v is to be g Inputting the data into the full connection layer, and obtaining the final output of the model, namely the emotion intensity value of the text by using the softmax function.
Figure GDA0003842507290000043
Step 5.7: optimizing the model by using the training data and the loss function to obtain optimal parameters and an optimized emotional intensity model:
using the mean square error of the actual emotional intensity value and the model predicted emotional intensity value as a loss function of the emotional intensity model:
Figure GDA0003842507290000044
where N' represents the number of instances of a particular singly labeled emotional training data set, p i Indicating the emotional intensity value to which the ith sample is labeled,
Figure GDA0003842507290000045
and (4) representing the emotional intensity value predicted by the model h' of the ith sample.
And (4) carrying out iterative optimization by adopting a random gradient descent algorithm, and ending the optimization process when the whole iteration of the training data set is circulated for L' times to obtain a single-label emotion intensity prediction model.
And 6: aiming at the multi-label social media short text test data, predicting by using a single label emotion classification model of a hierarchical convolutional neural network to obtain an optimized multi-label emotion intensity vector;
step 6.1: preprocessing a section of original multi-label social media short text test centralized data in the test set to obtain preprocessed single label data; wherein the data in the multi-label social media short text test set are n ' real number vectors [ e ' representing basic emotion intensity values ' 1 ,e’ 2 …e’ i ,e’ n’ ];
Step 6.1.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 6.1.2, if the social media short text data after the punctuation is removed has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation is removed with a specified numerical value to obtain the social media short text data after the numerical value is replaced;
step 6.1.3, if the social media short text data after the punctuation is removed has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation is removed with a specified numerical value to obtain the social media short text data after the numerical value is replaced;
step 6.1.3: for social media short text data after replacing numerical value, for the text data containing basic emotion e' i Is put into corresponding basic mood sheet markup data D' i Then a piece of social media short text data after replacing numerical values, containing n ' basic emotion intensity values, will generate n ' pieces of basic emotion single mark data, wherein each piece is if ' i >0, then the mood is considered to be present, e' i =0, the emotion is considered to be absent;
step 6.2: testing data D 'by using preprocessed single marking data' i Converting the word vector matrix into a word vector matrix X' and initializing an embedded layer of the neural network model;
step 6.3: extracting local features v 'by adopting convolution window of Convolution Neural Network (CNN) aiming at word vector matrix X' w
Step 6.4: encoding a word vector matrix X ', using bi-directional long-and-short-term memory network encoding to obtain an enhanced vector representation for each word that considers context information, using the enhanced vector to represent the sentence S to obtain a matrix X' c In matrix X' c Using convolution windowing and max pooling operations to extract features of the logical layer, resulting in vector v' c
Step 6.5: fusing local features and logical layer features to form a new vector v 'of text' f Obtaining the convolution of the network and the output vector of the pooling layer;
step 6.6: new vector v 'of text' f Inputting the emotion data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure GDA0003842507290000051
step 6.7: calculating the emotion intensity value output by the single-label emotion classification model by using an emotion intensity model ACNN;
step 6.8: output based on ACNN emotion intensity model of each emotion
Figure GDA0003842507290000052
Combining to obtain optimized multi-label emotion intensity value vector
Figure GDA0003842507290000053
The beneficial technical effects are as follows:
by adopting the multi-label emotion intensity prediction method based on the hierarchical convolutional neural network, the accuracy of emotion intensity prediction of the social media text can be further improved, and the method is particularly suitable for scenes in which various basic emotions exist in the text at the same time.
Drawings
FIG. 1 is a whole framework of a multi-label emotion intensity prediction method based on a hierarchical convolutional neural network according to an embodiment of the present invention;
FIG. 2 is a diagram of a HCNN model architecture according to an embodiment of the present invention;
FIG. 3 shows comparison result 1 with conventional CNN model experiment
FIG. 4 comparison with conventional CNN model experiment result 2
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific examples: a multi-label emotion intensity prediction method based on a hierarchical convolutional neural network comprises the following specific processes:
step 1: dividing an original multi-label social media short text into a training set and a testing set;
step 2: preprocessing a section of original multi-label social media short text data in a training set to obtain preprocessed single label data, wherein the data in the original multi-label social media short text training set are n real number vectors [ e ] representing basic emotion intensity values 1 ,e 2 …e i ,e n ];
Step 2.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 2.2, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 2.3: if the emotion category with the largest scale has num texts, when the emotion category distribution of the social media short text data after numerical value replacement is unbalanced, namely when the number of texts owned by other emotion categories is less than (3 x num)/4, resampling is carried out on the social media short text data of the category, and finally short text data with similar scales of all emotion categories are obtained, wherein the similar short text data is defined as: the data volume of the minimum scale category is not less than 0.75 times of the data volume of the maximum scale category; if the number of texts owned by other emotion categories is more than or equal to (3 x num)/4, no resampling is carried out, and the step 2.4 is carried out;
step 2.4: aiming at the social media short text data after resampling, aiming at the social media short text data containing basic emotion e i Into the corresponding basic emotion ticket markup data D i If one piece of resampled social media short text data contains n basic emotion intensity values, n pieces of basic emotion single label data are generated, wherein if e of each piece i >0, then the emotion is considered to be present, e i =0, then the emotion is considered to be absent;
and 3, step 3: obtaining a plurality of sections of original multi-label social media short text data to form a training set, and processing each section of original multi-label social media short text data by adopting the method in the step 2 to obtain basic emotion single label data D of the training set i
The overall framework of the algorithm of the invention is shown as figure 1, and mainly comprises two parts of model training and prediction, wherein the main algorithms 1 and 2 are described as follows:
Figure GDA0003842507290000061
Figure GDA0003842507290000071
and 4, step 4: constructing a single-label emotion classification model based on a Hierarchical Convolutional Neural Network (HCNN), as shown in FIG. 2;
step 4.1: basic emotion single-labeled data D of training set i Converting the word vector matrix into a word vector matrix X and initializing an embedded layer of a neural network model; the present invention uses the Chinese Wikipedia to train Chinese word vectors and to initialize the embedding layer of the neural network model. When the context window is set to be 5, an optimization method of negative sampling is adopted for training through a Skip-gram model of a word2vec tool.
Step 4.2: extracting local features v by adopting convolution window and maximum pooling operation of Convolution Neural Network (CNN) aiming at word vector matrix X w
v w =CNN(X)
Step 4.3: encoding a word vector matrix X, using bidirectional long-time memory network (BilSTM) encoding to obtain an enhanced vector representation for each word considering context information, and using the enhanced vector to represent the sentence S to obtain the matrix X c In matrix X c On the basis, a convolutional neural network is used for extracting the characteristics of a logic layer to obtain a vector v c
X c =BiLSTM(X)
v c =CNN(X c )
Step 4.4: fusing local features and logic layer features to form a new vector v of the text f
Figure GDA0003842507290000072
Wherein, the symbol
Figure GDA0003842507290000073
Representing a vector splicing operation or a vector addition operation.
Step 4.5: new vector v of text f Inputting the single-label emotion into the full-connection layer to obtain the single-label emotion of the hierarchical convolutional neural networkClassifying the model;
step 4.6: new vector v of text f Inputting the emotion data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure GDA0003842507290000074
the single label emotion classification model of the hierarchical convolutional neural network uses a cross entropy loss function as follows:
Figure GDA0003842507290000075
in the above formula, N represents the number of training examples, y i Is a binary variable used to indicate whether the ith sample belongs to a certain class,
Figure GDA0003842507290000081
the representation model predicts the probability that the ith sample belongs to the specified class.
Optimizing the single-label emotion classification model of the hierarchical convolutional neural network by using a cross entropy loss function, performing iterative optimization by using a gradient descent algorithm, finishing the optimization process when the training data set is subjected to overall iterative loop for L times, and obtaining the single-label emotion classification model of the hierarchical convolutional neural network after the loss function is optimized, namely the final single-label emotion classification model of the hierarchical convolutional neural network;
a binary classifier model { C of each basic emotion can be trained through the algorithm 1 i H and emotion intensity prediction model i On the basis, the method can predict the intensity value of each basic emotion in the given short text, and when various basic emotions are expressed in the text, the prediction of the multi-mark emotion intensity can be completed, specifically see algorithm 2.
Figure GDA0003842507290000082
The trained classification and prediction model is utilized, the strength values of multiple emotions existing in the text at the same time can be effectively predicted through the algorithm 2, and experimental results show that the method provided by the invention can further improve the text emotion strength prediction effect, and are shown in the attached figures 3 and 4.
And 5: constructing an emotion intensity value model based on an Attention Convolution Neural Network (ACNN);
step 5.1: for a single marker dataset D i Filter out e i Text of =0, next for e i >0, training a mood intensity value prediction model by utilizing the following steps;
step 5.2: single labeled data set D i Each text S in the text is converted into a word vector matrix X' for initializing an embedded layer of a neural network model;
step 5.3: word vector matrix X using long and short time memory model P Coding is carried out to obtain a task related expression vector v of the text S s Wherein:
v s =LSTM(X P )
step 5.4: representing a vector v by a sentence s And the original word vector matrix X P And calculating the related weight of the word vector through an attention mechanism, wherein the attention vector calculation method comprises the following steps:
v a =X P Wv s
weighting the word vector by the attention vector, namely scaling the word vector in the subsequent window, wherein the formula is as follows:
α i =l*softmax(v a [i:i+l]),i∈{0,1,…,n-l}
Figure GDA0003842507290000091
where l represents the size of the window at the time,
Figure GDA0003842507290000092
representing the i +0 th word in the sentenceA word vector. Therefore, the similarity score in the current window is converted into a probability distribution by utilizing the softmax function, and multiplied by l to obtain the weight of the scaling word vector, and then the weight is multiplied with the original word vector X P Multiplied and scaled. For each window, a new weighted representation Z of the text is generated.
Step 5.5: and extracting the characteristic of the weighted representation Z by using a convolutional neural network. Weighted token vector Z generated for window size l l The most significant features are extracted using CNN networks and max pooling methods:
v l =CNN(Z l )
v different window sizes l Splicing the features to form a final characterization vector v of the input text g
Step 5.6: v is to be g And inputting the model into the full connection layer, and obtaining the final output of the model, namely the emotional intensity value of the text by using a softmax function.
Figure GDA0003842507290000093
Step 5.7: optimizing the model by using the training data and the loss function to obtain the optimal parameters and the optimized emotional intensity model:
using the mean square error of the actual emotional intensity value and the model predicted emotional intensity value as a loss function of the emotional intensity model:
Figure GDA0003842507290000094
where N' represents the number of instances of a particular singly labeled emotional training data set, p i Indicating the value of the emotional intensity to which the ith sample is labeled,
Figure GDA0003842507290000095
the emotional intensity value predicted by the model h' of the ith sample is shown.
And (3) carrying out iterative optimization by adopting a random gradient descent algorithm, and ending the optimization process when the whole iteration of the training data set is circulated for L' times to obtain a single-label emotion intensity prediction model.
Step 6: aiming at multi-label social media short text test data, predicting by using a single label emotion classification model of a hierarchical convolution neural network to obtain an optimized multi-label emotion intensity vector;
step 6.1: preprocessing a section of original multi-label social media short text test centralized data in the test set to obtain preprocessed single label data; wherein the data in the multi-label social media short text test set are n ' real number vectors [ e ' representing basic emotion intensity values ' 1 ,e’ 2 …e’ i ,e’ n’ ];
Step 6.1.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 6.1.2, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 6.1.3, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 6.1.3: for social media short text data after replacing numerical value, for the text data containing basic emotion e' i Is put into corresponding basic mood sheet markup data D' i Then a piece of social media short text data after replacing numerical values, containing n ' basic emotion intensity values, will generate n ' pieces of basic emotion single mark data, wherein each piece is if ' i >0, then the mood is considered to be present, e' i =0, the emotion is considered to be absent;
step 6.2: testing data D 'by using preprocessed single marking data' i Converting the word vector matrix into a word vector matrix X' and initializing an embedded layer of the neural network model;
step 6.3: extracting local feature v 'by adopting convolution window of Convolution Neural Network (CNN) aiming at word vector matrix X' w
Step 6.4: encoding a word vector matrix X ', obtaining an enhanced vector representation for each word taking into account context information using bi-directional spatiotemporal memory network encoding, representing the sentence S using this enhanced vector to obtain a matrix X' c In matrix X' c Using convolution windowing and maximum pooling operations to extract features of the logical layer, resulting in a vector v' c
Step 6.5: fusing local features and logical layer features to form a new vector v 'of text' f Obtaining the convolution of the network and the output vector of the pooling layer;
step 6.6: new vector v 'of text' f Inputting the emotion data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure GDA0003842507290000101
step 6.7: calculating the emotion intensity value output by the single-label emotion classification model by using an emotion intensity model ACNN;
step 6.8: output based on ACNN emotion intensity model of each emotion
Figure GDA0003842507290000102
Combining to obtain optimized multi-label emotion intensity value vector
Figure GDA0003842507290000103
The core innovation of the invention is to provide a hierarchical convolutional neural network model (HCNN) which can be used for emotion classification and emotion intensity prediction of social media texts. The invention provides a specific embodiment of the HCNN model.
(1) Training and testing data. The Chinese blog data set is adopted, 19751 sentences are totally adopted, and basic emotions include eight types, namely anger, anxiety, expectation, aversion, joy, love, sadness and surprise. Each sentence in the data set is labeled according to the expressed emotion, and the intensity value of each emotion is within [0,1], wherein the intensity 0 indicates that the sentence does not express the basic emotion.
(2) Word vector pre-training. Chinese word vectors are trained using the Chinese Wikipedia, with the raw corpus downloaded directly from Wiki Dump. The word vector training tool selects word2vec, specifically selects a Skip-gram model with a vector dimension of 200, and when a context window is set to be 5, the optimization method of Negative Sampling is adopted for training, the size of a Sampling value is 1e-4, and the number of iterations of the model is 15.
An HCNN network training method. And an Adam optimization method is adopted for HCNN training. The number of convolution kernels in the network is set to 200. The last fully-connected layer of the model contains two hidden layers, the number of hidden units is 200 and 100 respectively, and the drop rate values of the two layers are set to be 0.2 and 0.1 respectively. And aiming at different basic emotions, different sizes of convolution windows and the number of hidden units of the bidirectional long-time and short-time memory network can be set, and the work is finished by adjusting parameters on a verification set.
Fig. 3 is a comparison result 1 of a conventional CNN model experiment, in which DCNN and ACNN are HCNN models that are obtained by fusing multi-layer features by means of vector splicing and vector addition, respectively; RL, sequencing loss; HL, hamming loss; MSE is mean square error; SA, subset accuracy; the arrow direction indicates that the larger the index is, the better the index is, and the smaller the index is, the better the index is; FIG. 4 shows the comparison result with the conventional CNN model experiment 2, and DCNN and ACNN are HCNN models with multi-layer characteristics fused by means of vector splicing and vector addition, respectively; OE, single error; maF is the macro-average F value; miF is the micro-average F value; AP is average accuracy rate; the arrow direction indicates upward that the larger the index is, the better, and the downward that the index is, the smaller the index is, the better.

Claims (3)

1. A multi-label emotion intensity prediction method based on a hierarchical convolutional neural network is characterized in that the multi-label emotion intensity prediction method based on the hierarchical convolutional neural network comprises the following specific processes:
step 1: dividing an original multi-label social media short text into a training set and a testing set;
and 2, step: preprocessing a section of original multi-label social media short text data in a training set to obtain preprocessed single label data, wherein the data in the original multi-label social media short text training set are n real number vectors [ e ] representing basic emotion intensity values 1 ,e 2 …e i ,e n ];
And step 3: obtaining a plurality of sections of original multi-label social media short text data to form a training set, and processing each section of original multi-label social media short text data by adopting the method in the step 2 to obtain basic emotion single label data D of the training set i
And 4, step 4: constructing a single-label emotion classification model based on a hierarchical convolutional neural network;
step 4.1: basic emotion single-labeled data D of training set i Converting the word vector matrix into a word vector matrix X and initializing an embedded layer of a neural network model;
step 4.2: extracting local features v by adopting convolution window and maximum pooling operation of convolution neural network aiming at word vector matrix X w
v w =CNN(X)
Step 4.3: coding is carried out aiming at a word vector matrix X, bidirectional long-and-short-term memory network coding is used for obtaining an enhanced vector representation aiming at each word and considering context information, the enhanced vector is used for representing a sentence S to obtain the matrix X c In matrix X c On the basis, a convolutional neural network is used for extracting the characteristics of a logic layer to obtain a vector v c
X c =BiLSTM(X)
v c =CNN(X c )
Step 4.4: fusing local features and logic layer features to form a new vector v of the text f
Figure FDA0003842507280000011
Wherein, the symbol
Figure FDA0003842507280000012
Representing a vector splicing operation or a vector addition operation;
step 4.5: new vector v of text f Inputting the emotion data into a full connection layer to obtain a single-label emotion classification model of the hierarchical convolutional neural network;
step 4.6: new vector v of text f Inputting the data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure FDA0003842507280000013
the single label emotion classification model of the hierarchical convolutional neural network uses a cross entropy loss function as follows:
Figure FDA0003842507280000021
in the above formula, N represents the number of training examples, y i Is a binary variable used to indicate whether the ith sample belongs to a certain class,
Figure FDA0003842507280000022
representing the probability that the model predicts that the ith sample belongs to the specified class;
optimizing the single-label emotion classification model of the hierarchical convolutional neural network by using a cross entropy loss function, performing iterative optimization by using a gradient descent algorithm, finishing the optimization process when the training data set is subjected to overall iterative loop for L times, and obtaining the single-label emotion classification model of the hierarchical convolutional neural network after the loss function is optimized, namely the final single-label emotion classification model of the hierarchical convolutional neural network;
and 5: constructing an emotion intensity value model based on the attention convolution neural network;
step 5.1: for a single marker dataset D i Filter out e i Text of =0, next for e i >0, training a mood intensity value prediction model by utilizing the following steps;
and step 5.2: single marker data set D i Converting each text S into a word vector matrix X' for initializing an embedded layer of the neural network model;
step 5.3: word vector matrix X using long-short time memory model P Coding is carried out to obtain a task related expression vector v of the text S s Wherein:
v s =LSTM(X P )
step 5.4: representing a vector v by a sentence s And the original word vector matrix X P And calculating the related weight of the word vector through an attention mechanism, wherein the attention vector calculation method comprises the following steps:
v a =X P Wv s
wherein v is a W is the attention vector, W is the weight;
weighting the word vectors by the attention vectors, i.e. scaling the word vectors in the subsequent window, the formula is:
α i =l*softmax(v a [i:i+l]),i∈{0,1,…,n-l}
Figure FDA0003842507280000023
where l represents the size of the window at the time,
Figure FDA0003842507280000024
representing word vectors corresponding to the i +0 th vocabulary in the sentence; therefore, the similarity score in the current window is converted into a probability distribution by utilizing the softmax function, and multiplied by l to obtain the weight of the scaling word vector, and then the weight is multiplied with the original word vector X P The multiplication is carried out by the following steps,scaling it; for each window, generating a new weighted representation Z of the text;
and step 5.5: extracting the characteristics of the weighted representation Z by using a convolutional neural network, and generating a weighted representation vector Z with a window size of l l The most significant features are extracted using CNN networks and max pooling methods:
v l =CNN(Z l )
v different window sizes l Splicing the features to form a final characterization vector v of the input text g
Step 5.6: v is to be g Inputting the model into a full connection layer, and obtaining the final output of the model, namely the emotion intensity value of the text by using a softmax function;
Figure FDA0003842507280000031
step 5.7: optimizing the model by using the training data and the loss function to obtain the optimal parameters and the optimized emotional intensity model:
using the mean square error of the actual emotional intensity value and the model predicted emotional intensity value as a loss function of the emotional intensity model:
Figure FDA0003842507280000032
where N' represents the number of instances of a particular singly labeled emotional training data set, p i Indicating the emotional intensity value to which the ith sample is labeled,
Figure FDA0003842507280000033
representing the emotional intensity value predicted by the model h' of the ith sample;
performing iterative optimization by adopting a random gradient descent algorithm, and ending the optimization process when the training data set is subjected to overall iterative loop for L' times to obtain a single-label emotion intensity prediction model;
step 6: and aiming at the multi-label social media short text test data, predicting by using a single-label emotion classification model of the hierarchical convolutional neural network to obtain an optimized multi-label emotion intensity vector.
2. The method for predicting multi-label emotional intensity based on the hierarchical convolutional neural network as claimed in claim 1, wherein the step 2 specifically comprises:
step 2.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 2.2, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 2.3: if the emotion category with the largest scale has num texts, when the emotion category distribution of the social media short text data after numerical value replacement is unbalanced, namely when the number of texts owned by other emotion categories is less than (3 x num)/4, resampling is carried out on the social media short text data of the category, and finally short text data with similar scales of all emotion categories are obtained, wherein the similar short text data is defined as: the data quantity of the minimum scale category is not less than 0.75 times of the data quantity of the maximum scale category; if the number of texts owned by other emotion types is greater than or equal to (3 × num)/4, no resampling is performed, and the step 2.4 is carried out;
step 2.4: aiming at the social media short text data after resampling, the social media short text data contains basic emotion e i Into the corresponding basic emotion ticket markup data D i If a piece of resampled social media short text data comprises n basic emotion intensity values, n pieces of basic emotion sheet mark data are generated, wherein each piece is e i >0, then the emotion is considered to be present, e i If =0, the emotion is considered to be absent.
3. The method for predicting multi-label emotional intensity based on the hierarchical convolutional neural network of claim 1, wherein the step 6 specifically comprises:
step 6.1: preprocessing a section of original multi-label social media short text test centralized data in the test set to obtain preprocessed single label data; wherein the data in the multi-label social media short text test set are n ' real number vectors [ e ' representing basic emotion intensity values ' 1 ,e’ 2 …e’ i ,e’ n’ ];
Step 6.1.1: removing punctuations irrelevant to emotion analysis from the social media short text data, reserving question marks and exclamation marks, and removing other punctuations to obtain the social media short text data after the punctuations are removed;
step 6.1.2, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 6.1.3, if the social media short text data after the punctuation removal has a numerical value, replacing the existing numerical value in the social media short text data after the punctuation removal with a specified numerical value to obtain the social media short text data after the numerical value replacement;
step 6.1.3: for social media short text data after replacing numerical value, for including basic emotion e' i Is put to the corresponding basic emotion single label data D' i Then a piece of social media short text data after replacing numerical values, containing n ' basic emotion intensity values, will generate n ' pieces of basic emotion single mark data, wherein each piece is if ' i >0, the emotion is considered to be present, e' i =0, then the emotion is considered to be absent;
step 6.2: testing data D 'by using preprocessed single marking data' i Converting the word vector matrix into a word vector matrix X' and initializing an embedded layer of the neural network model;
step 6.3: extracting local feature v 'by adopting convolution window of convolution neural network aiming at word vector matrix X' w
Step 6.4: encoding a word vector matrix X ', obtaining an enhanced vector representation for each word taking into account context information using bi-directional spatiotemporal memory network encoding, representing the sentence S using this enhanced vector to obtain a matrix X' c In matrix X' c Using convolution windowing and maximum pooling operations to extract features of the logical layer, resulting in a vector v' c
Step 6.5: fusing local features and logical layer features to form a new vector v 'of text' f Obtaining the convolution of the network and the output vector of the pooling layer;
step 6.6: new vector v 'of text' f Inputting the emotion data into a full connection layer, and obtaining the output of a single-label emotion classification model of the hierarchical convolutional neural network by using a softmax function:
Figure FDA0003842507280000041
step 6.7: calculating the emotion intensity value output by the single-label emotion classification model by using an emotion intensity model ACNN;
step 6.8: output based on ACNN emotion intensity model of each emotion
Figure FDA0003842507280000042
Combining to obtain optimized multi-label emotion intensity value vector
Figure FDA0003842507280000043
CN201910751989.2A 2019-08-15 2019-08-15 Multi-label emotion intensity prediction method based on hierarchical convolutional neural network Active CN110472245B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751989.2A CN110472245B (en) 2019-08-15 2019-08-15 Multi-label emotion intensity prediction method based on hierarchical convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751989.2A CN110472245B (en) 2019-08-15 2019-08-15 Multi-label emotion intensity prediction method based on hierarchical convolutional neural network

Publications (2)

Publication Number Publication Date
CN110472245A CN110472245A (en) 2019-11-19
CN110472245B true CN110472245B (en) 2022-11-29

Family

ID=68511433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751989.2A Active CN110472245B (en) 2019-08-15 2019-08-15 Multi-label emotion intensity prediction method based on hierarchical convolutional neural network

Country Status (1)

Country Link
CN (1) CN110472245B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110957039A (en) * 2019-12-16 2020-04-03 河南科技学院 Campus psychological coaching method and device based on deep learning
CN111985532B (en) * 2020-07-10 2021-11-09 西安理工大学 Scene-level context-aware emotion recognition deep network method
CN111862068B (en) * 2020-07-28 2022-09-13 福州大学 Three-model comprehensive decision emotion prediction method fusing data missing data and images
CN116306686B (en) * 2023-05-22 2023-08-29 中国科学技术大学 Method for generating multi-emotion-guided co-emotion dialogue

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN109299253A (en) * 2018-09-03 2019-02-01 华南理工大学 A kind of social text Emotion identification model construction method of Chinese based on depth integration neural network
CN110097894B (en) * 2019-05-21 2021-06-11 焦点科技股份有限公司 End-to-end speech emotion recognition method and system

Also Published As

Publication number Publication date
CN110472245A (en) 2019-11-19

Similar Documents

Publication Publication Date Title
CN110287320B (en) Deep learning multi-classification emotion analysis model combining attention mechanism
CN107608956B (en) Reader emotion distribution prediction algorithm based on CNN-GRNN
CN108399158B (en) Attribute emotion classification method based on dependency tree and attention mechanism
CN110472245B (en) Multi-label emotion intensity prediction method based on hierarchical convolutional neural network
CN110059188B (en) Chinese emotion analysis method based on bidirectional time convolution network
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
CN109376242B (en) Text classification method based on cyclic neural network variant and convolutional neural network
Ishaq et al. Aspect-based sentiment analysis using a hybridized approach based on CNN and GA
CN110083833B (en) Method for analyzing emotion by jointly embedding Chinese word vector and aspect word vector
CN112667818B (en) GCN and multi-granularity attention fused user comment sentiment analysis method and system
CN109284506A (en) A kind of user comment sentiment analysis system and method based on attention convolutional neural networks
CN110347836B (en) Method for classifying sentiments of Chinese-Yue-bilingual news by blending into viewpoint sentence characteristics
CN111414476A (en) Attribute-level emotion analysis method based on multi-task learning
CN110929034A (en) Commodity comment fine-grained emotion classification method based on improved LSTM
CN110415071B (en) Automobile competitive product comparison method based on viewpoint mining analysis
CN110046356B (en) Label-embedded microblog text emotion multi-label classification method
CN111984791B (en) Attention mechanism-based long text classification method
CN112561718A (en) Case microblog evaluation object emotion tendency analysis method based on BilSTM weight sharing
CN111538841B (en) Comment emotion analysis method, device and system based on knowledge mutual distillation
CN112905739A (en) False comment detection model training method, detection method and electronic equipment
CN111400494A (en) Sentiment analysis method based on GCN-Attention
CN112818698A (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN116245110A (en) Multi-dimensional information fusion user standing detection method based on graph attention network
CN113094502A (en) Multi-granularity takeaway user comment sentiment analysis method
CN115630156A (en) Mongolian emotion analysis method and system fusing Prompt and SRU

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant