CN108984724B - Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation - Google Patents
Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation Download PDFInfo
- Publication number
- CN108984724B CN108984724B CN201810754022.5A CN201810754022A CN108984724B CN 108984724 B CN108984724 B CN 108984724B CN 201810754022 A CN201810754022 A CN 201810754022A CN 108984724 B CN108984724 B CN 108984724B
- Authority
- CN
- China
- Prior art keywords
- word
- clause
- attribute
- representation
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000001537 neural Effects 0.000 claims abstract description 28
- 230000015654 memory Effects 0.000 claims abstract description 21
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 16
- 230000002457 bidirectional Effects 0.000 claims abstract description 13
- 230000011218 segmentation Effects 0.000 claims abstract description 11
- 238000003062 neural network model Methods 0.000 claims description 24
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 12
- 238000011161 development Methods 0.000 claims description 6
- 230000018109 developmental process Effects 0.000 claims description 6
- 238000000034 method Methods 0.000 claims description 6
- 230000006403 short-term memory Effects 0.000 claims description 5
- 230000000875 corresponding Effects 0.000 claims description 4
- 241000764238 Isis Species 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 description 7
- 230000002996 emotional Effects 0.000 description 5
- 238000007430 reference method Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000000306 recurrent Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000003044 adaptive Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002452 interceptive Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000001419 dependent Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000001264 neutralization Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention discloses a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation. Firstly, the invention provides a clause segmentation algorithm to segment a comment text into a plurality of clauses; secondly, coding the words in each clause by utilizing a plurality of bidirectional long-short term memory neural networks so as to obtain the representation of each clause; and finally, coding the expression of each clause obtained in the last step by adopting a bidirectional long-short term memory neural network so as to obtain the final expression of the whole sentence. Through the method, the information more relevant to the specific attribute is captured from three different dimensions of words, clauses and sentences, and finally the accuracy of emotion classification of the specific attribute is improved.
Description
Technical Field
The invention relates to an emotion analysis method for comment text expression, in particular to a method for improving emotion classification accuracy of specific attributes by using high-dimensional representation.
Background
In order to obtain the emotion polarity of each attribute in the comment text, attribute words, emotion words and emotion modifiers in the comment text are identified by an emotion analysis (SA) technology and are further analyzed to judge the emotion polarity expressed by the comment text for a specific attribute, so that the comment text can be applied to the fields of event analysis, network public opinion analysis, spam processing and the like.
When judging the emotion polarity of the comment text, the traditional coarse-grained emotion analysis method only analyzes and processes the whole comment text and cannot judge the fine-grained polarity of the comment text according to the specific attributes in the comment text. Therefore, the recent research on emotion analysis tends to be fine grained, which becomes a hot topic for research and attention at home and abroad.
Currently, Deep Neural Network (DNN) technology is utilized to carry out emotion analysis on specific attributes in a text, Tang and the like propose a Target-Dependent Long-Short Term Memory Neural Network (TD-LSTM) and a Target-associated Long-Short Term Memory (TC-LSTM) on the problem of carrying out emotion Classification on sentences according to the specific attributes in a Target-Dependent classified with Long Short Term Memory, wherein the TD-LSTM takes Target information into consideration when generating and representing sentences, and then the TC-LSTM associates the Target information and context thereof on the basis of the method, the method takes the average value of word vectors in the target phrase as the target vector, however, the simple average of the word vectors in the target phrase cannot completely express the semantics of the target phrase, and therefore, the optimal result cannot be obtained. Dong et al propose an Adaptive recurrent Neural Network (AdaRNN) for a hypothesis that depends on a specific attribute in an "Adaptive recurrent Neural Network for Target-dependent Twitter sententinformation Classification". And adaptively transmitting the emotional words to the specific attributes according to the context and the syntactic relation between the emotional words and the specific attributes. The method converts the dependency tree of the sentence into a recursive structure for a specific attribute and obtains a higher level representation based on the structure. Experimental research shows that the classification performance of the classifier constructed based on AdaRNN is superior to that of the traditional machine learning method and the basic recurrent neural network method, but the classification performance still needs to be improved.
Disclosure of Invention
The purpose of the invention is as follows: based on the defects of the prior art, the invention provides a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation.
The technical scheme is as follows:
a method for improving emotion classification accuracy of specific attributes by using high-dimensional representation comprises a training stage and a testing stage:
the method comprises the following specific steps:
a training stage:
s1) a sentence is segmented into a plurality of clauses by using a clause segmentation algorithm, each word in the clauses is expressed in the form of a word vector, the whole spliced word vector of the word and an attribute word vector is used as the input of a deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of the word vector and a bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to a development data set to obtain a trained deep neural network model;
s2), the deep neural network model comprises a 3-layer architecture including a word coding layer, a clause coding layer and a softmax layer, the word coding layer is used for capturing the relevance of each word in a clause and a specific attribute, the clause coding layer maps the specific attribute into the clause, and the softmax layer is used for inputting the final representation S of the comment text into a softmax classifier to finally obtain the class probability distribution of the comment text for the given attribute;
s3), the input word sequence of the deep neural network model is a word vector with (d + d ') dimension, wherein d represents the dimension of the word vector, d' represents the dimension of the attribute word vector, and the value of d can be adjusted according to the experimental condition;
s4), training a Loss Function of the model, training a specific attribute emotion classification model based on high-dimensional representation in an End-to-End (End-to End) mode by adopting a Cross-Entropy Loss Function (Cross-Entrol Loss Function);
s5) given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atTrue category label of (2);
s6) regarding the emotion classification model with specific attributes based on high-dimensional representation as a black box functionThe output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
in the above formula, M represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
s7) adopting Adagrad optimization function, and uniformly distributing parameters of all matrixes and vectors Wherein r and c' are the number of rows and columns in the matrix; in the training process, in order to avoid overfitting, a Dropout strategy is adopted in the Bi-LSTM;
and (3) a testing stage:
s8) inputting the comment text to be processed into the trained deep neural network model, and obtaining the emotion polarity of the comment text aiming at the specific attribute.
Further, the clause segmentation algorithm is specifically to segment sentences through punctuation marks and connection words (collectively referred to as delimiters): defining a minnum parameter to limit the number of words at least to be contained in the clause, and dividing the partial sentence into the clauses if and only if minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
Further, the other super parameters are adjusted according to the development data set, specifically, the initial value of the learning rate is set to 0.1, the regularization weight of the parameter is set to 10-5Dropoutratate is set to 0.25.
Further, in the clause segmentation algorithm, the parameter minnum is set to 3, and the parameter maxnum is set to 4, so that all possible clauses can be mined from the sentence, and the model can achieve the best performance on developing the data set.
Further, the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing inIs represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, whereinWhere d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed asWhereinRepresenting a d' dimensional vector representation of the nth word in the entity string;
Usually, a word vector representation has a linear structure, which makes it have an overlapping or subtractive property at a semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
in the above formulaNamely, it isIs (d + d'), i is belonged to [1, C],j∈[1,Ni],Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vectorAs input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted asThe neural network is selected fromi,1ToI.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed asIs to be fromTo Ii,1I.e. reading clause c from back to frontiThe word in (1):
hidden forward stateAnd backward hidden layer stateIs spliced to obtainEach word I in the clauseijThe final hidden-layer state representation of (1) fusing all the following words I in the clauseijThe related information of (1):
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
similar to the word coding layer, by concatenating forward hidden layer statesAnd backward hidden layer stateTo obtain each clause c in the comment textiThe final hidden-layer state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
Compared with the prior art, the method for improving the emotion classification accuracy of the specific attribute by using high-dimensional representation, provided by the invention, comprises the following steps: according to the method, a multi-level and high-dimensional deep neural network model is constructed by using the comment text and the specific attribute information thereof from three different dimensions of words, clauses and sentences so as to achieve better classification performance.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of a specific attribute emotion classification model architecture constructed by the present invention;
FIG. 3 is a restaurant domain review text example.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
The embodiment shows a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation:
constructing a bidirectional long-short term memory neural network model based on high-dimensional representation aiming at specific attributes, wherein the model comprises a training stage and a testing stage:
in the training stage, the word coding layer is used for capturing the correlation between each word in the clauses and the specific attributes, the clauses coding layer maps the specific attributes into the clauses to serve as the input of the deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of word vectors and the bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to the development data set to obtain the trained deep neural network model;
in the testing stage, inputting the comment text to be processed into the trained deep neural network model to obtain the emotion polarity of the comment text aiming at the specific attribute;
wherein: aiming at judging the emotion polarity expressed by the comment text aiming at the specific attribute:
on one hand, not all components in the comment text have certain relevance with specific attributes;
on the other hand, the comment text may contain a plurality of attributes, and different attributes may be subjected to emotion classification by combining with information of different parts in the sentence;
therefore, referring to fig. 1, the clause segmentation algorithm proposed in this embodiment segments a sentence into different clauses so as to map specific attributes into the clauses.
The basic idea of the clause segmentation algorithm proposed in this embodiment is to segment sentences by punctuation marks and conjunctions (collectively referred to as delimiters):
as shown in fig. 3, "great and tasty" should not be divided into two clauses by the conjunction of "and", so that not all separators can be used as the boundaries of the clauses;
the scheme defines a minnum parameter to limit the number of words at least to be contained in the clause, and divides the partial sentence into the clauses only when the minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the clause segmentation method is explained in detail in table 1, wherein the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
TABLE 1 clause segmentation Algorithm
Referring to fig. 2, the bidirectional long-short term memory neural network model based on high-dimensional representation, which is constructed for specific attributes, comprises a 3-layer architecture of a word coding layer, a clause coding layer and a softmax layer:
the word coding layer is used for capturing the relevance of each word in the clauses and the specific attributes;
the clause coding layer maps the specific attribute into the clause;
the softmax layer is used for inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for the given attribute;
training Loss Function of model Cross-Entropy Loss Function (Cross-Entropy Loss Function) was selected, and the activation Function was adarad:
the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing inIs represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, whereinWhere d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed asWhereinRepresenting a d' dimensional vector representation of the nth word in the entity string;
Usually, a word vector representation has a linear structure, which makes it have an overlapping or subtractive property at a semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
in the above formulaNamely, it isIs (d + d'), i is belonged to [1, C],j∈[1,Ni],Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vectorAs input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted asThe neural network is selected fromi,1ToI.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed asIs to be fromTo Ii,1I.e. reading clause c from back to frontiThe word in (1):
hidden forward stateAnd backward hidden layer stateSplicing to obtain each word I in the clausesijThe final hidden-layer state representation of (1) fusing all the following words I in the clauseijThe related information of (1):
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
similar to the word coding layer, by concatenating forward hidden layer statesAnd backward hidden layer stateTo obtain each clause c in the comment textiThe final hidden-layer state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
A verification step:
in order to verify the advantages of the deep neural network model provided by the invention relative to other emotion classification algorithms, a series of comparison experiments are carried out:
the experimental environment configuration comprises two parts of hardware and software:
the hardware configuration used by the training model is Intel Xeon 2.5GHz, 4 cores and 8GB of memory;
the software configuration part is provided with an operating system of Windows10, a machine learning front-end library utilized is keras-1.2.2, a rear end is theta-0.8.2 and is based on python2.7 and some scientific calculation libraries;
the experimental procedure mainly includes three aspects:
1) data preparation
According to the invention, experiments are carried out on two data sets (namely, the Laptop computer field and the Restaurant field) of a semantic evaluation Task12 to verify the effectiveness of the method provided by the invention, each data set consists of a plurality of user comments, each comment comprises an attribute list and emotional polarities corresponding to the attributes, wherein the emotional polarities comprise positive direction, neutral direction and negative direction, and the data distribution conditions of the two fields in the data set are shown by referring to a table 2;
in addition, 10% of the data is randomly selected from the training set as a development data set for adjusting algorithm parameters, and Glove is selected as a pre-trained word vector.
TABLE 2 Restaurant and Laptop computer Domain data set distribution
2) Model training
The method adopts a Cross-Entropy Loss Function (Cross-Entropy Loss Function) to train a specific attribute emotion classification model based on high-dimensional representation in an End-to-End (End-to-End) mode. Given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atTrue category label of (2);
considering a high-dimensional representation-based specific attribute emotion classification model as a black box functionThe output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
m represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
adagrad is adopted as an optimization function, and parameters of all matrixes and vectors are uniformly distributed Wherein r and c' are the number of rows and columns in the matrix;
and in order to avoid overfitting during training, the Dropout strategy is adopted in Bi-LSTM.
3) Results of the experiment
Comparing the deep neural network model with a reference method so as to comprehensively evaluate the performance of the model:
the reference method and the method provided by the scheme adopt Glove word vectors during training;
the reference method comprises the following steps:
1) majority algorithm (Majority): this method is a basic reference method, which assigns a majority of emotional polarities appearing in a training set to each test sample for a specific attribute;
2) long short term memory neural network (LSTM): the method uses only one LSTM to model the context to obtain the hidden layer representation of each word, then the average value of all the hidden layer representations is regarded as the final representation of the input, and the final representation is sent to the softmax layer to obtain the prediction probability value of each label;
3) long-short term memory neural network based on target association (TC-LSTM): the method extends the basic LSTM by using two LSTMs, namely one forward LSTM and one backward LSTM for the attribute information. In addition, the model blends attribute information into the representation of the sentence, and finally, the representation of the two attributes is spliced together for emotion polarity prediction aiming at the specific attribute;
4) attention-based long-short term memory neural network (ATAE-LSTM): the method models context words through an LSTM, and embeds attribute vectors into each word vector;
5) interactive Attention Network (IAN): the method is an interactive learning method, firstly, modeling is carried out on context and attributes through LSTM, and then attention expression is interactively learned on the context and the attributes;
the method proposed by the scheme is a multilayer bidirectional long-short term memory neural network (Hierarchical Bi-LSTM): the method is a multi-layer Bi-LSTM, and a multi-level and high-dimensional deep neural network model is constructed by utilizing comment texts and specific attribute information thereof based on three different dimensions of words, clauses and sentences in high-dimensional representation: firstly, a sentence is divided into a plurality of clauses by using a clause division algorithm; then, coding all clauses by utilizing a plurality of bidirectional long-short term memory neural networks; and finally, coding the clauses by utilizing a bidirectional long-short term memory neural network, and further obtaining the probability that the comment text belongs to each category aiming at the specific attribute through a softmax layer.
TABLE 3 comparison of Performance of different attribute level Emotion Classification methods for plain text
Referring to table 3, a comparison of the performance between this scheme and other baseline methods is shown:
as can be observed from table 3, the performance of the Majority algorithm is the worst, and the classification accuracy of the classifier constructed by the Majority algorithm in the retaurant field and the Laptop field is 53.7% and 57.0% respectively;
in addition, all the methods are realized on the basis of the LSTM neural network model, the classification performance of the methods is superior to that of a Majority algorithm, and experimental results show that the LSTM model not only has the potential of automatic generation and representation, but also can bring performance improvement for attribute-level emotion classification;
in addition, it can be seen from Table 3 that the classification accuracy of TC-LSTM, ATAE-LSTM and IAN are all better than that of LSTM. This result demonstrates that it is helpful to take attribute information into account when sentiment classification is performed for a particular attribute for improving classification performance;
finally, it can be seen that the Hierarchical Bi-LSTM method proposed by the present invention is superior to all of the aforementioned methods, which highlights the superiority of using clause information.
In summary, the method for improving the emotion classification accuracy of the specific attribute by using high-dimensional representation provided in this embodiment: according to the method, a multi-level and high-dimensional deep neural network model is constructed by using the comment text and the specific attribute information thereof from three different dimensions of words, clauses and sentences so as to achieve better classification performance.
Claims (5)
1. A method for improving emotion classification accuracy of specific attributes by using high-dimensional representation is characterized by comprising the following steps: the method comprises a training phase and a testing phase: the method comprises the following specific steps:
a training stage:
s1) a sentence is segmented into a plurality of clauses by using a clause segmentation algorithm, each word in the clauses is expressed in the form of a word vector, the whole spliced word vector of the word and an attribute word vector is used as the input of a deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of the word vector and a bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to a development data set to obtain a trained deep neural network model;
s2), the deep neural network model comprises a 3-layer architecture including a word coding layer, a clause coding layer and a softmax layer, the word coding layer is used for capturing the relevance of each word in a clause and a specific attribute, the clause coding layer maps the specific attribute into the clause, and the softmax layer is used for inputting the final representation S of the comment text into a softmax classifier and finally obtaining the class probability distribution of the comment text for the given attribute;
s3), the input word sequence of the deep neural network model is a word vector with (d + d ') dimension, wherein d represents the dimension of the word vector, d' represents the dimension of the attribute word vector, and the value of d can be adjusted according to the experimental condition;
s4), training a special attribute emotion classification model based on high-dimensional representation by adopting a Cross-Entropy Loss Function (Cross-Entropy Loss Function) in an End-to-End (End-to-End) mode;
s5) given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atIsA category of truth label;
s6) regarding the emotion classification model with specific attributes based on high-dimensional representation as a black box functionThe output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
in the above formula, M represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
s7) adopting Adagrad optimization function, and uniformly distributing parameters of all matrixes and vectorsWherein r and c' are the number of rows and columns in the matrix; in the training process, in order to avoid overfitting, a Dropout strategy is adopted in the Bi-LSTM;
and (3) a testing stage:
s8) inputting the comment text to be processed into the trained deep neural network model, and obtaining the emotion polarity of the comment text for specific attributes.
2. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 1, wherein: the clause segmentation algorithm is specifically to segment sentences through punctuation marks and connecting words: defining a minnum parameter to limit the number of words at least to be contained in the clause, and dividing the sentence into clauses if and only if minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
3. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 2, wherein: the other hyper-parameters are adjusted according to the development data set, specifically, the initial value of the learning rate is set to 0.1, the regularization weight of the parameter is set to 10-5Dropout Rate is set to 0.25.
4. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 3, wherein: in the clause segmentation algorithm, the parameter minnum is set to be 3, and the parameter maxnum is set to be 4, so that all possible clauses can be mined from the sentences, and the model can achieve the best performance on developing a data set.
5. The method for improving emotion classification accuracy of specific attributes by using high-dimensional representation as claimed in claim 4, wherein: the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing inIs represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, whereinWhere d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed asWhereinIndicating d for the nth word in the entity string′A dimension vector representation;
Usually, a word vector representation has a linear structure, which makes it have an overlap or subtraction characteristic at the semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
in the above formulaNamely, it isIs (d + d'), i is belonged to [1, C],j∈[1,Ni],Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vectorAs input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted asThe neural network is selected fromi,1ToI.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed asIs to be fromTo Ii,1I.e. reading clause c from back to frontiThe word in (1):
hidden forward stateAnd backward hidden layer stateSplicing to obtain each word I in the clausesijIs shown, the final hidden state representation fuses all the following words I in the clauseijThe related information of (1):
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
similar to the word coding layer, by concatenating forward hidden layer statesAnd backward hidden layer stateTo obtain each clause c in the comment textiThe final hidden state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810754022.5A CN108984724B (en) | 2018-07-10 | 2018-07-10 | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810754022.5A CN108984724B (en) | 2018-07-10 | 2018-07-10 | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108984724A CN108984724A (en) | 2018-12-11 |
CN108984724B true CN108984724B (en) | 2021-09-28 |
Family
ID=64537672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810754022.5A Active CN108984724B (en) | 2018-07-10 | 2018-07-10 | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108984724B (en) |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109447021B (en) * | 2018-11-08 | 2020-11-27 | 北京灵汐科技有限公司 | Attribute detection method and attribute detection device |
CN109766557B (en) * | 2019-01-18 | 2023-07-18 | 河北工业大学 | Emotion analysis method and device, storage medium and terminal equipment |
CN109710769A (en) * | 2019-01-23 | 2019-05-03 | 福州大学 | A kind of waterborne troops's comment detection system and method based on capsule network |
CN109902174B (en) * | 2019-02-18 | 2023-06-20 | 山东科技大学 | Emotion polarity detection method based on aspect-dependent memory network |
CN109993216B (en) * | 2019-03-11 | 2021-05-11 | 深兰科技(上海)有限公司 | Text classification method and device based on K nearest neighbor KNN |
CN110083829A (en) * | 2019-04-03 | 2019-08-02 | 平安科技(深圳)有限公司 | Feeling polarities analysis method and relevant apparatus |
CN110083785A (en) * | 2019-04-29 | 2019-08-02 | 清华大学 | The Sex, Age method of discrimination and device of record are searched for based on user |
CN110175237B (en) * | 2019-05-14 | 2023-02-03 | 华东师范大学 | Multi-category-oriented secondary emotion classification method |
CN111353040A (en) * | 2019-05-29 | 2020-06-30 | 北京工业大学 | GRU-based attribute level emotion analysis method |
CN110309769B (en) * | 2019-06-28 | 2021-06-15 | 北京邮电大学 | Method for segmenting character strings in picture |
CN110502633A (en) * | 2019-07-19 | 2019-11-26 | 中山大学 | Network comment management method based on machine learning |
CN110569338B (en) * | 2019-07-22 | 2022-05-03 | 中国科学院信息工程研究所 | Method for training decoder of generative dialogue system and decoding method |
CN110765769B (en) * | 2019-08-27 | 2023-05-02 | 电子科技大学 | Clause feature-based entity attribute dependency emotion analysis method |
CN110609899B (en) * | 2019-08-29 | 2022-04-19 | 成都信息工程大学 | Specific target emotion classification method based on improved BERT model |
CN110717325B (en) * | 2019-09-04 | 2020-11-13 | 北京三快在线科技有限公司 | Text emotion analysis method and device, electronic equipment and storage medium |
CN110781273B (en) * | 2019-09-17 | 2022-05-31 | 华东交通大学 | Text data processing method and device, electronic equipment and storage medium |
CN111144130A (en) * | 2019-12-26 | 2020-05-12 | 辽宁工程技术大学 | Context-aware-based fine-grained emotion classification method for hybrid neural network |
CN111241290B (en) * | 2020-01-19 | 2023-05-30 | 车智互联(北京)科技有限公司 | Comment tag generation method and device and computing equipment |
CN111242083A (en) * | 2020-01-21 | 2020-06-05 | 腾讯云计算(北京)有限责任公司 | Text processing method, device, equipment and medium based on artificial intelligence |
CN111325027B (en) * | 2020-02-19 | 2023-04-28 | 东南大学 | Sparse data-oriented personalized emotion analysis method and device |
CN111553147A (en) * | 2020-03-27 | 2020-08-18 | 南京工业大学 | BERT model based on N-gram and semantic segmentation method |
CN112199956B (en) * | 2020-11-02 | 2023-03-24 | 天津大学 | Entity emotion analysis method based on deep representation learning |
CN112597302B (en) * | 2020-12-18 | 2022-04-29 | 东北林业大学 | False comment detection method based on multi-dimensional comment representation |
CN112285664B (en) * | 2020-12-18 | 2021-04-06 | 南京信息工程大学 | Method for evaluating countermeasure simulation confidence of radar-aircraft system |
CN113268592B (en) * | 2021-05-06 | 2022-08-05 | 天津科技大学 | Short text object emotion classification method based on multi-level interactive attention mechanism |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335352A (en) * | 2015-11-30 | 2016-02-17 | 武汉大学 | Entity identification method based on Weibo emotion |
CN107168945A (en) * | 2017-04-13 | 2017-09-15 | 广东工业大学 | A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features |
-
2018
- 2018-07-10 CN CN201810754022.5A patent/CN108984724B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335352A (en) * | 2015-11-30 | 2016-02-17 | 武汉大学 | Entity identification method based on Weibo emotion |
CN107168945A (en) * | 2017-04-13 | 2017-09-15 | 广东工业大学 | A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features |
Non-Patent Citations (2)
Title |
---|
High dimensional data classification and feature selection using support vector machines;Bissan Ghaddar等;《European Journal of Operational Research》;20180316;第265卷(第3期);第993-1004页 * |
多策略中文微博细粒度情绪分析研究;欧阳纯萍等;《北京大学学报(自然科学版)》;20140131;第50卷(第1期);第67-72页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108984724A (en) | 2018-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108984724B (en) | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation | |
CN110083705B (en) | Multi-hop attention depth model, method, storage medium and terminal for target emotion classification | |
CN107992597B (en) | Text structuring method for power grid fault case | |
Xiang et al. | A convolutional neural network-based linguistic steganalysis for synonym substitution steganography | |
CN109753566A (en) | The model training method of cross-cutting sentiment analysis based on convolutional neural networks | |
CN110569508A (en) | Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism | |
CN112966074B (en) | Emotion analysis method and device, electronic equipment and storage medium | |
CN108536735B (en) | Multi-mode vocabulary representation method and system based on multi-channel self-encoder | |
CN112749274B (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN112163429B (en) | Sentence correlation obtaining method, system and medium combining cyclic network and BERT | |
CN113704546A (en) | Video natural language text retrieval method based on space time sequence characteristics | |
CN111538841B (en) | Comment emotion analysis method, device and system based on knowledge mutual distillation | |
Bae et al. | Flower classification with modified multimodal convolutional neural networks | |
CN115794999B (en) | Patent document query method based on diffusion model and computer equipment | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
Zhang et al. | Multi-modal multi-label emotion recognition with heterogeneous hierarchical message passing | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN111507093A (en) | Text attack method and device based on similar dictionary and storage medium | |
CN113035311A (en) | Medical image report automatic generation method based on multi-mode attention mechanism | |
CN115631504B (en) | Emotion identification method based on bimodal graph network information bottleneck | |
CN111382333B (en) | Case element extraction method in news text sentence based on case correlation joint learning and graph convolution | |
CN111145914B (en) | Method and device for determining text entity of lung cancer clinical disease seed bank | |
CN112560440A (en) | Deep learning-based syntax dependence method for aspect-level emotion analysis | |
CN110569355B (en) | Viewpoint target extraction and target emotion classification combined method and system based on word blocks | |
CN115758159A (en) | Zero sample text position detection method based on mixed contrast learning and generation type data enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |