CN108984724B - Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation - Google Patents

Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation Download PDF

Info

Publication number
CN108984724B
CN108984724B CN201810754022.5A CN201810754022A CN108984724B CN 108984724 B CN108984724 B CN 108984724B CN 201810754022 A CN201810754022 A CN 201810754022A CN 108984724 B CN108984724 B CN 108984724B
Authority
CN
China
Prior art keywords
word
clause
attribute
representation
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810754022.5A
Other languages
Chinese (zh)
Other versions
CN108984724A (en
Inventor
谢珏
吴含前
李露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kaier Bote Information Technology Kunshan Co ltd
Original Assignee
Kaier Bote Information Technology Kunshan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kaier Bote Information Technology Kunshan Co ltd filed Critical Kaier Bote Information Technology Kunshan Co ltd
Priority to CN201810754022.5A priority Critical patent/CN108984724B/en
Publication of CN108984724A publication Critical patent/CN108984724A/en
Application granted granted Critical
Publication of CN108984724B publication Critical patent/CN108984724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation. Firstly, the invention provides a clause segmentation algorithm to segment a comment text into a plurality of clauses; secondly, coding the words in each clause by utilizing a plurality of bidirectional long-short term memory neural networks so as to obtain the representation of each clause; and finally, coding the expression of each clause obtained in the last step by adopting a bidirectional long-short term memory neural network so as to obtain the final expression of the whole sentence. Through the method, the information more relevant to the specific attribute is captured from three different dimensions of words, clauses and sentences, and finally the accuracy of emotion classification of the specific attribute is improved.

Description

Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
Technical Field
The invention relates to an emotion analysis method for comment text expression, in particular to a method for improving emotion classification accuracy of specific attributes by using high-dimensional representation.
Background
In order to obtain the emotion polarity of each attribute in the comment text, attribute words, emotion words and emotion modifiers in the comment text are identified by an emotion analysis (SA) technology and are further analyzed to judge the emotion polarity expressed by the comment text for a specific attribute, so that the comment text can be applied to the fields of event analysis, network public opinion analysis, spam processing and the like.
When judging the emotion polarity of the comment text, the traditional coarse-grained emotion analysis method only analyzes and processes the whole comment text and cannot judge the fine-grained polarity of the comment text according to the specific attributes in the comment text. Therefore, the recent research on emotion analysis tends to be fine grained, which becomes a hot topic for research and attention at home and abroad.
Currently, Deep Neural Network (DNN) technology is utilized to carry out emotion analysis on specific attributes in a text, Tang and the like propose a Target-Dependent Long-Short Term Memory Neural Network (TD-LSTM) and a Target-associated Long-Short Term Memory (TC-LSTM) on the problem of carrying out emotion Classification on sentences according to the specific attributes in a Target-Dependent classified with Long Short Term Memory, wherein the TD-LSTM takes Target information into consideration when generating and representing sentences, and then the TC-LSTM associates the Target information and context thereof on the basis of the method, the method takes the average value of word vectors in the target phrase as the target vector, however, the simple average of the word vectors in the target phrase cannot completely express the semantics of the target phrase, and therefore, the optimal result cannot be obtained. Dong et al propose an Adaptive recurrent Neural Network (AdaRNN) for a hypothesis that depends on a specific attribute in an "Adaptive recurrent Neural Network for Target-dependent Twitter sententinformation Classification". And adaptively transmitting the emotional words to the specific attributes according to the context and the syntactic relation between the emotional words and the specific attributes. The method converts the dependency tree of the sentence into a recursive structure for a specific attribute and obtains a higher level representation based on the structure. Experimental research shows that the classification performance of the classifier constructed based on AdaRNN is superior to that of the traditional machine learning method and the basic recurrent neural network method, but the classification performance still needs to be improved.
Disclosure of Invention
The purpose of the invention is as follows: based on the defects of the prior art, the invention provides a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation.
The technical scheme is as follows:
a method for improving emotion classification accuracy of specific attributes by using high-dimensional representation comprises a training stage and a testing stage:
the method comprises the following specific steps:
a training stage:
s1) a sentence is segmented into a plurality of clauses by using a clause segmentation algorithm, each word in the clauses is expressed in the form of a word vector, the whole spliced word vector of the word and an attribute word vector is used as the input of a deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of the word vector and a bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to a development data set to obtain a trained deep neural network model;
s2), the deep neural network model comprises a 3-layer architecture including a word coding layer, a clause coding layer and a softmax layer, the word coding layer is used for capturing the relevance of each word in a clause and a specific attribute, the clause coding layer maps the specific attribute into the clause, and the softmax layer is used for inputting the final representation S of the comment text into a softmax classifier to finally obtain the class probability distribution of the comment text for the given attribute;
s3), the input word sequence of the deep neural network model is a word vector with (d + d ') dimension, wherein d represents the dimension of the word vector, d' represents the dimension of the attribute word vector, and the value of d can be adjusted according to the experimental condition;
s4), training a Loss Function of the model, training a specific attribute emotion classification model based on high-dimensional representation in an End-to-End (End-to End) mode by adopting a Cross-Entropy Loss Function (Cross-Entrol Loss Function);
s5) given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atTrue category label of (2);
s6) regarding the emotion classification model with specific attributes based on high-dimensional representation as a black box function
Figure BDA0001725546690000033
The output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
Figure RE-GDA0001751616660000022
in the above formula, M represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
s7) adopting Adagrad optimization function, and uniformly distributing parameters of all matrixes and vectors
Figure BDA0001725546690000032
Figure BDA0001725546690000041
Wherein r and c' are the number of rows and columns in the matrix; in the training process, in order to avoid overfitting, a Dropout strategy is adopted in the Bi-LSTM;
and (3) a testing stage:
s8) inputting the comment text to be processed into the trained deep neural network model, and obtaining the emotion polarity of the comment text aiming at the specific attribute.
Further, the clause segmentation algorithm is specifically to segment sentences through punctuation marks and connection words (collectively referred to as delimiters): defining a minnum parameter to limit the number of words at least to be contained in the clause, and dividing the partial sentence into the clauses if and only if minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
Further, the other super parameters are adjusted according to the development data set, specifically, the initial value of the learning rate is set to 0.1, the regularization weight of the parameter is set to 10-5Dropoutratate is set to 0.25.
Further, in the clause segmentation algorithm, the parameter minnum is set to 3, and the parameter maxnum is set to 4, so that all possible clauses can be mined from the sentence, and the model can achieve the best performance on developing the data set.
Further, the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing in
Figure BDA0001725546690000042
Is represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, wherein
Figure BDA0001725546690000043
Where d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed as
Figure BDA0001725546690000044
Wherein
Figure BDA0001725546690000045
Representing a d' dimensional vector representation of the nth word in the entity string;
accordingly, the present invention represents the characteristic character string as
Figure BDA0001725546690000046
Usually, a word vector representation has a linear structure, which makes it have an overlapping or subtractive property at a semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
Figure BDA0001725546690000051
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
Figure BDA0001725546690000052
in the above formula
Figure BDA0001725546690000058
Namely, it is
Figure BDA0001725546690000059
Is (d + d'), i is belonged to [1, C],j∈[1,Ni],
Figure BDA00017255466900000510
Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vector
Figure BDA00017255466900000511
As input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted as
Figure BDA00017255466900000512
The neural network is selected fromi,1To
Figure BDA00017255466900000513
I.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed as
Figure BDA00017255466900000514
Is to be from
Figure BDA00017255466900000515
To Ii,1I.e. reading clause c from back to frontiThe word in (1):
Figure BDA0001725546690000053
Figure BDA0001725546690000054
hidden forward state
Figure BDA00017255466900000516
And backward hidden layer state
Figure BDA00017255466900000517
Is spliced to obtainEach word I in the clauseijThe final hidden-layer state representation of (1) fusing all the following words I in the clauseijThe related information of (1):
Figure BDA0001725546690000055
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
Figure BDA0001725546690000056
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
Figure BDA0001725546690000057
Figure BDA0001725546690000061
similar to the word coding layer, by concatenating forward hidden layer states
Figure BDA0001725546690000065
And backward hidden layer state
Figure BDA0001725546690000066
To obtain each clause c in the comment textiThe final hidden-layer state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
Figure BDA0001725546690000062
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
Figure BDA0001725546690000063
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
Figure BDA0001725546690000067
represents the output, WlRepresenting a weight matrix, blRepresents an offset;
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
Figure BDA0001725546690000064
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
Compared with the prior art, the method for improving the emotion classification accuracy of the specific attribute by using high-dimensional representation, provided by the invention, comprises the following steps: according to the method, a multi-level and high-dimensional deep neural network model is constructed by using the comment text and the specific attribute information thereof from three different dimensions of words, clauses and sentences so as to achieve better classification performance.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of a specific attribute emotion classification model architecture constructed by the present invention;
FIG. 3 is a restaurant domain review text example.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
The embodiment shows a method for improving the emotion classification accuracy of specific attributes by using high-dimensional representation:
constructing a bidirectional long-short term memory neural network model based on high-dimensional representation aiming at specific attributes, wherein the model comprises a training stage and a testing stage:
in the training stage, the word coding layer is used for capturing the correlation between each word in the clauses and the specific attributes, the clauses coding layer maps the specific attributes into the clauses to serve as the input of the deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of word vectors and the bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to the development data set to obtain the trained deep neural network model;
in the testing stage, inputting the comment text to be processed into the trained deep neural network model to obtain the emotion polarity of the comment text aiming at the specific attribute;
wherein: aiming at judging the emotion polarity expressed by the comment text aiming at the specific attribute:
on one hand, not all components in the comment text have certain relevance with specific attributes;
on the other hand, the comment text may contain a plurality of attributes, and different attributes may be subjected to emotion classification by combining with information of different parts in the sentence;
therefore, referring to fig. 1, the clause segmentation algorithm proposed in this embodiment segments a sentence into different clauses so as to map specific attributes into the clauses.
The basic idea of the clause segmentation algorithm proposed in this embodiment is to segment sentences by punctuation marks and conjunctions (collectively referred to as delimiters):
as shown in fig. 3, "great and tasty" should not be divided into two clauses by the conjunction of "and", so that not all separators can be used as the boundaries of the clauses;
the scheme defines a minnum parameter to limit the number of words at least to be contained in the clause, and divides the partial sentence into the clauses only when the minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the clause segmentation method is explained in detail in table 1, wherein the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
TABLE 1 clause segmentation Algorithm
Figure BDA0001725546690000071
Figure BDA0001725546690000081
Referring to fig. 2, the bidirectional long-short term memory neural network model based on high-dimensional representation, which is constructed for specific attributes, comprises a 3-layer architecture of a word coding layer, a clause coding layer and a softmax layer:
the word coding layer is used for capturing the relevance of each word in the clauses and the specific attributes;
the clause coding layer maps the specific attribute into the clause;
the softmax layer is used for inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for the given attribute;
training Loss Function of model Cross-Entropy Loss Function (Cross-Entropy Loss Function) was selected, and the activation Function was adarad:
the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing in
Figure BDA0001725546690000082
Is represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, wherein
Figure BDA0001725546690000083
Where d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed as
Figure BDA0001725546690000096
Wherein
Figure BDA0001725546690000097
Representing a d' dimensional vector representation of the nth word in the entity string;
accordingly, the present invention represents the characteristic character string as
Figure BDA0001725546690000098
Usually, a word vector representation has a linear structure, which makes it have an overlapping or subtractive property at a semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
Figure RE-GDA0001751616660000086
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
Figure BDA0001725546690000092
in the above formula
Figure BDA0001725546690000099
Namely, it is
Figure BDA00017255466900000910
Is (d + d'), i is belonged to [1, C],j∈[1,Ni],
Figure BDA00017255466900000911
Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vector
Figure BDA00017255466900000912
As input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted as
Figure BDA00017255466900000913
The neural network is selected fromi,1To
Figure BDA00017255466900000914
I.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed as
Figure BDA00017255466900000915
Is to be from
Figure BDA00017255466900000916
To Ii,1I.e. reading clause c from back to frontiThe word in (1):
Figure BDA0001725546690000093
Figure BDA0001725546690000094
hidden forward state
Figure BDA00017255466900000917
And backward hidden layer state
Figure BDA00017255466900000918
Splicing to obtain each word I in the clausesijThe final hidden-layer state representation of (1) fusing all the following words I in the clauseijThe related information of (1):
Figure BDA0001725546690000095
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
Figure BDA0001725546690000101
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
Figure BDA0001725546690000102
Figure BDA0001725546690000103
similar to the word coding layer, by concatenating forward hidden layer states
Figure BDA0001725546690000107
And backward hidden layer state
Figure BDA0001725546690000108
To obtain each clause c in the comment textiThe final hidden-layer state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
Figure BDA0001725546690000104
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
Figure BDA0001725546690000105
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
Figure BDA0001725546690000109
represents the output, WlRepresenting a weight matrix, blRepresents an offset;
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
Figure BDA0001725546690000106
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
A verification step:
in order to verify the advantages of the deep neural network model provided by the invention relative to other emotion classification algorithms, a series of comparison experiments are carried out:
the experimental environment configuration comprises two parts of hardware and software:
the hardware configuration used by the training model is Intel Xeon 2.5GHz, 4 cores and 8GB of memory;
the software configuration part is provided with an operating system of Windows10, a machine learning front-end library utilized is keras-1.2.2, a rear end is theta-0.8.2 and is based on python2.7 and some scientific calculation libraries;
the experimental procedure mainly includes three aspects:
1) data preparation
According to the invention, experiments are carried out on two data sets (namely, the Laptop computer field and the Restaurant field) of a semantic evaluation Task12 to verify the effectiveness of the method provided by the invention, each data set consists of a plurality of user comments, each comment comprises an attribute list and emotional polarities corresponding to the attributes, wherein the emotional polarities comprise positive direction, neutral direction and negative direction, and the data distribution conditions of the two fields in the data set are shown by referring to a table 2;
in addition, 10% of the data is randomly selected from the training set as a development data set for adjusting algorithm parameters, and Glove is selected as a pre-trained word vector.
TABLE 2 Restaurant and Laptop computer Domain data set distribution
Figure BDA0001725546690000111
2) Model training
The method adopts a Cross-Entropy Loss Function (Cross-Entropy Loss Function) to train a specific attribute emotion classification model based on high-dimensional representation in an End-to-End (End-to-End) mode. Given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atTrue category label of (2);
considering a high-dimensional representation-based specific attribute emotion classification model as a black box function
Figure BDA0001725546690000112
The output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
Figure RE-GDA0001751616660000112
m represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
adagrad is adopted as an optimization function, and parameters of all matrixes and vectors are uniformly distributed
Figure BDA0001725546690000122
Figure BDA0001725546690000123
Wherein r and c' are the number of rows and columns in the matrix;
and in order to avoid overfitting during training, the Dropout strategy is adopted in Bi-LSTM.
3) Results of the experiment
Comparing the deep neural network model with a reference method so as to comprehensively evaluate the performance of the model:
the reference method and the method provided by the scheme adopt Glove word vectors during training;
the reference method comprises the following steps:
1) majority algorithm (Majority): this method is a basic reference method, which assigns a majority of emotional polarities appearing in a training set to each test sample for a specific attribute;
2) long short term memory neural network (LSTM): the method uses only one LSTM to model the context to obtain the hidden layer representation of each word, then the average value of all the hidden layer representations is regarded as the final representation of the input, and the final representation is sent to the softmax layer to obtain the prediction probability value of each label;
3) long-short term memory neural network based on target association (TC-LSTM): the method extends the basic LSTM by using two LSTMs, namely one forward LSTM and one backward LSTM for the attribute information. In addition, the model blends attribute information into the representation of the sentence, and finally, the representation of the two attributes is spliced together for emotion polarity prediction aiming at the specific attribute;
4) attention-based long-short term memory neural network (ATAE-LSTM): the method models context words through an LSTM, and embeds attribute vectors into each word vector;
5) interactive Attention Network (IAN): the method is an interactive learning method, firstly, modeling is carried out on context and attributes through LSTM, and then attention expression is interactively learned on the context and the attributes;
the method proposed by the scheme is a multilayer bidirectional long-short term memory neural network (Hierarchical Bi-LSTM): the method is a multi-layer Bi-LSTM, and a multi-level and high-dimensional deep neural network model is constructed by utilizing comment texts and specific attribute information thereof based on three different dimensions of words, clauses and sentences in high-dimensional representation: firstly, a sentence is divided into a plurality of clauses by using a clause division algorithm; then, coding all clauses by utilizing a plurality of bidirectional long-short term memory neural networks; and finally, coding the clauses by utilizing a bidirectional long-short term memory neural network, and further obtaining the probability that the comment text belongs to each category aiming at the specific attribute through a softmax layer.
TABLE 3 comparison of Performance of different attribute level Emotion Classification methods for plain text
Figure BDA0001725546690000131
Referring to table 3, a comparison of the performance between this scheme and other baseline methods is shown:
as can be observed from table 3, the performance of the Majority algorithm is the worst, and the classification accuracy of the classifier constructed by the Majority algorithm in the retaurant field and the Laptop field is 53.7% and 57.0% respectively;
in addition, all the methods are realized on the basis of the LSTM neural network model, the classification performance of the methods is superior to that of a Majority algorithm, and experimental results show that the LSTM model not only has the potential of automatic generation and representation, but also can bring performance improvement for attribute-level emotion classification;
in addition, it can be seen from Table 3 that the classification accuracy of TC-LSTM, ATAE-LSTM and IAN are all better than that of LSTM. This result demonstrates that it is helpful to take attribute information into account when sentiment classification is performed for a particular attribute for improving classification performance;
finally, it can be seen that the Hierarchical Bi-LSTM method proposed by the present invention is superior to all of the aforementioned methods, which highlights the superiority of using clause information.
In summary, the method for improving the emotion classification accuracy of the specific attribute by using high-dimensional representation provided in this embodiment: according to the method, a multi-level and high-dimensional deep neural network model is constructed by using the comment text and the specific attribute information thereof from three different dimensions of words, clauses and sentences so as to achieve better classification performance.

Claims (5)

1. A method for improving emotion classification accuracy of specific attributes by using high-dimensional representation is characterized by comprising the following steps: the method comprises a training phase and a testing phase: the method comprises the following specific steps:
a training stage:
s1) a sentence is segmented into a plurality of clauses by using a clause segmentation algorithm, each word in the clauses is expressed in the form of a word vector, the whole spliced word vector of the word and an attribute word vector is used as the input of a deep neural network model, all unknown words are randomly sampled and initialized in uniformly distributed U (-0.01,0.01), the dimensionalities of the word vector and a bidirectional long-short term memory neural network are set to be 300, and other super parameters are correspondingly adjusted according to a development data set to obtain a trained deep neural network model;
s2), the deep neural network model comprises a 3-layer architecture including a word coding layer, a clause coding layer and a softmax layer, the word coding layer is used for capturing the relevance of each word in a clause and a specific attribute, the clause coding layer maps the specific attribute into the clause, and the softmax layer is used for inputting the final representation S of the comment text into a softmax classifier and finally obtaining the class probability distribution of the comment text for the given attribute;
s3), the input word sequence of the deep neural network model is a word vector with (d + d ') dimension, wherein d represents the dimension of the word vector, d' represents the dimension of the attribute word vector, and the value of d can be adjusted according to the experimental condition;
s4), training a special attribute emotion classification model based on high-dimensional representation by adopting a Cross-Entropy Loss Function (Cross-Entropy Loss Function) in an End-to-End (End-to-End) mode;
s5) given training data xt,at,ytWherein x istDenotes the t-th sample to be predicted, atRepresenting the attribute, y, present in the sampletRepresenting a sample x to be predictedtFor a specific attribute atIsA category of truth label;
s6) regarding the emotion classification model with specific attributes based on high-dimensional representation as a black box function
Figure FDA0003050959500000011
The output of this function is a vector representing the probability that the input text belongs to each class label, and the goal of the training is to minimize the loss function:
Figure FDA0003050959500000012
in the above formula, M represents the number of training samples, K represents the number of class labels, and L represents the bias parameter L2Regularization of (1);
s7) adopting Adagrad optimization function, and uniformly distributing parameters of all matrixes and vectors
Figure FDA0003050959500000013
Wherein r and c' are the number of rows and columns in the matrix; in the training process, in order to avoid overfitting, a Dropout strategy is adopted in the Bi-LSTM;
and (3) a testing stage:
s8) inputting the comment text to be processed into the trained deep neural network model, and obtaining the emotion polarity of the comment text for specific attributes.
2. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 1, wherein: the clause segmentation algorithm is specifically to segment sentences through punctuation marks and connecting words: defining a minnum parameter to limit the number of words at least to be contained in the clause, and dividing the sentence into clauses if and only if minnum is larger than a specified value;
in addition, a maxnum parameter is defined to ensure that each sentence is cut into the same number of clauses, and the operation aims at that the subsequent neural network requires a fixed number of clauses as input;
the separator includes punctuation marks and connection words, i.e., ","; "," and "," but "," so "," especialoly "," however "," the "," throughout "," except ".
3. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 2, wherein: the other hyper-parameters are adjusted according to the development data set, specifically, the initial value of the learning rate is set to 0.1, the regularization weight of the parameter is set to 10-5Dropout Rate is set to 0.25.
4. The method for improving the emotion classification accuracy of the specific attribute by using the high-dimensional representation as claimed in claim 3, wherein: in the clause segmentation algorithm, the parameter minnum is set to be 3, and the parameter maxnum is set to be 4, so that all possible clauses can be mined from the sentences, and the model can achieve the best performance on developing a data set.
5. The method for improving emotion classification accuracy of specific attributes by using high-dimensional representation as claimed in claim 4, wherein: the two-way long-short term memory neural network model based on high-dimensional representation and composed of a word coding layer, a clause coding layer and a softmax layer comprises the following specific processes:
a first word coding layer, assuming that the comment text contains C clauses in total, wherein C is usediTo represent the ith clause and each clause contains N in commoniA word, IijThen the word that appears at the jth position in the ith clause is represented, where j e [1, Ni];
Clause ciFor words appearing in
Figure FDA0003050959500000021
Is represented by where j ∈ [1, N ∈ ]]The words wij=Ew·IijAre all stored in a word vector (word embedding) matrix, wherein
Figure FDA0003050959500000022
Where d represents the dimension of the word vector and V represents the vocabulary;
the attribute category (aspect category) of appearance is composed of two parts, entity (entity) and feature (attribute):
specifically, assume an entity string e1Has a length of L1It is expressed as
Figure FDA0003050959500000023
Wherein
Figure FDA0003050959500000031
Indicating d for the nth word in the entity stringA dimension vector representation;
accordingly, the characteristic character string is represented as
Figure FDA0003050959500000032
Usually, a word vector representation has a linear structure, which makes it have an overlap or subtraction characteristic at the semantic level, so that the purpose of combining words can be achieved by adding elements of the word vector;
adding the entity word vectors and the feature word vectors to obtain a final representation of the attribute word vectors:
Figure FDA0003050959500000033
then, adding an attribute word vector on the basis of the word vector representation to obtain an attribute extended representation of each word:
Figure FDA0003050959500000034
in the above formula
Figure FDA0003050959500000035
Namely, it is
Figure FDA0003050959500000036
Is (d + d'), i is belonged to [1, C],j∈[1,Ni],
Figure FDA0003050959500000037
Representing a vector splicing operator, C representing the number of clauses, NiExpress clause ciThe number of words contained therein;
the obtained word vector
Figure FDA0003050959500000038
As input, a bidirectional long-short term memory neural network (Bi-LSTM) is adopted to integrate the information of each word in the forward direction and the backward direction, so that the input of a word vector matrix is converted into a new representation:
Bi-LSTM means that each training sequence is a long short-term memory neural network (LSTM) forward and backward, and they are connected with an output layer;
this structure provides complete past and future context information for each point in the output layer input sequence;
the forward LSTM contained in the Bi-LSTM is denoted as
Figure FDA0003050959500000039
The neural network is selected fromi,1To
Figure FDA00030509595000000310
I.e. reading clause c from front to backiThe corresponding backward LSTM of the word in (1) is expressed as
Figure FDA00030509595000000311
Is to be from
Figure FDA00030509595000000312
To Ii,1I.e. reading clause c from back to frontiThe word in (1):
Figure FDA00030509595000000313
Figure FDA00030509595000000314
hidden forward state
Figure FDA00030509595000000315
And backward hidden layer state
Figure FDA00030509595000000316
Splicing to obtain each word I in the clausesijIs shown, the final hidden state representation fuses all the following words I in the clauseijThe related information of (1):
Figure FDA00030509595000000317
finally, each word I in the clause is divided into words through the Mean-Pooling layerijHidden layer state h ofijAverage pooling to get the final representation of the clause:
Figure FDA0003050959500000041
second clause coding layer, for clause vector c obtained in the previous stepiStill, a Bi-LSTM is used to encode these given clause vectors to fuse context information:
Figure FDA0003050959500000042
Figure FDA0003050959500000043
similar to the word coding layer, by concatenating forward hidden layer states
Figure FDA0003050959500000044
And backward hidden layer state
Figure FDA0003050959500000045
To obtain each clause c in the comment textiThe final hidden state representation of (2) fusing all the following clauses c in the comment textiThe related information is as follows:
Figure FDA0003050959500000046
commenting each clause c in the text through the Mean-Pooling layeriHidden layer state h ofiAnd carrying out average pooling to obtain a final representation of the comment text:
Figure FDA0003050959500000047
for the third softmax layer, inputting the final representation s of the comment text into a softmax classifier, and finally obtaining the class probability distribution of the comment text for given attributes:
o=Wl·s+bl
Figure FDA0003050959500000048
represents the output, WlRepresenting a weight matrix, blRepresents an offset;
the method of calculating the probability that a given sentence belongs to each category K e [1, K ] is as follows:
Figure FDA0003050959500000049
theta represents all parameters, and the class label with the highest probability calculated according to the formula is used as the final class label of the comment text.
CN201810754022.5A 2018-07-10 2018-07-10 Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation Active CN108984724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810754022.5A CN108984724B (en) 2018-07-10 2018-07-10 Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810754022.5A CN108984724B (en) 2018-07-10 2018-07-10 Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation

Publications (2)

Publication Number Publication Date
CN108984724A CN108984724A (en) 2018-12-11
CN108984724B true CN108984724B (en) 2021-09-28

Family

ID=64537672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810754022.5A Active CN108984724B (en) 2018-07-10 2018-07-10 Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation

Country Status (1)

Country Link
CN (1) CN108984724B (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447021B (en) * 2018-11-08 2020-11-27 北京灵汐科技有限公司 Attribute detection method and attribute detection device
CN109766557B (en) * 2019-01-18 2023-07-18 河北工业大学 Emotion analysis method and device, storage medium and terminal equipment
CN109710769A (en) * 2019-01-23 2019-05-03 福州大学 A kind of waterborne troops's comment detection system and method based on capsule network
CN109902174B (en) * 2019-02-18 2023-06-20 山东科技大学 Emotion polarity detection method based on aspect-dependent memory network
CN109993216B (en) * 2019-03-11 2021-05-11 深兰科技(上海)有限公司 Text classification method and device based on K nearest neighbor KNN
CN110083829A (en) * 2019-04-03 2019-08-02 平安科技(深圳)有限公司 Feeling polarities analysis method and relevant apparatus
CN110083785A (en) * 2019-04-29 2019-08-02 清华大学 The Sex, Age method of discrimination and device of record are searched for based on user
CN110175237B (en) * 2019-05-14 2023-02-03 华东师范大学 Multi-category-oriented secondary emotion classification method
CN111353040A (en) * 2019-05-29 2020-06-30 北京工业大学 GRU-based attribute level emotion analysis method
CN110309769B (en) * 2019-06-28 2021-06-15 北京邮电大学 Method for segmenting character strings in picture
CN110502633A (en) * 2019-07-19 2019-11-26 中山大学 Network comment management method based on machine learning
CN110569338B (en) * 2019-07-22 2022-05-03 中国科学院信息工程研究所 Method for training decoder of generative dialogue system and decoding method
CN110765769B (en) * 2019-08-27 2023-05-02 电子科技大学 Clause feature-based entity attribute dependency emotion analysis method
CN110609899B (en) * 2019-08-29 2022-04-19 成都信息工程大学 Specific target emotion classification method based on improved BERT model
CN110717325B (en) * 2019-09-04 2020-11-13 北京三快在线科技有限公司 Text emotion analysis method and device, electronic equipment and storage medium
CN110781273B (en) * 2019-09-17 2022-05-31 华东交通大学 Text data processing method and device, electronic equipment and storage medium
CN111144130A (en) * 2019-12-26 2020-05-12 辽宁工程技术大学 Context-aware-based fine-grained emotion classification method for hybrid neural network
CN111241290B (en) * 2020-01-19 2023-05-30 车智互联(北京)科技有限公司 Comment tag generation method and device and computing equipment
CN111242083B (en) * 2020-01-21 2024-01-26 腾讯云计算(北京)有限责任公司 Text processing method, device, equipment and medium based on artificial intelligence
CN111325027B (en) * 2020-02-19 2023-04-28 东南大学 Sparse data-oriented personalized emotion analysis method and device
CN111553147A (en) * 2020-03-27 2020-08-18 南京工业大学 BERT model based on N-gram and semantic segmentation method
CN111897954B (en) * 2020-07-10 2024-04-02 西北大学 User comment aspect mining system, method and storage medium
CN112199956B (en) * 2020-11-02 2023-03-24 天津大学 Entity emotion analysis method based on deep representation learning
CN112285664B (en) * 2020-12-18 2021-04-06 南京信息工程大学 Method for evaluating countermeasure simulation confidence of radar-aircraft system
CN112597302B (en) * 2020-12-18 2022-04-29 东北林业大学 False comment detection method based on multi-dimensional comment representation
CN113268592B (en) * 2021-05-06 2022-08-05 天津科技大学 Short text object emotion classification method based on multi-level interactive attention mechanism
CN113379032A (en) * 2021-06-08 2021-09-10 全球能源互联网研究院有限公司 Layered bidirectional LSTM sequence model training method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335352A (en) * 2015-11-30 2016-02-17 武汉大学 Entity identification method based on Weibo emotion
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335352A (en) * 2015-11-30 2016-02-17 武汉大学 Entity identification method based on Weibo emotion
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
High dimensional data classification and feature selection using support vector machines;Bissan Ghaddar等;《European Journal of Operational Research》;20180316;第265卷(第3期);第993-1004页 *
多策略中文微博细粒度情绪分析研究;欧阳纯萍等;《北京大学学报(自然科学版)》;20140131;第50卷(第1期);第67-72页 *

Also Published As

Publication number Publication date
CN108984724A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN107992597B (en) Text structuring method for power grid fault case
Xiang et al. A convolutional neural network-based linguistic steganalysis for synonym substitution steganography
CN110569508A (en) Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism
CN115794999B (en) Patent document query method based on diffusion model and computer equipment
Zhang et al. Multi-modal multi-label emotion recognition with heterogeneous hierarchical message passing
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN108536735B (en) Multi-mode vocabulary representation method and system based on multi-channel self-encoder
Bae et al. Flower classification with modified multimodal convolutional neural networks
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN113035311A (en) Medical image report automatic generation method based on multi-mode attention mechanism
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN114048314A (en) Natural language steganalysis method
CN112560440A (en) Deep learning-based syntax dependence method for aspect-level emotion analysis
CN115631504B (en) Emotion identification method based on bimodal graph network information bottleneck
CN111382333A (en) Case element extraction method in news text sentence based on case correlation joint learning and graph convolution
CN115730232A (en) Topic-correlation-based heterogeneous graph neural network cross-language text classification method
CN115758159A (en) Zero sample text position detection method based on mixed contrast learning and generation type data enhancement
CN111723301B (en) Attention relation identification and labeling method based on hierarchical theme preference semantic matrix
CN114943216A (en) Case microblog attribute-level viewpoint mining method based on graph attention network
Lai et al. Bi-directional attention comparison for semantic sentence matching
CN114492458A (en) Multi-head attention and word co-occurrence based aspect-level emotion analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant