CN113032570A

CN113032570A - Text aspect emotion classification method and system based on ATAE-BiGRU

Info

Publication number: CN113032570A
Application number: CN202110388795.8A
Authority: CN
Inventors: 曹倩倩; 陈向阳
Original assignee: Wuhan Institute of Technology
Current assignee: Wuhan Institute of Technology
Priority date: 2021-04-12
Filing date: 2021-04-12
Publication date: 2021-06-25

Abstract

The invention discloses a text aspect emotion classification method and system based on ATAE-BiGRU, wherein the method comprises the following steps: acquiring a data set; marking the text data in the data set with emotion labels to obtain a training set; vectorizing text data by using a BERT pre-training model, fusing facet information and an attention mechanism in a BiGRU network, and constructing a BERT-ATAE-BiGRU emotion classification model; training the emotion classification model by using a training set; and carrying out text aspect emotion classification by using the trained emotion classification model. The invention utilizes a BERT pre-training model to obtain word vector representation of a text, uses a BiGRU network and integrates aspect information and an attention mechanism, and provides a BERT-ATAE-BiGRU emotion classification model which can highly pay attention to information in a specific aspect, fully extracts key content in a sentence, and improves the accuracy rate of aspect emotion classification.

Description

Text aspect emotion classification method and system based on ATAE-BiGRU

Technical Field

The invention belongs to the technical field of deep learning classification, and particularly relates to a text aspect emotion classification method and system based on ATAE-BiGRU.

Background

With the rapid development of information technology, more and more internet applications have penetrated into the aspects of people's lives. The interaction between the general users and the network applications is more and more frequent, and the roles of the internet user groups are gradually evolved from the internet content information viewers to the creators. In the process, the user can put forward an emotional attitude type viewpoint and comment on the media platform, and detection and classification of the emotional attitude type viewpoint and comment on the media platform can not only generate huge commercial value, but also maintain the safety of the Internet environment. Therefore, the sentiment analysis method has important significance for sentiment analysis of subjective texts in internet online comment data. The method has the advantages that the idea and emotion of the implications in the social network are analyzed, the emotional tendency of the implications is mined, and the decision making is more efficient and correct.

In recent years, the natural language processing direction is rapidly printed into an eye curtain, and the classification of the semantics of comment texts also represents higher commercial value. Emotion analysis is an important task in natural language processing, the main purpose of which is to identify people's emotions, views and attitudes towards products, services, individuals, organizations and other entities. Aspect-level sentiment classification is a fine-grained task in sentiment analysis that aims to identify the sentiment polarity of a target in its context. Such as: "The food in The retaurant is delicious, but The waiter's attribute is not very good". In this sentence, "food" exhibits a positive emotion, but exhibits a negative emotion with respect to "waiter's attribute". It can thus be seen that different emotional polarities are present for different aspects.

In real life, the meaning of a word generally diverges in multiple directions, so that the word needs to be mapped to multiple dimensions. Therefore, the problem that the semantics diverge in multiple directions can be solved, and on the other hand, the multidimensional vector can be represented by using smaller numbers. In the emotion classification task, no matter whether the text is simply classified into two categories, or the text is subjected to complex multi-category, or the text is subjected to a fine-grained classification task, the word vector conversion is required. At present, many tools are available for obtaining word vectors, such as word2vec, Glove.

The term vector representation refers to a representation in which words are vectorized when they are classified according to the meaning of their expression. Wherein, the pre-training language model BERT can vectorize words. BERT is a deep two-way language model based on the Transformer architecture and enhances the task effect in many popular NLPs. BERT is an NLP pre-trained language model proposed in 2018 by google researchers. The model is a deep, bi-directional Transformer coder implemented language model. In contrast, BERT can also be integrated into downstream tasks and dynamically adjusted as a task-specific architecture. Superior to many systems with task specific architectures and are now used in many popular NLP tasks. BerT is more feature-extracting than other word vector tools, where attention mechanisms can enhance the model's ability to capture emotion semantics.

At present, for the conventional emotion classification task, a neural network is basically adopted to extract semantic information of sentences, and then polarity judgment is carried out. Most of them are simple classifications of text sentiment, and the above mentioned texts containing information of aspects, or the sentiment of a specific aspect, are not studied accurately.

Disclosure of Invention

The invention aims to provide a text aspect emotion classification method and system based on ATAE-BiGRU, and the accuracy of text aspect emotion analysis is improved.

The technical scheme provided by the invention is as follows:

a text aspect emotion classification method based on ATAE-BiGRU comprises the following steps:

s1, acquiring a data set;

s2, labeling emotion labels on the text data in the data set to obtain a training set;

s3, vectorizing the text data by using a BERT pre-training model, fusing aspect information and an attention mechanism in a BiGRU network, and constructing a BERT-ATAE-BiGRU emotion classification model;

s4, training the emotion classification model by using a training set;

and S5, carrying out text emotion classification by using the trained emotion classification model.

Preferably, step S1 includes: and collecting text data and performing data cleaning to construct a data set.

Preferably, step S1 further includes: and ID numbering is carried out on the text data in the data set by using an SQL statement, and the primary key constraint is added to the data set by using a MySQL database.

Preferably, after the data set is acquired, a database is constructed.

Preferably, the sentiment tag is: -1 denotes negative emotions, 1 denotes positive emotions, and 0 denotes neutral emotions.

Preferably, the BERT-ATAE-BiGRU emotion classification model comprises:

an input layer: obtaining word vectors of the text data by using a BERT pre-training model;

a BIGRU layer: taking the word vector as the serialization input of the BIGRU network, and respectively extracting semantic features of text data from 2 directions by using a model to obtain the hidden layer state and the integral expression of sentences at each time step;

the aspect attention module: fusing the hidden layer state output by the BiGRU layer with the aspect information and the attention mechanism to obtain text representation related to a specific aspect;

full connection layer and classification layer: and (4) carrying out emotion polarity classification of a specific aspect by using a softmax function.

Preferably, the BERT-ATAE-BiGRU emotion classification model processing steps are as follows:

s31, splicing the aspect word vectors in the text data and the word vectors of all words to serve as input vectors of a BIGRU layer, and obtaining a vector H of a hidden layer;

s32, splicing the vectors of the hidden layer and the aspect word vectors, inputting the spliced vectors into a layer containing an attention mechanism to obtain an attention vector gamma, and further obtaining a vector representation h of text data^*；

S33, mixing h^*Inputting the person to a Softmax layer, and obtaining the emotion polarity y of the text content for a specific aspect:

γ＝Hα^T

h^*＝tanh(W_pγ+W_xh_N)

y＝softmax(W_sh^*+b_s)

where α denotes the weight of attention, Wp, Wx, Ws, bs are parameters to be learned, { h₁,h₂,...,h_NDenotes the hidden layer vector H, tanh denotes the hyperbolic tangent activation function, and softmax denotes the classification function.

Preferably, parameters of the model are adjusted in the form of multiple parameter adjustments, and a Dropout strategy is used in parameter adjustment.

Preferably, the emotion classification model is verified using precision, recall, and F1 values.

A text aspect emotion classification system based on ATAE-BiGRU for realizing the text aspect emotion classification method based on ATAE-BiGRU comprises:

the data acquisition module is used for acquiring a data set;

the emotion marking module is used for marking the text data in the data set with emotion labels to obtain a training set;

the model building module is used for vectorizing the text data by using a BERT pre-training model, fusing aspect information and an attention mechanism in a BiGRU network and building a BERT-ATAE-BiGRU emotion classification model;

the model training module is used for training the emotion classification model by utilizing a training set;

and the emotion classification module is used for performing emotion classification on the text by using the trained emotion classification model.

The invention has the beneficial effects that:

according to the text aspect emotion classification method and system based on the ATAE-BiGRU, word vector representation of a text is obtained by using a BERT pre-training model, a BiGRU network is used, aspect information and an attention mechanism are fused, and the BERT-ATAE-BiGRU emotion classification model is provided, can be used for paying high attention to specific aspect information, fully extracts key contents in sentences, and improves aspect emotion classification accuracy.

Drawings

FIG. 1 is a flow chart of the text aspect emotion classification method based on ATAE-BiGRU of the invention.

FIG. 2 is a schematic diagram of the ATAE-BiGRU network structure in the embodiment of the present invention.

FIG. 3 is a graph comparing accuracy of different models in an embodiment of the present invention.

FIG. 4 is a graph comparing recall rates of different models in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the accompanying drawings and specific embodiments, but the scope of the present invention is not limited to the following embodiments.

The invention is developed aiming at aspect-level emotion analysis, which is a fine-grained emotion analysis task and aims to identify the emotional expression of a specific target in a sentence. At present, the task of classifying the aspect emotion is to extract semantic information of a sentence by using a neural network, and then to judge the polarity. On the basis, the method utilizes BERT to obtain word vectors, then integrates an aspect and attention mechanism on the basis of a Bi-GRU network, and constructs a BERT-ATAE-BiGRU emotion classification model. In the training process of the model, specific aspects can be focused highly, so that the aspect emotion classification effect can be effectively improved.

In addition, the invention also uses a Semeval data set for experimental verification, and finds that the characteristic extraction capability of BERT is more superior to other word vector tools; and the BiGRU also enables the network structure to be simpler, the use parameters to be less and the efficiency to be improved to a certain extent. Compared with the conventional emotion classification method, the method effectively improves the accuracy rate and recall rate of aspect emotion classification and the value of F1.

In order to meet various requirements for calculating, querying, counting, analyzing and the like of text data, the following two preconditions can be carried out:

(1) and collecting the data, and checking whether the data accord with the direction required by the invention, namely judging whether the collected data can be used as emotion classification data. And then, the collected data is cleaned, the number is counted, the data is stored in a database, and the SQL statement is used for adding the main key constraint to the data, so that the calculation, the query, the statistics and the analysis of the data are facilitated.

(2) The stored data is sorted, and the placement of specific aspects in the aspect attributes is well graded, so that the efficiency of calculation, query, statistics and analysis can be improved.

As shown in fig. 1, the text aspect emotion classification method based on ATAE-BiGRU according to the embodiment of the present invention includes the following steps:

and S1, collecting data and finding a data set suitable for the direction of the method.

The data set can be obtained by crawling the official website comment texts of various large restaurant platforms APP, and the emotion classification public data set can also be used. For data crawling, after the Python runtime environment is configured, the Requests class library is installed using pip. The specific operation is as follows: under a Windows platform, a cmd command window is operated, the pip3 install Requests are input, an Enter key is pressed, a Requests class library can be installed, and data crawling is performed to acquire data.

And S2, cleaning the acquired data of the public data set or the crawled data, and labeling the emotion label.

In this embodiment, -1 represents a negative emotion, 1 represents a positive emotion, and 0 represents a neutral emotion. And after the emotion labels are labeled, storing the processed data by using a text document, wherein the stored text document comprises a training data set and a testing data set, and the storage format is txt format.

When the data is cleaned, the main operation is to remove the link containing the URL in the data, because the URL has little useful information, which is generally used for the purpose of advertisement guidance and user positioning.

S3, constructing a BERT-ATAE-BiGRU emotion classification model, wherein the model firstly uses a BERT pre-training model to carry out vectorization on text data so as to obtain word vector representation of text words; then a bidirectional GRU network is used, and the facet information and attention mechanism are fused in the network, so that the model can focus on the facet information to a high degree.

By using the bidirectional GRU network structure, the long-term dependence problem in the recurrent neural network can be well solved, and the network is easier to calculate and implement. Because only two doors exist, the internal structure is simpler, and in addition, the parameters of the GRU neural network are reduced by 1/3 compared with LSTM, so that overfitting is not easy to generate.

Wherein, the GRU network updating mode:

r_t＝σ(w_r·[h_t-1,x_t])

z_t＝σ(w_z·[h_t-1,x_t])

h_t＝(1-z_t)*h_t-1+z_t*h_t

in the formula, r_tReset gate for time t, z_tFor the update gate at time t,

is a candidate activation state at time t, h_tFor the active state at time t, h_t-1Hidden layer state at time t-1, x_tWords representing the currently input sentence, w_r、w_zW is a corresponding weight matrix, σ is a sigmoid activation function, and tanh is a hyperbolic tangent activation function. The updating door is determined by historical information which needs to be forgotten in the current state and the received new information; the reset gate is determined by information obtained from the history information for the candidate state.

BiGRU can be viewed as two unidirectional GRUs. Wherein x is_tWhich is indicative of the current input to the device,

representing the forward hidden state at time t,

representing a reverse hidden state, h_tRepresenting the hidden state after splicing the forward hidden layer state and the reverse hidden layer state, the formula is as follows:

in which the GRU () function represents a non-linear transformation of an input word vector to encode the word vector into a corresponding GRU hidden layer state, w_t、v_tRespectively representing the forward hidden layer state h corresponding to the bidirectional GRU at the time t_tAnd a reverse hidden layer state h_tAssigned weight, b_tIs the bias against which the hidden layer state is biased at time t.

Before the word vector conversion is carried out on the data set, the data set needs to be counted, and the accuracy of the model is calculated conveniently. In the embodiment of the invention, statistics shows that the training set has 6762 samples, and the test set has 2369 samples.

And S4, configuring Python environment, executing Python track.

The experiment uses Python3.6 and Pytroch 1.0 environment, and matches with various packages and libraries, wherein numpy, sklern and transformations libraries are used, and the specific operations are as follows:

the first step is as follows: the Python3.6 interpreter is downloaded and the pytorech 1.0 framework is installed.

The second step is that: open Anaconda prompt, run with administrator status, enter install command: pip install numpy. The same is done for sklern and transformations libraries using commands.

The third step: executing a Python train. The log information can be used for checking the loss and the result value of each training round.

And S5, loading the training set into a program, preprocessing the training set, and converting the training set into data required by BERT input.

And loading a data set required by training into an executable program, preprocessing, and converting into data required by BERT input. BERT is a language model pre-trained by an NLP, which is implemented as a deep bi-directional Transformer coder. It can be integrated into downstream tasks and dynamically adjusted as a task-specific architecture, superior to many systems with task-specific architectures. BerT is more feature-extracting than other word vector tools, where attention mechanisms can enhance the model's ability to capture emotion semantics.

And S6, making the preprocessed data set into a training data set and a testing data set, and counting data information of all aspects of the data sets.

The resulting data set was divided into a training set and a test set, where the training set had 6762 samples and the test set had 2369 samples.

And S7, training the network model by using the data set for training.

The model uses BERT for word vector acquisition, which can be integrated into downstream tasks and dynamically adjusted as a task-specific architecture. The attention mechanism can enhance the capturing capability of the model for emotion semantics, and then the model is applied to an ATAE-BiGRU network model, so that the classification accuracy and efficiency can be effectively improved, and the network model is shown in FIG. 2. The BERT-ATAE-BiGRU emotion classification model of the method adopts a BERT pre-training and ATAE-BiGRU neural network model, and comprises the following steps:

step 1: and the input layer is used for training an input sample by using a BERT model to obtain a word vector.

Step 2: and the BIGRU layer takes the word vectors output by the input layer as the serialization input of the BIGRU network, so that the model respectively extracts the semantic features of the text from 2 directions, and the hidden layer state and the integral expression of the sentence at each time step are obtained.

Step 3: and the aspect attention module fuses the hidden layer state output by the BiGRU layer with aspect information and an attention mechanism to acquire text representation related to a specific aspect.

Step 4: the full connection layer and the classification layer use a softmax function to classify the emotion polarity of a specific aspect. Where, -1 represents a negative emotion, 1 represents a positive emotion, and 0 represents a neutral emotion.

The Softmax function is a common classification function. In FIG. 2, { w₁,w₂,...,w_NAnd is a vector of all words in a piece of text content of length N. v. of_aRepresenting the facet vector and alpha representing the weight of attention. { h₁,h₂,...,h_NIs the vector of the hidden layer. And splicing the aspect word vectors in the text content and the word vectors of all words to serve as input vectors, and putting the input vectors into the BiGRU layer to obtain a vector representation H of the hidden layer. Then the hidden layer representation and the aspect representation (namely the aforementioned aspect word vector) are spliced and input into the layer containing the attention mechanism, so that an attention vector gamma can be obtained, and a vector representation h of the whole text can be obtained^*. Finally h is^*And inputting the Softmax layer, namely obtaining the emotion polarity y of the text content for a specific aspect.

γ＝Hα^T

h^*＝tanh(W_pγ+W_xh_N)

y＝softmax(W_sh^*+b_s)

In the formula, Wp, Wx, Ws, bs are all model parameters to be learned, tanh represents a hyperbolic tangent activation function, and softmax represents a classification function.

In order to prevent the phenomenon that the trained model generates overfitting, a Dropout strategy is introduced when model parameter adjustment is carried out. Dropout can make the activation value of some neurons stop according to a certain probability when the forward propagation is carried out, and a certain proportion of neurons are randomly discarded, so that the coordination among the neurons can be prevented. Therefore, the method does not depend on local characteristics excessively, and a model with stronger generalization capability can be trained.

For the parameters of the model, the parameters are adjusted in a multi-time parameter adjusting mode, and finally the adjusted parameters are as follows: the iteration round number is 25 rounds, the word vector dimension is 768 dimensions, dropout is 0.1, and the learning rate is 1 e-5.

And S8, finally, testing by using the best model trained by the above to obtain a final test result, and verifying by using the accuracy, the recall rate and the F1 value.

The accuracy is given as follows:

wherein: TP indicates correct assignment to class C_iThe number of texts; FP indicates the wrong assignment to class C_iThe amount of text of (c).

The recall ratio is given the definition below:

wherein: FN indicates actually class C_iBut not the amount of text assigned to that category.

The accuracy rate is a check on the integrity of the classified results, and the recall rate is a check on the correctness of the classified results, which are contradictory. All, F will also be used in general₁The values of (A) and (B) are taken into account in combination to reconcile accuracy and recall. The specific formula is as follows:

in fact, F₁Is a weighted average of the two.

The BERT-ATAE-BiGRU emotion classification model provided by the method is compared with other emotion classification models, and the fact that the classification effect of the model provided by the method on emotion in a specific aspect is more accurate is found. The experimental comparison models are:

1. the ATAE-LSTM model is an LSTM emotion classification model with aspect contents fused, extremely focuses on specific aspects in the training process, and can effectively identify the emotion polarity of the aspects.

2. Bi-LSTM, which proposes a bidirectional LSTM model, can capture the front-back time sequence relation between words at the same time, thereby obtaining the front-back dependency relation.

3. The BiGRU-AAM uses a bidirectional GRU model, so that the model structure is simpler, the required parameters are halved, and the deep information of text content can be effectively improved. The attention mechanism is added and combined with the aspect information, and specific aspect content can be effectively extracted.

The experimental results are shown in tables 1, 2 and 3 and fig. 3 and 4, and it can be seen that the method has significant advantages compared with the conventional method.

Table 1 experimental parameter settings

TABLE 2 Experimental data set

TABLE 3 results of the different models

The feasibility of the model is verified through experiments, and compared with the existing aspect emotion classification technology, the accuracy rate, the recall rate and the F1 value of the method are obviously improved, and a better classification effect is obtained.

The invention also provides a text aspect emotion classification system based on the ATAE-BiGRU, which is used for realizing the text aspect emotion classification method based on the ATAE-BiGRU, and comprises the following steps:

the data acquisition module is used for acquiring a data set;

It will be understood by those skilled in the art that the foregoing is merely a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included within the scope of the present invention.

Claims

1. A text aspect emotion classification method based on ATAE-BiGRU is characterized by comprising the following steps:

s1, acquiring a data set;

s4, training the emotion classification model by using a training set;

2. The ATAE-BiGRU-based text aspect emotion classification method according to claim 1, wherein the step S1 includes: and collecting text data and performing data cleaning to construct a data set.

3. The ATAE-BiGRU-based text aspect emotion classification method according to claim 1 or 2, wherein the step S1 further comprises: and ID numbering is carried out on the text data in the data set by using an SQL statement, and the primary key constraint is added to the data set by using a MySQL database.

4. The ATAE-BiGRU-based text aspect emotion classification method of claim 1, wherein after the data set is obtained, a database is constructed.

5. The text aspect emotion classification method based on ATAE-BiGRU according to claim 1, wherein the emotion labels are: -1 denotes negative emotions, 1 denotes positive emotions, and 0 denotes neutral emotions.

6. The ATAE-BiGRU-based text aspect emotion classification method of claim 1, wherein the BERT-ATAE-BiGRU emotion classification model comprises:

7. The ATAE-BiGRU-based text aspect emotion classification method of claim 6, wherein the BERT-ATAE-BiGRU emotion classification model processing steps are as follows:

γ＝Hα^T

h^*＝tanh(W_pγ+W_xh_N)

y＝softmax(W_sh^*+b_s)

8. The ATAE-BiGRU-based text aspect emotion classification method as claimed in claim 1, wherein parameters of the model are adjusted in a multi-parameter mode, and a Dropout strategy is used in parameter adjustment.

9. The ATAE-BiGRU-based text-aspect emotion classification method of claim 1, wherein the emotion classification model is validated using precision, recall, and F1 values.

10. An ATAE-BiGRU-based text aspect emotion classification system for implementing the ATAE-BiGRU-based text aspect emotion classification method of claim 1, comprising:

the data acquisition module is used for acquiring a data set;