CN112347258B

CN112347258B - Short text aspect level emotion classification method

Info

Publication number: CN112347258B
Application number: CN202011277713.4A
Authority: CN
Inventors: 倪丽萍; 高九洲; 朱旭辉; 陈星月
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2022-09-13
Anticipated expiration: 2040-11-16
Also published as: CN112347258A

Abstract

The invention discloses a short text aspect level emotion classification method, which comprises the following steps: 1. after the short text is divided into words, generating word vectors through the words in the short text, and labeling all the vectors and part of speech vectors; 2. judging the owned aspects of the short text by using an XLNET model, and sequentially splicing the word vector, the part-of-speech vector and the owned aspect vector of each word; 3. inputting the vector after each word is spliced in the step 2 into an emotion classification Bilstm model, inputting the obtained hidden vector of each word into an Attention mechanism, and returning the weight of the hidden vector of each word; 4. and performing weighted average by using the implicit vector and the weight value corresponding to each word, and entering the result into a softmax neural network to obtain a corresponding emotion, wherein a larger probability value is used as an emotion classification result. The method can identify different emotions of the short text in different aspects, thereby finishing fine-grained emotion classification.

Description

Short text aspect level emotion classification method

Technical Field

The invention belongs to the field of natural language processing in artificial intelligence, and particularly relates to an emotion classification method of a Bilstm model, which integrates word vectors, aspect vectors and part-of-speech vectors and adds an Attention mechanism.

Background

With the development of e-commerce platforms, short text comments are becoming more and more an important way for users to express their own emotional opinions. Often more than one aspect is involved in the short text, and even the opposite emotional attitude may be held for different aspects. At present, emotion classification for short texts mostly belongs to coarse-grained classification, namely, only one emotion classification is given for the short texts, and emotions corresponding to different aspects cannot be recognized in a fine-grained manner.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a short text aspect level emotion classification method, so that different emotions of a short text in different aspects can be recognized by recognizing the aspects contained in the short text and then recognizing corresponding emotions in the aspects, and emotion classification of the fine granularity of the short text is completed.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention discloses a short text aspect level emotion classification method which is characterized by comprising the following steps:

step 1, acquiring all short texts in the comment data and using the short texts as a corpus, and performing preprocessing operations of classifying, cleaning and word segmentation on any one short text in the corpus to obtain a word vector set of the corresponding short text, wherein the word vector set is marked as t ═ t (t ═ t) ₁ ,t ₂ ,…,t _i ,…,t _k )，t _i Denotes the ith word, i ∈ [1, k ]]K represents a total word amount of the short text;

step 2, the word vector set t is equal to (t) ₁ ,t ₂ ,…,t _i ,…,t _k ) And performing part-of-speech recognition to obtain a part-of-speech characterization vector set p ' (p ') ' ₁ ,p″′ ₂ ,…,p″′ _i ，…,p″′ _k )，p″′ _i Denotes the ith word t _i The corresponding part of speech;

step 3, performing preprocessing operation on all short texts in the corpus according to the step 1, deleting repeated words to obtain a dictionary, numbering each word in the dictionary, and taking the number as an index position key of the corresponding word;

step 4, utilizing the index position key of each word in the dictionary to set t (t) as the word vector set ₁ ,t ₂ ,…,t _i ,…,t _k ) Index processing is performed to obtain an index vector set s's ', and ' ₁ ,s″′ ₂ ,…,s″′ _i ,…,s″′ _k )；s″′ _i Denotes the ith word t _i The corresponding index position;

step 5, recording the total number of words corresponding to the short text containing the most total number of words in the corpus as max; filling the part of speech characterization vector set p 'and the index vector set s' with '0' according to the total number of words max, so that the total number of words in the part of speech characterization vector set p 'and the index vector set s' is equal to max; recording the complemented part-of-speech characterization vector set as p 'and the complemented index vector set as s';

step 6, a pre-training XLNET model is utilized to carry out n-dimensional word vector conversion processing on the jth part of speech in the complemented part of speech characterization vector set p' to obtain a jth part of speech vector which is marked as p _j ＝(p _j,1 ,p _j,2 ,…,p _j,d ,…,p _j,n )，p _j,d Represents the d-th dimension value in the j-th part-of-speech vector, and d is the [1, n ]]；

Step 7, carrying out n-dimensional word vector conversion treatment on the jth word in the complemented index vector set s' by utilizing a pre-training XLNET model to obtain a jth word vector which is marked as s _j ＝(s _j,1 ,s _j,2 ,…,s _j,d ,…,s _j,n )，s _j,d Representing the d dimension value in the j word vector;

step 8, selecting F aspect words from the corpus, inputting the F-th aspect word into the pre-training XLNET model to obtain a word vector a corresponding to the F-th aspect word _f ＝(a _f,1 ,a _f,2 ,…,a _f,d ,…,a _f,n )，a _f,d Representing the d dimension value corresponding to the f aspect word;

step 9, respectively carrying out reverse order arrangement on elements in the part of speech characterization vector set p 'and the index vector set s' and filling up the elements by '0' according to the total number max of the words; obtaining a complemented reverse-order part-of-speech characterization vector set as p 'and a complemented reverse-order index vector set as s';

step 10, utilizing a pre-trained XLNET model to carry out n-dimensional word vector conversion processing on the jth part of speech in the complemented reverse-order part of speech characterization vector set p ', so as to obtain the jth reverse-order part of speech vector which is marked as p' _j ＝(p′ _j,1 ,p′ _j,2 ,…,p′ _j,d ,…,p′ _j,n )，p′ _j,d Represents the d-th dimension value in the j-th reverse-order part-of-speech vector, and belongs to [1, n ]]；

Step 11, utilizing a pre-trained XLNET model to convert the j word in the complemented reverse-order index vector set s 'into an n-dimensional word vector to obtain a j reverse-order word vector which is marked as s' _j ＝(s′ _j,1 ,s′ _j,2 ,…,s′ _j,d ,…,s′ _j,n )，s′ _j,d Representing the d dimension value in the j reverse order word vector;

and 12, forming the forward input characteristics of the Bilstm model:

the jth word vector s _j The jth part-of-speech vector p _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a feature vector fw _ cell _j And as input to the forward Bilstm model, the feature vector fw _ cell _j The dimension of (a) is 2 xn + nxf;

step 13, forming the reverse input characteristics of the Bilstm model:

vector s 'of j-th reverse-order word' _j J-th reverse-order part-of-speech vector p' _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a reverse-order feature vector bw _ cell _j And is used as the other input of the reverse Bilstm model;

step 14, establishing F correction matrixes and identifying information in the aspect:

let any F-th rectification matrix be a matrix of multiplying (2+ F) rows by 3 columns, wherein the element of the first row of the first column is "1", the element of the second row of the second column is "1", the element of the F +2 th row of the third column is "1", and the rest elements are all set to "0";

setting the word vector set t ═ t (t) ₁ ,t ₂ ,…,t _i ,…,t _k ) Inputting the information into a pre-training XLNET model to obtain aspect information contained in the word vector set t;

acquiring corresponding aspect words through aspect information contained in the word vector set t, so as to obtain corresponding correction matrixes according to the aspect words;

the feature vector fw _ cell is used for carrying out the processing _j Multiplying with the correction matrix corresponding to the word vector set t to obtain an input vector I _j ；

The reverse order feature vector bw _ cell is used for carrying out the reverse order feature vector bw _ cell _j Multiplying the correction matrix corresponding to the word vector set t to obtain an inverted-order input vector I' _j ；

Step 15, respectively training F Bilstm models by utilizing F side words to obtain F trained Bilstm models;

inputting the vector I _j Respectively inputting the input information into the trained Bilstm model of the corresponding aspect words, thereby obtaining output vectors hf corresponding to the aspect information contained in the word vector set t _j ；

Inputting the reverse order into vector I' _j Respectively inputting the training result into the Bilstm model of the corresponding aspect words so as to obtain the reverse order output vector hb corresponding to the aspect information contained in the word vector set t _j ；

Will output vector hf _j And the reverse order output vector hb _j Splicing is carried out to form an implicit vector h _j ；

Step 16, implicit vector h _j The jth part-of-speech vector p _j And sequentially splicing word vectors corresponding to the F aspect words to obtain an implicit vector h 'of an Attention mechanism' _j ：

Step 17, training an Attention mechanism network by using word vectors corresponding to F number of aspect words to obtain a trained Attention mechanism network;

implicit vector h 'of Attentition mechanism' _j Inputting the trained Attention mechanism network to obtain the jth word vector s _j A corresponding weight;

step 18, predicting classification results:

respectively combining each implicit vector h _j Multiplying the weight by the corresponding weight and then summing to obtain h ^* Inputting the scores into the full-connection layer to obtain scores corresponding to positive emotions and negative emotions; inputting the scores corresponding to the positive emotion and the negative emotion into the softmax layer to obtain the corresponding probabilities of the positive emotion and the negative emotionAnd the emotion with higher probability is used as a prediction classification result.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention identifies the aspect contained in the short text by adopting XLNET model fine granularity, converts the short text into word vectors, part-of-speech vectors and aspect vectors derived from the aspect identified by XLNET, fuses and inputs the word vectors, the part-of-speech vectors and the aspect vectors derived from the aspect identified by XLNET into the Bilstm model with attention mechanism correspondingly trained in different aspects, obtains the emotion corresponding to the short text in the aspect, completes fine-grained emotion classification by firstly performing label processing and then performing emotion classification processing, and makes the emotion classification of the short text more accurate. Meanwhile, the method and the device have the advantages that the word vectors, the part-of-speech vectors and the aspect vectors are fused, so that the contained information is more comprehensively reserved in the process of converting the short text into the vectors, and the accuracy of aspect identification and emotion classification is improved.

2. The core of natural language processing is to convert language into machine language for recognition and judgment, and the method considers the content of the text, the help of the part of speech of the text to recognize the text and the aspect vector through the triple splicing of the word vector, the part of speech vector and the aspect vector so as to carry out fine-grained emotion classification on the short text.

3. In the process of identifying the aspect labels, the invention provides a correction matrix, the correction matrix is used for generating input vectors which are sequentially spliced by word vectors, part-of-speech vectors and aspect vectors corresponding to various aspects related to the short text, the correction matrix is used for automatically splicing the aspect vectors related to the aspects to the word vectors and the part-of-speech vectors, and then the aspect vectors unrelated to the aspects are automatically abandoned, so that the input vectors are simplified, all contents related to the aspects are reserved, and the complexity of calculation is reduced.

Drawings

FIG. 1 is an overall flow chart of the method of the present invention.

Detailed Description

In this embodiment, an emotion classification method for a Bilstm model, which integrates a word vector, an aspect vector, and a part-of-speech vector and adds an Attention mechanism, is performed according to the following steps:

step 1, acquiring all short texts in the comment data and using the short texts as a corpus, and performing preprocessing operations of classifying, cleaning and segmenting any short text in the corpus to obtain a word vector set of the corresponding short text, wherein the word vector set is marked as t ═ t (t ═ t- ₁ ,t ₂ ,…,t _i ,…,t _k )，t _i Denotes the ith word, i ∈ [1, k ]]K represents the total number of words of the short text;

step 2, the word vector set t is equal to (t) ₁ ,t ₂ ,…,t _i ,…,t _k ) Performing part-of-speech recognition to obtain a part-of-speech characterization vector set p ' -p ', and ' ₁ ,p″′ ₂ ,…,p″′ _i ,…,p″′ _k )，p″′ _i Denotes the ith word t _i The corresponding part of speech;

step 5, recording the total number of words corresponding to the short text containing the most total number of words in the corpus as max; filling the part of speech characterization vector set p '″ and the index vector set s' ″ with '0' according to the total number of words max, so that the total number of words in the part of speech characterization vector set p '″ and the index vector set s' ″ is equal to max; recording a part-of-speech characterization vector set after completion as p 'and an index vector set after completion as s';

step 6, as shown in fig. 1, a pretrained XLNET model is utilized to perform n-dimensional word vector transformation on the jth part of speech in the complemented part of speech characterization vector set p ″Converting to obtain the jth part-of-speech vector, which is marked as p _j ＝(p _j,1 ,p _j,2 ,…,p _j,d ,…,p _j,n )，p _j,d Represents the d-th dimension value in the j-th part-of-speech vector, d ∈ [1, n [ ]]；

Step 7, as shown in fig. 1, a pretrained XLNET model is used for carrying out n-dimensional word vector conversion processing on the jth word in the aligned index vector set s ″ to obtain a jth word vector which is marked as s _j ＝(s _j,1 ,s _j,2 ,…,s _j,d ,…,s _j,n )，s _j,d Representing the d dimension value in the j word vector;

step 8, selecting F aspect words from the corpus, inputting the F-th aspect word into the pre-training XLNET model, and obtaining a word vector a corresponding to the F-th aspect word as shown in FIG. 1 _f ＝(a _f,1 ,a _f,2 ,…,a _f,d ,…,a _f,n )，a _f,d Representing the d dimension value corresponding to the f aspect word;

step 9, respectively carrying out reverse order arrangement on elements in the part of speech characterization vector set p 'and the index vector set s' and filling up the elements by 0 according to the total number max of words; obtaining a word characteristic vector set of the filled reverse order as p 'and an index vector set of the filled reverse order as s';

step 10, utilizing a pre-trained XLNET model to convert the jth part of speech in the complemented reverse-order part of speech characterization vector set p 'into an n-dimensional word vector to obtain a jth reverse-order part of speech vector which is marked as p' _j ＝(p′ _j,1 ,p′ _j,2 ,…,p′ _j,d ,…,p′ _j,n )，p′ _j,d Represents the d dimension value in the j reverse order part-of-speech vector, and d is the [1, n ]]；

Step 11, converting the j-th word in the complemented reverse-order index vector set s 'by using a pre-trained XLNET model to obtain a j-th reverse-order word vector which is recorded as s' _j ＝(s′ _j,1 ,s′ _j,2 ,…,s′ _j,d ,…,s′ _j,n )，s′ _j,d Representing the d dimension value in the j reverse order word vector;

and 12, forming the forward input characteristics of the Bilstm model:

the jth word vector s _j The jth part-of-speech vector p _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a feature vector fw _ cell _j And as a forward input to the Bilstm model, the feature vector fw _ cell _j The dimension of (a) is 2 xn + nxf; compared with the method that only word vectors are used or only word vectors, part of speech vectors and aspect vectors are spliced in pairs, the word vectors, the part of speech vectors and the aspect vectors are spliced, so that the original short text retains more semantic information.

Step 13, forming the reverse input characteristics of the Bilstm model:

vector s 'of j-th reverse-order word' _j J-th reverse-order part-of-speech vector p' _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a reverse-order feature vector bw _ cell _j And as the reverse input of the Bilstm model (the Bilstm model is automatically converted into reverse input through sequential input), the reverse input is helpful for identifying the context information, so that the context can be considered in the input process of the short text;

set of word vectors t ═ t (t) ₁ ,t ₂ ,…,t _i ,…,t _k ) Inputting the information into a pre-training XLNET model to obtain aspect information contained in a word vector set t;

acquiring corresponding aspect words through aspect information contained in the word vector set t, and accordingly acquiring corresponding correction matrixes according to the aspect words;

the feature vector fw _ cell _j Multiplying with a correction matrix corresponding to the word vector set t to obtain an input vector I _j ；

The reverse order feature vector bw _ cell _j With the word vector set tsegmentMultiplying the corresponding correction matrixes to obtain an inverted-sequence input vector I' _j ；

inputting vector I _j Respectively inputting the input information into the trained Bilstm model of the corresponding aspect words, thereby obtaining output vectors hf corresponding to the aspect information contained in the word vector set t _j ；

Inputting the reverse order into vector I' _j Respectively inputting the input into the trained Bilstm model of the corresponding aspect words, thereby obtaining the reverse order output vector hb corresponding to the aspect information contained in the word vector set t _j ；

Output vector hf _j And the reverse order output vector hb _j Splicing is carried out to form an implicit vector h _j ；

Step 17, training an Attention mechanism network by using the word vectors corresponding to the F aspects to obtain the trained Attention mechanism network (the machine can determine which part needs to be input by using the Attention mechanism, and limited information processing resources are allocated to important parts);

step 18, predicting a classification result:

respectively combining each implicit vector h _j Multiplying the obtained product by corresponding weight and summing the product to obtain h ^* Inputting the scores into the full-connection layer to obtain scores corresponding to positive emotions and negative emotions; and inputting the scores corresponding to the positive emotion and the negative emotion into the softmax layer to obtain the probabilities corresponding to the positive emotion and the negative emotion, and taking the emotion with higher probability as a prediction classification result.

Claims

1. A short text aspect level emotion classification method is characterized by comprising the following steps:

step 1, acquiring all short texts in the comment data and using the short texts as a corpus, and performing preprocessing operations of classifying, cleaning and word segmentation on any one short text in the corpus to obtain a word vector set of the corresponding short text, wherein the word vector set is marked as t ═ t (t ═ t) ₁ ,t ₂ ,…,t _i ,…,t _k )，t _i Denotes the ith word, i ∈ [1, k ]]K represents a total number of words of the short text;

step 2, the word vector set t is equal to (t) ₁ ,t ₂ ,…,t _i ,…,t _k ) And performing part-of-speech recognition to obtain a part-of-speech characterization vector set p ' (p ') ' ₁ ,p″′ ₂ ,…,p″′ _i ,…,p″′ _k )，p″′ _i Denotes the ith word t _i The corresponding part of speech;

step 3, carrying out preprocessing operation on all short texts in the corpus according to the step 1, deleting repeated words to obtain a dictionary, numbering each word in the dictionary, and using the number as an index position key of the corresponding word;

step 4, utilizing the index position key of each word in the dictionary to set t (t) as the word vector set ₁ ,t ₂ ,…,t _i ,…,t _k ) Index processing is performed to obtain an index vector set s ' (s ') ' ₁ ,s″′ ₂ ,…,s″′ _i ,…,s″′ _k )；s″′ _i Denotes the ith word t _i The corresponding index position;

step 6, utilizing a pre-trained XLNET model to represent the filled parts of speechThe j part of speech in the vector set p' is converted into an n-dimensional word vector to obtain a j part of speech vector which is marked as p _j ＝(p _j,1 ,p _j,2 ,…,p _j,d ,…,p _j,n )，p _j,d Represents the d-th dimension value in the j-th part-of-speech vector, d ∈ [1, n [ ]]；

Step 7, utilizing a pre-training XLNET model to convert the jth word in the complemented index vector set s' into an n-dimensional word vector to obtain a jth word vector which is marked as s _j ＝(s _j,1 ,s _j,2 ,…,s _j,d ,…,s _j,n )，s _j,d Representing the d dimension value in the j word vector;

step 9, respectively carrying out reverse order arrangement on elements in the part of speech characterization vector set p 'and the index vector set s' and filling up the elements by '0' according to the total number max of the words; obtaining a word characteristic vector set of the filled reverse order as p 'and an index vector set of the filled reverse order as s';

step 10, utilizing a pre-training XLNET model to convert the j (th) part of speech in the complemented reverse-order part of speech characterization vector set p 'into an n-dimensional word vector to obtain a j (th) reverse-order part of speech vector which is marked as p' _j ＝(p′ _j,1 ,p′ _j,2 ,…,p′ _j,d ,…,p′ _j,n )，p′ _j,d Represents the d-th dimension value in the j-th reverse-order part-of-speech vector, and belongs to [1, n ]]；

Step 11, utilizing a pre-training XLNET model to convert the j-th word in the complemented reverse-order index vector set s 'into an n-dimensional word vector to obtain a j-th reverse-order word vector which is recorded as s' _j ＝(s′ _j,1 ,s′ _j,2 ,…,s′ _j,d ,…,s′ _j,n )，s′ _j,d Representing the d-th of the j-th reverse-order word vectorA dimension value;

and 12, forming the forward input characteristics of the Bilstm model:

the jth word vector s _j The jth part-of-speech vector p _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a feature vector fw _ cell _j And as input to the forward bitstm model, the feature vector fw _ cell _j The dimension of (a) is 2 xn + nxf;

step 13, forming the reverse input characteristics of the Bilstm model:

j 'th reverse order word vector s' _j Jth reverse order part of speech vector p' _j Sequentially splicing word vectors corresponding to the F aspect words to obtain a reverse-order feature vector bw _ cell _j And is used as the other input of the reverse Bilstm model;

The reverse order feature vector bw _ cell is processed _j Multiplying the correction matrix corresponding to the word vector set t to obtain an inverted-order input vector I' _j ；

inputting the vector I _j Respectively input corresponding toObtaining an output vector hf corresponding to the aspect information contained in the word vector set t in the trained Bilstm model of the aspect word _j ；

Step 16, converting the implicit vector h _j The jth part-of-speech vector p _j And sequentially splicing word vectors corresponding to the F aspect words to obtain an implicit vector h 'of an Attention mechanism' _j ：

step 18, predicting classification results:

respectively dividing each implicit vector h _j Multiplying the obtained product by corresponding weight and summing the product to obtain h ^* Inputting the scores into the full-connection layer to obtain scores corresponding to positive emotions and negative emotions; and inputting the scores corresponding to the positive emotion and the negative emotion into the softmax layer to obtain the probabilities corresponding to the positive emotion and the negative emotion, and taking the emotion with higher probability as a prediction classification result.