CN115309894A - Text emotion classification method and device based on confrontation training and TF-IDF - Google Patents

Text emotion classification method and device based on confrontation training and TF-IDF Download PDF

Info

Publication number
CN115309894A
CN115309894A CN202210818922.8A CN202210818922A CN115309894A CN 115309894 A CN115309894 A CN 115309894A CN 202210818922 A CN202210818922 A CN 202210818922A CN 115309894 A CN115309894 A CN 115309894A
Authority
CN
China
Prior art keywords
text
word
training
idf
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210818922.8A
Other languages
Chinese (zh)
Inventor
沈志东
袁芙蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202210818922.8A priority Critical patent/CN115309894A/en
Publication of CN115309894A publication Critical patent/CN115309894A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a text emotion classification method and device based on countermeasure training and TF-IDF, which are used for researching from two aspects of generating interpretable countermeasure samples and extracting text features aiming at a text emotion classification task, so that the interpretability of the countermeasure samples is improved while the classification accuracy of a deep neural network model is improved. Aiming at the problem of how to generate interpretable countermeasure samples in the model training process, the invention pertinently provides that countermeasure training is applied to a BERT embedding layer, and the disturbance direction of a text word is normalized to the direction of the existing adjacent word in an embedding space to improve the interpretability of the countermeasure samples; for how to extract more text features, the invention provides a method for obtaining additional text features by using an attention mechanism and an improved TF-IDF algorithm, so that a model can obtain more comprehensive and accurate text emotional tendency.

Description

Text emotion classification method and device based on confrontation training and TF-IDF
Technical Field
The invention relates to the technical field of internet, in particular to a text emotion classification method and device based on confrontation training and TF-IDF.
Background
Emotional analysis is an important research direction in the field of natural language processing, and focuses on the idea that a paragraph sentence is not designed into the subject meaning, but is expressed. Emotional analysis has wide application in a variety of fields, from tracking opinions made by users about products or current events on a social media platform, to predicting public behavior and improving decisions. With the explosive development of artificial intelligence technology, online review websites, personal blogs and social media platforms rich in personal opinions are increasingly popular, and new opportunities and challenges are presented in the field of emotion analysis, as people can now seek and understand others' opinions using information technology. In the big data era of the information explosion shed, massive data information can be generated on various social network platforms every minute, and the social media information contains rich emotional knowledge. The artificial intelligence technology is used for extracting the opinions of people and the emotions hidden in the text information published by the user, and the artificial intelligence technology plays an important role in wide application, such as public opinion monitoring, public behavior prediction, recommendation systems and the like.
The computer software technology and hardware are replaced along with time, various chips are layered endlessly, the processing capacity is different in the past day, the deep learning technology gradually enters the visual field of people and is widely used, a large amount of manpower is not consumed for marking the text, and the deep learning technology can adaptively mine the emotional characteristics and the context emotional information from the text. However, the deep learning model can output an erroneous recognition result only by slightly perturbing the original input of the deep learning model.
In order to deal with security threats brought by counterattack, the conventional defense method adds counterattack disturbance to a deep learning model training process, and the generated text counterattack samples and original samples are trained together, so that the model learns the counterattack samples to enhance the generalization capability and robustness of the model. Two methods are commonly used in the field of natural language processing to generate countermeasure samples: text-based and gradient-based.
The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:
compared with the gradient-based countermeasure method, the text-based countermeasure method generates the countermeasure sample by replacing words or characters in the original sample, has higher interpretability, but lacks more attack diversity and relies on more human knowledge, thereby limiting the diversity of the countermeasure mode. In contrast, the gradient-based countermeasure method optimizes the parameters of the model by inputting the minute perturbation of the gradient calculation into the word embedding space and participating in the model training during the model training process. Although the method for adding the disturbance in the input word embedding space improves the performance of the natural language processing task, the generated confrontation sample is likely to be a word which does not exist in the corpus, namely, the interpretability of the generated confrontation sample is reduced to a certain extent.
Therefore, the method in the prior art cannot improve the interpretability of the generated confrontation sample while ensuring the classification accuracy.
Disclosure of Invention
The invention aims to relieve the interpretability of generating countermeasure samples in a text emotion classification task and improve emotion classification accuracy, and the interpretability of generating the countermeasure samples is improved by standardizing countermeasure disturbance to an existing word embedding space. Aiming at the problem of how to improve the interpretability of generating a countermeasure sample under the condition of ensuring the accuracy of model classification, the invention adds the countermeasure disturbance to an embedding space in a BERT fine adjustment process, and simultaneously limits the direction of the countermeasure disturbance to the direction of a neighboring word of an original text, wherein the neighboring word of the original text is solved by a mask language model of the BERT; aiming at how to extract additional text features, the method takes the existing text classification information in the training set into account in the TF-IDF algorithm to extract the text features at deeper levels in the text, thereby improving the performance of the final model.
The technical scheme adopted by the invention is as follows:
the first aspect provides a text emotion classification method based on countermeasure training and TF-IDF, which comprises the following steps:
s1: acquiring an original text, and dividing a training set from the original text;
s2: preprocessing a training set;
s3: constructing a text sentiment classification model based on confrontation training and TF-IDF, wherein the model comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, the embedding module adopts the confrontation training to generate a confrontation text, and the confrontation text and an input text are combined to be used as an input word vector to represent; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of an embedded module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-and-short-term memory network is used for further extracting text features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
s4: training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
s5: and performing text emotion classification by using the trained text emotion classification model.
In one embodiment, step S2 comprises:
s2.1: obtaining an emotion category array according to the divided training set;
s2.2: obtaining a word library and the occurrence frequency of each word in each emotion category from the training set;
s2.3: and traversing the training set, converting the words into subscripts in a word library, and analyzing each sentence to obtain a word vector, a segment vector and a position coding vector of the sentence.
In one embodiment, in the model constructed in step S3, the process of embedding the module includes:
taking a vector obtained after preprocessing an original text as the input of a BERT pre-training model, wherein the vector obtained after preprocessing the original text comprises a word vector, a segment vector and a position coding vector of a sentence;
predicting the probability of the word in the vocabulary at the corresponding position in a sentence by using a BERT pre-training model, which specifically comprises the following steps: finding the first K neighboring word vectors of each word in a sentence, and marking as K T
Respectively obtaining the direction expression d between each word and K adjacent word vectors (t,k) And obtaining a word vector x t Against disturbance r t
Figure BDA0003741929170000031
Where t represents the vector of the t-th word in the sentence, K represents the K-th neighboring word vector, | K | represents the number of neighboring word vectors, α (t,k) A weight vector representing the perturbation of the word t to the neighboring word k;
adding a disturbance vector on the basis of a word vector of a sentence to obtain a confrontation word vector serving as a confrontation sample, wherein the calculation mode is as follows:
Figure BDA0003741929170000032
wherein x t Vector representing the T-th word in the sentence, T representing the number of word vectors in the sentence, X +r To combat the samples and find the optimal perturbation direction by gradient:
Figure BDA0003741929170000033
Figure BDA0003741929170000034
wherein X +r As input, Y represents the label, Θ represents the model parameters,
Figure BDA0003741929170000035
representing a single numberAccording to the loss function, ε represents the threshold value for resisting disturbance, g t Denotes the gradient, g denotes all g t The connection is carried out by connecting the two parts,
Figure BDA0003741929170000036
and inputting a word vector obtained by preprocessing an original text and a confrontation sample into a BERT encoder to obtain a word vector representation containing context semantics.
In one embodiment, in the model constructed in step S3, the processing procedure of the first feature extraction module includes:
transmitting the output of the embedded module into a bidirectional long-time memory network model, and extracting text characteristic information from two directions of each word characteristic vector;
and weighting the hidden state of each step of the bidirectional long-and-short-term memory network model by using an attention mechanism, giving different weights to the hidden state at each moment, and outputting the weighted sum of the hidden states at all the moments as a final feature vector.
In one embodiment, in the model constructed in step S3, the processing procedure of the second feature extraction module includes:
the word w is found from the following equation t The frequency of occurrence TF value of:
Figure BDA0003741929170000037
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003741929170000038
represents each word w t In emotion label C j The number of occurrences of (a) is,
Figure BDA0003741929170000039
representing affective Label C j Total number of words in TF (w) t ,C j ) As a word w t The frequency of occurrence TF value of:
the existing emotion classification information is considered in the IDF calculation, and the method is realizedThe supervised TF-IDF algorithm specifically comprises the following steps: finding the contained word w t And calculates the IDF value of the word:
Figure BDA0003741929170000041
wherein | k | represents the number of all emotion categories;
according to the word w t Calculating TF-IDF value from TF value and IDF value:
TF-IDF=TF×IDF
the TF-IDF value is used to represent the frequency weight of the occurrence of each word in each category;
and transmitting the feature vector expressed by the TF-IDF value into a bidirectional long-time memory network, and performing further feature extraction to obtain text vector expression.
In one embodiment, the prediction module specifically uses a softmax function to solve the probability that the text score falls into a certain category, so as to obtain a final classification result.
In one embodiment, in the training process of S4, a gradient descent method and a back propagation method are used for propagating errors to each layer of the model, so that parameter values of the model are adjusted according to the errors, iterative training is continuously carried out until an optimal solution is finally obtained, and a cross entropy loss function is adopted to calculate final classification loss
Figure BDA0003741929170000042
The calculation process is expressed as:
Figure BDA0003741929170000043
where D represents the training set size, C represents the number of classes, q (i,j) Indicates whether it belongs to the current category, logp (i,j) Representing the probability that the feature word i is predicted as a category j,
Figure BDA0003741929170000044
representing the loss of text confrontation.
Based on the same inventive concept, the second aspect of the present invention provides a text emotion classification apparatus based on confrontation training and TF-IDF, comprising:
the data set acquisition module is used for acquiring an original text and dividing a training set from the original text;
the preprocessing module is used for preprocessing the training set;
the model building module is used for building a text emotion classification model based on countermeasure training and TF-IDF, and comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, wherein the embedding module generates a countermeasure text by adopting the countermeasure training, and combines the countermeasure text and the input text to be used as input word vector representation; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of the embedding module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-time and short-time memory network is used for further extracting features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
the training module is used for training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
and the classification module is used for performing text emotion classification by using the trained text emotion classification model.
Based on the same inventive concept, a third aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of the first aspect.
Based on the same inventive concept, a fourth aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect when executing the program.
Compared with the prior art, the invention has the advantages and beneficial technical effects as follows:
the invention provides a text emotion classification method based on confrontation training and TF-IDF, which constructs a text emotion classification model based on the confrontation training and TF-IDF, wherein the model comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module. In order to improve the generalization capability of the model, the embedded module adds the confrontation training in the embedded layer of the BERT model. Secondly, in order to improve the interpretability of the generated countermeasure samples, the invention provides that the countermeasure disturbance is limited to the direction of the adjacent words of the original words in the corpus, wherein the adjacent words are obtained by BERT prediction with a mask language model head. In the second feature extraction module, the invention adds known classification information into the calculation of IDF by improving TF-IDF algorithm to obtain the probability of the feature words appearing in each classification category, and improves the unsupervised TF-IDF algorithm into the supervised TF-IDF algorithm; the first feature extraction module solves the problem of long dependence in the text by using an attention mechanism and a Bi-LSTM model, and designs a dual-channel feature fusion model so as to further optimize the text emotion classification model, so that the method provided by the invention can improve the interpretability of the text countermeasures while ensuring the classification accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of a text sentiment classification method based on confrontation training and TF-IDF according to an embodiment of the present invention;
FIG. 2 is an architecture diagram of an overall model (text emotion classification model based on countermeasure training and TF-IDF) fusing countermeasure training and the improved TF-IDF algorithm in the embodiment of the present invention;
FIG. 3 is a diagram of a model architecture for introducing countertraining at the BERT embedding layer in an embodiment of the present invention.
Detailed Description
According to the method, for the text emotion classification task, two angles of generating interpretable confrontation samples and extracting text features are respectively studied, so that the interpretability of the confrontation samples is improved while the classification accuracy of the deep neural network model is improved. The invention aims to solve the problem of how to generate an interpretable countermeasure sample in the model training process, and aims to apply countermeasure training to a BERT embedding layer and standardize the perturbation direction of a text word to the direction of the existing neighboring word in an embedding space so as to improve the interpretability of the countermeasure sample; for how to extract more text features, the invention provides a method for obtaining additional text features by using an attention mechanism and an improved TF-IDF algorithm, so that a model can obtain more comprehensive and accurate text emotional tendency.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example one
The embodiment of the invention provides a text sentiment classification method based on confrontation training and TF-IDF, which comprises the following steps:
s1: acquiring an original text, and dividing a training set from the original text;
s2: preprocessing a training set;
s3: constructing a text sentiment classification model based on confrontation training and TF-IDF, wherein the model comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, the embedding module adopts the confrontation training to generate a confrontation text, and the confrontation text and an input text are combined to be used as an input word vector to represent; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of the embedding module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-and-short-term memory network is used for further extracting text features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
s4: training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
s5: and performing text emotion classification by using the trained text emotion classification model.
Please refer to fig. 1, which is a flowchart of a text emotion classification method based on countermeasure training and TF-IDF in an embodiment of the present invention.
Specifically, step S1 is data acquisition, and step S2 is data preprocessing, specifically including performing a data cleansing operation on an input text, and for convenience of processing, it is necessary to convert an original unstructured text into a word vector representation in a deep neural network.
And S3, constructing a model, wherein the embedding module uses a BERT model to realize pre-training, and converts the preprocessed word vector plus the confrontation vector (text) into word vector representation containing context semantics. Specifically, the robustness and generalization capability of the deep neural network model are improved by applying countermeasure training to a BERT embedding layer to generate countermeasure text and combining the countermeasure text with the preprocessed text as an input word vector representation. The specific model structure diagram is shown in fig. 3. During the processing of this embedding module, an original sequence (w) containing n words 1 ,w 2 ,…,w N ) Will be converted into X = (X) 1 ,x 2 ,…,x N ) Wherein
Figure BDA0003741929170000071
D-dimensional vector representing the ith feature word, X representing a word vector matrix as a subsequent input, w 1 、w N Respectively representing the 1 st and nth words in the original sequence.
The first feature extraction module transmits the output of the embedding module into a two-way long-short-term memory network model, and extracts text information from two directions of each word feature vector, so that the front and back sequence information of each feature word can be stored, and the problem that features with long distance cannot be stored in natural language processing is further solved. Hidden states in the bidirectional long-short time memory network model are then weighted using an attention mechanism, with each hidden state given a different weight.
The second feature extraction module introduces known classification information in the calculation process of the traditional TF-IDF algorithm, and improves the unsupervised TF-IDF algorithm into the supervised TF-IDF algorithm. For the inputted word (w) 1 ,w 2 ,…,w N ) And (3) adopting an improved TF-IDF algorithm to weight the frequency of the occurrence of each word in each category, and further extracting features by using a bidirectional long-time memory network.
And the prediction module is a full connection layer, connects the outputs of the first feature extraction module and the second feature extraction module and then serves as an input, and then uses a softmax function to calculate a final classification result, and the softmax function calculates the probability of the text being classified into a certain class.
The model architecture is described in detail with reference to FIG. 2.
Step S4 is the training of the model. And training the model by minimizing a model loss function, and propagating errors to each layer of the model by using a gradient descent method and a back propagation method in the training process, so that parameter values of the model are adjusted according to the errors, and the iterative training is continuously carried out until an optimal solution is finally obtained.
Step S5 is a specific application of the model.
In one embodiment, step S2 comprises:
s2.1: obtaining an emotion category array according to the divided training set;
s2.2: obtaining a word library and the occurrence frequency of each word in each emotion category from the training set;
s2.3: and traversing the training set, converting the words into subscripts in a word library, and analyzing each sentence to obtain a word vector (Token entries), a Segment vector (Segment entries) and a Position encoding vector (Position entries) of the sentence.
In one embodiment, in the model constructed in step S3, the process of embedding the module includes:
taking a vector obtained after preprocessing an original text as the input of a BERT pre-training model, wherein the vector obtained after preprocessing the original text comprises a word vector, a segment vector and a position coding vector of a sentence;
predicting the probability of the word in the vocabulary at the corresponding position in a sentence by using a BERT pre-training model, which specifically comprises the following steps: finding the first K neighboring word vectors of each word in a sentence, and marking as K T
Respectively obtaining the direction expression d between each word and K adjacent word vectors (t,k) And obtaining a word vector x t Total disturbance r of t
Figure BDA0003741929170000072
Where t represents the vector of the t-th word in the sentence, K represents the K-th neighboring word vector, | K | represents the number of neighboring word vectors, α (t,k) A weight vector representing the perturbation of the word t to the neighboring word k;
adding a disturbance vector on the basis of the word vector of the sentence to obtain a confrontation word vector as a confrontation sample, wherein the calculation mode is as follows:
Figure BDA0003741929170000081
wherein x t Vector representing the t-th word in a sentence, r t Representing the resistance to perturbation, T representing the number of word vectors in a sentence,X +r To combat the samples and find the optimal perturbation direction by gradient:
Figure BDA0003741929170000082
Figure BDA0003741929170000083
wherein X +r As input, Y represents the label, Θ represents the model parameters,
Figure BDA0003741929170000084
a loss function representing individual data, epsilon represents a threshold against disturbances, g t Denotes the gradient, g denotes all g t The connection is carried out by connecting the two parts,
Figure BDA0003741929170000085
and inputting a word vector obtained by preprocessing an original text and a confrontation sample into a BERT encoder to obtain a word vector representation containing context semantics.
Specifically, the vector obtained after the original text is preprocessed is input into the BERT pre-training model, and specifically includes: and adding the word vector model, the segment vector model and the position coding vector model and inputting the added three models into a BERT pre-training model. The embedding module inputs an original word vector (a word vector obtained by preprocessing an original text) and a confrontation sample into a BERT encoder to obtain a word vector representation containing context semantics, so that the robustness and the generalization capability of the deep neural network model are improved.
In one embodiment, in the model constructed in step S3, the processing procedure of the first feature extraction module includes:
transmitting the output of the embedded module into a bidirectional long-time memory network model, and extracting text characteristic information from two directions of each word characteristic vector;
and weighting the hidden state of each step of the bidirectional long-time and short-time memory network model by using an attention mechanism, giving different weights to the hidden state at each moment, and outputting the weighted sum of the hidden states at all the moments as a final feature vector.
Through a bidirectional long-and-short time memory network (Bi-LSTM) model, text information can be extracted from two directions of each word feature vector, and front and back sequence information of each feature word can be stored, so that the problem that features with long distance cannot be stored in natural language processing is further solved.
The problem of long-term dependence can be solved to some extent by Bi-LSTM, but since Bi-LSTM is a step-by-step acquisition of sequence information, it is difficult for long texts to preserve all useful information, so the hidden states of each step are weighted using an attention mechanism behind the Bi-LSTM model, giving different weights to each hidden state. In the Bi-LSTM model, only the output of the last moment is used for the feature vector, the weight of each moment is calculated by using an attention mechanism, and the weighted sum of the hidden states at all the moments is used as the final feature vector to be output.
In one embodiment, in the model constructed in step S3, the processing procedure of the second feature extraction module includes:
the word w is found from the following equation t The frequency of occurrence TF value of:
Figure BDA0003741929170000091
wherein the content of the first and second substances,
Figure BDA0003741929170000092
represents each word w t In emotion label C j The number of occurrences of (a) is,
Figure BDA0003741929170000093
representing emotion labels C j Total number of words in TF (w) t ,C j ) As a word w t The frequency of occurrence TF value of:
taking into account in IDF calculationThe existing emotion classification information realizes a supervised TF-IDF algorithm, and specifically comprises the following steps: finding included word w t And calculates the IDF value of the word:
Figure BDA0003741929170000094
where | k | represents the number of all emotion categories;
according to the word w t Calculating TF-IDF value from the TF value and the IDF value:
TF-IDF=TF×IDF
the TF-IDF value is used to represent the frequency weight of the occurrence of each word in each category;
and (4) transmitting the feature vector expressed by the TF-IDF value into a bidirectional long-time and short-time memory network, and performing further feature extraction to obtain text vector expression.
Specifically, the second feature extraction module introduces known classification information in the calculation process of the traditional TF-IDF algorithm, and improves the unsupervised TF-IDF algorithm into the supervised TF-IDF algorithm. For inputted word (w) 1 ,w 2 ,…,w N ) And (3) adopting a modified TF-IDF algorithm to obtain the frequency weight of the occurrence of each word in each category, namely the TF-IDF value. And transmitting the feature vector expressed by the TF-IDF value into a bidirectional long-short time memory network for model training to obtain text vector expression.
After the TF and the IDF values are respectively obtained, the frequency of the appearance of a word in the emotion type and the inverse text frequency index are simultaneously considered, namely, the TF-IDF and the TF value of the appearance frequency of the word in the emotion type are positively correlated and negatively correlated with the number n of the emotion types containing the word, so that a calculation formula of the TF-IDF value can be obtained.
In one embodiment, the prediction module specifically uses a softmax function to solve the probability that the text score falls into a certain category, so as to obtain a final classification result.
In one embodiment, in the training process of S4, the gradient descent method and the back propagation method are used for propagating errors to each layer of the model, so that the mode is adjusted according to the errorsContinuously performing iterative training on the type parameter values until an optimal solution is obtained finally, and calculating final classification loss by adopting a cross entropy loss function
Figure BDA0003741929170000095
The calculation process is represented as:
Figure BDA0003741929170000096
where D represents the training set size, C represents the number of classes, q (i,j) Indicates whether it belongs to the current category, logp (i,j) Representing the probability that the feature word i is predicted as a category j,
Figure BDA0003741929170000097
representing the loss of text confrontation.
The invention has the following beneficial effects:
(1) Aiming AT the problem of how to generate interpretable countermeasure samples in text countermeasure training, the invention provides that countermeasure training is introduced into an embedding layer (AT-BERT) of a BERT model, the BERT with a mask language model head is used for solving the neighboring words of the feature words, and the countermeasure disturbance direction is normalized as the neighboring word direction of the feature words in a corpus, so as to ensure the emotion classification accuracy of the model and generate the interpretable countermeasure samples.
(2) An attention mechanism and a bidirectional long-short time memory network (Bi-LSTM) are introduced into an AT-BERT model to solve the problem that long-distance information cannot be acquired in a text sequence, different attention degrees are allocated to feature words in sentences, hidden feature information in the text is acquired, and a classification model is further optimized.
(3) Known classification information in a training set is introduced into a traditional TF-IDF algorithm to realize a supervised TF-IDF algorithm, additional text features are obtained from the known classification information, and a text emotion classification model for realizing dual-channel feature fusion is designed by combining the model designed in the point (2).
Example two
Based on the same inventive concept, the embodiment provides a text emotion classification device based on confrontation training and TF-IDF, which comprises:
the data set acquisition module is used for acquiring an original text and dividing a training set from the original text;
the preprocessing module is used for preprocessing the training set;
the model building module is used for building a text emotion classification model based on countermeasure training and TF-IDF, and comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, wherein the embedding module generates a countermeasure text by adopting the countermeasure training, and combines the countermeasure text and the input text to be used as input word vector representation; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of an embedded module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-time and short-time memory network is used for further extracting features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
the training module is used for training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
and the classification module is used for performing text emotion classification by using the trained text emotion classification model.
Since the apparatus described in the second embodiment of the present invention is an apparatus for implementing the text emotion classification method based on the countermeasure training and the TF-IDF in the first embodiment of the present invention, those skilled in the art can understand the specific structure and deformation of the apparatus based on the method described in the first embodiment of the present invention, and thus the detailed description thereof is omitted. All the devices adopted in the method in the first embodiment of the invention belong to the protection scope of the invention.
EXAMPLE III
Based on the same inventive concept, the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed, implements the method as described in the first embodiment.
Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer-readable storage medium used for implementing the text emotion classification method based on the countermeasure training and the TF-IDF in the first embodiment of the present invention, those skilled in the art can understand the specific structure and modification of the computer-readable storage medium based on the method introduced in the first embodiment of the present invention, and thus the detailed description thereof is omitted here. Any computer readable storage medium used in the method of the first embodiment of the present invention falls within the intended scope of the present invention.
Example four
Based on the same inventive concept, the present application further provides a computer device, which includes a storage, a processor, and a computer program stored in the storage and executable on the processor, and when the processor executes the computer program, the method in the first embodiment is implemented.
Since the computer device introduced in the fourth embodiment of the present invention is a computer device used for implementing the text emotion classification method based on the countermeasure training and the TF-IDF in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and deformation of the computer device, and thus, no further description is provided herein. All the computer devices used in the method of the first embodiment of the present invention are within the scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (10)

1. A text emotion classification method based on confrontation training and TF-IDF is characterized by comprising the following steps:
s1: acquiring an original text, and dividing a training set from the original text;
s2: preprocessing a training set;
s3: constructing a text emotion classification model based on confrontation training and TF-IDF, wherein the model comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, the embedding module generates a confrontation text by adopting the confrontation training, and the confrontation text and an input text are combined to be used as an input word vector to represent; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of the embedding module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-time and short-time memory network is used for further extracting text features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
s4: training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
s5: and performing text emotion classification by using the trained text emotion classification model.
2. The text emotion classification method based on confrontation training and TF-IDF as claimed in claim 1, wherein step S2 includes:
s2.1: obtaining an emotion category array according to the divided training set;
s2.2: obtaining a word library and the occurrence frequency of each word in each emotion category from the training set;
s2.3: and traversing the training set, converting the words into subscripts in a word library, and analyzing each sentence to obtain a word vector, a segment vector and a position coding vector of the sentence.
3. The text emotion classification method based on confrontation training and TF-IDF as claimed in claim 1, wherein, in the model constructed in step S3, the processing procedure of the embedding module includes:
taking a vector obtained after preprocessing an original text as the input of a BERT pre-training model, wherein the vector obtained after preprocessing the original text comprises a word vector, a segment vector and a position coding vector of a sentence;
predicting the probability of the word in the vocabulary at the corresponding position in a sentence by using a BERT pre-training model, which specifically comprises the following steps: determining a sentenceThe first K neighboring word vectors of each word in the son, denoted as K T
Respectively obtaining the direction expression d between each word and K adjacent word vectors (t,k) And obtain a word vector x t Against disturbance r t
Figure FDA0003741929160000021
Where t represents the vector of the t-th word in the sentence, K represents the K-th neighboring word vector, | K | represents the number of neighboring word vectors, α (t,k) A weight vector representing the perturbation of the word t to the neighboring word k;
adding a disturbance vector on the basis of a word vector of a sentence to obtain a confrontation word vector serving as a confrontation sample, wherein the calculation mode is as follows:
Figure FDA0003741929160000022
wherein x t Vector representing the T-th word in the sentence, T representing the number of word vectors in the sentence, X +r To combat the samples and find the optimal perturbation direction by solving the gradient:
Figure FDA0003741929160000023
Figure FDA0003741929160000024
wherein X +r As input, Y represents the label, Θ represents the model parameters, l (X) +r Y, theta) represents a loss function for individual data, ε represents a threshold to combat perturbations, g t Denotes the gradient, g denotes all g t The connection is carried out by connecting the two parts,
Figure FDA0003741929160000025
and inputting a word vector obtained by preprocessing an original text and a confrontation sample into a BERT encoder to obtain a word vector representation containing context semantics.
4. The text emotion classification method based on confrontation training and TF-IDF as claimed in claim 1, wherein in the model constructed in step S3, the processing procedure of the first feature extraction module comprises:
transmitting the output of the embedded module into a bidirectional long-time memory network model, and extracting text characteristic information from two directions of each word characteristic vector;
and weighting the hidden state of each step of the bidirectional long-and-short-term memory network model by using an attention mechanism, giving different weights to the hidden state at each moment, and outputting the weighted sum of the hidden states at all the moments as a final feature vector.
5. The text emotion classification method based on confrontation training and TF-IDF as claimed in claim 1, wherein in the model constructed in step S3, the processing procedure of the second feature extraction module comprises:
the word w is found from the following equation t The frequency of occurrence TF value of:
Figure FDA0003741929160000026
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003741929160000027
represents each word w t In emotion label C j The number of occurrences of (a) is,
Figure FDA0003741929160000028
representing emotion labels C j Total number of words in (1), TF (w) t ,C j ) As a word w t The frequency of occurrence TF value of:
in IDF meterThe method takes the existing emotion classification information into consideration in the calculation to realize the supervised TF-IDF algorithm, and specifically comprises the following steps: finding included word w t And calculates the IDF value of the word:
Figure FDA0003741929160000029
where | k | represents the number of all emotion categories;
according to the word w t Calculating TF-IDF value from TF value and IDF value:
TF-IDF=TF×IDF
the TF-IDF value is used to represent the frequency weight of the occurrence of each word in each category;
and transmitting the feature vector expressed by the TF-IDF value into a bidirectional long-time memory network, and performing further feature extraction to obtain text vector expression.
6. The text emotion classification method based on confrontation training and TF-IDF as claimed in claim 1, wherein the prediction module specifically adopts a softmax function to solve the probability that the text is classified into a certain category, thereby obtaining the final classification result.
7. The text emotion classification method based on antagonistic training and TF-IDF as claimed in claim 1, wherein the error is propagated to each layer of the model using a gradient descent method and a back propagation method in the training process of S4, so that the parameter values of the model are adjusted according to the error, the training is iterated continuously until the optimal solution is finally obtained, and the final classification loss is calculated by using a cross entropy loss function
Figure FDA0003741929160000033
The calculation process is expressed as:
Figure FDA0003741929160000031
wherein D represents trainingSet size, C denotes number of classes, q (i,j) Indicates whether it belongs to the current category, logp (i,j) Representing the probability that the feature word i is predicted as a category j,
Figure FDA0003741929160000032
representing the loss of text confrontation.
8. A text emotion classification device based on confrontation training and TF-IDF is characterized by comprising:
the data set acquisition module is used for acquiring an original text and dividing a training set from the original text;
the preprocessing module is used for preprocessing the training set;
the model building module is used for building a text sentiment classification model based on the confrontation training and TF-IDF, and comprises an embedding module, a first feature extraction module, a second feature extraction module and a prediction module, wherein the embedding module generates the confrontation text by adopting the confrontation training, and combines the confrontation text and the input text to be used as input word vector representation; the first feature extraction module adopts a bidirectional long-short-term memory network model, takes the output of the embedding module as input to extract text features, then adopts an attention mechanism to weight hidden states in the bidirectional long-short-term memory network model, and gives different weights to each hidden state; the second feature extraction module adopts an improved TF-IDF algorithm to obtain the frequency weight of each word appearing in each category for the input words, a bidirectional long-time and short-time memory network is used for further extracting features, and the prediction module is used for obtaining a classification result according to the output of the first feature extraction module and the second feature extraction module;
the training module is used for training the text emotion classification model based on the preprocessed training set to obtain a trained text emotion classification model;
and the classification module is used for performing text emotion classification by using the trained text emotion classification model.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the program.
CN202210818922.8A 2022-07-12 2022-07-12 Text emotion classification method and device based on confrontation training and TF-IDF Pending CN115309894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210818922.8A CN115309894A (en) 2022-07-12 2022-07-12 Text emotion classification method and device based on confrontation training and TF-IDF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210818922.8A CN115309894A (en) 2022-07-12 2022-07-12 Text emotion classification method and device based on confrontation training and TF-IDF

Publications (1)

Publication Number Publication Date
CN115309894A true CN115309894A (en) 2022-11-08

Family

ID=83856860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210818922.8A Pending CN115309894A (en) 2022-07-12 2022-07-12 Text emotion classification method and device based on confrontation training and TF-IDF

Country Status (1)

Country Link
CN (1) CN115309894A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117521639A (en) * 2024-01-05 2024-02-06 湖南工商大学 Text detection method combined with academic text structure

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117521639A (en) * 2024-01-05 2024-02-06 湖南工商大学 Text detection method combined with academic text structure
CN117521639B (en) * 2024-01-05 2024-04-02 湖南工商大学 Text detection method combined with academic text structure

Similar Documents

Publication Publication Date Title
CN110134771B (en) Implementation method of multi-attention-machine-based fusion network question-answering system
CN110083705B (en) Multi-hop attention depth model, method, storage medium and terminal for target emotion classification
Meng et al. Aspect based sentiment analysis with feature enhanced attention CNN-BiLSTM
Gan et al. Scalable multi-channel dilated CNN–BiLSTM model with attention mechanism for Chinese textual sentiment analysis
Qian et al. Hierarchical CVAE for fine-grained hate speech classification
CN109284506A (en) A kind of user comment sentiment analysis system and method based on attention convolutional neural networks
Cai et al. A stacked BiLSTM neural network based on coattention mechanism for question answering
CN111079409B (en) Emotion classification method utilizing context and aspect memory information
CN111522908A (en) Multi-label text classification method based on BiGRU and attention mechanism
CN110909529B (en) User emotion analysis and prejudgment system of company image promotion system
CN112016002A (en) Mixed recommendation method integrating comment text level attention and time factors
CN116204674B (en) Image description method based on visual concept word association structural modeling
CN113435208A (en) Student model training method and device and electronic equipment
Sun et al. Transformer based multi-grained attention network for aspect-based sentiment analysis
Gao et al. Generating natural adversarial examples with universal perturbations for text classification
CN112784041A (en) Chinese short text emotion orientation analysis method
CN115658890A (en) Chinese comment classification method based on topic-enhanced emotion-shared attention BERT model
Luo et al. EmotionX-DLC: self-attentive BiLSTM for detecting sequential emotions in dialogue
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN115309897A (en) Chinese multi-modal confrontation sample defense method based on confrontation training and contrast learning
CN116663539A (en) Chinese entity and relationship joint extraction method and system based on Roberta and pointer network
Han et al. Text adversarial attacks and defenses: Issues, taxonomy, and perspectives
CN117313709B (en) Method for detecting generated text based on statistical information and pre-training language model
Wang et al. Information-enhanced hierarchical self-attention network for multiturn dialog generation
CN115309894A (en) Text emotion classification method and device based on confrontation training and TF-IDF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination