CN114757183A

CN114757183A - Cross-domain emotion classification method based on contrast alignment network

Info

Publication number: CN114757183A
Application number: CN202210373901.XA
Authority: CN
Inventors: 宋大为; 马放; 张辰; 杨艺
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-04-11
Filing date: 2022-04-11
Publication date: 2022-07-15
Anticipated expiration: 2042-04-11
Also published as: CN114757183B

Abstract

The invention relates to a cross-domain emotion analysis method based on a contrast alignment network, and belongs to the technical field of fine-grained emotion analysis in natural language processing. The invention researches an insufficiently explored scene of cross-domain emotion classification, namely a scene with few samples in a target domain. In this scenario, the present invention proposes a neural network model known as the Contrast Alignment Network (CAN). The model first randomly extracts two instances from the original domain and the target domain, and then trains the two instances according to the combined target domain and original domain. The first objective is to minimize classification errors on the original domain. The second is a pairwise comparison target, where the distance measure between the target domain instance and the original domain instance in a pair is minimized if they express the same emotion, otherwise the measure is maximized with a constant upper bound. The method solves the problem that the target field data resources are limited in the cross-field emotion classification task, and improves the use experience of users.

Description

Cross-domain emotion classification method based on contrast alignment network

Technical Field

The invention relates to a cross-domain emotion classification method, in particular to a cross-domain emotion analysis method based on a contrast alignment network, and belongs to the technical field of fine-grained emotion analysis in natural language processing.

Background

Cross-Domain Sentiment Classification (CDSC) is an important task, aiming at transferring learned knowledge from the original Domain to the target Domain. The CDSC enables the emotion classification model trained in the original domain with a large amount of labeled data to work well in the target domain data with limited training samples. This situation is common and challenging in the industry when the target domain has insufficient data and the original domain has sufficient data, with the main challenge being domain transfer (or distributed transfer) between the source domain and the target domain. The domain transfer problem is mainly the difference in distribution between any two domains, e.g., words used in the medical domain are very different from words used in the restaurant domain.

Domain transfer is an important problem in cross-domain emotion classification and can be alleviated to a large extent by domain adaptation methods. Currently, researchers have proposed various domain-adapted models. These models require a large amount of unlabeled data from the target domain so that they can learn a good representation of each target instance as the correct input to a classifier trained in the original domain.

Currently, Unsupervised Domain Adaptation (UDA) techniques have been used to solve the domain transfer problem. Essentially, UDAs take advantage of other unmarked data in the target domain to minimize domain skew by aligning statistics across domains. However, in practical applications, the large amount of unlabeled target domain data required for UDAs may not be sufficiently available, thereby limiting the applicability of UDAs.

Meanwhile, cross-domain emotion classification exists in a scene which is not fully explored at present, namely, a target domain is a few-sample scene, and the scene exists in many practical applications in the industry. Unlike unsupervised scenarios, the sample-less scenario does not require additional unlabeled target domain data, but only relies on scarce labeled data available in the target domain.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and solve the technical problem of data resource limitation in the target field, and creatively provides a cross-field emotion classification method based on a contrast alignment network. The invention researches an insufficiently explored scene of cross-domain emotion classification, namely a scene with few samples in a target domain. Unlike unsupervised scenarios, the sample-less scenario does not require additional unlabeled target domain data, but relies only on scarce labeled data available in the target domain. Under this scenario, the present invention proposes a neural network model known as a Contrast Alignment Network (CAN). The model first randomly extracts two instances from the original domain and the target domain, and then trains the two instances according to the combined target domain and original domain. The first objective is to minimize the classification errors on the original domain. The second is a pairwise comparison target, where the distance measure between the target domain instance and the original domain instance in a pair is minimized if they express the same emotion, otherwise the measure is maximized with a constant upper bound.

Notably, the motivation for aligning networks is not just to address the resource limitation issue. In fact, by deducing that the model provided by the invention has a more complete learning boundary, a theoretical basis is provided for the basic efficacy of the model.

Cross-domain emotion analysis CDSC requires an in-situ domain

Middle training model, original field

In which n is^sExample for target area

Emotion classification is performed. X_i ^sIs the ith sample sentence of the original field, y_i ^sIs the label corresponding to the ith sample in the original field. In other words, the original domain

Example sample distribution compliance

Also, in the target area are

X^tSample sentence, y, representing target field t^tAnd a label corresponding to the sample sentence representing the target field t.

And

represents the domain offset between the joint probabilities of the source and target, respectively, and

the concept of domain adaptation is imposed to alleviate the domain transfer problem. In the case where there is not enough tagged data in the target domain, the unsupervised domain adaptation method utilizes a large amount of additional untagged target data

To align two fields, n^tThe target domain is the number of marked data samples. However, the large amount of unlabeled data required may not always be available in the target domain. Therefore, the method focuses on a more practical case of cross-domain emotion analysis, namely a sample case, and uses a small amount of auxiliary mark target domain data instead

n^t＜＜n^s，n^tThe number of target field data samples is marked for a small number of aids.

Inspired by the adaptation work in the existing unsupervised field, the invention decomposes the learning objective into two parts: and calculating the risk of the discriminant original domain and the regularized domain transfer. In order to further minimize the risk of the original field, the invention provides a method for comparing and aligning the original field and the target field by taking example-level classification information as a condition.

The technical method adopted by the invention is as follows.

An emotion classification method based on a contrast alignment network comprises the following steps:

first, text data preprocessing is performed.

And loading the comment corpus set and the pre-training language model, and performing text preprocessing and text data formatting on the comment text data in the comment corpus set. The pre-training language model may adopt a BERT model, a RoBERTa model, or the like.

And then, constructing a cross-domain emotion classification model based on the contrast alignment network.

The cross-domain emotion classification model f based on the contrast alignment network comprises a coder g_θAnd a classifier h_φOn the basis of the architecture, an original domain classification target loss function is introduced

And comparing the target domain classification target loss functions

Wherein, the encoder g _θAnd using the pre-training language model as a base for coding the context information of the comment sentence. Preferably, the encoder uses CLS (sentence vector representation) full-sentence representation of the pre-trained language model as a context-hidden-state representation vector H of the entire comment sentence, H ═ H₁，h₂，...，h_n}，h_nThe hidden state representing the nth token represents a vector.

Classifier h_φConsists of a multi-layer perceptron MLP and softmax layers (soft maximization normalization layers). Wherein, multilayer perceptron includes the four layers, does in proper order: a fully connected layer, a ReLU (linear rectification function) activation function layer, a dropout layer (random discard layer), and a fully connected layer. To be passed through MLPThe output representation is passed to the softmax layer (soft maximization normalization layer) from which the corresponding losses are calculated.

Then, the discriminative original domain risk is calculated.

In the method, for the risk of the discriminant original field, the empirical classification loss term of the original field is adopted to model the classification target as the loss based on the cross entropy

Wherein n is^sThe number of original field data samples; y is_iIs the label of the ith sample of the original domain data,

is a prediction label of the model on the ith sample of the original domain data.

Then, the original domain and the target domain are aligned by comparing the example-level classification information.

The goal is to minimize the domain bias under limited target domain data. Although distribution level alignment is difficult to achieve with limited target data, instance level alignment can be merged and can be achieved with a certain probability.

In the method of the invention, contrast loss is introduced to minimize the distance measure between the original and target domain instances of the sampling pair (if they share the same emotion), otherwise maximize the measure. Therefore, the emotion labels can be accurately assigned regardless of the field. Although the distance measure between the original domain and the target domain can be directly applied to the input x, the hidden representation Z is aligned by following most domain adaptive methods, which is more abstract and can capture more semantic information.

In particular, any pair is given

Specific loss of contrast

The calculation is as follows:

wherein X_i ^sSample comment sentence, X, representing the ith sample of the original field_j ^tSample comment sentence, y, j, representing target field t_i ^sA label, y, corresponding to the comment statement of the ith sample of the original field s_j ^tRepresenting a label corresponding to the jth sample comment statement in the target field t;

representing a distance measure between the original domain instance and the target domain instance;

Indicating a function for the equation; m is a predefined constant.

The contrast loss objective pushes the same emotion polarity instances closer, rejecting the different emotion polarity instances. Furthermore, the goal is not to push the different clusters to infinity, but rather to limit the exclusion range to a constant, as a relaxation to the learning algorithm.

Then, regularized domain transfer is performed.

The overall target of the method comprises a cross entropy loss function of original field data

Comparing and aligning original domain and target domain loss functions

And pass through

Regularization minimization of cross entropy loss function of raw domain data

And comparing the aligned original domain and the target domain loss function

Wherein the overall objective function is:

where α is a trade-off between classification and contrast objectives and λ is a regularization coefficient for all model parameters Θ ═ θ, Φ.

Then, model training is performed. Using standard batch random gradient descent algorithm to perform overall objective function

And (5) training.

Specifically, batch iterative training is carried out on all training samples in the training set, and a trained cross-domain emotion classification model based on a contrast alignment network is obtained.

And finally, performing cross-domain emotion classification by using the trained cross-domain emotion classification model based on the contrast alignment network.

Advantageous effects

Compared with the prior art, the method has the following advantages:

1. the method solves the problem that the target field data resources are limited in the cross-field emotion classification task. The invention researches an insufficiently explored scene of cross-domain emotion classification, namely a scene with few samples in a target domain. Unlike unsupervised scenarios, the sample-less scenario does not require additional unlabeled target domain data, but relies only on scarce labeled data available in the target domain.

2. The neural network model of the contrast alignment network provided by the method comprises two targets, wherein the first target is to minimize the classification error in the original field. The second is a pairwise comparison target, where the distance measure between the target domain instance and the original domain instance in a pair is minimized if they express the same emotion, otherwise the measure is maximized with a constant upper bound.

3. The performance of the comparison and alignment network model provided by the invention on the cross-domain emotion classification task is obviously superior to that of a corresponding baseline model. Other baseline models have a dramatic drop in performance across the domain data set. In contrast, the contrast-aligned neural network model proposed by the present invention is more robust than other baseline models.

4. The contrast alignment neural network model provided by the invention not only solves the problem of resource limitation in the target field, but also has a more complete learning boundary, and provides a theoretical basis for the basic efficacy of the provided model.

5. The method improves the problems in the cross-domain emotion classification of the existing fine-grained emotion analysis, and can well improve the use experience of the user.

Drawings

FIG. 1 is an overall flow diagram of the process of the present invention.

FIG. 2 is an exemplary diagram of a cross-domain emotion classification task.

FIG. 3 is a visual representation of the effect of the method of the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and examples.

Examples

As shown in FIG. 1, a cross-domain emotion classification method based on a contrast alignment network includes the following steps:

step 1: and (5) preprocessing the text.

First, a corpus of comments and a pre-trained language model are loaded. The pre-training language model may be a BERT model, or may be another model (e.g., RoBERTa model).

And then, performing text preprocessing and text data formatting on the comment corpus.

Specifically, the method comprises the following steps:

step 1.1: and extracting attribute words, viewpoint words and position information thereof from each comment sentence.

Step 1.2: pre-segmenting words of the comment sentence by using an nltk word segmenter, and separating segmented token words by using spaces.

Step 1.3: adding two special token words after the comment sentence segmentation token sequence: [ CLS]、[SEP]Thereby, a general input form is constructed: s { [ CLS { []，w₁，w₂...，w_n，[SEP]N denotes the total number of token words of the comment sentence, w_nThe nth token representing the comment sentence.

Step 1.4: and formatting the text data.

And (5) performing padding (padding) processing on each comment sentence token word sequence to enable the length of each comment sentence token word sequence to be 128. Token operation is performed on each token word in the comment sentence using the tokenizer of the pre-trained language model. And dividing the processed data set into a training set, a verification set and a test set, and constructing the training set, the verification set and the test set into a batch data form.

Step 2: and constructing a cross-domain emotion classification model based on a contrast alignment network.

A cross-domain emotion classification model f based on a contrast alignment network is composed of a coder g_θAnd a classifier h_φAnd (4) forming. Based on encoder g_θAnd a classifier h_φArchitecture for introducing domain-specific objective loss functions

And comparing the target domain classification target loss functions

Wherein, the encoder g_θEncoding context information of the comment sentence. For the emotion classification task, emotion information of a sentence is contained in the context of the sentence, and it is particularly important to perform context modeling on the whole sentence. Thus, the full-sentence CLS representation of the pre-trained language model is used as a context-hidden state representation of the entire comment sentence Vector H, H ═ H { H }₁，h₂，...，h_n}，h_nThe hidden state representing the nth token represents a vector. Each token word in the sequence of comment sentences is mapped into an encoding vector.

Classifier h_φConsisting of a multi-layer perceptron MLP and a softmax layer (soft maximization normalization layer). Wherein, the multilayer perceptron includes four layers: a fully connected layer, a ReLU (linear rectification function) activation function layer, a dropout layer (random drop layer), and a fully connected layer. The output representation passing through the fully-connected layer is sent to the softmax layer (soft maximization normalization layer), so that the corresponding label is predicted, and the corresponding target loss is calculated.

And 3, step 3: and calculating the risk of the discriminative original field.

For the discriminative original field risk, the invention adopts the experience classification loss term of the original field and then models the classification target as the loss based on the cross entropy

Wherein n is^sIs the number of original domain data samples, y_iIs the label of the ith sample of the original domain data,

And 4, step 4: and comparing and aligning the original domain and the target domain by using example-level classification information.

The domain migration is minimized under limited target domain data. Although distribution level alignment is difficult to achieve with limited target data, instance level alignment can be merged and with a certain probability.

Thus, the present invention introduces a loss of contrast to minimize the distance measure between the original and target domain instances of the sampling pair if they share the same emotion, otherwise maximize the measure. The emotion labels can be accurately assigned regardless of the field.

Although the distance measure between the original domain and the target domain can be directly applied to the input x, the present invention follows most domain adaptation methods, aligning the hidden representation Z. This is more abstract and can capture more semantic information.

In particular, any pair is given

The specific contrast loss was calculated as:

wherein, the first and the second end of the pipe are connected with each other,

a distance measure between the original domain instance and the target domain instance is represented.

Indicating a function for the equation. m is a predefined constant.

The contrast loss objective pushes the instances of the same emotion polarity closer, rejecting the instances of different emotion polarities. Furthermore, the goal is not to push the different clusters to infinity, but rather to limit the exclusion range to a constant, as a relaxation of the learning algorithm.

And 5: regularization domain transfer.

Comparing and aligning original domain and target domain loss functions

By passing

Regularization minimization of cross entropy loss function of original domain data

And comparing the aligned original domain and the target domain loss function

The overall objective function is:

where α is a trade-off between classification and contrast objectives, λ is a regularization coefficient for all model parameters Θ ═ θ, Φ, θ denotes the parameters of the encoder, and Φ denotes the parameters of the classifier.

Step 6: and training a cross-domain emotion classification model based on a contrast alignment network.

Using standard batch random gradient descent algorithm to perform overall objective function

And carrying out optimization training.

And 7: and performing cross-domain emotion classification by using the trained cross-domain emotion classification model based on the contrast alignment network.

Further, the method can be evaluated. And after the training is finished in the training set, performing verification test in the used verification set. The evaluation indexes used include:

for cross-domain emotion classification, the accuracy and the F1 value are used as evaluation indexes;

The optimal model is updated for each round of validation and saved.

Test verification

The method was tested. Firstly, loading the optimal model parameters and the test data which are stored before, and then converting the test data into the required format to be input into the optimal model for testing. The evaluation index is the same as the evaluation index used in the verification.

As shown in fig. 2, for the original field that is the comment field of the notebook computer, the comment sentence "the brand of the notebook computer is too jammed, the use experience is very poor", the target field that is the comment sentence "the dishes of other households are general, the services are also general" in the restaurant field, and there is a fine-grained field deviation in the data of the original field and the target field, so that most of the cross-field emotion classification methods and the unsupervised adaptive methods cannot perform another migration well, and the emotion polarity of the comment sentence in the target field cannot be effectively determined.

As shown in fig. 3, a cross-domain (laptop review domain and restaurant review domain) effect visualization representation of the emotion classification method based on the contrast alignment network, the left graph is a cross-domain effect visualization graph without the contrast alignment network, and the right graph is a cross-domain effect visualization graph through the contrast alignment network. It can be seen that the model effect visualization with the right graph with the contrast aligned network has a more elegant manifold than the effect visualization without the contrast aligned network, which means that the contrast aligned network has generalization capability even if the target data is small.

The above description is a preferred embodiment of the present invention, and the present invention should not be limited to the disclosure of the embodiment and the drawings. Equivalents and modifications may be made without departing from the spirit of the disclosure, which is to be considered as within the scope of the invention.

Claims

1. A cross-domain emotion classification method based on a contrast alignment network is characterized by comprising the following steps:

step 1: loading a comment corpus set and a pre-training language model, and performing text preprocessing and text data formatting on comment text data in the comment corpus set;

step 2: constructing a cross-domain emotion classification model based on a contrast alignment network;

wherein, the cross-domain emotion classification model f based on the contrast alignment network comprises a coder g_θAnd a classifier h_φOn the basis of the architecture, an original domain classification target loss function is introduced

And comparing the target domain classification target loss functions

Encoder g_θUsing a pre-training language model as a base for encoding context information of the comment statement; classifier h_φThe sensor consists of a multi-layer sensor MLP and a softmax layer; sending the output representation passing through the multilayer perceptron to the softmax layer, thereby calculating the corresponding loss;

and step 3: calculating the risk of the discriminant original field;

For the risk of the discriminant original field, the empirical classification loss item of the original field is adopted to model the classification target into the loss based on the cross entropy

Wherein n is^sThe number of data samples in the original field; y is_iIs the label of the ith sample of the original domain data,

a prediction label of the model on the ith sample of the original field data;

and 4, step 4: comparing and aligning the original field and the target field by using example-level classification information;

given an arbitrary pair

Specific loss of contrast

The calculation is as follows:

wherein, X_i ^sSample comment sentence, X, representing the ith sample of the original field_j ^tSample comment sentence, y, j, representing target field t_i ^sA label, y, corresponding to the comment statement of the ith sample of the original field s_j ^tRepresenting a label corresponding to the jth sample comment statement in the target field t;

indicating a function for an equation; m is a predefined constant;

and 5: carrying out regularization domain transfer; the overall objective includes a cross-entropy loss function of the original domain data

Comparing and aligning original domain and target domain loss functions

And pass through

Regularization minimization of cross entropy loss function of raw domain data

And comparing the aligned original domain and the target domain loss function

Wherein the overall objective function is:

where α is a trade-off between classification and contrast objectives, and λ is a regularization coefficient for all model parameters Θ ═ θ, Φ };

step 6: using standard batch random gradient descent algorithm to perform overall objective function

Training to obtain a trained cross-domain emotion classification model based on a contrast alignment network;

2. A cross-domain emotion classification method based on a contrast alignment network is characterized in that a coder uses CLS full sentence representation through a pre-trained language model as a context hidden state representation vector H of an entire comment sentence, and H is { H ═ H }₁,h₂,…,h_n}，h_nThe hidden state representing the nth token represents a vector.

3. A cross-domain emotion classification method based on a contrast alignment network is characterized in that in step 2, a multilayer perceptron comprises four layers, which are sequentially as follows: the system comprises a full connection layer, a ReLU activation function layer, a dropout layer and a full connection layer.

4. A cross-domain emotion classification method based on a contrast alignment network is characterized in that the step 1 comprises the following steps:

Step 1.1: extracting attribute words, viewpoint words and position information thereof from each comment sentence;

step 1.2: pre-segmenting words of the comment sentence by using an nltk word segmenter, and separating segmented token words by using spaces;

step 1.3: adding two special token words after the comment sentence word segmentation token sequence: [ CLS]、[SEP]Thereby, a general input form is constructed: s { [ CLS { [],w₁,w₂…,w_n,[SEP]N denotes the total number of token words of the comment sentence, w_nAn nth token representing a comment sentence;

step 1.4: formatting text data;

filling up each comment sentence token word sequence to make the length of each comment sentence token word sequence be 128; using a tokenizer of a pre-training language model to perform tokenize operation on each token word in the comment sentence; and dividing the processed data set into a training set, a verification set and a test set, and constructing the training set, the verification set and the test set into a batch data form.

5. A cross-domain emotion classification method based on a contrast alignment network is characterized in that after training is completed in a training set, verification testing is performed in a used verification set, and evaluation indexes include:

the optimal model is updated and saved for each round of validation.