CN112163091A

CN112163091A - CNN-based aspect-level cross-domain emotion analysis method

Info

Publication number: CN112163091A
Application number: CN202011026500.4A
Authority: CN
Inventors: 孟佳娜; 于玉海; 吴诗涵
Original assignee: Dalian Minzu University
Current assignee: Dalian Minzu University
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2021-01-01
Anticipated expiration: 2040-09-25
Also published as: CN112163091B

Abstract

An aspect-level cross-domain emotion analysis method based on CNN belongs to the field of text emotion analysis and aims to solve the problem of obtaining good emotion analysis classification results, and comprises the steps of S1, constructing an aspect-level emotion analysis model and S2, performing aspect-level cross-domain emotion analysis, fusing context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, migrating a model trained in a source domain to a target domain, and performing aspect-level emotion analysis on data in the target domain.

Description

CNN-based aspect-level cross-domain emotion analysis method

Technical Field

The invention belongs to the field of text emotion analysis, and relates to a CNN-based aspect-level cross-domain emotion analysis method.

Background

The emotion analysis has wide application value, is a challenging task in the field of natural language processing, and is one of the most active research directions. According to the existing research, emotion analysis can be divided into three levels: document level, sentence level, and aspect level. Document-level and sentence-level emotion analysis are both coarse-grained emotion analysis, while aspect-level emotion analysis is a fine-grained emotion analysis that provides more detailed emotion analysis results than general emotion analysis. For the aspect-level emotion analysis problem, a plurality of advanced deep learning methods exist at present, but a common deep learning model is generally highly dependent on a large amount of labeled data for training, and manual labeling of data needs to take a lot of time and money to complete.

Early aspect level emotion analysis mainly depends on feature engineering to represent sentences, and in recent years, in aspect level emotion analysis tasks, deep learning models achieve better effects. The long-short term memory network (LSTM) has good capability of representing sequence information, and Tang et al uses two LSTMs to jointly model a target word and the context thereof, and integrates the correlation information of the target word and the context. Tai et al propose a tree-shaped LSTM structure, which combines grammatical features such as dependency relationship, phrase composition and the like, so that semantic expression is more accurate. The attention mechanism can effectively improve the emotion classification effect. Ma et al propose an LSTM structure of a hierarchical attention mechanism that introduces common sense knowledge of emotion related concepts into end-to-end training of deep neural networks. Ma et al propose an interactive attention network that interactively detects important words of a target and important words in context. The memory network model has long-term, large amount of memory and is easy to read and write. Tang et al use the context information to construct a memory network that captures information important to different aspects of emotional tendencies through a focused mechanism. The RAM model proposed by Peng et al can capture long distance emotional features and non-linearly combine the results of multiple attentions with RNN to extract more complex features. The CNN model is relatively good at extracting features from n-grams, and the TNet model of Li et al proposes a feature transformation component to introduce entity information into semantic representations of words, and a "context preservation" mechanism to combine features with context information with transformed features. Wei et al combine CNN with gating mechanisms to allow the model to selectively output emotional features given different aspects.

The core idea of the transfer learning method is to find the similarity between a Source Domain and a Target Domain, transfer a model or mark data used by the Source Domain to the Target Domain from the perspective of the similarity, and finally perform new training according to the existing similarity. Due to the large difference in features between different domains, many cross-domain methods are starting from the perspective of features. Blitzer et al propose a structure correspondence learning method that attempts to find a set of pivot (pivot) features in the source and target domains that have the same features or behaviors for alignment. Pan et al propose a technique for spectral feature alignment that aligns domain-specific words from different domains into a unified cluster. Based on the deep neural network, a plurality of methods for solving the cross-domain are also extended. Glorot et al uses a stacked noise reduction auto-encoder to reconstruct the features of the source and target domains. Chen et al propose to use the mSDA (clustered SDA) algorithm, and to retain the strong learning ability of the model without using an optimization algorithm. Yosinski et al found through experiments that the first few layers of the deep network are more suitable to be fixed for completing the migration learning task, and proposed fine tuning can well overcome the differences among data in the field. Long et al propose a deep adaptive network DAN model, which uses a deep network as a carrier to perform adaptive migration.

At present, transfer learning has achieved great success in a plurality of fields, such as text mining, speech recognition, computer vision, spam filtering, WIFI positioning, emotion classification tasks and the like, and has a wide application prospect. The aspect level emotion analysis can provide information with finer granularity than general emotion analysis, and has higher research value and commercial value. And a large amount of labeled data is needed for training an excellent aspect-level emotion analysis model, and when the training data is insufficient, the distribution is different or the data types are unbalanced, the effect of the model is greatly reduced. Therefore, constructing a model and a method which are universal to the cross-domain emotion analysis technology is a problem worthy of research in the future.

Disclosure of Invention

In order to solve the problem of obtaining a good emotion analysis classification result, the invention provides the following technical scheme: an aspect-level cross-domain emotion analysis method based on CNN comprises

S1, constructing an aspect level emotion analysis model,

and S2, performing aspect-level cross-domain emotion analysis.

The step of S1 is as follows:

the input of the aspect level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises I words, each word is converted into a word vector with d dimensions, the sentence X is expressed into a matrix with d dimensions and d dimensions k (k dimensions) are used<L) dimensional convolution kernel W_cPerforming one-way translation scan on the context matrix, where k represents the number of words included in each scan of the convolution kernel, and each scan can obtain a convolution result c_iAs shown in formula (2-1):

c_i＝f(X_i：i+k-1*W_c+b_c) (2-1)

wherein b is_cIs a bias, f is an activation function, which represents a convolution operation, so that after the sentence is scanned, a vector c is obtained, as shown in equation (2-2):

c＝[c₁，c₂，...，c_lk] (2-2)

where lk represents the length of vector c. Setting n in the experiment_kA convolution kernel with a size of k can obtain n when all sentences are scanned_kThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one n_kA vector representation of the dimensions;

since the aspect word T may be composed of one or more words, a small CNN is added to convert the aspect word T into a word embedding matrix, as shown in equation (2-3), and features of the aspect word are extracted through convolution and pooling operations, as shown in equation (2-4):

T＝[T_i，T_i+1，...，T_i+k] (2-3)

v_i＝f_relu(T_i：i+k*W_v+b_v) (2-4)

wherein, W_vIs a convolution kernel of dimension d x k, b_vIs a paranoia;

two groups of convolution kernels with the same size are set to scan sentences simultaneously, results are input into two gate units respectively, and information of aspects and emotions are coded respectively to obtain two vectors s_i、a_iAt the calculation of s_iWhen the activation function is adopted, tanh is adopted as an activation function, and the formula is shown as (2-5);

s_i＝f_tanh(X_i：i+k*W_s+b_s) (2-5)

wherein, W_sIs a convolution kernel of dimension d x k, b_sIs a paranoia;

in the calculation of a_iWhen, add the embedded vector v of the aspect word to the input_a，v_aBy v_iMaximum pooling is performed, and relu is used as an activation function, as shown in formula (2-6), a_iAspect features

a_i＝f_relu(X_i：i+k*W_a+V_av_a+b_a) (2-6)

After training and passing through the relu function, the model gives a higher weight a to emotional words with more close aspect words_iOn the contrary, if the two are far away, the weight may be very small or 0, and finally s is_i、a_iThe two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o_iAs shown in formulas (2-7):

o_i＝s_i*a_i (2-7)

will o_iInputting the vector into a pooling layer, performing maximum pooling, finally inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of each class according to the probability;

the step of S2 is as follows:

firstly, a neural network model is trained by using labeled data in a source field, each word in a sentence X is converted into a d-dimensional word to be embedded, the maximum length of the sentence is fixed to L, a part which is less than a specific value is supplemented by 0, a part which exceeds the specific value is cut off, the sentence X has L words, and the sentence X is expressed as a d-L-dimensional matrix as shown in a formula (2-8):

Xs∈R^d*l (2-8)

the facet words are expressed as d-l dimensional matrices, as shown in equations (2-9):

Ts∈R^d*l (2-9)

respectively inputting sentences and aspect words into the convolutional layer, extracting the characteristics of the sentences by using the convolutional layer, and setting the size of the convolutional kernel W as d x k dimension<L, respectively carrying out one-way translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning_iAnd v_iAs shown in formulas (2-10) (2-11):

c_i＝f(X_i：i+k-1*W_c+b_c) (2-10)

v_i＝f(T_i：i+k*W_v+b_v) (2-11)

wherein b is_c，b_vIs a bias, f is a convolution kernel activation function, representing the convolution operation;

second, v is_iV obtained after maximum pooling operation_aAnd c_iSending the data into a gate control unit together, matching and fusing the face information and the emotion information to obtain a group of emotion vectors O_sAs shown in formulas (2-12):

O_s＝[o₁，o₂，...，o_lk] (2-12)

and thirdly, aiming at an overfitting phenomenon generated during model training, improving the structural performance of the neural network by using Dropout, selecting maximum pooling operation, and taking out the maximum value in the characteristic values as a main characteristic, wherein the maximum value is shown as a formula (2-13):

max(O_s)＝(maxo₁，maxo₂，...，maxo_lk) (2-13)

fourthly, inputting the extracted features into a full connection layer, obtaining the probability of each class by using a softmax classifier by the full connection layer, and judging the class to which the class belongs according to the size of the probability, wherein the formula is shown as the formulas (2-14) (2-15):

and fifthly, after a classification result of a source field is obtained, fine tuning is carried out on the model by using a small part of labeled target field data, weights of convolution kernels trained in the source field are used in the convolution layer, a forward propagation algorithm is applied to obtain a feature diagram, fine tuning is carried out on the weights in the full connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field to obtain a final classification result, wherein the final classification result is shown as a formula (2-16) (2-17):

has the advantages that: the context features and the sentence features are fused, an aspect-level emotion classification model based on a convolutional neural network is constructed, the trained model in the source field is migrated to the target field, and aspect-level emotion analysis is carried out on data in the target field.

Drawings

FIG. 1 is a view of an aspect level sentiment analysis model.

Figure 2 model frame diagram.

FIG. 3 shows the result of the Chinese corpus accuracy experiment.

FIG. 4 shows the results of the experiment with the F1 corpus.

FIG. 5 shows the result of an English corpus accuracy experiment.

Fig. 6 english corpus F1 value experimental results.

Detailed Description

Summary of the invention

In recent years, the facet-level emotion analysis attracts more and more attention of scholars, but the facet-level cross-domain emotion analysis has the problems that no labeled data exists and good classification results are difficult to obtain. And fusing the context features and the sentence features to construct an aspect-level emotion classification model based on a convolutional neural network, migrating the trained model in the source field to the target field, and performing aspect-level emotion analysis on data in the target field. The Chinese and English corpora suitable for aspect-level cross-domain emotion analysis are labeled manually, and experimental results on the corpora show that the optimal F1 value of the Chinese data set reaches 92.19% and the optimal F1 value of the English data set reaches 81.57% in the cross-domain environment, so that the aspect-level cross-domain emotion analysis method based on CNN can effectively improve emotion classification accuracy of the target domain. In order to reduce the dependence of the model on a large amount of labeled data, the invention carries out cross-field research on aspect level emotion analysis, and the main contributions of the invention are as follows:

(1) chinese and English aspect level cross-domain emotion analysis corpora are labeled. At present, cross-field research on aspect-level emotion analysis is less, and the existing disclosed aspect-level emotion analysis data set cannot meet the requirement of the experiment, so that the invention selects emotion migration learning linguistic data of two sentence levels, extracts different aspects in sentences, marks corresponding emotion labels on the different aspects by combining semantic information, and manually marks the emotion labels as the linguistic data suitable for cross-field aspect-level emotion analysis tasks.

(2) A cross-domain model based on aspect level sentiment analysis is provided. The CNN-based aspect-level emotion analysis method is explored, a transfer learning model is established on the basis, the classification performance of the model in different fields is tested through experiments, and the method provided by the invention is verified to have good generalization capability.

2 introduction to the method

2.1 CNN-based aspect level Emotion analysis

Convolutional Neural Networks (CNNs) have made tremendous progress in the field of Natural Language Processing (NLP). The CNN is mainly composed of an input layer, a convolution layer, a pooling layer and a full-connection layer. When a sentence containing multiple emotions and multiple aspects is processed, a simple CNN cannot distinguish which entity the emotion word in the current scanning area describes, a gating activation unit is added on the basis of the CNN, after aspect information and emotion information pass through the activation unit, a model gives a higher weight to the emotion word with more close aspect information, and otherwise, if the aspect information and the emotion information are far away from each other, the weight to the emotion word is possibly small or 0. The model structure is shown in fig. 1.

The specific design steps are as follows:

the input of the model is divided into two parts, namely an aspect word and a context, and the corresponding convolution process also comprises two parts.

The context X contains l words, each word is converted into a word vector in d dimensions, and the sentence X can be represented as a matrix in d × l dimensions.

Using d.k (k)<L) dimensional convolution kernel W_cOne-way translation scans are performed on the context matrix, with k representing the number of words contained by the convolution kernel per scan. Each scan can obtain a convolution result c_iAs shown in formula (2-1).

c_i＝f(X_i：i+k-1*W_c+b_c) (2-1)

Wherein b is_cIs the bias, f is the activation function, which represents the convolution operation, so after the sentence is scanned, the vector c is obtained, as shown in equation (2-2).

c＝[c₁，c₂，…，c_lk] (2-2)

Where lk represents the length of vector c. Setting n in the experiment_kA convolution kernel with a size of k can obtain n when all sentences are scanned_kThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one n_kA vector of dimensions.

Because the aspect word T may be formed by one or more words, a small CNN is added in the experiment, the T is converted into a word embedding matrix, as shown in formula (2-3), and the features of the aspect word are extracted through convolution and pooling operations, as shown in formula (2-4).

T＝[T_i，T_i+1，...，T_i+k] (2-3)

v_i＝f_relu(T_i：i+k*W_v+b_v) (2-4)

Wherein, W_vIs a convolution kernel of dimension d x k, b_vIs a bias.

Two groups of convolution kernels with the same size are arranged in the experiment to scan sentences simultaneously, the results are respectively input into two gate units, the information of the aspect and the emotion is respectively coded, and two vectors s are obtained_i、a_i。

In calculating s_iWhen the method is used, tanh is used as an activation function, and the activation function is shown as a formula (2-5);

s_i＝f_tanh(X_i：i+k*W_s+b_s) (2-5)

wherein, W_sIs a convolution kernel of dimension d x k, b_sIs a bias.

In the calculation of a_iWhen the embedded vector v of the aspect word is added into the input_a，v_aBy v_iMaximum pooling is performed and relu is used as the activation function, as shown in equation (2-6), so that a_iMay be considered an aspect feature.

a_i＝f_relu(X_i：i+k*W_a+V_av_a+b_a) (2-6)

After training and passing through the relu function, the model gives a higher weight a to emotional words with more close aspect words_iConversely, if the two are far apart, the weight may be small or 0. Finally, will s_i、a_iThe two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector o_iAs shown in formulas (2-7).

o_i＝s_i*a_i (2-7)

Will o_iInput to the pooling layerAnd performing maximum pooling, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the class according to the probability.

2.2 aspect level Cross-Domain Emotion analysis

The transfer learning is a branch of the machine learning, and the transfer learning does not require training data to be in the same feature space or have the same edge probability distribution, so that the hypothesis required by the machine learning is relaxed. We pre-train the network model on a relatively large tagged dataset and then use the network model as an initialization model to continue processing tasks in other domains. In the model, after the aspect information and the context information are extracted into the features through convolution, the features are sent to a gating activation unit to be selected, the emotional features with low similarity with the aspect features are blocked at a gate, otherwise, the scale of the emotional features is correspondingly enlarged, the features of the aspect features and the emotional features are fused in the gating unit, and finally, the emotional tendency is predicted through a full connection layer.

The specific steps are designed as follows:

firstly, a neural network model is trained by using source field labeled data, each word in a sentence X is converted into a d-dimensional word to be embedded, the maximum length of the sentence is fixed to be L (the part which is less than a specific value is supplemented by 0, and the part which exceeds the specific value is truncated), the sentence X has L words, and the sentence X can be expressed as a d-L-dimensional matrix as shown in a formula (2-8).

Xs∈R^d*l (2-8)

Similarly, the term is also expressed as a d × l dimensional matrix, as shown in equations (2-9).

Ts∈R^d*l (2-9)

The sentence and the aspect words are respectively input into the convolution layer, and the features in the sentence are extracted by utilizing the convolution layer. Setting the size of convolution kernel W to d x k (k)<L) dimension, and respectively performing one-way translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel. Obtaining convolution result c after scanning_iAnd v_iAs shown in formulas (2-10) (2-11).

c_i＝f(X_i：i+k-1*W_c+b_c) (2-10)

v_i＝f(T_i：i+k*W_v+b_v) (2-11)

Wherein b is_c，b_vIs the offset, f is the convolution kernel activation function, which represents the convolution operation.

Second, v is_iV obtained after maximum pooling operation_aAnd c_iSending the information into a gate control unit, matching and fusing the face information and the emotion information, wherein the specific method is as described in section 2.1 of the invention, and finally obtaining a group of emotion vectors O_sAs shown in formulas (2-12).

O_s＝[o₁，o₂，...，o_lk] (2-12)

And thirdly, aiming at an overfitting phenomenon possibly occurring in model training, the Dropout is used for improving the structural performance of the neural network. The invention selects the maximum pooling operation and takes out the maximum value in the characteristic values as the main characteristic, as shown in the formula (2-13).

max(O_s)＝(maxo₁，maxo₂，...，maxo_lk) (2-13)

And fourthly, inputting the extracted features into a full connection layer, obtaining the probability of each class by using a softmax classifier through the full connection layer, and judging the class to which the class belongs according to the probability. The formula is shown in formulas (2-14) and (2-15).

And fifthly, after a classification result of the source field is obtained, fine tuning is carried out on the model by using a small part of labeled target field data. And using the weight of the convolution kernel trained in the source domain in the convolution layer, applying a forward propagation algorithm to obtain a characteristic diagram, carrying out fine adjustment on the weight in the full-connected layer by using a random gradient descent method, and carrying out emotion classification on the target field to obtain a final classification result, wherein the final classification result is shown as a formula (2-16) (2-17).

3 results and analysis of the experiments

3.1 corpus tagging

Because the existing emotion analysis corpus cannot completely meet the requirement of the research, the traditional Chinese and English migration learning corpus is respectively selected for manual labeling in the experiment, and a data set suitable for cross-field aspect-level emotion analysis tasks is created. The method specifically comprises the steps of analyzing aspect information and emotion information in the sentence-level emotion analysis public data set on the basis of the sentence-level emotion analysis public data set, extracting aspect words, and marking the emotion expressed in the aspect in the sentence.

3.1.1 Chinese corpus annotation

The Chinese language material selects Chinese comment text data sets which are sorted by students such as Tan Tubo and the like, wherein the Chinese comment text data sets are respectively from the comments of Beijing computer products, the comments of current books and the comments of hotel with journey network, and each field has 2000 texts with positive emotions and 2000 texts with negative emotions, and the total number of the texts is 12000. The part comment statement is selected as shown in table 3.1.

TABLE 3.1 Tan pine wave corpus example

Analyzing the corpus can see that each comment sentence relates to one or more aspects, and the emotional tendencies of different aspects are not necessarily the same. We label the emotion of different aspects in each comment sentence respectively. For example, for the sentence "the service of a hotel is too bad. The geographical position is good. "can be labeled as two different facets of affective data, for the" service "facet, the corresponding emotional tendencies are negative; for the "geographical location" aspect, the corresponding emotional propensity is positive. The annotated partial words are shown in table 3.2, for example.

TABLE 3.2 data example after Chinese corpus annotation

Each comment sentence after manual annotation is divided into three parts: sentences, aspects and emotional tendencies. How many aspects in the original comment sentence are copied for many times, and the aspect and the corresponding emotional tendency of each sentence are respectively marked. The part of the labeled aspect words are extracted and shown in table 3.3. The marked data were collated for a total of 19500 strips, as shown in Table 3.4. TABLE 3.3 partial facet words extracted after Chinese corpus tagging

TABLE 3.4 statistics of Chinese corpus postannotation

3.1.2 English corpus annotation

English corpora use the presently disclosed Amazon Book corpus, which is divided into four major categories, Book, DVD, Electronic, and Kitchen. The data of the four different fields comprise 1000 positive comments and 1000 negative comments, and 8000 data in total, and the selected part of comment statement is shown in table 3.5.

TABLE 3.5 Amazon corpus example

Similarly, each comment sentence after manual annotation is divided into three parts: sentences, aspects and emotional tendencies. Some of the words extracted after labeling are shown in table 3.6.

Some words extracted after labeling are shown in table 3.7, and the final labeled data are 9090 pieces in total and are shown in table 3.8 after arrangement.

TABLE 3.6 data example after labeling of English corpus

TABLE 3.7 partial facet words extracted after labeling of English corpus

TABLE 3.8 statistics after labeling of English corpus

3.2 Experimental parameter settings

In the experiment, Word vectors are constructed by taking words as basic units, a jieba tool is used for Word segmentation processing of texts, corresponding Word2vec Word vectors are constructed, specific hyper-parameter settings of the convolutional neural network are shown in a table 3.9, and given hyper-parameter m is the number of labeled data for fine tuning of the model in a target field.

TABLE 3.9 Experimental parameter settings

3.3 Experimental results and analysis

The accuracy (Acc) and the F1 value are used as evaluation indexes in the experiment.

The accuracy (Acc) calculation formula is shown as (3-1):

wherein

A prediction tag, y, representing a data sample_iThe actual label representing the data sample, and N represents the size of the test set.

The calculation of the F1 value balances the recall and accuracy indicators and is calculated as shown in 3.10:

TABLE 3.10 confusion matrix

Precision, also called Precision or accuracy, characterizes the true positive proportion of all results predicted to be positive, as shown in equation (3-2).

Recall, Rcca11, which characterizes the proportion of true classes found by the classifier, is shown in equation (3-3).

The calculation of the F1 value comprehensively considers the accuracy and recall of the classification model, and can be regarded as a weighted average of the two indexes. The F1 value is as accurate as the precision rate and the recall rate, and is between 0 and 1, and the larger value represents the better performance of the model. The calculation formula is shown as formula (3-4).

In order to show the influence of a target field sample on a model migration effect, part of data of a target field with a label is extracted for model training, and in an experiment on a Chinese data set, when m is 0, the model trained in a source field is directly migrated to the target field; and m is 0.05, namely randomly extracting data of 5% of the total number of the target fields, training the model again to adjust network parameters, and so on, and selecting the accuracy and the F1 value as test indexes by using a 10-time cross validation method.

3.2.1 Chinese corpus Experimental results and analysis

The result of the accuracy experiment of the Chinese corpus is shown in FIG. 3. A data set of Computer domain, where C represents; b represents a data set in the Book field; h denotes a dataset in the Hotel domain. C → B in FIG. 3 indicates that the source domain is Computer, the target domain is Book, and the rest is analogized in turn.

It can be seen from fig. 3 that when the convolutional neural network model with the gate control unit is used for migration, the migration effect from the book data set to the computer data set is the best, and the accuracy can reach 93.4%. With the increase of the training data of the target field, the accuracy rate is improved for most data sets, and when the target field samples are increased, the maximum increase is generally 0 to 0.05.

The experimental result of the F1 value of the Chinese corpus is shown in FIG. 4, and it can be seen that the migration effect from the book data set to the computer data set is the best, and the F1 value can reach 92.19%. As the target domain training data increased, the F1 value increased for most data sets. As expected, the performance of the model will improve with the increase of the target domain data set, but fig. 4 shows that the performance of the model is improved to the maximum extent when the target domain data is improved from 0 to 0.05, the performance of the model slightly floats with the increase of the target domain data, and the model reaches the best performance with the maximum target domain data at the later time. Therefore, in the experiment, a small proportion of target field data is added to finely adjust the model, the experiment result can be obviously improved, and the time consumption and the cost of manual marking are greatly reduced.

3.2.2 English corpus test results and analysis

The result of the accuracy experiment of the English corpus is shown in FIG. 5, wherein B represents a data set in the Book field; d, a data set of the DVD domain; e represents a data set in the Electronics field; k denotes data set of Kitchen area. In FIG. 5, B → D indicates that the source domain is Book, the destination domain is DVD, and the rest are analogized in turn.

It can be seen from fig. 5 that the accuracy of most data set experiments is improved along with the increase of training data of the target field, and the best experimental result is that when the Book data set is the source field and the Electronics data set is the target field, the accuracy reaches 82.45%.

Fig. 6 shows the experimental results of F1 values of english corpus, and it can be seen from fig. 6 that the F1 value of the experiment increases with the increase of training data of the target domain, and the best experimental result is that when the Book data set is the source domain and the Electronics data set is the target domain, the migration effect of the model is the best, and the F1 value reaches 81.57%.

In general, the accuracy and the F1 value of the experiment are improved with the increase of the target field data, the performance of the model in the experiment slightly fluctuates, but the performance of the model is optimal under the condition that the target field data is the maximum.

The invention marks the aspect-level emotion migration learning corpus, provides an experimental data set meeting the requirement for the invention, and also provides corpus support for the later related research. Aiming at cross-domain aspect level emotion analysis, the invention researches a CNN-based aspect level emotion analysis model, applies the idea of transfer learning, transfers the trained model in the source domain to the target domain, solves the problem that the target domain is difficult to obtain a good classification result due to less labeled data, and verifies that the model has good classification performance on the data set provided by the invention through experiments. In future work, the model can be improved by adopting more migration modes, and the generalization performance of the model can be further checked on more extensive data sets across the field.

The above description is only for the purpose of creating a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive concept of the present invention within the technical scope of the present invention.

Claims

1. An aspect-level cross-domain emotion analysis method based on CNN is characterized by comprising the following steps:

s1, constructing an aspect level emotion analysis model

S2, aspect-level cross-domain emotion analysis

The step of S1 is as follows:

the input of the aspect level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises I words, each word is converted into a word vector with d dimensions, the sentence X is expressed into a matrix with d dimensions and L dimensions, and a convolution kernel W with d dimensions and k dimensions (k is less than L) is used_cPerforming one-way translation scan on the context matrix, where k represents the number of words included in each scan of the convolution kernel, and each scan can obtain a convolution result c_iAs shown in formula (2-1):

c_i＝f(X_i：i+k-1*W_c+b_c) (2-1)

c＝[c₁，c₂，...，c_lk] (2-2)

where lk represents the length of vector c. Setting n in the experiment_kA convolution kernel with a size of k can obtain n when all sentences are scanned_kThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one n_kVector of dimension toRepresents;

T＝[T_i，T_i+1，...，T_i+k] (2-3)

v_i＝f_relu(T_i：i+k*W_v+b_v) (2-4)

wherein, W_vIs a convolution kernel of dimension d x k, b_vIs a paranoia;

s_i＝f_tanh(X_i：i+k*W_s+b_s) (2-5)

wherein, W_sIs a convolution kernel of dimension d x k, b_sIs a paranoia;

a_i＝f_relu(X_i：i+k*W_a+V_av_a+b_a) (2-6)

o_i＝s_i*a_i (2-7)

the step of S2 is as follows:

Xs∈R^d*l (2-8)

Ts∈R^d*l (2-9)

respectively inputting sentences and aspect words into a convolutional layer, extracting the characteristics of the sentences by using the convolutional layer, setting the size of a convolution kernel W to be d x k dimension, wherein k is less than L, respectively performing one-way translation scanning on the convolution kernel on a sentence matrix and an aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanning_iAnd v_iAs shown in formulas (2-10) (2-11):

c_i＝f(X_i：i+k-1*W_c+b_c) (2-10)

v_i＝f(T_i：i+k*W_v+b_v) (2-11)

second, v is_iV obtained after maximum pooling operation_sAnd c_iSending the data into a gate control unit together, matching and fusing the face information and the emotion information to obtain a group of emotion vectors O_sAs shown in formulas (2-12):

O_s＝[o₁，o₂，...，o_lk] (2-12)

max(O_s)＝(max o₁，max o₂，...，max o_lk) (2-13)