CN112163091A - CNN-based aspect-level cross-domain emotion analysis method - Google Patents

CNN-based aspect-level cross-domain emotion analysis method Download PDF

Info

Publication number
CN112163091A
CN112163091A CN202011026500.4A CN202011026500A CN112163091A CN 112163091 A CN112163091 A CN 112163091A CN 202011026500 A CN202011026500 A CN 202011026500A CN 112163091 A CN112163091 A CN 112163091A
Authority
CN
China
Prior art keywords
convolution
word
sentence
words
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011026500.4A
Other languages
Chinese (zh)
Other versions
CN112163091B (en
Inventor
孟佳娜
于玉海
吴诗涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Minzu University filed Critical Dalian Minzu University
Priority to CN202011026500.4A priority Critical patent/CN112163091B/en
Publication of CN112163091A publication Critical patent/CN112163091A/en
Application granted granted Critical
Publication of CN112163091B publication Critical patent/CN112163091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

An aspect-level cross-domain emotion analysis method based on CNN belongs to the field of text emotion analysis and aims to solve the problem of obtaining good emotion analysis classification results, and comprises the steps of S1, constructing an aspect-level emotion analysis model and S2, performing aspect-level cross-domain emotion analysis, fusing context features and sentence features, constructing an aspect-level emotion classification model based on a convolutional neural network, migrating a model trained in a source domain to a target domain, and performing aspect-level emotion analysis on data in the target domain.

Description

CNN-based aspect-level cross-domain emotion analysis method
Technical Field
The invention belongs to the field of text emotion analysis, and relates to a CNN-based aspect-level cross-domain emotion analysis method.
Background
The emotion analysis has wide application value, is a challenging task in the field of natural language processing, and is one of the most active research directions. According to the existing research, emotion analysis can be divided into three levels: document level, sentence level, and aspect level. Document-level and sentence-level emotion analysis are both coarse-grained emotion analysis, while aspect-level emotion analysis is a fine-grained emotion analysis that provides more detailed emotion analysis results than general emotion analysis. For the aspect-level emotion analysis problem, a plurality of advanced deep learning methods exist at present, but a common deep learning model is generally highly dependent on a large amount of labeled data for training, and manual labeling of data needs to take a lot of time and money to complete.
Early aspect level emotion analysis mainly depends on feature engineering to represent sentences, and in recent years, in aspect level emotion analysis tasks, deep learning models achieve better effects. The long-short term memory network (LSTM) has good capability of representing sequence information, and Tang et al uses two LSTMs to jointly model a target word and the context thereof, and integrates the correlation information of the target word and the context. Tai et al propose a tree-shaped LSTM structure, which combines grammatical features such as dependency relationship, phrase composition and the like, so that semantic expression is more accurate. The attention mechanism can effectively improve the emotion classification effect. Ma et al propose an LSTM structure of a hierarchical attention mechanism that introduces common sense knowledge of emotion related concepts into end-to-end training of deep neural networks. Ma et al propose an interactive attention network that interactively detects important words of a target and important words in context. The memory network model has long-term, large amount of memory and is easy to read and write. Tang et al use the context information to construct a memory network that captures information important to different aspects of emotional tendencies through a focused mechanism. The RAM model proposed by Peng et al can capture long distance emotional features and non-linearly combine the results of multiple attentions with RNN to extract more complex features. The CNN model is relatively good at extracting features from n-grams, and the TNet model of Li et al proposes a feature transformation component to introduce entity information into semantic representations of words, and a "context preservation" mechanism to combine features with context information with transformed features. Wei et al combine CNN with gating mechanisms to allow the model to selectively output emotional features given different aspects.
The core idea of the transfer learning method is to find the similarity between a Source Domain and a Target Domain, transfer a model or mark data used by the Source Domain to the Target Domain from the perspective of the similarity, and finally perform new training according to the existing similarity. Due to the large difference in features between different domains, many cross-domain methods are starting from the perspective of features. Blitzer et al propose a structure correspondence learning method that attempts to find a set of pivot (pivot) features in the source and target domains that have the same features or behaviors for alignment. Pan et al propose a technique for spectral feature alignment that aligns domain-specific words from different domains into a unified cluster. Based on the deep neural network, a plurality of methods for solving the cross-domain are also extended. Glorot et al uses a stacked noise reduction auto-encoder to reconstruct the features of the source and target domains. Chen et al propose to use the mSDA (clustered SDA) algorithm, and to retain the strong learning ability of the model without using an optimization algorithm. Yosinski et al found through experiments that the first few layers of the deep network are more suitable to be fixed for completing the migration learning task, and proposed fine tuning can well overcome the differences among data in the field. Long et al propose a deep adaptive network DAN model, which uses a deep network as a carrier to perform adaptive migration.
At present, transfer learning has achieved great success in a plurality of fields, such as text mining, speech recognition, computer vision, spam filtering, WIFI positioning, emotion classification tasks and the like, and has a wide application prospect. The aspect level emotion analysis can provide information with finer granularity than general emotion analysis, and has higher research value and commercial value. And a large amount of labeled data is needed for training an excellent aspect-level emotion analysis model, and when the training data is insufficient, the distribution is different or the data types are unbalanced, the effect of the model is greatly reduced. Therefore, constructing a model and a method which are universal to the cross-domain emotion analysis technology is a problem worthy of research in the future.
Disclosure of Invention
In order to solve the problem of obtaining a good emotion analysis classification result, the invention provides the following technical scheme: an aspect-level cross-domain emotion analysis method based on CNN comprises
S1, constructing an aspect level emotion analysis model,
and S2, performing aspect-level cross-domain emotion analysis.
The step of S1 is as follows:
the input of the aspect level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises I words, each word is converted into a word vector with d dimensions, the sentence X is expressed into a matrix with d dimensions and d dimensions k (k dimensions) are used<L) dimensional convolution kernel WcPerforming one-way translation scan on the context matrix, where k represents the number of words included in each scan of the convolution kernel, and each scan can obtain a convolution result ciAs shown in formula (2-1):
ci=f(Xi:i+k-1*Wc+bc) (2-1)
wherein b iscIs a bias, f is an activation function, which represents a convolution operation, so that after the sentence is scanned, a vector c is obtained, as shown in equation (2-2):
c=[c1,c2,...,clk] (2-2)
where lk represents the length of vector c. Setting n in the experimentkA convolution kernel with a size of k can obtain n when all sentences are scannedkThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one nkA vector representation of the dimensions;
since the aspect word T may be composed of one or more words, a small CNN is added to convert the aspect word T into a word embedding matrix, as shown in equation (2-3), and features of the aspect word are extracted through convolution and pooling operations, as shown in equation (2-4):
T=[Ti,Ti+1,...,Ti+k] (2-3)
vi=frelu(Ti:i+k*Wv+bv) (2-4)
wherein, WvIs a convolution kernel of dimension d x k, bvIs a paranoia;
two groups of convolution kernels with the same size are set to scan sentences simultaneously, results are input into two gate units respectively, and information of aspects and emotions are coded respectively to obtain two vectors si、aiAt the calculation of siWhen the activation function is adopted, tanh is adopted as an activation function, and the formula is shown as (2-5);
si=ftanh(Xi:i+k*Ws+bs) (2-5)
wherein, WsIs a convolution kernel of dimension d x k, bsIs a paranoia;
in the calculation of aiWhen, add the embedded vector v of the aspect word to the inputa,vaBy viMaximum pooling is performed, and relu is used as an activation function, as shown in formula (2-6), aiAspect features
ai=frelu(Xi:i+k*Wa+Vava+ba) (2-6)
After training and passing through the relu function, the model gives a higher weight a to emotional words with more close aspect wordsiOn the contrary, if the two are far away, the weight may be very small or 0, and finally s isi、aiThe two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector oiAs shown in formulas (2-7):
oi=si*ai (2-7)
will oiInputting the vector into a pooling layer, performing maximum pooling, finally inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of each class according to the probability;
the step of S2 is as follows:
firstly, a neural network model is trained by using labeled data in a source field, each word in a sentence X is converted into a d-dimensional word to be embedded, the maximum length of the sentence is fixed to L, a part which is less than a specific value is supplemented by 0, a part which exceeds the specific value is cut off, the sentence X has L words, and the sentence X is expressed as a d-L-dimensional matrix as shown in a formula (2-8):
Xs∈Rd*l (2-8)
the facet words are expressed as d-l dimensional matrices, as shown in equations (2-9):
Ts∈Rd*l (2-9)
respectively inputting sentences and aspect words into the convolutional layer, extracting the characteristics of the sentences by using the convolutional layer, and setting the size of the convolutional kernel W as d x k dimension<L, respectively carrying out one-way translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanningiAnd viAs shown in formulas (2-10) (2-11):
ci=f(Xi:i+k-1*Wc+bc) (2-10)
vi=f(Ti:i+k*Wv+bv) (2-11)
wherein b isc,bvIs a bias, f is a convolution kernel activation function, representing the convolution operation;
second, v isiV obtained after maximum pooling operationaAnd ciSending the data into a gate control unit together, matching and fusing the face information and the emotion information to obtain a group of emotion vectors OsAs shown in formulas (2-12):
Os=[o1,o2,...,olk] (2-12)
and thirdly, aiming at an overfitting phenomenon generated during model training, improving the structural performance of the neural network by using Dropout, selecting maximum pooling operation, and taking out the maximum value in the characteristic values as a main characteristic, wherein the maximum value is shown as a formula (2-13):
max(Os)=(maxo1,maxo2,...,maxolk) (2-13)
fourthly, inputting the extracted features into a full connection layer, obtaining the probability of each class by using a softmax classifier by the full connection layer, and judging the class to which the class belongs according to the size of the probability, wherein the formula is shown as the formulas (2-14) (2-15):
Figure BDA0002702268630000041
Figure BDA0002702268630000042
and fifthly, after a classification result of a source field is obtained, fine tuning is carried out on the model by using a small part of labeled target field data, weights of convolution kernels trained in the source field are used in the convolution layer, a forward propagation algorithm is applied to obtain a feature diagram, fine tuning is carried out on the weights in the full connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field to obtain a final classification result, wherein the final classification result is shown as a formula (2-16) (2-17):
Figure BDA0002702268630000043
Figure BDA0002702268630000044
has the advantages that: the context features and the sentence features are fused, an aspect-level emotion classification model based on a convolutional neural network is constructed, the trained model in the source field is migrated to the target field, and aspect-level emotion analysis is carried out on data in the target field.
Drawings
FIG. 1 is a view of an aspect level sentiment analysis model.
Figure 2 model frame diagram.
FIG. 3 shows the result of the Chinese corpus accuracy experiment.
FIG. 4 shows the results of the experiment with the F1 corpus.
FIG. 5 shows the result of an English corpus accuracy experiment.
Fig. 6 english corpus F1 value experimental results.
Detailed Description
Summary of the invention
In recent years, the facet-level emotion analysis attracts more and more attention of scholars, but the facet-level cross-domain emotion analysis has the problems that no labeled data exists and good classification results are difficult to obtain. And fusing the context features and the sentence features to construct an aspect-level emotion classification model based on a convolutional neural network, migrating the trained model in the source field to the target field, and performing aspect-level emotion analysis on data in the target field. The Chinese and English corpora suitable for aspect-level cross-domain emotion analysis are labeled manually, and experimental results on the corpora show that the optimal F1 value of the Chinese data set reaches 92.19% and the optimal F1 value of the English data set reaches 81.57% in the cross-domain environment, so that the aspect-level cross-domain emotion analysis method based on CNN can effectively improve emotion classification accuracy of the target domain. In order to reduce the dependence of the model on a large amount of labeled data, the invention carries out cross-field research on aspect level emotion analysis, and the main contributions of the invention are as follows:
(1) chinese and English aspect level cross-domain emotion analysis corpora are labeled. At present, cross-field research on aspect-level emotion analysis is less, and the existing disclosed aspect-level emotion analysis data set cannot meet the requirement of the experiment, so that the invention selects emotion migration learning linguistic data of two sentence levels, extracts different aspects in sentences, marks corresponding emotion labels on the different aspects by combining semantic information, and manually marks the emotion labels as the linguistic data suitable for cross-field aspect-level emotion analysis tasks.
(2) A cross-domain model based on aspect level sentiment analysis is provided. The CNN-based aspect-level emotion analysis method is explored, a transfer learning model is established on the basis, the classification performance of the model in different fields is tested through experiments, and the method provided by the invention is verified to have good generalization capability.
2 introduction to the method
2.1 CNN-based aspect level Emotion analysis
Convolutional Neural Networks (CNNs) have made tremendous progress in the field of Natural Language Processing (NLP). The CNN is mainly composed of an input layer, a convolution layer, a pooling layer and a full-connection layer. When a sentence containing multiple emotions and multiple aspects is processed, a simple CNN cannot distinguish which entity the emotion word in the current scanning area describes, a gating activation unit is added on the basis of the CNN, after aspect information and emotion information pass through the activation unit, a model gives a higher weight to the emotion word with more close aspect information, and otherwise, if the aspect information and the emotion information are far away from each other, the weight to the emotion word is possibly small or 0. The model structure is shown in fig. 1.
The specific design steps are as follows:
the input of the model is divided into two parts, namely an aspect word and a context, and the corresponding convolution process also comprises two parts.
The context X contains l words, each word is converted into a word vector in d dimensions, and the sentence X can be represented as a matrix in d × l dimensions.
Using d.k (k)<L) dimensional convolution kernel WcOne-way translation scans are performed on the context matrix, with k representing the number of words contained by the convolution kernel per scan. Each scan can obtain a convolution result ciAs shown in formula (2-1).
ci=f(Xi:i+k-1*Wc+bc) (2-1)
Wherein b iscIs the bias, f is the activation function, which represents the convolution operation, so after the sentence is scanned, the vector c is obtained, as shown in equation (2-2).
c=[c1,c2,…,clk] (2-2)
Where lk represents the length of vector c. Setting n in the experimentkA convolution kernel with a size of k can obtain n when all sentences are scannedkThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one nkA vector of dimensions.
Because the aspect word T may be formed by one or more words, a small CNN is added in the experiment, the T is converted into a word embedding matrix, as shown in formula (2-3), and the features of the aspect word are extracted through convolution and pooling operations, as shown in formula (2-4).
T=[Ti,Ti+1,...,Ti+k] (2-3)
vi=frelu(Ti:i+k*Wv+bv) (2-4)
Wherein, WvIs a convolution kernel of dimension d x k, bvIs a bias.
Two groups of convolution kernels with the same size are arranged in the experiment to scan sentences simultaneously, the results are respectively input into two gate units, the information of the aspect and the emotion is respectively coded, and two vectors s are obtainedi、ai
In calculating siWhen the method is used, tanh is used as an activation function, and the activation function is shown as a formula (2-5);
si=ftanh(Xi:i+k*Ws+bs) (2-5)
wherein, WsIs a convolution kernel of dimension d x k, bsIs a bias.
In the calculation of aiWhen the embedded vector v of the aspect word is added into the inputa,vaBy viMaximum pooling is performed and relu is used as the activation function, as shown in equation (2-6), so that aiMay be considered an aspect feature.
ai=frelu(Xi:i+k*Wa+Vava+ba) (2-6)
After training and passing through the relu function, the model gives a higher weight a to emotional words with more close aspect wordsiConversely, if the two are far apart, the weight may be small or 0. Finally, will si、aiThe two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector oiAs shown in formulas (2-7).
oi=si*ai (2-7)
Will oiInput to the pooling layerAnd performing maximum pooling, inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of the class according to the probability.
2.2 aspect level Cross-Domain Emotion analysis
The transfer learning is a branch of the machine learning, and the transfer learning does not require training data to be in the same feature space or have the same edge probability distribution, so that the hypothesis required by the machine learning is relaxed. We pre-train the network model on a relatively large tagged dataset and then use the network model as an initialization model to continue processing tasks in other domains. In the model, after the aspect information and the context information are extracted into the features through convolution, the features are sent to a gating activation unit to be selected, the emotional features with low similarity with the aspect features are blocked at a gate, otherwise, the scale of the emotional features is correspondingly enlarged, the features of the aspect features and the emotional features are fused in the gating unit, and finally, the emotional tendency is predicted through a full connection layer.
The specific steps are designed as follows:
firstly, a neural network model is trained by using source field labeled data, each word in a sentence X is converted into a d-dimensional word to be embedded, the maximum length of the sentence is fixed to be L (the part which is less than a specific value is supplemented by 0, and the part which exceeds the specific value is truncated), the sentence X has L words, and the sentence X can be expressed as a d-L-dimensional matrix as shown in a formula (2-8).
Xs∈Rd*l (2-8)
Similarly, the term is also expressed as a d × l dimensional matrix, as shown in equations (2-9).
Ts∈Rd*l (2-9)
The sentence and the aspect words are respectively input into the convolution layer, and the features in the sentence are extracted by utilizing the convolution layer. Setting the size of convolution kernel W to d x k (k)<L) dimension, and respectively performing one-way translation scanning on the convolution kernel on the sentence matrix and the aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel. Obtaining convolution result c after scanningiAnd viAs shown in formulas (2-10) (2-11).
ci=f(Xi:i+k-1*Wc+bc) (2-10)
vi=f(Ti:i+k*Wv+bv) (2-11)
Wherein b isc,bvIs the offset, f is the convolution kernel activation function, which represents the convolution operation.
Second, v isiV obtained after maximum pooling operationaAnd ciSending the information into a gate control unit, matching and fusing the face information and the emotion information, wherein the specific method is as described in section 2.1 of the invention, and finally obtaining a group of emotion vectors OsAs shown in formulas (2-12).
Os=[o1,o2,...,olk] (2-12)
And thirdly, aiming at an overfitting phenomenon possibly occurring in model training, the Dropout is used for improving the structural performance of the neural network. The invention selects the maximum pooling operation and takes out the maximum value in the characteristic values as the main characteristic, as shown in the formula (2-13).
max(Os)=(maxo1,maxo2,...,maxolk) (2-13)
And fourthly, inputting the extracted features into a full connection layer, obtaining the probability of each class by using a softmax classifier through the full connection layer, and judging the class to which the class belongs according to the probability. The formula is shown in formulas (2-14) and (2-15).
Figure BDA0002702268630000071
Figure BDA0002702268630000072
And fifthly, after a classification result of the source field is obtained, fine tuning is carried out on the model by using a small part of labeled target field data. And using the weight of the convolution kernel trained in the source domain in the convolution layer, applying a forward propagation algorithm to obtain a characteristic diagram, carrying out fine adjustment on the weight in the full-connected layer by using a random gradient descent method, and carrying out emotion classification on the target field to obtain a final classification result, wherein the final classification result is shown as a formula (2-16) (2-17).
Figure BDA0002702268630000082
Figure BDA0002702268630000083
3 results and analysis of the experiments
3.1 corpus tagging
Because the existing emotion analysis corpus cannot completely meet the requirement of the research, the traditional Chinese and English migration learning corpus is respectively selected for manual labeling in the experiment, and a data set suitable for cross-field aspect-level emotion analysis tasks is created. The method specifically comprises the steps of analyzing aspect information and emotion information in the sentence-level emotion analysis public data set on the basis of the sentence-level emotion analysis public data set, extracting aspect words, and marking the emotion expressed in the aspect in the sentence.
3.1.1 Chinese corpus annotation
The Chinese language material selects Chinese comment text data sets which are sorted by students such as Tan Tubo and the like, wherein the Chinese comment text data sets are respectively from the comments of Beijing computer products, the comments of current books and the comments of hotel with journey network, and each field has 2000 texts with positive emotions and 2000 texts with negative emotions, and the total number of the texts is 12000. The part comment statement is selected as shown in table 3.1.
TABLE 3.1 Tan pine wave corpus example
Figure BDA0002702268630000081
Analyzing the corpus can see that each comment sentence relates to one or more aspects, and the emotional tendencies of different aspects are not necessarily the same. We label the emotion of different aspects in each comment sentence respectively. For example, for the sentence "the service of a hotel is too bad. The geographical position is good. "can be labeled as two different facets of affective data, for the" service "facet, the corresponding emotional tendencies are negative; for the "geographical location" aspect, the corresponding emotional propensity is positive. The annotated partial words are shown in table 3.2, for example.
TABLE 3.2 data example after Chinese corpus annotation
Figure BDA0002702268630000091
Each comment sentence after manual annotation is divided into three parts: sentences, aspects and emotional tendencies. How many aspects in the original comment sentence are copied for many times, and the aspect and the corresponding emotional tendency of each sentence are respectively marked. The part of the labeled aspect words are extracted and shown in table 3.3. The marked data were collated for a total of 19500 strips, as shown in Table 3.4. TABLE 3.3 partial facet words extracted after Chinese corpus tagging
Figure BDA0002702268630000092
TABLE 3.4 statistics of Chinese corpus postannotation
Figure BDA0002702268630000093
3.1.2 English corpus annotation
English corpora use the presently disclosed Amazon Book corpus, which is divided into four major categories, Book, DVD, Electronic, and Kitchen. The data of the four different fields comprise 1000 positive comments and 1000 negative comments, and 8000 data in total, and the selected part of comment statement is shown in table 3.5.
TABLE 3.5 Amazon corpus example
Figure BDA0002702268630000101
Similarly, each comment sentence after manual annotation is divided into three parts: sentences, aspects and emotional tendencies. Some of the words extracted after labeling are shown in table 3.6.
Some words extracted after labeling are shown in table 3.7, and the final labeled data are 9090 pieces in total and are shown in table 3.8 after arrangement.
TABLE 3.6 data example after labeling of English corpus
Figure BDA0002702268630000102
Figure BDA0002702268630000111
TABLE 3.7 partial facet words extracted after labeling of English corpus
Figure BDA0002702268630000112
TABLE 3.8 statistics after labeling of English corpus
Figure BDA0002702268630000113
3.2 Experimental parameter settings
In the experiment, Word vectors are constructed by taking words as basic units, a jieba tool is used for Word segmentation processing of texts, corresponding Word2vec Word vectors are constructed, specific hyper-parameter settings of the convolutional neural network are shown in a table 3.9, and given hyper-parameter m is the number of labeled data for fine tuning of the model in a target field.
TABLE 3.9 Experimental parameter settings
Figure BDA0002702268630000121
3.3 Experimental results and analysis
The accuracy (Acc) and the F1 value are used as evaluation indexes in the experiment.
The accuracy (Acc) calculation formula is shown as (3-1):
Figure BDA0002702268630000122
wherein
Figure BDA0002702268630000125
A prediction tag, y, representing a data sampleiThe actual label representing the data sample, and N represents the size of the test set.
The calculation of the F1 value balances the recall and accuracy indicators and is calculated as shown in 3.10:
TABLE 3.10 confusion matrix
Figure BDA0002702268630000123
Precision, also called Precision or accuracy, characterizes the true positive proportion of all results predicted to be positive, as shown in equation (3-2).
Figure BDA0002702268630000124
Recall, Rcca11, which characterizes the proportion of true classes found by the classifier, is shown in equation (3-3).
Figure BDA0002702268630000131
The calculation of the F1 value comprehensively considers the accuracy and recall of the classification model, and can be regarded as a weighted average of the two indexes. The F1 value is as accurate as the precision rate and the recall rate, and is between 0 and 1, and the larger value represents the better performance of the model. The calculation formula is shown as formula (3-4).
Figure BDA0002702268630000132
In order to show the influence of a target field sample on a model migration effect, part of data of a target field with a label is extracted for model training, and in an experiment on a Chinese data set, when m is 0, the model trained in a source field is directly migrated to the target field; and m is 0.05, namely randomly extracting data of 5% of the total number of the target fields, training the model again to adjust network parameters, and so on, and selecting the accuracy and the F1 value as test indexes by using a 10-time cross validation method.
3.2.1 Chinese corpus Experimental results and analysis
The result of the accuracy experiment of the Chinese corpus is shown in FIG. 3. A data set of Computer domain, where C represents; b represents a data set in the Book field; h denotes a dataset in the Hotel domain. C → B in FIG. 3 indicates that the source domain is Computer, the target domain is Book, and the rest is analogized in turn.
It can be seen from fig. 3 that when the convolutional neural network model with the gate control unit is used for migration, the migration effect from the book data set to the computer data set is the best, and the accuracy can reach 93.4%. With the increase of the training data of the target field, the accuracy rate is improved for most data sets, and when the target field samples are increased, the maximum increase is generally 0 to 0.05.
The experimental result of the F1 value of the Chinese corpus is shown in FIG. 4, and it can be seen that the migration effect from the book data set to the computer data set is the best, and the F1 value can reach 92.19%. As the target domain training data increased, the F1 value increased for most data sets. As expected, the performance of the model will improve with the increase of the target domain data set, but fig. 4 shows that the performance of the model is improved to the maximum extent when the target domain data is improved from 0 to 0.05, the performance of the model slightly floats with the increase of the target domain data, and the model reaches the best performance with the maximum target domain data at the later time. Therefore, in the experiment, a small proportion of target field data is added to finely adjust the model, the experiment result can be obviously improved, and the time consumption and the cost of manual marking are greatly reduced.
3.2.2 English corpus test results and analysis
The result of the accuracy experiment of the English corpus is shown in FIG. 5, wherein B represents a data set in the Book field; d, a data set of the DVD domain; e represents a data set in the Electronics field; k denotes data set of Kitchen area. In FIG. 5, B → D indicates that the source domain is Book, the destination domain is DVD, and the rest are analogized in turn.
It can be seen from fig. 5 that the accuracy of most data set experiments is improved along with the increase of training data of the target field, and the best experimental result is that when the Book data set is the source field and the Electronics data set is the target field, the accuracy reaches 82.45%.
Fig. 6 shows the experimental results of F1 values of english corpus, and it can be seen from fig. 6 that the F1 value of the experiment increases with the increase of training data of the target domain, and the best experimental result is that when the Book data set is the source domain and the Electronics data set is the target domain, the migration effect of the model is the best, and the F1 value reaches 81.57%.
In general, the accuracy and the F1 value of the experiment are improved with the increase of the target field data, the performance of the model in the experiment slightly fluctuates, but the performance of the model is optimal under the condition that the target field data is the maximum.
The invention marks the aspect-level emotion migration learning corpus, provides an experimental data set meeting the requirement for the invention, and also provides corpus support for the later related research. Aiming at cross-domain aspect level emotion analysis, the invention researches a CNN-based aspect level emotion analysis model, applies the idea of transfer learning, transfers the trained model in the source domain to the target domain, solves the problem that the target domain is difficult to obtain a good classification result due to less labeled data, and verifies that the model has good classification performance on the data set provided by the invention through experiments. In future work, the model can be improved by adopting more migration modes, and the generalization performance of the model can be further checked on more extensive data sets across the field.
The above description is only for the purpose of creating a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive concept of the present invention within the technical scope of the present invention.

Claims (1)

1. An aspect-level cross-domain emotion analysis method based on CNN is characterized by comprising the following steps:
s1, constructing an aspect level emotion analysis model
S2, aspect-level cross-domain emotion analysis
The step of S1 is as follows:
the input of the aspect level emotion analysis model is divided into two parts, namely an aspect word and a context, the corresponding convolution process also comprises two parts, the context X comprises I words, each word is converted into a word vector with d dimensions, the sentence X is expressed into a matrix with d dimensions and L dimensions, and a convolution kernel W with d dimensions and k dimensions (k is less than L) is usedcPerforming one-way translation scan on the context matrix, where k represents the number of words included in each scan of the convolution kernel, and each scan can obtain a convolution result ciAs shown in formula (2-1):
ci=f(Xi:i+k-1*Wc+bc) (2-1)
wherein b iscIs a bias, f is an activation function, which represents a convolution operation, so that after the sentence is scanned, a vector c is obtained, as shown in equation (2-2):
c=[c1,c2,...,clk] (2-2)
where lk represents the length of vector c. Setting n in the experimentkA convolution kernel with a size of k can obtain n when all sentences are scannedkThe matrix of lk dimension is processed by maximum pooling, that is, the maximum value of each row is taken, and the sentence can use one nkVector of dimension toRepresents;
since the aspect word T may be composed of one or more words, a small CNN is added to convert the aspect word T into a word embedding matrix, as shown in equation (2-3), and features of the aspect word are extracted through convolution and pooling operations, as shown in equation (2-4):
T=[Ti,Ti+1,...,Ti+k] (2-3)
vi=frelu(Ti:i+k*Wv+bv) (2-4)
wherein, WvIs a convolution kernel of dimension d x k, bvIs a paranoia;
two groups of convolution kernels with the same size are set to scan sentences simultaneously, results are input into two gate units respectively, and information of aspects and emotions are coded respectively to obtain two vectors si、aiAt the calculation of siWhen the activation function is adopted, tanh is adopted as an activation function, and the formula is shown as (2-5);
si=ftanh(Xi:i+k*Ws+bs) (2-5)
wherein, WsIs a convolution kernel of dimension d x k, bsIs a paranoia;
in the calculation of aiWhen, add the embedded vector v of the aspect word to the inputa,vaBy viMaximum pooling is performed, and relu is used as an activation function, as shown in formula (2-6), aiAspect features
ai=frelu(Xi:i+k*Wa+Vava+ba) (2-6)
After training and passing through the relu function, the model gives a higher weight a to emotional words with more close aspect wordsiOn the contrary, if the two are far away, the weight may be very small or 0, and finally s isi、aiThe two vectors are multiplied correspondingly, and the obtained result is the final characteristic vector oiAs shown in formulas (2-7):
oi=si*ai (2-7)
will oiInputting the vector into a pooling layer, performing maximum pooling, finally inputting the obtained vector into a full-connection layer, obtaining the probability of each class by using a Softmax classifier, and judging the class of each class according to the probability;
the step of S2 is as follows:
firstly, a neural network model is trained by using labeled data in a source field, each word in a sentence X is converted into a d-dimensional word to be embedded, the maximum length of the sentence is fixed to L, a part which is less than a specific value is supplemented by 0, a part which exceeds the specific value is cut off, the sentence X has L words, and the sentence X is expressed as a d-L-dimensional matrix as shown in a formula (2-8):
Xs∈Rd*l (2-8)
the facet words are expressed as d-l dimensional matrices, as shown in equations (2-9):
Ts∈Rd*l (2-9)
respectively inputting sentences and aspect words into a convolutional layer, extracting the characteristics of the sentences by using the convolutional layer, setting the size of a convolution kernel W to be d x k dimension, wherein k is less than L, respectively performing one-way translation scanning on the convolution kernel on a sentence matrix and an aspect word matrix, wherein k represents the number of words contained in each scanning of the convolution kernel, and obtaining a convolution result c after scanningiAnd viAs shown in formulas (2-10) (2-11):
ci=f(Xi:i+k-1*Wc+bc) (2-10)
vi=f(Ti:i+k*Wv+bv) (2-11)
wherein b isc,bvIs a bias, f is a convolution kernel activation function, representing the convolution operation;
second, v isiV obtained after maximum pooling operationsAnd ciSending the data into a gate control unit together, matching and fusing the face information and the emotion information to obtain a group of emotion vectors OsAs shown in formulas (2-12):
Os=[o1,o2,...,olk] (2-12)
and thirdly, aiming at an overfitting phenomenon generated during model training, improving the structural performance of the neural network by using Dropout, selecting maximum pooling operation, and taking out the maximum value in the characteristic values as a main characteristic, wherein the maximum value is shown as a formula (2-13):
max(Os)=(max o1,max o2,...,max olk) (2-13)
fourthly, inputting the extracted features into a full connection layer, obtaining the probability of each class by using a softmax classifier by the full connection layer, and judging the class to which the class belongs according to the size of the probability, wherein the formula is shown as the formulas (2-14) (2-15):
Figure FDA0002702268620000021
Figure FDA0002702268620000022
and fifthly, after a classification result of a source field is obtained, fine tuning is carried out on the model by using a small part of labeled target field data, weights of convolution kernels trained in the source field are used in the convolution layer, a forward propagation algorithm is applied to obtain a feature diagram, fine tuning is carried out on the weights in the full connection layer by using a random gradient descent method, and then emotion classification is carried out on the target field to obtain a final classification result, wherein the final classification result is shown as a formula (2-16) (2-17):
Figure FDA0002702268620000031
Figure FDA0002702268620000032
CN202011026500.4A 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method Active CN112163091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011026500.4A CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011026500.4A CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Publications (2)

Publication Number Publication Date
CN112163091A true CN112163091A (en) 2021-01-01
CN112163091B CN112163091B (en) 2023-08-22

Family

ID=73864233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011026500.4A Active CN112163091B (en) 2020-09-25 2020-09-25 CNN-based aspect level cross-domain emotion analysis method

Country Status (1)

Country Link
CN (1) CN112163091B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128229A (en) * 2021-04-14 2021-07-16 河海大学 Chinese entity relation joint extraction method
CN113204645A (en) * 2021-04-01 2021-08-03 武汉大学 Knowledge-guided aspect-level emotion analysis model training method
CN113468292A (en) * 2021-06-29 2021-10-01 中国银联股份有限公司 Method and device for analyzing aspect level emotion and computer readable storage medium
CN113627550A (en) * 2021-08-17 2021-11-09 北京计算机技术及应用研究所 Image-text emotion analysis method based on multi-mode fusion
CN114757183A (en) * 2022-04-11 2022-07-15 北京理工大学 Cross-domain emotion classification method based on contrast alignment network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108763326A (en) * 2018-05-04 2018-11-06 南京邮电大学 A kind of sentiment analysis model building method of the diversified convolutional neural networks of feature based
CN109753566A (en) * 2019-01-09 2019-05-14 大连民族大学 The model training method of cross-cutting sentiment analysis based on convolutional neural networks
KR20190136337A (en) * 2018-05-30 2019-12-10 가천대학교 산학협력단 Social Media Contents Based Emotion Analysis Method, System and Computer-readable Medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108763326A (en) * 2018-05-04 2018-11-06 南京邮电大学 A kind of sentiment analysis model building method of the diversified convolutional neural networks of feature based
KR20190136337A (en) * 2018-05-30 2019-12-10 가천대학교 산학협력단 Social Media Contents Based Emotion Analysis Method, System and Computer-readable Medium
CN109753566A (en) * 2019-01-09 2019-05-14 大连民族大学 The model training method of cross-cutting sentiment analysis based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵传君;王素格;李德玉;: "基于集成深度迁移学习的多源跨领域情感分类", 山西大学学报(自然科学版), no. 04 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204645A (en) * 2021-04-01 2021-08-03 武汉大学 Knowledge-guided aspect-level emotion analysis model training method
CN113128229A (en) * 2021-04-14 2021-07-16 河海大学 Chinese entity relation joint extraction method
CN113128229B (en) * 2021-04-14 2023-07-18 河海大学 Chinese entity relation joint extraction method
CN113468292A (en) * 2021-06-29 2021-10-01 中国银联股份有限公司 Method and device for analyzing aspect level emotion and computer readable storage medium
CN113627550A (en) * 2021-08-17 2021-11-09 北京计算机技术及应用研究所 Image-text emotion analysis method based on multi-mode fusion
CN114757183A (en) * 2022-04-11 2022-07-15 北京理工大学 Cross-domain emotion classification method based on contrast alignment network
CN114757183B (en) * 2022-04-11 2024-05-10 北京理工大学 Cross-domain emotion classification method based on comparison alignment network

Also Published As

Publication number Publication date
CN112163091B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
Du et al. Explicit interaction model towards text classification
WO2022022163A1 (en) Text classification model training method, device, apparatus, and storage medium
CN106776581B (en) Subjective text emotion analysis method based on deep learning
Roy et al. Unit dependency graph and its application to arithmetic word problem solving
CN112163091B (en) CNN-based aspect level cross-domain emotion analysis method
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN110502753A (en) A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement
CN110347836B (en) Method for classifying sentiments of Chinese-Yue-bilingual news by blending into viewpoint sentence characteristics
CN107729309A (en) A kind of method and device of the Chinese semantic analysis based on deep learning
CN112084327A (en) Classification of sparsely labeled text documents while preserving semantics
CN111160037A (en) Fine-grained emotion analysis method supporting cross-language migration
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN111368086A (en) CNN-BilSTM + attribute model-based sentiment classification method for case-involved news viewpoint sentences
CN111858935A (en) Fine-grained emotion classification system for flight comment
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN109271636B (en) Training method and device for word embedding model
Xiao et al. Chinese text sentiment analysis based on improved Convolutional Neural Networks
CN113934835B (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
CN115129807A (en) Fine-grained classification method and system for social media topic comments based on self-attention
Kalbhor et al. Survey on ABSA based on machine learning, deep learning and transfer learning approach
CN112905750A (en) Generation method and device of optimization model
Liu College oral English teaching reform driven by big data and deep neural network technology
Nsaif et al. Political Post Classification based on Firefly and XG Boost
CN112507723A (en) News emotion analysis method based on multi-model fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant