CN110874411A - Cross-domain emotion classification system based on attention mechanism fusion - Google Patents

Cross-domain emotion classification system based on attention mechanism fusion Download PDF

Info

Publication number
CN110874411A
CN110874411A CN201911138355.6A CN201911138355A CN110874411A CN 110874411 A CN110874411 A CN 110874411A CN 201911138355 A CN201911138355 A CN 201911138355A CN 110874411 A CN110874411 A CN 110874411A
Authority
CN
China
Prior art keywords
text
module
attention
emotion classification
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911138355.6A
Other languages
Chinese (zh)
Inventor
廖祥文
陈癸旭
陈志豪
邓立明
陈开志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201911138355.6A priority Critical patent/CN110874411A/en
Publication of CN110874411A publication Critical patent/CN110874411A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention relates to a cross-domain emotion classification system based on attention mechanism fusion. The method comprises the following steps: the comment text preprocessing module is used for acquiring vector forms of the source field text and the target field text; the text semantic learning module is used for learning semantic dependency relations among the words; the attention mechanism fusion module is used for obtaining the comprehensive weight of the words for text classification by fusing different attention modes; the hierarchical attention module is used for calculating the attention weight of the text from the word level and the sentence level respectively, and judging the weight of the expression of words to sentences and the weight of the expression of sentences to documents; and the emotion classification output module is used for obtaining a final emotion classification result by utilizing the classification function. The method can automatically extract the potential general characteristics of the target field and the source field, abstract and combine the characteristics, and finally identify the emotion type of the target field text.

Description

Cross-domain emotion classification system based on attention mechanism fusion
Technical Field
The invention relates to the field of emotion analysis and viewpoint mining, in particular to a cross-domain emotion classification system based on attention mechanism fusion, which can better analyze cross-domain emotion categories through cross-domain text representation learning and learning domain adaptive feature representation.
Background
Sentiment classification is an important and challenging task. Significant success has been achieved in areas where there is sufficient labeled training data. However, labeling enough data is very time and energy consuming, and provides a significant barrier to the adaptation of emotion classification systems to new fields. Meanwhile, when users express emotions in different domains, they often use different words, and if we directly apply a classifier trained in one domain to the other domain, the resulting performance will be very low due to the difference between the domains. Therefore, cross-domain emotion text classification aims to develop a general emotion classification solution, a classifier is trained through labeled data in a source domain, and then the classifier is applied to a target domain, namely a label-free domain to classify text emotions, and the system is called a cross-domain emotion classification system.
Most of the current cross-domain emotion classification studies belong to feature-based transformation, requiring manual selection of pivot and non-pivot features. Wherein Structure Correspondence Learning (SCL) is a typical method that attempts to acquire a mapping matrix from a non-data perspective eigenspace to a data perspective eigenspace; the SFA method aims to bridge between the source domain and the target domain by aligning pivot features and non-pivot features of different domains. The above methods all require a large amount of untagged data in the target domain to help build the transmission process. In addition, these methods do not fully mine the semantics of words, nor do they fully exploit data and domain tags. The rise of deep learning in recent years achieves better results in cross-domain emotion classification, which mainly learns common features and shared parameters of emotion classification, wherein a stacked noise reduction automatic encoder (SDA) is used for generating actual feature representations with uniform formats for documents from a source domain and a target domain; the MSDA method reserves the powerful learning capacity of the SDA and solves the problems of large SDA calculation amount, poor expandability and the like, but the deep learning method lacks interpretability.
In the text emotion classification, the text has strong dependency on context semantics, but a standard neural network model cannot solve the problem well, and meanwhile, in the text, the contribution of each word to each sentence is different, and the contribution of each sentence to each document is different, so that an attention mechanism needs to be introduced to improve the text classification performance. The attention mechanism simulates the characteristics of human brain attention, namely, more attention is invested into important contents, and less attention is invested into other parts. The soft attention mechanism is that when the attention distribution probability distribution is calculated, a probability is given to each word in an input sentence, and then the probability is transmitted to the next layer; the hard attention mechanism is that a certain specific word is directly found from an input scenario, then the target sentence word is aligned with the word, and the words in other input sentences are hard to consider that the probability of the words is 0; the local attention mechanism is a combination of a soft attention mechanism and a hard attention mechanism, and more hidden layers of the previous codes need to be considered in each alignment, so that the calculation amount is large. Therefore, it is desirable to find a more efficient cross-domain emotion classification method, which improves the accuracy of cross-domain emotion classification and reduces the consumption of manual time and energy.
Disclosure of Invention
The invention aims to provide a cross-domain emotion classification system based on attention mechanism fusion, which can automatically extract potential general features of a target domain and a source domain, abstract and combine the features, and finally identify emotion types of texts in the target domain.
In order to achieve the purpose, the technical scheme of the invention is as follows: a cross-domain emotion classification system based on attention mechanism fusion comprises:
the text preprocessing module is used for acquiring vector forms corresponding to the source field texts and the target field texts;
the text semantic learning module is used for learning semantic dependency relationships among words of the text vectors obtained by the text preprocessing module;
the attention mechanism fusion module is used for obtaining the comprehensive weight of the words of the text vector to the text by fusing different attention modes;
the hierarchical attention module is used for calculating the attention weight of the text from the word level and the sentence level respectively, judging the weight expressed by the words to the sentences and the weight expressed by the sentences to the documents, and obtaining a text expression vector;
and the emotion classification output module is used for processing the text expression vector output by the layered attention module by using a classification function to obtain a final emotion classification result.
In an embodiment of the present invention, the text preprocessing module extracts a vector form corresponding to the source field and the target field text by using Word2 vec.
In an embodiment of the invention, the text semantic learning module utilizes BiGRU to capture semantic dependencies between words of a text vector.
In an embodiment of the invention, the attention mechanism fusion module combines a Biliner attention mechanism and a Dot attention mechanism, so that the contribution degree of words to sentences and the contribution degree of sentences to documents are calculated better, and the effect of text classification in cross-domain is promoted.
In an embodiment of the present invention, the emotion classification output module processes the text expression vector by using a softmax function, and predicts the emotion classification of each text.
In one embodiment of the present invention, during the training phase of the model, the forward propagation of the information and the backward propagation of the error will be continuously adjusted to gradually optimize the objective function.
Compared with the prior art, the invention has the following beneficial effects: the system can automatically extract the potential general characteristics of the target field and the source field, abstract and combine the characteristics, and finally identify the emotion type of the target field text.
Drawings
FIG. 1 is a schematic configuration diagram of a cross-domain emotion classification system based on attention mechanism fusion.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides a cross-domain emotion classification system based on attention mechanism fusion, which comprises the following components:
the text preprocessing module is used for acquiring vector forms corresponding to the source field texts and the target field texts;
the text semantic learning module is used for learning semantic dependency relationships among words of the text vectors obtained by the text preprocessing module;
the attention mechanism fusion module is used for obtaining the comprehensive weight of the words of the text vector to the text by fusing different attention modes;
the hierarchical attention module is used for calculating the attention weight of the text from the word level and the sentence level respectively, judging the weight expressed by the words to the sentences and the weight expressed by the sentences to the documents, and obtaining a text expression vector;
and the emotion classification output module is used for processing the text expression vector output by the layered attention module by using a classification function to obtain a final emotion classification result.
The following is a specific implementation of the present invention.
FIG. 1 shows a schematic configuration diagram of a cross-domain emotion classification system based on attention mechanism fusion, according to an embodiment of the present invention. As shown in the figure, the cross-domain emotion classification system based on attention mechanism fusion implemented according to the invention comprises:
the text preprocessing module 1 is used for acquiring vector forms of texts in a source field and a target field; the text semantic learning module 2 is used for learning semantic dependency relations among the words; the attention mechanism fusion module 3 fuses different attention modes to obtain the comprehensive contribution degree in the text; the layered attention module 4 is used for calculating the attention weight of the text from the word level and the sentence level respectively and judging the contribution degree of the words or the sentences to the text classification; and the emotion classification output module 5 obtains a final emotion classification result by using the classification function. The respective module configurations are described in detail below.
1) Text preprocessing module 1
First, how the text preprocessing module 1 obtains an initial text vector is described.
In order to facilitate data processing, analysis and application, the method comprises the steps of preprocessing data sets of a source field and a target field, removing punctuation marks, filtering stop words, performing word segmentation processing, and performing word2vec model training on obtained words so as to obtain a vector form corresponding to texts of the source field and the target field and serve as input of a neural network.
2) Text semantic learning module 2
The following describes how the text semantic learning module 2 learns the initial text vector obtained by the module 1 and obtains semantic information.
The BiGRU is composed of a positive GRU model and a negative GRU model, the GRU network is a good variant of a long-short term memory network (LSTM), has strong long-distance semantic capture capability, and can acquire forward semantic information of a text but ignores future context information, so that the method uses the BiGRU network to increase learning of the reverse semantic information of the text, thereby better capturing bidirectional semantic dependence and playing a better role in finer-grained classification.
3) Attention mechanism fusion module 3
The following describes how the attention mechanism fusion module enforces the impact on text classification from the word level.
The current research shows that the combination of the neural network and the attention mechanism is beneficial to improving text classification, so that the invention is inspired, and the invention provides that a plurality of attention mechanisms are fused to calculate the contribution degree of words to the text classification. The relation between the words is calculated by mainly adopting a Biliner attention mechanism to perform text matching calculation and a Dot attention mechanism, and the specific implementation mode is as follows:
BinlinearAttention:
Figure BDA0002280120870000041
Figure BDA0002280120870000042
Figure BDA0002280120870000043
Dot Attention:
Figure BDA0002280120870000044
Figure BDA0002280120870000045
Figure BDA0002280120870000046
text representations obtained by two different attention mechanisms are fused in a residual error connection mode, text semantic information is increased in the fusion mode, gradient disappearance in the training process is avoided, and the specific implementation mode is as follows:
xc=[vb,vd]
xsc=σ(W2σ(W1xc)+xc)
4) layered attention Module 4
The degree of contribution of each word to a sentence representation is different and each sentence does not contribute equally to the semantic meaning of the document, so hierarchical attention is given to weighting each word forming a sentence on a part-of-speech level and each document formed on a sentence level, respectively.
5) Emotion classification output module 5
Finally, how the emotion classification output module 5 processes the source domain and target domain data is described.
The module 2 learns the bidirectional semantic dependency information of the target field and the source field, the text expression vectors are obtained through the layered attention mechanism learning of the module 3 and the module 4, the emotion classification output module 5 calculates the obtained vectors one by one through a softmax classification function, and the emotion classification predicted value represented by the text is obtained according to the set threshold value. In the training stage, the emotion classification is predicted by using the text representation in the source field, the error between the emotion classification and the actual emotion label is calculated, and the parameters of the whole system are updated iteratively by using a random gradient descent method and backward propagation; otherwise, predicting the emotion category of the text representation of the target field, and outputting a predicted value.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (6)

1. A cross-domain emotion classification system based on attention mechanism fusion is characterized by comprising:
the text preprocessing module is used for acquiring vector forms corresponding to the source field texts and the target field texts;
the text semantic learning module is used for learning semantic dependency relationships among words of the text vectors obtained by the text preprocessing module;
the attention mechanism fusion module is used for obtaining the comprehensive weight of the words of the text vector to the text by fusing different attention modes;
the hierarchical attention module is used for calculating the attention weight of the text from the word level and the sentence level respectively, judging the weight expressed by the words to the sentences and the weight expressed by the sentences to the documents, and obtaining a text expression vector;
and the emotion classification output module is used for processing the text expression vector output by the layered attention module by using a classification function to obtain a final emotion classification result.
2. The cross-domain emotion classification system based on attention mechanism fusion as claimed in claim 1, wherein the text pre-processing module utilizes Word2vec to extract the vector form corresponding to the source domain and target domain text.
3. The cross-domain emotion classification system based on attention mechanism fusion of claim 1, wherein the text semantic learning module utilizes BiGRU to capture semantic dependencies between words of a text vector.
4. The cross-domain emotion classification system based on attention mechanism fusion as claimed in claim 1, wherein the attention mechanism fusion module combines a Biliner attention mechanism and a Dot attention mechanism, so as to better calculate the contribution degree of words to sentences and the contribution degree of sentences to documents, which is more beneficial to improving the effect of text classification in cross-domain.
5. The cross-domain emotion classification system based on attention mechanism fusion as claimed in claim 1, wherein the emotion classification output module processes the text representation vector using a softmax function to predict the emotion classification of each text.
6. The cross-domain emotion classification system based on attention mechanism fusion as claimed in claim 1, wherein, in the training stage of the model, the forward propagation of the information and the backward propagation of the error will be continuously adjusted to optimize the objective function step by step.
CN201911138355.6A 2019-11-20 2019-11-20 Cross-domain emotion classification system based on attention mechanism fusion Pending CN110874411A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911138355.6A CN110874411A (en) 2019-11-20 2019-11-20 Cross-domain emotion classification system based on attention mechanism fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911138355.6A CN110874411A (en) 2019-11-20 2019-11-20 Cross-domain emotion classification system based on attention mechanism fusion

Publications (1)

Publication Number Publication Date
CN110874411A true CN110874411A (en) 2020-03-10

Family

ID=69718049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911138355.6A Pending CN110874411A (en) 2019-11-20 2019-11-20 Cross-domain emotion classification system based on attention mechanism fusion

Country Status (1)

Country Link
CN (1) CN110874411A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984791A (en) * 2020-09-02 2020-11-24 南京信息工程大学 Long text classification method based on attention mechanism
CN113064968A (en) * 2021-04-06 2021-07-02 齐鲁工业大学 Social media emotion analysis method and system based on tensor fusion network
CN113435496A (en) * 2021-06-24 2021-09-24 湖南大学 Self-adaptive fusion multi-mode emotion classification method based on attention mechanism
CN113505226A (en) * 2021-07-09 2021-10-15 福州大学 Text emotion classification system fused with graph convolution neural network
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network
CN113779249A (en) * 2021-08-31 2021-12-10 华南师范大学 Cross-domain text emotion classification method and device, storage medium and electronic equipment
WO2022188773A1 (en) * 2021-03-12 2022-09-15 腾讯科技(深圳)有限公司 Text classification method and apparatus, device, computer-readable storage medium, and computer program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082171A1 (en) * 2016-09-22 2018-03-22 Salesforce.Com, Inc. Pointer sentinel mixture architecture
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082171A1 (en) * 2016-09-22 2018-03-22 Salesforce.Com, Inc. Pointer sentinel mixture architecture
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
廖祥文 等: ""结合表示学习和迁移学习的跨领域情感分类"", 《北京大学学报(自然科学版)》 *
曾碧卿 等: ""基于双注意力卷积神经网络模型的情感分析研究"", 《广东工业大学学报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984791A (en) * 2020-09-02 2020-11-24 南京信息工程大学 Long text classification method based on attention mechanism
CN111984791B (en) * 2020-09-02 2023-04-25 南京信息工程大学 Attention mechanism-based long text classification method
WO2022188773A1 (en) * 2021-03-12 2022-09-15 腾讯科技(深圳)有限公司 Text classification method and apparatus, device, computer-readable storage medium, and computer program product
CN113064968A (en) * 2021-04-06 2021-07-02 齐鲁工业大学 Social media emotion analysis method and system based on tensor fusion network
CN113064968B (en) * 2021-04-06 2022-04-19 齐鲁工业大学 Social media emotion analysis method and system based on tensor fusion network
CN113435496A (en) * 2021-06-24 2021-09-24 湖南大学 Self-adaptive fusion multi-mode emotion classification method based on attention mechanism
CN113505226A (en) * 2021-07-09 2021-10-15 福州大学 Text emotion classification system fused with graph convolution neural network
CN113505226B (en) * 2021-07-09 2023-08-04 福州大学 Text emotion classification system fusing graph convolution neural network
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network
CN113779249A (en) * 2021-08-31 2021-12-10 华南师范大学 Cross-domain text emotion classification method and device, storage medium and electronic equipment
CN113779249B (en) * 2021-08-31 2022-08-16 华南师范大学 Cross-domain text emotion classification method and device, storage medium and electronic equipment
CN113722439B (en) * 2021-08-31 2024-01-09 福州大学 Cross-domain emotion classification method and system based on antagonism class alignment network

Similar Documents

Publication Publication Date Title
CN111783462B (en) Chinese named entity recognition model and method based on double neural network fusion
CN110874411A (en) Cross-domain emotion classification system based on attention mechanism fusion
CN109635109B (en) Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
US11631007B2 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN107291795B (en) Text classification method combining dynamic word embedding and part-of-speech tagging
CN108460013B (en) Sequence labeling model and method based on fine-grained word representation model
CN111177366B (en) Automatic generation method, device and system for extraction type document abstract based on query mechanism
CN110569508A (en) Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism
CN108664589B (en) Text information extraction method, device, system and medium based on domain self-adaptation
CN113158665B (en) Method for improving dialog text generation based on text abstract generation and bidirectional corpus generation
CN110263325B (en) Chinese word segmentation system
CN108228569B (en) Chinese microblog emotion analysis method based on collaborative learning under loose condition
CN113626589B (en) Multi-label text classification method based on mixed attention mechanism
CN112069831A (en) Unreal information detection method based on BERT model and enhanced hybrid neural network
CN112966525B (en) Law field event extraction method based on pre-training model and convolutional neural network algorithm
CN110781290A (en) Extraction method of structured text abstract of long chapter
CN112749274A (en) Chinese text classification method based on attention mechanism and interference word deletion
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN114239574A (en) Miner violation knowledge extraction method based on entity and relationship joint learning
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN113657115A (en) Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN114048314A (en) Natural language steganalysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200310