CN116049393A - Aspect-level text emotion classification method based on GCN - Google Patents

Aspect-level text emotion classification method based on GCN Download PDF

Info

Publication number
CN116049393A
CN116049393A CN202211650414.XA CN202211650414A CN116049393A CN 116049393 A CN116049393 A CN 116049393A CN 202211650414 A CN202211650414 A CN 202211650414A CN 116049393 A CN116049393 A CN 116049393A
Authority
CN
China
Prior art keywords
gcn
module
grammar
matrix
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211650414.XA
Other languages
Chinese (zh)
Inventor
龙昭华
王高远
张�林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211650414.XA priority Critical patent/CN116049393A/en
Publication of CN116049393A publication Critical patent/CN116049393A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an aspect-level text emotion classification method based on a graph roll-up neural network (GCN), which comprises the following steps: (1) pretreatment: given text information of sentence-aspect pairs, using BERT as a sentence encoder to extract hidden context representations, generating hidden state vectors (2) respectively inputting the hidden state vectors of sentences into a grammar GCN module and a semantic GCN module for feature learning (3) and adopting a biaffin module to realize effective information flow, namely exchanging grammar and semantic features. (4) And applying average pooling and connection operation on aspect nodes of the grammar GCN and the semantic GCN module to obtain final characteristic representation and realize aspect-oriented emotion classification.

Description

Aspect-level text emotion classification method based on GCN
Technical Field
The invention belongs to the field of natural language processing, and mainly relates to a method for classifying aspect-level text emotion based on a graph roll-up neural network (GCN).
Background
Social media is rapidly developed in the world, and emotion analysis is also a basic task in the field of natural language processing. The text with the emotion viewpoints of the user is analyzed by using natural language processing technology, and the emotion tendencies contained in the text are mined to become an important way for social public opinion supervision and after-sales information feedback of manufacturers. Therefore, researching the text emotion analysis method has important social meaning and commercial value.
Emotion analysis can be classified into three types of document-level emotion classification, sentence-level emotion classification and aspect-level emotion classification according to the granularity of research on text. Early emotion analysis was mainly performed for coarse granularity emotion analysis for document-level and sentence-level text. Document-level emotion classification refers to tagging the emotional tendency/polarity of an entire opinion-type document with an opinion, i.e., determining whether the document as a whole conveys a positive or negative opinion. Since document-level emotion analysis is too coarse, the emotion tendencies of the text cannot be accurately described. Sentence-level emotion classification performs emotion polarity judgment on subjective sentences. However, coarse-grained emotion analysis only assumes that a text contains only a single emotion, such as positive or negative, and cannot identify emotion from text containing multiple aspects. However, the emotion analysis of the aspect level text has small text granularity, so that emotion polarities in different aspects can be accurately judged in one sentence, and the emotion analysis becomes an important research direction in the emotion analysis field.
In current research on aspect-level emotion classification, aspect-level emotion classification is solved mainly by modeling semantic associations between context and aspect terms using an attention-based neural network. Wang et al use an attention mechanism to focus attention on different parts of the sentence, generating attention vectors for aspect emotion classification; chen et al propose a multi-layer attention network to infer the emotional polarity of this aspect; ma et al introduced an interactive attention mechanism that generated representations of aspects and contexts, respectively; wang et al designed an aspect-oriented hierarchical attention model for aspect emotion classification. Another trend is to use dependency trees, where syntactic information can make relationships between aspects and corresponding opinion words, and GCNs based on dependency trees achieve good results in ABSA. (Zhang et al, 2019; sun et al, 2019) a GCN layer is stacked to extract rich representations on the dependency tree; liang et al (2020) construct aspect-oriented and space-oriented graphs to learn specific emotional characteristics of aspects; the beam and the like construct an emotion enhancement graph by integrating emotion knowledge of SenticNet, and consider emotion information between opinion words and aspect words; the field et al (2021) uses dependency types to distinguish between different relationships in the dependency tree. However, these methods typically ignore an efficient fusion of syntactic structures and semantic associations, thereby obtaining more rich information.
The existing method has the defects that:
(1) Sentences have different sensitivity to semantic information and to semantic information. Particularly those sentences whose grammatical structure is not obvious, have a low sensitivity to grammatical information, meaning that grammatical information may not help the model in judging the emotional polarity of the sentence in some cases.
(2) The syntax structure cannot be fully utilized, and only the information of the neighbor node is considered. In addition, some indistinct cases express the emotion of the aspect vocabulary in a fuzzy manner, and the aspect vocabulary and the opinion vocabulary have no direct syntactic relation. Most methods use multiple layers of GCNs to derive expression of opinion words, which creates potential noise.
CN114791950a, an aspect-level emotion classification method and device based on part-of-speech location and graph convolution network. The method comprises the following steps: obtaining word vector representation of sentence text where the word is located according to part-of-speech position information of the word; generating an enhanced syntactic dependency tree integrating part-of-speech position information and graph convolution network information for each target sentence; through the interactive information between the learning aspect words and the context, the emotion classification is realized.
The invention is different from CN114791950A in that CN114791950A is to enhance the dependency tree characteristic information of the syntactic grammar by fusing part-of-speech positions, and the invention not only extracts the syntactic characteristic information, but also extracts semantic characteristic information by aspect attention and self attention, and further improves the aspect-level emotion classification effect by fusing syntactic and semantic characteristics, thereby having more ideal text emotion analysis accuracy.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. An aspect-level text emotion classification method based on GCN is provided. The technical scheme of the invention is as follows:
an aspect-level text emotion classification method based on GCN, which comprises the following steps:
step a, acquiring sentence-aspect pairs of an aspect-level emotion classification task, and extracting hidden context representations by using a BERT encoder (pre-training language model) as a sentence encoder to generate a hidden state vector;
b, respectively inputting a grammar GCN (graph convolution neural network) module and a semantic GCN module according to the hidden state vector of the sentence obtained in the step a; the grammar GCN (graph convolution neural network) module is used for extracting sentence grammar characteristics by utilizing the graph convolution network after establishing an adjacency matrix through a dependency tree, and the semantic GCN module is used for extracting better semantic characteristics by integrating an aspect awareness attention matrix and a self attention matrix;
step c, realizing effective information flow by adopting a BiAffine module; the BiAffine module is used for effectively exchanging related characteristics between the SynGCN and the SemCN module, and mutual BiAffine conversion is adopted;
step d, aggregating all aspect node representations from grammar GCN and semantic GCN module by pooling and connection to form final aspect representation, realizing aspect word-oriented emotion classification
Further, the step a specifically includes:
gives a sentence-aspect pair (s, a), s= { w 1 ,w 2 ,...,w n },a={a 1 ,a 2 ,...,a m -is an aspect word, also a subsequence of sentences s, a representing a predefined set of aspects; w (w) n Representing a word in a given sentence s, a m An aspect word in a is preset. BERT encoder, employing "[ CLS ]]Sentence [ SEP ]]Aspect [ SEP ]]"as input text, special mark CLS in BERT, separator SEP in BERT, output after the BERT encoder is as shown in formula (1):
H=[h 0 ,h 1 ,h 2 ,...,h m ,h m+1 ]#(1)。
BERT is an architecture that can be used for many downstream tasks, such as answering questions, classification, NER, etc. The pre-trained BERT can be assumed to be a black box that provides a vector of h=768 dimensions for each input token (word) in the sequence. The sequence may be a single sentence or a sequence consisting of delimiters [ SEP ]]Separating and marking [ CLS ]]A first pair of sentences. h is a m Represents an mth dimension context representation obtained after BERT encoding, etc.
Further, in said step b, the dependency tree is converted into a graph structure G in a grammar GCN module syn =(A sy ,H),A sy Is an adjacency matrix, then the grammar information is extracted by using a graph convolution network, and the formula is as follows:
Figure BDA0004010288660000041
Figure BDA0004010288660000042
Figure BDA0004010288660000043
/>
Figure BDA0004010288660000044
W (l+1) is the weight of layer l+1ReLU is a piecewise linear function, changing all negative values to 0, while positive values are unchanged;
Figure BDA0004010288660000045
vector representation representing layer 1 of grammar GCN, H c Is a feature matrix.
Wherein:
Figure BDA0004010288660000046
Figure BDA0004010288660000047
is Bi-LSTM or BERT code output, which is used as input of the first GCN layer;
Figure BDA0004010288660000048
Figure BDA0004010288660000049
is a learnable matrix of the layer I GCN, d lstm Is the dimension, d, of the hidden representation of Bi-LSTM learning gcn Is the dimension of the GCN layer output, each node can iteratively aggregate the information from its one-hop neighbors and update its representation through 1-step convolution operation, successfully integrate the syntax information into the final representation through the grammar graph convolution module>
Figure BDA00040102886600000410
Is a kind of medium.
In the semantic GCN module, attention matrix integrating the attention matrix and self-attention matrix perceived in terms of attention matrix to obtain better semantic features, wherein:
Figure BDA00040102886600000411
Figure BDA00040102886600000412
is the aspect awareness feature matrix and b is the bias.
Where K is equal to H, W generated by the coding layer a ∈R d×d ,W k ∈R d×d For a learnable weight matrix, for H a Average pooling, replication n times, R d×d The representation matrix dimension is d dimension. Obtaining H a ∈R n×d As an aspect word representation; the p-head aspect-aware attention is used to obtain an attention score matrix for a sentence,
Figure BDA00040102886600000413
indicating that it is obtained through the ith attention header;
Figure BDA0004010288660000051
A self constructed by using self-intent, which captures the interaction between two arbitrary words in a single sentence, where Q and K are equal to H generated by the coding layer; w (W) Q ∈R d×d ,W K ∈R d×d Is a learnable parameter. Then, integrating the attention matrix of the aspect perception and the self-attention matrix;
Figure BDA0004010288660000052
A i ∈R n×n as input to the calculation. A is that i Representing the integrated attention profile matrix.
Further, in the step c, in order to effectively exchange correlation characteristics between the grammar GCN module and the semantic GCN module, mutual BiAffine transformation is performed as follows:
H syn′ =softmax(H syn W 1 (H sem ) T )H sem #(11)
H sem′ =softmax(H sem W 2 (H syn ) T )H syn #(12)
wherein W is 1 And W is 2 Is a trainable parameter. H syn′ Is the characteristic vector H obtained after BiAffine transformation sem′ Is a feature vector obtained after BiAffine conversion, a feature vector obtained by Hmem through semantic GCN module and H syn And the grammar GCN module obtains the characteristic vector.
Further, in the step d, an average pooling and connection operation is applied to the aspect nodes of the grammar GCN and the semantic GCN module, so as to obtain a final feature representation, which specifically includes:
Figure BDA0004010288660000053
Figure BDA0004010288660000054
Figure BDA0004010288660000055
Figure BDA0004010288660000056
and->
Figure BDA0004010288660000057
Representing the grammar feature matrix and the semantic feature matrix obtained by the average pool function, and r is the sum of
Figure BDA0004010288660000058
And->
Figure BDA0004010288660000059
And (5) connecting the obtained matrixes.
Where f (·) is the average pool function applied to the aspect node representation; then, inputting the obtained representation r into a linear layer, and inputting a softmax function to obtain emotion polarity probability distribution p, namely:
p(a)=softmax(W p r+b p )#(16)
W p and b p Is a learnable weight and bias.
The invention has the advantages and beneficial effects as follows:
the invention discloses a GCN-based aspect-level text emotion classification method, which combines a self-attention and an aspect-aware attention mechanism in a semantic GCN module to acquire an attention score matrix of sentences, so that not only the semantics related to aspects but also the global semantics can be learned. In the grammar GCN module, a dependency tree structure diagram is utilized to carry out diagram convolution to learn grammar information, grammar and semantic GCN features are shared to carry out aspect-level emotion classification, so that the error of dependency classification is reduced, and the sensitivity of sentences to the grammar information and the semantic information is improved.
The invention mainly fuses grammar and semantic features, and considers the complementarity of a grammar structure and the relativity of semantics, wherein the semantic GCN module skillfully combines the attention of aspects with the self-attention, and can better learn the semantics related to the aspects and the global semantics.
Drawings
FIG. 1 is a diagram of the overall architecture of an aspect level text emotion classification method based on GCN in accordance with a preferred embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and specifically described below with reference to the drawings in the embodiments of the present invention. The described embodiments are only a few embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
an aspect-level text emotion classification method based on a graph roll-up neural network (GCN), comprising the steps of:
step a, acquiring sentence-aspect pairs of an aspect-level emotion classification task, and extracting hidden context representations by using a BERT encoder (pre-training language model) as a sentence encoder to generate a hidden state vector;
b, respectively inputting a grammar GCN (graph convolution neural network) module and a semantic GCN module according to the hidden state vector of the sentence obtained in the step a; the grammar GCN (graph convolution neural network) module is used for extracting sentence grammar characteristics by utilizing the graph convolution network after establishing an adjacency matrix through a dependency tree, and the semantic GCN module is used for extracting better semantic characteristics by integrating an aspect awareness attention matrix and a self attention matrix;
step c, realizing effective information flow by adopting a BiAffine module; the BiAffine module is used for S y The relative characteristics are effectively exchanged between the nGCN and the SemCN modules, and mutual BiAffine conversion is adopted;
step d, aggregating all aspect node representations from grammar GCN and semantic GCN module by pooling and connection to form final aspect representation, realizing aspect word-oriented emotion classification
Further, the step a specifically includes:
gives a sentence-aspect pair (s, a), s= { w 1 ,w 2 ,...,wn},a={a 1 ,a 2 ,. am } is an aspect word, also a subsequence of sentences s, a representing a predefined set of aspects; w (w) n Representing a word in a given sentence s, a m An aspect word in a is preset. BERT encoder, employing "[ CLS ]]Sentence [ SEP ]]Aspect [ SEP ]]"as input text, special mark CLS in BERT, separator SEP in BERT, output after the BERT encoder is as shown in formula (1):
H=[h 0 ,h 1 ,h 2 ,...,h m ,h m+1 ]#(1)。
BERT is an architecture that can be used for many downstream tasks, such as answering questions, classification, NER, etc. The pre-trained BERT can be assumed to be a black box that provides a vector of h=768 dimensions for each input token (word) in the sequence. The sequence may be a single sentence or a sequence consisting of delimiters [ SEP ]]Separating and marking [ CLS ]]A first pair of sentences. h is a m Represents an mth dimension context representation obtained after BERT encoding, etc.
Further, in said step b, in the grammar GCConverting dependency tree into graph structure G in N module syn =(A sy ,H),A sy Is an adjacency matrix, then the grammar information is extracted by using a graph convolution network, and the formula is as follows:
Figure BDA0004010288660000071
Figure BDA0004010288660000072
Figure BDA0004010288660000073
Figure BDA0004010288660000074
W (l+1) the weight, reLU, being layer 1, is a piecewise linear function, changing all negative values to 0, while positive values are unchanged;
Figure BDA0004010288660000075
vector representation representing layer 1 of grammar GCN, H c Is a feature matrix.
Wherein:
Figure BDA0004010288660000076
Figure BDA0004010288660000077
is Bi-LSTM or BERT code output, which is used as input of the first GCN layer;
Figure BDA0004010288660000081
Figure BDA0004010288660000082
is a learnable matrix of the layer I GCN, d lstm Is the dimension, d, of the hidden representation of Bi-LSTM learning gcn Is the dimension of the GCN layer output, each node can iteratively aggregate the information from its one-hop neighbors and update its representation through 1-step convolution operation, successfully integrate the syntax information into the final representation through the grammar graph convolution module>
Figure BDA0004010288660000083
Is a kind of medium.
In the semantic GCN module, attention matrix integrating the attention matrix and self-attention matrix perceived in terms of attention matrix to obtain better semantic features, wherein:
Figure BDA0004010288660000084
Figure BDA0004010288660000085
is the aspect awareness feature matrix and b is the bias.
Where K is equal to H, W generated by the coding layer a ∈R d×d ,W k ∈R d×d For a learnable weight matrix, for H a Average pooling, replication n times, R d×d The representation matrix dimension is d dimension. Obtaining H a ∈R n×d As an aspect word representation; the p-head aspect-aware attention is used to obtain an attention score matrix for a sentence,
Figure BDA0004010288660000086
indicating that it is obtained through the ith attention header; />
Figure BDA0004010288660000087
A self By constructing with self-attitution, it capturesInteractions between two arbitrary words in a single sentence are described, where Q and K are equal to H generated by the coding layer; w (W) Q ∈R d,W K ∈R s×s Is a learnable parameter. Then, integrating the attention matrix of the aspect perception and the self-attention matrix;
Figure BDA0004010288660000088
A i ∈R n×n as input to the calculation. A is that i Representing the integrated attention profile matrix.
Further, in the step c, in order to effectively exchange correlation characteristics between the grammar GCN module and the semantic GCN module, mutual BiAffine transformation is performed as follows:
H syn′ =softmax(H syn W 1 (H sem ) T )H sem (11)
H sem′ =softmax(H sem W 2 (H syn ) T )H syn #(12)
wherein W is 1 And W is 2 Is a trainable parameter. H syn′ Is the characteristic vector H obtained after BiAffine transformation sem’ Is the characteristic vector H obtained after BiAffine transformation sem Feature vector, H obtained through semantic GCN module syn And the grammar GCN module obtains the characteristic vector.
Further, in the step d, an average pooling and connection operation is applied to the aspect nodes of the grammar GCN and the semantic GCN module, so as to obtain a final feature representation, which specifically includes:
Figure BDA0004010288660000091
Figure BDA0004010288660000092
Figure BDA0004010288660000093
Figure BDA0004010288660000094
and->
Figure BDA0004010288660000095
Representing the grammar feature matrix and the semantic feature matrix obtained by the average pool function, and r is the sum of
Figure BDA0004010288660000096
And->
Figure BDA0004010288660000097
And (5) connecting the obtained matrixes.
Where f (·) is the average pool function applied to the aspect node representation; then, inputting the obtained representation r into a linear layer, and inputting a softmax function to obtain emotion polarity probability distribution p, namely:
p(a)=softmax(W p r+b p )#(16)
W p and b p Is a learnable weight and bias.
Finally, standard cross entropy loss is used as a loss function:
Figure BDA0004010288660000098
wherein the method comprises the steps of
Figure BDA0004010288660000099
All sentence-aspect pairs are included, a representing the aspect appearing in sentence s. θ represents all trainable parameters, +.>
Figure BDA00040102886600000910
Is a set of emotion polarities.
The dataset of the present invention is the restaurant and notebook reviews in SemEval 2014task 4 (Pontiki et al, 2014) and the Twitter post of Dong et al (2014). Each aspect is labeled as one of three emotional polarities: positive, neutral and negative. The statistics of the three data sets are shown in table 1 below:
table 1 aspect Emotion Classification common dataset
Figure BDA00040102886600000911
/>
Figure BDA0004010288660000101
The invention adopts the accuracy, recall rate and F1 value to evaluate the result, and the calculation formula is shown as the following 19-21:
Figure BDA0004010288660000102
Figure BDA0004010288660000103
Figure BDA0004010288660000104
where TP represents the number of positive classes predicted as positive classes, FN represents the number of positive classes predicted as negative classes, and FP represents the number of negative classes predicted as positive classes.
The experimental environment of the invention is based on a Pytorch framework, adopts a NVIDIATESLAP100GPU training model, uses an English Bert-Base-based pre-training model as a text encoder, and uses an Adam optimizer to train the model. Word embedding is initialized using a 300-dimensional Glove vector provided by Pennington et al (2014). In addition, a 30-dimensional part-of-speech (POS) and 30-dimensional position embedding, i.e., the relative position of each word with respect to the aspect in the sentence, are also used. Word embedding, POS embedding, and location embedding are then concatenated as an input word representation. All sentences were parsed by Stanfordparser 2. The batch size of all models was set to 16 and the number of gcn layers was 2. Further, a dropout function is applied to the input word representation of BiLSTM, and the dropout rate is set to 0.3 and the learning rate is set to 0.002 to optimize the parameters.
TABLE 2 super parameter settings
Figure BDA0004010288660000105
Figure BDA0004010288660000111
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.

Claims (6)

1. The aspect-level text emotion classification method based on GCN is characterized by comprising the following steps of:
step a, acquiring sentence-aspect pairs of an aspect-level emotion classification task, and extracting hidden context representations by using a BERT encoder (pre-training language model) as a sentence encoder to generate a hidden state vector;
b, respectively inputting a grammar GCN graph convolution neural network module and a semantic GCN module according to the hidden state vector of the sentence obtained in the step a; the grammar GCN graph convolution neural network module is used for extracting sentence grammar characteristics by utilizing a graph convolution network after establishing an adjacency matrix through a dependency tree, and the semantic GCN module is used for extracting better semantic characteristics by integrating an aspect perception attention matrix and a self attention matrix;
step c, realizing effective information flow by adopting a BiAffine module; the BiAffine module is used for effectively exchanging related characteristics between the SynGCN and the SemCN module, and mutual BiAffine conversion is adopted;
and d, aggregating all aspect node representations from the grammar GCN and the semantic GCN module through pooling and connection to form a final aspect representation, and realizing aspect word emotion classification.
2. The GCN-based aspect-level text emotion classification method according to claim 1, wherein said step a specifically includes:
gives a sentence-aspect pair (s, a), s= { w 1 ,w 2 ,...,w n },a={a 1 ,a 2 ,...,a m -is an aspect word, also a subsequence of sentences s, a representing a predefined set of aspects; w (w) n Representing a word in a given sentence s, a m An aspect word in a is preset. BERT encoder, employing "[ CLS ]]Sentence [ SEP ]]Aspect [ SEP ]]"as input text, special mark CLS in BERT, separator SEP in BERT, output after the BERT encoder is as shown in formula (1):
H=[h 0 ,h 1 ,h 2 ,...,h m ,h m+1 ]#(1);
BERT is an architecture that can be used for many downstream tasks, including answering questions, classification, NER, assuming that the pre-trained BERT is a black box that provides a vector of h=768 dimensions for each input token (word) in a sequence, either a single sentence or a sequence of delimiters [ SEP ]]Separating and marking [ CLS ]]A pair of sentences at the beginning, h m Represents an mth dimension context representation obtained after BERT encoding, etc.
3. The GCN-based aspect text emotion classification method of claim 1, wherein in said step b, the dependency tree is converted into graph structure G in a grammar GCN module syn =(A sy ,H),A sy Is an adjacency matrix, then the grammar information is extracted by using a graph convolution network, and the formula is as follows:
Figure FDA0004010288650000021
Figure FDA0004010288650000022
Figure FDA0004010288650000023
Figure FDA0004010288650000024
W (l+1) the weight, reLU, being layer 1, is a piecewise linear function, changing all negative values to 0, while positive values are unchanged;
Figure FDA0004010288650000025
vector representation representing layer 1 of grammar GCN, H c Is a feature matrix;
wherein:
Figure FDA0004010288650000026
Figure FDA0004010288650000027
is Bi-LSTM or BERT code output, which is used as input of the first GCN layer; />
Figure FDA0004010288650000028
Figure FDA0004010288650000029
Is a learnable matrix of the layer I GCN, d lstm Is the dimension, d, of the hidden representation of Bi-LSTM learning gcn Is the dimension of the GCN layer output, each node can iteratively aggregate the information from its one-hop neighbors and update its representation by l-step convolution operation, successfully integrate the syntax information into the final representation by the syntax graph convolution module>
Figure FDA00040102886500000210
Is a kind of medium.
4. A GCN-based aspect-level text sentiment classification method according to claim 3, wherein in the semantic GCN module, attention matrices integrate the aspect-aware attention matrices and self-attention matrices to obtain better semantic features, wherein:
Figure FDA00040102886500000211
Figure FDA00040102886500000212
is an aspect awareness feature matrix, b is a bias;
where K is equal to H, W generated by the coding layer a ∈R d×d ,W k ∈R d×d For a learnable weight matrix, for H a Average pooling, replication n times, R d×d The representation matrix dimension is d dimension; obtaining H a ∈R n×d As an aspect word representation; the p-head aspect awareness attentiveness is used to obtain an attention score matrix for a sentence,
Figure FDA00040102886500000213
indicating that it is obtained through the ith attention header;
Figure FDA0004010288650000031
A self constructed by using self-intent, which captures the interaction between two arbitrary words in a single sentence, where Q and K are equal to H generated by the coding layer; w (W) Q ∈R d×d ,W K ∈R d×d As a learnable parameter, then integrating the aspect-aware attention matrix with the self-attention matrix;
Figure FDA0004010288650000032
A i ∈R n×n as input to the calculation, A i Representing the integrated attention profile matrix.
5. The GCN-based aspect-level text emotion classification method of claim 4, wherein in step c, in order to effectively exchange correlation characteristics between the grammar GCN module and the semantic GCN module, mutual BiAffine transformation is performed as follows:
H syn' =softmax(H syn W 1 (H sem ) T )H sem #(11)
H sem' =softmax(H sem W 2 (H syn ) T )H syn #(12)
wherein W is 1 And W is 2 Is a trainable parameter, H syn' Is the characteristic vector obtained after BiAffine transformation, H sem' Is the characteristic vector obtained after BiAffine transformation, H sem Representing feature vectors obtained by semantic GCN module, H syn Representing the feature vector obtained by the grammar GCN module.
6. The GCN-based aspect text emotion classification method of claim 5, wherein in step d, an average pooling and concatenation operation is applied to aspect nodes of the grammar GCN and the semantic GCN module to obtain a final feature representation, and specifically includes:
Figure FDA0004010288650000033
Figure FDA0004010288650000034
Figure FDA0004010288650000035
/>
Figure FDA0004010288650000036
and->
Figure FDA0004010288650000037
Representing the grammar feature matrix and the semantic feature matrix obtained by the average pool function, r is ∈>
Figure FDA0004010288650000038
And
Figure FDA0004010288650000039
a matrix obtained after connection;
where f (·) is the average pool function applied to the aspect node representation; then, inputting the obtained representation r into a linear layer, and inputting a softmax function to obtain emotion polarity probability distribution p, namely:
p(a)=softmax(W p r+b p )#(16)
W p and b p Is a learnable weight and bias.
CN202211650414.XA 2022-12-21 2022-12-21 Aspect-level text emotion classification method based on GCN Pending CN116049393A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211650414.XA CN116049393A (en) 2022-12-21 2022-12-21 Aspect-level text emotion classification method based on GCN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211650414.XA CN116049393A (en) 2022-12-21 2022-12-21 Aspect-level text emotion classification method based on GCN

Publications (1)

Publication Number Publication Date
CN116049393A true CN116049393A (en) 2023-05-02

Family

ID=86117321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211650414.XA Pending CN116049393A (en) 2022-12-21 2022-12-21 Aspect-level text emotion classification method based on GCN

Country Status (1)

Country Link
CN (1) CN116049393A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473083A (en) * 2023-09-30 2024-01-30 齐齐哈尔大学 Aspect-level emotion classification model based on prompt knowledge and hybrid neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473083A (en) * 2023-09-30 2024-01-30 齐齐哈尔大学 Aspect-level emotion classification model based on prompt knowledge and hybrid neural network

Similar Documents

Publication Publication Date Title
WO2021233112A1 (en) Multimodal machine learning-based translation method, device, equipment, and storage medium
CN113255755B (en) Multi-modal emotion classification method based on heterogeneous fusion network
CN111401077B (en) Language model processing method and device and computer equipment
Arshad et al. Aiding intra-text representations with visual context for multimodal named entity recognition
CN107066464A (en) Semantic Natural Language Vector Space
CN111159409B (en) Text classification method, device, equipment and medium based on artificial intelligence
CN114339450B (en) Video comment generation method, system, device and storage medium
CN111598183A (en) Multi-feature fusion image description method
CN111368082A (en) Emotion analysis method for domain adaptive word embedding based on hierarchical network
Huang et al. C-Rnn: a fine-grained language model for image captioning
CN114443899A (en) Video classification method, device, equipment and medium
Gandhi et al. Multimodal sentiment analysis: review, application domains and future directions
Luo et al. A thorough review of models, evaluation metrics, and datasets on image captioning
CN116049393A (en) Aspect-level text emotion classification method based on GCN
Le-Hong Diacritics generation and application in hate speech detection on Vietnamese social networks
CN117132923A (en) Video classification method, device, electronic equipment and storage medium
CN117033626A (en) Text auditing method, device, equipment and storage medium
CN116414988A (en) Graph convolution aspect emotion classification method and system based on dependency relation enhancement
Qi et al. Video captioning via a symmetric bidirectional decoder
CN115129807A (en) Fine-grained classification method and system for social media topic comments based on self-attention
Huang et al. Target-Oriented Sentiment Classification with Sequential Cross-Modal Semantic Graph
CN117521674B (en) Method, device, computer equipment and storage medium for generating countermeasure information
CN113704460B (en) Text classification method and device, electronic equipment and storage medium
CN116958997B (en) Graphic summary method and system based on heterogeneous graphic neural network
Yang et al. Weibo Sentiment Analysis Based on Advanced Capsule Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination