CN115269847A - Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method - Google Patents

Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method Download PDF

Info

Publication number
CN115269847A
CN115269847A CN202210922723.1A CN202210922723A CN115269847A CN 115269847 A CN115269847 A CN 115269847A CN 202210922723 A CN202210922723 A CN 202210922723A CN 115269847 A CN115269847 A CN 115269847A
Authority
CN
China
Prior art keywords
sentence
word
bert
emotion
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210922723.1A
Other languages
Chinese (zh)
Inventor
吴丽娟
陆广泉
李杰成
张魁
张桂衔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN202210922723.1A priority Critical patent/CN115269847A/en
Publication of CN115269847A publication Critical patent/CN115269847A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an aspect-level emotion classification method based on a knowledge-enhanced syntactic heteromorphic graph, which comprises the following steps of: 1) A data acquisition stage; 2) Constructing an enhanced syntax heterogeneous graph; 3) Obtaining the enhanced syntactic characteristics under the local context of the domain knowledge; 4) Constructing global semantic graph characteristics; 5) A feature adaptive fusion stage; 6) A feature vector output stage; 7) And (5) a model training stage. The method enhances the generalization ability of the model and improves the emotion classification ability of the face text.

Description

Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method
Technical Field
The invention relates to the technical field of text aspect level emotion analysis in natural language processing, in particular to an aspect level emotion classification method based on a knowledge enhanced syntactic heterograph.
Background
Emotion analysis is a fundamental and meaningful task in Natural Language Processing (NLP). The method aims to mine emotional information from the user comment text, help enterprises or consumers to provide valuable information and make reasonable decisions. Aspect-Based Sentiment Analysis (ABSA for short) is a fine-grained Sentiment Analysis task, and the ABSA is generally divided into Aspect Extraction (AE) and Aspect-Level Sentiment Classification (ALSC). We focus only on the ALSC task, which aims to identify the emotional polarity (i.e., positive, negative, or neutral) of aspects that appear explicitly in a sentence. For example, "atmosphere is good, but food is bad! In this sentence, the two terms "atmosphere" and "food" have positive and negative emotional polarities, respectively.
In the context of the aspect-based sentiment classification task earlier, various attention mechanisms have been proposed to model the interaction between contextual words and aspects to predict sentiment polarity for a particular aspect. Wang et al, among other things, propose an attention-based long-short term memory network (ATAE-LSTM) that feeds aspects and sentence links into the LSTM, generates an aspect-aware sentence representation, and assigns weights to contexts via an attention mechanism. Although attention-based models have achieved good performance, due to the complexity of the text, the mere use of attention may not accurately capture the important relationships between the facets and contextual words, and neglect the long-term dependency of the facet words on emotion-related words, which limits the performance of the model. With the development of the pre-training model PTMS, a model such as BERT (2019) is largely used in the task of emotion analysis. The BERT-ADA model uses a large corpus in the field to perform fine adjustment on the BERT, so that the performance of the ALSC task is improved.
In recent years, graph Neural Networks (GNNs) in dependency trees have attracted considerable attention from ALSC research. Zhang et al propose to construct GCN model based on dependency tree, use syntax dependence to model aspect and context, have shortened the dependence distance between aspect and the opinion word, have alleviated the problem of long-term dependence. However, the graph neural network-based models provide a good improvement over earlier neural network models, but they also have drawbacks. First, we observe that the use of syntactic dependencies alone does not fully exploit semantic information, and may yield opposite results for syntactically ambiguous sentence modeling. Secondly, most of the dependency graphs only consider the syntactic dependency of sentences, and the edges of the dependency graphs are binary (i.e. 0 or 1) in weight, and cannot assign weights to words with different emotions. Finally, when capturing syntactic dependencies, the local contextual information of the facet words is ignored as important for determining the emotional polarity of the specific facets to be predicted accurately.
Disclosure of Invention
The invention aims to provide an aspect level emotion classification method based on a knowledge-enhanced syntactic heterograph, aiming at the defects in the prior art. The method enhances the generalization ability of the model and improves the emotion classification ability of the face text.
The technical scheme for realizing the purpose of the invention is as follows:
an aspect level emotion classification method based on a knowledge enhancement syntax heterograph comprises the following steps:
1) A data acquisition stage: obtaining a comment text data set; acquiring external emotion knowledge, processing the acquired external emotion knowledge, and generating a key value pair file of words and scores;
2) And (3) constructing an enhanced syntactic heterogeneous graph stage: for a given sentence, an 'en _ core _ web _ sm' analytic sentence is loaded through a space tool, part-of-speech information of each word in the sentence is obtained through token, pos _ attribute, the part-of-speech information of each word is stored by using a pos list, the length of the sentence is calculated, the sequence length is obtained through n = len (pos), noun, adverb and adjective information are spliced into a matrix when an abnormal composition is constructed, and specifically, an initialized A matrix with 1 is constructed, and the size of the initialized A matrix is large and smallIs composed of
Figure BDA0003778418450000021
The type is float32, then whether a word in a sentence appears in an emotion dictionary is traversed, if the word appears, an emotion Score of the word is taken out and converted into a float type, otherwise, an emotion value is assigned to be 0, each word in the sentence is regarded as a node, the dependency relationship between the word and the word in a dependency tree is represented as an edge, in order to enhance the emotion information expression of the sentence, the Score of the emotion word in sentiment knowledge of SenticNet5 is used to enrich the representation of an adjacency matrix, if a dependency edge exists between two words, the value of the edge is 1 Score, then an A matrix is updated and initialized, in the composition, the relationship between a parent node and a child node which exist the dependency relationship is considered to be mutual, and the derived enhanced dependency graph is an undirected graph A i,j =A j,i In the enhanced dependency graph, the part of speech of the aspect in the comment sentence is that the NOUNs are many, and the aspect is important in the emotion classification task, so that the part of speech words in the sentence are more concerned, the description of the aspect in a sentence is usually an adjective, so that the adjective is also important in the sentence, a positive or negative adverb appears in the comment sentence, when the negative adverb is 'no, no', the emotion polarity of the aspect is opposite, which is also the reason for paying attention to the adverb in the comment, specifically, the name of the NOUN 'NOUN' and the adjective 'ADJ' and the adverb 'ADP' are stored by using the list m, and the sentence is traversed if pos [ i ] is]=“NOUN”,A i,-3 Has a value of 1 if pos [ i ]]=“ADJ”,A i,-2 Has a value of 1 if pos [ i ]]=“ADP”,A i,-1 Is set to 1, and finally, an enhanced syntactic heterogeneous graph matrix of the sentence is derived
Figure BDA0003778418450000031
3) And a stage of obtaining the enhanced syntactic characteristics under the local context of the domain knowledge: [ CLS ] was transformed using Tokenizer4Bert]+ text + [ SEP]The formal input generates a vector, and the vector is filled to the same length by pad _ and _ truncate, and is in the form of E, E = { w = { (w) } 1 ,..,w i .,w a1 ,w ai ,...,w k Where k is a set maximum length, w i Denotes the i +1 th word, w ai The method is the item of the ith aspect, and the size of the sentence abnormal graph obtained after the abnormal graph is processed by np
Figure BDA0003778418450000032
Inputting E into pre-trained BERT-ADA in the field to obtain sentence vector representation
Figure BDA0003778418450000033
The BERT-ADA is a BERT model obtained by fine tuning in an Amazon notebook computer review data set and a Yelp data set challenge review corpus, the relative distance between token and an aspect word is obtained by utilizing the position of each token and the position of the aspect word, namely, a weighting matrix V with all 1 is initialized first, the length x of the aspect word and the starting position asp _ begin of the aspect word are obtained, and then the average center position Avg of the aspect word is obtained a = (asp _ begin + asp _ len)/2, calculate relative distance between each context word and aspect word in sentence
Figure BDA0003778418450000034
Using the relative distance to further weight the sentence vector after BERT coding, if P i If the value is less than the set threshold value 3, the semantic information of the text word is kept, and if the value is more than the threshold value 3, the text word of the semantic is constructed into a weighting vector
Figure BDA0003778418450000035
To weight the features, update the weighting matrix V = [ V ] of the input sequence according to the semantic relative distance of the words 0 ,V 1 …V k ]Preliminary characterization of the BERT _ ADA will be passed
Figure BDA0003778418450000036
Mul () operation with weight matrix V, i.e. H l =H bert ·V,H l Is the output of the local dynamic weight layer, and uses graph convolution network, i.e. GCN to obtain the feature table of local context with knowledge in the fieldShow H l Syntactic heteromorphism matrix A with enhancement h As input, enhanced syntactic dependency information in the local context of knowledge in the domain is then obtained via an activation function ReLU:
H s_loc =ReLU(GCN(A h ,H l ,W)),
wherein, the formula of GCN is H l =σ(A h H l-1 W l-1 +b l-1 ),W l-1 And b l-1 Is the linear transformation weight and bias term parameters of layer l-1 of the model, σ is a nonlinear function usually set as ReLU, the initial input H 0 Is a sentence representation H l
4) And (3) constructing global semantic graph characteristics: comment on text and aspect words with "[ CLS ]]+ text + [ SEP]+ aspect + [ SEP]"get the vector representation of text _ Bert _ indices using Tokenizer4Bert, regenerate an index representation to distinguish comment text from aspect words, and put the first half sentence [ CLS ]]+ text + [ SEP]Index position is represented by 0, facet + [ SEP ]]The position of (a) is represented by 1, a BERT-segments-indexsx vector is obtained, text-BERT-indexses and BERT-segments-indexsx are input into the pre-training BERT-ADA in the field, and a vector representation H of a global sentence is obtained g Then, H is added g Input into multiple heads of attention, each head of attention obtains a feature
Figure BDA0003778418450000041
Figure BDA0003778418450000042
Splicing the attention matrixes of the h heads, and then dividing the spliced attention matrixes by h to obtain a semantic matrix
Figure BDA0003778418450000043
Fully obtaining semantic information of each word in the global sentence, and obtaining M through a Dropot layer in order to prevent overfitting se =Dropout(M se ) When constructing the semantic graph, M is added se Diagonals are set to 0 by using the second time of the torch, and then the elements on diagonals are set to 1 by using the second time of the torch, and each word and oneselfThe semantic relevance of (a) is one hundred percent, so far, a sentence global semantic graph of neighborhood knowledge is obtained, namely the input of semantic GCN, H glo =Relu(GCN(M se ,H g W)), the global semantic information feature H is updated by extraction using the convolutional net glo
5) A characteristic self-adaptive fusion stage: enhancing syntactic dependency information under local context of knowledge in domain H s_loc With global semantic information H glo Splicing, i.e. X = torr s_loc +H glo ) Obtaining enhanced syntactic information and global semantic information of sentences under a local context considering domain knowledge, and inputting the enhanced syntactic information and the global semantic information into a self-attention layer for self-adaptive fusion after passing through a residual multi-layer perceptron to obtain characteristic representation suitable for tasks;
6) And a feature vector output stage: after the fused feature vectors are subjected to BERT pooling operation, outputting final vector representation, and obtaining positive, negative and neutral emotion polarity probabilities through a softmax classifier;
7) A model training stage: and (3) optimizing the network by adopting a cross entropy loss function as a loss function through an Adam algorithm, namely, the goal of training a classifier is to minimize the cross entropy loss between the predicted emotion distribution and the real emotion distribution:
Figure BDA0003778418450000044
wherein S is the number of training samples, C is the number of polarity classes,
Figure BDA0003778418450000045
is the true emotional distribution of the sample, y is the emotional distribution of the predicted sample, and λ is L 2 The weights of the regularization terms, Θ, represent all trainable parameters.
The technical scheme has the advantages or beneficial effects that:
(1) The technical scheme integrates external emotion knowledge into a syntactic dependency graph, pays attention to noun, adverb and adjective information in a sentence, and obtains an enhanced syntactic heterogeneous graph representation of the sentence. The emotion supervision signal with finer granularity can be provided for the model, and the emotion correlation dependency relationship between the extraction aspect of the model and the context is promoted.
(2) According to the scheme, the local semantic information of knowledge in the field and the enhanced syntactic heterogeneous graph are used as input in the GCN to obtain enhanced syntactic dependency characteristics under the local context of the field knowledge, and local characteristics of the aspect words are enhanced. A sentence global semantic graph with neighborhood knowledge is built by utilizing multi-head self-attention, global semantic features are obtained by utilizing semantic GCN, and syntactic analysis errors existing in a syntactic resolver can be avoided to a certain extent.
(3) The residual multi-layer perceptron and the self-attention mechanism are used for carrying out self-adaptive fusion, the complementary emotion knowledge on the local context is enhanced with syntactic information and global semantic information, richer semantic expressions are obtained, and the effect of the aspect-level emotion classification model is further improved.
The method enhances the generalization ability of the model and improves the emotion classification ability of the face text.
Drawings
FIG. 1 is a model structure diagram of an embodiment;
FIG. 2 is a block diagram of an embodiment of a residual multi-layered perceptron.
Detailed Description
The invention will be described in further detail with reference to the following drawings and specific examples, but the invention is not limited thereto.
Example (b):
referring to fig. 1 and 2, the method for classifying the aspect-level emotion based on the knowledge enhanced syntactic dissimilarity graph includes the following steps:
1) A data acquisition stage: acquiring a comment text data set; obtaining external emotional knowledge, generating key-value pair files of words and scores using disclosed emotion analysis datasets, respectively REST14, LAP14, REST15, and REST16, wherein a laptop (LAP 14) and a restaurant (REST 14) dataset are from SemEval-2014 task 4 subtask 2, a restaurant (REST 15) dataset is from SemEval task 2015 task 12, and a restaurant (REST 16) dataset is from SemEval2016 task 5, the four datasets all relate to three emotion categories, namely positive, neutral, and negative, wherein each dataset sample includes a comment sentence, various aspects, and their corresponding emotion polarities, the example numbers of the different parts of the four datasets are shown in table 1, using SenticNet5 as the external emotion task 2016 task 5, wherein the emotion value range of the concept in the SenticNet5 dictionary is between-1 and +1, 1 represents extreme negative, +1 represents extreme negative, if there is no emotion value, then the concept is 0, then the concept is neutral, or the word is a word in the SenticNet5 dictionary, the sentiment score is 890.894. The positive emotion score is 880. The score of happier score; the words Dad and Terribly score-0.800 and-0.82 near-1 in the emotion dictionary, respectively, indicating that the emotion of these words is negative;
TABLE 1 statistics of data sets
Figure BDA0003778418450000061
2) And (3) constructing an enhanced syntactic heterogeneous graph: as shown in fig. 2, for a given sentence, the sentence is parsed by loading "en _ core _ web _ sm" through a space tool, obtaining part-of-speech information of each word in the sentence through token
Figure BDA0003778418450000062
The type is float32, then whether a word in a sentence appears in an emotion dictionary is traversed, if the word appears, the emotion Score of the word is taken out and converted into the float32 type, otherwise, the emotion value is assigned to be 0, each word in the sentence is regarded as a node, the dependency relationship between the word and the word in the dependency tree is represented as an edge, in order to enhance the emotional information expression of the sentence, the Score of the emotion word in sentiment knowledge of SenticNet5 is used, the representation of an adjacency matrix is enriched, if a dependency edge exists between two words, the value of the edge is 1 Score, and then the value of the edge is 1+ Score, and thenUpdating an initialization matrix A, in the composition, considering that the relationship between a parent node and a child node with dependency relationship is mutual, and deriving the enhanced dependency graph as an undirected graph A i,j =A j,i After obtaining the enhanced dependency graph, through observation, the part of speech of the face word in the comment sentence is NOUN, and the face word is important in the emotion classification task, so that the word of the part of speech in the sentence is more concerned, the word describing emotion of the face word in one sentence is usually an adjective, so the adjective is also important in the sentence, a positive or negative adverb appears in the comment sentence, when a negative adverb appears, "none, none", the emotion polarity of the aspect is opposite, specifically, the name of NOUN "and adjective" and "ADP" are stored by using the list m, and the pos [ i ] and the adverb "ADP" are traversed in the sentence]=“NOUN”,A i,-3 Has a value of 1, if pos [ i ]]=“ADJ”,A i,-2 Has a value of 1 if pos [ i ]]=“ADP”,A i,-1 Is set to 1, and finally, an enhanced syntactic heterogeneous graph matrix of the sentence is derived
Figure BDA0003778418450000071
3) And a stage of obtaining enhanced syntactic characteristics under local context of domain knowledge: [ CLS ] was transformed using Tokenizer4Bert]+ text + [ SEP]The formal input generates a vector, and the vector is filled to the same length by pad _ and _ truncate, and is in the form of E, E = { w = { (w) } 1 ,..,w i .,w a1 ,w ai ,...,w k Where k is a set maximum length, w i Represents the i +1 th word, w ai The method is the item of the ith aspect, and the size of the sentence abnormal graph obtained after the abnormal graph is processed by np
Figure BDA0003778418450000072
Inputting E into pre-trained BERT-ADA in the field to obtain sentence vector representation
Figure BDA0003778418450000073
Wherein BERT-ADA is a challenge comment via Amazon notebook review dataset and Yelp datasetObtaining the relative distance between the token and the aspect word by utilizing the position of each token and the position of the aspect word, namely, firstly initializing a weighting matrix V with all 1 to obtain the length x of the aspect word and the initial position asp _ begin of the aspect word, and then obtaining the average central position Avg of the aspect word a = (asp _ begin + asp _ len)/2, calculate relative distance between each context word and aspect word in sentence
Figure BDA0003778418450000074
The encoded sentence vector of BERT is further weighted by the relative distance if P i If the value is less than the set threshold value 3, the semantic information of the self-body is kept, and if the value is greater than the threshold value 3, the upper and lower words with relatively less semantics are constructed into a weighting vector
Figure BDA0003778418450000075
To weight the features, updating the weighting matrix of the input sequence V = [ V ] according to the semantic relative distance of the words 0 ,V 1 …V k ]Will go through preliminary characterization of BERT _ ADA
Figure BDA0003778418450000076
Mul () operation with weight matrix V, i.e. H l =H bert ·V,H l Is the output of the local dynamic weight layer, and uses graph convolution network, i.e. GCN to obtain the characteristic representation H of local context with knowledge in the field l Syntactic heteromorphism matrix A with enhancement h As input, enhanced syntactic dependency information in the local context of the domain knowledge is then obtained via an activation function ReLU:
H s_loc =ReLU(GCN(A h ,H l ,W)),
wherein, the formula of GCN is H l =σ(A h H l-1 W l-1 +b l-1 ),W l-1 And b l-1 Is the linear transformation weight and bias term parameter of the l-1 layer of the model, σ is a nonlinear function usually set to ReLU, the initial input H 0 Is a sentence representation H l
4) A global semantic graph feature construction stage: comment text and aspect words in "[ CLS ]]+ text + [ SEP]+ aspect + [ SEP]"get the vector representation of text _ Bert _ indices using token 4Bert, regenerate a position index representation to distinguish text from aspect words, and put the first half sentence [ CLS ]]+ text + [ SEP]Index position is represented by 0, facet + [ SEP ]]The position of (a) is represented by 1, a BERT _ segments _ indexsx vector is obtained, text _ BERT _ indexses and BERT _ segments _ indexsx are input into the pre-training BERT-ADA in the field, and a vector representation H of a global sentence is obtained g Then, H is added g Inputting into multi-head attention, obtaining a feature for each attention head
Figure BDA0003778418450000081
Figure BDA0003778418450000082
Splicing the attention matrixes of the h heads, and then dividing the spliced attention matrixes by h to obtain a semantic matrix
Figure BDA0003778418450000083
Sufficiently obtaining semantic information of each word in the global sentence, and obtaining M through a Dropout layer in order to prevent overfitting se =Dropout(M se ) When constructing the semantic graph, M se Diag, setting the value on the diagonal to 0, setting the element on the diagonal to 1 by using torch, eye, and the semantic relevance of each word and the word is one hundred percent, so as to obtain the sentence global semantic graph of the neighborhood knowledge, namely the input of the semantic GCN, H glo =Relu(GCN(M se ,H g W)), the global semantic information feature H is extracted and updated by the convolutional graph network glo
5) A characteristic self-adaptive fusion stage: enhancing syntactic dependency information under local context of knowledge in domain H s_loc With global semantic information H glo Splicing, i.e. X = torr s_loc +H glo ) Obtaining enhanced syntactic information and global semantic information of sentences under local context considering domain knowledge, and then obtaining enhanced syntactic information and global semantic information of sentences through residual errorsInputting the data into a self-attention layer for self-adaptive fusion after the multi-layer perceptron to obtain characteristic representation suitable for tasks;
6) And a feature vector output stage: after the fused feature vectors are subjected to BERT pooling operation, outputting final vector representation, and obtaining positive, negative and neutral emotion polarity probabilities through a softmax classifier;
7) A model training stage: and (3) optimizing the network by adopting a cross entropy loss function as a loss function through an Adam algorithm, namely, the goal of training a classifier is to minimize the cross entropy loss between the predicted emotion distribution and the real emotion distribution:
Figure BDA0003778418450000084
wherein S is the number of training samples, C is the number of polarity classes,
Figure BDA0003778418450000085
is the true emotional distribution of the sample, y is the emotional distribution of the predicted sample, and λ is L 2 The weights of the regularization terms, Θ, represent all trainable parameters.
To better illustrate the advantages of the method of this example, the following experimental verification was performed:
in the experimental setting, when a semantic graph is constructed, the number of multi-head attention heads is set to be 16, the number of GCN layers is set to be 2, the number of Dropout layers is set to be 0.1, the learning rate of an optimizer Adma is 10-5, and the number of random seeds is set to be 568. In the experiment, classification accuracy (acc.) and a harmonic mean (F1 value) of accuracy and recall are used as performance evaluation indexes, wherein the higher the two evaluation indexes are, the better the model classification capability is. In order to verify the effectiveness of the model provided by the embodiment, some mainstream and newly developed models are selected in aspect level emotion classification and are compared, the relevant baseline of the models can be divided into two types, namely a semantic-based model and a grammar-based model in principle, table 2 shows the result of comparing the embodiment with a comparison model, and the following is a brief introduction to the relevant model:
based on the semantic model:
ATAE-LSTM: the sentence vector and the specific aspect vector are used as splicing input, and the attention-based LSTM is used for exploring the relation between the aspect and the sentence.
IAN: two LSTM were designed to model aspects and context separately, using an interactive attention mechanism to learn the aspects and the feature representation of the sentence.
BERT: the prediction method is a general BERT model, takes "[ CLS ] + sentence + [ SEP ]" as input, and uses the expression of [ CLS ] to perform prediction.
BERT-ADA: the method is a field adaptive pre-training model for challenging a comment corpus based on an amazon notebook computer comment data set and a Yelp data set.
Based on the grammar model:
ASGCN, constructing a graph volume network by using syntactic dependency on a sentence dependency tree.
DGEDT-BERT is a double-transformer-based network, and information fusion is achieved by learning a planar representation learned by a transform and a dependency graph.
KumagCN the HardKuma distribution is utilized to sample sentences to generate a specific potential graph structure, and the potential graph and a dependency tree adopt a gating mechanism to be combined.
And the BiGCN adopts a double-layer interactive graph convolution network to fully utilize a global vocabulary graph and a concept hierarchy graph.
As can be seen from table 2, firstly, the experimental results prove that the model LSGCN proposed in this example is superior to the semantic neural network-based and grammar-based network comparison models in the ALSC task. This verifies the validity of the knowledge-based enhanced syntactic heterogeneous graph model concept proposed in this example. Secondly, it can be observed that the BERT-based and syntactic method-based models (ASGCN, DGEDT-BERT, kumaGCN, biGCN) achieve better results on each dataset than the models using attention alone (ATAE-LSTM, IAN), wherein BERT-ADA is greatly improved compared to the common BERT model, which indicates that data in the field of use is very essential for emotion analysis tasks. However, these models ignore the importance of emotion knowledge and word parts of speech to the ALSC task, lack finer grained emotion signals when constructing a dependency graph, and do not combine semantic and syntactic information well, resulting in a model classification performance lower than that of the present invention.
TABLE 2 comparison of experimental results of this and related models
Figure BDA0003778418450000101

Claims (1)

1. An aspect level emotion classification method based on a knowledge enhancement syntax heterograph is characterized by comprising the following steps:
1) A data acquisition stage: acquiring a comment text data set; acquiring external emotion knowledge, processing the acquired external emotion knowledge, and generating a key value pair file of words and scores;
2) And (3) constructing an enhanced syntactic heterogeneous graph stage: for a given sentence, an 'en _ core _ web _ sm' analytic sentence is loaded through a space tool, part-of-speech information of each word in the sentence is obtained through token, part-of-speech information of each word is stored by using a pos list, the length of the sentence is calculated, the sequence length is obtained through n = len (pos), noun, adverb and adjective information are spliced into a matrix when an abnormal composition is constructed, and specifically, an initialized A matrix with the size of 1 is constructed
Figure FDA0003778418440000011
The type is float32, then whether a word in a sentence appears in an emotion dictionary is traversed, if the word appears, an emotion Score of the word is taken out and converted into a float type, otherwise, an emotion value is assigned to be 0, each word in the sentence is regarded as a node, the dependency relationship between the word and the word in a dependency tree is represented as an edge, in order to enhance the emotion information expression of the sentence, the Score of the emotion word in sentiment knowledge of SenticNet5 is used to enrich the representation of an adjacency matrix, if a dependency edge exists between two words, the value of the edge is 1 Score, then an A matrix is updated and initialized, and in the composition, the relationship between a parent node and a child node which are considered to have the dependency relationship is the relationship between the parent node and the child nodeMutually, the derived enhanced dependency graph is an undirected graph A i,j =A j,i The enhanced dependency graph is obtained by observing that the part of speech of the aspect words in the comment sentences is NOUN, and the aspect words are important in the emotion classification task, so that the words of the part of speech in the sentences are more concerned, the description of the aspect words in one sentence is usually adjectives, so the adjectives are also important in the sentences, positive or negative adverbs can appear in the comment sentences, when the negative adverbs 'no, no' appear, the emotional polarities of the aspect words are opposite, specifically, the names of the NOUNs 'NOUN' and the adjectives 'ADJ' and 'ADP' are stored by using the list m, and the sentence is traversed if pos [ i ] i]=“NOUN”,A i,-3 Has a value of 1 if pos [ i ]]=“ADJ”,A i,-2 Has a value of 1 if pos [ i ]]=“ADP”,A i,-1 Is set to 1, and finally, an enhanced syntactic heterogeneous graph matrix of the sentence is derived
Figure FDA0003778418440000012
3) And a stage of obtaining enhanced syntactic characteristics under local context of domain knowledge: [ CLS ] was transformed using Tokenizer4Bert]+ text + [ SEP]The formal input generates a vector, and the vector is filled to the same length by pad _ and _ truncate, and is in the form of E, E = { w = { (w) } 1 ,..,w i .,w a1 ,w ai ,...,w k Where k is a set maximum length, w i Denotes the i +1 th word, w ai The method is the item of the ith aspect, and the size of the sentence abnormal graph obtained after the abnormal graph is processed by np
Figure FDA0003778418440000013
Inputting E into pre-trained BERT-ADA in the field to obtain sentence vector representation
Figure FDA0003778418440000021
Wherein BERT-ADA is a BERT model obtained by fine tuning in an Amazon notebook comment data set and a Yelp data set challenge comment corpus, and the BERT-ADA is obtained by utilizing the position of each token and the position of an aspect wordThe relative distance from token to the facet is obtained by first initializing a weighting matrix V of all 1, obtaining the length x of the facet and the start position asp _ begin of the facet, and then obtaining the average center position Avg of the facet a = (asp _ begin + asp _ len)/2, calculate relative distance between each context word and aspect word in sentence
Figure FDA0003778418440000022
Using the relative distance to further weight the sentence vector after BERT coding, if P i If the value is less than the set threshold value 3, the semantic information of the text word is kept, and if the value is more than the threshold value 3, the text word of the semantic is constructed into a weighting vector
Figure FDA0003778418440000023
To weight the features, update the weighting matrix V = [ V ] of the input sequence according to the semantic relative distance of the words 0 ,V 1 …V k ]Preliminary characterization of the BERT _ ADA will be passed
Figure FDA0003778418440000024
Mul () operation with weight matrix V, i.e. H l =H bert ·V,H l Is the output of the local dynamic weight layer, using the graph-convolution network, i.e. GCN, to obtain a feature representation H of the local context with knowledge in the field l Syntactic heteromorphism matrix A with enhancement h As input, enhanced syntactic dependency information in the local context of knowledge in the domain is then obtained via an activation function ReLU:
H s_loc =ReLU(GCN(A h ,H l ,W)),
wherein, the formula of GCN is H l =σ(A h H l-1 W l-1 +b l-1 ),W l-1 And b l-1 Is the linear transformation weight and bias term parameters of layer l-1 of the model, σ is a nonlinear function usually set as ReLU, the initial input H 0 Is a sentence representation H l
4) And (3) constructing global semantic graph characteristics: will comment onText and aspect words with "[ CLS]+ text + [ SEP]+ aspect + [ SEP]"get the vector representation of text _ Bert _ indices using Tokenizer4Bert, regenerate an index representation to distinguish comment text from aspect words, and put the first half sentence [ CLS ]]+ text + [ SEP]Index position is represented by 0, facet + [ SEP ]]The position of (a) is represented by 1, a BERT-segments-indexsx vector is obtained, text-BERT-indexses and BERT-segments-indexsx are input into the pre-training BERT-ADA in the field, and a vector representation H of a global sentence is obtained g Then, H is added g Inputting into multi-head attention, each attention head gets a feature
Figure FDA0003778418440000025
Figure FDA0003778418440000026
Splicing the attention matrixes of the h heads, and then dividing the spliced attention matrixes by h to obtain a semantic matrix
Figure FDA0003778418440000031
Figure FDA0003778418440000032
Sufficiently obtaining semantic information of each word in the global sentence, and obtaining M through a Dropout layer in order to prevent overfitting se =Dropout(M se ) When constructing the semantic graph, M is added se Diag, setting the value on the diagonal to be 0 after twice using the torch, setting the element on the diagonal to be 1 by using torch, eye, setting the semantic correlation between each word and the word to be one hundred percent, and obtaining a sentence global semantic graph of neighborhood knowledge, namely obtaining the input of semantic GCN, H glo =Relu(GCN(M se ,H g W)), the global semantic information feature H is updated by extraction using the convolutional net glo
5) A characteristic self-adaptive fusion stage: enhancing syntactic dependency information under local context of knowledge in domain H s_loc With global semantic information H glo Splicing, i.e. X = torr s_loc +H glo ) Get the field of considerationEnhancing syntactic information and global semantic information of sentences under the local context of knowledge, and inputting the information into a self-attention layer for self-adaptive fusion after passing through a residual multi-layer perceptron to obtain characteristic representation suitable for tasks;
6) And a feature vector output stage: after the fused feature vectors are subjected to BERT pooling operation, outputting final vector representation, and obtaining positive, negative and neutral emotion polarity probabilities through a softmax classifier;
7) A model training stage: and (3) optimizing the network by adopting a cross entropy loss function as a loss function through an Adam algorithm, namely, the goal of training a classifier is to minimize the cross entropy loss between the predicted emotion distribution and the real emotion distribution:
Figure FDA0003778418440000033
wherein S is the number of training samples, C is the number of polarity classes,
Figure FDA0003778418440000034
is the true emotional distribution of the sample, y is the emotional distribution of the predicted sample, and λ is L 2 The weights of the regularization terms, Θ, represent all trainable parameters.
CN202210922723.1A 2022-08-02 2022-08-02 Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method Withdrawn CN115269847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210922723.1A CN115269847A (en) 2022-08-02 2022-08-02 Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210922723.1A CN115269847A (en) 2022-08-02 2022-08-02 Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method

Publications (1)

Publication Number Publication Date
CN115269847A true CN115269847A (en) 2022-11-01

Family

ID=83747408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210922723.1A Withdrawn CN115269847A (en) 2022-08-02 2022-08-02 Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method

Country Status (1)

Country Link
CN (1) CN115269847A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659951A (en) * 2022-12-26 2023-01-31 华南师范大学 Statement emotion analysis method, device and equipment based on label embedding
CN116090450A (en) * 2022-11-28 2023-05-09 荣耀终端有限公司 Text processing method and computing device
CN116561592A (en) * 2023-07-11 2023-08-08 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device
CN116662554A (en) * 2023-07-26 2023-08-29 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN117171610A (en) * 2023-08-03 2023-12-05 江南大学 Knowledge enhancement-based aspect emotion triplet extraction method and system
CN117371456A (en) * 2023-10-10 2024-01-09 国网江苏省电力有限公司南通供电分公司 Multi-mode irony detection method and system based on feature fusion
CN117933372A (en) * 2024-03-22 2024-04-26 山东大学 Data enhancement-oriented vocabulary combined knowledge modeling method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116090450A (en) * 2022-11-28 2023-05-09 荣耀终端有限公司 Text processing method and computing device
CN115659951A (en) * 2022-12-26 2023-01-31 华南师范大学 Statement emotion analysis method, device and equipment based on label embedding
CN116561592A (en) * 2023-07-11 2023-08-08 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device
CN116561592B (en) * 2023-07-11 2023-09-29 航天宏康智能科技(北京)有限公司 Training method of text emotion recognition model, text emotion recognition method and device
CN116662554A (en) * 2023-07-26 2023-08-29 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN116662554B (en) * 2023-07-26 2023-11-14 之江实验室 Infectious disease aspect emotion classification method based on heterogeneous graph convolution neural network
CN117171610A (en) * 2023-08-03 2023-12-05 江南大学 Knowledge enhancement-based aspect emotion triplet extraction method and system
CN117171610B (en) * 2023-08-03 2024-05-03 江南大学 Knowledge enhancement-based aspect emotion triplet extraction method and system
CN117371456A (en) * 2023-10-10 2024-01-09 国网江苏省电力有限公司南通供电分公司 Multi-mode irony detection method and system based on feature fusion
CN117933372A (en) * 2024-03-22 2024-04-26 山东大学 Data enhancement-oriented vocabulary combined knowledge modeling method and device
CN117933372B (en) * 2024-03-22 2024-06-07 山东大学 Data enhancement-oriented vocabulary combined knowledge modeling method and device

Similar Documents

Publication Publication Date Title
CN115269847A (en) Knowledge-enhanced syntactic heteromorphic graph-based aspect-level emotion classification method
CN110807154B (en) Recommendation method and system based on hybrid deep learning model
CN111368996B (en) Retraining projection network capable of transmitting natural language representation
US11132512B2 (en) Multi-perspective, multi-task neural network model for matching text to program code
Neubig Neural machine translation and sequence-to-sequence models: A tutorial
CN110674850A (en) Image description generation method based on attention mechanism
US11410031B2 (en) Dynamic updating of a word embedding model
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN112784532B (en) Multi-head attention memory system for short text sentiment classification
CN108549703B (en) Mongolian language model training method based on recurrent neural network
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
Bokka et al. Deep Learning for Natural Language Processing: Solve your natural language processing problems with smart deep neural networks
Lebret Word embeddings for natural language processing
Grzegorczyk Vector representations of text data in deep learning
Chen et al. Deep neural networks for multi-class sentiment classification
CN111507093A (en) Text attack method and device based on similar dictionary and storage medium
CN111581365B (en) Predicate extraction method
CN112100342A (en) Knowledge graph question-answering method based on knowledge representation learning technology
CN114997155A (en) Fact verification method and device based on table retrieval and entity graph reasoning
Gupta A review of generative AI from historical perspectives
CN113157892A (en) User intention processing method and device, computer equipment and storage medium
Dasgupta et al. A Review of Generative AI from Historical Perspectives
Kreyssig Deep learning for user simulation in a dialogue system
Narayanaperumal Deep Neural Networks for Sentiment Analysis in Tweets with Emoticons
Lin Deep neural networks for natural language processing and its acceleration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20221101