CN113157919B - Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system - Google Patents
Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system Download PDFInfo
- Publication number
- CN113157919B CN113157919B CN202110372212.2A CN202110372212A CN113157919B CN 113157919 B CN113157919 B CN 113157919B CN 202110372212 A CN202110372212 A CN 202110372212A CN 113157919 B CN113157919 B CN 113157919B
- Authority
- CN
- China
- Prior art keywords
- representation
- context
- word
- final
- structural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 69
- 238000000034 method Methods 0.000 title claims abstract description 36
- 239000011159 matrix material Substances 0.000 claims abstract description 44
- 239000013598 vector Substances 0.000 claims abstract description 42
- 230000007246 mechanism Effects 0.000 claims abstract description 28
- 238000011176 pooling Methods 0.000 claims abstract description 16
- 230000002776 aggregation Effects 0.000 claims abstract description 15
- 238000004220 aggregation Methods 0.000 claims abstract description 15
- 238000009826 distribution Methods 0.000 claims abstract description 11
- 238000003860 storage Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 238000006467 substitution reaction Methods 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 4
- 239000010410 layer Substances 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000002996 emotional effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 235000012149 noodles Nutrition 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a sentence text aspect emotion classification method and a sentence text aspect emotion classification system, which belong to the technical field of text emotion classification and comprise the following steps: each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism; according to the structural aspect representation and the structural context representation, the syntactic dependency information of the dependency relation tree is utilized, the information of the average pooling aggregation aspect vector is combined, final embedding is extracted, the probability distribution of different emotion polarities is calculated by combining a back propagation algorithm, and the final emotion polarity of a sentence is predicted. The invention solves the problem of long-distance word dependency among a plurality of words, and considers the context dependency relationship; using a structured self-attention mechanism, sentences are encoded into a multi-dimensional matrix, each vector can be considered as a context associated with an aspect word, generating a contextual representation of the aspect, revealing the relationship of the plurality of semantic segments to the aspect word.
Description
Technical Field
The invention relates to the technical field of text emotion classification, in particular to a sentence text aspect emotion classification method and system based on deep learning of an attention network.
Background
Aspect-level emotion classification is a popular task of confirming emotion polarity, which aims to identify emotion polarity for a given aspect word in a sentence. For text comments of a certain object, judging the emotion polarity of a text sentence mainly comprises two aspects, namely positive and negative. The noun phrases appearing in the input sentence are used as aspect words for confirming emotion polarities, and the emotion classification is important because some sentences can have several aspects at the same time, and the aspects can represent different emotion polarities.
The traditional processing method is to construct a feature engineering for the model and select a series of good features. Conventional methods such as emotion dictionary and machine learning are generally used. Since the deep learning method has a remarkable advantage in automatically learning text features, reliance on manual design of features can be avoided, and features can be mapped into continuous low-dimensional vectors. Therefore, it is widely used in aspect-level emotion classification.
Classification models based on neural networks, such as Recurrent Neural Networks (RNNs), convolutional Neural Networks (CNNs), have been widely used for aspect-level emotion classification. The RNN model based on the attention mechanism enhances semantic connections between context vocabulary and aspect words, which are widely used in recent approaches for searching for potentially relevant words related to a given aspect word. CNN-based attention methods have also been proposed to enhance phrase-level representations and achieve good results. Although attention-based models have achieved good performance in multiple tasks, limitations remain apparent because attention modules may highlight irrelevant words due to syntactic loss, which may lead to incorrect prediction of emotional polarity.
Disclosure of Invention
The invention aims to provide a sentence text aspect emotion classification method and system based on deep learning of an attention network, which are accurate in text emotion polarity classification, so as to solve at least one technical problem in the background technology.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a sentence text aspect emotion classification method, including:
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
Preferably, preprocessing operation is performed by utilizing GloVE word embedding, each word is represented in a serialization manner, and word embedding representation of a text is obtained;
and extracting the sequence characteristics from the front direction and the rear direction by using a Bi-directional long-short-term memory network Bi-LSTM, and acquiring the context sequence information of the capturing sequence.
Preferably, a graph annotation meaning neural network based on a dependency relationship tree is constructed, and an extraction model is constructed for the dependency relationship by utilizing syntactic dependency information of the dependency relationship tree;
extracting the final embedding of the classification task by utilizing the constructed extraction model and combining the information of the vector in the aspect of average pooling aggregation;
the extracted final embedded is input into a final softmax classifier after passing through a full connection layer, so that the final emotion polarity is predicted.
Preferably, for the Chinese memory and the aspect memory of the context information, extracting semantic segments related to the aspect words, and converting the aspect memory into a structured aspect representation by using self-attention operation to obtain an aspect matrix;
adding a penalty term to obtain diversity of the weighted sum vectors in the aspect representation;
and acquiring the relation between the aspect matrixes, constructing a context matrix, transforming the context matrix by using a feedforward network, and combining the context matrix with the context matrix to obtain the final structured context representation.
Preferably, words in the text sentence are represented as nodes in the dependency tree, syntactic dependency paths between the words are represented as node edges in the dependency tree, and the nodes of the dependency tree are given by real value vectors modeled by Bi-LSTM;
distributing the attention to a neighbor node set of the central node, normalizing the attention coefficient, and recalculating the weight coefficient;
capturing the influence intensity of the neighbor nodes on the central node in different aspects through a multi-head attention mechanism, and splicing the extracted node characteristic representations to obtain a final node representation;
and combining the recalculated weight coefficients, and obtaining the final embedding by using average substitution splicing.
Preferably, word embedding preprocessing operation is performed by utilizing GloVE word embedding, and a context sentence S= { w with length n is given 1 ,w 2 ,...,w n -comprising a contextual input sequence of aspects, aspect a= { w i ,w i+1 ,...,w i+m-1 -m words;
will each word w i Word embedding vector mapping to a low dimensionIn d w Is the dimension of the word vector and,is an embedding matrix for pre-training GloVE, where V represents the vocabulary size.
Preferably, the Bi-LSTM network is utilized to extract the sequence characteristics from the front direction and the back direction, and the hidden state of the forward LSTM output at the moment t isThe hidden state of the reverse LSTM output is +.>The hidden state of Bi-LSTM output is +.> wherein ,
sequence h is divided into context memory M c Sum aspect memory M a ;M c Representation containing all context words, M a Including representations of all aspect words.
In a second aspect, the present invention provides a sentence text aspect emotion classification system, including:
a sequence representation module; the method comprises the steps of carrying out serialization representation on each word, obtaining context sequence information of a sequence, and generating structural aspect representation and structural context representation through a structural self-attention mechanism;
the extraction module is used for extracting final embedding by utilizing syntactic dependency information of the dependency relationship tree and combining information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and the prediction module is used for calculating probability distribution of different emotion polarities according to final embedding and combining a back propagation algorithm to predict the final emotion polarity of the sentence text.
In a third aspect, the present invention provides a non-transitory computer readable storage medium comprising instructions for performing the sentence text aspect emotion classification method as described above.
In a fourth aspect, the present invention provides an electronic device comprising a non-transitory computer readable storage medium as described above; and one or more processors capable of executing the instructions of the non-transitory computer-readable storage medium.
The invention has the beneficial effects that: the syntactic dependency structure in the sentence is used for solving the problem of long-distance word dependency among a plurality of words, and solving the dependency relationship ignored in the previous research; a structured self-attention mechanism is designed to encode sentences into a multi-dimensional matrix, where each vector can be considered as a context associated with an aspect word to generate a contextual representation of the aspect, revealing the relationship of multiple semantic segments to the aspect word.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a sentence text classifying method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a sentence text-level emotion classification result according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements throughout or elements having like or similar functionality. The embodiments described below by way of the drawings are exemplary only and should not be construed as limiting the invention.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or groups thereof.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
In order that the invention may be readily understood, a further description of the invention will be rendered by reference to specific embodiments that are illustrated in the appended drawings and are not to be construed as limiting embodiments of the invention.
It will be appreciated by those skilled in the art that the drawings are merely schematic representations of examples and that the elements of the drawings are not necessarily required to practice the invention.
Example 1
The embodiment 1 of the invention provides a sentence text aspect emotion classification system, which comprises:
a sequence representation module; the method comprises the steps of carrying out serialization representation on each word, obtaining context sequence information of a sequence, and generating structural aspect representation and structural context representation through a structural self-attention mechanism;
the extraction module is used for extracting final embedding by utilizing syntactic dependency information of the dependency relationship tree and combining information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and the prediction module is used for calculating probability distribution of different emotion polarities according to final embedding and combining a back propagation algorithm to predict the final emotion polarity of the sentence text.
In this embodiment 1, a sentence text aspect emotion classification method is implemented by using the system described above, where the method includes:
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
Performing preprocessing operation by utilizing GloVE word embedding, and performing serialization representation on each word to obtain word embedding representation of a text; and extracting the sequence characteristics from the front direction and the rear direction by using a Bi-directional long-short-term memory network Bi-LSTM, and acquiring the context sequence information of the capturing sequence.
Constructing a graph annotation meaning neural network based on a dependency relationship tree, and constructing an extraction model for the dependency relationship by utilizing syntactic dependency information of the dependency relationship tree; extracting the final embedding of the classification task by utilizing the constructed extraction model and combining the information of the vector in the aspect of average pooling aggregation; the extracted final embedded is input into a final softmax classifier after passing through a full connection layer, so that the final emotion polarity is predicted.
For the Chinese memory and the aspect memory of the upper and lower sequence information, extracting semantic segments related to aspect words, and converting the aspect memory into a structured aspect representation by utilizing self-attention operation to obtain an aspect matrix; adding a penalty term to obtain diversity of the weighted sum vectors in the aspect representation; and acquiring the relation between the aspect matrixes, constructing a context matrix, transforming the context matrix by using a feedforward network, and combining the context matrix with the context matrix to obtain the final structured context representation.
Representing the words in the text sentence as nodes in a dependency tree, and representing the syntactic dependency paths among the words as node edges in the dependency tree, wherein the nodes of the dependency tree are given by real value vectors modeled by Bi-LSTM; distributing the attention to a neighbor node set of the central node, normalizing the attention coefficient, and recalculating the weight coefficient; capturing the influence intensity of the neighbor nodes on the central node in different aspects through a multi-head attention mechanism, and splicing the extracted node characteristic representations to obtain a final node representation; and combining the recalculated weight coefficients, and obtaining the final embedding by using average substitution splicing.
Word embedding preprocessing operation by utilizing GloVE word embedding, and giving a context sentence S= { w with length of n 1 ,w 2 ,...,w n -comprising a contextual input sequence of aspects, aspect a= { w i ,w i+1 ,...,w i+m-1 -m words;
will each word w i Word embedding vector mapping to a low dimensionIn d w Is the dimension of the word vector and,is an embedding matrix for pre-training GloVE, where V represents the vocabulary size.
Extracting sequence features from front and back directions by using Bi-LSTM network, wherein the hidden state of the forward LSTM output at time t isThe hidden state of the reverse LSTM output is +.>The hidden state of the Bi-LSTM output is wherein ,
sequence h is divided into context memory M c Sum aspect memory M a ;M c Representation containing all context words, M a Including representations of all aspect words.
Example 2
In the embodiment 2 of the invention, a structural self-attention mechanism and a graphic attention network are utilized, and a sentence text aspect emotion prediction method is provided. Firstly, capturing context information among sentences by utilizing a BI-directional long-short-term memory network (BI-LSTM) to learn sentence characterization; the structured self-attention mechanism is then utilized to capture context segments related to the emotion of the aspect word and further enhance embedding through the graph annotation network acting directly on the dependency tree, thereby obtaining syntactic information and word dependencies.
The syntactic dependency structure in the sentence is used for solving the problem of long-distance word dependency among a plurality of words, and solving the dependency relationship which is ignored in the prior study. A structured self-attention mechanism is designed to encode sentences into a multi-dimensional matrix, where each vector can be considered as a context associated with an aspect word to generate a contextual representation of the aspect, revealing the relationship of multiple semantic segments to the aspect word.
As shown in fig. 1, the sentence text aspect emotion prediction method described in this embodiment 2 includes the steps of:
step S1: preprocessing operation is carried out by utilizing GloVE word embedding, each word is represented in a serialization mode, and word embedding representation of a text is obtained;
step S2: the Bi-LSTM network can be utilized to extract the sequence characteristics from the front direction and the back direction, and can well capture the context sequence information of the sequence;
step S3: generating a structured aspect representation and a structured context representation by a structured self-attention mechanism;
step S4: constructing a graph annotation meaning neural network based on a dependency tree, and modeling the dependency by utilizing syntactic dependency information of the dependency tree;
step S5: when the final embedding of the classification task is extracted, the information of the vector in the aspect of average pooling aggregation is utilized;
step S6: the pooled results were passed through the full connection layer and input into the final softmax classifier, predicting the final emotional polarity.
In the step S1: given a context sentence s= { w of length n 1 ,w 2 ,...,w n -a context input sequence comprising aspects, aspect a= { w i ,w i+1 ,...,w i+m-1 The task consists of speculating on the polarity of emotion of aspect a in sentence S.
In this embodiment 2, each input word w is first of all i Word embedding vector mapping to a low dimensionIn d w Is the dimension of the word vector,/>Is an embedding matrix for pre-training GloVe, where |v| is the size of the vocabulary.
In the step S2: bi-LSTM combines the forward hidden layer with the backward hidden layer, enabling systematic, selective use of both before and after information, often used to process context information in natural language processing tasks.
The Bi-LSTM network can extract the sequence characteristics from the front direction and the back direction, and can well capture the context sequence information of the sequence. The hidden state of the forward LSTM output at time t isThe hidden state of the reverse LSTM output is +.>The hidden state of Bi-LSTM output is +.> wherein ,
sequence h is divided into context memory M c Sum aspect memory M a ;M c Representation containing all context words, M a Including representations of all aspect words.
In the step S3: given context memory M c Sum aspect memory M a Extracting semantic segments related to aspect words and memorizing aspects M by self-attention operation a Conversion to structured aspect representation R a The following is shown:
wherein ,Aa Is a matrix of weights that are to be used, and />Are two parameters of the self-care layer. />Represents M a Is a transpose of (a).
Multiplying the weight matrix with the aspect words to calculate a weighted sum to obtain an aspect representation:
R a =A a M a ;
if the attention mechanism always provides a similar weighted sum, then the embedding matrix has redundancy problems. Thus, in this embodiment 2, a penalty term is required to encourage diversity of the weighted sum vector.
Using penalty term P in a penalty function to encourage motion in R a The diversity of the rows captured in (a).
wherein ,is A a I represents the transpose of the identity matrix, | x I F Representing the Frobenius norm of the matrix.
Given aspect matrix R a From the context memory M c Semantic segments related to the aspect are found.
First, a matrix A is established c To capture the relationships between the aspect matrices. A bilinear attention mechanism is used to capture the relationship between the context memory and the aspect matrix. Second, it is used to construct a context matrixEach row in the matrix can be considered as a semantic segment related to an aspect:
wherein ,Wc Is a parameter of the operation of the bilinear attention mechanism,is M c Is a transpose of (a).
Further generation of transformed representation T using feed forward network c :
The remaining connections are used to combine the two matrices to obtain the final structured context representation. Layer normalization is used to help prevent gradual extinction and explosion:
in the step S4: a dependency tree can be understood as a graph with N nodes, where the nodes represent words in a sentence, the edges represent syntactic dependency paths between words in the graph, and the nodes of the dependency tree are given by real valued vectors modeled by Bi-LSTM.
The set of input feature vectors for a single layer of attention isThe output node feature vector set is +.>The attention coefficients between the center node and the neighbor nodes are:
wherein ,input embedding representing the ith node, +.>The output embedding representing the ith node, N represents the number of nodes in the node set, W represents a weight matrix, and the weight matrix is a parameter linear transformation matrix for mapping the input feature vector dimension to the output dimension.
self-attention will distribute attention to all nodes in the graph, which will lose structural information. In this embodiment 2, in order to solve this problem, attention is allocated to the neighbor node set of node i using the masked self-attention. And, the attention coefficient is normalized by softmax, the weight coefficient is recalculated, and the updated coefficient is:
the strength of the influence of the neighbor nodes on the central node in different aspects is captured through a multi-head attention mechanism. Splicing the node characteristic representations extracted by the K heads respectively to obtain a final node representation:
wherein, the I represents a stitching operation,representing the normalized attention coefficient, W, calculated by the kth attention mechanism k Is a weight matrix of the corresponding input linear transformation.
Finally, the average substitution splicing is used for obtaining the final embedding:
in the step S5: when extracting the final inlay of the classification task, information on the average pooling aggregate aspect vector is utilized:
where f (·) is the average function of the enhancement aspect vector.
In the step S6: the hidden state represents the probability distribution p (a) of different emotion polarities output through one fully connected softmax layer.
wherein ,Wp As a weight coefficient matrix, b p Is a bias matrix.
In this example 2, the network model was trained using a back propagation algorithm, and by minimizing the cross entropy optimization model, the objective function loss was defined as:
wherein D is a training dataset. Lambda (lambda) 1 and λ2 Control L 2 The variation of the regularization term, θ, represents all parameters. P is p i Is the ith element of P, P i Is the penalty term for the ith training.
As shown in fig. 2, the comments therein are those about restaurants, and the emotion polarities are judged for two aspects of sentences "The noodles were delicious, but the service was terrible", which represent the meaning "noodles are good but service is poor", respectively, positive and negative. Wherein the emotional polarity rating for the aspect word "non" noodles is "positive", i.e. positive, and the emotional polarity rating for the aspect word "service is" negative ", i.e. negative.
Example 3
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
Example 4
Embodiment 4 of the present invention provides an electronic device including a non-transitory computer-readable storage medium; and one or more processors capable of executing the instructions of the non-transitory computer-readable storage medium. The non-transitory computer-readable storage medium includes instructions for performing a sentence text-aspect emotion classification method, the method comprising:
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
Example 5
An embodiment 5 of the present invention provides an electronic device, where the device includes instructions for performing a sentence text aspect emotion classification method, where the method includes:
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
In summary, the sentence text aspect emotion classification method and system provided by the embodiment of the invention utilize a structural self-attention mechanism and a graphic attention network. The model firstly captures context information among sentences by utilizing a BI-directional long-short-term memory network (BI-LSTM) to learn sentence characterization; the structured self-attention mechanism is then utilized to capture context segments related to the emotion of the aspect word and further enhance embedding through the graph annotation network acting directly on the dependency tree, thereby obtaining syntactic information and word dependencies. The syntactic dependency structure in the sentence is used for solving the problem of long-distance word dependency among a plurality of words, and solving the dependency relationship which is ignored in the prior study. A structured self-attention mechanism is designed to encode sentences into a multi-dimensional matrix, where each vector can be considered as a context associated with an aspect word to generate a contextual representation of the aspect, revealing the relationship of multiple semantic segments to the aspect word.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
While the foregoing embodiments of the present disclosure have been described in conjunction with the accompanying drawings, it is not intended to limit the scope of the disclosure, and it should be understood that, based on the technical solutions disclosed in the present disclosure, various modifications or variations may be made by those skilled in the art without requiring any inventive effort, and are intended to be included in the scope of the present disclosure.
Claims (8)
1. A sentence text-wise emotion classification method, comprising:
each word is subjected to serialization representation, contextual sequence information of the sequence is obtained, and structural aspect representation and structural contextual representation are generated through a structural self-attention mechanism;
comprising the following steps: preprocessing operation is carried out by utilizing GloVE word embedding, each word is represented in a serialization mode, and word embedding representation of a text is obtained;
extracting sequence characteristics from front and back directions by using Bi-directional long-short-term memory network Bi-LSTM to obtain context sequence information of a capturing sequence;
extracting the final embedding of the classification task by utilizing the syntactic dependency information of the dependency tree and combining the information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
comprising the following steps: constructing a graph annotation meaning neural network based on a dependency relationship tree, and constructing an extraction model for the dependency relationship by utilizing syntactic dependency information of the dependency relationship tree;
extracting the final embedding of the classification task by utilizing the constructed extraction model and combining the information of the vector in the aspect of average pooling aggregation;
comprising the following steps: extracting semantic segments related to aspect words aiming at the context memory and the aspect memory in the context sequence information, and converting the aspect memory into a structured aspect representation by utilizing self-attention operation to obtain an aspect matrix;
adding a penalty term to obtain diversity of the weighted sum vectors in the aspect representation;
acquiring the relation between the aspect matrixes, constructing a context matrix, transforming the context matrix by using a feedforward network, and combining the context matrix with the context matrix to obtain the final structured context representation;
and according to the final embedding, combining a back propagation algorithm, calculating probability distribution of different emotion polarities, and predicting the final emotion polarity of the sentence text.
2. The sentence text aspect emotion classification method of claim 1, characterized by:
the extracted final embedded is input into a final softmax classifier after passing through a full connection layer, so that the final emotion polarity is predicted.
3. The sentence text aspect emotion classification method of claim 2, characterized by:
representing the words in the text sentence as nodes in a dependency tree, and representing the syntactic dependency paths among the words as node edges in the dependency tree, wherein the nodes of the dependency tree are given by real value vectors modeled by Bi-LSTM;
distributing the attention to a neighbor node set of the central node, normalizing the attention coefficient, and recalculating the weight coefficient;
capturing the influence intensity of the neighbor nodes on the central node in different aspects through a multi-head attention mechanism, and splicing the extracted node characteristic representations to obtain a final node representation;
and combining the recalculated weight coefficients, and obtaining the final embedding by using average substitution splicing.
4. A sentence text aspect emotion classification method according to claim 3, characterized in that:
word embedding preprocessing operation by utilizing GloVE word embedding, and giving a context sentence S= { w with length of n 1 ,w 2 ,...,w n -comprising a contextual input sequence of aspects, aspect a= { w i ,w i+1 ,...,w i+m-1 -m words;
5. The sentence text aspect emotion classification method of claim 4, characterized by:
extracting sequence features from front and back directions by using Bi-LSTM network, wherein the hidden state of the forward LSTM output at time t isThe hidden state of the reverse LSTM output is +.>The hidden state of Bi-LSTM output is +.> wherein ,
wherein ,x1 ,x 2 ,...,x n Refers to a context input sequence;
sequence h is divided into context memory M c Sum aspect memory M a ;M c Representation containing all context words, M a Including representations of all aspect words.
6. A sentence text-wise emotion classification system, comprising:
a sequence representation module; the method comprises the steps of carrying out serialization representation on each word, obtaining context sequence information of a sequence, and generating structural aspect representation and structural context representation through a structural self-attention mechanism;
comprising the following steps: preprocessing operation is carried out by utilizing GloVE word embedding, each word is represented in a serialization mode, and word embedding representation of a text is obtained;
extracting sequence characteristics from front and back directions by using Bi-directional long-short-term memory network Bi-LSTM to obtain context sequence information of a capturing sequence;
the extraction module is used for extracting final embedding by utilizing syntactic dependency information of the dependency relationship tree and combining information of the average pooling aggregation aspect vector according to the structural aspect representation and the structural context representation;
comprising the following steps: constructing a graph annotation meaning neural network based on a dependency relationship tree, and constructing an extraction model for the dependency relationship by utilizing syntactic dependency information of the dependency relationship tree;
extracting the final embedding of the classification task by utilizing the constructed extraction model and combining the information of the vector in the aspect of average pooling aggregation;
comprising the following steps: extracting semantic segments related to aspect words aiming at the context memory and the aspect memory in the context sequence information, and converting the aspect memory into a structured aspect representation by utilizing self-attention operation to obtain an aspect matrix;
adding a penalty term to obtain diversity of the weighted sum vectors in the aspect representation;
acquiring the relation between the aspect matrixes, constructing a context matrix, transforming the context matrix by using a feedforward network, and combining the context matrix with the context matrix to obtain the final structured context representation;
and the prediction module is used for calculating probability distribution of different emotion polarities according to final embedding and combining a back propagation algorithm to predict the final emotion polarity of the sentence text.
7. A non-transitory computer-readable storage medium comprising instructions for performing the sentence text aspect emotion classification method of any of claims 1-5.
8. An electronic device, characterized in that: a non-transitory computer readable storage medium comprising the method of claim 7; and one or more processors capable of executing the instructions of the sentence text aspect emotion classification method of the non-transitory computer readable storage medium.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110372212.2A CN113157919B (en) | 2021-04-07 | 2021-04-07 | Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110372212.2A CN113157919B (en) | 2021-04-07 | 2021-04-07 | Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113157919A CN113157919A (en) | 2021-07-23 |
CN113157919B true CN113157919B (en) | 2023-04-25 |
Family
ID=76888564
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110372212.2A Active CN113157919B (en) | 2021-04-07 | 2021-04-07 | Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113157919B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113609867B (en) * | 2021-08-23 | 2024-02-02 | 南开大学 | Method and system for learning context information based on single-layer network structure |
CN113869034B (en) * | 2021-09-29 | 2022-05-20 | 重庆理工大学 | Aspect emotion classification method based on reinforced dependency graph |
CN114707518B (en) * | 2022-06-08 | 2022-08-16 | 四川大学 | Semantic fragment-oriented target emotion analysis method, device, equipment and medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488734A (en) * | 2020-04-14 | 2020-08-04 | 西安交通大学 | Emotional feature representation learning system and method based on global interaction and syntactic dependency |
CN111783474A (en) * | 2020-07-16 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Comment text viewpoint information processing method and device and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108829662A (en) * | 2018-05-10 | 2018-11-16 | 浙江大学 | A kind of conversation activity recognition methods and system based on condition random field structuring attention network |
CN109543039B (en) * | 2018-11-23 | 2022-04-08 | 中山大学 | Natural language emotion analysis method based on deep network |
CN111078833B (en) * | 2019-12-03 | 2022-05-20 | 哈尔滨工程大学 | Text classification method based on neural network |
CN111461004B (en) * | 2020-03-31 | 2023-08-22 | 北京邮电大学 | Event detection method and device based on graph attention neural network and electronic equipment |
CN112347248A (en) * | 2020-10-30 | 2021-02-09 | 山东师范大学 | Aspect-level text emotion classification method and system |
-
2021
- 2021-04-07 CN CN202110372212.2A patent/CN113157919B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488734A (en) * | 2020-04-14 | 2020-08-04 | 西安交通大学 | Emotional feature representation learning system and method based on global interaction and syntactic dependency |
CN111783474A (en) * | 2020-07-16 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Comment text viewpoint information processing method and device and storage medium |
Non-Patent Citations (2)
Title |
---|
基于方面情感的层次化双注意力网络;宋婷等;《信息技术与网络安全》(第06期);全文 * |
面向上下文注意力联合学习网络的方面级情感分类模型;杨玉亭等;《模式识别与人工智能》(第08期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113157919A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490946B (en) | Text image generation method based on cross-modal similarity and antagonism network generation | |
CN113157919B (en) | Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system | |
CN108875807B (en) | Image description method based on multiple attention and multiple scales | |
CN110502753A (en) | A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement | |
CN110969020A (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
Guan et al. | Autoattend: Automated attention representation search | |
CN111414749B (en) | Social text dependency syntactic analysis system based on deep neural network | |
CN111400494B (en) | Emotion analysis method based on GCN-Attention | |
CN113535953B (en) | Meta learning-based few-sample classification method | |
CN108154156B (en) | Image set classification method and device based on neural topic model | |
CN111723914A (en) | Neural network architecture searching method based on convolution kernel prediction | |
CN111353313A (en) | Emotion analysis model construction method based on evolutionary neural network architecture search | |
CN112560456A (en) | Generation type abstract generation method and system based on improved neural network | |
CN114841151B (en) | Medical text entity relation joint extraction method based on decomposition-recombination strategy | |
CN111858984A (en) | Image matching method based on attention mechanism Hash retrieval | |
CN114254645A (en) | Artificial intelligence auxiliary writing system | |
CN113806543B (en) | Text classification method of gate control circulation unit based on residual jump connection | |
Liu et al. | Hybrid neural network text classification combining TCN and GRU | |
Yang et al. | Text classification based on convolutional neural network and attention model | |
CN112559741B (en) | Nuclear power equipment defect record text classification method, system, medium and electronic equipment | |
CN113722439A (en) | Cross-domain emotion classification method and system based on antagonism type alignment network | |
CN113779966A (en) | Mongolian emotion analysis method of bidirectional CNN-RNN depth model based on attention | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
CN116595222A (en) | Short video multi-label classification method and device based on multi-modal knowledge distillation | |
CN116501864A (en) | Cross embedded attention BiLSTM multi-label text classification model, method and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |