CN113220887B - Emotion classification method using target knowledge enhancement model - Google Patents
Emotion classification method using target knowledge enhancement model Download PDFInfo
- Publication number
- CN113220887B CN113220887B CN202110605317.8A CN202110605317A CN113220887B CN 113220887 B CN113220887 B CN 113220887B CN 202110605317 A CN202110605317 A CN 202110605317A CN 113220887 B CN113220887 B CN 113220887B
- Authority
- CN
- China
- Prior art keywords
- knowledge
- word
- layer
- sentence
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to an emotion classification method for enhancing a model by utilizing target knowledge. The emotion classification method utilizing the target knowledge enhancement model comprises the following steps: constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer; acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted; and inputting the context and each knowledge sentence into the target knowledge enhancement model to obtain the emotion classification result of the context. The emotion classification method utilizing the target knowledge enhancement model, disclosed by the invention, has the advantages that the external knowledge is introduced into the target words in the Chinese comment text, the knowledge attention mechanism is provided, the weight is dynamically distributed to the introduced external knowledge, the defect of insufficient information content of the target words is overcome, and the classification accuracy is high.
Description
Technical Field
The invention relates to the technical field of emotion classification, in particular to an emotion classification method for enhancing a model by utilizing target knowledge.
Background
Traditional aspect level emotion classification methods typically employ machine learning methods, such as Support Vector Machines (SVMs), that require artificial features, including syntactic analysis features and dictionary features.
Compared with the traditional machine learning method, the neural network can automatically acquire important semantic information from the text, so that a large amount of manual operation is avoided. Therefore, it is widely applied to aspect-level emotion classification. Neural networks are capable of automatically capturing important emotional features from text and are therefore widely used in aspect-level emotion classification.
Tang et al use two long and short memory networks (LSTMs) to capture emotional features before and after an aspect. Wang et al propose an LSTM with attention mechanism for capturing emotional features in context. Ma et al propose two attention-based LSTM networks for interactively generating sentences and body representations and concatenating these representations for prediction. Chen et al use a network of Gated Recursive Units (GRUs) to integrate the hidden states of the LSTMs. Zeng et al utilizes a local contextual attention mechanism to capture local contextual features in comment text.
Wagner et al trained a support vector machine from multiple external emotion dictionaries for emotion classification tasks. kirithenko et al create a domain-specific emotion dictionary by manual writing and processing, providing additional domain emotion knowledge for support vector machines. Teng et al calculate the polarity of each emotion word using the emotion dictionary and calculate the weight of the emotion words using LSTM. And finally, predicting the emotion polarity of the sentence by using the weighted sum of the emotion words. Yang et al propose a humanoid hierarchical strategy based on aspect-based sentiment classification. Chen et al uses a self-built domain emotion knowledge map as side information to measure the emotion polarity between an emotion word and a target word.
The existing method can well utilize auxiliary information related to emotional words and improve the performance of the model. However, the existing method only focuses on emotion information, and ignores important semantic information contained in the target word. In addition, the conventional emotion dictionary and emotion recognition map are used as auxiliary information and need to be manually constructed for a specific data set.
Disclosure of Invention
Based on the above, the invention aims to provide an emotion classification method using a target knowledge enhancement model, which introduces external knowledge into a target word in a Chinese comment text by constructing the target knowledge enhancement model, provides a knowledge attention mechanism, dynamically allocates weight to the introduced external knowledge, overcomes the defect of insufficient information content of the target word, and has the advantage of high classification accuracy.
In a first aspect, the invention provides an emotion classification method using a target knowledge enhancement model, comprising the following steps:
constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer;
inputting the output sequence of the input layer into the embedding layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer;
inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
The emotion classification method utilizing the target knowledge enhancement model disclosed by the invention is used for constructing the target knowledge enhancement model, introducing external knowledge into a target word in a Chinese comment text, proposing a knowledge attention mechanism, dynamically distributing weight to the introduced external knowledge, effectively and automatically identifying fine-grained emotion in the Chinese short comment text, overcoming the defect of insufficient information content of the target word and simultaneously increasing the semantic feature extraction capability of the model.
Further, the hidden layer is a hidden attention layer, and the extracting a hidden state vector of an output sequence of the knowledge attention layer includes:
a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism. Further, obtaining a weight for each word in the context includes:
Qi=OBERT·Wi q
Ki=OBERT·Wi k
Vi=OBERT·Wi v
wherein, Wi CFor each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence.
Further, obtaining a multi-headed output sequence of weighted knowledge sentences, comprising:
Wcharacter=Sum(WC)
wherein, WCRepresenting the weight from word to word, WcharacterRepresenting the weight of a word relative to a context sentence, L0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented; wi SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. Wi SingleRepresenting knowledge attention weights;a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing the knowledge attention layer.
Further, the acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted includes:
acquiring an entity knowledge sentence corresponding to each target word from a database;
when the target word is a compound word, dividing the compound word into a plurality of sub-target words, and acquiring an entity knowledge sentence of each sub-target word;
and preprocessing the data of the entity knowledge sentence, deleting noise data, and obtaining the knowledge sentence of the context to be detected.
Further, the data preprocessing is carried out on the entity knowledge sentence, and noise data are deleted, and the method comprises the following steps:
cutting the entity knowledge of which the sentence length exceeds a first threshold value, and deleting the content of which the sentence length exceeds the first threshold value;
and deleting English letters and nonsense characters appearing in the entity knowledge sentence.
Furthermore, the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.
Further, the inputting of the output sequence of the hidden layer into the output layer to obtain the classification result of the context includes:
and processing the output sequence of the hidden layer by using a softmax function to obtain an emotion polarity prediction result of the context.
In a second aspect, the present invention provides an emotion classification apparatus using a target knowledge enhancement model, including:
the model building module is used for building a target knowledge enhancement model, and the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
the knowledge sentence acquisition module is used for acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
the input module is used for splicing the context and the knowledge sentences to obtain an output sequence of an input layer;
the embedding module is used for mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
the multi-head self-attention module is used for calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
the knowledge attention module is used for obtaining a weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting an output sequence of a knowledge attention layer;
a hidden state vector acquisition module, configured to extract a hidden state vector of the output sequence of the knowledge attention layer to obtain an output sequence of a hidden layer;
and the output module is used for obtaining the classification result of the context.
For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.
Drawings
FIG. 1 is a diagram illustrating the steps of a sentiment classification method using a target knowledge enhancement model according to the present invention;
FIG. 2 is a schematic diagram of a structure of a target knowledge enhancement model constructed by the present invention;
FIG. 3 is a schematic diagram of an input sequence of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 4 is an example of the concatenation of knowledge sentences of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 6 is a schematic diagram of an emotion classification apparatus using a target knowledge enhancement model according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the embodiments in the present application.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
As shown in fig. 1, fig. 1 is a step diagram of an emotion classification method using a target knowledge enhancement model provided by the present invention, including the following steps:
s1: the target knowledge enhancement model is constructed, as shown in fig. 2, fig. 2 is a schematic structural diagram of a target knowledge enhancement model constructed by the present invention, and the target knowledge enhancement model includes an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer.
S2: and acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted.
The knowledge sentence is target enhanced knowledge corresponding to the target word extracted from the database by the target knowledge enhancement model. Preferably, the database is a Chinese knowledge map.
In one embodiment, the knowledge sentence is obtained as follows:
s21: obtaining relevant knowledge sentences of the target words from a Chinese knowledge map (https:// www.ownthink.com/docs/kg /): the target word requests URL (HTTPs:// api. ownthink. com/kg/ambiguous ═ ment _ name) by HTTP Get.
Request example: https:// api. ownthink. com/kg/ambiguous ═ operation, https:// api. ownthink. com/kg/ambiguous ═ design
S22: and carrying out data preprocessing on the acquired entity knowledge, and deleting noise data.
In one embodiment, the specific manner of data preprocessing is as follows:
s221: and cutting the knowledge sentence with the sentence length of more than 512 words, and deleting the content with the sentence length of more than 512 words.
S222: and deleting English letters and nonsense characters appearing in the entity knowledge sentence, such as: abs,%, and the like.
S3: and inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer.
The purpose of the input layer is to splice the context and the knowledge sentences as input, and provide more semantic information for the target knowledge enhancement model. Meanwhile, knowledge sentences are used as auxiliary sentences, and a single sentence classification task is converted into: sentence pair classification tasks (i.e., determining a classification relationship between two sentences) between a context and an auxiliary sentence.
Inspired by the BERT4TC model and the BERT-SPC model, the input sequence of the target knowledge enhancement model is similar to the BERT4TC model and the BERT-SPC model. The difference is that a plurality of auxiliary sentences are used ". "separate.
As shown in fig. 3, fig. 3 is an input sequence diagram of a target knowledge enhancement model constructed by the present invention, and the concatenation manner of context and knowledge sentences is as follows:
[ CLS ] context [ SEP ] knowledge sentence 1. Knowledge sentence 2. [ SEP ]
Where [ CLS ] represents the beginning of a sentence and [ SEP ] represents the segmentation and end of the sentence. Used between different knowledge sentences. "separate.
For the target knowledge enhancement model, the input sequence of the model consists of a sequence of contexts and a sequence of knowledge sentences, which enables the model to better learn the correlation between the contexts and the knowledge sentences. Let C be an input context sequence consisting of k characters, denoted C ═ C1,c2,...,ci,...,ckIn which c isiRepresenting the ith word. Let T be the input sequence of a knowledge sentence consisting of n characters, denoted T ═ T1,t2,...,tj,...,tk}。
Especially when the target word is a compound word, the target word is divided into a plurality of sub-target words, and a knowledge sentence of each sub-target word is extracted from the Chinese knowledge graph. We use ". "to separate a plurality of knowledge sentences. t is tjA knowledge sentence sequence representing the ith sub-target word.
As shown in fig. 4, fig. 4 is a multiple knowledge sentence splicing example of a target knowledge enhancement model constructed by the present invention, and therefore, the formula of the input sequence S of the input layer can be expressed as:
S={[CLS],C,[SEP],T,[SEP]} (1)
T={T1,[.],T2,[.],…,Ti,[.]} (2)
where [ CLS ] represents the beginning of a sentence and [ SEP ] represents the segmentation and end of the sentence. Used between different knowledge sentences. "separate.
S4: and inputting the output sequence of the input layer into the embedded layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence.
The target knowledge enhancement model employs a BERT model as an embedding layer. The architecture of BERT is a multi-level bidirectional transformer encoder, comprising 12 transformer banks and 12 self-attention heads.
The Bert layer (embedding layer) may map each word or token to a vector space. Wherein, the words refer to common Chinese words and have self semantics; the marks refer to [ CLS ], [ SEP ], and the like, do not contain semantics, and only play a role in segmenting sentences.
For an input consisting of a sequence of context C and a sequence of knowledge sentences T, the embedding layer processes it as:
whereinIs the output representation of the context sequence,is ithThe output of the knowledge sentence, BERT, represents the embedding layer. Ti refers to the ith knowledge sentence; preferably, 1 ≦ i ≦ 4, i.e., the maximum value of i is 4, indicating that the maximum number of knowledge sentences the model introduces from the outside is 4. If the value exceeds 4, the introduced external data is too much, so that the model efficiency is slow and the data is noisy.
S5: inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context.
Based on the Self-Attention mechanism, MHSA (Multi-head Self-Attention) employs multiple Self-Attention mechanisms to compute the Attention score for each word in the context. Compared with Recurrent Neural Networks (RNNs) and LSTMs, the self-attention mechanism has the capability of parallel computation, so the latter is faster and more efficient.
Suppose thatAndis the output of the embedding layer. i denotes i of MHSAthA head. The formula of SDA (Scaled Dot Product) is as follows:
Qi=OBERT·Wi q (5)
Ki=OBERT·Wi k (6)
Vi=OBERT·Wi v (7)
where Q, K, V represent three different vectorized representations of each word in the context sentence, each attention head in the multi-head attention mechanism is computed by SDA. Suppose HiA representation of the characteristics of each of the outputs from the attention head, then
Hi=SDAi(OBERT)(1≤i≤h) (9)
MHSA(OBERT)=tanh({H1;…;Hh}·WMH) (10)
Wherein, "; "denotes the concatenation of vectors. The step adopts the tanh function as the nonlinear activation function of the MHSA encoder, and the feature extraction capability of the MHSA encoder is enhanced.An output sequence of a multi-head attention layer is shown.
S6: and inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer.
In the MHSA calculation process, the weight from word to word in each self-attention head is represented as Wi C. In the MHSA layer, we are dealing with Wi CIs subjected to Softmax processing. However, at the knowledge attention level, we average Wi CThe first dimension of (a). Suppose WcharacterRepresenting the weight of a word relative to a context sentence, then we have
Wcharacter=Sum(WC) (15)
Wherein L is0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented; wi SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. Wi SingleRepresenting knowledge attention weights; .
Output sequence O of knowledge attention layerKAThe calculation formula is as follows:
wherein the content of the first and second substances,a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing the knowledge attention layer.
In a specific example, as shown in fig. 5, fig. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed by the present invention, wherein the darker the color, the higher the knowledge weight.
S7: inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
the hidden attention layer is positioned behind the knowledge attention layer, namely the output sequence O of the knowledge attention layerKAAs an input sequence to the hidden attention layer.
Most approaches tend to take the hidden state vector as the output of the knowledge attention layer by extracting it on the matching location of the first marker, i.e. [ CLS ]. However, in this patent, we use a hidden attention mechanism instead of a pooling layer. The hidden attention mechanism calculates a weighted sum of the hidden state vectors for each word.
Suppose OHARepresents the output of the hidden attention layer, OHAThe specific calculation formula of (2) is as follows:
OHA=WHA·OKA+bHA (19)
where W represents the weight and b represents the bias parameter.
Experimental results show that the hidden attention mechanism has better performance than pooling.
S8: and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
At the end of the model, we used the Softmax layer to predict the polarity of the emotion. Y represents the final emotional polarity.
Where P represents the number of classification classes, Y represents the result of model prediction, OHAThe output of the hidden attention layer, i.e. the input of the output layer, is represented.
Corresponding to the emotion classification method using the target knowledge enhancement model, the embodiment of the application also provides an emotion classification device using the target knowledge enhancement model.
As shown in fig. 6, the emotion classification apparatus using the target knowledge enhancement model includes:
the model building module is used for building a target knowledge enhancement model, and the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
the knowledge sentence acquisition module is used for acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
the input module is used for splicing the context and the knowledge sentences to obtain an output sequence of an input layer;
the embedding module is used for mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
the multi-head self-attention module is used for calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
the knowledge attention module is used for obtaining a weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting an output sequence of a knowledge attention layer;
a hidden state vector acquisition module, configured to extract a hidden state vector of the output sequence of the knowledge attention layer to obtain an output sequence of a hidden layer;
and the output module is used for obtaining the classification result of the context.
Preferably, the hidden state vector acquisition module comprises a hidden attention unit for calculating a weighted sum of the hidden state vectors for each word of the output sequence of the knowledge attention layer based on a hidden attention mechanism.
Preferably, the multi-head self-attention module comprises a weight unit for calculating a weight between words in each self-attention head.
Preferably, the knowledge attention module comprises a knowledge sentence weighting unit for calculating a multi-headed output sequence of weighted knowledge sentences.
Preferably, the knowledge sentence acquisition module includes:
an entity knowledge sentence acquisition unit, configured to acquire an entity knowledge sentence corresponding to each target word from a database;
the decomposition unit is used for dividing the compound word into a plurality of sub-target words;
and the data preprocessing unit is used for preprocessing the data of the entity knowledge sentence, deleting the noise data and obtaining the knowledge sentence of the context to be detected.
Preferably, the data preprocessing unit includes:
a cutting subunit, configured to cut the entity knowledge whose sentence length exceeds a first threshold;
a first deletion subunit, configured to delete content exceeding a first threshold length;
and the second deleting subunit is used for deleting English letters and nonsense characters appearing in the entity knowledge sentence.
Preferably, the output module includes a softmax unit, configured to process the output sequence of the hidden layer using a softmax function to obtain the emotion polarity prediction result of the context.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Claims (6)
1. An emotion classification method using a target knowledge enhancement model is characterized by comprising the following steps:
constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer;
inputting the output sequence of the input layer into the embedding layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context; wherein the weight between words in each self-attention head is calculated according to the following formula:
wherein the content of the first and second substances,for each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence;
inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer; calculating a multi-head output sequence of the weighted knowledge sentence according to the following formula;
Wcharacter=Sum(WC)
wherein, WCRepresenting the weight from word to word, WcharacterRepresenting the weight of a word relative to a context sentence, L0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented;representing the weight of the ith word target word relative to the context sentence, i.e.Representing knowledge attention weights;a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing a knowledge attention layer;
inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
2. The emotion classification method using target knowledge enhancement model according to claim 1, wherein the hidden layer is a hidden attention layer, and the extracting hidden state vectors of the output sequence of the knowledge attention layer comprises:
a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism.
3. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein the obtaining of the knowledge sentence corresponding to each target word according to the target word in the context to be predicted comprises:
acquiring an entity knowledge sentence corresponding to each target word from a database;
when the target word is a compound word, dividing the compound word into a plurality of sub-target words, and acquiring an entity knowledge sentence of each sub-target word;
and preprocessing the data of the entity knowledge sentence, deleting noise data, and obtaining the knowledge sentence of the context to be detected.
4. The emotion classification method using a target knowledge enhancement model, as claimed in claim 3, wherein the step of performing data preprocessing on the entity knowledge sentence to remove noise data comprises:
cutting the entity knowledge of which the sentence length exceeds a first threshold value, and deleting the content of which the sentence length exceeds the first threshold value;
and deleting English letters and nonsense characters appearing in the entity knowledge sentence.
5. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein:
the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.
6. The emotion classification method using target knowledge enhancement model as claimed in claim 1, wherein the step of inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context comprises:
and processing the output sequence of the hidden layer by using a softmax function to obtain an emotion polarity prediction result of the context.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110605317.8A CN113220887B (en) | 2021-05-31 | 2021-05-31 | Emotion classification method using target knowledge enhancement model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110605317.8A CN113220887B (en) | 2021-05-31 | 2021-05-31 | Emotion classification method using target knowledge enhancement model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113220887A CN113220887A (en) | 2021-08-06 |
CN113220887B true CN113220887B (en) | 2022-03-15 |
Family
ID=77081927
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110605317.8A Active CN113220887B (en) | 2021-05-31 | 2021-05-31 | Emotion classification method using target knowledge enhancement model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113220887B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115422362B (en) * | 2022-10-09 | 2023-10-31 | 郑州数智技术研究院有限公司 | Text matching method based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489554A (en) * | 2019-08-15 | 2019-11-22 | 昆明理工大学 | Property level sensibility classification method based on the mutual attention network model of location aware |
CN111581966A (en) * | 2020-04-30 | 2020-08-25 | 华南师范大学 | Context feature fusion aspect level emotion classification method and device |
CN111666758A (en) * | 2020-04-15 | 2020-09-15 | 中国科学院深圳先进技术研究院 | Chinese word segmentation method, training device and computer readable storage medium |
CN112182227A (en) * | 2020-10-22 | 2021-01-05 | 福州大学 | Text emotion classification system and method based on transD knowledge graph embedding |
CN112417098A (en) * | 2020-11-20 | 2021-02-26 | 南京邮电大学 | Short text emotion classification method based on CNN-BiMGU model |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108717406B (en) * | 2018-05-10 | 2021-08-24 | 平安科技(深圳)有限公司 | Text emotion analysis method and device and storage medium |
CN110069778B (en) * | 2019-04-18 | 2023-06-02 | 东华大学 | Commodity emotion analysis method for Chinese merged embedded word position perception |
CN110427490B (en) * | 2019-07-03 | 2021-11-09 | 华中科技大学 | Emotional dialogue generation method and device based on self-attention mechanism |
CN111259142B (en) * | 2020-01-14 | 2020-12-25 | 华南师范大学 | Specific target emotion classification method based on attention coding and graph convolution network |
CN112131383B (en) * | 2020-08-26 | 2021-05-18 | 华南师范大学 | Specific target emotion polarity classification method |
CN112100388B (en) * | 2020-11-18 | 2021-02-23 | 南京华苏科技有限公司 | Method for analyzing emotional polarity of long text news public sentiment |
CN112597299A (en) * | 2020-12-07 | 2021-04-02 | 深圳价值在线信息科技股份有限公司 | Text entity classification method and device, terminal equipment and storage medium |
-
2021
- 2021-05-31 CN CN202110605317.8A patent/CN113220887B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489554A (en) * | 2019-08-15 | 2019-11-22 | 昆明理工大学 | Property level sensibility classification method based on the mutual attention network model of location aware |
CN111666758A (en) * | 2020-04-15 | 2020-09-15 | 中国科学院深圳先进技术研究院 | Chinese word segmentation method, training device and computer readable storage medium |
CN111581966A (en) * | 2020-04-30 | 2020-08-25 | 华南师范大学 | Context feature fusion aspect level emotion classification method and device |
CN112182227A (en) * | 2020-10-22 | 2021-01-05 | 福州大学 | Text emotion classification system and method based on transD knowledge graph embedding |
CN112417098A (en) * | 2020-11-20 | 2021-02-26 | 南京邮电大学 | Short text emotion classification method based on CNN-BiMGU model |
Also Published As
Publication number | Publication date |
---|---|
CN113220887A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110609891B (en) | Visual dialog generation method based on context awareness graph neural network | |
CN108363753B (en) | Comment text emotion classification model training and emotion classification method, device and equipment | |
Su et al. | Learning chinese word representations from glyphs of characters | |
CN110647612A (en) | Visual conversation generation method based on double-visual attention network | |
CN110287323B (en) | Target-oriented emotion classification method | |
WO2019080863A1 (en) | Text sentiment classification method, storage medium and computer | |
CN108984530A (en) | A kind of detection method and detection system of network sensitive content | |
CN110704621A (en) | Text processing method and device, storage medium and electronic equipment | |
CN110414009B (en) | Burma bilingual parallel sentence pair extraction method and device based on BilSTM-CNN | |
CN108319666A (en) | A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion | |
CN108647191B (en) | Sentiment dictionary construction method based on supervised sentiment text and word vector | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
Wshah et al. | Statistical script independent word spotting in offline handwritten documents | |
CN105551485B (en) | Voice file retrieval method and system | |
CN113505200B (en) | Sentence-level Chinese event detection method combined with document key information | |
CN112784696A (en) | Lip language identification method, device, equipment and storage medium based on image identification | |
CN111950283B (en) | Chinese word segmentation and named entity recognition system for large-scale medical text mining | |
Alsharid et al. | Captioning ultrasound images automatically | |
CN113128203A (en) | Attention mechanism-based relationship extraction method, system, equipment and storage medium | |
CN111311364B (en) | Commodity recommendation method and system based on multi-mode commodity comment analysis | |
CN112069312A (en) | Text classification method based on entity recognition and electronic device | |
CN111159405B (en) | Irony detection method based on background knowledge | |
CN113704396A (en) | Short text classification method, device, equipment and storage medium | |
CN113220887B (en) | Emotion classification method using target knowledge enhancement model | |
CN111274786A (en) | Automatic sentencing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |