CN113220887B - Emotion classification method using target knowledge enhancement model - Google Patents

Emotion classification method using target knowledge enhancement model Download PDF

Info

Publication number
CN113220887B
CN113220887B CN202110605317.8A CN202110605317A CN113220887B CN 113220887 B CN113220887 B CN 113220887B CN 202110605317 A CN202110605317 A CN 202110605317A CN 113220887 B CN113220887 B CN 113220887B
Authority
CN
China
Prior art keywords
knowledge
word
layer
sentence
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110605317.8A
Other languages
Chinese (zh)
Other versions
CN113220887A (en
Inventor
曾碧卿
杨健豪
陈嘉涛
邓会敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGDONG AIB POLYTECHNIC COLLEGE
South China Normal University
Original Assignee
GUANGDONG AIB POLYTECHNIC COLLEGE
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGDONG AIB POLYTECHNIC COLLEGE, South China Normal University filed Critical GUANGDONG AIB POLYTECHNIC COLLEGE
Priority to CN202110605317.8A priority Critical patent/CN113220887B/en
Publication of CN113220887A publication Critical patent/CN113220887A/en
Application granted granted Critical
Publication of CN113220887B publication Critical patent/CN113220887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an emotion classification method for enhancing a model by utilizing target knowledge. The emotion classification method utilizing the target knowledge enhancement model comprises the following steps: constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer; acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted; and inputting the context and each knowledge sentence into the target knowledge enhancement model to obtain the emotion classification result of the context. The emotion classification method utilizing the target knowledge enhancement model, disclosed by the invention, has the advantages that the external knowledge is introduced into the target words in the Chinese comment text, the knowledge attention mechanism is provided, the weight is dynamically distributed to the introduced external knowledge, the defect of insufficient information content of the target words is overcome, and the classification accuracy is high.

Description

Emotion classification method using target knowledge enhancement model
Technical Field
The invention relates to the technical field of emotion classification, in particular to an emotion classification method for enhancing a model by utilizing target knowledge.
Background
Traditional aspect level emotion classification methods typically employ machine learning methods, such as Support Vector Machines (SVMs), that require artificial features, including syntactic analysis features and dictionary features.
Compared with the traditional machine learning method, the neural network can automatically acquire important semantic information from the text, so that a large amount of manual operation is avoided. Therefore, it is widely applied to aspect-level emotion classification. Neural networks are capable of automatically capturing important emotional features from text and are therefore widely used in aspect-level emotion classification.
Tang et al use two long and short memory networks (LSTMs) to capture emotional features before and after an aspect. Wang et al propose an LSTM with attention mechanism for capturing emotional features in context. Ma et al propose two attention-based LSTM networks for interactively generating sentences and body representations and concatenating these representations for prediction. Chen et al use a network of Gated Recursive Units (GRUs) to integrate the hidden states of the LSTMs. Zeng et al utilizes a local contextual attention mechanism to capture local contextual features in comment text.
Wagner et al trained a support vector machine from multiple external emotion dictionaries for emotion classification tasks. kirithenko et al create a domain-specific emotion dictionary by manual writing and processing, providing additional domain emotion knowledge for support vector machines. Teng et al calculate the polarity of each emotion word using the emotion dictionary and calculate the weight of the emotion words using LSTM. And finally, predicting the emotion polarity of the sentence by using the weighted sum of the emotion words. Yang et al propose a humanoid hierarchical strategy based on aspect-based sentiment classification. Chen et al uses a self-built domain emotion knowledge map as side information to measure the emotion polarity between an emotion word and a target word.
The existing method can well utilize auxiliary information related to emotional words and improve the performance of the model. However, the existing method only focuses on emotion information, and ignores important semantic information contained in the target word. In addition, the conventional emotion dictionary and emotion recognition map are used as auxiliary information and need to be manually constructed for a specific data set.
Disclosure of Invention
Based on the above, the invention aims to provide an emotion classification method using a target knowledge enhancement model, which introduces external knowledge into a target word in a Chinese comment text by constructing the target knowledge enhancement model, provides a knowledge attention mechanism, dynamically allocates weight to the introduced external knowledge, overcomes the defect of insufficient information content of the target word, and has the advantage of high classification accuracy.
In a first aspect, the invention provides an emotion classification method using a target knowledge enhancement model, comprising the following steps:
constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer;
inputting the output sequence of the input layer into the embedding layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer;
inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
The emotion classification method utilizing the target knowledge enhancement model disclosed by the invention is used for constructing the target knowledge enhancement model, introducing external knowledge into a target word in a Chinese comment text, proposing a knowledge attention mechanism, dynamically distributing weight to the introduced external knowledge, effectively and automatically identifying fine-grained emotion in the Chinese short comment text, overcoming the defect of insufficient information content of the target word and simultaneously increasing the semantic feature extraction capability of the model.
Further, the hidden layer is a hidden attention layer, and the extracting a hidden state vector of an output sequence of the knowledge attention layer includes:
a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism. Further, obtaining a weight for each word in the context includes:
Qi=OBERT·Wi q
Ki=OBERT·Wi k
Vi=OBERT·Wi v
Figure GDA0003466621830000021
wherein, Wi CFor each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence.
Further, obtaining a multi-headed output sequence of weighted knowledge sentences, comprising:
Figure GDA0003466621830000031
Wcharacter=Sum(WC)
Figure GDA0003466621830000032
Figure GDA0003466621830000033
Figure GDA0003466621830000034
wherein, WCRepresenting the weight from word to word, WcharacterRepresenting the weight of a word relative to a context sentence, L0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented; wi SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. Wi SingleRepresenting knowledge attention weights;
Figure GDA0003466621830000035
a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing the knowledge attention layer.
Further, the acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted includes:
acquiring an entity knowledge sentence corresponding to each target word from a database;
when the target word is a compound word, dividing the compound word into a plurality of sub-target words, and acquiring an entity knowledge sentence of each sub-target word;
and preprocessing the data of the entity knowledge sentence, deleting noise data, and obtaining the knowledge sentence of the context to be detected.
Further, the data preprocessing is carried out on the entity knowledge sentence, and noise data are deleted, and the method comprises the following steps:
cutting the entity knowledge of which the sentence length exceeds a first threshold value, and deleting the content of which the sentence length exceeds the first threshold value;
and deleting English letters and nonsense characters appearing in the entity knowledge sentence.
Furthermore, the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.
Further, the inputting of the output sequence of the hidden layer into the output layer to obtain the classification result of the context includes:
and processing the output sequence of the hidden layer by using a softmax function to obtain an emotion polarity prediction result of the context.
In a second aspect, the present invention provides an emotion classification apparatus using a target knowledge enhancement model, including:
the model building module is used for building a target knowledge enhancement model, and the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
the knowledge sentence acquisition module is used for acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
the input module is used for splicing the context and the knowledge sentences to obtain an output sequence of an input layer;
the embedding module is used for mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
the multi-head self-attention module is used for calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
the knowledge attention module is used for obtaining a weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting an output sequence of a knowledge attention layer;
a hidden state vector acquisition module, configured to extract a hidden state vector of the output sequence of the knowledge attention layer to obtain an output sequence of a hidden layer;
and the output module is used for obtaining the classification result of the context.
For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.
Drawings
FIG. 1 is a diagram illustrating the steps of a sentiment classification method using a target knowledge enhancement model according to the present invention;
FIG. 2 is a schematic diagram of a structure of a target knowledge enhancement model constructed by the present invention;
FIG. 3 is a schematic diagram of an input sequence of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 4 is an example of the concatenation of knowledge sentences of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed in accordance with the present invention;
FIG. 6 is a schematic diagram of an emotion classification apparatus using a target knowledge enhancement model according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the embodiments in the present application.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
As shown in fig. 1, fig. 1 is a step diagram of an emotion classification method using a target knowledge enhancement model provided by the present invention, including the following steps:
s1: the target knowledge enhancement model is constructed, as shown in fig. 2, fig. 2 is a schematic structural diagram of a target knowledge enhancement model constructed by the present invention, and the target knowledge enhancement model includes an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer.
S2: and acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted.
The knowledge sentence is target enhanced knowledge corresponding to the target word extracted from the database by the target knowledge enhancement model. Preferably, the database is a Chinese knowledge map.
In one embodiment, the knowledge sentence is obtained as follows:
s21: obtaining relevant knowledge sentences of the target words from a Chinese knowledge map (https:// www.ownthink.com/docs/kg /): the target word requests URL (HTTPs:// api. ownthink. com/kg/ambiguous ═ ment _ name) by HTTP Get.
Request example: https:// api. ownthink. com/kg/ambiguous ═ operation, https:// api. ownthink. com/kg/ambiguous ═ design
S22: and carrying out data preprocessing on the acquired entity knowledge, and deleting noise data.
In one embodiment, the specific manner of data preprocessing is as follows:
s221: and cutting the knowledge sentence with the sentence length of more than 512 words, and deleting the content with the sentence length of more than 512 words.
S222: and deleting English letters and nonsense characters appearing in the entity knowledge sentence, such as: abs,%, and the like.
S3: and inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer.
The purpose of the input layer is to splice the context and the knowledge sentences as input, and provide more semantic information for the target knowledge enhancement model. Meanwhile, knowledge sentences are used as auxiliary sentences, and a single sentence classification task is converted into: sentence pair classification tasks (i.e., determining a classification relationship between two sentences) between a context and an auxiliary sentence.
Inspired by the BERT4TC model and the BERT-SPC model, the input sequence of the target knowledge enhancement model is similar to the BERT4TC model and the BERT-SPC model. The difference is that a plurality of auxiliary sentences are used ". "separate.
As shown in fig. 3, fig. 3 is an input sequence diagram of a target knowledge enhancement model constructed by the present invention, and the concatenation manner of context and knowledge sentences is as follows:
[ CLS ] context [ SEP ] knowledge sentence 1. Knowledge sentence 2. [ SEP ]
Where [ CLS ] represents the beginning of a sentence and [ SEP ] represents the segmentation and end of the sentence. Used between different knowledge sentences. "separate.
For the target knowledge enhancement model, the input sequence of the model consists of a sequence of contexts and a sequence of knowledge sentences, which enables the model to better learn the correlation between the contexts and the knowledge sentences. Let C be an input context sequence consisting of k characters, denoted C ═ C1,c2,...,ci,...,ckIn which c isiRepresenting the ith word. Let T be the input sequence of a knowledge sentence consisting of n characters, denoted T ═ T1,t2,...,tj,...,tk}。
Especially when the target word is a compound word, the target word is divided into a plurality of sub-target words, and a knowledge sentence of each sub-target word is extracted from the Chinese knowledge graph. We use ". "to separate a plurality of knowledge sentences. t is tjA knowledge sentence sequence representing the ith sub-target word.
As shown in fig. 4, fig. 4 is a multiple knowledge sentence splicing example of a target knowledge enhancement model constructed by the present invention, and therefore, the formula of the input sequence S of the input layer can be expressed as:
S={[CLS],C,[SEP],T,[SEP]} (1)
T={T1,[.],T2,[.],…,Ti,[.]} (2)
where [ CLS ] represents the beginning of a sentence and [ SEP ] represents the segmentation and end of the sentence. Used between different knowledge sentences. "separate.
S4: and inputting the output sequence of the input layer into the embedded layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence.
The target knowledge enhancement model employs a BERT model as an embedding layer. The architecture of BERT is a multi-level bidirectional transformer encoder, comprising 12 transformer banks and 12 self-attention heads.
The Bert layer (embedding layer) may map each word or token to a vector space. Wherein, the words refer to common Chinese words and have self semantics; the marks refer to [ CLS ], [ SEP ], and the like, do not contain semantics, and only play a role in segmenting sentences.
For an input consisting of a sequence of context C and a sequence of knowledge sentences T, the embedding layer processes it as:
Figure GDA0003466621830000071
Figure GDA0003466621830000072
wherein
Figure GDA0003466621830000073
Is the output representation of the context sequence,
Figure GDA0003466621830000074
is ithThe output of the knowledge sentence, BERT, represents the embedding layer. Ti refers to the ith knowledge sentence; preferably, 1 ≦ i ≦ 4, i.e., the maximum value of i is 4, indicating that the maximum number of knowledge sentences the model introduces from the outside is 4. If the value exceeds 4, the introduced external data is too much, so that the model efficiency is slow and the data is noisy.
S5: inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context.
Based on the Self-Attention mechanism, MHSA (Multi-head Self-Attention) employs multiple Self-Attention mechanisms to compute the Attention score for each word in the context. Compared with Recurrent Neural Networks (RNNs) and LSTMs, the self-attention mechanism has the capability of parallel computation, so the latter is faster and more efficient.
Suppose that
Figure GDA0003466621830000075
And
Figure GDA0003466621830000076
is the output of the embedding layer. i denotes i of MHSAthA head. The formula of SDA (Scaled Dot Product) is as follows:
Qi=OBERT·Wi q (5)
Ki=OBERT·Wi k (6)
Vi=OBERT·Wi v (7)
Figure GDA0003466621830000077
where Q, K, V represent three different vectorized representations of each word in the context sentence, each attention head in the multi-head attention mechanism is computed by SDA. Suppose HiA representation of the characteristics of each of the outputs from the attention head, then
Hi=SDAi(OBERT)(1≤i≤h) (9)
MHSA(OBERT)=tanh({H1;…;Hh}·WMH) (10)
Figure GDA0003466621830000081
Figure GDA0003466621830000082
Wherein, "; "denotes the concatenation of vectors. The step adopts the tanh function as the nonlinear activation function of the MHSA encoder, and the feature extraction capability of the MHSA encoder is enhanced.
Figure GDA0003466621830000083
An output sequence of a multi-head attention layer is shown.
S6: and inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer.
In the MHSA calculation process, the weight from word to word in each self-attention head is represented as Wi C. In the MHSA layer, we are dealing with Wi CIs subjected to Softmax processing. However, at the knowledge attention level, we average Wi CThe first dimension of (a). Suppose WcharacterRepresenting the weight of a word relative to a context sentence, then we have
Figure GDA0003466621830000084
Figure GDA0003466621830000085
Wcharacter=Sum(WC) (15)
Wherein L is0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented; wi SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. Wi SingleRepresenting knowledge attention weights; .
Output sequence O of knowledge attention layerKAThe calculation formula is as follows:
Figure GDA0003466621830000086
Figure GDA0003466621830000087
Figure GDA0003466621830000088
wherein the content of the first and second substances,
Figure GDA0003466621830000089
a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing the knowledge attention layer.
In a specific example, as shown in fig. 5, fig. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed by the present invention, wherein the darker the color, the higher the knowledge weight.
S7: inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
the hidden attention layer is positioned behind the knowledge attention layer, namely the output sequence O of the knowledge attention layerKAAs an input sequence to the hidden attention layer.
Most approaches tend to take the hidden state vector as the output of the knowledge attention layer by extracting it on the matching location of the first marker, i.e. [ CLS ]. However, in this patent, we use a hidden attention mechanism instead of a pooling layer. The hidden attention mechanism calculates a weighted sum of the hidden state vectors for each word.
Suppose OHARepresents the output of the hidden attention layer, OHAThe specific calculation formula of (2) is as follows:
OHA=WHA·OKA+bHA (19)
where W represents the weight and b represents the bias parameter.
Experimental results show that the hidden attention mechanism has better performance than pooling.
S8: and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
At the end of the model, we used the Softmax layer to predict the polarity of the emotion. Y represents the final emotional polarity.
Figure GDA0003466621830000091
Where P represents the number of classification classes, Y represents the result of model prediction, OHAThe output of the hidden attention layer, i.e. the input of the output layer, is represented.
Corresponding to the emotion classification method using the target knowledge enhancement model, the embodiment of the application also provides an emotion classification device using the target knowledge enhancement model.
As shown in fig. 6, the emotion classification apparatus using the target knowledge enhancement model includes:
the model building module is used for building a target knowledge enhancement model, and the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
the knowledge sentence acquisition module is used for acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
the input module is used for splicing the context and the knowledge sentences to obtain an output sequence of an input layer;
the embedding module is used for mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
the multi-head self-attention module is used for calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;
the knowledge attention module is used for obtaining a weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting an output sequence of a knowledge attention layer;
a hidden state vector acquisition module, configured to extract a hidden state vector of the output sequence of the knowledge attention layer to obtain an output sequence of a hidden layer;
and the output module is used for obtaining the classification result of the context.
Preferably, the hidden state vector acquisition module comprises a hidden attention unit for calculating a weighted sum of the hidden state vectors for each word of the output sequence of the knowledge attention layer based on a hidden attention mechanism.
Preferably, the multi-head self-attention module comprises a weight unit for calculating a weight between words in each self-attention head.
Preferably, the knowledge attention module comprises a knowledge sentence weighting unit for calculating a multi-headed output sequence of weighted knowledge sentences.
Preferably, the knowledge sentence acquisition module includes:
an entity knowledge sentence acquisition unit, configured to acquire an entity knowledge sentence corresponding to each target word from a database;
the decomposition unit is used for dividing the compound word into a plurality of sub-target words;
and the data preprocessing unit is used for preprocessing the data of the entity knowledge sentence, deleting the noise data and obtaining the knowledge sentence of the context to be detected.
Preferably, the data preprocessing unit includes:
a cutting subunit, configured to cut the entity knowledge whose sentence length exceeds a first threshold;
a first deletion subunit, configured to delete content exceeding a first threshold length;
and the second deleting subunit is used for deleting English letters and nonsense characters appearing in the entity knowledge sentence.
Preferably, the output module includes a softmax unit, configured to process the output sequence of the hidden layer using a softmax function to obtain the emotion polarity prediction result of the context.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims (6)

1. An emotion classification method using a target knowledge enhancement model is characterized by comprising the following steps:
constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;
acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;
inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer;
inputting the output sequence of the input layer into the embedding layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;
inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context; wherein the weight between words in each self-attention head is calculated according to the following formula:
Figure FDA0003466621820000011
Figure FDA0003466621820000012
Figure FDA0003466621820000013
Figure FDA0003466621820000014
wherein the content of the first and second substances,
Figure FDA0003466621820000015
for each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence;
inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer; calculating a multi-head output sequence of the weighted knowledge sentence according to the following formula;
Figure FDA0003466621820000016
Wcharacter=Sum(WC)
Figure FDA0003466621820000017
Figure FDA0003466621820000018
Figure FDA0003466621820000021
wherein, WCRepresenting the weight from word to word, WcharacterRepresenting the weight of a word relative to a context sentence, L0A position of a first word representing a target word; when the target word is a compound word, LiThe position of the last word of the ith sub-target word is represented;
Figure FDA0003466621820000022
representing the weight of the ith word target word relative to the context sentence, i.e.
Figure FDA0003466621820000023
Representing knowledge attention weights;
Figure FDA0003466621820000024
a multi-headed output sequence, O, representing the weighted knowledge sentenceKAAn output sequence representing a knowledge attention layer;
inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;
and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.
2. The emotion classification method using target knowledge enhancement model according to claim 1, wherein the hidden layer is a hidden attention layer, and the extracting hidden state vectors of the output sequence of the knowledge attention layer comprises:
a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism.
3. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein the obtaining of the knowledge sentence corresponding to each target word according to the target word in the context to be predicted comprises:
acquiring an entity knowledge sentence corresponding to each target word from a database;
when the target word is a compound word, dividing the compound word into a plurality of sub-target words, and acquiring an entity knowledge sentence of each sub-target word;
and preprocessing the data of the entity knowledge sentence, deleting noise data, and obtaining the knowledge sentence of the context to be detected.
4. The emotion classification method using a target knowledge enhancement model, as claimed in claim 3, wherein the step of performing data preprocessing on the entity knowledge sentence to remove noise data comprises:
cutting the entity knowledge of which the sentence length exceeds a first threshold value, and deleting the content of which the sentence length exceeds the first threshold value;
and deleting English letters and nonsense characters appearing in the entity knowledge sentence.
5. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein:
the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.
6. The emotion classification method using target knowledge enhancement model as claimed in claim 1, wherein the step of inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context comprises:
and processing the output sequence of the hidden layer by using a softmax function to obtain an emotion polarity prediction result of the context.
CN202110605317.8A 2021-05-31 2021-05-31 Emotion classification method using target knowledge enhancement model Active CN113220887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110605317.8A CN113220887B (en) 2021-05-31 2021-05-31 Emotion classification method using target knowledge enhancement model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110605317.8A CN113220887B (en) 2021-05-31 2021-05-31 Emotion classification method using target knowledge enhancement model

Publications (2)

Publication Number Publication Date
CN113220887A CN113220887A (en) 2021-08-06
CN113220887B true CN113220887B (en) 2022-03-15

Family

ID=77081927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110605317.8A Active CN113220887B (en) 2021-05-31 2021-05-31 Emotion classification method using target knowledge enhancement model

Country Status (1)

Country Link
CN (1) CN113220887B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115422362B (en) * 2022-10-09 2023-10-31 郑州数智技术研究院有限公司 Text matching method based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489554A (en) * 2019-08-15 2019-11-22 昆明理工大学 Property level sensibility classification method based on the mutual attention network model of location aware
CN111581966A (en) * 2020-04-30 2020-08-25 华南师范大学 Context feature fusion aspect level emotion classification method and device
CN111666758A (en) * 2020-04-15 2020-09-15 中国科学院深圳先进技术研究院 Chinese word segmentation method, training device and computer readable storage medium
CN112182227A (en) * 2020-10-22 2021-01-05 福州大学 Text emotion classification system and method based on transD knowledge graph embedding
CN112417098A (en) * 2020-11-20 2021-02-26 南京邮电大学 Short text emotion classification method based on CNN-BiMGU model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717406B (en) * 2018-05-10 2021-08-24 平安科技(深圳)有限公司 Text emotion analysis method and device and storage medium
CN110069778B (en) * 2019-04-18 2023-06-02 东华大学 Commodity emotion analysis method for Chinese merged embedded word position perception
CN110427490B (en) * 2019-07-03 2021-11-09 华中科技大学 Emotional dialogue generation method and device based on self-attention mechanism
CN111259142B (en) * 2020-01-14 2020-12-25 华南师范大学 Specific target emotion classification method based on attention coding and graph convolution network
CN112131383B (en) * 2020-08-26 2021-05-18 华南师范大学 Specific target emotion polarity classification method
CN112100388B (en) * 2020-11-18 2021-02-23 南京华苏科技有限公司 Method for analyzing emotional polarity of long text news public sentiment
CN112597299A (en) * 2020-12-07 2021-04-02 深圳价值在线信息科技股份有限公司 Text entity classification method and device, terminal equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489554A (en) * 2019-08-15 2019-11-22 昆明理工大学 Property level sensibility classification method based on the mutual attention network model of location aware
CN111666758A (en) * 2020-04-15 2020-09-15 中国科学院深圳先进技术研究院 Chinese word segmentation method, training device and computer readable storage medium
CN111581966A (en) * 2020-04-30 2020-08-25 华南师范大学 Context feature fusion aspect level emotion classification method and device
CN112182227A (en) * 2020-10-22 2021-01-05 福州大学 Text emotion classification system and method based on transD knowledge graph embedding
CN112417098A (en) * 2020-11-20 2021-02-26 南京邮电大学 Short text emotion classification method based on CNN-BiMGU model

Also Published As

Publication number Publication date
CN113220887A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
CN110609891B (en) Visual dialog generation method based on context awareness graph neural network
CN108363753B (en) Comment text emotion classification model training and emotion classification method, device and equipment
Su et al. Learning chinese word representations from glyphs of characters
CN110647612A (en) Visual conversation generation method based on double-visual attention network
CN110287323B (en) Target-oriented emotion classification method
WO2019080863A1 (en) Text sentiment classification method, storage medium and computer
CN108984530A (en) A kind of detection method and detection system of network sensitive content
CN110704621A (en) Text processing method and device, storage medium and electronic equipment
CN110414009B (en) Burma bilingual parallel sentence pair extraction method and device based on BilSTM-CNN
CN108319666A (en) A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion
CN108647191B (en) Sentiment dictionary construction method based on supervised sentiment text and word vector
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
Wshah et al. Statistical script independent word spotting in offline handwritten documents
CN105551485B (en) Voice file retrieval method and system
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN112784696A (en) Lip language identification method, device, equipment and storage medium based on image identification
CN111950283B (en) Chinese word segmentation and named entity recognition system for large-scale medical text mining
Alsharid et al. Captioning ultrasound images automatically
CN113128203A (en) Attention mechanism-based relationship extraction method, system, equipment and storage medium
CN111311364B (en) Commodity recommendation method and system based on multi-mode commodity comment analysis
CN112069312A (en) Text classification method based on entity recognition and electronic device
CN111159405B (en) Irony detection method based on background knowledge
CN113704396A (en) Short text classification method, device, equipment and storage medium
CN113220887B (en) Emotion classification method using target knowledge enhancement model
CN111274786A (en) Automatic sentencing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant