CN113220887B

CN113220887B - Emotion classification method using target knowledge enhancement model

Info

Publication number: CN113220887B
Application number: CN202110605317.8A
Authority: CN
Inventors: 曾碧卿; 杨健豪; 陈嘉涛; 邓会敏
Original assignee: GUANGDONG AIB POLYTECHNIC COLLEGE; South China Normal University
Current assignee: GUANGDONG AIB POLYTECHNIC COLLEGE; South China Normal University
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2022-03-15
Anticipated expiration: 2041-05-31
Also published as: CN113220887A

Abstract

The invention relates to an emotion classification method for enhancing a model by utilizing target knowledge. The emotion classification method utilizing the target knowledge enhancement model comprises the following steps: constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer; acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted; and inputting the context and each knowledge sentence into the target knowledge enhancement model to obtain the emotion classification result of the context. The emotion classification method utilizing the target knowledge enhancement model, disclosed by the invention, has the advantages that the external knowledge is introduced into the target words in the Chinese comment text, the knowledge attention mechanism is provided, the weight is dynamically distributed to the introduced external knowledge, the defect of insufficient information content of the target words is overcome, and the classification accuracy is high.

Description

Emotion classification method using target knowledge enhancement model

Technical Field

The invention relates to the technical field of emotion classification, in particular to an emotion classification method for enhancing a model by utilizing target knowledge.

Background

Traditional aspect level emotion classification methods typically employ machine learning methods, such as Support Vector Machines (SVMs), that require artificial features, including syntactic analysis features and dictionary features.

Compared with the traditional machine learning method, the neural network can automatically acquire important semantic information from the text, so that a large amount of manual operation is avoided. Therefore, it is widely applied to aspect-level emotion classification. Neural networks are capable of automatically capturing important emotional features from text and are therefore widely used in aspect-level emotion classification.

Tang et al use two long and short memory networks (LSTMs) to capture emotional features before and after an aspect. Wang et al propose an LSTM with attention mechanism for capturing emotional features in context. Ma et al propose two attention-based LSTM networks for interactively generating sentences and body representations and concatenating these representations for prediction. Chen et al use a network of Gated Recursive Units (GRUs) to integrate the hidden states of the LSTMs. Zeng et al utilizes a local contextual attention mechanism to capture local contextual features in comment text.

Wagner et al trained a support vector machine from multiple external emotion dictionaries for emotion classification tasks. kirithenko et al create a domain-specific emotion dictionary by manual writing and processing, providing additional domain emotion knowledge for support vector machines. Teng et al calculate the polarity of each emotion word using the emotion dictionary and calculate the weight of the emotion words using LSTM. And finally, predicting the emotion polarity of the sentence by using the weighted sum of the emotion words. Yang et al propose a humanoid hierarchical strategy based on aspect-based sentiment classification. Chen et al uses a self-built domain emotion knowledge map as side information to measure the emotion polarity between an emotion word and a target word.

The existing method can well utilize auxiliary information related to emotional words and improve the performance of the model. However, the existing method only focuses on emotion information, and ignores important semantic information contained in the target word. In addition, the conventional emotion dictionary and emotion recognition map are used as auxiliary information and need to be manually constructed for a specific data set.

Disclosure of Invention

Based on the above, the invention aims to provide an emotion classification method using a target knowledge enhancement model, which introduces external knowledge into a target word in a Chinese comment text by constructing the target knowledge enhancement model, provides a knowledge attention mechanism, dynamically allocates weight to the introduced external knowledge, overcomes the defect of insufficient information content of the target word, and has the advantage of high classification accuracy.

In a first aspect, the invention provides an emotion classification method using a target knowledge enhancement model, comprising the following steps:

constructing a target knowledge enhancement model, wherein the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;

acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;

inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer;

inputting the output sequence of the input layer into the embedding layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;

inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;

inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer;

inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;

and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.

The emotion classification method utilizing the target knowledge enhancement model disclosed by the invention is used for constructing the target knowledge enhancement model, introducing external knowledge into a target word in a Chinese comment text, proposing a knowledge attention mechanism, dynamically distributing weight to the introduced external knowledge, effectively and automatically identifying fine-grained emotion in the Chinese short comment text, overcoming the defect of insufficient information content of the target word and simultaneously increasing the semantic feature extraction capability of the model.

Further, the hidden layer is a hidden attention layer, and the extracting a hidden state vector of an output sequence of the knowledge attention layer includes:

a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism. Further, obtaining a weight for each word in the context includes:

Q_i＝O_BERT·W_i ^q

K_i＝O_BERT·W_i ^k

V_i＝O_BERT·W_i ^v

wherein, W_i ^CFor each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence.

Further, obtaining a multi-headed output sequence of weighted knowledge sentences, comprising:

W_character＝Sum(W^C)

wherein, W^CRepresenting the weight from word to word, W_characterRepresenting the weight of a word relative to a context sentence, L₀A position of a first word representing a target word; when the target word is a compound word, L_iThe position of the last word of the ith sub-target word is represented; w_i ^SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. W_i ^SingleRepresenting knowledge attention weights;

a multi-headed output sequence, O, representing the weighted knowledge sentence_KAAn output sequence representing the knowledge attention layer.

Further, the acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted includes:

acquiring an entity knowledge sentence corresponding to each target word from a database;

when the target word is a compound word, dividing the compound word into a plurality of sub-target words, and acquiring an entity knowledge sentence of each sub-target word;

and preprocessing the data of the entity knowledge sentence, deleting noise data, and obtaining the knowledge sentence of the context to be detected.

Further, the data preprocessing is carried out on the entity knowledge sentence, and noise data are deleted, and the method comprises the following steps:

cutting the entity knowledge of which the sentence length exceeds a first threshold value, and deleting the content of which the sentence length exceeds the first threshold value;

and deleting English letters and nonsense characters appearing in the entity knowledge sentence.

Furthermore, the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.

Further, the inputting of the output sequence of the hidden layer into the output layer to obtain the classification result of the context includes:

and processing the output sequence of the hidden layer by using a softmax function to obtain an emotion polarity prediction result of the context.

In a second aspect, the present invention provides an emotion classification apparatus using a target knowledge enhancement model, including:

the model building module is used for building a target knowledge enhancement model, and the target knowledge enhancement model comprises an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer;

the knowledge sentence acquisition module is used for acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted;

the input module is used for splicing the context and the knowledge sentences to obtain an output sequence of an input layer;

the embedding module is used for mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence;

the multi-head self-attention module is used for calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context;

the knowledge attention module is used for obtaining a weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting an output sequence of a knowledge attention layer;

a hidden state vector acquisition module, configured to extract a hidden state vector of the output sequence of the knowledge attention layer to obtain an output sequence of a hidden layer;

and the output module is used for obtaining the classification result of the context.

For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.

Drawings

FIG. 1 is a diagram illustrating the steps of a sentiment classification method using a target knowledge enhancement model according to the present invention;

FIG. 2 is a schematic diagram of a structure of a target knowledge enhancement model constructed by the present invention;

FIG. 3 is a schematic diagram of an input sequence of a target knowledge enhancement model constructed in accordance with the present invention;

FIG. 4 is an example of the concatenation of knowledge sentences of a target knowledge enhancement model constructed in accordance with the present invention;

FIG. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed in accordance with the present invention;

FIG. 6 is a schematic diagram of an emotion classification apparatus using a target knowledge enhancement model according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the embodiments in the present application.

The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

As shown in fig. 1, fig. 1 is a step diagram of an emotion classification method using a target knowledge enhancement model provided by the present invention, including the following steps:

s1: the target knowledge enhancement model is constructed, as shown in fig. 2, fig. 2 is a schematic structural diagram of a target knowledge enhancement model constructed by the present invention, and the target knowledge enhancement model includes an input layer, an embedded layer, a multi-head self-attention layer, a knowledge attention layer, a hidden layer and an output layer.

S2: and acquiring a knowledge sentence corresponding to each target word according to the target word in the context to be predicted.

The knowledge sentence is target enhanced knowledge corresponding to the target word extracted from the database by the target knowledge enhancement model. Preferably, the database is a Chinese knowledge map.

In one embodiment, the knowledge sentence is obtained as follows:

s21: obtaining relevant knowledge sentences of the target words from a Chinese knowledge map (https:// www.ownthink.com/docs/kg /): the target word requests URL (HTTPs:// api. ownthink. com/kg/ambiguous ═ ment _ name) by HTTP Get.

Request example: https:// api. ownthink. com/kg/ambiguous ═ operation, https:// api. ownthink. com/kg/ambiguous ═ design

S22: and carrying out data preprocessing on the acquired entity knowledge, and deleting noise data.

In one embodiment, the specific manner of data preprocessing is as follows:

s221: and cutting the knowledge sentence with the sentence length of more than 512 words, and deleting the content with the sentence length of more than 512 words.

S222: and deleting English letters and nonsense characters appearing in the entity knowledge sentence, such as: abs,%, and the like.

S3: and inputting the context and each knowledge sentence into the input layer, and splicing the context and the knowledge sentences to obtain an output sequence of the input layer.

The purpose of the input layer is to splice the context and the knowledge sentences as input, and provide more semantic information for the target knowledge enhancement model. Meanwhile, knowledge sentences are used as auxiliary sentences, and a single sentence classification task is converted into: sentence pair classification tasks (i.e., determining a classification relationship between two sentences) between a context and an auxiliary sentence.

Inspired by the BERT4TC model and the BERT-SPC model, the input sequence of the target knowledge enhancement model is similar to the BERT4TC model and the BERT-SPC model. The difference is that a plurality of auxiliary sentences are used ". "separate.

As shown in fig. 3, fig. 3 is an input sequence diagram of a target knowledge enhancement model constructed by the present invention, and the concatenation manner of context and knowledge sentences is as follows:

[ CLS ] context [ SEP ] knowledge sentence 1. Knowledge sentence 2. [ SEP ]

Where [ CLS ] represents the beginning of a sentence and [ SEP ] represents the segmentation and end of the sentence. Used between different knowledge sentences. "separate.

For the target knowledge enhancement model, the input sequence of the model consists of a sequence of contexts and a sequence of knowledge sentences, which enables the model to better learn the correlation between the contexts and the knowledge sentences. Let C be an input context sequence consisting of k characters, denoted C ═ C₁,c₂,...,c_i,...,c_kIn which c is_iRepresenting the ith word. Let T be the input sequence of a knowledge sentence consisting of n characters, denoted T ═ T₁,t₂,...,t_j,...,t_k}。

Especially when the target word is a compound word, the target word is divided into a plurality of sub-target words, and a knowledge sentence of each sub-target word is extracted from the Chinese knowledge graph. We use ". "to separate a plurality of knowledge sentences. t is t_jA knowledge sentence sequence representing the ith sub-target word.

As shown in fig. 4, fig. 4 is a multiple knowledge sentence splicing example of a target knowledge enhancement model constructed by the present invention, and therefore, the formula of the input sequence S of the input layer can be expressed as:

S＝{[CLS],C,[SEP],T,[SEP]} (1)

T＝{T₁,[.],T₂,[.],…，T_i,[.]} (2)

S4: and inputting the output sequence of the input layer into the embedded layer, and mapping each word or mark to a vector space to obtain the output vector representation of the context and the output vector representation of the knowledge sentence.

The target knowledge enhancement model employs a BERT model as an embedding layer. The architecture of BERT is a multi-level bidirectional transformer encoder, comprising 12 transformer banks and 12 self-attention heads.

The Bert layer (embedding layer) may map each word or token to a vector space. Wherein, the words refer to common Chinese words and have self semantics; the marks refer to [ CLS ], [ SEP ], and the like, do not contain semantics, and only play a role in segmenting sentences.

For an input consisting of a sequence of context C and a sequence of knowledge sentences T, the embedding layer processes it as:

wherein

Is the output representation of the context sequence,

is i^thThe output of the knowledge sentence, BERT, represents the embedding layer. Ti refers to the ith knowledge sentence; preferably, 1 ≦ i ≦ 4, i.e., the maximum value of i is 4, indicating that the maximum number of knowledge sentences the model introduces from the outside is 4. If the value exceeds 4, the introduced external data is too much, so that the model efficiency is slow and the data is noisy.

S5: inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context.

Based on the Self-Attention mechanism, MHSA (Multi-head Self-Attention) employs multiple Self-Attention mechanisms to compute the Attention score for each word in the context. Compared with Recurrent Neural Networks (RNNs) and LSTMs, the self-attention mechanism has the capability of parallel computation, so the latter is faster and more efficient.

Suppose that

And

is the output of the embedding layer. i denotes i of MHSA^thA head. The formula of SDA (Scaled Dot Product) is as follows:

Q_i＝O_BERT·W_i ^q (5)

K_i＝O_BERT·W_i ^k (6)

V_i＝O_BERT·W_i ^v (7)

where Q, K, V represent three different vectorized representations of each word in the context sentence, each attention head in the multi-head attention mechanism is computed by SDA. Suppose H_iA representation of the characteristics of each of the outputs from the attention head, then

H_i＝SDA_i(O_BERT)(1≤i≤h) (9)

MHSA(O_BERT)＝tanh({H₁；…；H_h}·W^MH) (10)

Wherein, "; "denotes the concatenation of vectors. The step adopts the tanh function as the nonlinear activation function of the MHSA encoder, and the feature extraction capability of the MHSA encoder is enhanced.

An output sequence of a multi-head attention layer is shown.

S6: and inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer.

In the MHSA calculation process, the weight from word to word in each self-attention head is represented as W_i ^C. In the MHSA layer, we are dealing with W_i ^CIs subjected to Softmax processing. However, at the knowledge attention level, we average W_i ^CThe first dimension of (a). Suppose W_characterRepresenting the weight of a word relative to a context sentence, then we have

W_character＝Sum(W^C) (15)

Wherein L is₀A position of a first word representing a target word; when the target word is a compound word, L_iThe position of the last word of the ith sub-target word is represented; w_i ^SingleRepresenting the weight of the ith word target word relative to the context sentence, i.e. W_i ^SingleRepresenting knowledge attention weights; .

Output sequence O of knowledge attention layer_KAThe calculation formula is as follows:

wherein the content of the first and second substances,

In a specific example, as shown in fig. 5, fig. 5 is a simulation diagram of the knowledge attention mechanism of a target knowledge enhancement model constructed by the present invention, wherein the darker the color, the higher the knowledge weight.

S7: inputting the output sequence of the knowledge attention layer into the hidden layer, and extracting the hidden state vector of the output sequence of the knowledge attention layer to obtain the output sequence of the hidden layer;

the hidden attention layer is positioned behind the knowledge attention layer, namely the output sequence O of the knowledge attention layer_KAAs an input sequence to the hidden attention layer.

Most approaches tend to take the hidden state vector as the output of the knowledge attention layer by extracting it on the matching location of the first marker, i.e. [ CLS ]. However, in this patent, we use a hidden attention mechanism instead of a pooling layer. The hidden attention mechanism calculates a weighted sum of the hidden state vectors for each word.

Suppose O_HARepresents the output of the hidden attention layer, O_HAThe specific calculation formula of (2) is as follows:

O_HA＝W^HA·O_KA+b^HA (19)

where W represents the weight and b represents the bias parameter.

Experimental results show that the hidden attention mechanism has better performance than pooling.

S8: and inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context.

At the end of the model, we used the Softmax layer to predict the polarity of the emotion. Y represents the final emotional polarity.

Where P represents the number of classification classes, Y represents the result of model prediction, O_HAThe output of the hidden attention layer, i.e. the input of the output layer, is represented.

Corresponding to the emotion classification method using the target knowledge enhancement model, the embodiment of the application also provides an emotion classification device using the target knowledge enhancement model.

As shown in fig. 6, the emotion classification apparatus using the target knowledge enhancement model includes:

Preferably, the hidden state vector acquisition module comprises a hidden attention unit for calculating a weighted sum of the hidden state vectors for each word of the output sequence of the knowledge attention layer based on a hidden attention mechanism.

Preferably, the multi-head self-attention module comprises a weight unit for calculating a weight between words in each self-attention head.

Preferably, the knowledge attention module comprises a knowledge sentence weighting unit for calculating a multi-headed output sequence of weighted knowledge sentences.

Preferably, the knowledge sentence acquisition module includes:

an entity knowledge sentence acquisition unit, configured to acquire an entity knowledge sentence corresponding to each target word from a database;

the decomposition unit is used for dividing the compound word into a plurality of sub-target words;

and the data preprocessing unit is used for preprocessing the data of the entity knowledge sentence, deleting the noise data and obtaining the knowledge sentence of the context to be detected.

Preferably, the data preprocessing unit includes:

a cutting subunit, configured to cut the entity knowledge whose sentence length exceeds a first threshold;

a first deletion subunit, configured to delete content exceeding a first threshold length;

and the second deleting subunit is used for deleting English letters and nonsense characters appearing in the entity knowledge sentence.

Preferably, the output module includes a softmax unit, configured to process the output sequence of the hidden layer using a softmax function to obtain the emotion polarity prediction result of the context.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims

1. An emotion classification method using a target knowledge enhancement model is characterized by comprising the following steps:

inputting the output vector representation of the context and the output vector representation of the knowledge sentence into the multi-head self-attention layer, calculating the attention scores of each word in the context and the knowledge sentence, obtaining a multi-head output sequence of the context and a multi-head output sequence of the knowledge sentence, and obtaining the weight of each word in the context; wherein the weight between words in each self-attention head is calculated according to the following formula:

wherein the content of the first and second substances,

for each word-to-word weight in the self-attention header, Q, K, V are three different vectorized representations of each word in the context sentence;

inputting the weight of each word in the context and the multi-head output sequence of the knowledge sentence into the knowledge attention layer to obtain the weighted multi-head output sequence of the knowledge sentence, splicing the multi-head output sequence of the context and the weighted multi-head output sequence of the knowledge sentence, and outputting the output sequence of the knowledge attention layer; calculating a multi-head output sequence of the weighted knowledge sentence according to the following formula;

W_character＝Sum(W^C)

wherein, W^CRepresenting the weight from word to word, W_characterRepresenting the weight of a word relative to a context sentence, L₀A position of a first word representing a target word; when the target word is a compound word, L_iThe position of the last word of the ith sub-target word is represented;

representing the weight of the ith word target word relative to the context sentence, i.e.

Representing knowledge attention weights;

a multi-headed output sequence, O, representing the weighted knowledge sentence_KAAn output sequence representing a knowledge attention layer;

2. The emotion classification method using target knowledge enhancement model according to claim 1, wherein the hidden layer is a hidden attention layer, and the extracting hidden state vectors of the output sequence of the knowledge attention layer comprises:

a weighted sum of hidden state vectors for each word of the output sequence of knowledge attention layers is calculated based on a hidden attention mechanism.

3. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein the obtaining of the knowledge sentence corresponding to each target word according to the target word in the context to be predicted comprises:

4. The emotion classification method using a target knowledge enhancement model, as claimed in claim 3, wherein the step of performing data preprocessing on the entity knowledge sentence to remove noise data comprises:

5. The emotion classification method using a target knowledge enhancement model as claimed in claim 1, wherein:

the embedded layer is a multilayer bidirectional transformer encoder and comprises a plurality of transformer banks and a plurality of self-attention heads; the embedding layer is used to map each word or token to a vector space.

6. The emotion classification method using target knowledge enhancement model as claimed in claim 1, wherein the step of inputting the output sequence of the hidden layer into the output layer to obtain the classification result of the context comprises: