CN110222178B

CN110222178B - Text emotion classification method and device, electronic equipment and readable storage medium

Info

Publication number: CN110222178B
Application number: CN201910443387.0A
Authority: CN
Inventors: 刘玉茹
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2019-05-24
Filing date: 2019-05-24
Publication date: 2021-11-09
Anticipated expiration: 2039-05-24
Also published as: CN110222178A

Abstract

The embodiment of the application provides a text emotion classification method, a text emotion classification device, electronic equipment and a readable storage medium, wherein the method comprises the following steps: acquiring target word vectors corresponding to target keywords in target texts to be classified; extracting the syntactic characteristics of the target word vector through a first Bi-LSTM layer, and outputting a target syntactic characteristic vector representing the syntactic characteristics; extracting semantic features of the target syntactic feature vector through a second Bi-LSTM layer, and outputting a target semantic feature vector representing the semantic features; and determining the emotion category corresponding to the target text based on the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector through the classification layer. The syntactic characteristics and semantic characteristics of the target keywords can be extracted, so that the obtained information can be more comprehensive, the inherent meaning of the target text can be further understood, and the accuracy of the classification result can be effectively improved.

Description

Text emotion classification method and device, electronic equipment and readable storage medium

Technical Field

The application relates to the technical field of language understanding, in particular to a text emotion classification method and device, electronic equipment and a readable storage medium.

Background

With the rapid development of internet technology, more and more users like to publish own opinions, attitudes and opinions on a social platform, so that a large amount of valuable text information of the users for hot events, products and the like is generated on the network. The text information contains rich emotional colors and emotional tendencies of the users, and the purpose of emotion analysis is to automatically extract and classify subjective emotional information of the users from the text so as to know the opinion of the public on a certain event or product.

Currently, a machine learning method is generally adopted to identify emotion categories of texts, generally, relevant features of extracted texts are input into a classifier, and the text is classified into emotion categories by the classifier based on the features, but information possibly contained in the features extracted by the classifier is not comprehensive, so that the accuracy of classification results is low.

Disclosure of Invention

In view of the above, embodiments of the present application provide a text emotion classification method, apparatus, electronic device and readable storage medium, so as to solve the problem in the prior art that the emotion classification accuracy of a text is low.

In a first aspect, an embodiment of the present application provides a text emotion classification method, which is used for performing emotion classification on a text through a Bi-directional recurrent neural network Bi-LSTM model, where the Bi-LSTM model includes: a first Bi-LSTM layer, a second Bi-LSTM layer, and a classification layer, the method comprising: acquiring target word vectors corresponding to target keywords in target texts to be classified; inputting the target word vector into the first Bi-LSTM layer, extracting syntactic characteristics of the target word vector through the first Bi-LSTM layer, and outputting a target syntactic characteristic vector representing the syntactic characteristics, wherein the syntactic characteristics are used for representing context information of the target keyword in the target text; extracting semantic features of the target syntactic feature vector through the second Bi-LSTM layer, and outputting a target semantic feature vector representing the semantic features, wherein the semantic features are used for representing semantic information of the target keywords in the target text; determining, by the classification layer, an emotion category corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector.

In the implementation process, syntactic characteristics are extracted through a first Bi-LSTM layer in a Bi-LSTM model, semantic characteristics are extracted through a second Bi-LSTM layer, then output of two layers and input word vectors are input into a classification layer, and emotion categories of the target text are determined through the classification layer based on the syntactic characteristic vectors, the semantic characteristic vectors and the input word vectors.

Optionally, the determining, by the classification layer, an emotion category corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector includes: weighting the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector through the classification layer to obtain a weighted vector; predicting probability values of all emotion classes corresponding to the target text based on the weighted vectors through the classification layer; and determining the emotion types corresponding to the target text according to the probability values of the emotion types corresponding to the target text through the classification layer.

In the implementation process, the obtained target syntactic characteristic vector, the target semantic characteristic vector and the target word vector are weighted to obtain a weighted vector, and when the weighting is carried out, each vector can be multiplied by different weights to distinguish the influence of different vectors on emotion type prediction, so that the probability value of each emotion type corresponding to the target text can be more accurately predicted when the emotion type corresponding to the target text is determined based on the weighted vector.

Optionally, the determining, by the classification layer, an emotion category corresponding to the target text according to the probability value of each emotion category corresponding to the target text includes: and determining the emotion category with the maximum probability value in the probability values of all emotion categories corresponding to the target text as the emotion category corresponding to the target text through the classification layer.

Optionally, the inputting the target word vector into the first Bi-LSTM layer, extracting syntactic features of the target word vector through the first Bi-LSTM layer, and outputting a target syntactic feature vector characterizing the syntactic features, includes:

calculating to obtain an output value of a forgetting gate through a sigmoid function based on the target word vector input at the current moment and the target syntactic characteristic vector output by the hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment;

calculating to obtain an output value of an input gate through a sigmoid function based on the target word vector input at the current moment and a target syntactic characteristic vector output by a hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment;

calculating a value of a temporary Bi-LSTM unit cell state through a tanh function based on the target word vector input at the current moment and the target syntactic characteristic vector output by the hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment;

calculating to obtain the value of the cell state of the Bi-LSTM unit at the current moment based on the output value of the forgetting gate, the output value of the input gate, the value of the cell state of the temporary Bi-LSTM unit and the value of the cell state of the Bi-LSTM unit at the last moment;

obtaining an output vector of the hidden state at the current moment according to the hidden state value at the previous moment, the target word vector input at the current moment and the cell state value of the Bi-LSTM unit at the current moment;

obtaining an output vector of a forward LSTM network in the first Bi-LSTM layer according to the output vector of the hidden state at each moment;

obtaining an output vector of a backward LSTM network in the first Bi-LSTM layer according to the output vector of the hidden state at each moment;

and splicing the output vector of the forward LSTM network in the first Bi-LSTM layer with the output vector of the backward LSTM network to obtain the target syntactic characteristic vector output by the first Bi-LSTM layer.

In the implementation process, the syntactic characteristics of the target keyword can be effectively extracted through the first Bi-LSTM layer through the algorithm, so that a target syntactic characteristic vector is obtained.

Optionally, before obtaining the word vector corresponding to the target keyword in the target text to be classified, the method further includes: acquiring a plurality of training texts, wherein each training text comprises training keywords, and each training text is marked with a corresponding emotion type; and taking the word vector corresponding to the training keyword as the input of the Bi-LSTM model, taking the emotion category marked by each training text as the output of the Bi-LSTM model, and training the Bi-LSTM model to obtain the trained Bi-LSTM model.

In the implementation process, the Bi-LSTM model is trained through a large number of training texts in advance, so that the trained Bi-LSTM model can accurately classify the emotion types of the texts.

Optionally, before obtaining the plurality of training texts, the method further includes: acquiring a plurality of original texts; performing data cleaning on the plurality of original texts to remove useless texts in the plurality of original texts and obtain a plurality of training texts; performing word segmentation processing on each training text to obtain training keywords corresponding to each training text; and carrying out emotion category labeling on the training keywords corresponding to each training text based on the emotion dictionary to obtain the emotion category corresponding to each training text.

In the implementation process, useless texts in the original texts can be removed by cleaning the data of the original texts so as to avoid interference on classification results during training, then word segmentation processing is carried out on the training texts, and emotion category marking is carried out on training keywords based on an emotion dictionary, so that emotion categories corresponding to the training texts can be obtained, and further the training effect on the model can be improved.

Optionally, the training the Bi-LSTM model includes: and training the Bi-LSTM model based on a cross loss function, and finishing the training of the Bi-LSTM model when the value of the cross loss function is smaller than a preset value.

In the implementation process, the Bi-LSTM model is trained through the cross loss function, so that network parameters in the Bi-LSTM model can be continuously optimized, and the classification effect of the Bi-LSTM model is improved through training.

In a second aspect, the embodiment of the present application provides a text emotion classification apparatus, configured to perform emotion classification on a text through a Bi-directional recurrent neural network Bi-LSTM model, where the Bi-LSTM model includes: a first Bi-LSTM layer, a second Bi-LSTM layer, and a classification layer, the apparatus comprising:

the word vector acquisition module is used for acquiring a target word vector corresponding to a target keyword in a target text to be classified;

a syntactic feature extraction module, configured to input the target word vector into the first Bi-LSTM layer, extract syntactic features of the target word vector through the first Bi-LSTM layer, and output a target syntactic feature vector representing the syntactic features, where the syntactic features are used to represent context information of the target keyword in the target text;

a semantic feature extraction module, configured to extract a semantic feature of the target syntactic feature vector through the second Bi-LSTM layer, and output a target semantic feature vector representing the semantic feature, where the semantic feature is used to represent semantic information of the target keyword in the target text;

and the emotion category determining module is used for determining the emotion category corresponding to the target text based on the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector through the classification layer.

Optionally, the emotion category determining module is specifically configured to:

weighting the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector through the classification layer to obtain a weighted vector;

predicting probability values of all emotion classes corresponding to the target text based on the weighted vectors through the classification layer;

and determining the emotion types corresponding to the target text according to the probability values of the emotion types corresponding to the target text through the classification layer.

Optionally, the emotion category determining module is further specifically configured to determine, by the classification layer, an emotion category with a maximum probability value in the probability values of the emotion categories corresponding to the target text as the emotion category corresponding to the target text.

Optionally, the syntactic feature extraction module is specifically configured to:

Optionally, the apparatus further comprises:

the training module is used for acquiring a plurality of training texts, each training text comprises training keywords, and each training text is marked with a corresponding emotion type; and taking the word vector corresponding to the training keyword as the input of the Bi-LSTM model, taking the emotion category marked by each training text as the output of the Bi-LSTM model, and training the Bi-LSTM model to obtain the trained Bi-LSTM model.

Optionally, the training module is further configured to:

acquiring a plurality of original texts;

performing data cleaning on the plurality of original texts to remove useless texts in the plurality of original texts and obtain a plurality of training texts;

performing word segmentation processing on each training text to obtain training keywords corresponding to each training text;

and carrying out emotion category labeling on the training keywords corresponding to each training text based on the emotion dictionary to obtain the emotion category corresponding to each training text.

Optionally, the training module is further configured to train the Bi-LSTM model based on a cross loss function, and when a value of the cross loss function is smaller than a preset value, complete the training of the Bi-LSTM model.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps in the method as provided in the first aspect are executed.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps in the method as provided in the first aspect.

Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a schematic structural diagram of a conventional Bi-LSTM model provided in an embodiment of the present application;

FIG. 2 is a schematic structural diagram of an improved Bi-LSTM model provided in an embodiment of the present application;

FIG. 3 is a flowchart of a text sentiment classification method according to an embodiment of the present application;

FIG. 4 is a schematic flowchart illustrating a process of performing emotion labeling on a training keyword by using a method based on an emotion dictionary according to an embodiment of the present application;

fig. 5 is a block diagram illustrating a structure of a text emotion classification apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Before describing the specific embodiments of the present application, a brief description will be given of an application scenario of the present application.

In the comparative embodiment, in some text emotion analysis methods, emotion tendency analysis is generally performed on the text by using a deep learning or deep learning related algorithm, such as positive emotion, negative emotion and the like. The deep learning related algorithms such as Long Short Term Memory (LSTM) and Bi-directional cyclic neural network (Bi-LSTM) are used, and when the text emotion analysis is performed by using these algorithms, the prediction is performed based on the feature vectors output by the last layer of neural units in the model, and although the feature vectors output by the last layer of neural units also include the context information of the text, the feature vectors do not include more semantic information of the text, for example, some words are very complicated and different from each other in different fields, such as "the vehicle suspension is too hard" and is inferior, and "the diamond is hard" and is positive, so that the emotion classification needs to be performed by deeply understanding the intrinsic meaning of the whole sentence during the analysis, while the intrinsic meaning of the whole sentence cannot be well understood by using the conventional Bi-LSTM model in the comparative example, it results in a less accurate prediction of its emotion classification.

The conventional Bi-LSTM model is different from the LSTM model in one-way propagation of a hidden layer unlike the LSTM model, as shown in fig. 1, and fig. 1 is a structural diagram of the Bi-LSTM model of the comparative example. The Bi-LSTM model comprises two independent hidden layers, namely a forward LSTM network and a backward LSTM network, the propagation directions of the hidden layers are opposite, so that two hidden layer outputs, namely two characteristic vectors related to input data, can be finally obtained for the same input data, then the Bi-LSTM model obtains a vector by splicing (concat) or averaging the two characteristic vectors, and outputs the vector to a full connection layer, and the full connection layer predicts the emotion category of the text based on the vector.

The above-mentioned defects in the comparative examples are the structures that the applicant has found after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the following embodiments of the present application for the above-mentioned problems should be the contributions of the applicant to the present application during the course of the present application.

In order to solve the above problem, the embodiments of the present application provide a text emotion classification method, which is used for performing emotion classification on a text through an improved Bi-LSTM model, and can effectively improve the accuracy of text emotion category prediction.

In the application, in order to extract more features from the input text and obtain a more accurate classification result when performing emotion classification on the input text, the Bi-LSTM model adopted in the application is obtained by improving the conventional Bi-LSTM model.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an improved Bi-LSTM model provided in an embodiment of the present application, where the Bi-LSTM model includes: the text emotion classification method comprises a first Bi-LSTM layer 10, a second Bi-LSTM layer 20 and a classification layer 30, wherein the first Bi-LSTM layer 10 is used for obtaining an input data vector, then the first Bi-LSTM layer 10 performs syntactic feature extraction on the input data vector, then outputs the first vector to the second Bi-LSTM layer 20, the second Bi-LSTM layer 20 continues to perform semantic feature extraction on the first vector to obtain a second vector, then inputs the input data vector, the first vector and the second vector to the classification layer 30, and performs emotion classification on texts through the classification layer 30 based on the data vector, the first vector and the second vector.

It can be understood that the Bi-LSTM model provided in the embodiment of the present application can be actually regarded as a concatenation of two conventional Bi-LSTM models, that is, the first Bi-LSTM layer 10 is a single conventional Bi-LSTM model, and the second Bi-LSTM layer 20 is also a single conventional Bi-LSTM model, so that feature extraction can be performed on an input text through the two Bi-LSTM models, semantic information included in the text can be extracted more comprehensively, and accuracy of emotion classification on the text can be effectively improved.

The specific principle of the Bi-LSTM model in this embodiment can be referred to the following description of the method embodiment of the present application, which is not described herein in detail.

Referring to fig. 3, fig. 3 is a flowchart of a text emotion classification method according to an embodiment of the present application, where the method is applicable to the following electronic devices, and the method includes the following steps:

step S110: and acquiring a target word vector corresponding to a target keyword in the target text to be classified.

The target text to be classified is a text that needs to be classified according to emotion categories, and may be a text composed of one sentence, or a text including a plurality of sentences, or of course, may be a single word or a plurality of words, or a text that is a combination of sentences and words.

In order to facilitate the processing of the data by the Bi-LSTM model, the target text is obtained by preprocessing the original text and then inputting the target text into the Bi-LSTM model. The preprocessing comprises the processing of word segmentation, data cleaning and the like on the original text, for example, a jieba word segmentation tool or other word segmentation methods can be firstly adopted to perform word segmentation on the original text, for example, the original text is 'the great river mountain in China love of China', the words obtained after the word segmentation processing comprise 'I/love/China/big good river mountain', the data cleaning can be understood as the word processing of the word retention of the obtained words, the vocabulary without practical meaning, such as prepositions, articles, mood auxiliary words, adverbs, conjunctions, punctuations and the like can be automatically filtered out according to the staying vocabulary, thus, the words can be 'I/love/China/big good river mountain' after being cleaned by data, these words "i/china/good river mountain" can be used as target keywords of the target text.

Before these target keywords are input into the Bi-LSTM model, the target keywords are converted into target word vectors, which can be converted into word vectors according to a word vector dictionary, for example, for the word "i", the word can be converted into a 300-dimensional vector, i.e., "i (-0.0637040.403445-0.454980-0.1444730.0675890.125771-0.0322710.0924800.106311-0.084045-0.2085990.2328190.020058-0.194340-0.3234680.0177240.314494-0.006405-0.0396910.055776-0.2017580.002135 … …"), and thus, each target keyword can be converted into a word vector according to a word vector dictionary.

According to the method, the target keywords can be converted into corresponding word vectors, and then the target word vectors corresponding to the target text can be directly obtained based on the method when the target text is subjected to emotion classification.

Step S120: and inputting the target word vector into the first Bi-LSTM layer, extracting the syntactic characteristics of the target word vector through the first Bi-LSTM layer, and outputting a target syntactic characteristic vector representing the syntactic characteristics.

The first Bi-LSTM layer is used for carrying out syntactic feature extraction on an input target word vector, the syntactic feature is used for representing the context information of the target keyword in the target text, namely the word meaning of the target keyword in the target text, for example, some words can be word polysemy and have different meanings under different contexts, so that the syntactic feature extraction is carried out through the first Bi-LSTM layer, so that the word meaning of the target keyword in the target text can be obtained by combining the context information, namely, the target syntactic feature vector representing the syntactic feature is obtained through the first Bi-LSTM layer, the target syntactic feature vector contains the word meaning of the target keyword, and then the target syntactic feature vector is output to the second Bi-LSTM layer.

Step S130: and extracting the semantic features of the target syntactic feature vector through the second Bi-LSTM layer, and outputting a target semantic feature vector representing the semantic features.

The second Bi-LSTM layer is used for extracting semantic features of the target syntactic feature vector, and the semantic features are used for representing semantic information of the target keywords in the target text, such as part of speech information of the target keywords in the target text, and the like.

Step S140: determining, by the classification layer, an emotion category corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector.

After the target syntactic feature vector is obtained through the first Bi-LSTM layer of the Bi-LSTM model and the target semantic feature vector is obtained through the second Bi-LSTM layer, the emotion category of the target text can be determined through the classification layer of the Bi-LSTM model based on the target word vector, the target syntactic feature vector and the target semantic feature vector.

The emotion categories may be predefined to be multiple, such as positive, negative and middle emotions, or more specific emotion categories such as happy, sad or middle emotions, it can be understood that the classification of the specific emotion categories may be preset according to the actual emotion, and this is not limited in this embodiment of the application.

It is to be understood that the classification layer may be a classifier that can predict the emotion classification of the target text according to the target word vector, the target syntactic feature vector, and the target semantic feature vector, so the classification layer may predict the emotion classification of the target text based on the obtained three vectors and then output the corresponding prediction result.

As an example, the manner of determining the emotion category corresponding to the target text through the classification layer may be: weighting the target word vector, the target syntactic characteristic vector and the target and semantic characteristic vectors through a classification layer to obtain a weighted vector, predicting the probability value of each emotion category corresponding to the target text based on the weighted vector through the classification layer, and determining the emotion category corresponding to the target text according to the probability value of each emotion category corresponding to the target text.

The classification layer can comprise a weighting layer and a full connection layer, the weighting layer is used for weighting the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector to obtain a weighting vector, and the full connection layer is used for determining the emotion category corresponding to the target text according to the weighting vector.

For example, the target word vector includes: i (x)₁,x₂,x₃) Love (y)₁,y₂,y₃) China (z)₁,z₂,z₃) (ii) a The target syntactic feature vector includes: i (x)₁₁,x₂₁,x₃₁) Love (y)₁₁,y₂₁,y₃₁) China (z)₁₁,z₂₁,z₃₁) (ii) a The target semantic feature vector includes: i (x)₁₂,x₂₂,x₃₂) Love (y)₁₂,y₂₂,y₃₂) China (z)₁₂,z₂₂,z₃₂) (ii) a Weighting the vectors through the weighting layer means that the vector corresponding to each target keyword is weighted first, and then all the vectors are weighted, for example, each vector corresponding to the word "i" is weighted first, and the weighting formula is as follows: x ═ a (x)₁,x₂,x₃)+b*(x₁₁,x₂₁,x₃₁)+c*(x₁₂,x₂₂,x₃₂) The weighting vector corresponding to the target keyword of 'I' can be obtained, and the weighting vector corresponding to 'love' can be obtained in the same way: y ═ a (y)₁,y₂,y₃)+b*(y₁₁,y₂₁,y₃₁)+c*(y₁₂,y₂₂,y₃₂) And the weighting vector corresponding to "china": z ═ a (z)₁,z₂,z₃)+b*(z₁₁,z₂₁,z₃₁)+c*(z₁₂,z₂₂,z₃₂) Finally, weighting the weighted vector corresponding to each target keyword to obtain the final weighted vector: w ═ w₁*x+w₂*y+w₃Z, wherein, a, b, c, w₁,w₂,w₃Can be customized weighting coefficients in the Bi-LSTM model.

The obtained weighting vector w may then be input to the fully-connected layer, and the emotion classification of the target text is output through the fully-connected layer, where each output of the fully-connected layer may be obtained by multiplying each node of the previous layer by a weight coefficient, and finally adding an offset, and the specific principle of which may be described in detail below.

After the probability values of the emotion categories corresponding to the target text are obtained through the classification layer, the emotion category with the maximum probability value can be determined as the emotion category corresponding to the target text. For example, if the probability value of the target text output through the full link layer is 0.7 for positive emotion, 0.2 for negative emotion, and 0.1 for middle emotion, it is determined that the emotion type of the target text is positive emotion.

As another example, the manner of determining the emotion classification of the target text by the classification layer may also be: and directly adding the target word vector, the target syntactic characteristic vector and the target semantic characteristic vector to obtain an added vector, and then determining the emotion type of the target text based on the added vector through a full-connection layer. For example, the calculation formula of the addition vector is: w is x + y + z, and the data processing procedure of the full link layer is also the same as that described above, and will not be described in detail herein.

The following describes the data processing procedure of each layer in the Bi-LSTM model in detail, and it can be seen from the above-described principle of the Bi-LSTM model that the Bi-LSTM model is actually a variant of the LSTM model, so its internal data processing procedure is similar to that of the LSTM model.

The process of extracting the syntactic characteristics of the target word vector through the first Bi-LSTM layer and outputting the target syntactic characteristic vector representing the syntactic characteristics is as follows:

and calculating to obtain an output value of the forgetting gate through a sigmoid function based on the target word vector input at the current moment and the target syntactic characteristic vector output by the hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment.

Wherein the step can adopt a formula f_t＝σ(W_f·[h_t-1,x_t]+b_f) Is represented by f_tI.e. the output value of the forgetting gate, x_tTarget word vector, h, which is the input of the Bi-LSTM cell at the current time_t-1Output vector, W, of the hidden layer of the Bi-LSTM cell at the previous instant_fWeight matrix for forgetting the state of the gate cell, b_fIs a bias vector that forgets the state of the gate cell.

And calculating to obtain an output value of the input gate through a sigmoid function based on the target word vector input at the current moment and the target syntactic characteristic vector output by the hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment.

Wherein the step can adopt formula i_t＝σ(W_i·[h_t-1,x_t]+b_i) Is represented by i_tIs the output value of the input gate, sigma is sigmoid activation function, W_iWeight matrix being the state of the input gate cell, b_iIs the offset vector of the input gate cell state.

And calculating the value of the cell state of the temporary Bi-LSTM unit by using a tanh function based on the target word vector input at the current moment and the target syntactic characteristic vector output by the hidden layer of the Bi-LSTM unit of the first Bi-LSTM layer at the last moment.

Wherein the step can adopt a formula

It is shown that,

i.e. the value of the temporary Bi-LSTM unit cell state, and tanh is a hyperbolic tangent function, W_cWeight matrix being the state of the Bi-LSTM cell, b_cIs the bias vector for the Bi-LSTM cell state.

And calculating to obtain the value of the cell state of the Bi-LSTM unit at the current moment based on the output value of the forgetting gate, the output value of the input gate, the value of the cell state of the temporary Bi-LSTM unit and the value of the cell state of the Bi-LSTM unit at the last moment.

Wherein the step can adopt a formula

Is represented by C_tI.e. the value of the cell state of the Bi-LSTM unit at the current moment, C_t-1Is the value of the cell state of the Bi-LSTM unit at the previous time.

And obtaining an output vector of the hidden state at the current moment according to the hidden state value at the previous moment, the target word vector input at the current moment and the cell state value of the Bi-LSTM unit at the current moment.

Wherein, the step can be expressed by the following formula:

o_t＝σ(W_o·[h_t-1,x_t]+b_o)；

h_t＝o_t*tanh(C_t)；

W_oweight matrix being the state of the output gate cell, b_oIs an offset vector of output gate cell states, o_tFor outputting the output of the gate unit, h_tThe output vector of the hidden state at the current moment.

And obtaining the output vector of the forward LSTM network in the first Bi-LSTM layer according to the output vector of the hidden state at each moment.

Aiming at the forward LSTM network in the first Bi-LSTM layer, the output vector of the hidden state at each moment can be obtained according to the process, and therefore the hidden state sequence can be obtained to serve as the output vector of the forward LSTM network

And obtaining the output vector of the backward LSTM network in the first Bi-LSTM layer according to the output vector of the hidden state at each moment.

For post in the first Bi-LSTM layerTo LSTM network, the output vector of hidden state at each time can be obtained according to the above process, so that the hidden state sequence can be obtained as the output vector of backward LSTM network

Wherein the target syntactic feature vector can be expressed as

Thus, the target syntactic characteristic vector can be obtained in the above manner, and the obtained target syntactic characteristic vector contains the syntactic characteristics of each keyword. For example, the input target word vector is denoted as x ═ x (x)₁,x₂,x₃,x₄,x₅) The dimension of each target word vector is 300, and five vectors are respectively obtained after each target word vector is input into a forward LSTM network in a first Bi-LSTM layer

Each vector has 150 dimensions, and five vectors are respectively obtained after each target word vector outputs a backward LSTM network in the first Bi-LSTM layer

Each vector has 150 dimensions, and then the output of the forward LSTM network and the output of the backward LSTM network are spliced to obtain the output of a first Bi-LSTM layer, namely a target syntactic feature vector

Is marked as

Each target syntactic feature vector is 300-dimensional, so that the target syntactic feature of the output of the first Bi-LSTM layer can be obtained based on the algorithmAnd (5) sign vectors.

Similarly, for the second Bi-LSTM layer to extract the semantic features of the target syntactic feature vector, the process of outputting the target semantic feature vector is similar to the process of the first Bi-LSTM layer. For example, the input of the second Bi-LSTM layer is the output of the first Bi-LSTM layer

The target syntactic characteristic vector passes through a forward LSTM network of a second Bi-LSTM layer to obtain a vector (h)_2L1,h_2L2,h_2L3,h_2L4,h_2L5) After passing through the backward LSTM network of the second Bi-LSTM layer, a vector (h) is obtained_2R1,h_2R2,h_2R3,h_2R4,h_2R5) Then, the output vectors of the forward LSTM network and the backward LSTM network are spliced to obtain the output of a second Bi-LSTM layer, namely a target semantic feature vector

Is marked as

Each target semantic feature vector has 300 dimensions, and therefore, the target semantic feature vector output by the second Bi-LSTM layer can be obtained.

In addition, in order to accelerate the convergence speed of the Bi-LSTM model, the output target word vector can be added to the residual network of the second Bi-LSTM layer, so the output of the second Bi-LSTM layer can be (h)₂₁+x₁,h₂₂+x₂,h₂₃+x₃,h₂₄+x₄,h₂₅+x₅) Is marked as

Therefore, a weighted vector obtained by weighting the target word vector, the target syntactic feature vector, and the target semantic feature vector through the classification layer can be represented as

Wherein, w₁,w₂,w₃Is a self-defined weighting coefficient.

In the implementation process, the syntactic characteristics of the target keywords can be effectively extracted through the first Bi-LSTM layer through the algorithm so as to obtain syntactic characteristic vectors, and the semantic characteristics of the target keywords can be effectively extracted through the second Bi-LSTM layer so as to obtain semantic characteristic vectors.

As an example, the target word vector x and the target syntactic characteristic vector are obtained

And target semantic feature vectors

These vectors are then weighted by the classification level based on the following formula:

wherein k is a weighting vector, and a, b and c are preset coefficients, namely self-defined weighting coefficients;

then determining the emotion type corresponding to the target text through the following formula:

y₁＝relu(k*W_n1+b)；

y₂＝relu(k*W_n2+b)；

wherein relu is an activation function, relu is max (0, j), j is a preset constant, and W is a predetermined constant_n1,W_n2Are the weights of the Bi-LSTM model, b is the bias,

to predict the probability value of the target text to be the first emotion category,

a probability value for predicting the target text to be a second emotion category.

If the weight vector is as described above

The emotion classification corresponding to the target text can be determined by the following formula:

y₁＝relu(w*W_n1+b)；

y₂＝relu(w*W_n2+b)；

in the implementation process, the probability values of all emotion categories corresponding to the target text can be quickly and accurately output by the classification layer through the algorithm.

As an example, before the text is actually emotion classified by the Bi-LSTM model, the Bi-LSTM model may be trained as follows:

acquiring a plurality of training texts, wherein each training text comprises training keywords, and each training text is marked with a corresponding emotion type; and taking the training keywords as the input of the Bi-LSTM model, taking the emotion category marked by each training text as the output of the Bi-LSTM model, and training the Bi-LSTM model to obtain the trained Bi-LSTM model.

The training text is a text after preprocessing, and the preprocessing process is as follows:

the method comprises the steps of obtaining a plurality of original texts, carrying out data cleaning on the original texts to remove useless texts in a plurality of ideographic texts, obtaining a plurality of training texts, then carrying out word segmentation processing on each training text to obtain training keywords corresponding to each training text, carrying out emotion category standard on the training keywords corresponding to each training text based on an emotion dictionary, and obtaining emotion categories corresponding to each training text.

Specifically, a plurality of original texts can be obtained from various websites by a crawler method, and the following are the original texts obtained by taking "university of Hainan" as a search keyword:

in the process of analyzing the original text, it is found that there are elements which are useless for emotion classification and even affect the classification result in the original text, such as: "please ask for the current admission to the hospital of Hainan university? The link "http:// t. cn/RgoecB 1" in http:// t. cn/RgoecB1 "has no effect on emotion classification, but affects the accuracy of classification, so that the data can be purged.

"@ today you do not worry about do not run and do not fall! This child is tied to me. The @ user name in [ hip-hop ] ", may affect the classification result, and thus needs to be cleared.

Therefore, data preprocessing first removes information that does not contribute much to emotion classification such as: numbers, http links, @ users, [ replies ], and the like, and the cleaned original text can be input into the Bi-LSTM model as training text for training.

In order to facilitate data processing of the model, before training, word segmentation processing needs to be performed on each training text, a word segmentation method can adopt a jieba word segmentation tool, and an example of a training keyword obtained after the word segmentation is performed on each training text is as follows:

then labeling each training keyword according to the emotion dictionary, if the emotion category is positive and negative, labeling 1 if positive, labeling 0 if negative, pre-including a plurality of words and the weight of the emotion category corresponding to each word in the emotion dictionary, so that the training keyword can be matched with each word in the emotion dictionary, thereby obtaining the weight of the emotion category corresponding to each training keyword, if some words are preceded by negative words or degree words, enhancing the emotion of the text as degree words such as 'extreme', 'special', etc., therefore, when matching each training keyword with a word in the emotion dictionary, it can also find whether the word is preceded by the degree word or the negative word, if yes, multiplying the corresponding weight, and finally adding the weights of all training keywords in each training text, and obtaining the weight of the training text, if the weight of the training text is greater than or equal to 0, marking the training text as positive emotion, and if the weight of the training text is less than 0, marking the training text as negative emotion.

As shown in fig. 4, fig. 4 is a schematic flowchart of a process of emotion labeling a training keyword based on an emotion dictionary method provided in this embodiment of the present application, for example, for a training keyword being "belief, me, haidan, sadness", according to the labeling method shown in fig. 4, a weight corresponding to "belief" is 1, a weight corresponding to "me" is 1, a weight corresponding to "haidan" is 1, a weight corresponding to the first "sadness" is-1, a weight corresponding to the second "sadness" is-1, a weight obtained by adding the weights corresponding to the training keywords is 1, which indicates that an emotion category of the training text is positive, so that an emotion category corresponding to each training text can be obtained in this manner, and when a Bi-LSTM model is trained, an emotion category corresponding to the training text can be output as a Bi-LSTM model, so that the Bi-LSTM model can accurately predict the emotion category of the text in the actual prediction.

The structure of the Bi-LSTM model in the training process is consistent with that of the model shown in the figure 2, and some neuron nodes in the classification layer can be discarded randomly through a DrourOut layer in the classification layer in the training process, so that the phenomenon that the accuracy of a prediction result is influenced by model overfitting can be avoided. And the emotion classification corresponding to the predicted text in the classification layer is mainly completed by a softMax layer and is used for outputting the probability value of the emotion classification corresponding to the text, and then the emotion classification with the maximum probability value is selected as the final emotion classification corresponding to the text to be output.

In addition, in the process of training the Bi-LSTM model, the Bi-LSTM model can be trained based on a cross loss function, and when the value of the cross loss function is smaller than a preset value, the Bi-LSTM model is trained.

Wherein the cross-loss function is:

wherein n is the number of emotion categories, y_iIn order to be able to output the desired output,

for the output of the Bi-LSTM model during training, loss is the cross-loss function.

Based on the loss obtained in the training process, a gradient descent algorithm is used to train parameters of each LSTM unit and the classification layer, such as the above-mentioned weight, bias, and the like.

The preset value can be set according to actual requirements, and if the preset value is 0.001, the training is stopped when the loss value is less than 0.001 or the iteration times of the Bi-LSTM model reach the preset maximum value, which indicates that the Bi-LSTM model training is finished.

In addition, because the labeling data based on the emotion dictionary method has some data with wrong labeling, in order to reduce the influence of the data on emotion classification and further improve the accuracy of the Bi-LSTM model on emotion classification, after the Bi-LSTM model is trained based on the method, the Bi-LSTM model can be finely tuned by adopting a manually labeled sample, only network parameters in a classification layer of the model, such as the network parameters of weight, bias and the like in the full connection layer, can be trained during fine tuning, and then the finely tuned network parameters are used as the network parameters of the final Bi-LSTM model.

Referring to fig. 5, fig. 5 is a block diagram of a text emotion classification apparatus 200 according to an embodiment of the present application, where the apparatus may be a module, a program segment, or code on an electronic device. It should be understood that the text emotion classification apparatus 200 corresponds to the above-mentioned embodiment of the method in fig. 3, and can perform the steps related to the embodiment of the method in fig. 3, and the specific functions of the text emotion classification apparatus 200 can be referred to the above description, and the detailed description is appropriately omitted here to avoid repetition.

Optionally, the apparatus is configured to perform emotion classification on the text through a Bi-directional recurrent neural network Bi-LSTM model, where the Bi-LSTM model includes: a first Bi-LSTM layer, a second Bi-LSTM layer, and a classification layer, the apparatus comprising:

a word vector obtaining module 210, configured to obtain a target word vector corresponding to a target keyword in a target text to be classified;

a syntactic feature extracting module 220, configured to input the target word vector into the first Bi-LSTM layer, extract syntactic features of the target word vector through the first Bi-LSTM layer, and output a target syntactic feature vector representing the syntactic features, where the syntactic features are used to represent context information of the target keyword in the target text;

a semantic feature extraction module 230, configured to extract a semantic feature of the target syntactic feature vector through the second Bi-LSTM layer, and output a target semantic feature vector representing the semantic feature, where the semantic feature is used to represent semantic information of the target keyword in the target text;

an emotion classification determination module 240, configured to determine, by the classification layer, an emotion classification corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector.

Optionally, the emotion category determining module 240 is specifically configured to:

Optionally, the emotion category determining module 240 is further specifically configured to determine, by the classification layer, an emotion category with a highest probability value in the probability values of the emotion categories corresponding to the target text as the emotion category corresponding to the target text.

Optionally, the syntactic feature extracting module 220 is specifically configured to:

Optionally, the apparatus further comprises:

Optionally, the training module is further configured to:

acquiring a plurality of original texts;

Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device may include: at least one processor 310, such as a CPU, at least one communication interface 320, at least one memory 330, and at least one communication bus 340. Wherein the communication bus 340 is used for realizing direct connection communication of these components. The communication interface 320 of the device in the embodiment of the present application is used for performing signaling or data communication with other node devices. The memory 330 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 330 may optionally be at least one memory device located remotely from the aforementioned processor. The memory 330 stores computer readable instructions, which when executed by the processor 310, cause the electronic device to perform the method processes described above with reference to fig. 3.

The embodiment of the present application provides a readable storage medium, and when being executed by a processor, the computer program performs the method process performed by the electronic device in the method embodiment shown in fig. 3.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.

In summary, the embodiments of the present application provide a text sentiment classification method, apparatus, electronic device and readable storage medium, in which syntactic features are extracted through a first Bi-LSTM layer in a Bi-LSTM model, semantic features are extracted through a second Bi-LSTM layer, then both output and input word vectors of the two layers are input into a classification layer, and then the classification layer determines sentiment categories of a target text based on the syntactic feature vectors, the semantic feature vectors and the input word vectors.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A text emotion classification method is used for emotion classification of texts through a Bi-directional recurrent neural network Bi-LSTM model, and the Bi-LSTM model comprises the following steps: a first Bi-LSTM layer, a second Bi-LSTM layer, and a classification layer, the method comprising:

acquiring target word vectors corresponding to target keywords in target texts to be classified;

inputting the target word vector into the first Bi-LSTM layer, extracting syntactic characteristics of the target word vector through the first Bi-LSTM layer, and outputting a target syntactic characteristic vector representing the syntactic characteristics, wherein the syntactic characteristics are used for representing context information of the target keyword in the target text;

extracting semantic features of the target syntactic feature vector through the second Bi-LSTM layer, and outputting a target semantic feature vector representing the semantic features, wherein the semantic features are used for representing semantic information of the target keywords in the target text;

determining, by the classification layer, an emotion category corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector.

2. The method of claim 1, wherein the determining, by the classification layer, an emotion classification corresponding to the target text based on the target word vector, the target syntactic feature vector, and the target semantic feature vector comprises:

3. The method of claim 2, wherein the determining, by the classification layer, the emotion classification corresponding to the target text according to the probability value of each emotion classification corresponding to the target text comprises:

and determining the emotion category with the maximum probability value in the probability values of all emotion categories corresponding to the target text as the emotion category corresponding to the target text through the classification layer.

4. The method of claim 1, wherein the inputting the target word vector into the first Bi-LSTM layer, extracting syntactic features of the target word vector through the first Bi-LSTM layer, and outputting a target syntactic feature vector characterizing the syntactic features, comprises:

5. The method according to claim 1, wherein before obtaining the word vector corresponding to the target keyword in the target text to be classified, the method further comprises:

acquiring a plurality of training texts, wherein each training text comprises training keywords, and each training text is marked with a corresponding emotion type;

and taking the word vector corresponding to the training keyword as the input of the Bi-LSTM model, taking the emotion category marked by each training text as the output of the Bi-LSTM model, and training the Bi-LSTM model to obtain the trained Bi-LSTM model.

6. The method of claim 5, wherein prior to obtaining the plurality of training texts, the method further comprises:

acquiring a plurality of original texts;

7. The method of claim 5 or 6, wherein the training of the Bi-LSTM model comprises:

and training the Bi-LSTM model based on a cross loss function, and finishing the training of the Bi-LSTM model when the value of the cross loss function is smaller than a preset value.

8. A text emotion classification device is used for carrying out emotion classification on a text through a Bi-directional recurrent neural network Bi-LSTM model, and the Bi-LSTM model comprises: a first Bi-LSTM layer, a second Bi-LSTM layer, and a classification layer, the apparatus comprising:

9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 7.