CN110222178A - Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing - Google Patents

Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN110222178A
CN110222178A CN201910443387.0A CN201910443387A CN110222178A CN 110222178 A CN110222178 A CN 110222178A CN 201910443387 A CN201910443387 A CN 201910443387A CN 110222178 A CN110222178 A CN 110222178A
Authority
CN
China
Prior art keywords
target
vector
lstm
text
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910443387.0A
Other languages
Chinese (zh)
Other versions
CN110222178B (en
Inventor
刘玉茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN201910443387.0A priority Critical patent/CN110222178B/en
Publication of CN110222178A publication Critical patent/CN110222178A/en
Application granted granted Critical
Publication of CN110222178B publication Critical patent/CN110222178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the present application provides a kind of text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing, this method comprises: obtaining the corresponding target term vector of target keyword in target text to be sorted;By the first Bi-LSTM layers the syntactic feature of target term vector is extracted, output characterization syntactic feature target syntactic feature vector;By the 2nd Bi-LSTM layers the semantic feature of target syntactic feature vector is extracted, output characterization semantic feature target semanteme feature vector;The corresponding emotional category of target text is determined based on target term vector, target syntactic feature vector and target semanteme feature vector by classification layer.Since the syntactic feature and semantic feature of target keyword can be extracted in the application, the information obtained is allowed the inherent meaning of target text can more deeply to be understood, so the accuracy of classification results can be effectively improved more comprehensively.

Description

Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
Technical field
This application involves language understanding technology fields, in particular to a kind of text sentiment classification method, device, electricity Sub- equipment and readable storage medium storing program for executing.
Background technique
With the fast development of Internet technology, more and more users like social platform issue oneself viewpoint, Attitude and view, to generate a large number of users on network for valuable text informations such as focus incident, products.These texts Information includes user's emotional color abundant and Sentiment orientation, and the purpose of sentiment analysis is exactly to extract and classify from text automatically The subjective emotion information of user, to understand the public view to a certain event or product.
Emotional category identification is generally carried out to text using machine learning method at present, is usually by the text of extraction It is input in classifier between correlated characteristic, emotional category classification is carried out to text based on these features by classifier, but It is that the information that may include is not comprehensive in its feature for usually extracting, so that classification results accuracy is lower.
Summary of the invention
In view of this, the embodiment of the present application is designed to provide a kind of text sentiment classification method, device, electronic equipment And readable storage medium storing program for executing, the low problem of emotional category classification accuracy is carried out to text in the prior art to improve.
In a first aspect, the embodiment of the present application provides a kind of text sentiment classification method, for passing through bidirectional circulating nerve Network B i-LSTM model carries out emotional semantic classification to text, and the Bi-LSTM model includes: the first Bi-LSTM layers, the 2nd Bi- LSTM layers and classification layer, which comprises obtain the corresponding target word of target keyword in target text to be sorted Vector;The target term vector is input to the described first Bi-LSTM layers, by the described first Bi-LSTM layers to the target The syntactic feature of term vector extracts, and output characterizes the target syntactic feature vector of the syntactic feature, wherein the syntax Feature is for characterizing contextual information of the target keyword in the target text;By the described 2nd Bi-LSTM layers The semantic feature of the target syntactic feature vector is extracted, output characterize the target semantic feature of the semantic feature to Amount, wherein the semantic feature is for characterizing semantic information of the target keyword in the target text;By described Classify layer based on described in the determination of the target term vector, the target syntactic feature vector and the target semanteme feature vector The corresponding emotional category of target text.
During above-mentioned realization, by Bi-LSTM model the first Bi-LSTM layers extract syntactic feature, pass through 2nd Bi-LSTM layers extract semantic feature, then the term vector of two layers of output and input is inputted to classification layer, then lead to Cross the emotion class that classification layer determines target text based on the term vector of syntactic feature vector, semantic feature vector and input Not, since the syntactic feature and semantic feature of target keyword can be extracted in the application, so that the information obtained can More comprehensively, so as to more deeply understand the inherent meaning of target text, and then to effectively improve the accurate of classification results Property.
Optionally, it is described by the classification layer be based on the target term vector, the target syntactic feature vector and The target semanteme feature vector determines the corresponding emotional category of the target text, comprising: will be described by the classification layer Target term vector, the target syntactic feature vector and the target semanteme feature vector are weighted, and obtain weighing vector; The probability value of the corresponding each emotional category of the target text is predicted based on the weighing vector by the classification layer;Pass through The classification layer determines the corresponding feelings of the target text according to the probability value of the corresponding each emotional category of the target text Feel classification.
During above-mentioned realization, by target syntactic feature vector, target semanteme feature vector and the target word of acquisition Vector obtains weighing vector after being weighted, can be by each vector multiplied by different weights, to distinguish difference when being weighted The influence size that vector predicts emotional category, so that determining the corresponding emotional category of target text based on weighing vector When, the probability value of the corresponding each emotional category of prediction target text that can be more accurate.
Optionally, described true according to the probability value of the corresponding each emotional category of the target text by the classification layer The corresponding emotional category of the fixed target text, comprising: by the classification layer by the corresponding each emotion of the target text The maximum emotional category of probability value is determined as the corresponding emotional category of the target text in the probability value of classification.
Optionally, described that the target term vector is input to the described first Bi-LSTM layers, pass through the first Bi- LSTM layers extract the syntactic feature of the target term vector, output characterize the target syntactic feature of the syntactic feature to Amount, comprising:
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by sigmoid function obtains the output valve for forgeing door;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit calculates the output valve for obtaining input gate by sigmoid function;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by tanh function obtains interim Bi-LSTM unit cell state Value;
Based on output valve, the output valve of the input gate, the interim Bi-LSTM unit cell state for forgeing door Value and the value of last moment Bi-LSTM unit cell state calculate and obtain current time Bi-LSTM unit cell state Value;
According to last moment hidden state value, the target term vector of current time input and the current time The value of Bi-LSTM unit cell state obtains the output vector of current time hidden state;
The the described first forward direction LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
The the described first backward LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
By the output vector of the described first forward direction LSTM network in Bi-LSTM layers and the output of the backward LSTM network Vector is spliced, and the described first Bi-LSTM layers of target syntactic feature vector exported are obtained.
During above-mentioned realization, the first Bi-LSTM layers of effective extraction target keyword can be passed through by above-mentioned algorithm Syntactic feature, to obtain target syntactic feature vector.
Optionally, before the corresponding term vector of target keyword obtained in target text to be sorted, the side Method further include: obtain multiple training texts, each training text includes training keyword, and each training text is labeled with corresponding Emotional category;Using the corresponding term vector of the trained keyword as the input of the Bi-LSTM model, by each training Output of the emotional category of text marking as the Bi-LSTM model, is trained the Bi-LSTM model, is instructed The Bi-LSTM model perfected.
During above-mentioned realization, a large amount of training text is first passed through in advance, Bi-LSTM model is trained, so that Bi-LSTM model after training can the emotional category to text accurately classify.
Optionally, before obtaining multiple training texts, the method also includes: obtain multiple urtext;To described more A urtext carries out data cleansing and obtains multiple training texts to remove text useless in the multiple urtext;It is right Each training text carries out word segmentation processing, obtains the corresponding trained keyword of each training text;Based on sentiment dictionary to each The corresponding trained keyword of training text carries out emotional category mark, obtains the corresponding emotional category of each training text.
It is useless in urtext by that can be disposed to urtext progress data cleansing during above-mentioned realization Then text carries out word segmentation processing to training text, is then based on emotion word in order to avoid interfering in training to classification results Allusion quotation carries out emotional category mark to training keyword, thus can get the corresponding emotional category of training text, and then can be improved To the training effect of model.
Optionally, described that the Bi-LSTM model is trained, comprising: based on intersection loss function to the Bi- LSTM model is trained, and when the value for intersecting loss function is less than preset value, completes the instruction of the Bi-LSTM model Practice.
During above-mentioned realization, Bi-LSTM model is trained by intersecting loss function, so as to continue to optimize Network parameter in Bi-LSTM model, and then the classifying quality of Bi-LSTM model is improved by training.
Second aspect, the embodiment of the present application provide a kind of text emotion sorter, for passing through bidirectional circulating nerve Network B i-LSTM model carries out emotional semantic classification to text, and the Bi-LSTM model includes: the first Bi-LSTM layers, the 2nd Bi- LSTM layers and classification layer, described device includes:
Term vector obtain module, for obtain the corresponding target word of target keyword in target text to be sorted to Amount;
Syntactic feature extraction module, for the target term vector to be input to the described first Bi-LSTM layers, by described First Bi-LSTM layers the syntactic feature of target term vector is extracted, the target syntax that output characterizes the syntactic feature is special Levy vector, wherein the syntactic feature is for characterizing contextual information of the target keyword in the target text;
Semantic feature extraction module, for passing through the described 2nd Bi-LSTM layers of language to the target syntactic feature vector Adopted feature extracts, and output characterizes the target semanteme feature vector of the semantic feature, wherein the semantic feature is used for table Levy semantic information of the target keyword in the target text;
Emotional category determining module, it is special for being based on the target term vector, the target syntax by the classification layer Sign vector and the target semanteme feature vector determine the corresponding emotional category of the target text.
Optionally, the emotional category determining module, is specifically used for:
It is by the classification layer that the target term vector, the target syntactic feature vector and the target is semantic special Sign vector is weighted, and obtains weighing vector;
The general of the corresponding each emotional category of the target text is predicted based on the weighing vector by the classification layer Rate value;
The target is determined according to the probability value of the corresponding each emotional category of the target text by the classification layer The corresponding emotional category of text.
Optionally, the emotional category determining module, also particularly useful for by the classification layer by the target text pair The maximum emotional category of probability value is determined as the corresponding emotion class of the target text in the probability value for each emotional category answered Not.
Optionally, the syntactic feature extraction module, is specifically used for:
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by sigmoid function obtains the output valve for forgeing door;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit calculates the output valve for obtaining input gate by sigmoid function;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by tanh function obtains interim Bi-LSTM unit cell state Value;
Based on output valve, the output valve of the input gate, the interim Bi-LSTM unit cell state for forgeing door Value and the value of last moment Bi-LSTM unit cell state calculate and obtain current time Bi-LSTM unit cell state Value;
According to last moment hidden state value, the target term vector of current time input and the current time The value of Bi-LSTM unit cell state obtains the output vector of current time hidden state;
The the described first forward direction LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
The the described first backward LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
By the output vector of the described first forward direction LSTM network in Bi-LSTM layers and the output of the backward LSTM network Vector is spliced, and the described first Bi-LSTM layers of target syntactic feature vector exported are obtained.
Optionally, described device further include:
Training module, for obtaining multiple training texts, each training text includes training keyword, each training text It is labeled with corresponding emotional category;It, will using the corresponding term vector of the trained keyword as the input of the Bi-LSTM model Output of the emotional category of each training text mark as the Bi-LSTM model, carries out the Bi-LSTM model Training obtains the trained Bi-LSTM model.
Optionally, the training module, is also used to:
Obtain multiple urtext;
Data cleansing is carried out to the multiple urtext to obtain to remove text useless in the multiple urtext Obtain multiple training texts;
Word segmentation processing is carried out to each training text, obtains the corresponding trained keyword of each training text;
Emotional category mark is carried out based on sentiment dictionary trained keyword corresponding to each training text, obtains each instruction Practice the corresponding emotional category of text.
Optionally, the training module is also used to be trained the Bi-LSTM model based on intersection loss function, When the value for intersecting loss function is less than preset value, the training of the Bi-LSTM model is completed.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, including processor and memory, the memory It is stored with computer-readable instruction fetch, when the computer-readable instruction fetch is executed by the processor, operation such as above-mentioned the On the one hand the step in the method provided.
Fourth aspect, the embodiment of the present application provide a kind of readable storage medium storing program for executing, are stored thereon with computer program, the meter The step in the method that first aspect offer is as above provided is run when calculation machine program is executed by processor.
Other feature and advantage of the application will be illustrated in subsequent specification, also, partly be become from specification It is clear that by implementing the embodiment of the present application understanding.The purpose of the application and other advantages can be by written theorys Specifically noted structure is achieved and obtained in bright book and attached drawing.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 is a kind of structural schematic diagram of conventional Bi-LSTM model provided by the embodiments of the present application;
Fig. 2 is a kind of structural schematic diagram of improved Bi-LSTM model provided by the embodiments of the present application;
Fig. 3 is a kind of flow chart of text sentiment classification method provided by the embodiments of the present application;
Fig. 4 be it is provided by the embodiments of the present application it is a kind of based on the method for sentiment dictionary to training keyword carry out Emotion tagging Flow diagram;
Fig. 5 is a kind of structural block diagram of text emotion sorter provided by the embodiments of the present application;
Fig. 6 is the structural schematic diagram of a kind of electronic equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Ground description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.Usually exist The component of the embodiment of the present application described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, is not intended to limit claimed the application's to the detailed description of the embodiments herein provided in the accompanying drawings below Range, but it is merely representative of the selected embodiment of the application.Based on embodiments herein, those skilled in the art are not being done Every other embodiment obtained under the premise of creative work out, shall fall in the protection scope of this application.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile the application's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
Before introducing the specific embodiment of the application, first the application scenarios of the application are simply introduced.
In comparative example, in some text emotion analysis methods, deep learning or deep learning phase are usually utilized It closes algorithm and Sentiment orientation analysis, such as positive emotion, negative emotion is carried out to text.The deep learning related algorithm wherein used Such as shot and long term memory network (Long Short-Term Memory, LSTM), bidirectional circulating neural network (Bidirectional Long Short-Term Memor, Bi-LSTM) etc., and when use these algorithms progress text emotion analysis, by being based on The feature vector of the last layer neural unit output in model is predicted, although the spy of the last layer neural unit output Sign vector also includes the contextual information of text, but it does not include the more semantic informations of text, for example, some words It is extremely complex, it passes different judgements under different field, such as " suspension of this vehicle is too hard " is derogatory sense, and " this diamond is very hard " is exactly commendation , therefore need deeply to understand the inherent meaning of entire sentence when analysis to carry out emotional semantic classification, and comparative example It is middle that the inherent meaning of entire sentence can not deeply be understood using conventional Bi-LSTM model well, so leading to its emotion The accuracy of class prediction is not high.
Conventional Bi-LSTM model is compared with for LSTM model, and difference is the list different from LSTM model hidden layer To propagation, as shown in FIG. 1, FIG. 1 is the structural schematic diagrams of the Bi-LSTM model of comparative example.Bi-LSTM model includes two Mutually independent hidden layer, i.e., preceding to LSTM network and backward LSTM network, the direction of propagation is on the contrary, thus directed towards same input Data may finally obtain two hidden layers and export, i.e. two feature vectors about input data, later Bi-LSTM model A vector is obtained by the method that two feature vectors are spliced (concat) or are averaged, then is exported To full articulamentum, which is based on by full articulamentum, emotional category prediction is carried out to text, so, due to only with single The vertical correlated characteristic for extracting text of Bi-LSTM model may cause the semantic letter in the vector that full articulamentum obtains comprising text Breath is not comprehensive, so that may be inaccurate to the prediction of the emotional category of text.
Drawbacks described above present in comparative example is applicant in the knot obtained after practicing and carefully studying Structure, therefore, the solution that the discovery procedure of the above problem and hereinafter the embodiment of the present application are proposed regarding to the issue above, It all should be the contribution that applicant makes the application during the application.
In order to improve the above problem, the embodiment of the present application provides a kind of text sentiment classification method, and this method is for leading to It crosses improved Bi-LSTM model and emotional semantic classification is carried out to text, can obtain and effectively improve the accurate of text emotion class prediction Property.
And in the application, in order to input the more features of Text Feature Extraction so that can when finally carrying out emotional semantic classification to it To obtain more accurate classification results, so, the Bi-LSTM model used in the application is to above-mentioned conventional Bi-LSTM What model obtained after improving.
Referring to figure 2., Fig. 2 is the structural schematic diagram of improved Bi-LSTM model provided by the embodiments of the present application, the Bi- LSTM model includes: that the first Bi-LSTM layer 10, the 2nd Bi-LSTM layer 20 and classification layer 30, the first Bi-LSTM layer 10 are used In the data vector for obtaining input, then the data vector of 10 pairs of the first Bi-LSTM layer inputs carries out syntactic feature extraction, then Primary vector is exported to the 2nd Bi-LSTM layer 20, the 2nd Bi-LSTM layer 20 continues to obtain primary vector progress semantic feature extraction Secondary vector is obtained, then the data vector of input, primary vector and secondary vector are input to classification layer 30, pass through classification Layer 30 carries out emotional category classification to text based on data vector, primary vector and secondary vector.
It is to be appreciated that Bi-LSTM model provided by the embodiments of the present application can actually regard the Bi- of two routines as The splicing of LSTM model, i.e. the first Bi-LSTM layer 10 are the Bi-LSTM model of an individual routine, the 2nd Bi-LSTM layer 20 Also it is the Bi-LSTM model of an individual routine, thus can carries out feature by text of two Bi-LSTM models to input It extracts, can more fully extract the semantic information for including in text, emotion point is carried out to text so as to effectively improve The accuracy of class.
The concrete principle of Bi-LSTM model in the present embodiment is referred to the correlation in following the application embodiments of the method Excessive introduction is not done in description first herein.
Referring to figure 3., Fig. 3 is a kind of flow chart of text sentiment classification method provided by the embodiments of the present application, this method It can be applied to following electronic equipments, this method comprises the following steps:
Step S110: the corresponding target term vector of target keyword in target text to be sorted is obtained.
Wherein, target text to be sorted is the text for needing to carry out emotional category classification, be can be by a sentence The text of composition is also possible to the text comprising multiple sentences, certainly, is also possible to an individual word or multiple words, The either text of sentence and contamination.
Processing for the ease of Bi-LSTM model to data, target text are obtained after urtext is pre-processed, Then target text is input in Bi-LSTM model again.Pretreatment include urtext is segmented, data cleansing etc. Processing such as first can carry out word segmentation processing, such as urtext to urtext using jieba participle tool or other segmenting methods For the beautiful rivers and mountains of China " I like ", it is carried out the multiple words obtained after word segmentation processing include " I/love/China// it is good greatly Territory ", data cleansing can be regarded as that multiple words of acquisition are carried out stop with word processing, can be according to stop vocabulary automatic fitration Fall the vocabulary of not practical significance, such as preposition, article, auxiliary words of mood, adverbial word, conjunction and punctuate, so above-mentioned is more A word can be " I/love/China/beautiful rivers and mountains " after data cleansing, and these words " I/love/China/beautiful rivers and mountains " it can make For the target keyword of target text.
Before these target keywords are inputted Bi-LSTM model, also need to be converted to target keyword target word to Target keyword can be carried out term vector conversion according to term vector dictionary by amount, for example, can incite somebody to action for " I " this word Its vector for being converted to 300 dimensions, i.e., " I am (- 0.063704 0.403445-0.454980-0.144473 0.067589 0.125771-0.032271 0.092480 0.106311-0.084045-0.208599 0.232819 0.020058- 0.194340-0.323468 0.017724 0.314494-0.006405-0.039691 0.055776-0.201758 0.002135 ...) ", in this way, term vector conversion can be carried out according to term vector dictionary for each target keyword.
In this manner, target keyword can be converted into corresponding term vector, then carried out to target text When emotional semantic classification, above-mentioned mode can be directly based upon and obtain the corresponding target term vector of target text.
Step S120: the target term vector is input to the described first Bi-LSTM layers, passes through the first Bi-LSTM Layer extracts the syntactic feature of the target term vector, and output characterizes the target syntactic feature vector of the syntactic feature.
The first Bi-LSTM layers of target term vector progress syntactic feature extraction being used for input, the syntactic feature are used for table Contextual information of the target keyword in target text, the i.e. meaning of a word of the target keyword in target text are levied, for example, for Certain words may be polysemy, have different meanings under different contexts, so passing through the first Bi-LSTM layers of progress sentence Method feature extraction obtains the meaning of a word of the target keyword in target text, that is, passes through first so as to combine contextual information Bi-LSTM layers obtain the target syntactic feature vector for characterizing syntactic feature, include the target in the target syntactic feature vector Then the meaning of a word of keyword again exports target syntactic feature vector to the 2nd Bi-LSTM layers.
Step S130: by the described 2nd Bi-LSTM layers the semantic feature of the target syntactic feature vector is mentioned It takes, output characterizes the target semanteme feature vector of the semantic feature.
2nd Bi-LSTM layers for carrying out semantic feature extraction to target syntactic feature vector, semantic feature is for characterizing Semantic information of the target keyword in target text, such as Subject, Predicate and Object part-of-speech information of the target keyword in target text, So can be by the 2nd Bi-LSTM layers of acquisition target semanteme feature vector, which includes target critical Semantic information of the word in target text.
Step S140: the target term vector, the target syntactic feature vector and institute are based on by the classification layer It states target semanteme feature vector and determines the corresponding emotional category of the target text.
In the first Bi-LSTM layers of acquisition target syntactic feature vector above by Bi-LSTM model, and pass through second After Bi-LSTM layers of acquisition target semanteme feature vector, target term vector, target can be based on by the classification layer of Bi-LSTM model Syntactic feature vector and target semanteme feature vector determine the emotional category of target text.
Wherein, emotional category can be defined previously as a variety of, such as positive, negative and intermediate emotion, or glad, sadness or The more specific emotional category such as intermediate emotion, it is to be understood that the classification of specific emotional category can be according to practical emotion It is preset, the embodiment of the present application is not limited this.
It is to be appreciated that classification layer can be a classifier, can according to target term vector, target syntactic feature to Amount and target semanteme feature vector predict the emotional category of target text, so classification layer can three vectors based on acquisition The emotional category of target text is predicted, then exports corresponding prediction result.
During above-mentioned realization, by Bi-LSTM model the first Bi-LSTM layers extract syntactic feature, pass through 2nd Bi-LSTM layers extract semantic feature, then the term vector of two layers of output and input is inputted to classification layer, then lead to Cross the emotion class that classification layer determines target text based on the term vector of syntactic feature vector, semantic feature vector and input Not, since the syntactic feature and semantic feature of target keyword can be extracted in the application, so that the information obtained can More comprehensively, so as to more deeply understand the inherent meaning of target text, and then to effectively improve the accurate of classification results Property.
As an example, can be in such a way that layer of classifying determines the corresponding emotional category of target text are as follows: by point Target term vector, target syntactic feature vector and target and semantic feature vector are weighted by class layer, obtain weighing vector, Then the probability value for predicting the corresponding each emotional category of target text based on the weighing vector by classification layer again, further according to mesh The probability value of the corresponding each emotional category of mark text determines the corresponding emotional category of target text.
Wherein, classification layer may include weighting layer and full articulamentum, and weighting layer is used for target term vector, target syntax Feature vector and target semanteme feature vector obtain weighing vector after being weighted, full articulamentum is used for true according to weighing vector The corresponding emotional category of the text that sets the goal.
For example, target term vector includes: my (x1,x2,x3), like (y1,y2,y3), China (z1,z2,z3);Target syntax is special Sign vector includes: my (x11,x21,x31), like (y11,y21,y31), China (z11,z21,z31);Target semanteme feature vector includes: I am (x12,x22,x32), like (y12,y22,y32), China (z12,z22,z32);These vectors are weighted by weighting layer and are referred to First the corresponding vector of each target keyword is weighted, is then again weighted all vectors, such as first to " I " this The corresponding each vector of a word is weighted, weighted formula are as follows: x=a* (x1,x2,x3)+b*(x11,x21,x31)+c*(x12,x22, x32), the corresponding weighing vector of " I " this target keyword can be obtained, similarly can get " love " corresponding weighing vector: y= a*(y1,y2,y3)+b*(y11,y21,y31)+c*(y12,y22,y32), and " China " corresponding weighing vector: z=a* (z1,z2, z3)+b*(z11,z21,z31)+c*(z12,z22,z32), finally the corresponding weighing vector of each target keyword is added again again Power, obtains last weighing vector: w=w1*x+w2*y+w3* z, wherein a, b, c, w1,w2,w3It can be in Bi-LSTM model Customized weighting coefficient.
Then the weighing vector w of acquisition can be input to full articulamentum, the feelings of target text is exported by full articulamentum Feel classification, wherein each output of full articulamentum it can be seen that each node of preceding layer multiplied by a weight coefficient, It is finally obtained plus a biasing, specific principle can hereinafter describe in detail.
During above-mentioned realization, by target syntactic feature vector, target semanteme feature vector and the target word of acquisition Vector obtains weighing vector after being weighted, can be by each vector multiplied by different weights, to distinguish difference when being weighted The influence size that vector predicts emotional category, so that determining the corresponding emotional category of target text based on weighing vector When, the probability value of the corresponding each emotional category of prediction target text that can be more accurate.
It, can be maximum by probability value after obtaining the probability value of the corresponding each emotional category of target text by classification layer Emotional category is determined as the corresponding emotional category of target text.For example, if the target text exported by above-mentioned full articulamentum Probability value for positive emotion is 0.7, and the probability value of negative emotion is 0.2, and the probability value of intermediate emotion is 0.1, then can determine The emotional category of target text is positive emotion.
It, can be in such a way that layer of classifying determines the emotional category of target text are as follows: directly general as another example Target term vector, target syntactic feature vector and target semanteme feature vector are added, and are obtained and are added vector, then lead to again Cross the emotional category that full articulamentum determines target text based on addition vector.For example, being added the calculation formula of vector are as follows: w=x+y The data handling procedure of+z, full articulamentum are also same as described above, do not repeat excessively herein.
Each layer in Bi-LSTM model of data handling procedure is described in detail below, by the Bi- of foregoing description The principle of LSTM model is it is found that Bi-LSTM model is actually a kind of deformation of LSTM model, so interior data is processed Journey is similar with the data handling procedure of LSTM model.
Wherein, by the first Bi-LSTM layers the syntactic feature of target term vector is extracted, output characterization syntax is special The process of the target syntactic feature vector of sign is as follows:
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by sigmoid function obtains the output valve for forgeing door.
Wherein, which can use formula ft=σ (Wf·[ht-1,xt]+bf) indicate, ftAs forget the output valve of door, xtFor the target term vector of the input of the Bi-LSTM unit at current time, ht-1For the Bi-LSTM unit hidden layer of previous moment Output vector, WfFor the weight matrix for forgeing gate cell state, bfFor the bias vector for forgeing gate cell state.
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit calculates the output valve for obtaining input gate by sigmoid function.
Wherein, which can use formula it=σ (Wi·[ht-1,xt]+bi) indicate, itThe as output valve of input gate, σ is sigmoid activation primitive, WiFor the weight matrix of input gate location mode, biFor the bias vector of input gate location mode.
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by tanh function obtains interim Bi-LSTM unit cell state Value.
Wherein, which can use formulaIt indicates,As interim Bi-LSTM The value of unit cell state, tanh are hyperbolic tangent function, WcFor the weight matrix of Bi-LSTM location mode, bcFor Bi-LSTM The bias vector of location mode.
Based on output valve, the output valve of the input gate, the interim Bi-LSTM unit cell state for forgeing door Value and the value of last moment Bi-LSTM unit cell state calculate and obtain current time Bi-LSTM unit cell state Value.
Wherein, which can use formulaIt indicates, CtAs current time Bi-LSTM unit The value of cell state, Ct-1For the value of last moment Bi-LSTM unit cell state.
According to last moment hidden state value, the target term vector of current time input and the current time The value of Bi-LSTM unit cell state obtains the output vector of current time hidden state.
Wherein, which can be indicated using following formula:
ot=σ (Wo·[ht-1,xt]+bo);
ht=ot*tanh(Ct);
WoFor the weight matrix of out gate location mode, boFor the bias vector of out gate location mode, otFor out gate list The output of member, htFor the output vector of current time hidden state.
The the described first forward direction LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector.
For the first forward direction LSTM network in Bi-LSTM layers, can obtain each moment as procedure described above implies shape Thus the output vector of state can get hidden state sequence as before to the output vector of LSTM network
The the described first backward LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector.
For the first backward LSTM network in Bi-LSTM layers, can obtain each moment as procedure described above implies shape Thus the output vector of state can get hidden state sequence as after to the output vector of LSTM network
By the output vector of the described first forward direction LSTM network in Bi-LSTM layers and the output of the backward LSTM network Vector is spliced, and the described first Bi-LSTM layers of target syntactic feature vector exported are obtained.
Wherein, target syntactic feature vector can be expressed as
It can get target syntactic feature vector in the manner described above as a result, include in the target syntactic feature vector of acquisition The syntactic feature of each keyword.For example, the target term vector of input is denoted as x=(x1,x2,x3,x4,x5), each target word The dimension of vector is 300 dimensions, respectively obtains five after the forward direction LSTM network in the first Bi-LSTM layers of each target term vector input A vectorEach vector 150 is tieed up, in the first Bi-LSTM layers of each target term vector output Backward LSTM network after respectively obtain five vectorsEach vector 150 is tieed up, and then will The output of forward, backward LSTM network is spliced to obtain the first Bi-LSTM layers of output, i.e. target syntactic feature vectorIt is denoted asEach mesh Mark syntactic feature vector 300 tie up, as a result, can based on above-mentioned algorithm obtain the first Bi-LSTM layer export target syntactic features to Amount.
Similarly, for the 2nd Bi-LSTM layers extraction target syntactic feature vector semantic feature, export target semantic feature The process of vector also with the above-mentioned first Bi-LSTM layers treatment process it is similar.For example, the 2nd Bi-LSTM layers input be first Bi-LSTM layers of outputThe target syntactic feature vector is by the 2nd Bi-LSTM layers of forward direction Vector (h is obtained after LSTM network2L1,h2L2,h2L3,h2L4,h2L5), it is obtained after the 2nd Bi-LSTM layers of backward LSTM network Vector (h2R1,h2R2,h2R3,h2R4,h2R5), then spliced the output vector of forward, backward LSTM network to obtain second again Bi-LSTM layers of output, i.e. target semanteme feature vectorIt is denoted as Each target semanteme feature vector 300 is tieed up, and can get the 2nd Bi-LSTM layers of target semanteme feature vector exported as a result,.
In addition, in order to accelerate the convergence rate of Bi-LSTM model, the target term vector of output can be added to the 2nd Bi- LSTM layers of residual error network, so, the 2nd Bi-LSTM layers of output can be (h21+x1,h22+x2,h23+x3,h24+x4,h25+ x5), it is denoted as
So being carried out by classification layer to target term vector, target syntactic feature vector and target semanteme feature vector The weighing vector that weighting obtains is represented byWherein, w1,w2,w3For customized weighting system Number.
During above-mentioned realization, the first Bi-LSTM layers of effective extraction target keyword can be passed through by above-mentioned algorithm Syntactic feature, to obtain syntactic feature vector, and pass through the 2nd Bi-LSTM layers of semanteme for effectively extracting target keyword Feature, to obtain semantic feature vector.
As an example, above-mentioned target word vector x, target syntactic feature vector are being obtainedAnd target semantic feature VectorAfterwards, following formula is based on to these vectors by classification layer to be weighted:
Wherein, k is weighing vector, and a, b, c is predetermined coefficient, i.e., customized weighting coefficient;
Then the corresponding emotional category of the target text is determined by following formula:
y1=relu (k*Wn1+b);
y2=relu (k*Wn2+b);
Wherein, relu is activation primitive, and relu=max (0, j), j are a preset constant, Wn1,Wn2It is the Bi- The weight of LSTM model, b are biasing,To predict that the target text is the probability value of the first emotional category,For prediction institute State the probability value that target text is the second emotional category.
If weighing vector is above-mentionedThen the target can be determined by following formula The corresponding emotional category of text:
y1=relu (w*Wn1+b);
y2=relu (w*Wn2+b);
During above-mentioned realization, classification layer can be made rapidly and accurately to export target text pair by above-mentioned algorithm The probability value for each emotional category answered.
It as an example, can also be first right before actually carrying out emotional semantic classification to text by Bi-LSTM model Bi-LSTM model is trained, and the process of training is as follows:
Multiple training texts are obtained, each training text includes training keyword, and each training text is labeled with corresponding Emotional category;Using the trained keyword as the input of Bi-LSTM model, by the emotion class of each training text mark Output not as the Bi-LSTM model is trained the Bi-LSTM model, obtains the trained Bi-LSTM Model.
Training text among the above is by pretreated text, and preprocessing process is as follows:
Multiple urtext are obtained, data cleansing are carried out to multiple urtext, to remove nothing in multiple Yuan's art texts Text obtains multiple training texts, then carries out word segmentation processing to each training text, and it is corresponding to obtain each training text Training keyword, emotional category standard is carried out based on sentiment dictionary trained keyword corresponding to each training text, is obtained The corresponding emotional category of each training text.
Specifically, multiple urtext can be obtained from each website by crawler method, below for " Hainan is big Learn " be the urtext that search key obtains:
During analyzing urtext, finds to exist useless on emotional semantic classification in urtext or even influence to classify As a result element, such as: " may I ask present hospital, University Of Hainan to open the door? http://t.cn/RgoecB1 " in " http://t.cn/RgoecB1 " this kind of link does not act on emotional semantic classification, but will affect the accuracy of classification, so can To be purged to these data.
"@today it is unhappy you can not run away~!This lifetime is together with my bolt.[giggle] " in@user name in can Classification results can be will affect, therefore be also required to remove.
So data prediction, which first has to remove, acts on little information such as to emotional semantic classification: number, http link ,@are used These texts of family, [reply] etc., the urtext after removing can be used as training text input Bi-LSTM model and be trained.
For the ease of the data processing of model, before training, also need to carry out each training text word segmentation processing, participle Method can segment tool using jieba etc., and following is the training keyword obtained after segmenting to each training text Example:
It is then based on sentiment dictionary and emotional category mark is carried out to each trained keyword, if emotional category is positive and negative Face is then labeled as 1 if front, in advance includes multiple words and each word pair in sentiment dictionary if being negatively then labeled as 0 The weight for the emotional category answered, so can match training keyword with each word in sentiment dictionary, to can get The weight of the corresponding emotional category of each trained keyword, and if when having negative word or degree word before some words, can be right The emotion of text has an enhancing conduct, degree word such as " extremely ", " special ", so, by each trained keyword and emotion word When word in allusion quotation is matched, whether there are degree word or negative word before the word can also be searched, multiplied by corresponding power if having Weight is finally added the weight of all trained keywords in each training text, obtains the weight of training text, if training text This weight is greater than or equal to 0, then is labeled as positive emotion, negative emotion is labeled as if less than 0.
As shown in figure 4, Fig. 4 be it is provided by the embodiments of the present application it is a kind of based on the method for sentiment dictionary to training keyword into The flow diagram of row Emotion tagging, such as be " cherishing the memory of, I, Hai great, sadness, sadness " for training keyword, according to Fig. 4 institute The mask method shown, " cherishing the memory of " corresponding weight are 1, and " I " corresponding weight is 1, and " sea big " corresponding weight is 1, first " sadness " corresponding weight is -1, and second " sadness " corresponding weight is -1, then by these corresponding weights of training keyword Being added the weight obtained is 1, then it represents that the emotional category of the training text is front, so in this manner, can get each The corresponding emotional category of training text can make the corresponding emotional category of training text when being trained to Bi-LSTM model For the output of Bi-LSTM model, so that Bi-LSTM model accurately can carry out emotional category to text in actual prediction Prediction.
The structure of the Bi-LSTM model of training process is consistent with above-mentioned Fig. 2's, in the training process can be by classification layer DrourOut layer at random give up classification layer in some neuron nodes, as a result, can avoid model over-fitting, influence prediction knot The accuracy of fruit.And predict that the corresponding emotional category of text is mainly completed by softMax layers, is used to export text in layer of classifying The probability value of this corresponding emotional category, then the maximum emotional category of select probability value is as the corresponding final emotion of text Classification output.
During above-mentioned realization, a large amount of training text is first passed through in advance, Bi-LSTM model is trained, so that Bi-LSTM model after training can the emotional category to text accurately classify.
In addition, also can be used based on intersection loss function during training Bi-LSTM model to the Bi-LSTM Model is trained, and when the value for intersecting loss function is less than preset value, completes the training of the Bi-LSTM model.
Wherein, intersect loss function are as follows:
Wherein, n is the number of emotional category, yiFor desired output,For the defeated of Bi-LSTM model described in training process Out, loss is to intersect loss function.
According to the loss obtained in training process, each LSTM unit in place and classification are trained using gradient descent algorithm The parameter of layer, such as above-mentioned weight, biasing parameter.
Wherein, preset value can be arranged according to actual needs, as preset value be 0.001 when, then when loss value less than 0.001 or Deconditioning when person's Bi-LSTM model the number of iterations reaches preset maximum value indicates that Bi-LSTM model training is completed.
In addition, due to the data for having some marking errors based on sentiment dictionary method labeled data, it is this in order to reduce Influence of the data to emotional semantic classification is further increased Bi-LSTM model to the accuracy of emotional semantic classification, is instructed based on aforesaid way After perfecting Bi-LSTM model, the sample manually marked can be used, Bi-LSTM model is finely adjusted, it can only training mould when fine tuning Network parameter in the classification layer of type, such as the weight in above-mentioned full articulamentum, biasing network parameter, the network after then finely tuning Network parameter of the parameter as final Bi-LSTM model.
During above-mentioned realization, a large amount of training text is first passed through in advance, Bi-LSTM model is trained, so that Bi-LSTM model after training can the emotional category to text accurately classify.
Referring to figure 5., Fig. 5 is a kind of structural block diagram of text emotion sorter 200 provided by the embodiments of the present application, should Device can be the module, section or code on electronic equipment.It should be understood that text emotional semantic classification device 200 and above-mentioned Fig. 3 Embodiment of the method is corresponding, is able to carry out each step that Fig. 3 embodiment of the method is related to, and text emotional semantic classification device 200 is specific Function may refer to it is described above, it is appropriate herein to omit detailed description to avoid repeating.
Optionally, which is used to carry out emotional semantic classification, institute to text by bidirectional circulating neural network Bi-LSTM model State that Bi-LSTM model includes: the first Bi-LSTM layers, the 2nd Bi-LSTM layers and classification layer, described device includes:
Term vector obtains module 210, for obtaining the corresponding target word of target keyword in target text to be sorted Vector;
Syntactic feature extraction module 220 passes through for the target term vector to be input to the described first Bi-LSTM layers Described first Bi-LSTM layers the syntactic feature of target term vector is extracted, output characterizes the target sentence of the syntactic feature Method feature vector, wherein the syntactic feature is used to characterize context letter of the target keyword in the target text Breath;
Semantic feature extraction module 230, for by the described 2nd Bi-LSTM layers to the target syntactic feature vector Semantic feature extracts, and output characterizes the target semanteme feature vector of the semantic feature, wherein the semantic feature is used for Characterize semantic information of the target keyword in the target text;
Emotional category determining module 240, for being based on the target term vector, the target syntax by the classification layer Feature vector and the target semanteme feature vector determine the corresponding emotional category of the target text.
Optionally, the emotional category determining module 240, is specifically used for:
It is by the classification layer that the target term vector, the target syntactic feature vector and the target is semantic special Sign vector is weighted, and obtains weighing vector;
The general of the corresponding each emotional category of the target text is predicted based on the weighing vector by the classification layer Rate value;
The target is determined according to the probability value of the corresponding each emotional category of the target text by the classification layer The corresponding emotional category of text.
Optionally, the emotional category determining module 240, also particularly useful for by the classification layer by the target text The maximum emotional category of probability value is determined as the corresponding emotion of the target text in the probability value of corresponding each emotional category Classification.
Optionally, the syntactic feature extraction module 220, is specifically used for:
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by sigmoid function obtains the output valve for forgeing door;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit calculates the output valve for obtaining input gate by sigmoid function;
First Bi-LSTM layers of Bi-LSTM described in the target term vector and last moment based on current time input The target syntactic feature vector of the hidden layer output of unit is calculated by tanh function obtains interim Bi-LSTM unit cell state Value;
Based on output valve, the output valve of the input gate, the interim Bi-LSTM unit cell state for forgeing door Value and the value of last moment Bi-LSTM unit cell state calculate and obtain current time Bi-LSTM unit cell state Value;
According to last moment hidden state value, the target term vector of current time input and the current time The value of Bi-LSTM unit cell state obtains the output vector of current time hidden state;
The the described first forward direction LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
The the described first backward LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Output vector;
By the output vector of the described first forward direction LSTM network in Bi-LSTM layers and the output of the backward LSTM network Vector is spliced, and the described first Bi-LSTM layers of target syntactic feature vector exported are obtained.
Optionally, described device further include:
Training module, for obtaining multiple training texts, each training text includes training keyword, each training text It is labeled with corresponding emotional category;It, will using the corresponding term vector of the trained keyword as the input of the Bi-LSTM model Output of the emotional category of each training text mark as the Bi-LSTM model, carries out the Bi-LSTM model Training obtains the trained Bi-LSTM model.
Optionally, the training module, is also used to:
Obtain multiple urtext;
Data cleansing is carried out to the multiple urtext to obtain to remove text useless in the multiple urtext Obtain multiple training texts;
Word segmentation processing is carried out to each training text, obtains the corresponding trained keyword of each training text;
Emotional category mark is carried out based on sentiment dictionary trained keyword corresponding to each training text, obtains each instruction Practice the corresponding emotional category of text.
Optionally, the training module is also used to be trained the Bi-LSTM model based on intersection loss function, When the value for intersecting loss function is less than preset value, the training of the Bi-LSTM model is completed.
Fig. 6 is please referred to, Fig. 6 is the structural schematic diagram of a kind of electronic equipment provided by the embodiments of the present application, and the electronics is set Standby may include: at least one processor 310, such as CPU, at least one communication interface 320,330 He of at least one processor At least one communication bus 340.Wherein, communication bus 340 is for realizing the direct connection communication of these components.Wherein, this Shen The communication interface 320 of equipment it please be used to carry out the communication of signaling or data with other node devices in embodiment.Memory 330 can To be high speed RAM memory, it is also possible to non-labile memory (non-volatile memory), for example, at least one Magnetic disk storage.Memory 330 optionally can also be that at least one is located remotely from the storage device of aforementioned processor.Memory Computer-readable instruction fetch is stored in 330, when the computer-readable instruction fetch is executed by the processor 310, electronics Equipment executes method process shown in above-mentioned Fig. 3.
The embodiment of the present application provides a kind of readable storage medium storing program for executing, when the computer program is executed by processor, executes such as Method process performed by electronic equipment in embodiment of the method shown in Fig. 3.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description Specific work process, no longer can excessively be repeated herein with reference to the corresponding process in preceding method.
In conclusion the embodiment of the present application provides a kind of text sentiment classification method, device, electronic equipment and readable storage Medium, in this method, by Bi-LSTM model the first Bi-LSTM layers extract syntactic feature, pass through the 2nd Bi-LSTM Layer extracts semantic feature, then the term vector of two layers of output and input input to classification layer, then pass through layer base of classifying The emotional category that target text is determined in the term vector of syntactic feature vector, semantic feature vector and input, due to this Shen Please in can extract the syntactic feature and semantic feature of target keyword so that the information obtained can more comprehensively, So as to more deeply understand the inherent meaning of target text, and then effectively improve the accuracy of classification results.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the device of multiple embodiments according to the application, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part of the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each functional module in each embodiment of the application can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) execute each embodiment the method for the application all or part of the steps. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.It should also be noted that similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and explained.
The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

Claims (10)

1. a kind of text sentiment classification method, which is characterized in that for passing through bidirectional circulating neural network Bi-LSTM model to text This progress emotional semantic classification, the Bi-LSTM model includes: the first Bi-LSTM layers, the 2nd Bi-LSTM layers and classification layer, described Method includes:
Obtain the corresponding target term vector of target keyword in target text to be sorted;
The target term vector is input to the described first Bi-LSTM layers, by the described first Bi-LSTM layers to the target word The syntactic feature of vector extracts, and output characterizes the target syntactic feature vector of the syntactic feature, wherein the syntax is special It takes over for use in the contextual information for characterizing the target keyword in the target text;
By the described 2nd Bi-LSTM layers the semantic feature of the target syntactic feature vector is extracted, output characterization institute State the target semanteme feature vector of semantic feature, wherein the semantic feature is for characterizing the target keyword in the mesh Mark the semantic information in text;
The target term vector, the target syntactic feature vector and the target semantic feature are based on by the classification layer Vector determines the corresponding emotional category of the target text.
2. the method according to claim 1, wherein it is described by the classification layer be based on the target word to Amount, the target syntactic feature vector and the target semanteme feature vector determine the corresponding emotion class of the target text Not, comprising:
By the classification layer by the target term vector, the target syntactic feature vector and the target semantic feature to Amount is weighted, and obtains weighing vector;
The probability value of the corresponding each emotional category of the target text is predicted based on the weighing vector by the classification layer;
The target text is determined according to the probability value of the corresponding each emotional category of the target text by the classification layer Corresponding emotional category.
3. according to the method described in claim 2, it is characterized in that, it is described by the classification layer according to the target text pair The probability value for each emotional category answered determines the corresponding emotional category of the target text, comprising:
By the classification layer by the maximum emotion of probability value in the probability value of the corresponding each emotional category of the target text Classification is determined as the corresponding emotional category of the target text.
4. the method according to claim 1, wherein described be input to described first for the target term vector Bi-LSTM layers, by the described first Bi-LSTM layers the syntactic feature of the target term vector is extracted, output characterization institute State the target syntactic feature vector of syntactic feature, comprising:
First Bi-LSTM layers of Bi-LSTM unit described in the target term vector and last moment based on current time input Hidden layer output target syntactic feature vector by sigmoid function calculate obtain forget door output valve;
First Bi-LSTM layers of Bi-LSTM unit described in the target term vector and last moment based on current time input Hidden layer output target syntactic feature vector by sigmoid function calculate obtain input gate output valve;
First Bi-LSTM layers of Bi-LSTM unit described in the target term vector and last moment based on current time input The target syntactic feature vector of hidden layer output calculated by tanh function and obtain interim Bi-LSTM unit cell state Value;
Based on output valve, the output valve of the input gate, the value of the interim Bi-LSTM unit cell state for forgeing door And the value of last moment Bi-LSTM unit cell state calculates the value for obtaining current time Bi-LSTM unit cell state;
According to last moment hidden state value, the target term vector and the current time Bi- of current time input The value of LSTM unit cell state obtains the output vector of current time hidden state;
The output of the described first forward direction LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Vector;
The output of the described first backward LSTM network in Bi-LSTM layers is obtained according to the output vector of each moment hidden state Vector;
By the output vector of the described first forward direction LSTM network in Bi-LSTM layers and the output vector of the backward LSTM network Spliced, obtains the described first Bi-LSTM layers of target syntactic feature vector exported.
5. the method according to claim 1, wherein the target critical obtained in target text to be sorted Before the corresponding term vector of word, the method also includes:
Multiple training texts are obtained, each training text includes training keyword, and each training text is labeled with corresponding emotion Classification;
Using the corresponding term vector of the trained keyword as the input of the Bi-LSTM model, by each training text Output of the emotional category of mark as the Bi-LSTM model, is trained the Bi-LSTM model, is trained The Bi-LSTM model.
6. according to the method described in claim 5, it is characterized in that, before obtaining multiple training texts, the method also includes:
Obtain multiple urtext;
Data cleansing is carried out to the multiple urtext, to remove text useless in the multiple urtext, is obtained more A training text;
Word segmentation processing is carried out to each training text, obtains the corresponding trained keyword of each training text;
Emotional category mark is carried out based on sentiment dictionary trained keyword corresponding to each training text, obtains each training text This corresponding emotional category.
7. method according to claim 5 or 6, which is characterized in that described to be trained to the Bi-LSTM model, packet It includes:
The Bi-LSTM model is trained based on loss function is intersected, is preset when the value for intersecting loss function is less than When value, the training of the Bi-LSTM model is completed.
8. a kind of text emotion sorter, which is characterized in that for passing through bidirectional circulating neural network Bi-LSTM model to text This progress emotional semantic classification, the Bi-LSTM model includes: the first Bi-LSTM layers, the 2nd Bi-LSTM layers and classification layer, described Device includes:
Term vector obtains module, for obtaining the corresponding target term vector of the target keyword in target text to be sorted;
Syntactic feature extraction module passes through described first for the target term vector to be input to the described first Bi-LSTM layers Bi-LSTM layers extract the syntactic feature of target term vector, output characterize the target syntactic feature of the syntactic feature to Amount, wherein the syntactic feature is for characterizing contextual information of the target keyword in the target text;
Semantic feature extraction module, for passing through the described 2nd Bi-LSTM layers of semanteme spy to the target syntactic feature vector Sign extracts, and output characterizes the target semanteme feature vector of the semantic feature, wherein the semantic feature is for characterizing institute State semantic information of the target keyword in the target text;
Emotional category determining module, for by the classification layer based on the target term vector, the target syntactic feature to Amount and the target semanteme feature vector determine the corresponding emotional category of the target text.
9. a kind of electronic equipment characterized by comprising processor, storage medium and bus, the storage medium storage is The executable machine readable instructions of processor are stated, when electronic equipment operation, are led between the processor and the storage medium Bus communication is crossed, the processor executes the machine readable instructions, and the side as described in claim 1-7 is any is executed when executing The step of method.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer journey on the computer readable storage medium Sequence is executed when the computer program is run by processor such as the step of claim 1-7 any the method.
CN201910443387.0A 2019-05-24 2019-05-24 Text emotion classification method and device, electronic equipment and readable storage medium Active CN110222178B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910443387.0A CN110222178B (en) 2019-05-24 2019-05-24 Text emotion classification method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910443387.0A CN110222178B (en) 2019-05-24 2019-05-24 Text emotion classification method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN110222178A true CN110222178A (en) 2019-09-10
CN110222178B CN110222178B (en) 2021-11-09

Family

ID=67818373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910443387.0A Active CN110222178B (en) 2019-05-24 2019-05-24 Text emotion classification method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110222178B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825849A (en) * 2019-11-05 2020-02-21 泰康保险集团股份有限公司 Text information emotion analysis method, device, medium and electronic equipment
CN111210844A (en) * 2020-02-03 2020-05-29 北京达佳互联信息技术有限公司 Method, device and equipment for determining speech emotion recognition model and storage medium
CN111274807A (en) * 2020-02-03 2020-06-12 华为技术有限公司 Text information processing method and device, computer equipment and readable storage medium
CN111291187A (en) * 2020-01-22 2020-06-16 北京芯盾时代科技有限公司 Emotion analysis method and device, electronic equipment and storage medium
CN111930938A (en) * 2020-07-06 2020-11-13 武汉卓尔数字传媒科技有限公司 Text classification method and device, electronic equipment and storage medium
CN111950258A (en) * 2020-08-10 2020-11-17 深圳市慧择时代科技有限公司 Emotion classification method and device
CN112232079A (en) * 2020-10-15 2021-01-15 燕山大学 Microblog comment data classification method and system
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN112949288A (en) * 2019-12-11 2021-06-11 上海大学 Text error detection method based on character sequence
CN113158684A (en) * 2021-04-21 2021-07-23 清华大学深圳国际研究生院 Emotion analysis method, emotion reminding method and emotion reminding control device
CN113343711A (en) * 2021-06-29 2021-09-03 南方电网数字电网研究院有限公司 Work order generation method, device, equipment and storage medium
WO2021174922A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Statement sentiment classification method and related device
CN113449087A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Information processing method, device, equipment and computer readable storage medium
CN114218381A (en) * 2021-12-08 2022-03-22 北京中科闻歌科技股份有限公司 Method, device, equipment and medium for identifying position
CN114385890A (en) * 2022-03-22 2022-04-22 深圳市世纪联想广告有限公司 Internet public opinion monitoring system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291795A (en) * 2017-05-03 2017-10-24 华南理工大学 A kind of dynamic word insertion of combination and the file classification method of part-of-speech tagging
CN108170681A (en) * 2018-01-15 2018-06-15 中南大学 Text emotion analysis method, system and computer readable storage medium
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
US20180240012A1 (en) * 2017-02-17 2018-08-23 Wipro Limited Method and system for determining classification of text
CN108829818A (en) * 2018-06-12 2018-11-16 中国科学院计算技术研究所 A kind of file classification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180240012A1 (en) * 2017-02-17 2018-08-23 Wipro Limited Method and system for determining classification of text
CN107291795A (en) * 2017-05-03 2017-10-24 华南理工大学 A kind of dynamic word insertion of combination and the file classification method of part-of-speech tagging
CN108170681A (en) * 2018-01-15 2018-06-15 中南大学 Text emotion analysis method, system and computer readable storage medium
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN108829818A (en) * 2018-06-12 2018-11-16 中国科学院计算技术研究所 A kind of file classification method

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825849A (en) * 2019-11-05 2020-02-21 泰康保险集团股份有限公司 Text information emotion analysis method, device, medium and electronic equipment
CN112949288B (en) * 2019-12-11 2022-11-11 上海大学 Text error detection method based on character sequence
CN112949288A (en) * 2019-12-11 2021-06-11 上海大学 Text error detection method based on character sequence
CN111291187A (en) * 2020-01-22 2020-06-16 北京芯盾时代科技有限公司 Emotion analysis method and device, electronic equipment and storage medium
CN111291187B (en) * 2020-01-22 2023-08-08 北京芯盾时代科技有限公司 Emotion analysis method and device, electronic equipment and storage medium
CN111210844A (en) * 2020-02-03 2020-05-29 北京达佳互联信息技术有限公司 Method, device and equipment for determining speech emotion recognition model and storage medium
CN111274807A (en) * 2020-02-03 2020-06-12 华为技术有限公司 Text information processing method and device, computer equipment and readable storage medium
CN111210844B (en) * 2020-02-03 2023-03-24 北京达佳互联信息技术有限公司 Method, device and equipment for determining speech emotion recognition model and storage medium
WO2021174922A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Statement sentiment classification method and related device
CN113449087A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Information processing method, device, equipment and computer readable storage medium
CN113449087B (en) * 2020-03-25 2024-03-08 阿里巴巴集团控股有限公司 Information processing method, apparatus, device and computer readable storage medium
CN111930938A (en) * 2020-07-06 2020-11-13 武汉卓尔数字传媒科技有限公司 Text classification method and device, electronic equipment and storage medium
CN111950258A (en) * 2020-08-10 2020-11-17 深圳市慧择时代科技有限公司 Emotion classification method and device
CN112232079B (en) * 2020-10-15 2022-12-02 燕山大学 Microblog comment data classification method and system
CN112232079A (en) * 2020-10-15 2021-01-15 燕山大学 Microblog comment data classification method and system
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN113158684A (en) * 2021-04-21 2021-07-23 清华大学深圳国际研究生院 Emotion analysis method, emotion reminding method and emotion reminding control device
CN113343711A (en) * 2021-06-29 2021-09-03 南方电网数字电网研究院有限公司 Work order generation method, device, equipment and storage medium
CN114218381A (en) * 2021-12-08 2022-03-22 北京中科闻歌科技股份有限公司 Method, device, equipment and medium for identifying position
CN114385890B (en) * 2022-03-22 2022-05-20 深圳市世纪联想广告有限公司 Internet public opinion monitoring system
CN114385890A (en) * 2022-03-22 2022-04-22 深圳市世纪联想广告有限公司 Internet public opinion monitoring system

Also Published As

Publication number Publication date
CN110222178B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN110222178A (en) Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
Munoz et al. A learning approach to shallow parsing
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN110569508A (en) Method and system for classifying emotional tendencies by fusing part-of-speech and self-attention mechanism
CN107025284A (en) The recognition methods of network comment text emotion tendency and convolutional neural networks model
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN110309514A (en) A kind of method for recognizing semantics and device
CN108052625B (en) Entity fine classification method
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN110427616B (en) Text emotion analysis method based on deep learning
CN111143569A (en) Data processing method and device and computer readable storage medium
CN107590127A (en) A kind of exam pool knowledge point automatic marking method and system
CN110502742B (en) Complex entity extraction method, device, medium and system
CN107180084A (en) Word library updating method and device
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN109101490B (en) Factual implicit emotion recognition method and system based on fusion feature representation
CN110489554B (en) Attribute-level emotion classification method based on location-aware mutual attention network model
CN112434164B (en) Network public opinion analysis method and system taking topic discovery and emotion analysis into consideration
CN115952292B (en) Multi-label classification method, apparatus and computer readable medium
CN112328797A (en) Emotion classification method and system based on neural network and attention mechanism
CN110516070A (en) A kind of Chinese Question Classification method based on text error correction and neural network
CN109271636B (en) Training method and device for word embedding model
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN110852071B (en) Knowledge point detection method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant