CN109446331A - A kind of text mood disaggregated model method for building up and text mood classification method - Google Patents
A kind of text mood disaggregated model method for building up and text mood classification method Download PDFInfo
- Publication number
- CN109446331A CN109446331A CN201811492975.5A CN201811492975A CN109446331A CN 109446331 A CN109446331 A CN 109446331A CN 201811492975 A CN201811492975 A CN 201811492975A CN 109446331 A CN109446331 A CN 109446331A
- Authority
- CN
- China
- Prior art keywords
- document
- sentence
- vector
- mood
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of text mood disaggregated model method for building up and text mood classification methods, comprising: treats classifying documents and carries out subordinate sentence operation, and obtains the term vector of all words in each sentence respectively;Word conversion coating, document vector synthesis layer and output layer are successively established according to subordinate sentence result, to complete the foundation of text mood disaggregated model;Word conversion coating includes M tree construction neural network, corresponds with the resulting M sentence of subordinate sentence, is respectively used to convert hiding vector for the term vector of word in sentence;Document vector synthesis layer is used to obtain the document vector of document to be sorted according to the hiding vector that word conversion coating converts resulting word;Output layer, which is used for, to be mood probability distribution by the document DUAL PROBLEMS OF VECTOR MAPPING that document vector synthesis layer obtains and normalizes, to obtain probability of the document to be sorted in each mood classification.The present invention effectively increases the accuracy of text mood classification by fusion syntactic information, subject information and semantic information.
Description
Technical field
The invention belongs to machine learning fields, more particularly, to a kind of text mood disaggregated model method for building up and text
This mood classification method.
Background technique
With the fast development of Internet technology and universal, people can receive a large amount of news article or rich on the net daily
Objective model, the mood tendency for analyzing these articles or model just become particularly important, it can help public sentiment manager quickly into
The monitoring of row public sentiment and guidance.In addition to this, mood classification can play a role in terms of article retrieval or article.For
Mood classification, has been developed for multiple technologies now, can substantially be divided into the method based on topic model, is based on feature selecting
Method and method neural network based.
Method and thought based on topic model is the master that document is extracted by topic model (such as: LDA topic model)
Information is inscribed, the subject information of document mood corresponding with document is then made to associate, achievees the purpose that mood is classified.Based on spy
The method and thought for levying selection is useful feature in the means abstracting document with Feature Engineering, such as: part of speech feature, label symbol are special
Relationship characteristic etc. between sign, mood, is then classified using classifier common in machine learning;In recent years, with mind
Through network in the good application of other field, researcher is also begun with using neural network and solves mood classification problem.It is based on
The method and thought of neural network is to form a kind of suitable document by the relationship in neural network discovery document between word and word
Then feature vector is classified with classifier.Currently used neural network model has CNN (convolutional neural networks) and LSTM
(shot and long term memory network).It wherein, is the local feature that can effectively extract document the characteristics of CNN, LSTM is as a kind of sequence
Model regards document as an ordered sequence, the context information for the document that is good at learning.
Method based on topic model and based on feature selecting has that model is simple, interpretation is strong, can efficiently use biography
The advantages of system text feature, but deficiency is that Feature Engineering labor intensive, verification and measurement ratio are not high enough.Method energy neural network based
Depth excavates text semantic information, oneself learning characteristic vector is obtained compared with high detection rate, but interpretation is poor.In addition, existing
Work most of by the deep semantic information Separation Research of traditional text feature (such as: theme) and neural network learning and make
With classification accuracy is not high.Utilization for syntactic information also more lacks in existing research.
Summary of the invention
In view of the drawbacks of the prior art and Improvement requirement, the present invention provides a kind of text mood disaggregated model method for building up
And text mood classification method, it is intended that improving the accuracy of text mood classification.
To achieve the above object, according to the invention in a first aspect, providing a kind of text mood disaggregated model foundation side
Method, for predicting probability of the document to be sorted in each mood classification, method includes: text mood disaggregated model
(1) classifying documents are treated and carry out subordinate sentence operation, and obtain the term vector of all words in each sentence respectively;
(2) word conversion coating, document vector synthesis layer and output layer are successively established according to subordinate sentence result, to complete text
The foundation of this mood disaggregated model;
Wherein, word conversion coating includes M tree construction neural network, is corresponded with the resulting M sentence of subordinate sentence, respectively
For converting hiding vector for the term vector of word in sentence;Document vector synthesizes layer and is used to convert gained according to word conversion coating
The hiding vector of word obtain the document vector of document to be sorted;Output layer is used for the document for obtaining document vector synthesis layer
DUAL PROBLEMS OF VECTOR MAPPING is mood probability distribution and normalizes, to obtain probability of the document to be sorted in each mood classification.
In above method, corresponding hiding vector is converted for the term vector of word using tree construction neural network, is made
Deep semantic information and syntactic information that the Process fusion of text mood classification neural network learning arrives are obtained, therefore can be effective
Improve the accuracy of text mood classification.
Further, if the resulting sentence number M=1 of subordinate sentence, step (2) include:
Sentence S resulting to subordinate sentence carries out syntactic analysis, to obtain the interdependent syntax tree T of sentence S;Based on interdependent syntax tree
T establishes tree construction neural network TN, to obtain the word conversion coating being made of tree construction neural network TN, is used for sentence S
In the term vector of each word be converted into corresponding hiding vector;
The theme probability distribution p (k | d) and text to be sorted of document to be sorted are obtained using trained topic model TM
Shelves in each word theme probability distribution p (k | wmn), and according to theme probability distribution p (k | d) and theme probability distribution p (k |
wmn) calculate the attention weight of each word in document to be sorted;The synthesis of document vector is established according to the attention weight of word
Layer is weighted summation for the hiding vector to word to obtain the document vector of document to be sorted;
Output layer is constructed according to the dimension of document vector and mood classification number, for being mood probability by document DUAL PROBLEMS OF VECTOR MAPPING
It is distributed and normalizes, to obtain probability of the document to be sorted in each mood classification;
Wherein, d is document to be sorted, and k is the theme number, and m is that sentence is numbered, and n is that word is numbered, wmnIndicate m-th
N-th of word in son.
For simple sentence subdocument, the attention weight of each word in sentence is obtained using topic model, and according to word
Attention weight summation is weighted to the hiding vector of word, to obtain the document vector of document, so that text mood point
The process of class further merges theme letter on the basis of the deep semantic information and syntactic information that fused neural network learns
Breath, therefore the accuracy of text mood classification can be effectively improved.
Further, if the resulting sentence number M > 1 of subordinate sentence, it successively includes sentence synthesis layer, sentence that document vector, which synthesizes layer,
Sub- conversion coating and document synthesize layer, and step (2) includes:
M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain sentence S1~SMInterdependent syntax tree
T1~TM;It is based respectively on interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by tree construction nerve
Network TN1~TNMThe word conversion coating of composition, for respectively by sentence S1~SMIn the term vector of each word be converted into pair
The hiding vector answered;
Each list in the theme probability distribution p (k | d) and document to be sorted of document to be sorted is obtained using topic model TM
Word theme probability distribution p (k | wmn), and according to theme probability distribution p (k | d) and theme probability distribution p (k | wmn) calculate to
The attention weight of each word in classifying documents;Sentence synthesis layer is established according to the attention weight of word, for word
Hiding vector be weighted summation to respectively obtain S1~SMSentence vector x s1~xsM;
According to sentence vector x s1~xsMChain structure neural network CN is established, to obtain by chain structure neural network
The sentence conversion coating that CN is constituted is used for sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~hsM;
Establish document synthesis layer, the hiding vector hs of the end-node for obtaining chain structure neural network CNMAs to
The document vector of classifying documents;
Output layer is constructed according to the dimension of document vector and mood classification number, for being mood probability by document DUAL PROBLEMS OF VECTOR MAPPING
It is distributed and normalizes, to obtain probability of the document to be sorted in each mood classification.
Further, if the resulting sentence number M > 1 of subordinate sentence, it successively includes sentence synthesis layer, sentence that document vector, which synthesizes layer,
Sub- conversion coating and document synthesize layer, and step (2) includes:
M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain sentence S1~SMInterdependent syntax tree
T1~TM, and it is based respectively on interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by tree construction mind
Through network TN1~TNMThe word conversion coating of composition, for respectively by sentence S1~SMIn the term vector of each word be converted into
Corresponding hiding vector;
Sentence synthesis layer is established, for obtaining interdependent syntax tree T respectively1~TMRoot node hiding vector as sentence
S1~SMSentence vector x s1~xsM;
According to sentence vector x s1~xsMChain structure neural network CN is established, to obtain by chain structure neural network
The sentence conversion coating that CN is constituted is used for sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~hsM;
Each word under the theme probability distribution p (k | d) and each theme of document to be sorted is obtained using topic model TM
Probability p (wmn| k), and according to theme probability distribution p (k | d) and Probability p (wmn| k) calculate each sentence in document to be sorted
Attention weight;Document synthesis layer is established according to the attention weight of sentence, is weighted for the hiding vector to sentence
Summation is to obtain the document vector of document to be sorted;
Output layer is constructed according to the dimension of document vector and mood classification number, for being mood probability by document DUAL PROBLEMS OF VECTOR MAPPING
It is distributed and normalizes, to obtain probability of the document to be sorted in each mood classification.
For different documents, the model structure specifically established is different: for simple sentence subdocument, only with tree construction nerve
Network, without using chain structure neural network;For more sentence documents, while using tree construction neural network and chain structure
Neural network;Thus, it is possible to guaranteeing to avoid model training fast in the case where effectively improving the accuracy of text mood classification
It spends slow.
For more sentence documents, during carrying out the classification of text mood, merely with the attention weight or sentence of word
Son attention priority aggregation document vector, can guarantee effectively improve text mood classification accuracy in the case where,
Avoid model training speed excessively slow.
Further, the operation that mood probability distribution is normalized in output layer is softmax normalization.
Using softmax normalization operation, can equilibrium probability distribution, while avoiding the occurrence of the case where probability is 0 so that
Model can not have to do smoothing techniques again.
Second aspect according to the invention provides a kind of text mood classification method, comprising: for document to be sorted,
Using trained node parameter, the text mood disaggregated model method for building up provided according to a first aspect of the present invention establishes text
This mood disaggregated model, and established text mood disaggregated model is utilized to predict document to be sorted in each mood classification
Probability, with complete treat classifying documents text mood classification.
Further, the acquisition methods of node parameter include:
(S1) corpus is divided into training set and test set, and is transferred to (S4);The mood class of each document in corpus
It is unknown;
(S2) for document Di, text is established using the text mood disaggregated model method for building up that first aspect present invention provides
This mood disaggregated model, and the probability using text mood disaggregated model prediction document in each mood classification;
(S3) according to document DiMood classification and prediction result, adjust text mood disaggregated model node parameter, with
Minimize the loss function constructed in advance;
(S4) step (S2)-(S3) successively is executed to the document in training set, to complete to text mood disaggregated model
Train and obtain trained node parameter;
Wherein, i is document code.
Further, the acquisition methods of node parameter further include: according to trained node parameter successively in test set
Document execute step (S2), and text mood disaggregated model is tested according to the mood classification of document and prediction result,
To complete the verifying to node parameter.
In general, contemplated above technical scheme through the invention, can obtain it is following the utility model has the advantages that
(1) text mood classification method provided by the present invention is turned the term vector of word using tree construction neural network
Turn to corresponding hiding vector so that deep semantic information that the Process fusion neural network learning of text mood classification arrives and
Syntactic information, therefore the accuracy of text mood classification can be effectively improved.
(2) text mood classification method provided by the present invention obtains document using topic model for simple sentence subdocument
In each word attention weight, for more sentence documents, then using topic model obtain document in each word attention
The attention weight of each sentence in power weight or document, and closed using the attention weight of word or the attention weight of sentence
At document vector, so that deep semantic information and syntax that the process of text mood classification learns in fused neural network
On the basis of information, subject information is further merged, therefore the accuracy of text mood classification can be effectively improved.
(3) text mood classification method provided by the present invention establishes simple sentence subdocument and more sentence documents respectively
Different text mood disaggregated models, and for more sentence documents, when establishing text mood disaggregated model, merely with word
The attention weight of attention weight or sentence, can guarantee effectively improve text mood classification accuracy in the case where,
Avoid model training speed excessively slow.
(4) text mood classification method provided by the present invention, due to using topic model, each word or sentence
Attention weight size can be visualized, thus be more concerned about when carrying out the classification of text mood pay attention to which word or sentence can
Clearly to be shown, to improve the interpretation of model to a certain extent.
Detailed description of the invention
Fig. 1 is the flow chart of text mood disaggregated model method for building up embodiment one of the present invention;
Fig. 2 is the schematic diagram of interdependent syntax tree provided in an embodiment of the present invention;
Fig. 3 is the internal structure chart of existing Tree-LSTM network;
Fig. 4 is the flow chart of text mood disaggregated model method for building up embodiment two of the present invention;
Fig. 5 is the internal structure chart of existing Chain-LSTM network;
Fig. 6 is the flow chart of text mood disaggregated model method for building up embodiment three of the present invention;
Fig. 7 is the flow chart of text mood classification method embodiment of the present invention;
Fig. 8 is the word probability distribution signal for the distribution subject that LDA topic model provided in an embodiment of the present invention learns
Figure;(a)-(c) is respectively the corresponding word probability distribution schematic diagram of different themes;
Fig. 9 is the corresponding theme probability distribution schematic diagram of a document in corpus provided in an embodiment of the present invention;
Figure 10 is the higher word schematic diagram of weight in certain document provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below
Not constituting a conflict with each other can be combined with each other.
It is according to the invention in a first aspect, provide a kind of text mood disaggregated model method for building up, built by this method
Vertical text mood disaggregated model is for predicting probability of the document to be sorted in each mood classification.
Embodiment of the method one as shown in Figure 1, when document to be sorted is simple sentence subdocument, text feelings provided by the present invention
Thread disaggregated model method for building up includes:
(1) classifying documents are treated and carry out subordinate sentence operation, obtain sentence S, and obtain the term vector of each word in sentence S;
In an optional embodiment, the word of each word in sentence is obtained using trained term vector model
Vector;Used term vector model can be Word2vec term vector model, it is possible to use other term vector models;
The term vector of word takes the form of as follows:
Wherein, m is sentence number, and n is word number, xwmnIndicate the term vector of n-th of word of m-th of sentence, x table
Show the value of term vector, dwIndicate the dimension of term vector;
(2) syntactic analysis is carried out to sentence S, to obtain the interdependent syntax tree T of sentence S;It is established and is set based on interdependent syntax tree T
Artificial neural TN is used for obtain the word conversion coating being made of tree construction neural network TN by each in sentence S
The term vector of word is converted into corresponding hiding vector;
Interdependent syntax tree describes the syntax dependence between each word, and Fig. 2 show an existing interdependent sentence
Method tree example establishes tree construction neural network based on interdependent syntax tree, related word combination can will be formed on syntax
Phrase feature vector plays an important role for excavating depth semantic information;
In embodiments of the present invention, used tree construction neural network is Tree-LSTM, the internal junction of Tree-LSTM
Structure is as shown in figure 3, its conversion function is as follows:
f1=σ (W(f)x3+U(f)h1+b(f))
f2=σ (W(f)x3+U(f)h2+b(f))
i3=σ (W(i)x3+U(i)(h1+h2)+b(i))
o3=σ (W(o)x3+U(o)(h1+h2)+b(o))
u3=tanh (W(u)x3+U(u)(h1+h2)+b(u))
c3=i3⊙u3+(f1⊙c1+f2⊙c2)
h3=o3⊙tanh(c3)
Wherein, x3Indicate the term vector of current parent input, h1,h2The respectively hiding vector of child 1 and child 2,
c1,c2For the cell state of child 1 and child 2, f, i, o is respectively to forget door, input gate and out gate, and ⊙ indicates point multiplication operation;
Using above-mentioned conversion function, the term vector of each word can be converted to corresponding hiding vector;Word is hidden
Vector representation is as follows;
Wherein, m is sentence number, and n is word number, hwmnIndicate the hiding vector of n-th of word of m-th of sentence, h
Indicate the value of hiding vector, dmIndicate the dimension of hiding vector;
It should be understood that the neural fusion word conversion coating of other tree constructions can also be used in the present invention;
(3) the theme probability distribution p (k | d) of document to be sorted and to be sorted is obtained using trained topic model TM
In document each word theme probability distribution p (k | wmn);In embodiments of the present invention, used topic model TM can be
LDA topic model;
Calculating theme probability distribution p (k | d) and theme probability distribution p (k | wmn) between cosine similarity are as follows:
Cosine similarity is subjected to softmax normalization in each sentence, using the value after normalization as the note of word
Meaning power weight, calculation formula are as follows;
Document vector synthesis layer is established according to the attention weight of word, is weighted and asks for the hiding vector to word
With the document vector to obtain document to be sorted are as follows:
Wherein, d is document to be sorted, and k is the theme number, and m is that sentence is numbered, and n and n' are that word is numbered, wmnIndicate the
N-th of word in m sentence, N are the total words in sentence;
It should be understood that other topic models, which can also be used, in the present invention obtains corresponding theme probability distribution;
(4) output layer is constructed according to the dimension of document vector and mood classification number, for being mood by document DUAL PROBLEMS OF VECTOR MAPPING
Probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification;It is formulated as follows:
L=Wyd
Wherein, l is the resulting mood probability distribution of document DUAL PROBLEMS OF VECTOR MAPPING, and W is weight parameter matrix, and E is mood classification number,It is probability value of the document in i-th of mood classification.
For simple sentence subdocument, the attention weight of each word in sentence is obtained using topic model, and according to word
Attention weight summation is weighted to the hiding vector of word, to obtain the document vector of document, so that text mood point
The process of class further merges theme letter on the basis of the deep semantic information and syntactic information that fused neural network learns
Breath, therefore the accuracy of text mood classification can be effectively improved.
Embodiment of the method two as shown in Figure 4, when document to be sorted is more sentence documents, text feelings provided by the present invention
Thread disaggregated model method for building up includes:
(1) classifying documents are treated and carry out subordinate sentence operation, obtain M sentence S1~SM, and institute in each sentence is obtained respectively
There is the term vector of word;
The specific embodiment for obtaining the term vector of word, can refer to the description in above method embodiment one;
(2) M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain sentence S1~SMInterdependent sentence
Method tree T1~TM;
It is based respectively on interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by tree construction mind
Through network TN1~TNMThe word conversion coating of composition, for respectively by sentence S1~SMIn the term vector of each word be converted into
Corresponding hiding vector;In embodiments of the present invention, used tree construction neural network is Tree-LSTM, it should be understood that
It is that other tree construction neural networks can also be used;
(3) it is obtained using topic model TM each in the theme probability distribution p (k | d) and document to be sorted of document to be sorted
Word theme probability distribution p (k | wmn);In embodiments of the present invention, used topic model is LDA topic model, should
Understand, other topic models can also be used;
Calculating theme probability distribution p (k | d) and theme probability distribution p (k | wmn) between cosine similarity are as follows:
Cosine similarity is subjected to softmax normalization in each sentence, using the value after normalization as the note of word
Meaning power weight, calculation formula are as follows;
Establish sentence synthesis layer according to the attention weight of word, for the hiding vector to word be weighted summation with
Respectively obtain S1~SMSentence vector x s1~xsM;Wherein, the sentence vector of m-th of sentence are as follows:
(4) according to sentence vector x s1~xsMChain structure neural network CN is established, to obtain by chain structure nerve net
The sentence conversion coating that network CN is constituted is used for sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~hsM;
In embodiments of the present invention, used chain structure neural network is Chain-LSTM, and Chain-LSTM's is interior
Portion's structure is as shown in figure 5, its conversion function is as follows:
F=σ (W(f)st+U(f)ht-1+b(f))
I=σ (W(i)st+U(i)ht-1+b(i))
O=σ (W(o)st+U(o)ht-1+b(o))
ut=tanh (W(u)st+U(u)ht-1+b(u))
ct=i ⊙ ut+f⊙ct-1
ht=o ⊙ tanh (ct)
Wherein, stIndicate the sentence vector of t moment, htAnd ht-1Respectively indicate the hiding vector at t moment and t-1 moment, ct
And ct-1Respectively indicate the cell state of t moment and t-1 moment;
By above-mentioned conversion function, sentence vector can be converted to corresponding hiding vector;
It should be understood that other chain structure neural fusion sentence conversion coatings can also be used in the present invention;
(5) document synthesis layer, the hiding vector hs of the end-node for obtaining chain structure neural network CN are establishedMAs
The document vector of document to be sorted;
Above-mentioned sentence synthesis layer, sentence conversion coating and document synthesis layer collectively form the document of text mood disaggregated model to
Amount synthesis layer;
(6) output layer is constructed according to the dimension of document vector and mood classification number, for being mood by document DUAL PROBLEMS OF VECTOR MAPPING
Probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification;
The specific embodiment for establishing output layer can refer to the description in above method embodiment one.
Embodiment of the method three as shown in FIG. 6, when document to be sorted is more sentence documents, text feelings provided by the present invention
Thread disaggregated model method for building up includes:
(1) classifying documents are treated and carry out subordinate sentence operation, obtain M sentence S1~SM, and institute in each sentence is obtained respectively
There is the term vector of word;
The specific embodiment for obtaining the term vector of word, can refer to the description in above method embodiment one;
(2) M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain sentence S1~SMInterdependent sentence
Method tree T1~TM;
It is based respectively on interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by tree construction mind
Through network TN1~TNMThe word conversion coating of composition, for respectively by sentence S1~SMIn the term vector of each word be converted into
Corresponding hiding vector;In the present invention, used tree construction neural network is Tree-LSTM, it should be appreciated that can also
Using other tree construction neural networks;
(3) sentence synthesis layer is established, for obtaining interdependent syntax tree T respectively1~TMRoot node hiding vector as sentence
Sub- S1~SMSentence vector x s1~xsM;
(4) according to sentence vector x s1~xsMChain structure neural network CN is established, to obtain by chain structure nerve net
The sentence conversion coating that network CN is constituted is used for sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~hsM;?
In the embodiment of the present invention, used chain structure neural network is Chain-LSTM, it should be appreciated that other can also be used
Chain structure neural network;
(5) each list under the theme probability distribution p (k | d) and each theme of document to be sorted is obtained using topic model TM
Probability p (the w of wordmn|k);
According to theme probability distribution p (k | d) and Probability p (wmn| k) calculate separately the theme probability distribution of each sentence;Wherein,
M-th of sentence smTheme probability distribution p (k | sm) are as follows:
Wherein, NmIndicate sentence smIn word number;
Calculating theme probability distribution p (k | sm) and theme probability distribution p (k | d) between cosine similarity sim (sm,d)
Are as follows:
Wherein, IR indicates that information radius, KL indicate KL distance;
By sim (sm, d) and softmax normalization in a document, obtain sentence smAttention weight size are as follows:
Document synthesis layer is established according to the attention weight of each sentence, is weighted summation for the hiding vector to sentence
To obtain the document vector of document to be sorted are as follows:
The document vector that layer constitutes text mood disaggregated model is synthesized by above-mentioned sentence synthesis layer, sentence conversion coating and document
Synthesize layer;
(6) output layer is constructed according to the dimension of document vector and mood classification number, for being mood by document DUAL PROBLEMS OF VECTOR MAPPING
Probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification;
The specific embodiment for establishing output layer can refer to the description in above method embodiment one.
For different documents, the model structure specifically established is different: for simple sentence subdocument, only with tree construction nerve
Network, without using chain structure neural network;For more sentence documents, while using tree construction neural network and chain structure
Neural network;Thus, it is possible to guaranteeing to avoid model training fast in the case where effectively improving the accuracy of text mood classification
It spends slow.
For more sentence documents, during carrying out the classification of text mood, merely with the attention weight or sentence of word
Son attention priority aggregation document vector, can guarantee effectively improve text mood classification accuracy in the case where,
Avoid model training speed excessively slow.
In above method embodiment, output layer is softmax normalizing to the operation that mood probability distribution is normalized
Change;Using softmax normalization operation, can equilibrium probability distribution, while the case where probability is 0 is avoided the occurrence of, so that model
It can not have to do smoothing techniques again.
Based on above method embodiment, the present invention also provides a kind of text mood classification method, for realizing treating point
The text mood of class document is classified, as shown in fig. 7, comprises: for document to be sorted, utilization trained node parameter, root
Text mood disaggregated model is established according to the text mood disaggregated model method for building up that first aspect present invention provides, and using being built
Vertical text mood disaggregated model predicts probability of the document to be sorted in each mood classification, to complete to treat classifying documents
Text mood classification.
In an optional embodiment, trained node parameter is to pass through in above-mentioned text mood classification method
The acquisition methods of the node parameter that model training is got, node parameter specifically include:
(S1) corpus is divided into training set and test set, and is transferred to (S4);The mood class of each document in corpus
It is unknown;
(S2) for document Di, text is established using the text mood disaggregated model method for building up that first aspect present invention provides
This mood disaggregated model, and the probability using text mood disaggregated model prediction document in each mood classification;
(S3) according to document DiMood classification and prediction result, use BP algorithm adjustment text mood disaggregated model section
Point parameter, to minimize the loss function constructed in advance;
In embodiments of the present invention, the loss function constructed in advance is the mood probability distribution based on predictionWith it is true
Mood probability distribution y constructs KL divergence loss function, expression formula are as follows:
Wherein,For prediction result,It is prediction probability value of the document in i-th of mood classification, yiIt is document i-th
True probability value in a mood classification;
(S4) step (S2)-(S3) successively is executed to the document in training set, to complete to text mood disaggregated model
Train and obtain trained node parameter;
Wherein, i is document code;
The acquisition methods of node parameter further include: successively the document in test set is executed according to trained node parameter
Step (S2), and text mood disaggregated model is tested according to the mood classification of document and prediction result, thus completion pair
The verifying of node parameter.
Application example
Using Sina News data set as training corpus, trained node parameter is obtained for training pattern;Tree
Artificial neural uses Tree-LSTM network, and chain structure neural network uses Chain-LSTM network;Term vector model is adopted
With the good Word2vec term vector model of Chinese wikipedia database training;Topic model is used by entire Sina News data
Collect trained LDA topic model.Sina News data set is collected from Sina News (https: //news.sina.com.cn/)
Social channel, include 5258 hot news in total issued from January, 2016 to December.There are 6 class moods after every news
Label is voted for reader, is respectively: being moved, indignation, is sympathized with, is sad, surprised and novel.Table 1 shows detailed ballot statistics knot
Fruit, wherein training set is preceding 3109 data in data set, and test set is rear 2149 data.
Table 1
Table 2 shows the setting result of relevant parameter in document mood assorting process.
Table 2
Fig. 8 shows the word probability distribution situation for the distribution subject that LDA topic model learns in the form of word cloud.Wherein,
Word is bigger, and the probability for indicating to occur is higher.It can be seen from the figure that theme 3 mainly has with " express delivery ", " hospital ", " phone " etc.
It closes, theme 9 is mainly related with " aspiration ", " hospital ", " Japan " etc., and theme 24 mainly has with " school ", " teacher ", " student " etc.
It closes.
Fig. 9 shows the corresponding theme probability distribution of a certain piece document in Sina News data set, and document original text: " parent connects
It is unrelated with school that notice need to hand over thousand yuan of expression of sympathy teacher school sides to claim ... summer time in this year, and school once organized high school senior to make amends for one's faults a period of time
Class, the new term begins after at weekend and daily self-study at night 10 after also have private tutoring ... reporter illustrate the morning on the 12nd shooting family
It is long after the picture that 402 classrooms are collected money openly, Xiao's surname Senior group leader says in city two, the activity of collecting money of same day parent, most
It is dispersed afterwards.And Zhu Niansheng, Wu Huijun are explained, it may be possible to which part parent does not listen ' greeting '.Zhu Niansheng is said frankly, ' if any parent
The money has been paid, whom gives and just whom looks for retract.The money, any teacher of school will not receive.' 14 day afternoons, Shao
Surname parent promises to undertake, will link up with the member of other parent committees after going back, and paid-in debt is all returned to house
It is long."
From fig. 9, it can be seen that the theme of this document is mainly distributed on theme 9 and theme 24, in conjunction with Fig. 8, it is easy
The content of this document is found out mainly with " student ", and " school ", " teacher ", " parent " is related, obtains with direct reading document
Direct feel is consistent, illustrates that LDA topic model can effectively extract the subject information of document, this is also to text mood assorting process
Middle fusion subject information plays key effect.
Figure 10 shows the higher word of attention weight in certain document.From fig. 10 it can be seen that passing through fusion theme
Information, system effectively capture descriptor important in document, such as: " baffling ", " downhearted " etc..This is also to a certain degree
On explain the superiority of method proposed by the present invention.
The accuracy of text mood classification can be effectively improved for the verifying present invention, test uses following 6 kinds of models respectively
(a)-(f) accuracy of text mood classification is carried out.Model (a)-(f) is text mood provided in an embodiment of the present invention classification mould
Type, or the model formed after being modified slightly on the basis of text mood disaggregated model provided by the embodiment of the present invention.Mould
Type (a)-(f) is:
(a) model established by embodiment of the method two shown in Fig. 4;
(b) model established by embodiment of the method three shown in fig. 6;
(c) sentence synthesis layer is modified on the basis of model (a), so that sentence synthesis layer directly obtains in word conversion coating
Sentence vector of the hiding vector of the root node of each Tree-LSTM as corresponding sentence, remaining structure are constant;
(d) word conversion coating is modified on the basis of model (a), and the Tree-LSTM in word conversion coating is replaced with
Chain-LSTM obtains the hiding vector weighted sum of each node of Chain-LSTM with the attention weight of word corresponding
Sentence vector, remaining structure are constant;
(e) word conversion coating is modified on the basis of model (b), and the Tree-LSTM in word conversion coating is replaced with
Chain-LSTM, sentence vector are the hiding vector of corresponding the last one node of Chain-LSTM network, remaining structure is constant;
(f) word conversion coating is modified on the basis of model (c), and the Tree-LSTM in word conversion coating is replaced with
Chain-LSTM, sentence vector are the hiding vector of corresponding the last one node of Chain-LSTM network, remaining structure is constant.
In addition, two models having performed the best on this Sina News data set at present are Social Opinion
Mining model and Emotion-topic model, is denoted as model (g) and model (h) respectively.
Table 3, which is shown, uses model (b) phase as can be seen from the table using the classification accuracy of model (a)-(h) mood
Than using other models mood classification performance with higher, and compared with existing best model (g and h), have more bright
Aobvious advantage.By model (a), model (b) and model (c) or model (d), model (e) compared with model (f), as a result prove to melt
The promotion of mood classification accuracy can be promoted by closing subject information;Although scheme (a) is compared with scheme (d), after the former is slightly below
Person, but gap is little, and on the whole, is better than using the scheme of tree-like syntax neural network using chain structure neural network
(such as: b is better than e to scheme, and c is better than f), it is possible to say the standard classified using tree-like syntax neural network for promoting text mood
Exactness is more more effective than chain structure neural network.
Model | Accuracy |
a | 63.64% |
b | 65.50% |
c | 62.57% |
d | 63.73% |
e | 63.31% |
f | 61.69% |
g | 58.59% |
h | 54.19% |
Table 3
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to
The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include
Within protection scope of the present invention.
Claims (8)
1. a kind of text mood disaggregated model method for building up, the text mood disaggregated model is for predicting document to be sorted every
Probability in one mood classification characterized by comprising
(1) subordinate sentence operation is carried out to the document to be sorted, and obtains the term vector of all words in each sentence respectively;
(2) word conversion coating, document vector synthesis layer and output layer are successively established according to subordinate sentence result, to complete the text
The foundation of this mood disaggregated model;
Wherein, the word conversion coating includes M tree construction neural network, is corresponded with the resulting M sentence of subordinate sentence, respectively
For converting hiding vector for the term vector of word in sentence;The document vector synthesis layer is used for according to the word conversion coating
The hiding vector for converting resulting word obtains the document vector of the document to be sorted;The output layer is used for the document
The document DUAL PROBLEMS OF VECTOR MAPPING that vector synthesis layer obtains is mood probability distribution and normalizes, to obtain the document to be sorted every
Probability in a mood classification.
2. text mood model method for building up as described in claim 1, which is characterized in that if the resulting sentence number M=of subordinate sentence
1, then the step (2) include:
Sentence S resulting to subordinate sentence carries out syntactic analysis, to obtain the interdependent syntax tree T of the sentence S;Based on the interdependent sentence
Method tree T establishes tree construction neural network TN, to obtain the word conversion coating being made of the tree construction neural network TN, is used for
Corresponding hiding vector is converted by the term vector of each word in the sentence S;
Using trained topic model TM obtain the document to be sorted theme probability distribution p (k | d) and it is described to point
In class document each word theme probability distribution p (k | wmn), and according to the theme probability distribution p (k | d) and the theme
Probability distribution p (k | wmn) calculate the attention weight of each word in the document to be sorted;According to the attention weight of word
The document vector synthesis layer is established, is weighted summation for the hiding vector to word to obtain the document to be sorted
Document vector;
The output layer is constructed according to the dimension of the document vector and mood classification number, for being by the document DUAL PROBLEMS OF VECTOR MAPPING
Mood probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification;
Wherein, d is the document to be sorted, and k is the theme number, and m is that sentence is numbered, and n is that word is numbered, wmnIndicate m-th
N-th of word in son.
3. text mood model method for building up as claimed in claim 2, which is characterized in that if the resulting sentence number M > of subordinate sentence
1, then the document vector synthesis layer successively includes that sentence synthesis layer, sentence conversion coating and document synthesize layer, and the step
(2) include:
M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain the sentence S1~SMInterdependent syntax tree
T1~TM;It is based respectively on the interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by the tree
Artificial neural TN1~TNMThe word conversion coating of composition, for respectively by the sentence S1~SMIn each word word
Vector is converted into corresponding hiding vector;
The theme probability distribution p (k | d) and the document to be sorted of the document to be sorted are obtained using the topic model TM
In each word theme probability distribution p (k | wmn), and according to the theme probability distribution p (k | d) and the theme probability point
Cloth p (k | wmn) calculate the attention weight of each word in the document to be sorted;Institute is established according to the attention weight of word
Sentence synthesis layer is stated, is weighted summation for the hiding vector to word to respectively obtain the S1~SMSentence vector x s1
~xsM;
According to the sentence vector x s1~xsMChain structure neural network CN is established, to obtain by the chain structure nerve
The sentence conversion coating that network C N is constituted is used for the sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~
hsM;
Establish the document synthesis layer, the hiding vector hs of the end-node for obtaining the chain structure neural network CNMAs
The document vector of the document to be sorted;
The output layer is constructed according to the dimension of the document vector and mood classification number, for being by the document DUAL PROBLEMS OF VECTOR MAPPING
Mood probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification.
4. text mood disaggregated model method for building up as claimed in claim 2, which is characterized in that if the resulting sentence number of subordinate sentence
M > 1, then the document vector synthesis layer successively includes that sentence synthesis layer, sentence conversion coating and document synthesize layer, and the step
Suddenly (2) include:
M sentence S resulting to subordinate sentence respectively1~SMSyntactic analysis is carried out, to obtain the sentence S1~SMInterdependent syntax tree
T1~TM, and it is based respectively on the interdependent syntax tree T1~TMEstablish tree construction neural network TN1~TNM, to obtain by described
Tree construction neural network TN1~TNMThe word conversion coating of composition, for respectively by the sentence S1~SMIn each word
Term vector is converted into corresponding hiding vector;
The sentence synthesis layer is established, for obtaining the interdependent syntax tree T respectively1~TMRoot node hiding vector conduct
The sentence S1~SMSentence vector x s1~xsM;
According to the sentence vector x s1~xsMChain structure neural network CN is established, to obtain by the chain structure nerve
The sentence conversion coating that network C N is constituted is used for the sentence vector x s1~xsMIt is separately converted to corresponding hiding vector hs1~
hsM;
It is obtained using the topic model TM each under the theme probability distribution p (k | d) and each theme of the document to be sorted
Probability p (the w of wordmn| k), and according to the theme probability distribution p (k | the d) and Probability p (wmn| it k) calculates described wait divide
The attention weight of each sentence in class document;The document synthesis layer is established according to the attention weight of sentence, is used for distich
The hiding vector of son is weighted summation to obtain the document vector of the document to be sorted;
The output layer is constructed according to the dimension of the document vector and mood classification number, for being by the document DUAL PROBLEMS OF VECTOR MAPPING
Mood probability distribution simultaneously normalizes, to obtain probability of the document to be sorted in each mood classification.
5. text mood disaggregated model method for building up as described in claim 1, which is characterized in that the output layer is general to mood
The operation that rate distribution is normalized is softmax normalization.
6. a kind of text mood classification method characterized by comprising for document to be sorted, utilization trained node
Parameter, text mood disaggregated model method for building up according to claim 1-5 establish text mood disaggregated model,
And established text mood disaggregated model is utilized to predict probability of the document to be sorted in each mood classification, with complete
The text mood classification of the pairs of document to be sorted.
7. text mood classification method as claimed in claim 6, which is characterized in that the acquisition methods packet of the node parameter
It includes:
(S1) corpus is divided into training set and test set, and is transferred to (S4);The mood class of each document in the corpus
It is unknown;
(S2) for document Di, text is established using the described in any item text mood disaggregated model method for building up of claim 1-5
Mood disaggregated model, and established text mood disaggregated model is utilized to predict the document DiIn each mood classification
Probability;
(S3) according to the document DiMood classification and prediction result, adjust the node parameter of the text mood disaggregated model,
To minimize the loss function constructed in advance;
(S4) step (S2)-(S3) successively is executed to the document in the training set, to complete to classify to the text mood
Model trains and obtains trained node parameter;
Wherein, i is document code.
8. text mood classification method as claimed in claim 7, which is characterized in that the acquisition methods of the node parameter also wrap
It includes: step (S2) successively being executed to the document in the test set according to trained node parameter, and according to the mood of document
Classification and prediction result test the text mood disaggregated model, to complete the verifying to the node parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811492975.5A CN109446331B (en) | 2018-12-07 | 2018-12-07 | Text emotion classification model establishing method and text emotion classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811492975.5A CN109446331B (en) | 2018-12-07 | 2018-12-07 | Text emotion classification model establishing method and text emotion classification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109446331A true CN109446331A (en) | 2019-03-08 |
CN109446331B CN109446331B (en) | 2021-03-26 |
Family
ID=65557726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811492975.5A Active CN109446331B (en) | 2018-12-07 | 2018-12-07 | Text emotion classification model establishing method and text emotion classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109446331B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110162787A (en) * | 2019-05-05 | 2019-08-23 | 西安交通大学 | A kind of class prediction method and device based on subject information |
CN110309306A (en) * | 2019-06-19 | 2019-10-08 | 淮阴工学院 | A kind of Document Modeling classification method based on WSD level memory network |
CN110543563A (en) * | 2019-08-20 | 2019-12-06 | 暨南大学 | Hierarchical text classification method and system |
CN110704626A (en) * | 2019-09-30 | 2020-01-17 | 北京邮电大学 | Short text classification method and device |
CN110704715A (en) * | 2019-10-18 | 2020-01-17 | 南京航空航天大学 | Network overlord ice detection method and system |
CN111259142A (en) * | 2020-01-14 | 2020-06-09 | 华南师范大学 | Specific target emotion classification method based on attention coding and graph convolution network |
CN111312403A (en) * | 2020-01-21 | 2020-06-19 | 山东师范大学 | Disease prediction system, device and medium based on instance and feature sharing cascade |
CN111339440A (en) * | 2020-02-19 | 2020-06-26 | 东南大学 | Social emotion ordering method for news text based on hierarchical state neural network |
CN111949790A (en) * | 2020-07-20 | 2020-11-17 | 重庆邮电大学 | Emotion classification method based on LDA topic model and hierarchical neural network |
CN112417889A (en) * | 2020-11-25 | 2021-02-26 | 重庆文理学院 | Client product comment emotion analysis method based on machine learning |
CN112836017A (en) * | 2021-02-09 | 2021-05-25 | 天津大学 | Event detection method based on hierarchical theme-driven self-attention mechanism |
WO2021169364A1 (en) * | 2020-09-23 | 2021-09-02 | 平安科技(深圳)有限公司 | Semantic emotion analysis method and apparatus, device, and storage medium |
CN113869034A (en) * | 2021-09-29 | 2021-12-31 | 重庆理工大学 | Aspect emotion classification method based on reinforced dependency graph |
CN117635785A (en) * | 2024-01-24 | 2024-03-01 | 卓世科技(海南)有限公司 | Method and system for generating worker protection digital person |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160180215A1 (en) * | 2014-12-19 | 2016-06-23 | Google Inc. | Generating parse trees of text segments using neural networks |
CN105930503A (en) * | 2016-05-09 | 2016-09-07 | 清华大学 | Combination feature vector and deep learning based sentiment classification method and device |
US20170192956A1 (en) * | 2015-12-31 | 2017-07-06 | Google Inc. | Generating parse trees of text segments using neural networks |
CN107066446A (en) * | 2017-04-13 | 2017-08-18 | 广东工业大学 | A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules |
CN107578092A (en) * | 2017-09-01 | 2018-01-12 | 广州智慧城市发展研究院 | A kind of emotion compounding analysis method and system based on mood and opinion mining |
CN107944014A (en) * | 2017-12-11 | 2018-04-20 | 河海大学 | A kind of Chinese text sentiment analysis method based on deep learning |
US20180174036A1 (en) * | 2016-12-15 | 2018-06-21 | DeePhi Technology Co., Ltd. | Hardware Accelerator for Compressed LSTM |
CN108399158A (en) * | 2018-02-05 | 2018-08-14 | 华南理工大学 | Attribute sensibility classification method based on dependency tree and attention mechanism |
CN108664632A (en) * | 2018-05-15 | 2018-10-16 | 华南理工大学 | A kind of text emotion sorting algorithm based on convolutional neural networks and attention mechanism |
CN108763284A (en) * | 2018-04-13 | 2018-11-06 | 华南理工大学 | A kind of question answering system implementation method based on deep learning and topic model |
CN108804417A (en) * | 2018-05-21 | 2018-11-13 | 山东科技大学 | A kind of documentation level sentiment analysis method based on specific area emotion word |
-
2018
- 2018-12-07 CN CN201811492975.5A patent/CN109446331B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160180215A1 (en) * | 2014-12-19 | 2016-06-23 | Google Inc. | Generating parse trees of text segments using neural networks |
US20170192956A1 (en) * | 2015-12-31 | 2017-07-06 | Google Inc. | Generating parse trees of text segments using neural networks |
CN105930503A (en) * | 2016-05-09 | 2016-09-07 | 清华大学 | Combination feature vector and deep learning based sentiment classification method and device |
US20180174036A1 (en) * | 2016-12-15 | 2018-06-21 | DeePhi Technology Co., Ltd. | Hardware Accelerator for Compressed LSTM |
CN107066446A (en) * | 2017-04-13 | 2017-08-18 | 广东工业大学 | A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules |
CN107578092A (en) * | 2017-09-01 | 2018-01-12 | 广州智慧城市发展研究院 | A kind of emotion compounding analysis method and system based on mood and opinion mining |
CN107944014A (en) * | 2017-12-11 | 2018-04-20 | 河海大学 | A kind of Chinese text sentiment analysis method based on deep learning |
CN108399158A (en) * | 2018-02-05 | 2018-08-14 | 华南理工大学 | Attribute sensibility classification method based on dependency tree and attention mechanism |
CN108763284A (en) * | 2018-04-13 | 2018-11-06 | 华南理工大学 | A kind of question answering system implementation method based on deep learning and topic model |
CN108664632A (en) * | 2018-05-15 | 2018-10-16 | 华南理工大学 | A kind of text emotion sorting algorithm based on convolutional neural networks and attention mechanism |
CN108804417A (en) * | 2018-05-21 | 2018-11-13 | 山东科技大学 | A kind of documentation level sentiment analysis method based on specific area emotion word |
Non-Patent Citations (2)
Title |
---|
LIU CHEN: "Tree-LSTM Guided Attention Pooling of DCNN for Semantic Sentence Modeling", 《SPRINGER LINK》 * |
王文凯等: "基于卷积神经网络和Tree-LSTM的微博情感分析", 《计算机应用研究》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110162787A (en) * | 2019-05-05 | 2019-08-23 | 西安交通大学 | A kind of class prediction method and device based on subject information |
CN110309306A (en) * | 2019-06-19 | 2019-10-08 | 淮阴工学院 | A kind of Document Modeling classification method based on WSD level memory network |
CN110543563A (en) * | 2019-08-20 | 2019-12-06 | 暨南大学 | Hierarchical text classification method and system |
CN110704626B (en) * | 2019-09-30 | 2022-07-22 | 北京邮电大学 | Short text classification method and device |
CN110704626A (en) * | 2019-09-30 | 2020-01-17 | 北京邮电大学 | Short text classification method and device |
CN110704715A (en) * | 2019-10-18 | 2020-01-17 | 南京航空航天大学 | Network overlord ice detection method and system |
CN111259142A (en) * | 2020-01-14 | 2020-06-09 | 华南师范大学 | Specific target emotion classification method based on attention coding and graph convolution network |
CN111312403A (en) * | 2020-01-21 | 2020-06-19 | 山东师范大学 | Disease prediction system, device and medium based on instance and feature sharing cascade |
CN111339440A (en) * | 2020-02-19 | 2020-06-26 | 东南大学 | Social emotion ordering method for news text based on hierarchical state neural network |
CN111339440B (en) * | 2020-02-19 | 2024-01-23 | 东南大学 | Social emotion sequencing method based on hierarchical state neural network for news text |
CN111949790A (en) * | 2020-07-20 | 2020-11-17 | 重庆邮电大学 | Emotion classification method based on LDA topic model and hierarchical neural network |
WO2021169364A1 (en) * | 2020-09-23 | 2021-09-02 | 平安科技(深圳)有限公司 | Semantic emotion analysis method and apparatus, device, and storage medium |
CN112417889A (en) * | 2020-11-25 | 2021-02-26 | 重庆文理学院 | Client product comment emotion analysis method based on machine learning |
CN112836017A (en) * | 2021-02-09 | 2021-05-25 | 天津大学 | Event detection method based on hierarchical theme-driven self-attention mechanism |
CN112836017B (en) * | 2021-02-09 | 2022-07-26 | 天津大学 | Event detection method based on hierarchical theme-driven self-attention mechanism |
CN113869034A (en) * | 2021-09-29 | 2021-12-31 | 重庆理工大学 | Aspect emotion classification method based on reinforced dependency graph |
CN113869034B (en) * | 2021-09-29 | 2022-05-20 | 重庆理工大学 | Aspect emotion classification method based on reinforced dependency graph |
CN117635785A (en) * | 2024-01-24 | 2024-03-01 | 卓世科技(海南)有限公司 | Method and system for generating worker protection digital person |
Also Published As
Publication number | Publication date |
---|---|
CN109446331B (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109446331A (en) | A kind of text mood disaggregated model method for building up and text mood classification method | |
Abdullah et al. | SEDAT: sentiment and emotion detection in Arabic text using CNN-LSTM deep learning | |
CN109684440B (en) | Address similarity measurement method based on hierarchical annotation | |
CN103207855B (en) | For the fine granularity sentiment analysis system and method for product review information | |
CN108984724A (en) | It indicates to improve particular community emotional semantic classification accuracy rate method using higher-dimension | |
CN103605729B (en) | A kind of method based on local random lexical density model POI Chinese Text Categorizations | |
CN105895087A (en) | Voice recognition method and apparatus | |
Ognyanova et al. | A multitheoretical, multilevel, multidimensional network model of the media system: Production, content, and audiences | |
CN103823890B (en) | A kind of microblog hot topic detection method for special group and device | |
CN111401040B (en) | Keyword extraction method suitable for word text | |
CN107609185A (en) | Method, apparatus, equipment and computer-readable recording medium for POI Similarity Measure | |
CN106897262A (en) | A kind of file classification method and device and treating method and apparatus | |
Black et al. | Academisation of schools in England and placements of pupils with special educational needs: an analysis of trends, 2011–2017 | |
CN109448703A (en) | In conjunction with the audio scene recognition method and system of deep neural network and topic model | |
CN106682089A (en) | RNNs-based method for automatic safety checking of short message | |
CN108052625A (en) | A kind of entity sophisticated category method | |
CN110263822A (en) | A kind of Image emotional semantic analysis method based on multi-task learning mode | |
CN110825850B (en) | Natural language theme classification method and device | |
CN101556582A (en) | System for analyzing and predicting netizen interest in forum | |
CN105574633A (en) | College teacher and student knowledge sharing platform based on KNN | |
Liu et al. | A BERT-based ensemble model for Chinese news topic prediction | |
CN105869058A (en) | Method for user portrait extraction based on multilayer latent variable model | |
Perez-Encinas et al. | Geographies and cultures of international student experiences in higher education: Shared perspectives between students from different countries | |
Stemle et al. | Using language learner data for metaphor detection | |
Wang | Application of C4. 5 decision tree algorithm for evaluating the college music education |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |