CN107153642A - A kind of analysis method based on neural network recognization text comments Sentiment orientation - Google Patents
A kind of analysis method based on neural network recognization text comments Sentiment orientation Download PDFInfo
- Publication number
- CN107153642A CN107153642A CN201710342178.8A CN201710342178A CN107153642A CN 107153642 A CN107153642 A CN 107153642A CN 201710342178 A CN201710342178 A CN 201710342178A CN 107153642 A CN107153642 A CN 107153642A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msubsup
- word
- msub
- theta
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a kind of analysis method based on neural network recognization text comments Sentiment orientation for belonging to computer language word processing technical field.It is word or word one by one the accurate participle of each sentence first by CBOW processing in processing text comments data;Each sentence has a corresponding class label;Then the emotion tendency commented on using long short-term memory LSTM Model checkings;Obtain the label of each sentence;And then be compared with true tag and obtain accuracy rate, by training the neural network model to obtain its best accuracy rate, that is, reach the purpose of neural network recognization text comments Sentiment orientation analysis.By the training process of GPU accelerans networks, the accuracy rate of emotional semantic classification is not only improved, and the training speed of large-scale corpus is got a promotion;The Sentiment orientation of comment can be effectively recognized, especially other faces have more preferable application prospect in electric business, film etc..
Description
Technical field
It is more particularly to a kind of to be based on neural network recognization text the invention belongs to computer language word processing technical field
Comment on the analysis method of Sentiment orientation.
Background technology
Continuing to develop and becoming increasingly popular with computer internet, people are in the urgent need to the text to being become increasingly abundant on network
This comment resource is managed, and is realized and is commented on mass text the effectively accurate emotional semantic classification of resource.Traditional text emotion point
Class is taken as a text categorization task to consider, accurately far can not carry out emotion tendency judgement to text comments.
Sentiment analysis is that the emotion expressed by text comments is accurately classified, and helps people conveniently and efficiently to recognize
Go out the Sentiment orientation of comment.At present, the method for text comments emotional semantic classification may be referred to document:[1] number of patent application:
201410602800.0, patent name:A kind of text emotion analysis method and equipment based on SVMs;[2] patent Shen
Please number:201510452024.5, patent name:A kind of Chinese text sentiment analysis side based on Computerized Information Processing Tech
Method;[3] number of patent application:201410224628.X, patent name:A kind of network text sentiment analysis side based on emotion value
Method.Above-mentioned document is the feature and its weight for constructing text mostly, then carries out text emotion identification by certain sorting algorithm.
For text comments data, handled according to the method described above, there is weak point.For example, text comments number
According to typically shorter, the above method have ignored important function of the word order in emotional semantic classification merely with word feature.Therefore, this
A little methods can not carry out emotion tendency identification to text comments well.
The content of the invention
The purpose of the present invention is to propose to a kind of analysis method based on neural network recognization text comments Sentiment orientation, it is special
Levy and be, in processing text comments data, use continuous type bag of words CBOW models (Continuous Bag-of-Words
Model) as training term vector method, then using long short-term memory LSTM models (Long Short Time Memory,
LSTM the emotion tendency of comment) is differentiated;Comprise the following steps that:
The first step, language material pretreatment is word or word one by one the accurate participle of each sentence;Each sentence has one
Individual corresponding class label, i.e., 0,1,2, negative, neutral, front is represented respectively;Need each class label being converted into herein
Three-dimensional vector, i.e., 0 is converted to [1 0 0], and 1 is converted to [0 1 0], and 2 are converted to [0 0 1].The purpose of this conversion is to lead to
Cross and obtain the label of each sentence after training and contrasted;
Second step, term vector training is trained to the word obtained after text comments language material participle using CBOW, obtains each
Vector corresponding to word, the vectorial dimension can be configured as needed;
3rd step, LSTM training.Use LSTM as the model of classification in the present invention, the language material with affective tag is made
Training set, by the first step, second step processing, you can the problem of problem is converted into neural metwork training disaggregated model;
Assuming that including m word, corresponding term vector V={ v in a word l1,v2,L,vm, now sentence l is just with vector
V represents, the vector corresponding to the word of every words is directly inputted and handled to Recognition with Recurrent Neural Network;The Cyclic Operation Network
Use conventional model:Long short-term memory LSTM models, are just corresponded to according to each word of sentence in Recognition with Recurrent Neural Network
One LSTM model, namely one of word in the training of actual sentence, and will be all according to the position relationship of word
The composition that is connected in turn before and after LSTM models chain structure, so as to be trained, the corresponding term vector V={ v of text word1,
v2,…,vmSuccessively as x in each models of LSTMtInput value, last LSTM model in every a word is obtained into ht
Exported as the three-dimensional vector of this;It regard the output valve of every as the input of Softmax functions, the definition of Softmax functions again
For:The function is probability-distribution function, and obtains three probable values, and three probable value sums are
1;In formula:exTo calculate the value of each classification;To calculate the value sum of k classification;If there are three class labels, the is calculated
The probability of one class isCalculate Equations of The Second Kind probability beThe probability for calculating the 3rd class isThe emotional category as this for obtaining maximum is compared by the probable value to each classification, is obtained
Maximum class probability value is just as the label of each sentence;And then be compared with true tag and obtain accuracy rate, pass through instruction
Practice the neural network model and obtain its best accuracy rate, while also Optimal Parameters;Realize neural network recognization text comments
The purpose of Sentiment orientation analysis;It is above-mentioned it is various in:V represents m term vector corresponding to m word, v in a word1Represent first
Term vector corresponding to individual word, v2The term vector corresponding to second word is represented, by that analogy;In t in LSTM models
Input value xt, i.e., the corresponding term vector of t-th word as LSTM models input;The corresponding term vector v of first word1It is used as x
First input value x1;htIt is the output of LSTM models, is made up of two parts, is to be obtained by sigmoid layers at the beginning of one first
Begin to export, then zoomed to Ct values between -1 and 1 with tanh, the output then obtained with sigmoid is by being multiplied, so as to obtain
To the output of model.
The term vector training basic thought and step of the second step are as follows:The description of language model form is exactly given one
The character string S of individual T word, regards the probability P (w of natural language as1,w2,w3,…,wT),w1To wTRepresent successively in the words
Each word, i.e., following reasoning
P (s)=P (w1,w2,...,wT)=P (w1)P(w2|w1)P(w3|w1,w2)...P(wT|w1,w2,w3,...,wT-1)
After i.e. first word is determined, i.e., the probability that word below occurs in the case of word appearance above;(" everybody likes
Vigorously eat apple ", by obtaining four words after participle, " everybody ", " liking ", " eating ", " apple ", the natural language of the words it is general
Rate is:P (everybody, likes, and eats, apple)=P (everybody) P (like | everybody) P (eat | everybody, like) P (apple | everybody, happiness
Vigorously, eat)) each probability at the same time can be obtained respectively, above formula is reduced to:
Work as ContextiIt is exactly its own P (w) for space-time.
CBOW model cores are exactly that, on gradient calculation, its key technology is exactly Hierarchical Softmax, herein
Need to use the knowledge that Huffman trees are related, using each word in dictionary as Huffman trees leaf node, for Huffman
Some leaf node in tree, it is assumed that corresponding in dictionary is word w, in order to which following convenient calculate introduces some symbols:
(1)pw:The path of correspondence leaf node from root node to w.
(2)lw:Path pwIn include the number of node;
(3)Path pwIn lwIndividual node,Represent the corresponding nodes of word w;
(4)Word w Huffman tree-encodings,Represent path pwIn j-th of node pair
The coding answered;
(5)Path pwThe corresponding vector of middle non-leaf nodes,Represent path pwMiddle jth
The corresponding vector of individual non-leaf nodes;
For any word w in dictionary, than one of existence anduniquess from root node to word w corresponding nodes in Huffman trees
Path pw;Path pwOn there is lw- 1 branch, regards each branch as two classification, does not have a subseries just to produce one
Probability, it is exactly required P (w | Context (w)) that these probability, which are multiplied,;
Conditional probability P (w | Context (w)) general formulae is write as:
Wherein:
Arranging merging according to above formula can obtain:
In formula:Represent that it is every subseries knot to reach leafy node from Huffman tree roots node
The probability of fruit;According to logistic regression, the probability that a node is divided into positive class isIt is divided into negative class
Probability isIt is exactly above formula that two formulas, which are combined together,;θ:The corresponding vector of non-leaf nodes;σ:It is
Sigmoid functions, formula:X:2c vector of input layer does summation and added up, i.e.,2c represents current word w, before have c word, behind have c word;
The object function of language model based on neutral net is usually taken to be following log-likelihood function:
P (w | Context (w)) is substituted into Γ log-likelihood functions to obtain:
Gradient is derived for convenience, and the content inside the dual summation bracket of above formula is denoted as into Γ (w, j) i.e.:
Now above formula Γ is then the object function of CBOW models, is next exactly that object function is optimized, use
It is stochastic gradient rise method, that is, asks the maximization of object function;
The thought of stochastic gradient rise method is:Often take a sample (Context (w), w) just to all in object function
Parameter, which is done, once to be refreshed, and Γ (w, the j) gradients vectorial on these is first provided herein;Provide first Γ (w, j) onLadder
Pair degree is calculated, i.e.,Carry out derivation:
Then, it is rightMore new formula writeable be:
Wherein, η represents learning rate.
Secondly Γ (w, j) is calculated on XwGradient, examining Γ (w, j) can obtainWith XwBe it is symmetrical, institute in the hope of
Lead ibid:
Final purpose is the term vector of each word in requirement dictionary, and X herewRepresent each in Context (w)
Adding up for term vector, then utilizeNext pairIt is updated:
I.e.Contribute on each term vector in Context (w), par contribution is used herein, so
The term vector of each word can be obtained.
The structure of the LSTM models is as follows:
(1) forget gate layer, mainly determine what information is abandoned from cell state, the door can read ht-1And xt, output one
The individual numerical value between 0 to 1, to each in cell state Ct-1In numeral, 1 represent " being fully retained ", 0 represent " completely house
Abandon ", representation is
ft=σ (Wf·[ht-1,xt]+bf);
(2) candidate's layer, determines which type of fresh information is stored in cell state, is made up of two parts, first,
Sigmoid layers are referred to as " input gate layer " and determine that what value will update;Second, one tanh layers create a new candidate values to
AmountIt will be added into state, the renewal to state is produced according to the two information, representation is
it=σ (Wi[ht-1,xt]+bi)
(3) cell state, new and old cell state C are updatedt-1It is updated to Ct, oldState Ct-1With ftIt is multiplied, discards really
Surely the information to be abandoned, is then addedHere it is new candidate value, is carried out according to the degree for determining each state of renewal
Change, representation is
(4) gate layer is exported, last it needs to be determined that what value exported, this output valve will be based on cell state, therefore
One sigmoid layers of operation come determine cell state which partly output is gone out;Then cell state is passed through tanh letters
Number progress, which is handled, obtains a value between -1 and 1, and it is multiplied with the output of sigmoid, and final output is represented
For:
ht=ot*tanh(Ct);
The above is exactly the content of LSTM models, and a LSTM is just corresponded in Recognition with Recurrent Neural Network according to each word of sentence
Model, namely one of word in the training of actual sentence, and according to the position relationship of word by all LSTM models
The front and rear composition chain structure that is connected in turn;So as to be trained;With the corresponding term vector V={ v of text word1,v2,…,vm}
Successively as x in each models of LSTMtInput value;Last LSTM model in every a word is obtained into htIt is used as the sentence
Three-dimensional vector output;
(5) using the output valve of every as the input of Softmax functions, Softmax functions are defined as:The function is probability-distribution function, and three probable value sums are 1;By to each classification
Probable value is compared the emotional category as this for obtaining maximum, obtains the label of each sentence;And then and true tag
It is compared and obtains accuracy rate, by training the neural network model to obtain its best accuracy rate, while also Optimal Parameters;I.e.
Reach the purpose of neural network recognization text comments Sentiment orientation analysis.
The beneficial effects of the invention are as follows two problems that the present invention solves the presence of traditional text emotional semantic classification:
(1) term vector trained using CBOW methods is dense, real-valued vectors, and it can be effectively using a large amount of without mark number
According to so as to obtain word, more accurately semanteme is portrayed in semantic space.Simultaneously, it is to avoid sparse, dimension that traditional only hotlist shows
The shortcomings of spending disaster.
(2) relatively conventional sorting technique, LSTM can not only utilize the word information of comment text, can also be to word order
It is modeled, so as to obtain commenting on the distinctive document representation method of language.
(3) can by GPU accelerans networks training process, not only improve the accuracy rate of emotional semantic classification, and make
The training speed of large-scale corpus gets a promotion;The Sentiment orientation of comment can effectively be recognized, especially electric business, film etc. other
Face has more preferable application prospect.
Brief description of the drawings
Fig. 1 is the flow chart of the analysis of identification text comments Sentiment orientation.
Fig. 2 is LSTM composition schematic diagram.
Embodiment
The present invention proposes a kind of analysis method based on neural network recognization text comments Sentiment orientation, below in conjunction with the accompanying drawings
It is explained.
The present invention uses continuous type bag of words CBOW models (Continuous Bag-of- in processing text comments data
Words Model).As the method for training term vector, long short-term memory LSTM models (Long Short Time are then utilized
Memory, LSTM) differentiate the emotion tendency commented on;Comprising the following steps that as shown in Fig. 1 flow charts:
The first step, language material pretreatment is word or word one by one the accurate participle of each sentence;Each sentence has one
Individual corresponding class label, i.e., 0,1,2, negative, neutral, front is represented respectively;Need each class label being converted into herein
Three-dimensional vector, i.e., 0 is converted to [1 0 0], and 1 is converted to [0 1 0], and 2 are converted to [0 0 1].The purpose of this conversion is to lead to
Cross and obtain the label of each sentence after training and contrasted;
Second step, term vector training is trained to the word obtained after text comments language material participle using CBOW, obtains each
Vector corresponding to word, the vectorial dimension can be configured as needed;
The term vector training basic thought and step are as follows:The description of language model form is exactly to give a T word
Character string S, regard the probability P (w of natural language as1,w2,w3,…,wT),w1To wTEach word in the words is represented successively,
I.e. following reasoning
P (s)=P (w1,w2,...,wT)=P (w1)P(w2|w1)P(w3|w1,w2)...P(wT|w1,w2,w3,...,wT-1)
After i.e. first word is determined, i.e., the probability that word below occurs in the case of word appearance above;(" everybody likes
Vigorously eat apple ", by obtaining four words after participle, " everybody ", " liking ", " eating ", " apple ", the natural language of the words it is general
Rate is:P (everybody, likes, and eats, apple)=P (everybody) P (like | everybody) P (eat | everybody, like) P (apple | everybody, happiness
Vigorously, eat)) each probability at the same time can be obtained respectively, above formula is reduced to:
Work as ContextiIt is exactly its own P (w) for space-time.
CBOW model cores are exactly that, on gradient calculation, its key technology is exactly Hierarchical Softmax, herein
Need to use the knowledge that Huffman trees are related, using each word in dictionary as Huffman trees leaf node, for Huffman
Some leaf node in tree, it is assumed that corresponding in dictionary is word w, in order to which following convenient calculate introduces some symbols:
(6)pw:The path of correspondence leaf node from root node to w.
(7)lw:Path pwIn include the number of node;
(8)Path pwIn lwIndividual node,Represent the corresponding nodes of word w;
(9)Word w Huffman tree-encodings,Represent path pwIn j-th of node pair
The coding answered;
(10)Path pwThe corresponding vector of middle non-leaf nodes,Represent path pwIn
The corresponding vector of j non-leaf nodes;
For any word w in dictionary, than one of existence anduniquess from root node to word w corresponding nodes in Huffman trees
Path pw;Path pwOn there is lw- 1 branch, regards each branch as two classification, does not have a subseries just to produce one
Probability, it is exactly required P (w | Context (w)) that these probability, which are multiplied,;
Conditional probability P (w | Context (w)) general formulae is write as:
Wherein:
Arranging merging according to above formula can obtain:
In formula:Represent that it is every subseries knot to reach leafy node from Huffman tree roots node
The probability of fruit;According to logistic regression, the probability that a node is divided into positive class isIt is divided into the general of negative class
Rate isIt is exactly above formula that two formulas, which are combined together,;θ:The corresponding vector of non-leaf nodes;σ:It is sigmoid
Function, formula:X:2c vector of input layer does summation and added up, i.e.,2c tables
Show current word w, before have c word, behind have c word;
The object function of language model based on neutral net is usually taken to be following log-likelihood function:
P (w | Context (w)) is substituted into Γ log-likelihood functions to obtain:
Gradient is derived for convenience, and the content inside the dual summation bracket of above formula is denoted as into Γ (w, j) i.e.:
Now above formula Γ is then the object function of CBOW models, is next exactly that object function is optimized, use
It is stochastic gradient rise method, that is, asks the maximization of object function;
The thought of stochastic gradient rise method is:Often take a sample (Context (w), w) just to all in object function
Parameter, which is done, once to be refreshed, and Γ (w, the j) gradients vectorial on these is first provided herein;Provide first Γ (w, j) onLadder
Pair degree is calculated, i.e.,Carry out derivation:
Then, it is rightMore new formula writeable be:
Wherein, η represents learning rate.
Secondly Γ (w, j) is calculated on XwGradient, examining Γ (w, j) can obtainWith XwBe it is symmetrical, institute in the hope of
Lead ibid:
Final purpose is the term vector of each word in requirement dictionary, and X herewRepresent each in Context (w)
Adding up for term vector, then utilizeNext pairIt is updated:
I.e.Contribute on each term vector in Context (w), par contribution is used herein, so
The term vector of each word can be obtained.
3rd step, LSTM training.Use LSTM as the model of classification in the present invention, the language material with affective tag is made
Training set, by the first step, second step processing, you can the problem of problem is converted into neural metwork training disaggregated model;
Assuming that including m word, corresponding term vector V={ v in a word l1,v2,…,vm, now sentence l just with to
V is measured to represent, wherein, V represents m term vector corresponding to m word, v in a word1Represent first word corresponding to word to
Amount, v2The term vector corresponding to second word is represented, by that analogy, the vector corresponding to the word of every words is directly inputted to following
Ring neutral net is handled;The Cyclic Operation Network uses conventional model:Long short-term memory LSTM models, according to sentence
Son each word in Recognition with Recurrent Neural Network just correspondence one LSTM model, actual sentence training in namely one of those
Word, and according to the position relationship of word by the composition chain structure that is connected in turn before and after all LSTM models, so as to carry out
Training, the corresponding term vector V={ v of text word1,v2,…,vmSuccessively as x in each models of LSTMt;In LSTM models
The input value x of tt, i.e., the corresponding term vector of t-th word as LSTM models input.Such as, the corresponding word of first word to
Measure v1It is used as x first input value x1.Input value, last LSTM model in every a word is obtained into ht;htIt is
The output of LSTM models, is made up of two parts, is to obtain an initial output by sigmoid layers first, then will with tanh
Ct values are zoomed between -1 and 1, and the output then obtained with sigmoid is by being multiplied, so as to obtain the output of model.It is used as this
The three-dimensional vector output of sentence;Again using the output valve of every as the input of Softmax functions, Softmax functions are defined as:In formula, exTo calculate the value of each classification;To calculate the value sum of k classification;If having three
Individual class label, calculate the first kind probability beCalculate Equations of The Second Kind probability beCalculate the
The probability of three classes isThe function is probability-distribution function, and obtains three probable values, and three probable values it
With for 1;The emotional category as this for obtaining maximum is compared by the probable value to each classification, each sentence is obtained
Label;And then be compared with true tag and obtain accuracy rate, by training the neural network model to obtain its best standard
True rate, while also Optimal Parameters;Realize the purpose of neural network recognization text comments Sentiment orientation analysis.Specifically such as Fig. 2 institutes
The structure of the LSTM models shown is as follows:
(1) forget gate layer, mainly determine what information is abandoned from cell state, the door can read ht-1And xt, output one
The individual numerical value between 0 to 1, to each in cell state Ct-1In numeral, 1 represent " being fully retained ", 0 represent " completely house
Abandon ", representation is
ft=σ (Wf·[ht-1,xt]+bf);
(2) candidate's layer, determines which type of fresh information is stored in cell state, is made up of two parts, first,
Sigmoid layers are referred to as " input gate layer " and determine that what value will update;Second, one tanh layers create a new candidate values to
AmountIt will be added into state, the renewal to state is produced according to the two information, representation is
it=σ (Wi[ht-1,xt]+bi)
(3) cell state, new and old cell state C are updatedt-1It is updated to Ct, oldState Ct-1With ftIt is multiplied, discards really
Surely the information to be abandoned, is then addedHere it is new candidate value, is carried out according to the degree for determining each state of renewal
Change, representation is
(4) gate layer is exported, last it needs to be determined that what value exported, this output valve will be based on cell state, therefore
One sigmoid layers of operation come determine cell state which partly output is gone out;Then cell state is passed through tanh letters
Number progress, which is handled, obtains a value between -1 and 1, and it is multiplied with the output of sigmoid, and final output is represented
For:
ht=ot*tanh(Ct);
The above is exactly the content of LSTM models, and a LSTM is just corresponded in Recognition with Recurrent Neural Network according to each word of sentence
Model, namely one of word in the training of actual sentence, and according to the position relationship of word by all LSTM models
The front and rear composition chain structure that is connected in turn;So as to be trained;With the corresponding term vector V={ v of text word1,v2,…,vm}
Successively as x in each models of LSTMtInput value;Last LSTM model in every a word is obtained into htIt is used as the sentence
Three-dimensional vector output;
(5) using the output valve of every as the input of Softmax functions, Softmax functions are defined as:The function is probability-distribution function, and three probable value sums are 1;By to each classification
Probable value is compared the emotional category as this for obtaining maximum, obtains the label of each sentence;And then and true tag
It is compared and obtains accuracy rate, by training the neural network model to obtain its best accuracy rate, while also Optimal Parameters;I.e.
Reach the purpose of neural network recognization text comments Sentiment orientation analysis.
Claims (3)
1. a kind of analysis method based on neural network recognization text comments Sentiment orientation, it is characterised in that commented in processing text
By in data, using continuous type bag of words CBOW as the method for training term vector, long short-term memory LSTM moulds are then utilized
Type differentiates the emotion tendency of comment;Comprise the following steps that:
The first step, language material pretreatment is word or word one by one the accurate participle of each sentence;Each sentence have one it is right
The class label answered, i.e., 0,1,2, represent negative, neutral, front respectively;Need each class label being converted into three-dimensional herein
Vector, i.e., 0 is converted to [1 0 0], and 1 is converted to [0 1 0], and 2 are converted to [0 0 1], this conversion purpose be by instruction
The label that each sentence is obtained after white silk is contrasted;
Second step, term vector training is trained to the word obtained after text comments language material participle using CBOW, obtains each word institute
Corresponding vector, the vectorial dimension can be configured as needed;
Use LSTM as the model of classification in 3rd step, LSTM training, the present invention, the language material with affective tag is trained
Collection, by the first step, second step processing, you can the problem of problem is converted into neural metwork training disaggregated model;
Assuming that including m word, corresponding term vector V={ v in a word l1,v2,L,vm, now sentence l just uses vector V tables
Show, the vector corresponding to the word of every words is directly inputted and handled to Recognition with Recurrent Neural Network;The Cyclic Operation Network is used
Be conventional model:Long short-term memory LSTM models, one is just corresponded to according to each word of sentence in Recognition with Recurrent Neural Network
LSTM models, namely one of word in the training of actual sentence, and according to the position relationship of word by all LSTM
The composition that is connected in turn before and after model chain structure, so as to be trained, the corresponding term vector V={ v of text word1,v2,…,
vmSuccessively as x in each models of LSTMtInput value, last LSTM model in every a word is obtained into htIt is used as this
The three-dimensional vector output of sentence;Again using the output valve of every as the input of Softmax functions, Softmax functions are defined as:
The function is probability-distribution function, and obtains three probable values, and three probable value sums are
1;In formula:exTo calculate the value of each classification;To calculate the value sum of k classification;If there are three class labels, the is calculated
The probability of one class isCalculate Equations of The Second Kind probability beThe probability for calculating the 3rd class isThe emotional category as this for obtaining maximum is compared by the probable value to each classification, is obtained
Maximum class probability value is just as the label of each sentence;And then be compared with true tag and obtain accuracy rate, pass through instruction
Practice the neural network model and obtain its best accuracy rate, while also Optimal Parameters;Realize neural network recognization text comments
The purpose of Sentiment orientation analysis;It is above-mentioned it is various in:V represents m term vector corresponding to m word in a word, and v1 represents first
Term vector corresponding to individual word, v2The term vector corresponding to second word is represented, by that analogy;In t in LSTM models
Input value xt, i.e., the corresponding term vector of t-th word as LSTM models input;The corresponding term vector v of first word1It is used as x
First input value x1;htIt is the output of LSTM models, is made up of two parts, is to be obtained by sigmoid layers at the beginning of one first
Begin to export, then zoomed to Ct values between -1 and 1 with tanh, the output then obtained with sigmoid is by being multiplied, so as to obtain
To the output of model.
2. a kind of analysis method based on neural network recognization text comments Sentiment orientation, its feature according to claim 1
It is, the term vector training basic thought and step of the second step are as follows:The description of language model form is exactly to give a T
The character string S of individual word, regards the probability P (w of natural language as1,w2,w3,…,wT),w1To wTRepresent successively each in the words
Individual word, i.e., following reasoning P (s)=P (w1,w2,...,wT)=P (w1)P(w2|w1)P(w3|w1,w2)...P(wT|w1,w2,
w3,...,wT-1)
After i.e. first word is determined, i.e., the probability that word below occurs in the case of word appearance above;At the same time can be with
Each probability is obtained respectively, and above formula is reduced to:
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>w</mi>
<mn>2</mn>
</msub>
<mo>,</mo>
<msub>
<mi>w</mi>
<mn>3</mn>
</msub>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<msub>
<mi>w</mi>
<mi>T</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>T</mi>
</munderover>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>i</mi>
</msub>
<mo>|</mo>
<msub>
<mi>Context</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
Work as ContextiIt is exactly its own P (w) for space-time;
CBOW model cores are exactly that, on gradient calculation, its key technology is exactly Hierarchical Softmax, is needed herein
Using the related knowledge of Huffman trees, using each word in dictionary as Huffman trees leaf node, in Huffman trees
Some leaf node, it is assumed that corresponding in dictionary is word w, in order to which following convenient calculate introduces some symbols:
(1)pw:The path of correspondence leaf node from root node to w;
(2)lw:Path pwIn include the number of node;
(3)Path pwIn lwIndividual node,Represent the corresponding nodes of word w;
(4)Word w Huffman tree-encodings,Represent path pwIn j-th of node it is corresponding compile
Code;
(5)Path pwThe corresponding vector of middle non-leaf nodes,Represent path pwIn j-th it is non-
The corresponding vector of leaf node;
For any word w in dictionary, than one of existence anduniquess from root node to the road of word w corresponding nodes in Huffman trees
Footpath pw;Path pwOn there is lw- 1 branch, regards each branch as two classification, does not have a subseries just to produce a probability,
It is exactly required P (w | Context (w)) that these probability, which are multiplied,;
Conditional probability P (w | Context (w)) general formulae is write as:
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>|</mo>
<mi>C</mi>
<mi>o</mi>
<mi>n</mi>
<mi>t</mi>
<mi>e</mi>
<mi>x</mi>
<mi>t</mi>
<mo>(</mo>
<mi>w</mi>
<mo>)</mo>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>2</mn>
</mrow>
<msup>
<mi>l</mi>
<mi>w</mi>
</msup>
</munderover>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>|</mo>
<msub>
<mi>X</mi>
<mi>w</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
</mrow>
Wherein:
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>|</mo>
<msub>
<mi>X</mi>
<mi>w</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>=</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mn>1</mn>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>=</mo>
<mn>1</mn>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
Arranging merging according to above formula can obtain:
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>|</mo>
<msub>
<mi>X</mi>
<mi>W</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>W</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<mrow>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
</mrow>
</msup>
<mo>&CenterDot;</mo>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
</msup>
</mrow>
In formula:Represent that it is each classification results to reach leafy node from Huffman tree roots node
Probability;According to logistic regression, the probability that a node is divided into positive class isIt is divided into the probability of negative class
It isIt is exactly above formula that two formulas, which are combined together,;θ:The corresponding vector of non-leaf nodes;σ:It is sigmoid letters
Number, formula:X:2c vector of input layer does summation and added up, i.e.,2c is represented
Current word w, before have c word, behind have c word;
The object function of language model based on neutral net is usually taken to be following log-likelihood function:
<mrow>
<mi>&Gamma;</mi>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mrow>
<mi>w</mi>
<mo>&Element;</mo>
<mi>c</mi>
</mrow>
</munder>
<mi>log</mi>
<mi> </mi>
<mi>P</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>|</mo>
<mi>C</mi>
<mi>o</mi>
<mi>n</mi>
<mi>t</mi>
<mi>e</mi>
<mi>x</mi>
<mi>t</mi>
<mo>(</mo>
<mi>w</mi>
<mo>)</mo>
<mo>)</mo>
</mrow>
</mrow>
2
P (w | Context (w)) is substituted into Γ log-likelihood functions to obtain:
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mi>&Gamma;</mi>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mrow>
<mi>w</mi>
<mo>&Element;</mo>
<mi>c</mi>
</mrow>
</munder>
<mi>l</mi>
<mi>o</mi>
<mi>g</mi>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>2</mn>
</mrow>
<msup>
<mi>l</mi>
<mi>w</mi>
</msup>
</munderover>
<mo>{</mo>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<mrow>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
</mrow>
</msup>
<mo>&times;</mo>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
</msup>
<mo>}</mo>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mrow>
<mi>w</mi>
<mo>&Element;</mo>
<mi>c</mi>
</mrow>
</munder>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>2</mn>
</mrow>
<msup>
<mi>l</mi>
<mi>w</mi>
</msup>
</munderover>
<mo>{</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&CenterDot;</mo>
<mi>log</mi>
<mo>&lsqb;</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
<mo>+</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>&CenterDot;</mo>
<mi>log</mi>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
<mo>}</mo>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Gradient is derived for convenience, and the content inside the dual summation bracket of above formula is denoted as into Γ (w, j) i.e.:
<mrow>
<mi>&Gamma;</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mrow>
<mo>&prime;</mo>
<mo>&prime;</mo>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&CenterDot;</mo>
<mi>log</mi>
<mo>&lsqb;</mo>
<mi>&sigma;</mi>
<mo>(</mo>
<mrow>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mrow>
<mo>&prime;</mo>
<mo>&prime;</mo>
</mrow>
</msubsup>
</mrow>
<mo>)</mo>
<mo>&rsqb;</mo>
<mo>+</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mrow>
<mo>&prime;</mo>
<mo>&prime;</mo>
</mrow>
</msubsup>
<mo>&CenterDot;</mo>
<mi>l</mi>
<mi>o</mi>
<mi>g</mi>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&sigma;</mi>
<mo>(</mo>
<mrow>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mrow>
<mo>&prime;</mo>
<mo>&prime;</mo>
</mrow>
</msubsup>
</mrow>
<mo>)</mo>
<mo>&rsqb;</mo>
</mrow>
Now above formula Γ is then the object function of CBOW models, next be exactly object function is optimized, use with
Machine gradient rise method, that is, ask the maximization of object function;
The thought of stochastic gradient rise method is:Often take a sample (Context (w), w) just to all parameters in object function
Do and once refresh, Γ (w, the j) gradients vectorial on these is first provided herein;Provide first Γ (w, j) onGradiometer
Pair calculate, i.e.,Carry out derivation:
Then, it is rightMore new formula writeable be:
<mrow>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>:</mo>
<mo>=</mo>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>+</mo>
<mi>&eta;</mi>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
<msub>
<mi>X</mi>
<mi>w</mi>
</msub>
</mrow>
Wherein, η represents learning rate;
Secondly Γ (w, j) is calculated on XwGradient, examining Γ (w, j) can obtainWith XwIt is symmetrical, so derivation is same
On:
<mrow>
<mfrac>
<mrow>
<mo>&part;</mo>
<mi>&Gamma;</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mo>&part;</mo>
<msub>
<mi>X</mi>
<mi>w</mi>
</msub>
</mrow>
</mfrac>
<mo>=</mo>
<mo>&lsqb;</mo>
<mn>1</mn>
<mo>-</mo>
<msubsup>
<mi>d</mi>
<mi>j</mi>
<mi>w</mi>
</msubsup>
<mo>-</mo>
<mi>&sigma;</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>X</mi>
<mi>w</mi>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
<msubsup>
<mi>&theta;</mi>
<mrow>
<mi>j</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>w</mi>
</msubsup>
</mrow>
Final purpose is the term vector of each word in requirement dictionary, and X herewRepresent in Context (w) each word to
Adding up for amount, then utilizeNext pair It is updated:
<mrow>
<mi>v</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>w</mi>
<mo>&OverBar;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>:</mo>
<mo>=</mo>
<mi>v</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>w</mi>
<mo>&OverBar;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&eta;</mi>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>2</mn>
</mrow>
<msup>
<mi>l</mi>
<mi>w</mi>
</msup>
</munderover>
<mfrac>
<mrow>
<mo>&part;</mo>
<mi>&Gamma;</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mo>&part;</mo>
<msub>
<mi>X</mi>
<mi>w</mi>
</msub>
</mrow>
</mfrac>
<mover>
<mi>w</mi>
<mo>&OverBar;</mo>
</mover>
<mo>&Element;</mo>
<mi>C</mi>
<mi>o</mi>
<mi>n</mi>
<mi>t</mi>
<mi>e</mi>
<mi>x</mi>
<mi>t</mi>
<mrow>
<mo>(</mo>
<mi>w</mi>
<mo>)</mo>
</mrow>
</mrow>
I.e.Contribute on each term vector in Context (w), par contribution is used herein, so can
Obtain the term vector of each word.
3. a kind of analysis method based on neural network recognization text comments Sentiment orientation according to claim 1 or claim 2, it is special
Levy and be, the structure of the LSTM models is as follows:
(1) forget gate layer, mainly determine what information is abandoned from cell state, the door can read ht-1And xt, output one is 0
Numerical value between to 1, to each in cell state Ct-1In numeral, 1 represent " being fully retained ", 0 represent " giving up completely ", table
Show that form is
ft=σ (Wf·[ht-1,xt]+bf);
(2) candidate's layer, determines which type of fresh information is stored in cell state, is made up of two parts, first, sigmoid
Layer is referred to as " input gate layer " and determines that what value will update;Second, one tanh layers create a new candidate value vectorWill
It can be added into state, the renewal to state is produced according to the two information, representation is
it=σ (Wi[ht-1,xt]+bi)
<mrow>
<mover>
<msub>
<mi>C</mi>
<mi>t</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>=</mo>
<mi>tanh</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>W</mi>
<mi>C</mi>
</msub>
<mo>&lsqb;</mo>
<msub>
<mi>h</mi>
<mrow>
<mi>t</mi>
<mo>-</mo>
<mn>1</mn>
<mo>,</mo>
</mrow>
</msub>
<msub>
<mi>x</mi>
<mi>t</mi>
</msub>
<mo>&rsqb;</mo>
<mo>+</mo>
<msub>
<mi>b</mi>
<mi>C</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
(3) cell state, new and old cell state C are updatedt-1It is updated to Ct, oldState Ct-1With ftIt is multiplied, discarding determination will
The information of discarding, is then addedHere it is new candidate value, is changed according to the degree for determining to update each state,
Representation is
<mrow>
<msub>
<mi>C</mi>
<mi>t</mi>
</msub>
<mo>=</mo>
<msub>
<mi>f</mi>
<mi>t</mi>
</msub>
<mo>*</mo>
<msub>
<mi>C</mi>
<mrow>
<mi>t</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>+</mo>
<msub>
<mi>i</mi>
<mi>t</mi>
</msub>
<mo>*</mo>
<mover>
<msub>
<mi>C</mi>
<mi>t</mi>
</msub>
<mo>&OverBar;</mo>
</mover>
<mo>;</mo>
</mrow>
(4) gate layer is exported, last it needs to be determined that what value exported, this output valve will be based on cell state, therefore operation
One sigmoid layers come determine cell state which partly output is gone out;Then cell state is entered by tanh functions
Row processing obtains a value between -1 and 1, and it is multiplied with the output of sigmoid, and final output is expressed as:
ht=ot*tanh(Ct);
The above is exactly the content of LSTM models, and a LSTM mould is just corresponded in Recognition with Recurrent Neural Network according to each word of sentence
Type, namely one of word in the training of actual sentence, and according to the position relationship of word by before all LSTM models
After be connected in turn composition chain structure;So as to be trained;With the corresponding term vector V={ v of text word1,v2,…,vmAccording to
X in secondary each model as LSTMtInput value;Last LSTM model in every a word is obtained into htIt is used as this 's
Three-dimensional vector is exported;
(5) using the output valve of every as the input of Softmax functions, Softmax functions are defined as:The function is probability-distribution function, and three probable value sums are 1;By to each classification
Probable value is compared the emotional category as this for obtaining maximum, obtains the label of each sentence;And then and true tag
It is compared and obtains accuracy rate, by training the neural network model to obtain its best accuracy rate, while also Optimal Parameters;I.e.
Reach the purpose of neural network recognization text comments Sentiment orientation analysis.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342178.8A CN107153642A (en) | 2017-05-16 | 2017-05-16 | A kind of analysis method based on neural network recognization text comments Sentiment orientation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342178.8A CN107153642A (en) | 2017-05-16 | 2017-05-16 | A kind of analysis method based on neural network recognization text comments Sentiment orientation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107153642A true CN107153642A (en) | 2017-09-12 |
Family
ID=59793270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710342178.8A Pending CN107153642A (en) | 2017-05-16 | 2017-05-16 | A kind of analysis method based on neural network recognization text comments Sentiment orientation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107153642A (en) |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491490A (en) * | 2017-07-19 | 2017-12-19 | 华东师范大学 | Text sentiment classification method based on Emotion center |
CN108038492A (en) * | 2017-11-23 | 2018-05-15 | 西安理工大学 | A kind of perceptual term vector and sensibility classification method based on deep learning |
CN108133038A (en) * | 2018-01-10 | 2018-06-08 | 重庆邮电大学 | A kind of entity level emotional semantic classification system and method based on dynamic memory network |
CN108415953A (en) * | 2018-02-05 | 2018-08-17 | 华融融通(北京)科技有限公司 | A kind of non-performing asset based on natural language processing technique manages knowledge management method |
CN108519976A (en) * | 2018-04-04 | 2018-09-11 | 郑州大学 | The method for generating extensive sentiment dictionary based on neural network |
CN108595592A (en) * | 2018-04-19 | 2018-09-28 | 成都睿码科技有限责任公司 | A kind of text emotion analysis method based on five-stroke form code character level language model |
CN108628834A (en) * | 2018-05-14 | 2018-10-09 | 国家计算机网络与信息安全管理中心 | A kind of word lists dendrography learning method based on syntax dependence |
CN108764268A (en) * | 2018-04-02 | 2018-11-06 | 华南理工大学 | A kind of multi-modal emotion identification method of picture and text based on deep learning |
CN108829672A (en) * | 2018-06-05 | 2018-11-16 | 平安科技(深圳)有限公司 | Sentiment analysis method, apparatus, computer equipment and the storage medium of text |
CN108959268A (en) * | 2018-07-20 | 2018-12-07 | 科大讯飞股份有限公司 | A kind of text emotion analysis method and device |
CN109036570A (en) * | 2018-05-31 | 2018-12-18 | 北京云知声信息技术有限公司 | The filter method and system of the non-case history content of Ultrasonography |
CN109086393A (en) * | 2018-07-27 | 2018-12-25 | 贵州中科恒运软件科技有限公司 | A kind of the analysis of public opinion system and method |
CN109255027A (en) * | 2018-08-27 | 2019-01-22 | 上海宝尊电子商务有限公司 | A kind of method and apparatus of electric business comment sentiment analysis noise reduction |
CN109446414A (en) * | 2018-09-28 | 2019-03-08 | 武汉大学 | A kind of software information website fast tag recommended method based on neural network classification |
CN109461037A (en) * | 2018-12-17 | 2019-03-12 | 北京百度网讯科技有限公司 | Comment on viewpoint clustering method, device and terminal |
CN109460508A (en) * | 2018-10-10 | 2019-03-12 | 浙江大学 | A kind of efficient comment spam groups of users detection method |
CN109543036A (en) * | 2018-11-20 | 2019-03-29 | 四川长虹电器股份有限公司 | Text Clustering Method based on semantic similarity |
CN109597997A (en) * | 2018-12-07 | 2019-04-09 | 上海宏原信息科技有限公司 | Based on comment entity, aspect grade sensibility classification method and device and its model training |
CN109739978A (en) * | 2018-12-11 | 2019-05-10 | 中科恒运股份有限公司 | A kind of Text Clustering Method, text cluster device and terminal device |
CN109800438A (en) * | 2019-02-01 | 2019-05-24 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN110020147A (en) * | 2017-11-29 | 2019-07-16 | 北京京东尚科信息技术有限公司 | Model generates, method for distinguishing, system, equipment and storage medium are known in comment |
WO2019149076A1 (en) * | 2018-02-05 | 2019-08-08 | 阿里巴巴集团控股有限公司 | Word vector generation method, apparatus and device |
CN110110137A (en) * | 2019-03-19 | 2019-08-09 | 咪咕音乐有限公司 | A kind of method, apparatus, electronic equipment and the storage medium of determining musical features |
CN110134966A (en) * | 2019-05-21 | 2019-08-16 | 中电健康云科技有限公司 | A kind of sensitive information determines method and device |
CN110264311A (en) * | 2019-05-30 | 2019-09-20 | 佛山科学技术学院 | A kind of business promotion accurate information recommended method and system based on deep learning |
CN110427616A (en) * | 2019-07-19 | 2019-11-08 | 山东科技大学 | A kind of text emotion analysis method based on deep learning |
CN111523319A (en) * | 2020-04-10 | 2020-08-11 | 广东海洋大学 | Microblog emotion analysis method based on scene LSTM structure network |
CN111771208A (en) * | 2018-02-19 | 2020-10-13 | 博朗有限公司 | Apparatus and method for implementing positioning of a movable processing device |
CN111881249A (en) * | 2020-06-08 | 2020-11-03 | 江苏大学 | Method for judging text emotion tendentiousness based on recurrent neural network |
CN112069311A (en) * | 2020-08-04 | 2020-12-11 | 北京声智科技有限公司 | Text extraction method, device, equipment and medium |
WO2021135457A1 (en) * | 2020-08-06 | 2021-07-08 | 平安科技(深圳)有限公司 | Recurrent neural network-based emotion recognition method, apparatus, and storage medium |
CN113112310A (en) * | 2021-05-12 | 2021-07-13 | 北京大学 | Commodity service culture added value assessment method, device and system |
CN114168730A (en) * | 2021-11-26 | 2022-03-11 | 一拓通信集团股份有限公司 | Consumption tendency analysis method based on BilSTM and SVM |
CN115759088A (en) * | 2023-01-10 | 2023-03-07 | 中国测绘科学研究院 | Text analysis method and storage medium for comment information |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809186A (en) * | 2016-02-25 | 2016-07-27 | 中国科学院声学研究所 | Emotion classification method and system |
CN105868317A (en) * | 2016-03-25 | 2016-08-17 | 华中师范大学 | Digital education resource recommendation method and system |
CN105955959A (en) * | 2016-05-06 | 2016-09-21 | 深圳大学 | Sentiment classification method and system |
CN106294684A (en) * | 2016-08-06 | 2017-01-04 | 上海高欣计算机系统有限公司 | The file classification method of term vector and terminal unit |
CN106407178A (en) * | 2016-08-25 | 2017-02-15 | 中国科学院计算技术研究所 | Session abstract generation method and device |
US20170053646A1 (en) * | 2015-08-17 | 2017-02-23 | Mitsubishi Electric Research Laboratories, Inc. | Method for using a Multi-Scale Recurrent Neural Network with Pretraining for Spoken Language Understanding Tasks |
CN106599933A (en) * | 2016-12-26 | 2017-04-26 | 哈尔滨工业大学 | Text emotion classification method based on the joint deep learning model |
CN106601226A (en) * | 2016-11-18 | 2017-04-26 | 中国科学院自动化研究所 | Phoneme duration prediction modeling method and phoneme duration prediction method |
-
2017
- 2017-05-16 CN CN201710342178.8A patent/CN107153642A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170053646A1 (en) * | 2015-08-17 | 2017-02-23 | Mitsubishi Electric Research Laboratories, Inc. | Method for using a Multi-Scale Recurrent Neural Network with Pretraining for Spoken Language Understanding Tasks |
CN105809186A (en) * | 2016-02-25 | 2016-07-27 | 中国科学院声学研究所 | Emotion classification method and system |
CN105868317A (en) * | 2016-03-25 | 2016-08-17 | 华中师范大学 | Digital education resource recommendation method and system |
CN105955959A (en) * | 2016-05-06 | 2016-09-21 | 深圳大学 | Sentiment classification method and system |
CN106294684A (en) * | 2016-08-06 | 2017-01-04 | 上海高欣计算机系统有限公司 | The file classification method of term vector and terminal unit |
CN106407178A (en) * | 2016-08-25 | 2017-02-15 | 中国科学院计算技术研究所 | Session abstract generation method and device |
CN106601226A (en) * | 2016-11-18 | 2017-04-26 | 中国科学院自动化研究所 | Phoneme duration prediction modeling method and phoneme duration prediction method |
CN106599933A (en) * | 2016-12-26 | 2017-04-26 | 哈尔滨工业大学 | Text emotion classification method based on the joint deep learning model |
Non-Patent Citations (1)
Title |
---|
LI JIA等: "Tweet modeling with LSTM recurrent neural networks for hashtag recommendation", 《2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS》 * |
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491490B (en) * | 2017-07-19 | 2020-10-13 | 华东师范大学 | Text emotion classification method based on emotion center |
CN107491490A (en) * | 2017-07-19 | 2017-12-19 | 华东师范大学 | Text sentiment classification method based on Emotion center |
CN108038492A (en) * | 2017-11-23 | 2018-05-15 | 西安理工大学 | A kind of perceptual term vector and sensibility classification method based on deep learning |
CN110020147A (en) * | 2017-11-29 | 2019-07-16 | 北京京东尚科信息技术有限公司 | Model generates, method for distinguishing, system, equipment and storage medium are known in comment |
CN108133038A (en) * | 2018-01-10 | 2018-06-08 | 重庆邮电大学 | A kind of entity level emotional semantic classification system and method based on dynamic memory network |
CN108133038B (en) * | 2018-01-10 | 2022-03-22 | 重庆邮电大学 | Entity level emotion classification system and method based on dynamic memory network |
CN108415953A (en) * | 2018-02-05 | 2018-08-17 | 华融融通(北京)科技有限公司 | A kind of non-performing asset based on natural language processing technique manages knowledge management method |
US10824819B2 (en) | 2018-02-05 | 2020-11-03 | Alibaba Group Holding Limited | Generating word vectors by recurrent neural networks based on n-ary characters |
WO2019149076A1 (en) * | 2018-02-05 | 2019-08-08 | 阿里巴巴集团控股有限公司 | Word vector generation method, apparatus and device |
CN108415953B (en) * | 2018-02-05 | 2021-08-13 | 华融融通(北京)科技有限公司 | Method for managing bad asset management knowledge based on natural language processing technology |
CN111771208A (en) * | 2018-02-19 | 2020-10-13 | 博朗有限公司 | Apparatus and method for implementing positioning of a movable processing device |
CN108764268A (en) * | 2018-04-02 | 2018-11-06 | 华南理工大学 | A kind of multi-modal emotion identification method of picture and text based on deep learning |
CN108519976A (en) * | 2018-04-04 | 2018-09-11 | 郑州大学 | The method for generating extensive sentiment dictionary based on neural network |
CN108595592A (en) * | 2018-04-19 | 2018-09-28 | 成都睿码科技有限责任公司 | A kind of text emotion analysis method based on five-stroke form code character level language model |
CN108628834A (en) * | 2018-05-14 | 2018-10-09 | 国家计算机网络与信息安全管理中心 | A kind of word lists dendrography learning method based on syntax dependence |
CN108628834B (en) * | 2018-05-14 | 2022-04-15 | 国家计算机网络与信息安全管理中心 | Word expression learning method based on syntactic dependency relationship |
CN109036570B (en) * | 2018-05-31 | 2021-08-31 | 云知声智能科技股份有限公司 | Method and system for filtering non-medical record content of ultrasound department |
CN109036570A (en) * | 2018-05-31 | 2018-12-18 | 北京云知声信息技术有限公司 | The filter method and system of the non-case history content of Ultrasonography |
CN108829672A (en) * | 2018-06-05 | 2018-11-16 | 平安科技(深圳)有限公司 | Sentiment analysis method, apparatus, computer equipment and the storage medium of text |
CN108959268B (en) * | 2018-07-20 | 2023-01-17 | 科大讯飞股份有限公司 | Text emotion analysis method and device |
CN108959268A (en) * | 2018-07-20 | 2018-12-07 | 科大讯飞股份有限公司 | A kind of text emotion analysis method and device |
CN109086393A (en) * | 2018-07-27 | 2018-12-25 | 贵州中科恒运软件科技有限公司 | A kind of the analysis of public opinion system and method |
CN109255027B (en) * | 2018-08-27 | 2022-06-24 | 上海宝尊电子商务有限公司 | E-commerce comment sentiment analysis noise reduction method and device |
CN109255027A (en) * | 2018-08-27 | 2019-01-22 | 上海宝尊电子商务有限公司 | A kind of method and apparatus of electric business comment sentiment analysis noise reduction |
CN109446414B (en) * | 2018-09-28 | 2021-08-17 | 武汉大学 | Software information site rapid label recommendation method based on neural network classification |
CN109446414A (en) * | 2018-09-28 | 2019-03-08 | 武汉大学 | A kind of software information website fast tag recommended method based on neural network classification |
CN109460508A (en) * | 2018-10-10 | 2019-03-12 | 浙江大学 | A kind of efficient comment spam groups of users detection method |
CN109460508B (en) * | 2018-10-10 | 2021-10-15 | 浙江大学 | Efficient spam comment user group detection method |
CN109543036A (en) * | 2018-11-20 | 2019-03-29 | 四川长虹电器股份有限公司 | Text Clustering Method based on semantic similarity |
CN109597997B (en) * | 2018-12-07 | 2023-05-02 | 上海宏原信息科技有限公司 | Comment entity and aspect-level emotion classification method and device and model training thereof |
CN109597997A (en) * | 2018-12-07 | 2019-04-09 | 上海宏原信息科技有限公司 | Based on comment entity, aspect grade sensibility classification method and device and its model training |
CN109739978A (en) * | 2018-12-11 | 2019-05-10 | 中科恒运股份有限公司 | A kind of Text Clustering Method, text cluster device and terminal device |
CN109461037A (en) * | 2018-12-17 | 2019-03-12 | 北京百度网讯科技有限公司 | Comment on viewpoint clustering method, device and terminal |
CN109800438A (en) * | 2019-02-01 | 2019-05-24 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109800438B (en) * | 2019-02-01 | 2020-03-31 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN110110137A (en) * | 2019-03-19 | 2019-08-09 | 咪咕音乐有限公司 | A kind of method, apparatus, electronic equipment and the storage medium of determining musical features |
CN110134966A (en) * | 2019-05-21 | 2019-08-16 | 中电健康云科技有限公司 | A kind of sensitive information determines method and device |
CN110264311A (en) * | 2019-05-30 | 2019-09-20 | 佛山科学技术学院 | A kind of business promotion accurate information recommended method and system based on deep learning |
CN110427616A (en) * | 2019-07-19 | 2019-11-08 | 山东科技大学 | A kind of text emotion analysis method based on deep learning |
CN110427616B (en) * | 2019-07-19 | 2023-06-09 | 山东科技大学 | Text emotion analysis method based on deep learning |
CN111523319A (en) * | 2020-04-10 | 2020-08-11 | 广东海洋大学 | Microblog emotion analysis method based on scene LSTM structure network |
CN111523319B (en) * | 2020-04-10 | 2023-06-30 | 广东海洋大学 | Microblog emotion analysis method based on scene LSTM structure network |
CN111881249A (en) * | 2020-06-08 | 2020-11-03 | 江苏大学 | Method for judging text emotion tendentiousness based on recurrent neural network |
CN112069311A (en) * | 2020-08-04 | 2020-12-11 | 北京声智科技有限公司 | Text extraction method, device, equipment and medium |
WO2021135457A1 (en) * | 2020-08-06 | 2021-07-08 | 平安科技(深圳)有限公司 | Recurrent neural network-based emotion recognition method, apparatus, and storage medium |
CN113112310A (en) * | 2021-05-12 | 2021-07-13 | 北京大学 | Commodity service culture added value assessment method, device and system |
CN114168730A (en) * | 2021-11-26 | 2022-03-11 | 一拓通信集团股份有限公司 | Consumption tendency analysis method based on BilSTM and SVM |
CN115759088A (en) * | 2023-01-10 | 2023-03-07 | 中国测绘科学研究院 | Text analysis method and storage medium for comment information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107153642A (en) | A kind of analysis method based on neural network recognization text comments Sentiment orientation | |
CN108492200B (en) | User attribute inference method and device based on convolutional neural network | |
CN112487143B (en) | Public opinion big data analysis-based multi-label text classification method | |
CN111931506B (en) | Entity relationship extraction method based on graph information enhancement | |
CN103955451B (en) | Method for judging emotional tendentiousness of short text | |
CN108664632A (en) | A kind of text emotion sorting algorithm based on convolutional neural networks and attention mechanism | |
CN109558487A (en) | Document Classification Method based on the more attention networks of hierarchy | |
CN107273355A (en) | A kind of Chinese word vector generation method based on words joint training | |
CN107038480A (en) | A kind of text sentiment classification method based on convolutional neural networks | |
CN109697232A (en) | A kind of Chinese text sentiment analysis method based on deep learning | |
CN107544957A (en) | A kind of Sentiment orientation analysis method of business product target word | |
CN107203511A (en) | A kind of network text name entity recognition method based on neutral net probability disambiguation | |
CN107562784A (en) | Short text classification method based on ResLCNN models | |
CN106886516A (en) | The method and device of automatic identification statement relationship and entity | |
CN109800411A (en) | Clinical treatment entity and its attribute extraction method | |
CN110472042B (en) | Fine-grained emotion classification method | |
CN106897371B (en) | Chinese text classification system and method | |
CN107908614A (en) | A kind of name entity recognition method based on Bi LSTM | |
CN108038205B (en) | Viewpoint analysis prototype system for Chinese microblogs | |
CN107025284A (en) | The recognition methods of network comment text emotion tendency and convolutional neural networks model | |
CN106354710A (en) | Neural network relation extracting method | |
CN107943784A (en) | Relation extraction method based on generation confrontation network | |
CN109934261A (en) | A kind of Knowledge driving parameter transformation model and its few sample learning method | |
CN107122349A (en) | A kind of feature word of text extracting method based on word2vec LDA models | |
CN113407660B (en) | Unstructured text event extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170912 |
|
RJ01 | Rejection of invention patent application after publication |