CN105005553A - Emotional thesaurus based short text emotional tendency analysis method - Google Patents

Emotional thesaurus based short text emotional tendency analysis method Download PDF

Info

Publication number
CN105005553A
CN105005553A CN201510342473.4A CN201510342473A CN105005553A CN 105005553 A CN105005553 A CN 105005553A CN 201510342473 A CN201510342473 A CN 201510342473A CN 105005553 A CN105005553 A CN 105005553A
Authority
CN
China
Prior art keywords
word
emotion
wsi
sentiment
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510342473.4A
Other languages
Chinese (zh)
Other versions
CN105005553B (en
Inventor
张海仙
章毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201510342473.4A priority Critical patent/CN105005553B/en
Publication of CN105005553A publication Critical patent/CN105005553A/en
Application granted granted Critical
Publication of CN105005553B publication Critical patent/CN105005553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

An emotional thesaurus based short text emotional tendency analysis method is disclosed. At first, a basic emotional thesaurus is constructed based on a word frequency statistics method; and the emotional tendency is judged by statistical correlation calculation of candidate words and vocabularies in the basic emotional thesaurus so as to expand the basic thesaurus. Then on the basis of the emotional thesaurus, each evaluative utterance S is taken as a unit, each emotional word WS in the utterance is taken as a separator, a phrase (WSi-1, WSi) between every two separators is subjected to emotional weight calculation, the weight values of all phrases are subjected to weighted summation to obtain the total emotional tendency value [weight (S)] of S, the emotional polarity of S is judged, and if the weight (S) is greater than 0, a comment belongs to positive comments; or otherwise, S is considered to belong to negative comments, so that polarity classification of the evaluative utterances is realized, and the phrase (WSi-1, WSi) contains words WSi but does not contain WSi-1.

Description

Based on the short text Sentiment orientation analytical approach of sentiment dictionary
Technical field
The present invention relates to short text and carry out Sentiment orientation sorting technique field, provide a kind of short text Sentiment orientation analytical approach based on sentiment dictionary.
Background technology
From the proposition of the Internet community concept more than ten years till now, the correlation technique that the researcher of various countries detects the Internet community and research give a lot of concern, achieve a lot of substantial progress.
First researcher has carried out more deep analysis to the topological structure of internet.Different from the imagination of people, internet and other a lot of networks interrelated be not exclusively random, can not describe the structure of the Internet community completely by Random Graph.Especially, after analyzing more and more internet data, the concept of Random Graph structure is subject to serious impact.The practical structures of internet is imagined more complex more than us, and the relation between link, website, the page, user, supvr is also diversified.The contact having a lot of intra-zone to contact closely with outside in internet is more weak, and these regions are exactly the Internet community, and the architectural feature of the Internet community cannot be described clearly by Random Graph.
Along with the proposition of the Internet community concept and the deep expansion of correlative study, developers devise various dissimilar the Internet community detection algorithm and carry out structure detection to it, and experimentally result is constantly improved algorithm and optimized.Along with carrying out in a deep going way of research, the algorithm detected the Internet community is also in constantly optimised improvement.
Compared with classic method, present algorithm can fully take into account the concurrency of the network operation mostly, real-time and extensibility etc. solve restriction physically.The ant that the use that the people such as such as Sadi propose walks abreast looks for the community detection method of circle.Like this can under the prerequisite not affecting result effect, compression the Internet architecture figure until a stable size is to reduce the cost of algorithm operation, thus completes the process to large scale network.What also have the people such as Leung to propose improves label propagation algorithm, and the method adding heuristic education carries out real-time community's detection to large scale network.For different the Internet community detection methods, the people such as Leskovec carry out research to existing certain methods and compare, find that large scale network community test problems is not that a simple algorithm just can solve, be a very complicated problem, many-sided problems such as network structure, Data distribution8, network crawl effect be considered.Along with the continuous maturation of detection technique, availability is detected also in continuous lifting in the Internet community, compares with traditional brute-force algorithm, and community's detection technique has more and more become a kind of art.Detect an emerging direction as community, will excavate the Internet architecture and make tremendous influence.
Summary of the invention
The object of the present invention is to provide a kind of short text Sentiment orientation analytical approach based on sentiment dictionary.
Based on the short text Sentiment orientation analytical approach of sentiment dictionary, it is characterized in that comprising the steps:
Step 1, structure sentiment dictionary, based on the method basis of formation sentiment dictionary of word frequency statistics; By SO-PMI method, the Sentiment orientation differentiating it is calculated to the statistic correlation of vocabulary in candidate's word and basic sentiment dictionary, thus expands basic dictionary.
The model of step 2, structure sentiment analysis, on the basis of sentiment dictionary, evaluate in units of statement S by every bar, with each emotion word WS in this statement for separator, to the punctuate phrase (WSi-1 between two separators, WSi) emotion weight computing is carried out, then the weights weighted sum of each punctuate is drawn the overall emotion propensity value weight (S) of S, judge that the method for the feeling polarities of S is: if weight (S) is greater than 0, then this comment belongs to front comment; Otherwise think that S belongs to negative sense comment, thus realize the polarity classification to evaluating statement, punctuate phrase (WSi-1, WSi) comprises word WSi, but does not comprise word WSi-1.
In technique scheme, SO-PMI method comprises the steps:
The part of speech property of word is obtained after step 2-1, employing ICTCLAS system participle,
Step 2-2, calculate by word.propertyal ∈ a, ad, an, ag, al} and word.propertyal ∈ the SO-PMI value of two kinds of candidate word word that vn, vd, vi, vg, vl} limit, the candidate word of all the other parts of speech is directly regarded as neutral word;
The SO-PMI value calculating two kinds of candidate word word is specially:
PMI value between calculated candidate word and forward basis emotion word, the PMI value between calculated candidate word and negative sense basis emotion word, finally both are subtracted each other the SO-PMI value obtaining candidate word, the calculating formula of SO-PMI is as follows:
S O - P M I ( w o r d ) = Σ p o s W o r d ∈ p o s W o r d s P M I ( w o r d , p o s W o r d ) - Σ n e g W o r d ∈ n e g W o r d s P M I ( w o r d , n e g W o r d ) (formula 1)
PosWords is forward basis sentiment dictionary, and negWords is negative sense basis sentiment dictionary, and word is candidate's word;
Relation between the value of SO-PMI and Sentiment orientation as shown in the formula:
(formula 2)
Step 2-4, by the synonym of basis, front emotion word, and meet formula word.propertyal ∈ { a, ad, an, ag, al} or formula word.propertyal ∈ { vn, vd, through formula 2, vi, vg, vl} are also judged to be that the emotion word that front is inclined to is added to posWords;
Step 2-5, by the synonym of negative basic emotion word, and meet formula word.propertyal ∈ { a, ad, an, ag, al} or formula word.propertyal ∈ { vn, vd, vi, vg, through formula 2, vl} is also judged to be that the emotion word of negative tendency is added to negWords, obtain a comprehensive emotion word comparative sample.
The present invention, because adopt above technical scheme, therefore possesses following beneficial effect:
Our experimental result shows, when data set comprises 100,000 comment on commodity, 67.9% and 83.27% is respectively merely based on machine learning and the simple accuracy rate based on the Sentiment orientation analytical approach of sentiment dictionary, and the accuracy rate of comprehensive method in this paper can reach 85.9%, effect is much better than the method based on machine learning, is also better than merely based on the method for sentiment dictionary.
Embodiment
The invention provides a kind of short text Sentiment orientation analytical approach based on sentiment dictionary.
The construction method of sentiment dictionary
Sentiment dictionary refers to a series of set can expressing the word of mankind front or negative emotions.For ease of calculating the Sentiment orientation value that comment on commodity short sentence quantizes, also in sentiment dictionary, preserve its Sentiment orientation value for each word herein below, wherein, the positive emotion that+1 representative is the strongest, the negative emotion that-1 representative is the strongest.
We comprise two parts by the sentiment dictionary construction method of design: based on the method basis of formation sentiment dictionary of word frequency statistics; Based on the SO-PMI method improved, by calculating to the statistic correlation of vocabulary in candidate's word and basic sentiment dictionary the Sentiment orientation differentiating it, thus expand basic dictionary.
The structure of basis sentiment dictionary
Basis sentiment dictionary is basis and the key of carrying out short text sentiment analysis based on natural language processing method.Whether this problem appears at according to the word in corpus among sentiment dictionary, and the Sentiment orientation value appearing at the word among dictionary is to calculate the Sentiment orientation value of comment on commodity short sentence.So include which word in sentiment dictionary, whether the word in dictionary is representative in commodity evaluation field, whether the Sentiment orientation value of these words is accurate, and these problems all can impact the accuracy of emotional semantic classification result.The first step addressed these problems sets up basic sentiment dictionary accurately exactly.
The common method of basis of formation sentiment dictionary is: choose a series of emotion word from knowing net (Hownet), they are inputed to Google search engine one by one, the size of the click volume (hits value) returned according to Google sorts to emotion word, chooses emotion word based on several the highest emotion word of click volume.Corpus due to this problem only comes from the information on commodity comment in e-commerce website, so know that the word finder scope for this problem in net is excessive.Further, the click volume of search engine feedback can not reflect whether representative a vocabulary is evaluated in corpus at commodity.So the method is unsuitable for this problem.
This problem adopts the method based on word frequency statistics, semi-automatically chooses basic emotion vocabulary.Because the word that commodity are evaluated containing emotion composition in short text is mostly adjective, verb and a small amount of noun, so after carrying out pre-service, only need based on the abundant comment on commodity short sentence set of number of entries, automatic word frequency statistics is carried out for adjective, verb and noun, then for the higher some vocabulary of word frequency, choose by hand 20 the highest positive emotion words of word frequency and the highest 20 the negative emotion words of word frequency, be made up of the basic sentiment dictionary of this problem them.
Adopt said method, we finally include the front of basic dictionary and negative emotion vocabulary in table 1.
Table 1: basic sentiment dictionary
Because basic emotion vocabulary have expressed very strong feelings tendency, so we are the Sentiment orientation value that forward basis emotion word is given is+1, the Sentiment orientation value of giving for negative sense basis emotion word is-1.
The expansion of sentiment dictionary
The vocabulary of basis sentiment dictionary is very little, can not be included in commodity and evaluate all vocabulary with Sentiment orientation occurred in corpus.Therefore, need to expand basic sentiment dictionary, build relatively complete sentiment dictionary.Our extending method has two kinds: add synonym, add candidate word with Sentiment orientation.
Add synonym
Evaluate in short text at commodity, have the word synonym all each other much praised or belittle.So, expand synonym and us can be helped more broadly to identify emotion vocabulary.For this reason, we wish to utilize Harbin Institute of Technology's Chinese thesaurus [33], carry out synonym expansion to basic sentiment dictionary.But, have a lot of synonym to be the word of very writtenization in Harbin Institute of Technology's Chinese thesaurus, evaluate in corpus at commodity and can not use completely, the synonym " of inferior quality " of such as " bad ".In order to improve the algorithm performance that Sentiment orientation calculates, we still need artificial screening to go out conventional thesauarus.After synon expansion, the word of sentiment dictionary increases to 256.Because be synonym, the synon Sentiment orientation value of positive emotion words all in basic sentiment dictionary is set to+1 by us, and the synon Sentiment orientation value of all negative emotion words is set to-1.
Add relevant emotion word
Although build, completely exhaustively sentiment dictionary is very difficult, and concentrate the correlativity of emotion vocabulary in each word and dictionary by analyzing language material, dictionary included in the word that correlativity is very high, effectively can build the wider sentiment dictionary of coverage rate.This problem uses a kind of Statistics-Based Method: some mutual information method (Pointwise Mutual Information) carrys out the correlativity of emotion vocabulary in calculated candidate word and dictionary, thus judges whether this word should as emotion word.If so, then sentiment dictionary is added into.
Point mutual information method calculates the correlativity between word and word based on Mutual Information Theory.Its basic thought is: add up two word word iand word jthe probability of co-occurrence in statement is evaluated at commodity.The probability of co-occurrence is larger, then represent that the correlativity between these two words is higher, shown in lower:
(formula 5-1)
Wherein p (word i∧ word j) be word iand word jthe probability of co-occurrence in corpus, its computing method are such as formula shown in (6-1), and wherein n represents the total number of comment on commodity in corpus, numSentence (word i, word j) represent comprise word simultaneously iand word jevaluation number.P (word i) and P (word j) represent in corpus respectively and comprise word iand word jthe ratio of evaluation number shared by total evaluation number.Their computing method such as formula shown in 6-2 and 6-3, wherein numSentence (word i) represent in corpus and comprise word ievaluation number.PMI (word in formula (6-1) i, word j) represent and work as word iand word jduring one of them occurrences, the quantity of information of another variable that we can get, this has fully showed word iand word jbetween statistic correlation: when PMI is greater than 0, represent that two words have correlativity, and PMI value is larger, correlativity is stronger; When PMI equals 0, represent that between these two words be statistical iteration; When PMI is less than 0, represent that between these two words be mutual exclusion.
(formula 6-1)
P ( word i ) = n u m S e n t e n c e ( word i ) n (formula 6-2)
P ( word j ) = n u m S e n t e n c e ( word j ) n (formula 6-3)
When the principle of PMI is applied to feeling polarities analysis by us, just develop into SO-PMI algorithm.SO-PMI adopts the statistic correlation between the thought calculated candidate word of PMI and the basic emotion word of each group, from the Sentiment orientation of each group of this word of statistic correlation comprehensive descision.Concrete calculation procedure is: first, the PMI value between calculated candidate word and forward basis emotion word; Then, the PMI value between calculated candidate word and negative sense basis emotion word; Finally both are subtracted each other the SO-PMI value obtaining candidate word.Suppose that forward basis sentiment dictionary is posWords, negative sense basis sentiment dictionary is negWords, then for the calculating of candidate's word word, SO-PMI such as formula shown in 6-4:
S O - P M I ( w o r d ) = Σ p o s W o r d ∈ p o s W o r d s P M I ( w o r d , p o s W o r d ) - Σ n e g W o r d ∈ n e g W o r d s P M I ( w o r d , n e g W o r d ) (formula 6-4)
Relation between the value of SO-PMI and Sentiment orientation is such as formula shown in 6-5:
(formula 6-5)
When SO-PMI method being applied to the commodity evaluation corpus of this experiment, we have found following problem:
1) a lot of individual character verb and exclusive noun itself are neutral implications, but they may be very large with the probability of a certain emotion word co-occurrence in dictionary in corpus, thus cause SO-PMI greatly to depart from neutral value.Such as verb " hits ".PMI value in it and dictionary between the word of front is 18.97, and and PMI value between negative word be 0, therefore its SO-PMI value can much larger than 0.Noun " thinkpad " also there will be similar situation.These situations can include the word that much there is no Sentiment orientation in sentiment dictionary, cause the performance cost that sensibility classification method is meaningless, and the accuracy of infringement classification.
2) SO-PMI of a lot of neutral words often accurately can not equal 0: they may be close in 0, also and may have very large deviation between 0.So the threshold values being front or negative emotion word by differentiation word is decided to be 0, and is not suitable for this problem.
3) problem is omitted: the corpus adopted due to this problem is the commodity evaluating data of short text form, the number of words of comment is often less, the quantity of basis emotion word is also few, so the probability of candidate word and basic emotion word co-occurrence can be lower, namely the value of SO-PMI can be tending towards 0.But from the visual angle of sentiment analysis, the correlativity between this candidate word and basic emotion word is very large again.The vocabulary much should including sentiment dictionary in can be caused like this to be missed, and to produce the Sparse Problems of feature.
So, need the feature according to this problem corpus, adaptive improvement is carried out to SO-PMI algorithm, in the hope of solving above-mentioned three problems.The improvement that we propose has following three places.
1) for problem 1, after ICTCLAS participle, obtain the part of speech property of word, and regulation only calculates the SO-PMI value of the two kinds of candidate word word limited by formula 6-7 and formula 6-8, the candidate word of all the other parts of speech is directly regarded as neutral word.
Word.propertyal ∈ { a, ad, an, ag, al} (formula 6-7)
Word.propertyal ∈ { vn, vd, vi, vg, vl} (formula 6-8)
Because adjective generally all contains emotion tendency, so all calculate SO-PMI value to all adjective vocabulary.
Meanwhile, we have given up the word of noun and all the other parts of speech, because the word of these parts of speech seldom can with Sentiment orientation.Because the singularity of corpus evaluated by commodity, most noun is all typonym or the brand name of commodity, such as " clothes ", " Mei Di " etc.So in order to prevent these neuters from being brought in sentiment dictionary mistakenly, in order to improve the efficiency of algorithm expanding candidate word, we do not calculate the SO-PMI value of noun yet.But a series of noun and synonym thereof that comprise intense emotion tendency can manually join in sentiment dictionary by we.The part nominal emotion word added by hand is as shown in table 2.
The hand picked nominal emotion word of table 2:
2) for problem 2, after to the observation of mass data, the value of SO-PMI and the relation of Sentiment orientation are readjusted as formula (6-9) by we.
(formula 6-9)
For problem 3: we select posWords and the negWords dictionary in further expansion type 6-4.Specific practice is: (a) by the synonym of basis, front emotion word, and meets formula 6-7 or formula 6-8 and be judged to be that the emotion word that front is inclined to is added to posWords through formula 6-9; B () by the synonym of negative basic emotion word, and meets formula 6-7 or formula 6-8 and is judged to be that the emotion word of negative tendency is added to negWords through formula 6-9.Like this, just provide a more fully emotion word comparative sample to candidate word, avoid omitting the candidate's word with Sentiment orientation.
Strictly, posWords is defined as follows:
1) if w is the front word in basic sentiment dictionary, so w posWords;
2) if w is the synonym of certain front word in basic sentiment dictionary, so w posWords;
3) if w meets formula 6-7 or formula 6-8, and 1.36<SO-PMI (word) <23, so w posWords.In like manner, negWords is defined as follows:
1) if w is the negative word in basic sentiment dictionary, so w negWords;
2) if w is the synonym of certain negative word in basic sentiment dictionary, so w negWords;
3) if w meets formula 6-7 or formula 6-8, and-16<SO-PMI (word) <-1, so w negWords.
According to the SO-PMI algorithm improved, we using the comment data of 100,000 after word segmentation processing as input, to wherein meeting formula 6-7 or formula 6-8, and the candidate word that basic emotion word of getting along well repeats carries out the calculating of Sentiment orientation value, pick out the candidate word meeting formula 6-9, this word is added to sentiment dictionary together with its Sentiment orientation value.Now, SO-PMI unavoidably can the polarity misclassification of some emotion word, so need manually to carry out denoising.After completing expansion, in dictionary, the number of emotion word is increased to 2393, wherein comprises forward emotion word 1302 and negative sense emotion word 1091.This completes the structure of the sentiment dictionary of this problem.
The design of emotion model
This section will introduce we to carry out sentiment analysis model to information on commodity comment, i.e. emotion model in detail.Its main thought is: on the basis of sentiment dictionary, evaluate in units of statement S by every bar, with each emotion word WS in this statement for separator, to the punctuate phrase (WSi-1 between two separators, WSi) emotion weight computing is carried out, then the weights weighted sum of each punctuate is drawn the overall emotion propensity value of S, thus realize the polarity classification to evaluating statement.Arrange herein, punctuate phrase (WSi-1, WSi) comprises word WSi, but does not comprise word WSi-1.
This model is made up of 6 modules, is respectively: the analysis of the analysis of the analysis of the analysis of emotion word, the analysis of negative word, adverbial word, the analysis of regular collocation words and phrases, adversative, confirmative question and exclamative sentence.
In the design process of these 6 modules, herein traditional emotion model [is all carried out to the different transformation of degree, made it the Sentiment orientation analysis being suitable for comment on commodity short text in e-commerce website.Such as, in the analysis of emotion word, we consider the different parts of speech of emotion word to the impact of Sentiment orientation, introduce Ad dictionary and special processing is carried out to the word with adjective and adverbial word two kinds of parts of speech; And for example, in the analysis of regular collocation words and phrases, we consider the emotion of regular collocation on sentence or emotion word affects, and is divided into 4 kinds, and all made corresponding special processing to often kind of collocation phrase, the result that Sentiment orientation is classified is more accurate.
The analysis of emotion word
As follows to the analysis process of emotion word: for each word word in comment to be analyzed all, scanning sentiment dictionary, judges whether word is present among sentiment dictionary, if exist, then word be considered as emotion word and from sentiment dictionary, read the Sentiment orientation value of this word, being returned; If do not exist, then word is considered as neutral vocabulary, returns 0.Such circulation is until judged the word of whole comment collection.This process is realized by algorithm analyzeSentimentWord.
But, some has the word of adjective and adverbial word two kinds of parts of speech, comprises emotional attitude in some cases, but only has the effect of adverbial word in yet some other cases, now, whether to be present among sentiment dictionary as criterion to the emotion propensity value calculating it using this word is inaccurate.
This situation is distinguished into following two kinds by this problem.
1) when a word has adjective and adverbial word two kinds of parts of speech, according to the difference in functionality of word in different statement, ICTCLAS tool analysis part of speech out also can be different, sees example 6.1.So in conjunction with the part of speech of word, we can judge whether this word is emotion word on the basis of sentiment dictionary.
Example 6.1
Sentence 1: taste/n very/d is general/a./wj
Now " generally " is as negative emotion vocabulary, and part of speech is a.
Sentence 2: general/ad not /d meeting/v goes offline/n./wj
Now " generally " is as the adverbial word modifying " going offline ", and represent the degree of strength of emotion, part of speech is ad.
For the problems referred to above, the solution that we propose is: set up an Ad dictionary, Ad dictionary is put in the word with adjective and adverbial word two kinds of parts of speech, and specifies: if word belongs to Ad dictionary, and its part of speech comprises character " d ", then this word is not considered as emotion word.
Through summing up, the Ad dictionary that we determine is as shown in table 6.3.
Table 6.3Ad dictionary
Good, many, really, especially, easily, strongly, completely, directly, substantially
2) some is had to the emotion word of adjective and adverbial word character, its adverbial word part of speech is just brought into use recently, as " little " in second example sentence in example 6.2.Now, ICTCLAS cannot analyze the adverbial word part of speech of this word, can only judge whether it has adverbial word part of speech by front and back collocations.
Example 6.2:
Sentence is 1: very/d little/a /ude thing/n./wj
Now " little " is as negative sense emotion vocabulary, and part of speech is a.
Sentence is 2: little/a expensive/a./wj
Now " little " is as the adverbial word modifying " expensive ", and represent the degree of strength of emotion, part of speech is d, but its part of speech is still identified as a by ICTCLAS.
From example 6.2, when " little " and adjective is arranged in pairs or groups together time, it has adverbial word part of speech.The word meeting this rule also has " greatly ".So in conjunction with the part of speech of its next word, we can by judging whether it plays emotion word.Concrete rule is: if a next word simultaneously with the word of adjective and adverbial word part of speech is adjective (a), then it is not regarded as emotion word.
Suppose that the part of speech of word word is property, the part of speech of its next word is nextProperty.We represent the Sentiment orientation value (also claiming weight) of vocabulary word with weight (word), represent that whether word is for emotion word with isSentiment (word).Emotion word analytical algorithm false code is described below:
Algorithm: analyzeSentimentWord (emotion word analysis)
Input: word, property, nextProperty
Export: weight (word), isSentiment (word)
if(isInSentimentLexicon(word))then
if(isInAdLexicon(word)&&property.contains(―d‖))then
weight(word):=0;
Else if ((word==" greatly " || the little ‖ of word==-) & & nextProperty.contains (-a ‖)) then
weight(word):=0;
else
weight(word):=getWeightFromSentimentLexicon(word);
end if
else
weight(word):=0;
end if
if(weight(word)==0)then
isSentiment(word):=false;
else
isSentiment(word):=true;
end if
In emotion word analytical algorithm, function isInSentimentLexicon and isInAdLexicon judges whether vocabulary is positioned among sentiment dictionary and Ad dictionary, and function getWeightFromSentimentLexicon obtains the Sentiment orientation value of vocabulary from sentiment dictionary respectively.
By the calculating of the Sentiment orientation value to each word, we get emotion word (namely weights are not equal to the word of 0) accurately, and have filtered the emotion word (i.e. weights equal 0 emotion word) not playing affectivity in particular statement.
The analysis of negative word
Negative word is the word representing negative implication, and its appearance can change the Sentiment orientation of former sentence.Such as in example 6.3, " liking " is that front is evaluated, and in time adding negative word " no " above, front is evaluated and just become unfavorable ratings.The negative word dictionary that this problem is determined, as shown in table 6.4, comprises altogether 40 negative words
Table 6.4 negative word dictionary
Except the situation that single negative word occurs, in Chinese, also often there will be double denial, namely occur even number negative word in a word.Such as in example 6.4, " cannot " and " no " be all negative word, they are modified emotion word simultaneously and " like ", finally reduce " liking " positive emotion tendency.
Example 6.3 I/rr not /d likes/vi it/rr./wj
Example 6.4 I/rr cannot/d not /d likes/vi it/rr./wj
Be: when emotion word Wsi appears in statement to calculate between Wsi and previous separator Wsi-1 the number negNum (Wsi-1, Wsi) of (namely one make pauses in reading unpunctuated ancient writings in) negative word herein to the analytical approach of negative word.If negNum is odd number, then the emotion value of this punctuate is the Sentiment orientation value negate of emotion word; Otherwise, then former Sentiment orientation value is kept.
The method calculating negative word number is: scan the vocabulary in sentence s one by one, and when scanning Wsi-1, using Wsi-1 as starting point, obtain word word one by one from front to back, call function isNegWord judges whether word is present in negative word dictionary.If so, then negNum increases one, until scan next emotion word Wsi.In the process, by word successively stored in array variable phrase (Wsi-1, Wsi), thus complete the intercepting to a punctuate.
The set of all punctuates in a comment short sentence might as well be designated as phrases, i-th punctuate is designated as phrases [i], the set of the Sentiment orientation value of all punctuates is designated as weight (phrases), the Sentiment orientation value of i-th punctuate is designated as weight (phrases [i]), and the number of the negative word comprised in punctuate is designated as negNum.Negative word analytical algorithm analyzeNegWord is herein described below:
Algorithm: analyzeNegWord (negative word analysis)
Input: comment short sentence s to be analyzed
Export: phrases, weight (phrases)
i:=0;negNum:=0;
foreach word win s
Phrases [i] .appendWord (w); // word w is joined the end of the sequence of terms of phrases [i]
if(isInNegLexicon(w))then
negNum++;
else
weight:=analyzeSentimentWord(w);
if(weight!=0)then
if(negNum%2==0)then
weight(phrases[i]):=weight;
else
weight(phrases[i]):=-1*weight;
end if
i++;negNum:=0;
end if
end if
end for
In negative word analytical algorithm, function isInNegLexicon judges whether vocabulary is positioned among negative word dictionary, and function analyzeSentimentWord obtains the Sentiment orientation value of the word that emotion word analytical algorithm returns.
The analysis of adverbial word
Adverbial word is the word of intensity of showing emotion." very " in such as " I is delithted with " have expressed strong positive emotion; " comparison " again such as in " I prefers " one word only express relatively weak positive emotion.According to the degree of strength of adverbs modify emotion word, adverbial word is divided into 4 classifications by us, for each classification distributes the numerical value that one represents emotion intensity.Through arranging, the adverbial word dictionary that this problem adopts is as table 6.5.
Table 6.5 adverbial word dictionary
Similar with the analytic process of negative word, we obtain the intensity of each adverbial word in punctuate from adverbial word dictionary, and using the product of the product of these intensity levels and the Sentiment orientation degree of the punctuate obtained before this as new punctuate Sentiment orientation degree.
Might as well represent with phrase certain punctuate that negative word analytical algorithm obtains, the weight representing the punctuate phrase that this algorithm obtains with weight (phrase), represents the intensity of adverbial word with degree.Adverbial word analytical algorithm analyzeAdvWord is herein described below:
Algorithm: analyzeAdvWord (adverbial word analysis)
Input: phrase, weight (phrase)
Export: weight (phrase)
degree:=1.0;
for each word w in phrase
if(isInAdvLexicon(w))then
degree=degree*getDegreeFromAdvLexicon(w);
end if
end for
weight(phrase):=degree*weight(phrase);
In adverbial word analytical algorithm, function isInAdvLexicon judges whether vocabulary is positioned among adverbial word dictionary, and function getDegreeFromAdvLexicon obtains the emotion intensity of adverbial word from adverbial word dictionary.
The analysis of regular collocation phrase
We found through experiments, and there will be the specific collocation of some phrases in some comment words and phrases.Although these collocation comprise emotion word, this collocation can change this emotion word and lead to the emotion of whole statement; These collocation also may not comprise emotion word, but bring Sentiment orientation can to whole statement.So in both cases, it is inadequate for calculating emotion weights according to emotion weights, negative word and secondary contamination, also need to analyze regular collocation phrase.
Regular collocation phrase is divided into following four kinds herein.
1) regular collocation be made up of adverbial word (d) or conjunction (c), such as routine 6-5.We carried out before other described in this chapter analysis starts their analysis at regulation.
If example 6.5/c again/d is beautiful/a 1 point/m just/d is good/a/y./wj
Although contain the positive emotion vocabulary that " beautiful " and " good " is such in this sentence, regular collocation " if all right " brings negative emotion to the words.
We process in the regular collocation of algorithm for design matchAdvConjPatterns to adverbial word and conjunction.Detailed process is: based on the regular collocation rule set acPatterns of adverbial word and conjunction, judge whether evaluate statement S meets acPatterns with regular expression, if met, then aforementioned algorism is no longer adopted to calculate the Sentiment orientation value of making pauses in reading unpunctuated ancient writings in S, directly for S gives emotion weights.
Suppose that posWord represents certain forward basis emotion word, regular collocation rule has following 4 groups:
(1) if if if/again/more///a bit/many/can ... just ... + posWord+ ... }
(2). ... again/more/have ...
(3) just. ... too ... "
(4) need// obtain/after ./occupy ./sky/all/also/heavy. ... ...
Can be expressed as with regular expression:
(1) [" if " | " again " | " more " | " if " | " if " | " a bit " | " many " | " energy "]+[u4E00-u9FA5] *+[" just "]+[u4E00-u9FA5] *+posWord+ [u4E00-u9FA5] *+" "
(2) " best "+[u4E00-u9FA5] *+[" again " | " more " | " having "]+[u4E00-u9FA5] *
(3) " be exactly "+[u4E00-u9FA5] *+" too "+[u4E00-u9FA5] *
(4) [" need " | " taking " | " obtaining " | " afterwards " | " unexpectedly " | " my god " | " all " | " going back " | " again "]+[u4E00-u9FA5] *+[" "]+[u4E00-u9FA5] *
The false code of matchAdvConjPatterns algorithm can be described below:
Algorithm: matchAdvConjPatterns (regular collocation of coupling adverbial word and conjunction)
Input: S, acPatterns (the regular expression collection of the regular collocation rule of adverbial word and conjunction)
Export: weight (S)
if(acPatterns.match(S))then
weight(S)=-0.5;
else
Weight (S) is calculated with the additive method described in this chapter;
end if
2) ambiguity emotion word regular collocation.Some emotion word, when together with different collocations, Sentiment orientation also can be different, or keep initial value, or negate, or be neutrality.Such emotion word is claimed to be ambiguity emotion word herein.Such as example 6.6 and example 6.7.
Example 6.6
Sentence 1: cost performance/n very/d is high/a./wj
Sentence is 2: price/n high/a/y./wj
" height " is forward emotion word in sentiment dictionary.In sentence 1, together, the Sentiment orientation value of whole sentence gets the former weights of emotion word in " height " and " cost performance " collocation; In sentence 2, when it and " price " are arranged in pairs or groups, then bring negative sense emotion.
Example 6.7
Sentence 1: quite/d is large/a /ude./wj
Sentence 2: large/a/y point/qt./wj
" greatly " is forward emotion word in sentiment dictionary.We find to be similar to " greatly " such adjective and " point " when arranging in pairs or groups, and its original Sentiment orientation can be reversed.
Similar with the aforesaid regular collocation process to adverbial word and conjunction, arrange two groups of rules about the regular collocation of ambiguity emotion word herein, first group makes the negate of Sentiment orientation value, and second group makes Sentiment orientation value reset.We, by according to different rules, recalculate the Sentiment orientation value of punctuate.First group of rule is designated as ambigNegPatterns, and second group of rule is designated as ambigZeroPatterns, represents certain negative emotion word with negWord, and ambiguity emotion word regular collocation rule is defined as follows.
AmbigNegPatterns comprises following 5 rules:
((negWord+ " rate ") | (" valency "+[" lattice " | " position "]))+[u4E00-u9FA5] *+[" height " | " low " | " greatly " | " little "]+[u4E00-u9FA5] *
[u4E00-u9FA5] *+[" just " | " afterwards " | " again "]+[u4E00-u9FA5] *+" price reduction "+[u4E00-u9FA5] *
[u4E00-u9FA5] *+" price reduction "+[u4E00-u9FA5] *+" too fast "+[u4E00-u9FA5] *
[u4E00-u9FA5] *+" point "+[u4E00-u9FA5] *
" use "+[u4E00-u9FA5] *+" for a long time "
AmbigZeroPatterns comprises following 3 rules:
[u4E00-u9FA5] *+" temporarily "+[u4E00-u9FA5] *
[u4E00-u9FA5] *+" also useless "
[u4E00-u9FA5] *+" not knowing "+[u4E00-u9FA5] *+" how "
The false code of ambiguity emotion word analytical algorithm can be described below:
Algorithm: analyzeAmbigEmotionWord (analysis of ambiguity emotion word)
Input: phrase, weight (phrase)
Export: weight (phrase)
if(ambigNegPatterns.match(phrase))then
weight(phrase)=-1*weight(phrase);
else if(ambigZeroPatterns.match(phrase))then
weight(phrase)=0;
end if
3) oppositely emotion word regular collocation.Some emotion word, current wearing passionately high-coloredly describes adverbial word, and during the adverbial word that namely weights are larger, Sentiment orientation can be inverted.We claim such emotion word to be reverse emotion word.
Example 6.8 too/d is large/a/y./wj
When emotion word " greatly " and " too " adverbial word that weights are larger is like this connected time, its positive emotion tendency can be inverted.
In order to analyze this type of regular collocation, we set up reverse sentiment dictionary as shown in table 6.6, store reverse emotion word, and suppose, the adverbial word that weight is greater than 0.5 can reverse the emotion value of reverse emotion word.
Table 6.6 is sentiment dictionary oppositely
Bright, greatly, easily, small and exquisite, light, in vain, simply, tightly, thin, gently, heavy, long, high
Can be described below the process that analyzing and processing is carried out in reverse emotion word regular collocation herein.
Scanning in the process of emotion word Wsi from last emotion word, using variable advWeight to record the weight of each adverbial word.When scanning the position of Wsi, namely advWeight stores the weight apart from the nearest adverbial word of Wsi.First judge whether advWeight is greater than 0.5, then judge whether occur negative word before Wsi, if there is not negative word, then judge whether Wsi belongs to reverse sentiment dictionary by function isOppositeWord (Wsi), if, just the emotion value of phrase (Wsi-1, Wsi) is reversed.If there is negative word before Wsi, so the emotion value of phrase (Wsi-1, Wsi) has remained unchanged.This is because in " negative word+adverbial word+emotion word " this combination, emotion word does not show as negative sense emotion word.Such as, in " not too easy " short sentence, positive emotion word " easily " is not reversed to negative sense emotion word by adverbial word " too " above.
Suppose that the number of the negative word that phrase (Wsi-1, Wsi) comprises is negNum (Wsi-1, Wsi), the Processing Algorithm of reverse emotion word regular collocation can be described below by false code:
Algorithm: matchOppositeEmotionPatterns (mating the regular collocation of reverse emotion word)
Input: weight (phrase (Wsi-1, Wsi)), advWeight, negNum (Wsi-1, Wsi)
Export: weight (phrase (Wsi-1, Wsi))
if((advWeight>0.5)&&(negNum(Wsi-1,Wsi)==0)&&(isOppositeWord(Wsi)))thenweight(phrase(Wsi-1,Wsi))=-1*weight(phrase(Wsi-1,Wsi))end if
4) regular collocation of negative word.Our regulation is analyzed it and is only carried out when whole comment S does not comprise emotion word.
In commodity evaluation field, we notice that some is evaluated statement and does not comprise any emotion vocabulary, only comprise negative word, such as example 6.9.If according to the method described in 6.2.2 joint, this statement will be identified as neutral statement.But we obviously can experience the negative sense emotion of this statement, and such emotion is passed on by negative word.
Example 6.9 electric fans/n all/d do not have/d! / wt
This sentence comprises two negative words (" no " and " not having "), but this Sentiment orientation is but negative sense.
Solution to the problems described above is: the regular collocation summing up several negative word, forms negPatterns rule set.
When scanning the end of S, first judging whether S meets negPatterns rule, if met, just the weights of S being set to 0.5.If the rule of not meeting, the parity that we recycle negative word is that S gives Sentiment orientation value.
At present, our negPatterns only comprises a rule, later extendible rule newly.
[" no " | " not having "]+[u4E00-u9FA5] *+" just "+[" not having " | "None"]+[u4E00-u9FA5] Processing Algorithm of regular collocation of * negative word can be described below by false code:
Algorithm: matchNegPatterns (regular collocation of coupling negative word)
Input: S, negNum (S)
Export: weight (S)
if((negPatterns.match(S))||(negNum(S)%2!=0))then
weight(S)=-0.5;
else
weight(S)=0.5;
end if
The analysis of adversative
Adversative refers to the word that can bring reversion effect to the semanteme of sentence.
Example 6.11
Ratio/p market/n /ude1 is cheap/a ,/wd still/c sells/v after/f also/d do not have/v market/n is good/a./wj
What first short sentence represented is front emotion, however when adversative " but " occur after, sentence meaning emotion is then partial to negative.
The adversative dictionary of this Subject Design is as shown in table 6.7.
Table 6.7 adversative dictionary
Can be abstract in following structure by the statement comprising adversative:
Phrase (Wsi-1, Wsi)+punctuation mark+adversative+phrase (Wsi, Wsi+1)
According to the effect of adversative, the Sentiment orientation of known weight (phrase (Wsi-1, Wsi)) and weight (phrase (Wsi, Wsi+1)) should be contrary.
Be: after the analysis of emotion word, negative word and adverbial word that next emotion word Wsi+1 is found in scanning backward from current emotion word Wsi place to the analytic process of adversative.In this process, if scan adversative, then by weight (phrase (Wsi-1, Wsi)) negate, make phrase (Wsi-1, the Sentiment orientation of Sentiment orientation deflection adversative punctuate phrase (Wsi, Wsi+1) below Wsi).Suppose that phrases represents the set of all punctuates in comment short sentence, weight (phrases) represents these Sentiment orientation values of making pauses in reading unpunctuated ancient writings, numPhrases represents the sum of punctuate, phrases [i] represents i-th punctuate, isTransitionWord (word) is for judge whether word word belongs to the function of adversative dictionary, and the analytical algorithm of adversative can be described below by false code:
Algorithm: analyzeTransitionWord (analysis adversative)
Input: phrases, weight (phrases)
Export: weight (phrases)
for(i=0;i<numPhrases-1;i+=2)
for word in phrases[i+1]
if(isTransitionWord(word))then
weight(phrase[i])=-1*weight(phrase[i]);
break;
end if
end for
end for
The analysis of exclamative sentence and confirmative question
Exclamative sentence and confirmative question are all the sentence patterns increasing the weight of statement Sentiment orientation.Wherein, exclamative sentence only plays booster action, and confirmative question can also reverse Sentiment orientation.
For the analysis of exclamative sentence, we with exclamation mark "! " as the mark of exclamative sentence, it is designated as exc.The computing method of its emotion weights are: when scanning exclamation mark, and we find from back to front from the nearest emotion word Wsi-1 of exclamation mark, and using the weights of the Sentiment orientation value of Wsi-1 as exc.
Be different from traditional emotion model, our emotion model does not do special processing to confirmative question.This is because, evaluate short sentence at commodity and concentrate, the implication that the semanteme that great majority comprise antisense interrogative does not all reverse, but represent the query attitude to commodity.Such as in example 6.12, although " " is the representative word of disjunctive question, in this sentence, inreal reversion effect is played to Sentiment orientation.
Example 6.12
The inside/f unexpectedly/d has/vyou 60/m is many/m M/x /ude1 file/n ,/wd/d are /vshi second hand/n? / ww
Sentiment orientation value weighted calculation
Calculate the Sentiment orientation value of all punctuates that a comment S comprises.These Sentiment orientation values are added, just can calculate the Sentiment orientation value weight (S) of S.Judge that the method for the feeling polarities of S is: if weight (S) is greater than 0, then this comment belongs to front comment; Otherwise, think that S belongs to negative sense comment.

Claims (2)

1., based on the short text Sentiment orientation analytical approach of sentiment dictionary, it is characterized in that comprising the steps:
Step 1, structure sentiment dictionary, based on the method basis of formation sentiment dictionary of word frequency statistics; By SO-PMI method, the Sentiment orientation differentiating it is calculated to the statistic correlation of vocabulary in candidate's word and basic sentiment dictionary, thus expands basic dictionary;
The model of step 2, structure sentiment analysis, on the basis of sentiment dictionary, evaluate in units of statement S by every bar, with each emotion word WS in this statement for separator, to the punctuate phrase (WSi-1 between two separators, WSi) emotion weight computing is carried out, then the weights weighted sum of each punctuate is shown that every bar evaluates the overall emotion propensity value weight (S) of statement S, judge that the method for the feeling polarities of every bar evaluation statement S is: if weight (S) is greater than 0, then this comment belongs to front comment; Otherwise think that every bar is evaluated statement S and belonged to negative sense comment, thus realize the polarity classification to evaluating statement, punctuate phrase (WSi-1, WSi) comprises word WSi, but does not comprise word WSi-1.
2. the short text Sentiment orientation analytical approach based on sentiment dictionary according to claim 1, it is characterized in that, SO-PMI method comprises the steps:
The part of speech property of word is obtained after step 2-1, employing ICTCLAS system participle,
Step 2-2, calculate by word.propertyal ∈ a, ad, an, ag, al} and word.propertyal ∈ the SO-PMI value of two kinds of candidate word word that vn, vd, vi, vg, vl} limit, the candidate word of all the other parts of speech is directly regarded as neutral word;
The SO-PMI value calculating two kinds of candidate word word is specially:
PMI value between calculated candidate word and forward basis emotion word, the PMI value between calculated candidate word and negative sense basis emotion word, finally both are subtracted each other the SO-PMI value obtaining candidate word, the calculating formula of SO-PMI is as follows:
S O - P M I ( w o r d ) = &Sigma; p o s W o r d &Element; p o s W o r d s P M I ( w o r d , p o s W o r d ) - &Sigma; n e g W o r d &Element; n e g W o r d s P M I ( w o r d , n e g W o r d ) (formula 1)
PosWords is forward basis sentiment dictionary, and negWords is negative sense basis sentiment dictionary, and word is candidate's word;
Relation between the value of SO-PMI and Sentiment orientation as shown in the formula:
(formula 2)
Step 2-4, by the synonym of basis, front emotion word, and meet formula word.propertyal ∈ { a, ad, an, ag, al} or formula word.propertyal ∈ { vn, vd, through formula 2, vi, vg, vl} are also judged to be that the emotion word that front is inclined to is added to posWords;
Step 2-5, by the synonym of negative basic emotion word, and meet formula word.propertyal ∈ { a, ad, an, ag, al} or formula word.propertyal ∈ { vn, vd, vi, vg, through formula 2, vl} is also judged to be that the emotion word of negative tendency is added to negWords, obtain a comprehensive emotion word comparative sample.
CN201510342473.4A 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary Active CN105005553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510342473.4A CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510342473.4A CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Publications (2)

Publication Number Publication Date
CN105005553A true CN105005553A (en) 2015-10-28
CN105005553B CN105005553B (en) 2017-11-21

Family

ID=54378229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510342473.4A Active CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Country Status (1)

Country Link
CN (1) CN105005553B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682128A (en) * 2016-12-13 2017-05-17 成都数联铭品科技有限公司 Method for automatic establishment of multi-field dictionaries
CN107291696A (en) * 2017-06-28 2017-10-24 达而观信息科技(上海)有限公司 A kind of comment word sentiment analysis method and system based on deep learning
CN107609132A (en) * 2017-09-18 2018-01-19 杭州电子科技大学 One kind is based on Ontology storehouse Chinese text sentiment analysis method
CN107885785A (en) * 2017-10-17 2018-04-06 北京京东尚科信息技术有限公司 Text emotion analysis method and device
CN108108433A (en) * 2017-12-19 2018-06-01 杭州电子科技大学 A kind of rule-based and the data network integration sentiment analysis method
CN108763196A (en) * 2018-05-03 2018-11-06 上海海事大学 A kind of keyword extraction method based on PMI
CN109033433A (en) * 2018-08-13 2018-12-18 中国地质大学(武汉) A kind of comment data sensibility classification method and system based on convolutional neural networks
CN109408798A (en) * 2018-07-27 2019-03-01 昆明理工大学 A kind of word Sentiment orientation determination method
CN109800308A (en) * 2019-01-22 2019-05-24 四川长虹电器股份有限公司 A kind of short text classification method combined based on part of speech and Fuzzy Pattern Recognition
CN110096696A (en) * 2018-06-11 2019-08-06 电子科技大学 A kind of Chinese long text sentiment analysis method
CN110399494A (en) * 2018-04-16 2019-11-01 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN110399595A (en) * 2019-07-31 2019-11-01 腾讯科技(成都)有限公司 A kind of method and relevant apparatus of text information mark
CN110728131A (en) * 2018-06-29 2020-01-24 北京京东尚科信息技术有限公司 Method and device for analyzing text attribute
CN111259661A (en) * 2020-02-11 2020-06-09 安徽理工大学 New emotion word extraction method based on commodity comments
CN111353044A (en) * 2020-03-09 2020-06-30 重庆邮电大学 Comment-based emotion analysis method and system
CN112037818A (en) * 2020-08-30 2020-12-04 北京嘀嘀无限科技发展有限公司 Abnormal condition determining method and forward matching formula generating method
CN112416917A (en) * 2020-11-19 2021-02-26 珠海格力电器股份有限公司 Method, device and system for processing abnormal data in real time
CN113158669A (en) * 2021-04-28 2021-07-23 河北冀联人力资源服务集团有限公司 Method and system for identifying positive and negative comments of employment platform
CN113378578A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food and medicine public opinion analysis method
CN113378577A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food safety evaluation text emotional tendency analysis method
CN115271816A (en) * 2022-08-02 2022-11-01 北京信息科技大学 Bulk commodity price prediction method and device based on emotion index
CN115796158A (en) * 2023-02-07 2023-03-14 中国传媒大学 Emotion dictionary construction method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2083733A1 (en) * 1991-12-30 1993-07-01 Kenneth Ward Church Word disambiguation methods and apparatus
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
CN103955451A (en) * 2014-05-15 2014-07-30 北京优捷信达信息科技有限公司 Method for judging emotional tendentiousness of short text

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2083733A1 (en) * 1991-12-30 1993-07-01 Kenneth Ward Church Word disambiguation methods and apparatus
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
CN103955451A (en) * 2014-05-15 2014-07-30 北京优捷信达信息科技有限公司 Method for judging emotional tendentiousness of short text

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
TURNEY, P.等: "Measuring Praise and Criticism: Inference of Semantic Orientation from Association", 《ACM TRANSACTIONS ON INFORMATION SYSTEMS》 *
王洪伟等: "在线评论的情感极性分类研究综述", 《情报科学》 *
郭叶: "中文句子情感倾向分析", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682128A (en) * 2016-12-13 2017-05-17 成都数联铭品科技有限公司 Method for automatic establishment of multi-field dictionaries
CN107291696A (en) * 2017-06-28 2017-10-24 达而观信息科技(上海)有限公司 A kind of comment word sentiment analysis method and system based on deep learning
CN107609132A (en) * 2017-09-18 2018-01-19 杭州电子科技大学 One kind is based on Ontology storehouse Chinese text sentiment analysis method
CN107609132B (en) * 2017-09-18 2020-03-20 杭州电子科技大学 Semantic ontology base based Chinese text sentiment analysis method
CN107885785A (en) * 2017-10-17 2018-04-06 北京京东尚科信息技术有限公司 Text emotion analysis method and device
CN108108433A (en) * 2017-12-19 2018-06-01 杭州电子科技大学 A kind of rule-based and the data network integration sentiment analysis method
CN110399494A (en) * 2018-04-16 2019-11-01 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN108763196A (en) * 2018-05-03 2018-11-06 上海海事大学 A kind of keyword extraction method based on PMI
CN110096696A (en) * 2018-06-11 2019-08-06 电子科技大学 A kind of Chinese long text sentiment analysis method
CN110728131A (en) * 2018-06-29 2020-01-24 北京京东尚科信息技术有限公司 Method and device for analyzing text attribute
CN109408798A (en) * 2018-07-27 2019-03-01 昆明理工大学 A kind of word Sentiment orientation determination method
CN109408798B (en) * 2018-07-27 2021-09-14 昆明理工大学 Word emotional tendency judgment method
CN109033433B (en) * 2018-08-13 2020-09-29 中国地质大学(武汉) Comment data emotion classification method and system based on convolutional neural network
CN109033433A (en) * 2018-08-13 2018-12-18 中国地质大学(武汉) A kind of comment data sensibility classification method and system based on convolutional neural networks
CN109800308A (en) * 2019-01-22 2019-05-24 四川长虹电器股份有限公司 A kind of short text classification method combined based on part of speech and Fuzzy Pattern Recognition
CN109800308B (en) * 2019-01-22 2022-04-15 四川长虹电器股份有限公司 Short text classification method based on part-of-speech and fuzzy pattern recognition combination
CN110399595B (en) * 2019-07-31 2024-04-05 腾讯科技(成都)有限公司 Text information labeling method and related device
CN110399595A (en) * 2019-07-31 2019-11-01 腾讯科技(成都)有限公司 A kind of method and relevant apparatus of text information mark
CN111259661A (en) * 2020-02-11 2020-06-09 安徽理工大学 New emotion word extraction method based on commodity comments
CN111259661B (en) * 2020-02-11 2023-07-25 安徽理工大学 New emotion word extraction method based on commodity comments
CN111353044B (en) * 2020-03-09 2022-11-11 重庆邮电大学 Comment-based emotion analysis method and system
CN111353044A (en) * 2020-03-09 2020-06-30 重庆邮电大学 Comment-based emotion analysis method and system
CN112037818A (en) * 2020-08-30 2020-12-04 北京嘀嘀无限科技发展有限公司 Abnormal condition determining method and forward matching formula generating method
CN112416917A (en) * 2020-11-19 2021-02-26 珠海格力电器股份有限公司 Method, device and system for processing abnormal data in real time
CN113158669A (en) * 2021-04-28 2021-07-23 河北冀联人力资源服务集团有限公司 Method and system for identifying positive and negative comments of employment platform
CN113158669B (en) * 2021-04-28 2023-03-28 河北冀联人力资源服务集团有限公司 Method and system for identifying positive and negative comments of employment platform
CN113378578B (en) * 2021-05-08 2023-04-18 重庆航天信息有限公司 Food and medicine public opinion analysis method
CN113378577A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food safety evaluation text emotional tendency analysis method
CN113378578A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food and medicine public opinion analysis method
CN115271816A (en) * 2022-08-02 2022-11-01 北京信息科技大学 Bulk commodity price prediction method and device based on emotion index
CN115271816B (en) * 2022-08-02 2023-12-22 北京信息科技大学 Method and device for predicting commodity price based on emotion index
CN115796158A (en) * 2023-02-07 2023-03-14 中国传媒大学 Emotion dictionary construction method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN105005553B (en) 2017-11-21

Similar Documents

Publication Publication Date Title
CN105005553A (en) Emotional thesaurus based short text emotional tendency analysis method
CN105022805A (en) Emotional analysis method based on SO-PMI (Semantic Orientation-Pointwise Mutual Information) commodity evaluation information
CN110083705B (en) Multi-hop attention depth model, method, storage medium and terminal for target emotion classification
CN108920622B (en) Training method, training device and recognition device for intention recognition
CN111259127B (en) Long text answer selection method based on transfer learning sentence vector
CN107346340A (en) A kind of user view recognition methods and system
CN112163425B (en) Text entity relation extraction method based on multi-feature information enhancement
Kim Deep recurrent neural networks with layer-wise multi-head attentions for punctuation restoration
CN107832295B (en) Title selection method and system of reading robot
CN108710680A (en) It is a kind of to carry out the recommendation method of the film based on sentiment analysis using deep learning
CN101751455B (en) Method for automatically generating title by adopting artificial intelligence technology
CN105512687A (en) Emotion classification model training and textual emotion polarity analysis method and system
CN102081602B (en) Method and equipment for determining category of unlisted word
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN104317965B (en) Sentiment dictionary construction method based on language material
CN111368088A (en) Text emotion classification method based on deep learning
CN110502640A (en) A kind of extracting method of the concept meaning of a word development grain based on construction
CN103984943A (en) Scene text identification method based on Bayesian probability frame
CN103324745A (en) Text garbage identifying method and system based on Bayesian model
CN108376133A (en) The short text sensibility classification method expanded based on emotion word
CN105183717A (en) OSN user emotion analysis method based on random forest and user relationship
CN110909116B (en) Entity set expansion method and system for social media
CN113761890A (en) BERT context sensing-based multi-level semantic information retrieval method
CN110728136A (en) Multi-factor fused textrank keyword extraction algorithm
CN114707007B (en) Image text retrieval method and device and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant