CN105005553B - Short text Sentiment orientation analysis method based on sentiment dictionary - Google Patents

Short text Sentiment orientation analysis method based on sentiment dictionary Download PDF

Info

Publication number
CN105005553B
CN105005553B CN201510342473.4A CN201510342473A CN105005553B CN 105005553 B CN105005553 B CN 105005553B CN 201510342473 A CN201510342473 A CN 201510342473A CN 105005553 B CN105005553 B CN 105005553B
Authority
CN
China
Prior art keywords
word
emotion
wsi
sentiment
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510342473.4A
Other languages
Chinese (zh)
Other versions
CN105005553A (en
Inventor
张海仙
章毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201510342473.4A priority Critical patent/CN105005553B/en
Publication of CN105005553A publication Critical patent/CN105005553A/en
Application granted granted Critical
Publication of CN105005553B publication Critical patent/CN105005553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

Short text Sentiment orientation analysis method based on sentiment dictionary, the method for being primarily based on word frequency statisticses build basic sentiment dictionary;The statistic correlation of vocabulary in candidate's word and basic sentiment dictionary is calculated to differentiate its Sentiment orientation, so as to expand basic dictionary.Then on the basis of sentiment dictionary, in units of every evaluation sentence S, using each emotion word WS in the sentence as separator, to the punctuate phrase (WSi 1 between two separators, WSi emotion weight computing) is carried out, then the weights weighted sum of each punctuate is drawn to every evaluation sentence S overall emotion propensity value weight (S), judges every evaluation sentence S feeling polarities, if weight (S) is more than 0, the evaluation belongs to positive evaluation;Otherwise it is assumed that every evaluation sentence S belongs to negative sense evaluation, the polarity for evaluating sentence is classified so as to realize, punctuate phrase (WSi 1, WSi) includes word WSi, but does not include word WSi 1.

Description

Short text Sentiment orientation analysis method based on sentiment dictionary
Technical field
The present invention relates to short text to carry out Sentiment orientation sorting technique field, there is provided a kind of short essay based on sentiment dictionary This Sentiment orientation analysis method.
Background technology
From proposition more than ten years till now of the Internet community concept, what the researchers of various countries detected to the Internet community Correlation technique and research give many concerns, achieve many substantial progress.
Researcher has carried out more deep analysis to the topological structure of internet first.It is different from the imagination of people, mutually Networking and other many networks it is interrelated be not fully it is random, internet society can not be described with Random Graph completely The structure in area.Especially after analyzing more and more internet datas, the concept of random graph structure is rushed by serious Hit.The practical structures of internet are imagined more than us must be much more complex, between link, website, the page, user, manager Relation is also diversified.There are many region internal connections close weaker with contacting for outside in internet, these regions are just It is the Internet community, the architectural feature of the Internet community is can not to be described clearly with Random Graph.
With the deep expansion of proposition and the correlative study of the Internet community concept, developers devise various inhomogeneities The Internet community detection algorithm of type carries out structure detection to it, and according to experimental result constantly algorithm is improved and Optimization.With carrying out in a deep going way for research, the algorithm detected to the Internet community is also in constantly optimised improvement.
Compared with conventional method, present algorithm can fully take into account mostly the concurrency of the network operation, real-time and Scalability etc. solves limitation physically.Such as the detection side of community that circle is looked for parallel ant that Sadi et al. is proposed Method.So the Internet architecture figure can be compressed up to a stable size to reduce on the premise of result effect is not influenceed The cost of algorithm operation, so as to complete the processing to large scale network.Also there is entering to label propagation algorithm for Leung et al. propositions Row improves, and the method for adding heuristic education carries out real-time community's detection to large scale network.For different interconnections Net community detection method, Leskovec et al. carry out research comparison to existing certain methods, find the inspection of large scale network community Survey problem is not that a simple algorithm just can solve, the problem of being one extremely complex, be in view of network structure, data Many problems such as distribution, network crawl effect.With the continuous maturation of detection technique, the Internet community detection availability Constantly lifted, compared with traditional brute-force algorithm, community's detection technique has increasingly becomed a kind of art.As community Detect an emerging direction, it will the Internet architecture is excavated and makes tremendous influence.
The content of the invention
It is an object of the invention to provide a kind of short text Sentiment orientation analysis method based on sentiment dictionary.
Short text Sentiment orientation analysis method based on sentiment dictionary, it is characterised in that comprise the following steps:
Step 1, structure sentiment dictionary, the method based on word frequency statisticses build basic sentiment dictionary;By SO-PMI methods, The statistic correlation of vocabulary in candidate's word and basic sentiment dictionary is calculated to differentiate its Sentiment orientation, so as to expand basis Dictionary.
Step 2, the model for building sentiment analysis, on the basis of sentiment dictionary, in units of every evaluation sentence S, with Each emotion word WS in the sentence is separator, enters market to the punctuate phrase (WSi-1, WSi) between two separators Feel weight computing, then the weights weighted sum of each punctuate is drawn to every evaluation sentence S overall emotion propensity value Weight (S), judging the method for every evaluation sentence S feeling polarities is:If weight (S) is more than 0, the evaluation belongs to Front evaluation;Otherwise it is assumed that every evaluation sentence S belongs to negative sense evaluation, the polarity for evaluating sentence is classified so as to realize, punctuate Phrase (WSi-1, WSi) includes word WSi, but does not include word WSi-1.
SO-PMI methods comprise the following steps:
Step 2-1, the part of speech property of word is obtained afterwards using ICTCLAS systems participle,
Step 2-2, calculate by word.propertyal ∈ { a, ad, an, ag, al } and word.propertyal ∈ vn, Vd, vi, vg, vl } the two kinds of candidate word word SO-PMI values that are limited, the candidate word of remaining part of speech is directly considered as neutral words Language;
Calculate two kinds of candidate word word SO-PMI values be specially:
The PMI value between candidate word and positive basic emotion word is calculated, is calculated between candidate word and negative sense basis emotion word PMI value, finally both are subtracted each other to obtain the SO-PMI values of candidate word, SO-PMI calculating formula is as follows:
PosWords is positive basic sentiment dictionary, and negWords is negative sense basis sentiment dictionary, and word is candidate's word;
Relation such as following formula between SO-PMI value and Sentiment orientation:
Step 2-3, by the synonym of the basic emotion word in front, and meeting formula word.propertyal ∈ a, ad, An, ag, al } or formula word.propertyal ∈ { vn, vd, vi, vg, vl } and through formula 2 be determined as front tendency emotion word add Enter to posWords;
Step 2-4, by the synonym of negative basic emotion word, and meeting formula word.propertyal ∈ a, ad, An, ag, al } or formula word.propertyal ∈ { vn, vd, vi, vg, vl } and it is determined as that the emotion word of negative tendency adds through formula 2 Enter to negWords, obtain a comprehensive emotion word comparative sample.
The present invention is because use above technical scheme, therefore possess following beneficial effect:
Ours test result indicates that, data set include 100,000 commodity evaluation in the case of, be based purely on machine learning Accuracy rate with the Sentiment orientation analysis method for being based purely on sentiment dictionary is respectively 67.9% and 83.27%, and set forth herein The accuracy rate of comprehensive method can reach 85.9%, effect is much better than the method based on machine learning, is also better than simple base In the method for sentiment dictionary.
Embodiment
The invention provides a kind of short text Sentiment orientation analysis method based on sentiment dictionary.
The construction method of sentiment dictionary
Sentiment dictionary refers to a series of set of words that can express mankind front or negative emotions.For ease of below The Sentiment orientation value quantified to commodity evaluation short sentence calculates, and preserves its feelings for each word also in sentiment dictionary herein Feel propensity value, wherein ,+1 represents most strong positive emotion, and -1 represents most strong negative emotion.
The sentiment dictionary construction method that we design includes two parts:Method based on word frequency statisticses builds basic emotion Dictionary;Based on improved SO-PMI methods, by calculating the statistic correlation of vocabulary in candidate's word and basic sentiment dictionary To differentiate its Sentiment orientation, so as to expand basic dictionary.
The structure of basic sentiment dictionary
Basic sentiment dictionary is to carry out the basic and crucial of short text sentiment analysis based on natural language processing method.This class Among whether the word in corpus is appeared in sentiment dictionary by topic, and appear in the emotion of the word among dictionary and incline The Sentiment orientation value of commodity evaluation short sentence is calculated to value.So which word included into sentiment dictionary, the word in dictionary exists Whether representative in commodity evaluation field, whether the Sentiment orientation value of these words is accurate, and these problems all can be to emotion The accuracy of classification results impacts.The first step for solving these problems is exactly to establish accurately basic sentiment dictionary.
Building the common method of basic sentiment dictionary is:A series of emotion words are chosen from Hownet (Hownet), by them Input to Google search engine, emotion word is arranged one by one according to the size of the Google click volumes (hits values) returned Sequence, choose emotion word based on click volume highest several emotion words.Because the corpus of this problem is only from electricity Information on commodity comment in sub- business web site, so the scope for this problem of the word finder in Hownet is excessive.Also, search for The click volume of engine feedback can not reflect whether a vocabulary is representative in commodity evaluate corpus.So this method It is unsuitable for this problem.
This problem uses the method based on word frequency statisticses, semi-automatically chooses basic emotion vocabulary.Because commodity evaluation is short Word containing emotion composition in text is mostly adjective, verb and a small amount of noun, so after being pre-processed, only Need to evaluate short sentence set based on the enough commodity of number of entries, automatic word frequency system is carried out for adjective, verb and noun Meter, then for the higher some vocabulary of word frequency, 20 positive emotion words of word frequency highest and word frequency highest 20 are chosen by hand Negative emotion word, the basic sentiment dictionary of this problem is made up of them.
Using the above method, the front that we finally include basic dictionary is shown in Table 1 with negative emotion vocabulary.
Table 1:Basic sentiment dictionary
Because basic emotion vocabulary expresses very strong feelings tendency, so what we assigned for positive basic emotion word Sentiment orientation value is+1, is that the Sentiment orientation value that negative sense basis emotion word assigns is -1.
The expansion of sentiment dictionary
The vocabulary very little of basic sentiment dictionary, it is impossible to it is in love that all bands occurred in corpus are evaluated included in commodity Feel the vocabulary of tendency.Therefore, it is necessary to expand basic sentiment dictionary, relatively complete sentiment dictionary is built.Our expansion Filling method has two kinds:Add the candidate word of synonym, addition with Sentiment orientation.
Add synonym
In commodity evaluate short text, there are many words praised or belittled all synonyms each other.So expand synonymous Word can help us more broadly to identify emotion vocabulary.Therefore, it is desirable that using Harbin Institute of Technology's Chinese thesaurus [33], to base Plinth sentiment dictionary carries out synonym expansion.But the word for thering are many synonyms to be very writtenization in Harbin Institute of Technology's Chinese thesaurus Language, it will not be used completely in commodity evaluate corpus, such as the synonym " of inferior quality " of " bad ".In order to improve Sentiment orientation meter The algorithm performance of calculation, we still need to artificial screening and go out conventional thesauarus.After the expansion of synonym, sentiment dictionary Word increases to 256.Because being synonym, we incline the emotion of the synonym of all positive emotion words in basic sentiment dictionary + 1 is arranged to value, and the Sentiment orientation value of the synonym of all negative emotion words is arranged to -1.
Add related emotion word
Although exhaustively sentiment dictionary is extremely difficult completely for structure, each word and dictionary are concentrated by analyzing language material The correlation of middle emotion vocabulary, the very high word of correlation is included into dictionary, can effectively build the wider array of emotion of coverage rate Dictionary.This problem uses a kind of Statistics-Based Method:Point mutual information method (Pointwise Mutual Information) comes Candidate word and the correlation of emotion vocabulary in dictionary are calculated, so as to judge whether the word should be used as emotion word.If so, then add To sentiment dictionary.
Point mutual information method calculates the correlation between word and word based on Mutual Information Theory.Its basic thought is:System Count two word wordiAnd wordjThe probability of co-occurrence in sentence is evaluated in commodity.The probability of co-occurrence is bigger, then it represents that the two words it Between correlation it is higher, as shown under formula:
Wherein p (wordi∧wordj) it is wordiAnd wordjThe probability of co-occurrence in corpus, its computational methods such as formula Shown in (6-1), wherein n represents the total number that commodity are evaluated in corpus, numSentence (wordi,wordj) represent to wrap simultaneously Containing wordiAnd wordjEvaluation bar number.P(wordi) and P (wordj) represent to include word in corpus respectivelyiAnd wordjComment Valency bar number is in the ratio shared by total evaluation bar number.Their computational methods are as shown in formula 6-2 and 6-3, wherein numSentence (wordi) represent to include word in corpusiEvaluation bar number.PMI (word in formula (6-1)i,wordj) represent to work as wordiWith wordjDuring one of occurrences, the information content for another variable that we can get, this is fully demonstrated by wordiWith wordjBetween statistic correlation:When PMI is more than 0, represent that two words have correlation, and PMI value is bigger, it is related Property is stronger;When PMI is equal to 0, represent between the two words it is statistical iteration;PMI be less than 0 when, represent be between the two words Mutual exclusion.
When PMI principle is applied to feeling polarities by us to be analyzed, SO-PMI algorithms have just been developed into.SO-PMI is used PMI thought calculates the statistic correlation between candidate word and each group basis emotion word, from each group statistic correlation comprehensive descision The Sentiment orientation of the word.Specifically calculation procedure is:First, the PMI value between candidate word and positive basic emotion word is calculated;So Afterwards, the PMI value between candidate word and negative sense basis emotion word is calculated;Finally both are subtracted each other to obtain the SO-PMI values of candidate word. Assuming that positive basic sentiment dictionary is posWords, negative sense basis sentiment dictionary is negWords, then for candidate word word, SO-PMI calculating is as shown in formula 6-4:
Relation between SO-PMI value and Sentiment orientation is as shown in formula 6-5:
When SO-PMI methods to be applied to the commodity evaluation corpus of this experiment, it has been found that problems with:
1) many individual character verbs and exclusive noun are neutral implication in itself, but they may be in corpus and in dictionary The probability of a certain emotion Term co-occurrence is very big, so as to cause SO-PMI to greatly deviate from neutral value.Such as verb " hitting ".During it is with dictionary PMI value between the word of front is 18.97, and the PMI value between negative word is 0, therefore its SO-PMI values can be long-range In 0.Similar situation also occurs in noun " thinkpad ".The word for much having no Sentiment orientation can be included feelings by these situations Feel dictionary, cause the performance cost that sensibility classification method is meaningless, and damage the accuracy of classification.
2) SO-PMI of many neutral words tends not to be exactly equal to 0:They are possibly close to 0, it is also possible between 0 There is very big deviation.So will distinguish word is that positive or negative emotion word threshold values is set to 0, this problem is not appropriate for.
3) problem is omitted:Due to this problem use corpus be short text form commodity evaluating data, the word of evaluation Number it is often less, the quantity of basic emotion word is also few, thus the probability of candidate word and basic emotion Term co-occurrence can than relatively low, I.e. SO-PMI value can tend to 0.But from the correlation between the visual angle of sentiment analysis, this candidate word and basic emotion word It is again very big.The vocabulary that can so cause much include sentiment dictionary is missed, and produces the Sparse Problems of feature.
It is therefore desirable to according to the characteristics of this problem corpus, adaptive improvement is carried out to SO-PMI algorithms, in the hope of in solution State three problems.It is proposed that improvement have at following three.
1) problem 1 is directed to, the part of speech property of word is obtained after ICTCLAS participles, and provide only to calculate by formula The SO-PMI values for two kinds of candidate word word that 6-7 and formula 6-8 are limited, the candidate word of remaining part of speech are directly considered as neutral words Language.
Word.propertyal ∈ { a, ad, an, ag, al } (formula 6-7)
Word.propertyal ∈ { vn, vd, vi, vg, vl } (formula 6-8)
Because adjective typically all contains emotion tendency, SO-PMI values are all calculated to all adjective vocabulary.
Meanwhile we have given up noun and the word of remaining part of speech, because the word of these parts of speech seldom can be with love Sense tendency.Due to the particularity of commodity evaluation corpus, most noun is all the typonym or brand name of commodity, than Such as " clothes ", " Mei Di ".So in order to prevent that these neuters from mistakenly being brought into sentiment dictionary, also for carrying Height expands the efficiency of algorithm of candidate word, and we do not calculate the SO-PMI values of noun.But we manually can will include strong feelings A series of nouns and its synonym of sense tendency are added in sentiment dictionary.The nominal emotion word in part manually added to is such as Shown in table 2.
The hand picked nominal emotion word of table 2:
2) problem 2 is directed to, after the observation to mass data, we are by SO-PMI value and the relation of Sentiment orientation Readjust as formula (6-9).
For problem 3:We select posWords the and negWords dictionaries in further expansion type 6-4.Specific practice It is:(a) by the synonym of the basic emotion word in front, and meeting formula 6-7 or formula 6-8 and it is determined as positive tendency through formula 6-9 Emotion word add to posWords;(b) by the synonym of negative basic emotion word, and meeting formula 6-7 or formula 6-8 and pass through Formula 6-9 is determined as that the emotion word of negative tendency is added to negWords.So, a more fully feelings just are provided to candidate word Feel word comparative sample, avoid omitting candidate's word with Sentiment orientation.
Strictly, posWords is defined as follows:
1) if w is the front word in basic sentiment dictionary, then w posWords;
2) if w is the synonym of some front word in basic sentiment dictionary, then w posWords;
If 3) w meeting formulas 6-7 or formula 6-8, and 1.36<SO-PMI(word)<23, then w posWords.
Similarly, negWords is defined as follows:
1) if w is the negative word in basic sentiment dictionary, then w negWords;
2) if w is the synonym of the negative word of some in basic sentiment dictionary, then w negWords;
If 3) w meeting formulas 6-7 or formula 6-8, and -16<SO-PMI(word)<- 1, then w negWords.
According to improved SO-PMI algorithms, we are right using 100,000 evaluating datas after word segmentation processing as input Wherein meeting formula 6-7 or formula 6-8, and the candidate word for basic emotion word repetition of getting along well carries out the calculating of Sentiment orientation value, picks out Meeting formula 6-9 candidate word, the word and its Sentiment orientation value are added to sentiment dictionary together.Now, SO-PMI unavoidably can The polarity misclassification of some emotion words, so needing artificial progress denoising.Complete after expanding, the number increase of emotion word in dictionary To 2393, wherein including positive emotion word 1302 and negative sense emotion word 1091.This completes the emotion of this problem The structure of dictionary.
The design of emotion model
This section will be described in detail us and the model of sentiment analysis, i.e. emotion model carried out to information on commodity comment.Its master Wanting thought is:On the basis of sentiment dictionary, in units of every evaluation sentence S, using each emotion word WS in the sentence as Separator, emotion weight computing is carried out to the punctuate phrase (WSi-1, WSi) between two separators, then will each be made pauses in reading unpunctuated ancient writings Weights weighted sum draw S overall emotion propensity value, the polarity for evaluating sentence is classified so as to realize.Arrange herein, break Sentence phrase (WSi-1, WSi) includes word WSi, but does not include word WSi-1.
The model is made up of 6 modules, is respectively:It is the analysis of emotion word, the analysis of negative word, the analyzing of adverbial word, fixed Arrange in pairs or groups the analysis of words and phrases, the analysis of adversative, the analysis of confirmative question and exclamative sentence.
In the design process of this 6 modules, traditional emotion model [is all carried out by the different transformation of degree, has been made herein The Sentiment orientation analysis for being suitable for the evaluation short text of commodity in e-commerce website.For example, in terms of the analysis of emotion word, I Consider influence of the different parts of speech of emotion word to Sentiment orientation, introduce Ad dictionaries to two kinds of words of adjective and adverbial word Property word carry out specially treated;And for example, in terms of the analysis of regular collocation words and phrases, we consider regular collocation to sentence or The emotion of person's emotion word influences, and is classified as 4 kinds, is all made that corresponding specially treated to every kind of collocation phrase so that feelings The result of sense tendency classification is more accurate.
The analysis of emotion word
It is as follows to the analysis process of emotion word:For each word word in evaluation to be analyzed, sentiment dictionary is scanned, Judge word whether there is among sentiment dictionary, if in the presence of, by word be considered as emotion word and from sentiment dictionary read should The Sentiment orientation value of word, is returned it into;If being not present, word is considered as neutral vocabulary, returns to 0.So circulation is until to whole The word of individual evaluate collection judges to complete.The process is realized by algorithm analyzeSentimentWord.
However, some have the word of two kinds of parts of speech of adjective and adverbial word, in some cases comprising emotional attitude, another But there was only the effect of adverbial word under outer certain situation, now, whether there is using this word among sentiment dictionary in terms of coming as criterion The emotion propensity value for calculating it is inaccurate.
Such case is distinguished into following two by this problem.
1) when a word has two kinds of parts of speech of adjective and adverbial word, according to difference in functionality of the word in different sentences, The part of speech that ICTCLAS tool analysis comes out also can be different, sees example 6.1.So we can tie on the basis of sentiment dictionary The part of speech for closing word judges whether the word is emotion word.
Example 6.1
Sentence 1:Taste/n very/d is general/a./wj
It is now " general " to be used as negative emotion vocabulary, part of speech a.
Sentence 2:Typically/ad not /d meetings/v goes offline/n./wj
The now adverbial word of " general " as modification " going offline ", represent the degree of strength of emotion, part of speech ad.
In view of the above-mentioned problems, it is proposed that solution be:An Ad dictionary is established, adjective and adverbial word two will be carried The word of kind part of speech is put into Ad dictionaries, and is provided:If word belongs to Ad dictionaries, and its part of speech includes character " d ", then will not The word is considered as emotion word.
Through summary, we determined that Ad dictionaries as shown in table 6.3.
Table 6.3Ad dictionaries
It is good, it is more, really, especially, easily, strongly, completely, directly, substantially
2) for some emotion words with adjective and adverbial word property, its adverbial word part of speech is just to begin to use recently , such as " small " in second example sentence in example 6.2.Now, ICTCLAS can not analyze the adverbial word part of speech of the word, can only pass through Front and rear collocations judges whether it has adverbial word part of speech.
Example 6.2:
Sentence 1:Very/d is small/a /ude things/n./wj
It is now " small " to be used as negative sense emotion vocabulary, part of speech a.
Sentence 2:Small/a is expensive/a./wj
The now adverbial word of " small " as modification " expensive ", represents the degree of strength of emotion, part of speech d, but ICTCLAS by it Part of speech be still identified as a.
From example 6.2, when " small " and adjective collocation together when, it has adverbial word part of speech.Meet this rule Word also has " big ".So we can judge whether it plays emotion word by combining the part of speech of its next word Effect.Specifically rule is:It is adjective that if one has next word of adjective and the word of adverbial word part of speech simultaneously (a), then it is not considered as emotion word.
Assuming that word word part of speech is property, the part of speech of its next word is nextProperty.We Vocabulary word Sentiment orientation value (also referred to as weight) is represented with weight (word), word is represented with isSentiment (word) Whether it is emotion word.Emotion word parser is described as follows with false code:
Algorithm:AnalyzeSentimentWord (emotion word analysis)
Input:word,property,nextProperty
Output:Weight (word), isSentiment (word)
if(isInSentimentLexicon(word))then
if(isInAdLexicon(word)&&property.contains(―d‖))then
weight(word):=0;
Else if ((word==" big " | | word==-small ‖)s &&nextProperty.contains (- a ‖)) then weight(word):=0;
else
weight(word):=getWeightFromSentimentLexicon (word);
end if
else
weight(word):=0;
end if
If (weight (word)==0) then
isSentiment(word):=false;
else
isSentiment(word):=true;
end if
In emotion word parser, function isInSentimentLexicon and isInAdLexicon difference grammatical term for the character Converge whether among sentiment dictionary and Ad dictionaries, function getWeightFromSentimentLexicon is from sentiment dictionary Obtain the Sentiment orientation value of vocabulary.
By the calculating of the Sentiment orientation value to each word, we get accurate emotion word, and (i.e. weights are not equal to 0 Word), and filtered do not played in particular statement affectivity emotion word (i.e. weights be equal to 0 emotion word).
The analysis of negative word
Negative word is the word for representing negative implication, and its appearance can change the Sentiment orientation of former sentence.Such as in example 6.3 In, " liking " is positive evaluation, and when negative word " no " is above added, front evaluation has reformed into unfavorable ratings.This class The negative word dictionary determined is inscribed as shown in table 6.4, altogether including 40 negative words
The negative word dictionary of table 6.4
Except the situation that single negative word occurs, double denial also often occurs in Chinese, i.e., occurs in a word even Several negative words.Such as in example 6.4, " can not " and " no " be all negative word, they modify emotion word " liking " simultaneously, finally Reduce the positive emotion tendency of " liking ".
Example 6.3 I/rr not /d likes/vi it/rr./wj
Example 6.4 I/rr can not/d not /d likes/vi it/rr./wj
It is to the analysis method of negative word herein:In the case where there is emotion word Wsi in sentence, calculate Wsi with it is previous The number negNum (Wsi-1, Wsi) of (i.e. one punctuate in) negative word between separator Wsi-1.If negNum is odd number, Then the emotion value of the punctuate negates for the Sentiment orientation value of emotion word;Conversely, then keep former Sentiment orientation value.
Calculate negative word number method be:The vocabulary in sentence s is scanned one by one, when Wsi-1 is arrived in scanning, is made with Wsi-1 For starting point, word word is obtained one by one from front to back, call function isNegWord judges that word whether there is in negative word dictionary In.If so, then negNum increases one, until next emotion word Wsi is arrived in scanning.In the process, word is sequentially stored into number Group variable phrase (Wsi-1, Wsi), so as to complete the interception to a punctuate.
The set that one might as well be evaluated all punctuates in short sentence is designated as phrases, and i-th of punctuate is designated as phrases [i], the set of the Sentiment orientation value of all punctuates are designated as weight (phrases), and the Sentiment orientation value of i-th of punctuate is designated as Weight (phrases [i]), the number of the negative word included in punctuate are designated as negNum.This paper negative word parser AnalyzeNegWord is described as follows:
Algorithm:AnalyzeNegWord (negative word analysis)
Input:Evaluation short sentence s to be analyzed
Output:Phrases, weight (phrases)
i:=0;negNum:=0;
foreach word win s
phrases[i].appendWord(w);// word w is added to phrases [i] sequence of terms end
if(isInNegLexicon(w))then
negNum++;
else
weight:=analyzeSentimentWord (w);
if(weight!=0) then
If (negNum%2==0) then
weight(phrases[i]):=weight;
else
weight(phrases[i]):=-1*weight;
end if
i++;negNum:=0;
end if
end if
end for
In negative word parser, function isInNegLexicon judges whether vocabulary is located among negative word dictionary, Function analyzeSentimentWord obtains the Sentiment orientation value for the word that emotion word parser returns.
The analysis of adverbial word
Adverbial word is the word of intensity of showing emotion.For example express " very much " strong positive feelings in " I is delithted with " Sense;" comparison " one word in such as " I prefers " only expresses relatively weak positive emotion again.Emotion is modified according to adverbial word The degree of strength of word, adverbial word is divided into 4 classifications by us, and the numerical value for representing emotion intensity is distributed for each classification.By Arrange, the adverbial word dictionary such as table 6.5 that this problem uses.
The adverbial word dictionary of table 6.5
Similar with the analysis process of negative word, we obtain the intensity of each adverbial word in punctuate from adverbial word dictionary, and will The product of the Sentiment orientation degree of the product of these intensity levels and the punctuate obtained before this is as new punctuate Sentiment orientation degree.
Some punctuate that negative word parser obtains might as well be represented with phrase, the calculation is represented with weight (phrase) The weight for the punctuate phrase that method obtains, the intensity of adverbial word is represented with degree.This paper adverbial word parser AnalyzeAdvWord is described as follows:
Algorithm:AnalyzeAdvWord (adverbial word analysis)
Input:Phrase, weight (phrase)
Output:weight(phrase)
degree:=1.0;
for each word w in phrase
if(isInAdvLexicon(w))then
Degree=degree*getDegreeFromAdvLexicon (w);
end if
end for
weight(phrase):=degree*weight (phrase);
In adverbial word parser, function isInAdvLexicon judges whether vocabulary is located among adverbial word dictionary, function GetDegreeFromAdvLexicon obtains the emotion intensity of adverbial word from adverbial word dictionary.
The analysis of regular collocation phrase
We experimentally found that the specific collocation of some phrases occurs in some evaluation words and phrases.Though these are arranged in pairs or groups Emotion word is so included, but this collocation can change the emotion word and the emotion of whole sentence is oriented to;These collocation may not also wrap Containing emotion word, but Sentiment orientation can be brought to whole sentence.So in both cases, according to emotion weights, negative word and Secondary contamination is come to calculate emotion weights be inadequate, it is also necessary to which regular collocation phrase is analyzed.
Regular collocation phrase is divided into following four herein.
1) regular collocation being made up of adverbial word (d) or conjunction (c), such as example 6-5.We provide analysis to them at this Zhang Suoshu other analyses are carried out before starting.
If 6.5/c of example again/d is beautiful/1 point/m of a just/d is good/a/y./wj
Although containing positive emotion vocabulary as " beautiful " and " good " in the sentence, regular collocation " will It is all right " bring negative emotion to the words.
We are handled in regular collocations of the algorithm for design matchAdvConjPatterns to adverbial word and conjunction.Specific mistake Cheng Wei:Regular collocation rule set acPatterns based on adverbial word and conjunction, judge whether evaluate sentence S with regular expression Meet acPatterns, if met, the Sentiment orientation value made pauses in reading unpunctuated ancient writings in S is no longer calculated using aforementioned algorism, is directly assigned for S Emotion weights.
Assuming that posWord represents some positive basic emotion word, regular collocation rule has following 4 groups:
(1) if if/again/more/if// a bit/more/can ... if ...+posWord+ ... }
(2) { most .../more/has ... again }
(3) with regard to ... too ... "
(4) { need// obtain/rear/and occupy/day/all/also/weight ... ... }
It is represented by with regular expression:
(1) [" if " | " again " | " more " | " if " | " if " | " a bit " | " more " | " energy "]+[u4E00- u9FA5] *+ [" just "]+[u4E00- u9FA5] *+posWord+ [u4E00- u9FA5] *+" "
(2) " best "+[u4E00- u9FA5] *+[" again " | " more " | " having "]+[u4E00- u9FA5] *
(3) " it is exactly "+[u4E00- u9FA5] *+" too "+[u4E00- u9FA5] *
(4) [" need " | " taking " | " obtaining " | " later " | " unexpectedly " | " my god " | " all " | " going back " | " again "]+[u4E00- U9FA5] *+[" "]+[u4E00- u9FA5] *
The false code of matchAdvConjPatterns algorithms can be described as follows:
Algorithm:MatchAdvConjPatterns (regular collocation of matching adverbial word and conjunction)
Input:S, acPatterns (the regular expression collection of the regular collocation rule of adverbial word and conjunction)
Output:weight(S)
if(acPatterns.match(S))then
Weight (S)=- 0.5;
else
Weight (S) is calculated with the other method described in this chapter;
end if
2) ambiguity emotion word regular collocation.Some emotion words, when with different collocations together with when, Sentiment orientation Can be different, or initial value is kept, or negate, or be neutrality.Referred to herein as such emotion word is ambiguity emotion word.Such as example 6.6 With example 6.7.
Example 6.6
Sentence 1:Cost performance/n very/d height/a./wj
Sentence 2:Price/n height/a/y./wj
" height " is positive emotion word in sentiment dictionary.In sentence 1, " height " and " cost performance " arranges in pairs or groups together, entirely The Sentiment orientation value of sentence takes the former weights of emotion word;In sentence 2, when it arranges in pairs or groups with " price ", then negative sense emotion has been taken.
Example 6.7
Sentence 1:Quite/d is big/a /ude./wj
Sentence 2:Greatly/a/y points/qt./wj
" big " is positive emotion word in sentiment dictionary.We have found that similar to adjective as " big " and " point " During collocation, its original Sentiment orientation can be inverted.
With it is foregoing similar with the regular collocation processing of conjunction to adverbial word, set two groups to be fixed on ambiguity emotion word herein The rule of collocation, first group negates Sentiment orientation value, and second group resets Sentiment orientation value.We are by according to different rule Then, the Sentiment orientation value of punctuate is recalculated.First group of rule is designated as ambigNegPatterns, second group of rule is designated as AmbigZeroPatterns, some negative emotion word is represented with negWord, the definition of ambiguity emotion word regular collocation rule is such as Under.
AmbigNegPatterns includes following 5 rule:
((negWord+ " rate ") | (" valency "+[" lattice " | " position "]))+[u4E00- u9FA5] *+[" height " | " low " | " big " | " small "]+[u4E00- u9FA5] *
[u4E00- u9FA5] *+[" just " | " rear " | " again "]+[u4E00- u9FA5] *+" price reduction "+[u4E00- u9FA5]*
[u4E00- u9FA5] *+" price reduction "+[u4E00- u9FA5] *+" too fast "+[u4E00- u9FA5] *
[u4E00- u9FA5] *+" point "+[u4E00- u9FA5] *
" use "+[u4E00- u9FA5] *+" long "
AmbigZeroPatterns includes following 3 rule:
[u4E00- u9FA5] *+" temporary transient "+[u4E00- u9FA5] *
[u4E00- u9FA5] *+" also useless "
[u4E00- u9FA5] *+" not knowing "+[u4E00- u9FA5] *+" how "
The false code of ambiguity emotion word parser can be described as follows:
Algorithm:AnalyzeAmbigEmotionWord (analysis of ambiguity emotion word)
Input:phrase,weight(phrase)
Output:weight(phrase)
if(ambigNegPatterns.match(phrase))then
Weight (phrase)=- 1*weight (phrase);
else if(ambigZeroPatterns.match(phrase))then
Weight (phrase)=0;
end if
3) reversely emotion word regular collocation.Some emotion words, when above carry emotion it is strong describe adverbial word, that is, weigh When being worth larger adverbial word, Sentiment orientation can be inverted.Our such emotion words are referred to as reverse emotion word.
Example 6.8 too/d is big/a/y./wj
When emotion word " big " is connected with " too " so larger adverbial word of weights, its positive emotion tendency can quilt Reversion.
In order to analyze such regular collocation, we establish the reverse sentiment dictionary as shown in table 6.6, store reverse emotion Word, and it is assumed that adverbial word of the weight more than 0.5 can invert the emotion value of reverse emotion word.
6.6 reverse sentiment dictionary of table
Bright, greatly, easily, compact, light, in vain, simply, tightly, thin, gently, weight is long, high
The process analyzed and processed herein to reverse emotion word regular collocation can be described as follows.
During being scanned from previous emotion word to emotion word Wsi, each adverbial word is recorded using variable advWeight Weight.When the position of scanning to Wsi, advWeight stores the weight away from adverbial word nearest Wsi.First determine whether Whether advWeight is more than 0.5, then judges whether negative word occur before Wsi, if not occurring negative word, passes through Function isOppositeWord (Wsi) judges whether Wsi belongs to reverse sentiment dictionary, if it is, just by phrase (Wsi-1, Wsi emotion value reversion).If occurs negative word before Wsi, then phrase (Wsi-1, Wsi) emotion value is kept It is constant.Because in " negative word+adverbial word+emotion word " this combination, emotion word does not appear as negative sense emotion word.Such as In " being less susceptible to " short sentence, positive emotion word " easy " " too " is not reversed to negative sense emotion word by adverbial word above.
Assuming that the number for the negative word that phrase (Wsi-1, Wsi) is included is negNum (Wsi-1, Wsi), reverse emotion word The Processing Algorithm of regular collocation can be described as follows with false code:
Algorithm:MatchOppositeEmotionPatterns (regular collocation for matching reverse emotion word)
Input:Weight (phrase (Wsi-1, Wsi)), advWeight, negNum (Wsi-1, Wsi)
Output:weight(phrase(Wsi-1,Wsi))
if((advWeight>0.5) && (negNum (Wsi-1, Wsi)==0) && (isOppositeWord (Wsi))) then
Weight (phrase (Wsi-1, Wsi))=- 1*weight (phrase (Wsi-1, Wsi))
end if
4) regular collocation of negative word.We are provided to analyze it only in entirely evaluation S not no feelings comprising emotion word Carried out under condition.
In commodity evaluation field, it was noted that some evaluation sentences do not include any emotion vocabulary, only comprising negative Word, such as example 6.9.If saving described method according to 6.2.2, the sentence will be identified as neutral sentence.But we can be with The negative sense emotion of this sentence is substantially experienced, such emotion is exactly to be passed on by negative word.
6.9 electric fans of example/n all/d do not have/d!/wt
This sentence includes two negative words (" no " and " not having "), but the Sentiment orientation of this is negative sense.
Solution to the problems described above is:The regular collocation of several negative word is summed up, forms negPatterns rules Collection.
When the end of scanning to S, first determine whether S meets negPatterns rules, if met, just by S power Value is arranged to 0.5.If be not inconsistent normally, we recycle the parity of negative word to assign Sentiment orientation value for S.
At present, our negPatterns only includes a rule, later extendible new rule.
[" not " | " not having "]+[u4E00- u9FA5] *+" "+[" not having " | "None"]+[u4E00- u9FA5] *
The Processing Algorithm of the regular collocation of negative word can be described as follows with false code:
Algorithm:MatchNegPatterns (regular collocation of matching negative word)
Input:S, negNum (S)
Output:weight(S)
If ((negPatterns.match (S)) | | (negNum (S) %2!=0)) then
Weight (S)=- 0.5;
else
Weight (S)=0.5;
end if
The analysis of adversative
Adversative refers to the word that reversal effect can be brought to the semanteme of sentence.
Example 6.11
Than/p markets/n /ude1 is cheap/a ,/wd still/c sells/v after/f also/d do not have/v markets/n is good/a./wj
What first short sentence represented is positive emotion, but after adversative " still " occurs, sentence meaning emotion is then inclined to Negatively.
The adversative dictionary of this Subject Design is as shown in table 6.7.
The adversative dictionary of table 6.7
Sentence comprising adversative can be abstracted as following structure:
Phrase (Wsi-1, Wsi)+punctuation mark+adversative+phrase (Wsi, Wsi+1)
According to the effect of adversative, it is known that weight (phrase (Wsi-1, Wsi)) and weight (phrase (Wsi, Wsi + 1) Sentiment orientation) should be opposite.
Analysis process to adversative is:After the analysis of emotion word, negative word and adverbial word, from current emotion word Wsi Place starts to scan for next emotion word Wsi+1 backward.In this process, if adversative is arrived in scanning, by weight (phrase (Wsi-1, Wsi)) is negated so that the punctuate behind phrase (Wsi-1, Wsi) Sentiment orientation deviation adversative Phrase (Wsi, Wsi+1) Sentiment orientation.Assuming that phrases represents the set of all punctuates in evaluation short sentence, weight (phrases) the Sentiment orientation value of these punctuates is represented, numPhrases represents the sum of punctuate, and phrases [i] represents i-th Individual punctuate, isTransitionWord (word) are to judge whether word word belongs to the function of adversative dictionary, adversative Parser can be described as follows with false code:
Algorithm:AnalyzeTransitionWord (analysis adversative)
Input:Phrases, weight (phrases)
Output:weight(phrases)
For (i=0;i<numPhrases-1;I+=2)
for word in phrases[i+1]
if(isTransitionWord(word))then
Weight (phrase [i])=- 1*weight (phrase [i]);
break;
end if
end for
end for
The analysis of exclamative sentence and confirmative question
Exclamative sentence and confirmative question are all the sentence patterns for aggravating sentence Sentiment orientation.Wherein, exclamative sentence only serves booster action, instead Question sentence can also invert Sentiment orientation.
Analysis for exclamative sentence, we with exclamation mark "!" mark as exclamative sentence, it is designated as exc.Its emotion is weighed The computational methods of value are:When scanning exclamation mark, we find the emotion word Wsi-1 nearest from exclamation mark from back to front, and Weights using Wsi-1 Sentiment orientation value as exc.
Different from traditional emotion model, our emotion model does not do specially treated to confirmative question.Because in business Valency short sentence concentration is judged, it is most of semantic all without the implication of reversion comprising antisense interrogative, but represent the matter to commodity Doubt attitude.Such as in example 6.12, although " " is the representative word of disjunctive question, in the sentence, not Reversal effect really is played to Sentiment orientation.
Example 6.12
The inside/f unexpectedly/d has/60/m of vyou are more/m M/x /ude1 files/n ,/wd/d be /vshi is second-hand Goods/n/ww
Sentiment orientation value weighted calculation
The Sentiment orientation value for all punctuates that an evaluation S is included is calculated.These Sentiment orientation values are added, so that it may Calculate S Sentiment orientation value weight (S).Judging the method for S feeling polarities is:, should if weight (S) is more than 0 Evaluation belongs to positive evaluation;Otherwise it is assumed that S belongs to negative sense evaluation.

Claims (1)

1. the short text Sentiment orientation analysis method based on sentiment dictionary, it is characterised in that comprise the following steps:
Step 1, structure sentiment dictionary, the method based on word frequency statisticses build basic sentiment dictionary;By SO-PMI methods, to waiting The statistic correlation of word and vocabulary in basic sentiment dictionary is selected to calculate to differentiate its Sentiment orientation, so as to expand basic word Allusion quotation;
Step 2, the model for building sentiment analysis, on the basis of sentiment dictionary, in units of every evaluation sentence S, with the language Each emotion word WS in sentence is separator, and emotion power is carried out to the punctuate phrase (WSi-1, WSi) between two separators Value is calculated, then the weights weighted sum of each punctuate is drawn to every evaluation sentence S overall emotion propensity value weight (S), judging the method for every evaluation sentence S feeling polarities is:If weight (S) is more than 0, the evaluation belongs to front and commented Valency;Otherwise it is assumed that every evaluation sentence S belongs to negative sense evaluation, the polarity for evaluating sentence is classified so as to realize, punctuate phrase (WSi-1, WSi) includes word WSi, but does not include word WSi-1;
SO-PMI methods comprise the following steps:
Step 2-1, using the part of speech property that word is obtained after ICTCLAS systems participle;
Step 2-2, calculate by word.propertyal ∈ { a, ad, an, ag, al } and word.propertyal ∈ vn, vd, Vi, vg, vl } the two kinds of candidate word word SO-PMI values that are limited, the candidate word of remaining part of speech is directly considered as neutral word;
Calculate two kinds of candidate word word SO-PMI values be specially:
The PMI value between candidate word and positive basic emotion word is calculated, calculates the PMI between candidate word and negative sense basis emotion word Value, finally subtracts each other both to obtain the SO-PMI values of candidate word, SO-PMI calculating formula is as follows:
PosWords is positive basic sentiment dictionary, and negWords is negative sense basis sentiment dictionary, and word is candidate's word;
Relation such as following formula between SO-PMI value and Sentiment orientation:
Step 2-3, by the synonym of the basic emotion word in front, and meeting formula word.propertyal ∈ a, ad, an, Ag, al } or formula word.propertyal ∈ { vn, vd, vi, vg, vl } and the emotion word addition for being determined as front tendency through formula 2 To posWords;
Step 2-4, by the synonym of negative basic emotion word, and meeting formula word.propertyal ∈ a, ad, an, Ag, al } or formula word.propertyal ∈ { vn, vd, vi, vg, vl } and the emotion word addition for being determined as negative tendency through formula 2 To negWords, a comprehensive emotion word comparative sample is obtained.
CN201510342473.4A 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary Active CN105005553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510342473.4A CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510342473.4A CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Publications (2)

Publication Number Publication Date
CN105005553A CN105005553A (en) 2015-10-28
CN105005553B true CN105005553B (en) 2017-11-21

Family

ID=54378229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510342473.4A Active CN105005553B (en) 2015-06-19 2015-06-19 Short text Sentiment orientation analysis method based on sentiment dictionary

Country Status (1)

Country Link
CN (1) CN105005553B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682128A (en) * 2016-12-13 2017-05-17 成都数联铭品科技有限公司 Method for automatic establishment of multi-field dictionaries
CN107291696A (en) * 2017-06-28 2017-10-24 达而观信息科技(上海)有限公司 A kind of comment word sentiment analysis method and system based on deep learning
CN107609132B (en) * 2017-09-18 2020-03-20 杭州电子科技大学 Semantic ontology base based Chinese text sentiment analysis method
CN107885785A (en) * 2017-10-17 2018-04-06 北京京东尚科信息技术有限公司 Text emotion analysis method and device
CN108108433A (en) * 2017-12-19 2018-06-01 杭州电子科技大学 A kind of rule-based and the data network integration sentiment analysis method
CN110399494A (en) * 2018-04-16 2019-11-01 北京京东尚科信息技术有限公司 Method and apparatus for generating information
CN108763196A (en) * 2018-05-03 2018-11-06 上海海事大学 A kind of keyword extraction method based on PMI
CN110096696A (en) * 2018-06-11 2019-08-06 电子科技大学 A kind of Chinese long text sentiment analysis method
CN110728131A (en) * 2018-06-29 2020-01-24 北京京东尚科信息技术有限公司 Method and device for analyzing text attribute
CN109408798B (en) * 2018-07-27 2021-09-14 昆明理工大学 Word emotional tendency judgment method
CN109033433B (en) * 2018-08-13 2020-09-29 中国地质大学(武汉) Comment data emotion classification method and system based on convolutional neural network
CN109800308B (en) * 2019-01-22 2022-04-15 四川长虹电器股份有限公司 Short text classification method based on part-of-speech and fuzzy pattern recognition combination
CN110399595B (en) * 2019-07-31 2024-04-05 腾讯科技(成都)有限公司 Text information labeling method and related device
CN111259661B (en) * 2020-02-11 2023-07-25 安徽理工大学 New emotion word extraction method based on commodity comments
CN111353044B (en) * 2020-03-09 2022-11-11 重庆邮电大学 Comment-based emotion analysis method and system
CN112037818A (en) * 2020-08-30 2020-12-04 北京嘀嘀无限科技发展有限公司 Abnormal condition determining method and forward matching formula generating method
CN112416917A (en) * 2020-11-19 2021-02-26 珠海格力电器股份有限公司 Method, device and system for processing abnormal data in real time
CN113158669B (en) * 2021-04-28 2023-03-28 河北冀联人力资源服务集团有限公司 Method and system for identifying positive and negative comments of employment platform
CN113378577B (en) * 2021-05-08 2023-04-07 重庆航天信息有限公司 Food safety evaluation text emotional tendency analysis method
CN113378578B (en) * 2021-05-08 2023-04-18 重庆航天信息有限公司 Food and medicine public opinion analysis method
CN115271816B (en) * 2022-08-02 2023-12-22 北京信息科技大学 Method and device for predicting commodity price based on emotion index
CN115796158A (en) * 2023-02-07 2023-03-14 中国传媒大学 Emotion dictionary construction method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2083733A1 (en) * 1991-12-30 1993-07-01 Kenneth Ward Church Word disambiguation methods and apparatus
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
CN103955451A (en) * 2014-05-15 2014-07-30 北京优捷信达信息科技有限公司 Method for judging emotional tendentiousness of short text

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2083733A1 (en) * 1991-12-30 1993-07-01 Kenneth Ward Church Word disambiguation methods and apparatus
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
CN103955451A (en) * 2014-05-15 2014-07-30 北京优捷信达信息科技有限公司 Method for judging emotional tendentiousness of short text

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Measuring Praise and Criticism: Inference of Semantic Orientation from Association;Turney, P.等;《ACM Transactions on Information Systems》;20031031;第21卷(第4期);全文 *
中文句子情感倾向分析;郭叶;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110315;第2011年卷(第03期);I138-1601 *
在线评论的情感极性分类研究综述;王洪伟等;《情报科学》;20120831;第30卷(第8期);全文 *

Also Published As

Publication number Publication date
CN105005553A (en) 2015-10-28

Similar Documents

Publication Publication Date Title
CN105005553B (en) Short text Sentiment orientation analysis method based on sentiment dictionary
CN105022805B (en) A kind of sentiment analysis method based on SO-PMI information on commodity comment
Thavareesan et al. Sentiment lexicon expansion using Word2vec and fastText for sentiment prediction in Tamil texts
Kutuzov et al. Texts in, meaning out: neural language models in semantic similarity task for Russian
CN104268197B (en) A kind of industry comment data fine granularity sentiment analysis method
García et al. A lexicon based sentiment analysis retrieval system for tourism domain
CN103399901A (en) Keyword extraction method
CN106598944A (en) Civil aviation security public opinion emotion analysis method
Amancio et al. Extractive summarization using complex networks and syntactic dependency
Krasnowska-Kieraś et al. Empirical linguistic study of sentence embeddings
CN106294863A (en) A kind of abstract method for mass text fast understanding
CN108874896B (en) Humor identification method based on neural network and humor characteristics
CN108363691B (en) Domain term recognition system and method for power 95598 work order
CN106227756A (en) A kind of stock index forecasting method based on emotional semantic classification and system
CN104765779A (en) Patent document inquiry extension method based on YAGO2s
CN112069312B (en) Text classification method based on entity recognition and electronic device
Sadr et al. Unified topic-based semantic models: A study in computing the semantic relatedness of geographic terms
CN109299248A (en) A kind of business intelligence collection method based on natural language processing
CN112966508A (en) General automatic term extraction method
CN104317882A (en) Decision-based Chinese word segmentation and fusion method
Zhao et al. Fuzzy sentiment membership determining for sentiment classification
Sawhney et al. A modified technique for Word Sense Disambiguation using Lesk algorithm in Hindi language
Lee et al. Detecting suicidality with a contextual graph neural network
pal Singh et al. Naive Bayes classifier for word sense disambiguation of Punjabi language
Quan et al. Combine sentiment lexicon and dependency parsing for sentiment classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant