CN105843796A - Microblog emotional tendency analysis method and device - Google Patents

Microblog emotional tendency analysis method and device Download PDF

Info

Publication number
CN105843796A
CN105843796A CN201610181735.8A CN201610181735A CN105843796A CN 105843796 A CN105843796 A CN 105843796A CN 201610181735 A CN201610181735 A CN 201610181735A CN 105843796 A CN105843796 A CN 105843796A
Authority
CN
China
Prior art keywords
sentence
emotion value
emotion
word
subordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610181735.8A
Other languages
Chinese (zh)
Inventor
姚海鹏
方超
赵天奇
张俊东
张培颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201610181735.8A priority Critical patent/CN105843796A/en
Publication of CN105843796A publication Critical patent/CN105843796A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a microblog emotional tendency analysis method and device, applicable to electronic equipment. In the embodiment of the invention, the emotional value of a microblog content can be determined according to emotional values of expressions and emotional values of texts included in the microblog content simultaneously; and furthermore, when the emotional values of the texts are determined, sentence patterns of various compound sentences in the texts and inter-sentence relationships of various sub-sentences included in various compound sentences are considered simultaneously. Compared with the prior art, the microblog emotional tendency is determined by using more factors influencing the microblog emotional tendency; and thus, the microblog emotional tendency analysis accuracy can be improved.

Description

A kind of microblog emotional trend analysis method and device
Technical field
The present invention relates to natural language processing technique field, particularly relate to a kind of microblog emotional trend analysis side Method and device.
Background technology
Along with the development of the Internet, people increasingly get used to expressing the viewpoint of oneself on network.Such as, The viewpoint of oneself can be expressed by microblogging.
Microblogging is a kind of by paying close attention to the social network-i i-platform that mechanism shares the broadcast type of brief real time information.With Family open microblogging service after, can deliver, forward and comment on message, come labelling life, share strange thing, Express viewpoint etc..Microblogging has attracted rapidly masses' once coming out by its opening, equality, ease for use Sight.The quantity of microblogging is big, updating decision, the most much have expressed user to the viewpoint of some event and attitude, The emotion tendency analyzing content of microblog has important practical significance.Such as, netizen is for some focus thing The view of part, understands current public feelings information for government, judges that current public opinion situation and decision-making are very Valuable;And for commodity, the comment of user, then businessman is adjusted market strategy and buyer selects commodity There is certain help.
In prior art, mainly can carry out microblog emotional trend analysis based on semantic rule.So-called based on Semantic rule, it is simply that by the emotion value adding up emotion word in microblogging text the journey arranged in pairs or groups the most therewith Degree adverbial word and negative adverb, by being averaging or other computing mode provides the emotion value of statement and text. But, in actual applications, due to numerous, only according to feelings to the Sentiment orientation influence factor of content of microblog Sense word and the degree adverb arranged in pairs or groups therewith and negative adverb can not be analyzed the emotion of content of microblog exactly and incline To.
Summary of the invention
The purpose of the embodiment of the present invention is to provide a kind of microblog emotional trend analysis method and device, to improve The accuracy of microblog emotional trend analysis.Concrete technical scheme is as follows:
First aspect, embodiments provides a kind of microblog emotional trend analysis method, is applied to electronics Equipment, described method includes:
For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine institute State the text that content of microblog is corresponding;
For each expression in described expression set, according to the expression data storehouse built in advance, obtain each The emotion value that expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression set Emotion value;
According to default punctuation mark, described text is divided at least one complex sentence;And according to each complex sentence Sentence pattern, determines the sentence pattern coefficient of described each complex sentence;
For each complex sentence, extract each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other subordinate sentences Relation between Ju, determines coefficient of relationship between the sentence of each subordinate sentence;
For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain each participle that this subordinate sentence includes, and Identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word;
Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence Sentence pattern coefficient, calculates the emotion value of described text;
Emotion value according to described expression set and the emotion value of described text, calculate the feelings of described content of microblog Inductance value.
Further, the described emotion value corresponding according to described each expression, calculate the feelings of described expression set Inductance value includes:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection The emotion value closed.
Further, described for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain this subordinate sentence wraps After each participle included, described method also includes:
Each participle is carried out part-of-speech tagging process;
After emotion word in each participle of described identification, described method also includes:
Identify the degree adverb before described emotion word and negative adverb;
The dictionary that described basis builds in advance, determines that the emotion value of each emotion word includes:
According to the degree adverb before the dictionary built in advance, and each emotion word and negative adverb, determine and repair The emotion value of each emotion word after just.
Further, coefficient of relationship between the sentence of the described emotion value according to described each emotion word, described each subordinate sentence, And the sentence pattern coefficient of each complex sentence, the emotion value calculating described text includes:
According to the emotion value of each emotion word in each subordinate sentence determined, calculate each emotion word that each subordinate sentence includes Emotion value sum, as the word emotion value of each subordinate sentence;
Word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence Word emotion value and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;
Emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;
Calculate the emotion value sum of described each complex sentence, as the emotion value of described text.
Further, the described emotion value according to described expression set and the emotion value of described text, calculate institute The emotion value stating content of microblog includes:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
Second aspect, embodiments provides a kind of microblog emotional trend analysis device, is applied to electronics Equipment, described device includes:
Extraction module, for for content of microblog to be analyzed, extracts the expression that described content of microblog includes Set, and determine the text that described content of microblog is corresponding;
First computing module, for for each expression in described expression set, according to the table built in advance Feelings data base, obtains the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, Calculate the emotion value of described expression set;
First determines module, for according to the punctuation mark preset, described text is divided at least one multiple Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to Relation between the sentence of each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtains this subordinate sentence The each participle included, and identify the emotion word in each participle;According to the dictionary built in advance, determine each feelings The emotion value of sense word;
Second computing module, for according to relation between the emotion value of described each emotion word, the sentence of described each subordinate sentence The sentence pattern coefficient of coefficient and each complex sentence, calculates the emotion value of described text;
3rd computing module, for the emotion value according to described expression set and the emotion value of described text, meter Calculate the emotion value of described content of microblog.
Further, described first computing module specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection The emotion value closed.
Further, described device also includes:
Processing module, for carrying out part-of-speech tagging process to each participle;
Identification module, for identifying the degree adverb before described emotion word and negative adverb;
Described 3rd determines module, is additionally operable to according to the dictionary built in advance, and the journey before each emotion word Degree adverbial word and negative adverb, determine the emotion value of revised each emotion word.
Further, described second computing module includes:
First calculating sub module, for according to the emotion value of each emotion word in each subordinate sentence determined, calculates each point The emotion value sum of each emotion word that sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module, for the word emotion value according to each subordinate sentence, and between the sentence of corresponding each subordinate sentence Coefficient of relationship, calculates the word emotion value of each subordinate sentence and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, makees Emotion value for each subordinate sentence;
3rd calculating sub module, for the emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, meter Calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the product of the sentence pattern coefficient of corresponding each complex sentence, as The emotion value of each complex sentence;
4th calculating sub module, for calculating the emotion value sum of described each complex sentence, as the feelings of described text Inductance value.
Further, described 3rd computing module specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
Embodiments providing a kind of microblog emotional trend analysis method and device, described method includes: For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine described micro- The text that rich content is corresponding;For each expression in described expression set, according to the expression number built in advance According to storehouse, obtain the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, calculate The emotion value of described expression set;According to default punctuation mark, described text is divided at least one multiple Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;For each complex sentence, extract Each subordinate sentence that this complex sentence includes, and according to relation between the sentence of each subordinate sentence and other subordinate sentences, determine each subordinate sentence Coefficient of relationship between Ju;For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain what this subordinate sentence included Each participle, and identify the emotion word in each participle;According to the dictionary built in advance, determine the feelings of each emotion word Inductance value;Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence Sentence pattern coefficient, calculate the emotion value of described text;Emotion value according to described expression set and described text Emotion value, calculate the emotion value of described content of microblog.The embodiment of the present invention can be simultaneously according to content of microblog The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and, When determining the emotion value of text, consider the sentence pattern of each complex sentence in text simultaneously, and each complex sentence includes Relation between the sentence of each subordinate sentence, compared with prior art, employs the factor more affecting microblog emotional tendency Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to enforcement In example or description of the prior art, the required accompanying drawing used is briefly described, it should be apparent that, describe below In accompanying drawing be only some embodiments of the present invention, for those of ordinary skill in the art, do not paying On the premise of going out creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
The flow chart of a kind of microblog emotional trend analysis method that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 is the microblogging expression schematic diagram of conventional graphic form;
The structural representation of a kind of microblog emotional trend analysis device that Fig. 3 provides for the embodiment of the present invention.
Detailed description of the invention
In order to improve the accuracy of microblog emotional trend analysis, embodiments provide a kind of microblog emotional Trend analysis method and device.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clearly Chu, be fully described by, it is clear that described embodiment be only a part of embodiment of the present invention rather than Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creation The every other embodiment obtained under property work premise, broadly falls into the scope of protection of the invention.
It should be noted that in the case of not conflicting, the embodiment in the present invention and the feature in embodiment Can be mutually combined.Describe the present invention below with reference to the accompanying drawings and in conjunction with the embodiments in detail.
In order to improve the accuracy of microblog emotional trend analysis, embodiments provide a kind of microblog emotional Trend analysis procedure, as it is shown in figure 1, this process comprises the following steps:
S101, for content of microblog to be analyzed, extracts the expression set that described content of microblog includes, and Determine the text that described content of microblog is corresponding.
The method that the embodiment of the present invention provides can apply to electronic equipment.Specifically, this electronic equipment is such as May is that notebook computer, intelligent terminal, desk computer, portable computer etc..
In embodiments of the present invention, first electronic equipment can obtain content of microblog to be analyzed.Such as, electricity Subset can be crawled by reptile on network and obtain whole piece content of microblog to be analyzed, e.g., can pass through Reptile crawls any bar content of microblog obtained in Sina's microblogging, and using this content of microblog as to be analyzed micro- Rich content;Or, in order to improve the efficiency of microblog emotional trend analysis, electronic equipment can also be in advance at net Crawled by reptile on network and obtain at least one content of microblog, and the content of microblog got is saved in data In storehouse, when carrying out microblog emotional trend analysis, from data base, directly obtain content of microblog to be analyzed. Being crawled by reptile and obtain the process of content of microblog and can use prior art, the embodiment of the present invention is to this process Do not repeat.
Being appreciated that in some cases, user is when delivering content of microblog, except wrapping in content of microblog Including outside text, it is also possible to include that some are expressed one's feelings, this expression can be character style, it is also possible to be picture shape Formula.And, it is generally the case that the expression in content of microblog can express the emotion of user well, mark The Sentiment orientation of content of microblog.
As in figure 2 it is shown, it illustrates the microblogging expression schematic diagram of conventional graphic form, wherein, expression 210 For " smile ", expression 220 is " extremely ".
Therefore, in embodiments of the present invention, in order to improve the accuracy of microblog emotional trend analysis, electronics sets The standby Sentiment orientation that simultaneously can analyze content of microblog according to the expression of content of microblog and text.
Specifically, electronic equipment, can be first against to be analyzed after getting content of microblog to be analyzed Content of microblog, extracts the expression that this content of microblog includes, obtains the expression set comprising each expression, and really The text that this content of microblog fixed is corresponding.Such as, what first electronic equipment can extract that content of microblog includes is every Individual expression, gathers comprising the set comprising each expression extracted as expression, and will be except expression set Outside content be defined as the text that this content of microblog is corresponding.
Such as, whole piece content of microblog to be analyzed is obtained when electronic equipment is directly crawled by reptile on network Time, the expression that this content of microblog includes can be picture format.In this case, electronic equipment can be known The picture that content of microblog not to be analyzed includes, and the picture recognized is defined as this microblogging to be analyzed The expression set that content includes.When electronic equipment obtains content of microblog to be analyzed from data base, number Would generally show with the form of expression word according to the expression in the content of microblog preserved in storehouse, and, this expression Word can show in a pre-defined format.As in Fig. 2 express one's feelings 210 can be shown as in data base [smile], Expression 220 can be shown as [extremely] in data base.In this case, electronic equipment can extract symbol " [] " The expression word inside included, and the expression word extracted is defined as the expression that content of microblog to be analyzed includes Set.
S102, for each expression in described expression set, according to the expression data storehouse built in advance, obtains Take the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression The emotion value of set.
Extracting after obtaining the expression set that content of microblog to be analyzed includes, electronic equipment can be for being extracted Expression set in each expression, according to the expression data storehouse built in advance, obtain each expression corresponding Emotion value, and according to the emotion value of each expression correspondence, calculate the emotion value of expression set.
Such as, in embodiments of the present invention, get when electronic equipment is the expression shown with picture format Time, before obtaining the emotion value that each expression is corresponding, first each expression can be converted to the expression of its correspondence Word, and then can search, according to expression data storehouse, the emotion value that each expression word is corresponding.
In embodiments of the present invention, expression data storehouse can be built in advance, include in this expression data storehouse Each expression word and the emotion value of correspondence thereof.
Such as, the expression data storehouse built in the embodiment of the present invention can be as shown in the table:
As shown in above table, expression word corresponding emotion value of " smiling " can be 1.0;, word of expressing one's feelings The emotion value of correspondence of " cursing in rage " can be-0.9;The emotion value of expression word " sad " correspondence can be-1. Wherein, the positive and negative Sentiment orientation being used for identifying this expression word of the emotion value of expression word, when emotion value is On the occasion of time, show that the Sentiment orientation of this expression word is forward;When emotion value is negative value, show this expression The Sentiment orientation of word is negative sense;When emotion value is 0, show that the Sentiment orientation of this expression word is neutrality. The numerical value of each emotion value is the biggest, shows that the Sentiment orientation of this expression word is the strongest.In embodiments of the present invention, Emotion value corresponding for each expression can be arranged between-1 to 1.
Electronic equipment, can be at table when calculating the emotion value of the expression set that content of microblog to be analyzed includes Feelings data base searches the emotion value of each expression correspondence that expression set includes.
It should be noted that in embodiments of the present invention, get when electronic equipment is to show with picture format During the expression shown, before obtaining the emotion value that each expression is corresponding, first each expression can be converted to it right The expression word answered, and then each expression word can be searched corresponding according to expression data storehouse as noted above Emotion value.
After getting the emotion value of each expression correspondence that expression set includes, electronic equipment can be further Ground, according to the emotion value of each expression correspondence, calculates the emotion value of expression set.
Such as, in one implementation, electronic equipment can calculate emotion value corresponding to all of expression Meansigma methods, and the emotion value that calculated meansigma methods is gathered as expression.
S103, according to default punctuation mark, is divided at least one complex sentence by described text;And according to respectively The sentence pattern of complex sentence, determines the sentence pattern coefficient of described each complex sentence.
In embodiments of the present invention, electronic equipment can also determine content of microblog according to the emotion value of text Emotion value.
Specifically, electronic equipment can be first according to the punctuation mark preset, such as fullstop, question mark, exclamation mark Deng, text is divided at least one complex sentence.And it is possible to according to the sentence pattern of each complex sentence, determine each complex sentence Sentence pattern coefficient.The process that text is divided at least one complex sentence can use prior art, and the present invention is real Execute example this process is not repeated.
After obtaining each complex sentence, electronic equipment according to the sentence pattern of each complex sentence, can determine the sentence pattern coefficient of each complex sentence. Complex sentence sentence pattern in the embodiment of the present invention can include assertive sentence, exclamative sentence, interrogative sentence and confirmative question etc.. Specifically, the punctuation mark that electronic equipment can include according to each complex sentence, and predetermined keyword etc., Determine the sentence pattern of each complex sentence.
Such as, when the punctuation mark that complex sentence includes is fullstop, electronic equipment may determine that the sentence of this complex sentence Type is assertive sentence;When the punctuation mark that complex sentence includes is exclamation mark, electronic equipment may determine that this complex sentence Sentence pattern be exclamative sentence;The punctuation mark included when complex sentence is question mark, and does not comprise such as " no ", "no" Deng rhetorical question word time, electronic equipment may determine that the sentence pattern of this complex sentence is interrogative sentence;When the punctuate that complex sentence includes Symbol is question mark, and when comprising such as the rhetorical question word such as " no ", "no", electronic equipment may determine that this complex sentence Sentence pattern be confirmative question.
Further, in embodiments of the present invention, when complex sentence is assertive sentence, its sentence pattern coefficient can be 1;When When complex sentence is exclamative sentence, its sentence pattern coefficient can be 2;When complex sentence is interrogative sentence, its sentence pattern coefficient is permissible It is 0;When complex sentence is confirmative question, its sentence pattern coefficient can be-1.5.
S104, for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other Relation between the sentence of subordinate sentence, determines coefficient of relationship between the sentence of each subordinate sentence.
After electronic equipment determines the sentence pattern of each complex sentence and the sentence pattern coefficient of correspondence, for each complex sentence, it is also possible to Extract each subordinate sentence that this complex sentence includes, it is possible to according to relation between the sentence of each subordinate sentence and other subordinate sentences, determine Coefficient of relationship between the sentence of each subordinate sentence.Electronic equipment extracts the process of each subordinate sentence that each complex sentence includes, can adopt By prior art, e.g., the comma that electronic equipment can include according to each complex sentence, extract each complex sentence and include Each subordinate sentence, this process is not repeated by the embodiment of the present invention.
After obtaining each subordinate sentence that each complex sentence includes, electronic equipment can also be for each complex sentence, multiple according to this Relation between the sentence of each subordinate sentence of including of sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence.
It is appreciated that between each subordinate sentence that a complex sentence includes, some annexations can be there are, as Transfer, go forward one by one, assume.And there is each subordinate sentence of different annexation, its Sentiment orientation expressed also may be used Can be different.
Therefore, in embodiments of the present invention, electronic equipment can wrap according in this complex sentence for each complex sentence Relation between each subordinate sentence included and the sentence of other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence.
Specifically, whether electronic equipment can identify in each subordinate sentence comprised in this complex sentence for each complex sentence Comprise predetermined key word, determine relation between the sentence of each subordinate sentence and other subordinate sentences.
Such as, when subordinate sentence comprises as " but ", " but ", " ", " but " etc. represent the relation of turnover During word, it may be determined that between the sentence of this subordinate sentence and the subordinate sentence before it, relation is turning relation.
Being appreciated that in the subordinate sentence that there is turning relation, the most above subordinate sentence proposes certain true or situation, Below subordinate sentence then state the meaning contrary or relative with above subordinate sentence, subordinate sentence is only speaker and is wanted i.e. below The real intention expressed.Therefore, in embodiments of the present invention, between the sentence when between several subordinate sentences, relation is turnover Time, between the sentence of each subordinate sentence, coefficient of relationship can be: the subordinate sentence before adversative is 0, adversative below point Sentence is 1.
When subordinate sentence comprises as " more ", " What is more " etc. represent go forward one by one relational word time, it may be determined that Between the sentence of this subordinate sentence and the subordinate sentence before it, relation is progressive relationship.Between the sentence when between several subordinate sentences, relation is for passing When entering, between the sentence of each subordinate sentence, coefficient of relationship can be: the subordinate sentence before the word that goes forward one by one is 1, goes forward one by one word below Subordinate sentence is 1.5.
S105, for each subordinate sentence, carries out word segmentation processing to this subordinate sentence, obtain that this subordinate sentence includes each point Word, and identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word.
Obtaining between the sentence of each subordinate sentence after coefficient of relationship, further, electronic equipment can also be for each subordinate sentence, This subordinate sentence is carried out word segmentation processing, obtains each participle that this subordinate sentence includes, and identify the emotion in each participle Word, then according to the dictionary built in advance, can determine the emotion value of each emotion word.
In embodiments of the present invention, electronic equipment, for each subordinate sentence, carries out participle to this subordinate sentence, is somebody's turn to do Each participle that subordinate sentence includes, and identify the emotion word in each participle, then can be according to the word built in advance Storehouse, determines that the process of the emotion value of each emotion word can use prior art, and the embodiment of the present invention is to this process Do not repeat.
Alternatively, in embodiments of the present invention, in order to improve the accuracy that the emotion value of each emotion word determines, Electronic equipment, for each subordinate sentence, carries out word segmentation processing to this subordinate sentence, obtains each participle that this subordinate sentence includes Afterwards, it is also possible to each participle is carried out part-of-speech tagging process;Further, electronic equipment can also identify After the emotion word that each participle includes, it is also possible to identify the degree adverb before each emotion word and negative pair Word;Further, electronic equipment, can be according to the word built in advance when determining the emotion value of each emotion word Degree adverb before storehouse, and each emotion word and negative adverb, determine the emotion of revised each emotion word Value.
In embodiments of the present invention, electronic equipment carries out part-of-speech tagging process to each subordinate sentence, identifies each emotion word Degree adverb before and negative adverb, according to the degree before the dictionary built in advance, and each emotion word Adverbial word and negative adverb, determine that the process of the emotion value of revised each emotion word can use prior art, This process is not repeated by the embodiment of the present invention.
S106, according to coefficient of relationship between the emotion value of described each emotion word, the sentence of described each subordinate sentence and each The sentence pattern coefficient of complex sentence, calculates the emotion value of described text.
In embodiments of the present invention, electronic equipment can according to the emotion value of each emotion word, each subordinate sentence sentence between The sentence pattern coefficient of coefficient of relationship and each complex sentence, calculates the emotion value of text.
Specifically, electronic equipment can calculate first according to the emotion value of each emotion word in each subordinate sentence determined The emotion value sum of each emotion word that each subordinate sentence includes, as the word emotion value of each subordinate sentence;Then, may be used With the word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence The product of coefficient of relationship between word emotion value with the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;Afterwards, Can be according to the emotion value of each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;Finally, The emotion value sum of each complex sentence can be calculated, as the emotion value of text.
Such as, electronic equipment can calculate the emotion value E (s of subordinate sentence i first according to below equationi):
E(si)=∑ E (Wi)×ri
Wherein, E (Wi) it is the emotion value of each emotion word that this subordinate sentence i includes;∑E(Wi) it is the word of this subordinate sentence i Language emotion value;riFor coefficient of relationship between the sentence of subordinate sentence i.
Further, electronic equipment according to below equation, can calculate the emotion value E (S of complex sentence jj):
E ( S j ) = Σ j = 1 n E ( s j ) × T j
Wherein, E (sj) it is the emotion value of each subordinate sentence that complex sentence j includes;TjSentence pattern coefficient for complex sentence j.
S107, according to emotion value and the emotion value of described text of described expression set, calculates in described microblogging The emotion value held.
After obtaining express one's feelings the emotion value of set and the emotion value of text that content of microblog to be analyzed includes, electronics Equipment can calculate the emotion of content of microblog to be analyzed according to the emotion value of expression set and the emotion value of text Value.
Specifically, the emotion value of expression set can be multiplied by the first predetermined weights, by text by electronic equipment Emotion value be multiplied by the second predetermined weights, and the results added that will be calculated, obtain the emotion of content of microblog Value.
Such as, above-mentioned first weights can be 0.4, and the second weights can be 0.6;Or, the first weights can Being 0.35, the second weights can be 0.65 etc..
The microblog emotional trend analysis method that the embodiment of the present invention provides, it is possible to simultaneously wrap according in content of microblog The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and, determining During the emotion value of text, consider the sentence pattern of each complex sentence in text, and each point that each complex sentence includes simultaneously Relation between the sentence of sentence, compared with prior art, employ more affect that microblog emotional is inclined to because of the most true Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Corresponding to above method embodiment, the embodiment of the present invention additionally provides corresponding device embodiment.
A kind of microblog emotional trend analysis device that Fig. 3 provides for the embodiment of the present invention, is applied to electronic equipment, Described device includes:
Extraction module 310, for for content of microblog to be analyzed, extracts what described content of microblog included Expression set, and determine the text that described content of microblog is corresponding;
First computing module 320, for for each expression in described expression set, according to building in advance Expression data storehouse, obtain the emotion value that each expression is corresponding, and according to emotion corresponding to described each expression Value, calculates the emotion value of described expression set;
First determines module 330, for according to the punctuation mark preset, described text being divided at least one Individual complex sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module 340, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and Relation between the sentence according to each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module 350, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, is somebody's turn to do Each participle that subordinate sentence includes, and identify the emotion word in each participle;According to the dictionary built in advance, determine The emotion value of each emotion word;
Second computing module 360, for according between the emotion value of described each emotion word, the sentence of described each subordinate sentence The sentence pattern coefficient of coefficient of relationship and each complex sentence, calculates the emotion value of described text;
3rd computing module 370, for the emotion value according to described expression set and the emotion value of described text, Calculate the emotion value of described content of microblog.
The microblog emotional trend analysis device that the embodiment of the present invention provides, it is possible to simultaneously wrap according in content of microblog The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and, determining During the emotion value of text, consider the sentence pattern of each complex sentence in text, and each point that each complex sentence includes simultaneously Relation between the sentence of sentence, compared with prior art, employ more affect that microblog emotional is inclined to because of the most true Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Further, described first computing module 320 specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection The emotion value closed.
Further, described device also includes:
Processing module (not shown), for carrying out part-of-speech tagging process to each participle;
Identification module (not shown), for identifying the degree adverb before described emotion word and negative pair Word;
Described 3rd determines module 350, is additionally operable to according to the dictionary built in advance, and before each emotion word Degree adverb and negative adverb, determine the emotion value of revised each emotion word.
Further, described second computing module 360 includes:
First calculating sub module (not shown), for according to the feelings of each emotion word in each subordinate sentence determined Inductance value, calculates the emotion value sum of each emotion word that each subordinate sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module (not shown), for the word emotion value according to each subordinate sentence and right Should coefficient of relationship between the sentence of each subordinate sentence, calculate the word emotion value of each subordinate sentence and relation between the sentence of corresponding each subordinate sentence The product of coefficient, as the emotion value of each subordinate sentence;
3rd calculating sub module (not shown), for the emotion value according to each subordinate sentence, and each complex sentence Sentence pattern coefficient, calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the sentence pattern system of corresponding each complex sentence The product of number, as the emotion value of each complex sentence;
4th calculating sub module (not shown), for calculating the emotion value sum of described each complex sentence, makees Emotion value for described text.
Further, described 3rd computing module 370 specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
It should be noted that in this article, the relational terms of such as first and second or the like be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply these Relation or the order of any this reality is there is between entity or operation.And, term " includes ", " bag Contain " or its any other variant be intended to comprising of nonexcludability, so that include a series of key element Process, method, article or equipment not only include those key elements, but also include being not expressly set out Other key elements, or also include the key element intrinsic for this process, method, article or equipment.? In the case of there is no more restriction, statement " including ... " key element limited, it is not excluded that including The process of described key element, method, article or equipment there is also other identical element.
Each embodiment in this specification all uses relevant mode to describe, phase homophase between each embodiment As part see mutually, what each embodiment stressed is different from other embodiments it Place.For system embodiment, owing to it is substantially similar to embodiment of the method, so describe Fairly simple, relevant part sees the part of embodiment of the method and illustrates.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the protection model of the present invention Enclose.All any modification, equivalent substitution and improvement etc. made within the spirit and principles in the present invention, all wrap Containing within the scope of the present invention.

Claims (10)

1. a microblog emotional trend analysis method, is applied to electronic equipment, it is characterised in that described side Method includes:
For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine institute State the text that content of microblog is corresponding;
For each expression in described expression set, according to the expression data storehouse built in advance, obtain each The emotion value that expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression set Emotion value;
According to default punctuation mark, described text is divided at least one complex sentence;And according to each complex sentence Sentence pattern, determines the sentence pattern coefficient of described each complex sentence;
For each complex sentence, extract each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other subordinate sentences Relation between Ju, determines coefficient of relationship between the sentence of each subordinate sentence;
For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain each participle that this subordinate sentence includes, and Identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word;
Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence Sentence pattern coefficient, calculates the emotion value of described text;
Emotion value according to described expression set and the emotion value of described text, calculate the feelings of described content of microblog Inductance value.
Method the most according to claim 1, it is characterised in that described according to described each expression correspondence Emotion value, calculate described expression set emotion value include:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection The emotion value closed.
Method the most according to claim 1, it is characterised in that described for each subordinate sentence, to this point Sentence carries out word segmentation processing, and after obtaining each participle that this subordinate sentence includes, described method also includes:
Each participle is carried out part-of-speech tagging process;
After emotion word in each participle of described identification, described method also includes:
Identify the degree adverb before described emotion word and negative adverb;
The dictionary that described basis builds in advance, determines that the emotion value of each emotion word includes:
According to the degree adverb before the dictionary built in advance, and each emotion word and negative adverb, determine and repair The emotion value of each emotion word after just.
Method the most according to claim 3, it is characterised in that the described feelings according to described each emotion word Inductance value, described each subordinate sentence sentence between coefficient of relationship and the sentence pattern coefficient of each complex sentence, calculate described text Emotion value includes:
According to the emotion value of each emotion word in each subordinate sentence determined, calculate each emotion word that each subordinate sentence includes Emotion value sum, as the word emotion value of each subordinate sentence;
Word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence Word emotion value and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;
Emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;
Calculate the emotion value sum of described each complex sentence, as the emotion value of described text.
5. according to the method described in any one of claim 1-4, it is characterised in that described according to described expression The emotion value of set and the emotion value of described text, the emotion value calculating described content of microblog includes:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
6. a microblog emotional trend analysis device, is applied to electronic equipment, it is characterised in that described dress Put and include:
Extraction module, for for content of microblog to be analyzed, extracts the expression that described content of microblog includes Set, and determine the text that described content of microblog is corresponding;
First computing module, for for each expression in described expression set, according to the table built in advance Feelings data base, obtains the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, Calculate the emotion value of described expression set;
First determines module, for according to the punctuation mark preset, described text is divided at least one multiple Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to Relation between the sentence of each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtains this subordinate sentence The each participle included, and identify the emotion word in each participle;According to the dictionary built in advance, determine each feelings The emotion value of sense word;
Second computing module, for according to relation between the emotion value of described each emotion word, the sentence of described each subordinate sentence The sentence pattern coefficient of coefficient and each complex sentence, calculates the emotion value of described text;
3rd computing module, for the emotion value according to described expression set and the emotion value of described text, meter Calculate the emotion value of described content of microblog.
Device the most according to claim 6, it is characterised in that described first computing module specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection The emotion value closed.
Device the most according to claim 6, it is characterised in that described device also includes:
Processing module, for carrying out part-of-speech tagging process to each participle;
Identification module, for identifying the degree adverb before described emotion word and negative adverb;
Described 3rd determines module, is additionally operable to according to the dictionary built in advance, and the journey before each emotion word Degree adverbial word and negative adverb, determine the emotion value of revised each emotion word.
Device the most according to claim 8, it is characterised in that described second computing module includes:
First calculating sub module, for according to the emotion value of each emotion word in each subordinate sentence determined, calculates each point The emotion value sum of each emotion word that sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module, for the word emotion value according to each subordinate sentence, and between the sentence of corresponding each subordinate sentence Coefficient of relationship, calculates the word emotion value of each subordinate sentence and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, makees Emotion value for each subordinate sentence;
3rd calculating sub module, for the emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, meter Calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the product of the sentence pattern coefficient of corresponding each complex sentence, as The emotion value of each complex sentence;
4th calculating sub module, for calculating the emotion value sum of described each complex sentence, as the feelings of described text Inductance value.
10. according to the device described in any one of claim 6-9, it is characterised in that the described 3rd calculates mould Block specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
CN201610181735.8A 2016-03-28 2016-03-28 Microblog emotional tendency analysis method and device Pending CN105843796A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610181735.8A CN105843796A (en) 2016-03-28 2016-03-28 Microblog emotional tendency analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610181735.8A CN105843796A (en) 2016-03-28 2016-03-28 Microblog emotional tendency analysis method and device

Publications (1)

Publication Number Publication Date
CN105843796A true CN105843796A (en) 2016-08-10

Family

ID=56584525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610181735.8A Pending CN105843796A (en) 2016-03-28 2016-03-28 Microblog emotional tendency analysis method and device

Country Status (1)

Country Link
CN (1) CN105843796A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202584A (en) * 2016-09-20 2016-12-07 北京工业大学 A kind of microblog emotional based on standard dictionary and semantic rule analyzes method
CN106326481A (en) * 2016-08-31 2017-01-11 中译语通科技(北京)有限公司 Detection method of Weibo hot topics based on suddenness
CN106503220A (en) * 2016-10-28 2017-03-15 上海大学 A kind of microblogging emoticon affection computation method based on a mutual information
CN107229612A (en) * 2017-05-24 2017-10-03 重庆誉存大数据科技有限公司 A kind of network information semantic tendency analysis method and system
CN108153831A (en) * 2017-12-13 2018-06-12 北京小米移动软件有限公司 Music adding method and device
CN108197104A (en) * 2017-12-27 2018-06-22 浙江力石科技股份有限公司 Text analyzing method, apparatus and cloud platform
CN108228573A (en) * 2018-03-23 2018-06-29 北京航空航天大学 Text emotion analysis method, device and electronic equipment
CN108647257A (en) * 2018-04-24 2018-10-12 北京科技大学 A kind of microblog emotional determines method
CN109145306A (en) * 2018-09-11 2019-01-04 刘瑞军 The three-dimensional expression generation method of text-driven
CN109471928A (en) * 2018-10-31 2019-03-15 北京国信云服科技有限公司 A kind of associated entity Judgment by emotion method based on diffusive transport model
CN109598402A (en) * 2018-10-23 2019-04-09 平安科技(深圳)有限公司 Data report generation method, device, computer equipment based on data mining
CN113378578A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food and medicine public opinion analysis method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text
CN103699626A (en) * 2013-12-20 2014-04-02 华南理工大学 Method and system for analysing individual emotion tendency of microblog user
CN105224640A (en) * 2015-09-25 2016-01-06 杭州朗和科技有限公司 A kind of method and apparatus extracting viewpoint

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text
CN103699626A (en) * 2013-12-20 2014-04-02 华南理工大学 Method and system for analysing individual emotion tendency of microblog user
CN105224640A (en) * 2015-09-25 2016-01-06 杭州朗和科技有限公司 A kind of method and apparatus extracting viewpoint

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326481A (en) * 2016-08-31 2017-01-11 中译语通科技(北京)有限公司 Detection method of Weibo hot topics based on suddenness
CN106202584A (en) * 2016-09-20 2016-12-07 北京工业大学 A kind of microblog emotional based on standard dictionary and semantic rule analyzes method
CN106503220A (en) * 2016-10-28 2017-03-15 上海大学 A kind of microblogging emoticon affection computation method based on a mutual information
CN107229612B (en) * 2017-05-24 2021-01-08 重庆电信系统集成有限公司 Network information semantic tendency analysis method and system
CN107229612A (en) * 2017-05-24 2017-10-03 重庆誉存大数据科技有限公司 A kind of network information semantic tendency analysis method and system
CN108153831A (en) * 2017-12-13 2018-06-12 北京小米移动软件有限公司 Music adding method and device
CN108197104A (en) * 2017-12-27 2018-06-22 浙江力石科技股份有限公司 Text analyzing method, apparatus and cloud platform
CN108228573A (en) * 2018-03-23 2018-06-29 北京航空航天大学 Text emotion analysis method, device and electronic equipment
CN108647257A (en) * 2018-04-24 2018-10-12 北京科技大学 A kind of microblog emotional determines method
CN109145306A (en) * 2018-09-11 2019-01-04 刘瑞军 The three-dimensional expression generation method of text-driven
CN109598402A (en) * 2018-10-23 2019-04-09 平安科技(深圳)有限公司 Data report generation method, device, computer equipment based on data mining
CN109471928A (en) * 2018-10-31 2019-03-15 北京国信云服科技有限公司 A kind of associated entity Judgment by emotion method based on diffusive transport model
CN109471928B (en) * 2018-10-31 2021-09-28 北京国信云服科技有限公司 Associated entity emotion judgment method based on diffusion propagation model
CN113378578A (en) * 2021-05-08 2021-09-10 重庆航天信息有限公司 Food and medicine public opinion analysis method

Similar Documents

Publication Publication Date Title
CN105843796A (en) Microblog emotional tendency analysis method and device
CN108573411B (en) Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments
CN103678564B (en) Internet product research system based on data mining
CN103544255B (en) Text semantic relativity based network public opinion information analysis method
CN102866989B (en) Viewpoint abstracting method based on word dependence relationship
CN103049435B (en) Text fine granularity sentiment analysis method and device
CN105005564B (en) A kind of data processing method and device based on answer platform
CN105630768B (en) A kind of product name recognition method and device based on stacking condition random field
CN107784092A (en) A kind of method, server and computer-readable medium for recommending hot word
CN106951438A (en) A kind of event extraction system and method towards open field
CN102200975B (en) Vertical search engine system using semantic analysis
CN104881458B (en) A kind of mask method and device of Web page subject
CN106250513A (en) A kind of event personalization sorting technique based on event modeling and system
CN106970912A (en) Chinese sentence similarity calculating method, computing device and computer-readable storage medium
CN104820686A (en) Network search method and network search system
CN109960756A (en) Media event information inductive method
CN103473380B (en) A kind of computer version sensibility classification method
CN106126502A (en) A kind of emotional semantic classification system and method based on support vector machine
CN110362678A (en) A kind of method and apparatus automatically extracting Chinese text keyword
CN110134845A (en) Project public sentiment monitoring method, device, computer equipment and storage medium
CN109472022A (en) New word identification method and terminal device based on machine learning
Nandi et al. Bangla news recommendation using doc2vec
CN107798622A (en) A kind of method and apparatus for identifying user view
CN106250365A (en) The extracting method of item property Feature Words in consumer reviews based on text analyzing
CN114722174A (en) Word extraction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160810

RJ01 Rejection of invention patent application after publication