CN105843796A - Microblog emotional tendency analysis method and device - Google Patents
Microblog emotional tendency analysis method and device Download PDFInfo
- Publication number
- CN105843796A CN105843796A CN201610181735.8A CN201610181735A CN105843796A CN 105843796 A CN105843796 A CN 105843796A CN 201610181735 A CN201610181735 A CN 201610181735A CN 105843796 A CN105843796 A CN 105843796A
- Authority
- CN
- China
- Prior art keywords
- sentence
- emotion value
- emotion
- word
- subordinate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a microblog emotional tendency analysis method and device, applicable to electronic equipment. In the embodiment of the invention, the emotional value of a microblog content can be determined according to emotional values of expressions and emotional values of texts included in the microblog content simultaneously; and furthermore, when the emotional values of the texts are determined, sentence patterns of various compound sentences in the texts and inter-sentence relationships of various sub-sentences included in various compound sentences are considered simultaneously. Compared with the prior art, the microblog emotional tendency is determined by using more factors influencing the microblog emotional tendency; and thus, the microblog emotional tendency analysis accuracy can be improved.
Description
Technical field
The present invention relates to natural language processing technique field, particularly relate to a kind of microblog emotional trend analysis side
Method and device.
Background technology
Along with the development of the Internet, people increasingly get used to expressing the viewpoint of oneself on network.Such as,
The viewpoint of oneself can be expressed by microblogging.
Microblogging is a kind of by paying close attention to the social network-i i-platform that mechanism shares the broadcast type of brief real time information.With
Family open microblogging service after, can deliver, forward and comment on message, come labelling life, share strange thing,
Express viewpoint etc..Microblogging has attracted rapidly masses' once coming out by its opening, equality, ease for use
Sight.The quantity of microblogging is big, updating decision, the most much have expressed user to the viewpoint of some event and attitude,
The emotion tendency analyzing content of microblog has important practical significance.Such as, netizen is for some focus thing
The view of part, understands current public feelings information for government, judges that current public opinion situation and decision-making are very
Valuable;And for commodity, the comment of user, then businessman is adjusted market strategy and buyer selects commodity
There is certain help.
In prior art, mainly can carry out microblog emotional trend analysis based on semantic rule.So-called based on
Semantic rule, it is simply that by the emotion value adding up emotion word in microblogging text the journey arranged in pairs or groups the most therewith
Degree adverbial word and negative adverb, by being averaging or other computing mode provides the emotion value of statement and text.
But, in actual applications, due to numerous, only according to feelings to the Sentiment orientation influence factor of content of microblog
Sense word and the degree adverb arranged in pairs or groups therewith and negative adverb can not be analyzed the emotion of content of microblog exactly and incline
To.
Summary of the invention
The purpose of the embodiment of the present invention is to provide a kind of microblog emotional trend analysis method and device, to improve
The accuracy of microblog emotional trend analysis.Concrete technical scheme is as follows:
First aspect, embodiments provides a kind of microblog emotional trend analysis method, is applied to electronics
Equipment, described method includes:
For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine institute
State the text that content of microblog is corresponding;
For each expression in described expression set, according to the expression data storehouse built in advance, obtain each
The emotion value that expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression set
Emotion value;
According to default punctuation mark, described text is divided at least one complex sentence;And according to each complex sentence
Sentence pattern, determines the sentence pattern coefficient of described each complex sentence;
For each complex sentence, extract each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other subordinate sentences
Relation between Ju, determines coefficient of relationship between the sentence of each subordinate sentence;
For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain each participle that this subordinate sentence includes, and
Identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word;
Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence
Sentence pattern coefficient, calculates the emotion value of described text;
Emotion value according to described expression set and the emotion value of described text, calculate the feelings of described content of microblog
Inductance value.
Further, the described emotion value corresponding according to described each expression, calculate the feelings of described expression set
Inductance value includes:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection
The emotion value closed.
Further, described for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain this subordinate sentence wraps
After each participle included, described method also includes:
Each participle is carried out part-of-speech tagging process;
After emotion word in each participle of described identification, described method also includes:
Identify the degree adverb before described emotion word and negative adverb;
The dictionary that described basis builds in advance, determines that the emotion value of each emotion word includes:
According to the degree adverb before the dictionary built in advance, and each emotion word and negative adverb, determine and repair
The emotion value of each emotion word after just.
Further, coefficient of relationship between the sentence of the described emotion value according to described each emotion word, described each subordinate sentence,
And the sentence pattern coefficient of each complex sentence, the emotion value calculating described text includes:
According to the emotion value of each emotion word in each subordinate sentence determined, calculate each emotion word that each subordinate sentence includes
Emotion value sum, as the word emotion value of each subordinate sentence;
Word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence
Word emotion value and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;
Emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point
The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;
Calculate the emotion value sum of described each complex sentence, as the emotion value of described text.
Further, the described emotion value according to described expression set and the emotion value of described text, calculate institute
The emotion value stating content of microblog includes:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by
The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
Second aspect, embodiments provides a kind of microblog emotional trend analysis device, is applied to electronics
Equipment, described device includes:
Extraction module, for for content of microblog to be analyzed, extracts the expression that described content of microblog includes
Set, and determine the text that described content of microblog is corresponding;
First computing module, for for each expression in described expression set, according to the table built in advance
Feelings data base, obtains the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence,
Calculate the emotion value of described expression set;
First determines module, for according to the punctuation mark preset, described text is divided at least one multiple
Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to
Relation between the sentence of each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtains this subordinate sentence
The each participle included, and identify the emotion word in each participle;According to the dictionary built in advance, determine each feelings
The emotion value of sense word;
Second computing module, for according to relation between the emotion value of described each emotion word, the sentence of described each subordinate sentence
The sentence pattern coefficient of coefficient and each complex sentence, calculates the emotion value of described text;
3rd computing module, for the emotion value according to described expression set and the emotion value of described text, meter
Calculate the emotion value of described content of microblog.
Further, described first computing module specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection
The emotion value closed.
Further, described device also includes:
Processing module, for carrying out part-of-speech tagging process to each participle;
Identification module, for identifying the degree adverb before described emotion word and negative adverb;
Described 3rd determines module, is additionally operable to according to the dictionary built in advance, and the journey before each emotion word
Degree adverbial word and negative adverb, determine the emotion value of revised each emotion word.
Further, described second computing module includes:
First calculating sub module, for according to the emotion value of each emotion word in each subordinate sentence determined, calculates each point
The emotion value sum of each emotion word that sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module, for the word emotion value according to each subordinate sentence, and between the sentence of corresponding each subordinate sentence
Coefficient of relationship, calculates the word emotion value of each subordinate sentence and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, makees
Emotion value for each subordinate sentence;
3rd calculating sub module, for the emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, meter
Calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the product of the sentence pattern coefficient of corresponding each complex sentence, as
The emotion value of each complex sentence;
4th calculating sub module, for calculating the emotion value sum of described each complex sentence, as the feelings of described text
Inductance value.
Further, described 3rd computing module specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by
The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
Embodiments providing a kind of microblog emotional trend analysis method and device, described method includes:
For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine described micro-
The text that rich content is corresponding;For each expression in described expression set, according to the expression number built in advance
According to storehouse, obtain the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, calculate
The emotion value of described expression set;According to default punctuation mark, described text is divided at least one multiple
Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;For each complex sentence, extract
Each subordinate sentence that this complex sentence includes, and according to relation between the sentence of each subordinate sentence and other subordinate sentences, determine each subordinate sentence
Coefficient of relationship between Ju;For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain what this subordinate sentence included
Each participle, and identify the emotion word in each participle;According to the dictionary built in advance, determine the feelings of each emotion word
Inductance value;Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence
Sentence pattern coefficient, calculate the emotion value of described text;Emotion value according to described expression set and described text
Emotion value, calculate the emotion value of described content of microblog.The embodiment of the present invention can be simultaneously according to content of microblog
The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and,
When determining the emotion value of text, consider the sentence pattern of each complex sentence in text simultaneously, and each complex sentence includes
Relation between the sentence of each subordinate sentence, compared with prior art, employs the factor more affecting microblog emotional tendency
Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to enforcement
In example or description of the prior art, the required accompanying drawing used is briefly described, it should be apparent that, describe below
In accompanying drawing be only some embodiments of the present invention, for those of ordinary skill in the art, do not paying
On the premise of going out creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
The flow chart of a kind of microblog emotional trend analysis method that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 is the microblogging expression schematic diagram of conventional graphic form;
The structural representation of a kind of microblog emotional trend analysis device that Fig. 3 provides for the embodiment of the present invention.
Detailed description of the invention
In order to improve the accuracy of microblog emotional trend analysis, embodiments provide a kind of microblog emotional
Trend analysis method and device.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clearly
Chu, be fully described by, it is clear that described embodiment be only a part of embodiment of the present invention rather than
Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creation
The every other embodiment obtained under property work premise, broadly falls into the scope of protection of the invention.
It should be noted that in the case of not conflicting, the embodiment in the present invention and the feature in embodiment
Can be mutually combined.Describe the present invention below with reference to the accompanying drawings and in conjunction with the embodiments in detail.
In order to improve the accuracy of microblog emotional trend analysis, embodiments provide a kind of microblog emotional
Trend analysis procedure, as it is shown in figure 1, this process comprises the following steps:
S101, for content of microblog to be analyzed, extracts the expression set that described content of microblog includes, and
Determine the text that described content of microblog is corresponding.
The method that the embodiment of the present invention provides can apply to electronic equipment.Specifically, this electronic equipment is such as
May is that notebook computer, intelligent terminal, desk computer, portable computer etc..
In embodiments of the present invention, first electronic equipment can obtain content of microblog to be analyzed.Such as, electricity
Subset can be crawled by reptile on network and obtain whole piece content of microblog to be analyzed, e.g., can pass through
Reptile crawls any bar content of microblog obtained in Sina's microblogging, and using this content of microblog as to be analyzed micro-
Rich content;Or, in order to improve the efficiency of microblog emotional trend analysis, electronic equipment can also be in advance at net
Crawled by reptile on network and obtain at least one content of microblog, and the content of microblog got is saved in data
In storehouse, when carrying out microblog emotional trend analysis, from data base, directly obtain content of microblog to be analyzed.
Being crawled by reptile and obtain the process of content of microblog and can use prior art, the embodiment of the present invention is to this process
Do not repeat.
Being appreciated that in some cases, user is when delivering content of microblog, except wrapping in content of microblog
Including outside text, it is also possible to include that some are expressed one's feelings, this expression can be character style, it is also possible to be picture shape
Formula.And, it is generally the case that the expression in content of microblog can express the emotion of user well, mark
The Sentiment orientation of content of microblog.
As in figure 2 it is shown, it illustrates the microblogging expression schematic diagram of conventional graphic form, wherein, expression 210
For " smile ", expression 220 is " extremely ".
Therefore, in embodiments of the present invention, in order to improve the accuracy of microblog emotional trend analysis, electronics sets
The standby Sentiment orientation that simultaneously can analyze content of microblog according to the expression of content of microblog and text.
Specifically, electronic equipment, can be first against to be analyzed after getting content of microblog to be analyzed
Content of microblog, extracts the expression that this content of microblog includes, obtains the expression set comprising each expression, and really
The text that this content of microblog fixed is corresponding.Such as, what first electronic equipment can extract that content of microblog includes is every
Individual expression, gathers comprising the set comprising each expression extracted as expression, and will be except expression set
Outside content be defined as the text that this content of microblog is corresponding.
Such as, whole piece content of microblog to be analyzed is obtained when electronic equipment is directly crawled by reptile on network
Time, the expression that this content of microblog includes can be picture format.In this case, electronic equipment can be known
The picture that content of microblog not to be analyzed includes, and the picture recognized is defined as this microblogging to be analyzed
The expression set that content includes.When electronic equipment obtains content of microblog to be analyzed from data base, number
Would generally show with the form of expression word according to the expression in the content of microblog preserved in storehouse, and, this expression
Word can show in a pre-defined format.As in Fig. 2 express one's feelings 210 can be shown as in data base [smile],
Expression 220 can be shown as [extremely] in data base.In this case, electronic equipment can extract symbol " [] "
The expression word inside included, and the expression word extracted is defined as the expression that content of microblog to be analyzed includes
Set.
S102, for each expression in described expression set, according to the expression data storehouse built in advance, obtains
Take the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression
The emotion value of set.
Extracting after obtaining the expression set that content of microblog to be analyzed includes, electronic equipment can be for being extracted
Expression set in each expression, according to the expression data storehouse built in advance, obtain each expression corresponding
Emotion value, and according to the emotion value of each expression correspondence, calculate the emotion value of expression set.
Such as, in embodiments of the present invention, get when electronic equipment is the expression shown with picture format
Time, before obtaining the emotion value that each expression is corresponding, first each expression can be converted to the expression of its correspondence
Word, and then can search, according to expression data storehouse, the emotion value that each expression word is corresponding.
In embodiments of the present invention, expression data storehouse can be built in advance, include in this expression data storehouse
Each expression word and the emotion value of correspondence thereof.
Such as, the expression data storehouse built in the embodiment of the present invention can be as shown in the table:
As shown in above table, expression word corresponding emotion value of " smiling " can be 1.0;, word of expressing one's feelings
The emotion value of correspondence of " cursing in rage " can be-0.9;The emotion value of expression word " sad " correspondence can be-1.
Wherein, the positive and negative Sentiment orientation being used for identifying this expression word of the emotion value of expression word, when emotion value is
On the occasion of time, show that the Sentiment orientation of this expression word is forward;When emotion value is negative value, show this expression
The Sentiment orientation of word is negative sense;When emotion value is 0, show that the Sentiment orientation of this expression word is neutrality.
The numerical value of each emotion value is the biggest, shows that the Sentiment orientation of this expression word is the strongest.In embodiments of the present invention,
Emotion value corresponding for each expression can be arranged between-1 to 1.
Electronic equipment, can be at table when calculating the emotion value of the expression set that content of microblog to be analyzed includes
Feelings data base searches the emotion value of each expression correspondence that expression set includes.
It should be noted that in embodiments of the present invention, get when electronic equipment is to show with picture format
During the expression shown, before obtaining the emotion value that each expression is corresponding, first each expression can be converted to it right
The expression word answered, and then each expression word can be searched corresponding according to expression data storehouse as noted above
Emotion value.
After getting the emotion value of each expression correspondence that expression set includes, electronic equipment can be further
Ground, according to the emotion value of each expression correspondence, calculates the emotion value of expression set.
Such as, in one implementation, electronic equipment can calculate emotion value corresponding to all of expression
Meansigma methods, and the emotion value that calculated meansigma methods is gathered as expression.
S103, according to default punctuation mark, is divided at least one complex sentence by described text;And according to respectively
The sentence pattern of complex sentence, determines the sentence pattern coefficient of described each complex sentence.
In embodiments of the present invention, electronic equipment can also determine content of microblog according to the emotion value of text
Emotion value.
Specifically, electronic equipment can be first according to the punctuation mark preset, such as fullstop, question mark, exclamation mark
Deng, text is divided at least one complex sentence.And it is possible to according to the sentence pattern of each complex sentence, determine each complex sentence
Sentence pattern coefficient.The process that text is divided at least one complex sentence can use prior art, and the present invention is real
Execute example this process is not repeated.
After obtaining each complex sentence, electronic equipment according to the sentence pattern of each complex sentence, can determine the sentence pattern coefficient of each complex sentence.
Complex sentence sentence pattern in the embodiment of the present invention can include assertive sentence, exclamative sentence, interrogative sentence and confirmative question etc..
Specifically, the punctuation mark that electronic equipment can include according to each complex sentence, and predetermined keyword etc.,
Determine the sentence pattern of each complex sentence.
Such as, when the punctuation mark that complex sentence includes is fullstop, electronic equipment may determine that the sentence of this complex sentence
Type is assertive sentence;When the punctuation mark that complex sentence includes is exclamation mark, electronic equipment may determine that this complex sentence
Sentence pattern be exclamative sentence;The punctuation mark included when complex sentence is question mark, and does not comprise such as " no ", "no"
Deng rhetorical question word time, electronic equipment may determine that the sentence pattern of this complex sentence is interrogative sentence;When the punctuate that complex sentence includes
Symbol is question mark, and when comprising such as the rhetorical question word such as " no ", "no", electronic equipment may determine that this complex sentence
Sentence pattern be confirmative question.
Further, in embodiments of the present invention, when complex sentence is assertive sentence, its sentence pattern coefficient can be 1;When
When complex sentence is exclamative sentence, its sentence pattern coefficient can be 2;When complex sentence is interrogative sentence, its sentence pattern coefficient is permissible
It is 0;When complex sentence is confirmative question, its sentence pattern coefficient can be-1.5.
S104, for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other
Relation between the sentence of subordinate sentence, determines coefficient of relationship between the sentence of each subordinate sentence.
After electronic equipment determines the sentence pattern of each complex sentence and the sentence pattern coefficient of correspondence, for each complex sentence, it is also possible to
Extract each subordinate sentence that this complex sentence includes, it is possible to according to relation between the sentence of each subordinate sentence and other subordinate sentences, determine
Coefficient of relationship between the sentence of each subordinate sentence.Electronic equipment extracts the process of each subordinate sentence that each complex sentence includes, can adopt
By prior art, e.g., the comma that electronic equipment can include according to each complex sentence, extract each complex sentence and include
Each subordinate sentence, this process is not repeated by the embodiment of the present invention.
After obtaining each subordinate sentence that each complex sentence includes, electronic equipment can also be for each complex sentence, multiple according to this
Relation between the sentence of each subordinate sentence of including of sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence.
It is appreciated that between each subordinate sentence that a complex sentence includes, some annexations can be there are, as
Transfer, go forward one by one, assume.And there is each subordinate sentence of different annexation, its Sentiment orientation expressed also may be used
Can be different.
Therefore, in embodiments of the present invention, electronic equipment can wrap according in this complex sentence for each complex sentence
Relation between each subordinate sentence included and the sentence of other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence.
Specifically, whether electronic equipment can identify in each subordinate sentence comprised in this complex sentence for each complex sentence
Comprise predetermined key word, determine relation between the sentence of each subordinate sentence and other subordinate sentences.
Such as, when subordinate sentence comprises as " but ", " but ", " ", " but " etc. represent the relation of turnover
During word, it may be determined that between the sentence of this subordinate sentence and the subordinate sentence before it, relation is turning relation.
Being appreciated that in the subordinate sentence that there is turning relation, the most above subordinate sentence proposes certain true or situation,
Below subordinate sentence then state the meaning contrary or relative with above subordinate sentence, subordinate sentence is only speaker and is wanted i.e. below
The real intention expressed.Therefore, in embodiments of the present invention, between the sentence when between several subordinate sentences, relation is turnover
Time, between the sentence of each subordinate sentence, coefficient of relationship can be: the subordinate sentence before adversative is 0, adversative below point
Sentence is 1.
When subordinate sentence comprises as " more ", " What is more " etc. represent go forward one by one relational word time, it may be determined that
Between the sentence of this subordinate sentence and the subordinate sentence before it, relation is progressive relationship.Between the sentence when between several subordinate sentences, relation is for passing
When entering, between the sentence of each subordinate sentence, coefficient of relationship can be: the subordinate sentence before the word that goes forward one by one is 1, goes forward one by one word below
Subordinate sentence is 1.5.
S105, for each subordinate sentence, carries out word segmentation processing to this subordinate sentence, obtain that this subordinate sentence includes each point
Word, and identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word.
Obtaining between the sentence of each subordinate sentence after coefficient of relationship, further, electronic equipment can also be for each subordinate sentence,
This subordinate sentence is carried out word segmentation processing, obtains each participle that this subordinate sentence includes, and identify the emotion in each participle
Word, then according to the dictionary built in advance, can determine the emotion value of each emotion word.
In embodiments of the present invention, electronic equipment, for each subordinate sentence, carries out participle to this subordinate sentence, is somebody's turn to do
Each participle that subordinate sentence includes, and identify the emotion word in each participle, then can be according to the word built in advance
Storehouse, determines that the process of the emotion value of each emotion word can use prior art, and the embodiment of the present invention is to this process
Do not repeat.
Alternatively, in embodiments of the present invention, in order to improve the accuracy that the emotion value of each emotion word determines,
Electronic equipment, for each subordinate sentence, carries out word segmentation processing to this subordinate sentence, obtains each participle that this subordinate sentence includes
Afterwards, it is also possible to each participle is carried out part-of-speech tagging process;Further, electronic equipment can also identify
After the emotion word that each participle includes, it is also possible to identify the degree adverb before each emotion word and negative pair
Word;Further, electronic equipment, can be according to the word built in advance when determining the emotion value of each emotion word
Degree adverb before storehouse, and each emotion word and negative adverb, determine the emotion of revised each emotion word
Value.
In embodiments of the present invention, electronic equipment carries out part-of-speech tagging process to each subordinate sentence, identifies each emotion word
Degree adverb before and negative adverb, according to the degree before the dictionary built in advance, and each emotion word
Adverbial word and negative adverb, determine that the process of the emotion value of revised each emotion word can use prior art,
This process is not repeated by the embodiment of the present invention.
S106, according to coefficient of relationship between the emotion value of described each emotion word, the sentence of described each subordinate sentence and each
The sentence pattern coefficient of complex sentence, calculates the emotion value of described text.
In embodiments of the present invention, electronic equipment can according to the emotion value of each emotion word, each subordinate sentence sentence between
The sentence pattern coefficient of coefficient of relationship and each complex sentence, calculates the emotion value of text.
Specifically, electronic equipment can calculate first according to the emotion value of each emotion word in each subordinate sentence determined
The emotion value sum of each emotion word that each subordinate sentence includes, as the word emotion value of each subordinate sentence;Then, may be used
With the word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence
The product of coefficient of relationship between word emotion value with the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;Afterwards,
Can be according to the emotion value of each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point
The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;Finally,
The emotion value sum of each complex sentence can be calculated, as the emotion value of text.
Such as, electronic equipment can calculate the emotion value E (s of subordinate sentence i first according to below equationi):
E(si)=∑ E (Wi)×ri
Wherein, E (Wi) it is the emotion value of each emotion word that this subordinate sentence i includes;∑E(Wi) it is the word of this subordinate sentence i
Language emotion value;riFor coefficient of relationship between the sentence of subordinate sentence i.
Further, electronic equipment according to below equation, can calculate the emotion value E (S of complex sentence jj):
Wherein, E (sj) it is the emotion value of each subordinate sentence that complex sentence j includes;TjSentence pattern coefficient for complex sentence j.
S107, according to emotion value and the emotion value of described text of described expression set, calculates in described microblogging
The emotion value held.
After obtaining express one's feelings the emotion value of set and the emotion value of text that content of microblog to be analyzed includes, electronics
Equipment can calculate the emotion of content of microblog to be analyzed according to the emotion value of expression set and the emotion value of text
Value.
Specifically, the emotion value of expression set can be multiplied by the first predetermined weights, by text by electronic equipment
Emotion value be multiplied by the second predetermined weights, and the results added that will be calculated, obtain the emotion of content of microblog
Value.
Such as, above-mentioned first weights can be 0.4, and the second weights can be 0.6;Or, the first weights can
Being 0.35, the second weights can be 0.65 etc..
The microblog emotional trend analysis method that the embodiment of the present invention provides, it is possible to simultaneously wrap according in content of microblog
The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and, determining
During the emotion value of text, consider the sentence pattern of each complex sentence in text, and each point that each complex sentence includes simultaneously
Relation between the sentence of sentence, compared with prior art, employ more affect that microblog emotional is inclined to because of the most true
Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Corresponding to above method embodiment, the embodiment of the present invention additionally provides corresponding device embodiment.
A kind of microblog emotional trend analysis device that Fig. 3 provides for the embodiment of the present invention, is applied to electronic equipment,
Described device includes:
Extraction module 310, for for content of microblog to be analyzed, extracts what described content of microblog included
Expression set, and determine the text that described content of microblog is corresponding;
First computing module 320, for for each expression in described expression set, according to building in advance
Expression data storehouse, obtain the emotion value that each expression is corresponding, and according to emotion corresponding to described each expression
Value, calculates the emotion value of described expression set;
First determines module 330, for according to the punctuation mark preset, described text being divided at least one
Individual complex sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module 340, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and
Relation between the sentence according to each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module 350, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, is somebody's turn to do
Each participle that subordinate sentence includes, and identify the emotion word in each participle;According to the dictionary built in advance, determine
The emotion value of each emotion word;
Second computing module 360, for according between the emotion value of described each emotion word, the sentence of described each subordinate sentence
The sentence pattern coefficient of coefficient of relationship and each complex sentence, calculates the emotion value of described text;
3rd computing module 370, for the emotion value according to described expression set and the emotion value of described text,
Calculate the emotion value of described content of microblog.
The microblog emotional trend analysis device that the embodiment of the present invention provides, it is possible to simultaneously wrap according in content of microblog
The emotion value of the expression included and the emotion value of text to determine the emotion value of content of microblog, and, determining
During the emotion value of text, consider the sentence pattern of each complex sentence in text, and each point that each complex sentence includes simultaneously
Relation between the sentence of sentence, compared with prior art, employ more affect that microblog emotional is inclined to because of the most true
Determine the Sentiment orientation of microblogging, therefore, it is possible to improve the accuracy of microblog emotional trend analysis.
Further, described first computing module 320 specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection
The emotion value closed.
Further, described device also includes:
Processing module (not shown), for carrying out part-of-speech tagging process to each participle;
Identification module (not shown), for identifying the degree adverb before described emotion word and negative pair
Word;
Described 3rd determines module 350, is additionally operable to according to the dictionary built in advance, and before each emotion word
Degree adverb and negative adverb, determine the emotion value of revised each emotion word.
Further, described second computing module 360 includes:
First calculating sub module (not shown), for according to the feelings of each emotion word in each subordinate sentence determined
Inductance value, calculates the emotion value sum of each emotion word that each subordinate sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module (not shown), for the word emotion value according to each subordinate sentence and right
Should coefficient of relationship between the sentence of each subordinate sentence, calculate the word emotion value of each subordinate sentence and relation between the sentence of corresponding each subordinate sentence
The product of coefficient, as the emotion value of each subordinate sentence;
3rd calculating sub module (not shown), for the emotion value according to each subordinate sentence, and each complex sentence
Sentence pattern coefficient, calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the sentence pattern system of corresponding each complex sentence
The product of number, as the emotion value of each complex sentence;
4th calculating sub module (not shown), for calculating the emotion value sum of described each complex sentence, makees
Emotion value for described text.
Further, described 3rd computing module 370 specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by
The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
It should be noted that in this article, the relational terms of such as first and second or the like be used merely to by
One entity or operation separate with another entity or operating space, and not necessarily require or imply these
Relation or the order of any this reality is there is between entity or operation.And, term " includes ", " bag
Contain " or its any other variant be intended to comprising of nonexcludability, so that include a series of key element
Process, method, article or equipment not only include those key elements, but also include being not expressly set out
Other key elements, or also include the key element intrinsic for this process, method, article or equipment.?
In the case of there is no more restriction, statement " including ... " key element limited, it is not excluded that including
The process of described key element, method, article or equipment there is also other identical element.
Each embodiment in this specification all uses relevant mode to describe, phase homophase between each embodiment
As part see mutually, what each embodiment stressed is different from other embodiments it
Place.For system embodiment, owing to it is substantially similar to embodiment of the method, so describe
Fairly simple, relevant part sees the part of embodiment of the method and illustrates.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the protection model of the present invention
Enclose.All any modification, equivalent substitution and improvement etc. made within the spirit and principles in the present invention, all wrap
Containing within the scope of the present invention.
Claims (10)
1. a microblog emotional trend analysis method, is applied to electronic equipment, it is characterised in that described side
Method includes:
For content of microblog to be analyzed, extract the expression set that described content of microblog includes, and determine institute
State the text that content of microblog is corresponding;
For each expression in described expression set, according to the expression data storehouse built in advance, obtain each
The emotion value that expression is corresponding, and according to the emotion value of described each expression correspondence, calculate described expression set
Emotion value;
According to default punctuation mark, described text is divided at least one complex sentence;And according to each complex sentence
Sentence pattern, determines the sentence pattern coefficient of described each complex sentence;
For each complex sentence, extract each subordinate sentence that this complex sentence includes, and according to each subordinate sentence and other subordinate sentences
Relation between Ju, determines coefficient of relationship between the sentence of each subordinate sentence;
For each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtain each participle that this subordinate sentence includes, and
Identify the emotion word in each participle;According to the dictionary built in advance, determine the emotion value of each emotion word;
Coefficient of relationship and each complex sentence between emotion value according to described each emotion word, the sentence of described each subordinate sentence
Sentence pattern coefficient, calculates the emotion value of described text;
Emotion value according to described expression set and the emotion value of described text, calculate the feelings of described content of microblog
Inductance value.
Method the most according to claim 1, it is characterised in that described according to described each expression correspondence
Emotion value, calculate described expression set emotion value include:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection
The emotion value closed.
Method the most according to claim 1, it is characterised in that described for each subordinate sentence, to this point
Sentence carries out word segmentation processing, and after obtaining each participle that this subordinate sentence includes, described method also includes:
Each participle is carried out part-of-speech tagging process;
After emotion word in each participle of described identification, described method also includes:
Identify the degree adverb before described emotion word and negative adverb;
The dictionary that described basis builds in advance, determines that the emotion value of each emotion word includes:
According to the degree adverb before the dictionary built in advance, and each emotion word and negative adverb, determine and repair
The emotion value of each emotion word after just.
Method the most according to claim 3, it is characterised in that the described feelings according to described each emotion word
Inductance value, described each subordinate sentence sentence between coefficient of relationship and the sentence pattern coefficient of each complex sentence, calculate described text
Emotion value includes:
According to the emotion value of each emotion word in each subordinate sentence determined, calculate each emotion word that each subordinate sentence includes
Emotion value sum, as the word emotion value of each subordinate sentence;
Word emotion value according to each subordinate sentence, and coefficient of relationship between the sentence of corresponding each subordinate sentence, calculate each subordinate sentence
Word emotion value and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, as the emotion value of each subordinate sentence;
Emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, calculate that each complex sentence includes each point
The emotion value sum of sentence and the product of the sentence pattern coefficient of corresponding each complex sentence, as the emotion value of each complex sentence;
Calculate the emotion value sum of described each complex sentence, as the emotion value of described text.
5. according to the method described in any one of claim 1-4, it is characterised in that described according to described expression
The emotion value of set and the emotion value of described text, the emotion value calculating described content of microblog includes:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by
The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
6. a microblog emotional trend analysis device, is applied to electronic equipment, it is characterised in that described dress
Put and include:
Extraction module, for for content of microblog to be analyzed, extracts the expression that described content of microblog includes
Set, and determine the text that described content of microblog is corresponding;
First computing module, for for each expression in described expression set, according to the table built in advance
Feelings data base, obtains the emotion value that each expression is corresponding, and according to the emotion value of described each expression correspondence,
Calculate the emotion value of described expression set;
First determines module, for according to the punctuation mark preset, described text is divided at least one multiple
Sentence;And according to the sentence pattern of each complex sentence, determine the sentence pattern coefficient of described each complex sentence;
Second determines module, for for each complex sentence, extracts each subordinate sentence that this complex sentence includes, and according to
Relation between the sentence of each subordinate sentence and other subordinate sentences, determines coefficient of relationship between the sentence of each subordinate sentence;
3rd determines module, for for each subordinate sentence, this subordinate sentence is carried out word segmentation processing, obtains this subordinate sentence
The each participle included, and identify the emotion word in each participle;According to the dictionary built in advance, determine each feelings
The emotion value of sense word;
Second computing module, for according to relation between the emotion value of described each emotion word, the sentence of described each subordinate sentence
The sentence pattern coefficient of coefficient and each complex sentence, calculates the emotion value of described text;
3rd computing module, for the emotion value according to described expression set and the emotion value of described text, meter
Calculate the emotion value of described content of microblog.
Device the most according to claim 6, it is characterised in that described first computing module specifically for:
Calculate the meansigma methods of emotion value corresponding to all of expression, and using described meansigma methods as described expression collection
The emotion value closed.
Device the most according to claim 6, it is characterised in that described device also includes:
Processing module, for carrying out part-of-speech tagging process to each participle;
Identification module, for identifying the degree adverb before described emotion word and negative adverb;
Described 3rd determines module, is additionally operable to according to the dictionary built in advance, and the journey before each emotion word
Degree adverbial word and negative adverb, determine the emotion value of revised each emotion word.
Device the most according to claim 8, it is characterised in that described second computing module includes:
First calculating sub module, for according to the emotion value of each emotion word in each subordinate sentence determined, calculates each point
The emotion value sum of each emotion word that sentence includes, as the word emotion value of each subordinate sentence;
Second calculating sub module, for the word emotion value according to each subordinate sentence, and between the sentence of corresponding each subordinate sentence
Coefficient of relationship, calculates the word emotion value of each subordinate sentence and the product of coefficient of relationship between the sentence of corresponding each subordinate sentence, makees
Emotion value for each subordinate sentence;
3rd calculating sub module, for the emotion value according to each subordinate sentence, and the sentence pattern coefficient of each complex sentence, meter
Calculate the emotion value sum of each subordinate sentence that each complex sentence includes and the product of the sentence pattern coefficient of corresponding each complex sentence, as
The emotion value of each complex sentence;
4th calculating sub module, for calculating the emotion value sum of described each complex sentence, as the feelings of described text
Inductance value.
10. according to the device described in any one of claim 6-9, it is characterised in that the described 3rd calculates mould
Block specifically for:
The emotion value of described expression set is multiplied by the first predetermined weights, the emotion value of described text is multiplied by
The second predetermined weights, and the results added that will be calculated, obtain the emotion value of described content of microblog.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610181735.8A CN105843796A (en) | 2016-03-28 | 2016-03-28 | Microblog emotional tendency analysis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610181735.8A CN105843796A (en) | 2016-03-28 | 2016-03-28 | Microblog emotional tendency analysis method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105843796A true CN105843796A (en) | 2016-08-10 |
Family
ID=56584525
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610181735.8A Pending CN105843796A (en) | 2016-03-28 | 2016-03-28 | Microblog emotional tendency analysis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105843796A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106202584A (en) * | 2016-09-20 | 2016-12-07 | 北京工业大学 | A kind of microblog emotional based on standard dictionary and semantic rule analyzes method |
CN106326481A (en) * | 2016-08-31 | 2017-01-11 | 中译语通科技(北京)有限公司 | Detection method of Weibo hot topics based on suddenness |
CN106503220A (en) * | 2016-10-28 | 2017-03-15 | 上海大学 | A kind of microblogging emoticon affection computation method based on a mutual information |
CN107229612A (en) * | 2017-05-24 | 2017-10-03 | 重庆誉存大数据科技有限公司 | A kind of network information semantic tendency analysis method and system |
CN108153831A (en) * | 2017-12-13 | 2018-06-12 | 北京小米移动软件有限公司 | Music adding method and device |
CN108197104A (en) * | 2017-12-27 | 2018-06-22 | 浙江力石科技股份有限公司 | Text analyzing method, apparatus and cloud platform |
CN108228573A (en) * | 2018-03-23 | 2018-06-29 | 北京航空航天大学 | Text emotion analysis method, device and electronic equipment |
CN108647257A (en) * | 2018-04-24 | 2018-10-12 | 北京科技大学 | A kind of microblog emotional determines method |
CN109145306A (en) * | 2018-09-11 | 2019-01-04 | 刘瑞军 | The three-dimensional expression generation method of text-driven |
CN109471928A (en) * | 2018-10-31 | 2019-03-15 | 北京国信云服科技有限公司 | A kind of associated entity Judgment by emotion method based on diffusive transport model |
CN109598402A (en) * | 2018-10-23 | 2019-04-09 | 平安科技(深圳)有限公司 | Data report generation method, device, computer equipment based on data mining |
CN113378578A (en) * | 2021-05-08 | 2021-09-10 | 重庆航天信息有限公司 | Food and medicine public opinion analysis method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663046A (en) * | 2012-03-29 | 2012-09-12 | 中国科学院自动化研究所 | Sentiment analysis method oriented to micro-blog short text |
CN103699626A (en) * | 2013-12-20 | 2014-04-02 | 华南理工大学 | Method and system for analysing individual emotion tendency of microblog user |
CN105224640A (en) * | 2015-09-25 | 2016-01-06 | 杭州朗和科技有限公司 | A kind of method and apparatus extracting viewpoint |
-
2016
- 2016-03-28 CN CN201610181735.8A patent/CN105843796A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663046A (en) * | 2012-03-29 | 2012-09-12 | 中国科学院自动化研究所 | Sentiment analysis method oriented to micro-blog short text |
CN103699626A (en) * | 2013-12-20 | 2014-04-02 | 华南理工大学 | Method and system for analysing individual emotion tendency of microblog user |
CN105224640A (en) * | 2015-09-25 | 2016-01-06 | 杭州朗和科技有限公司 | A kind of method and apparatus extracting viewpoint |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106326481A (en) * | 2016-08-31 | 2017-01-11 | 中译语通科技(北京)有限公司 | Detection method of Weibo hot topics based on suddenness |
CN106202584A (en) * | 2016-09-20 | 2016-12-07 | 北京工业大学 | A kind of microblog emotional based on standard dictionary and semantic rule analyzes method |
CN106503220A (en) * | 2016-10-28 | 2017-03-15 | 上海大学 | A kind of microblogging emoticon affection computation method based on a mutual information |
CN107229612B (en) * | 2017-05-24 | 2021-01-08 | 重庆电信系统集成有限公司 | Network information semantic tendency analysis method and system |
CN107229612A (en) * | 2017-05-24 | 2017-10-03 | 重庆誉存大数据科技有限公司 | A kind of network information semantic tendency analysis method and system |
CN108153831A (en) * | 2017-12-13 | 2018-06-12 | 北京小米移动软件有限公司 | Music adding method and device |
CN108197104A (en) * | 2017-12-27 | 2018-06-22 | 浙江力石科技股份有限公司 | Text analyzing method, apparatus and cloud platform |
CN108228573A (en) * | 2018-03-23 | 2018-06-29 | 北京航空航天大学 | Text emotion analysis method, device and electronic equipment |
CN108647257A (en) * | 2018-04-24 | 2018-10-12 | 北京科技大学 | A kind of microblog emotional determines method |
CN109145306A (en) * | 2018-09-11 | 2019-01-04 | 刘瑞军 | The three-dimensional expression generation method of text-driven |
CN109598402A (en) * | 2018-10-23 | 2019-04-09 | 平安科技(深圳)有限公司 | Data report generation method, device, computer equipment based on data mining |
CN109471928A (en) * | 2018-10-31 | 2019-03-15 | 北京国信云服科技有限公司 | A kind of associated entity Judgment by emotion method based on diffusive transport model |
CN109471928B (en) * | 2018-10-31 | 2021-09-28 | 北京国信云服科技有限公司 | Associated entity emotion judgment method based on diffusion propagation model |
CN113378578A (en) * | 2021-05-08 | 2021-09-10 | 重庆航天信息有限公司 | Food and medicine public opinion analysis method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105843796A (en) | Microblog emotional tendency analysis method and device | |
CN108573411B (en) | Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments | |
CN103678564B (en) | Internet product research system based on data mining | |
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
CN102866989B (en) | Viewpoint abstracting method based on word dependence relationship | |
CN103049435B (en) | Text fine granularity sentiment analysis method and device | |
CN105005564B (en) | A kind of data processing method and device based on answer platform | |
CN105630768B (en) | A kind of product name recognition method and device based on stacking condition random field | |
CN107784092A (en) | A kind of method, server and computer-readable medium for recommending hot word | |
CN106951438A (en) | A kind of event extraction system and method towards open field | |
CN102200975B (en) | Vertical search engine system using semantic analysis | |
CN104881458B (en) | A kind of mask method and device of Web page subject | |
CN106250513A (en) | A kind of event personalization sorting technique based on event modeling and system | |
CN106970912A (en) | Chinese sentence similarity calculating method, computing device and computer-readable storage medium | |
CN104820686A (en) | Network search method and network search system | |
CN109960756A (en) | Media event information inductive method | |
CN103473380B (en) | A kind of computer version sensibility classification method | |
CN106126502A (en) | A kind of emotional semantic classification system and method based on support vector machine | |
CN110362678A (en) | A kind of method and apparatus automatically extracting Chinese text keyword | |
CN110134845A (en) | Project public sentiment monitoring method, device, computer equipment and storage medium | |
CN109472022A (en) | New word identification method and terminal device based on machine learning | |
Nandi et al. | Bangla news recommendation using doc2vec | |
CN107798622A (en) | A kind of method and apparatus for identifying user view | |
CN106250365A (en) | The extracting method of item property Feature Words in consumer reviews based on text analyzing | |
CN114722174A (en) | Word extraction method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160810 |
|
RJ01 | Rejection of invention patent application after publication |