CN103699626A - Method and system for analysing individual emotion tendency of microblog user - Google Patents

Method and system for analysing individual emotion tendency of microblog user Download PDF

Info

Publication number
CN103699626A
CN103699626A CN201310711626.9A CN201310711626A CN103699626A CN 103699626 A CN103699626 A CN 103699626A CN 201310711626 A CN201310711626 A CN 201310711626A CN 103699626 A CN103699626 A CN 103699626A
Authority
CN
China
Prior art keywords
word
topic
emotion
user
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310711626.9A
Other languages
Chinese (zh)
Other versions
CN103699626B (en
Inventor
王伟凝
刘剑聪
韦岗
王励
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201310711626.9A priority Critical patent/CN103699626B/en
Publication of CN103699626A publication Critical patent/CN103699626A/en
Application granted granted Critical
Publication of CN103699626B publication Critical patent/CN103699626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

The invention discloses a method for analysing individual emotion tendency of microblog user. The method comprises the following steps of: acquiring data, separating words, loading a word bank and an emoticon bank, establishing the interested topic bank of the user, dividing short sentences, extracting emotion elements, establishing the individual locution list of the user, calculating a locution emotion value, calculating the topic emotion tendency of the user, and calculating the overall emotion tendency of the user. The invention further discloses a system for analysing individual emotion tendency of microblog user. According to the method and the system disclosed by the invention, an emotion analysis on the single microblog user is realized, and the emotion analysis on the user is combined with specific topics to avoid an indistinct and stiff analysis mode, thus the emotion analysis on the user is more meticulous and directional, and the accuracy of the emotion tendency analysis is improved.

Description

A kind of microblog users individualized emotion trend analysis method and system
Technical field
The present invention relates to microblogging data processing field, particularly a kind of microblog users individualized emotion trend analysis method and system.
Background technology
Microblogging is a platform that freedom is random, and its information is short and small, and rapidly, user often delivers the own subjective feeling to variety of event and comment object by microblogging in issue, shares oneself values, suggestion, emotion etc. with other people.In micro-blog information, comprise a lot of emotion words, contained abundant emotion information.The environment freely loosening, makes the information of emotional expression in user's microblogging data can reflect more deep, exactly user's emotion tendency.
The sentiment analysis research work of current Chinese microblogging, is mainly to carry out for certain particular event and theme, analyzes all relevant micro-blog information texts, extract Emotion element, carry out statistical analysis, the emotion information of microblogging is classified, marked and predicts, obtained certain achievement.But the sentiment analysis of micro-blog information or the user's of colony heartbeat conditions are mainly paid close attention in current research, for emotion trend analysis and the research of single microblog users, not yet carry out the not independent emotion trend analysis for user in a deep going way.And the analysis to emotion tendency, does not refine to each the concrete aspect in social life yet, this just causes the specific aim of sentiment analysis not strong, the accuracy of analysis and prediction, comprehensively waits further raising.
The emotional expression mode of microblogging is personalized, and the individualized feature that must introduce user just can obtain analysis result more accurately.The much-talked-about topic conversion of microblogging is very fast, and active user is relatively stable.The trend that affects event development is the microblog users of all participations, and user's emotion model is metastable.By the analysis to user individual emotion, can be the more accurate emotion of mark micro-blog information more meticulously, the development of predicted events and variation.The user feeling analytical information of setting up can also be used for a long time, along with the accumulation meeting of data is more and more accurate.
By the individualized emotion analytical technology to microblog users, can analyze and judge their liking or hatred degree hot issue, particular words opinion, special object or product, excavate business and social value wherein, have broad application prospects, as 1) public sentiment monitoring, the trend analysis of much-talked-about topic and prediction, the sentiment analysis of social groups etc.; 2) trend analysis and the prediction such as stock market, epidemic disease, election; 3) user behavior analysis based on large data, such as consumption propensity, user preferences etc.The research of microblog users individualized emotion trend analysis method has important learning value and social effect.
Summary of the invention
In order to overcome the above-mentioned shortcoming and deficiency of prior art, the object of the present invention is to provide a kind of microblog users individualized emotion trend analysis method, realized the sentiment analysis to microblogging unique user, make user's sentiment analysis more careful, have more directivity.
Another object of the present invention is to provide a kind of microblog users individualized emotion trend analysis system.
Object of the present invention is achieved through the following technical solutions:
A microblog users individualized emotion trend analysis method, comprises the following steps:
(1) gather all data of each user's microblogging homepage, deposit database in;
(2) text data in the microblogging data that step (1) collected carries out participle, obtains minute set of words and part-of-speech tagging;
(3) load required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon;
(4) adopt interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse:
(4-1) set up topic tree: filter out the word that there is no topic meaning in user version data, obtain the word of obvious topic information, utilize classified lexicon, statistics word frequency, sets up topic tree; Described topic tree is hierarchical structure, and ground floor is the classification of one-level topic, and the second layer is the sub-topic classification of secondary, and the 3rd layer is three grades of sub-topic classification; Described do not have the word of topic meaning to comprise degree word, negative word, personal pronoun, function word, link word, adjective;
(4-2), according to topic tree, by the aggregating algorithm that makes progress of the interlayer based on word frequency, successively extract high frequency topic;
(4-3) set up Yi Ge main split, for placing, cannot be included into popular peculiar topic word or phrase on the topic word of father's layer topic and network, obtain conventional topic storehouse; Word in microblogging data and word in conventional topic storehouse are mated corresponding, the topic word that occurrence number in the microblogging data user is surpassed to threshold value extracts, also as high frequency topic;
(4-4) using step (4-2) and the high frequency topic that (4-3) obtains as user's topic of interest word, set up user's topic of interest storehouse;
(5) the microblogging data that step (1) collected are divided short sentence, guarantee that each short sentence contains a topic of interest word at the most;
(6) extract the Emotion element in each short sentence, calculate the initial emotion value of each short sentence:
(6-1) set of words of short sentence is mated to mapping with each dictionary and emoticon storehouse, mark all kinds of Emotion elements; Described Emotion element comprises emotion word, degree word, negative word, punctuation mark, emoticon, and wherein degree word and punctuation mark are all for adjusting the degree of emotion word, and negative word is for adjusting the polarity of emotion word;
(6-2) calculate the emotion value of short sentence Chinese version:
The weights of Emotion element are set: positive emotion word weights are "+1 ", and negative emotion word weights are " 1 "; Negative word weights are " 1 "; Degree word and punctuation mark, according to the depth of its degree, arrange weights, and weights scope is between 0 to 3; Nearby principle followed in the emotion word that degree word and punctuation mark affect, and each degree word or punctuation mark impact are apart from the emotion degree of its nearest emotion word;
The emotion value I of short sentence Chinese version wordscomputing method be:
I words = b · Σ i = 1 m ( Σ j = 1 n c ij · f ij ) · q i
In formula, q irepresent i emotion word, c ijrepresent to modify q ij degree word weights, f ijrepresent to modify q ij negative word weights; If q inot subsidiary degree word, c ijget default value 1; If q inot subsidiary negative word, f ijget default value 1; N gets and modifies q idegree word number and modify q inegative word number in maximal value, m represents the number of emotion word, b represents the weights that punctuation mark is corresponding, i, j is positive integer;
(6-3) calculate the emotion value of emoticon in short sentence:
The expression providing for microblogging operator, contribution by it for emotion tendency is divided into front, negative, neutral three kinds of situations: the weights of front emoticon are made as "+1 ", the weights of negative emoticon are made as " 1 ", and the weights of neutral emoticon are made as " 0 ";
Emoticon emotion value I in short sentence markscomputing method be:
I marks = Σ i = 1 l m i
In formula, m ithe expression that represents positive, the negative or neutral emotion of i table, i is positive integer, l is emoticon number;
(6-4) calculate the initial emotion value I of short sentence 0:
I 0=I words+I marks
(7) to the text data after step (2) processing, utilize word slip window sampling to extract the word combination of high frequency, obtain the list of user individual idiom;
(8) the initial emotion value of all short sentences that comprise each idiom is carried out to statistical study, draw the emotion value of idiom;
For every idiom, find out all short sentences that contain this idiom, by its initial emotion value sum-average arithmetic, computing method are as follows:
I g = 1 p Σ i = 1 p I 0 i
In formula, I 0ibe the initial emotion value of the i sentence short sentence that comprises this idiom, p is the short sentence number that contains this idiom, I gemotion initial value for this idiom;
By I gvalue be mapped in [3,3], obtain the emotion value I ' of idiom g, be recorded in this user's personalized idiom emotion label table;
(9) calculate the individualized emotion value of each short sentence, computing method are:
I = I 0 - Σ i = 1 m ′ ( Σ j = 1 n ′ c gij · f gij ) · q gi + Σ k = 1 r I gk ′
In formula, I 0for the initial emotion value of short sentence, q girepresent i word, c gijrepresent to modify q gij degree word weights, f gijrepresent to modify q gij negative word weights; If q ginot subsidiary degree word, c gijget default value 1; If q ginot subsidiary negative word, f gijget default value 1; N' gets and modifies q gidegree word number and modify q githe number of negative word in maximal value; M' represents the number of word, i, and j is positive integer; I' gkthe emotion value that represents k idiom, r represents the number of idiom in this short sentence;
(10) calculate the emotion tendency of user's topic of interest:
Arbitrary user's topic of interest word in user's topic of interest storehouse, is calculated as follows its emotion value:
I topi c i = 1 w Σ j = 1 w I j
I jfor the individualized emotion value of j short sentence comprising this user's topic of interest word, w is the short sentence sum that comprises this user's topic of interest word,
Figure BDA0000442450760000044
emotion value for this user's topic of interest word; Will
Figure BDA0000442450760000045
value be mapped in [3,3], obtain final user's topic emotion propensity value, utilize these values, set up user individual microblog topic emotion value list.
Carry out step (10) afterwards, can also carry out following steps:
(a) repeat step (1)~(7), for the idiom newly adding in the list of user individual idiom, by step (8), calculated the emotion value I ' of this idiom g; For being recorded in the idiom in the list of user individual idiom before this circulation, upgrade by the following method I ' g: to every idiom, first carry out step (8), obtain the I of epicycle circulation gcalculated value I g_new, establish I g_prevthe I obtaining for last round of circulation gcalculated value, I gbe updated to:
I g1I g_prev2I g_new
In formula, ω 1for I g_prevweights, ω 2for I g_newweights;
By I gvalue be mapped in [3,3], obtain the emotion value I ' of idiom g;
(b) I ' obtaining according to step (a) g, carry out step (9)~(10).
Carry out step (10) afterwards, can also carry out following steps:
Repeat step (1)~(9), for the user's topic of interest word newly adding in user's topic of interest storehouse in this circulation, by the method for step (10), calculate user's topic emotion tendency; For being recorded in the user's topic of interest word in user's topic of interest storehouse before this circulation, upgrade by the following method user's topic emotion tendency I topic: first carry out step (10), obtain the I of epicycle circulation topiccalculated value I topic_new, establish I topic_prevthe I obtaining for last round of circulation topiccalculated value, I topicbe updated to:
I topic=ω' 1I topic_prev+ω' 2I topic_new
ω ' 1for I topic_prevweights, ω ' 2for I topic_newweights.
Carry out step (10) afterwards, also carry out following steps:
According to following formula, calculate the overall emotion propensity value of user:
I user = 1 s Σ i = 1 s I i
In formula, I ithe emotion value that represents i short sentence, s represents short sentence sum.
All data of described each user's of collection of step (1) microblogging homepage, are specially:
Take user as unit, collect all data in its homepage; Described data that comprise own issue in user home page face with the microblogging name microblogging data that forward, the data of " comment of sending ", the data of " mentioning mine ", microblogging name and concern user, self-introduction, spontaneous or forwarding microblogging in the url webpage that comprises and the video title in linking.
Step (4-2) is described according to topic tree, by the aggregating algorithm that makes progress of the interlayer based on word frequency, successively extracts high frequency topic, is specially:
To each topic word, if its occurrence number higher than the threshold value of setting, this topic is high frequency topic, otherwise, the occurrence number of this topic word is passed to father's layer topic word, successively calculate and extract high frequency topic; The occurrence number of topic word comprises the occurrence number of the occurrence number of this topic itself and the sub-topic of this topic.
The described data that step (1) is collected of step (5) are divided short sentence, are specially:
Analyze each microblogging data, if there is no topic word in microblogging data or only relate to a topic, whole piece microblogging is as a short sentence;
If one contain a plurality of topic words in microblogging data, in conjunction with punctuation mark degree of priority, analyze: if having punctuate between the maximum topic word of two distances, in punctuate punishment, split into two short sentences, if there is no punctuate between the maximum topic word of two distances, check so time topic word of large distance, if all there is no punctuate, do not split, whole piece microblogging data are as a short sentence; Described distance for arriving from a topic word tree number that another topic word will pass through in topic tree; If have a plurality of punctuates between two topic words, choose the punctuate place that relative importance value is high and divide: in punctuation mark relative importance value, fullstop > branch > comma.
Step (7) is described to the text data after step (2) processing, extracts the word combination of high frequency, is specially:
If the length of window of moving window is W, W is the word number that moving window comprises, and successively gets respectively 1,2,3,4; Utilize word moving window to add up in all short sentences, the total degree that each word or phrase occur, the list of user's idiom listed in the word or the phrase that occurrence number are greater than to threshold value; Collect daily statement word, set up daily statement phrase storehouse; Daily statement word is rejected from the list of user's idiom, obtain the list of user's idiom.
The microblog users individualized emotion trend analysis system that realizes above-mentioned analytical approach, comprises
Data acquisition module, for gathering all data of each user's microblogging homepage, deposits database in;
Word-dividing mode, for the text data of the data that collect is carried out to participle, obtains minute set of words and part-of-speech tagging;
Dictionary load-on module, for loading required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon;
Module is set up in user's topic of interest storehouse, for adopting interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse;
Module divided in short sentence, for the data that data collecting module collected is arrived, divides short sentence, guarantees that each short sentence contains a topic of interest word at the most;
Emotion element extraction module, for extracting the Emotion element of each short sentence, calculates the initial emotion value of each short sentence;
Module is set up in the list of user individual idiom, for the data that data acquisition module is collected, utilizes word slip window sampling to extract the word combination of high frequency, obtains the list of user individual idiom;
Idiom emotion value computing module, carries out statistical study for the emotion value to all short sentences that comprise each idiom, draws the emotion value of idiom;
Short sentence emotion value computing module, for calculating the individualized emotion value of each short sentence;
User's topic emotion tendency computing module, for calculating the emotion tendency of each topic of interest of user.
Described microblog users individualized emotion trend analysis system, also comprises the overall emotion tendency of user computing module, for calculating user's overall emotion tendency.
Compared with prior art, the present invention has the following advantages and beneficial effect:
(1) the present invention has realized the sentiment analysis to microblogging unique user, by user's sentiment analysis with concrete topic is combined, avoided general mechanical analytical model, make more careful to user's sentiment analysis, have more directivity.From user's habitual expression way, consider one of element using user individual idiom as sentiment analysis, the accuracy that is conducive to improve emotion trend analysis.
(2) the present invention is directed to user, its microblogging data are carried out to sentiment analysis, can help user more fully to understand self and other users' hobby; While there is one section of new speech focus in network, the result that can utilize this method to obtain, draws this user's interest level and emotion tendency, fast with speech and the reaction of predictive user; Product development business, operator and advertiser utilize this method, can find out pushed away commodity or serve interested user, contribute to product development business that the commodity that more satisfy the demands are provided, contribute to operator that more humane intimate service is provided, contribute to advertiser to throw in advertisement for user; Can be many industries market demand reference is provided; Contribute to public sentiment monitoring.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the microblog users individualized emotion trend analysis method of embodiments of the invention.
Fig. 2 is the data acquisition of microblog users individualized emotion trend analysis method of embodiments of the invention and the particular flow sheet of preprocessing process.
Fig. 3 is the particular flow sheet of setting up user's topic of interest storehouse of embodiments of the invention.
Fig. 4 is the schematic diagram of the topic tree of being set up by certain user's microblogging data in embodiments of the invention.
Fig. 5 successively extracts the particular flow sheet of high frequency topic from topic tree in embodiments of the invention.
Fig. 6 be embodiments of the invention emoticon weights schematic diagram is specifically set.
Fig. 7 is that the word moving window that utilizes of embodiments of the invention extracts the exemplary plot (W gets 1) of word combination.
Fig. 8 is that the word moving window that utilizes of embodiments of the invention extracts the exemplary plot (W gets 3) of word combination.
Fig. 9 is the idiom emotion value calculation flow chart of embodiments of the invention.
Figure 10 is the particular flow sheet of step (7)~(8) of the microblog users individualized emotion trend analysis method of embodiments of the invention.
Figure 11 is the particular flow sheet of step (5)~(11) of the microblog users individualized emotion trend analysis method of embodiments of the invention.
Figure 12 is the structural representation of the microblog users individualized emotion trend analysis system of embodiments of the invention.
Embodiment
Below in conjunction with embodiment, the present invention is described in further detail, but embodiments of the present invention are not limited to this.
Embodiment
As shown in Figure 1, the microblog users individualized emotion trend analysis method of the present embodiment, comprises the following steps:
(1) gather all microblogging data of each user's microblogging homepage, deposit database in:
Take user as unit, collect all data in its homepage; Described data that comprise own issue in user home page face with the microblogging name microblogging data that forward, the data of " comment of sending ", the data of " mentioning mine ", microblogging name and concern user, self-introduction, spontaneous or forwarding microblogging in the url webpage that comprises and the video title in linking.
(2) text data in the microblogging data that step (1) collected be take microblogging as unit reads piecemeal, utilizes the segmenting method of Chinese lexical analysis system ICTCLAS to carry out participle operation, obtains minute set of words and corresponding part-of-speech tagging.
(3) load required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon; Wherein, hownet emotion word dictionary, degree dictionary, each word in negative dictionary is with weights; In hownet emotion word dictionary, only get Chinese word, positive emotion word wherein and the positive word of evaluating are classified as to positive emotion word, negative emotion word wherein and negative evaluation word are classified as to negative emotion word; About degree word and negative word, adopt self-built degree word dictionary and negative word dictionary, wherein degree word dictionary comprises 219 of degree words, and negative dictionary comprises 48 of negative words.Classified lexicon adopts the classified lexicon of the QQ input method of improvement, is stratification tree structure; Dictionary ground floor is maximum kind, comprise game, subject specialty, hobby, sports and amusement, culture and arts, area etc., the dictionary second layer is the segmentation subclass of ground floor, for example, below " subject specialty " large class, be divided into again science and engineering agriculture doctor, social sciences economy, education military affairs etc., the 3rd layer of dictionary is the segmentation subclass of second layer classification, below social sciences economy, is divided into equity fund, law, commodity, finance etc.; Under the catalogue of whole classified lexicon, have 6223 and no longer segment dictionary; Emoticon storehouse obtains by collecting the emoticon of microblogging platform and providing weights.
Step (1)~(3) are data acquisition and preprocessing process, and particular flow sheet is shown in Fig. 2.
(4) adopt interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse, as shown in Figure 3, step is as follows:
(4-1) set up topic tree: filter out the word that there is no topic meaning in user version data, obtain the word of obvious topic information, utilize classified lexicon, statistics word frequency, sets up topic tree; Described topic tree is hierarchical structure, and ground floor is the classification of one-level topic, and the second layer is the sub-topic classification of secondary, and the 3rd layer is three grades of sub-topic classification; Described do not have the word of topic meaning to comprise degree word, negative word, personal pronoun, function word, link word, adjective;
Fig. 4 topic tree that certain user's microblogging data are set up of serving as reasons.In square frame, corresponding numeral is the number of times that this topic word and its sub-topic word occur.
(4-2) according to topic, set, by the aggregating algorithm that makes progress of the interlayer based on word frequency, successively extract high frequency topic: to each topic word, if its occurrence number is higher than the threshold value of setting, this topic is high frequency topic, otherwise, the occurrence number of this topic word is passed to his father's layer topic word, successively calculate and extract high frequency topic; The occurrence number of topic word comprises the occurrence number of this topic itself and the occurrence number of its sub-topic;
Operation example as shown in Figure 5, represents with dashed rectangle the high frequency topic extracting in figure, has extracted altogether four high frequency topics.
(4-3) set up Yi Ge main split, for placing, cannot be included into popular peculiar topic word or phrase on the topic word of father's layer topic and network, obtain conventional topic storehouse, this branch does not have layer architecture; Word in microblogging data and word in conventional topic storehouse are mated corresponding, the topic word that occurrence number in the microblogging data user is surpassed to threshold value extracts, also as high frequency topic;
(4-4) using step (4-2) and the high frequency topic that (4-3) obtains as user's topic of interest word, set up user's topic of interest storehouse.
(5) data that step (1) collected are divided short sentence, guarantee that each short sentence contains a topic of interest word at the most;
The described data that step (1) is collected are divided short sentence, are specially: analyze each microblogging data, if there is no topic word in microblogging data or only relate to a topic, whole piece microblogging is as a short sentence; If one contain a plurality of topic words in microblogging data, analyze in conjunction with punctuation mark degree of priority: if having punctuate between the maximum topic word of two distances, split into two short sentences in punctuate punishment; If there is no punctuate between the maximum topic word of two distances, check so time topic word of large distance; If all there is no punctuate, do not split, whole piece microblogging data are as a short sentence; Described distance for arriving from a topic word tree number that another topic word will pass through in topic tree; If have a plurality of punctuates between two topic words, choose the punctuate place that relative importance value is high and divide: in punctuation mark relative importance value, fullstop > branch > comma.
(6) extract the Emotion element in each short sentence, calculate the initial emotion value of each short sentence:
(6-1) set of words of short sentence is mated to mapping with each dictionary and emoticon storehouse, mark all kinds of Emotion elements; Described Emotion element comprises emotion word, degree word, negative word, punctuation mark, emoticon, and wherein degree word and punctuation mark are all for adjusting the degree of emotion word, and negative word is for adjusting the polarity of emotion word;
(6-2) calculate the emotion value of short sentence Chinese version:
The weights of Emotion element are set: positive emotion word weights are "+1 ", and negative emotion word weights are " 1 "; Negative word weights are " 1 "; Degree word and punctuation mark, according to the depth of its degree, arrange weights, and weights scope is 0 to 3; Nearby principle followed in the emotion word that degree word and punctuation mark affect, and each degree word or punctuation mark impact are apart from the emotion degree of its nearest emotion word;
Weights setting is for example as table 1~4:
Table 1hownet emotion word
Front word Negative word
Love is doted on Regret
Love and esteem Annoyed
Good and sound Dim
Table 2 is commonly used degree word weights
Degree word Degree value
Too 3
Very 2.5
Very 2
Table 3 is commonly used negative word weights
Negative word Weights
No -1
No -1
Non- -1
Table 4 punctuation mark weights
punctuation mark degree coefficient
.(fullstop) 1
, (comma) 1
! ! ! 2
1.5
... ...
The emotion value I of short sentence Chinese version wordscomputing method be:
I words = b · Σ i = 1 m ( Σ j = 1 n c ij · f ij ) · q i
In formula, q irepresent i emotion word, c ijrepresent to modify q ij degree word weights, f ijrepresent to modify q ij negative word weights; If q inot subsidiary degree word, c ijget default value 1; If q inot subsidiary negative word, f ijget default value 1; N gets and modifies q idegree word number and modify q inegative word number in maximal value, m represents the number of emotion word, b represents the weights that punctuation mark is corresponding, i, j is positive integer;
(6-3) calculate the emotion value of emoticon in short sentence:
The expression providing for microblogging operator, contribution by it for emotion tendency is divided into front, negative, neutral three kinds of situations: the weights of front emoticon are made as "+1 ", the weights of negative emoticon are made as " 1 ", and the weights of neutral emoticon are made as " 0 ";
The concrete setting of emoticon weights for example as shown in Figure 6.
Emoticon emotion value I in short sentence markscomputing method be:
I marks = Σ i = 1 l m i
In formula, m ithe expression that represents positive, the negative or neutral emotion of i table, i is positive integer, l is emoticon number;
(6-4) calculate the initial emotion value I of short sentence 0:
I 0=I words+I marks
(7) to the text data after step (2) processing, utilize word slip window sampling to extract the word combination of high frequency, obtain the list of user individual idiom;
If the length of window of moving window is W, W is the word number that moving window comprises, and successively gets respectively 1,2,3,4; Utilize word moving window to add up in all short sentences, the total degree that each word or phrase occur, the list of user's idiom listed in the word or the phrase that occurrence number are greater than to threshold value (different windows has different threshold values, and length of window is less, and desired threshold value is higher); Wherein, during W=1, only adjective and modal particle are carried out to statistical treatment.For phrase, with the maximum phrase of word number, list idiom list in.In addition, adopt self-built daily statement phrase storehouse to reject the daily statement phrase pointing to without obvious emotion causing due to Chinese grammer, specific practice is: artificially collect Chinese everyday expressions collocation, as " I ", " present " " get up " etc., with contrast reject in idiom list without daily statement phrase that obviously emotion is pointed to.Thus, obtain the list of user individual idiom.
Fig. 7 is the exemplary plot that W gets the combination of 1 o'clock word moving window extraction word.
Fig. 8 is the exemplary plot that W gets the combination of 3 o'clock word moving window extraction words.
(8) the initial emotion value of all short sentences that comprise each idiom is carried out to statistical study, calculate the emotion value of each idiom, as shown in Figure 9, detailed process is as follows:
For every idiom, find out all short sentences that contain this idiom, by its initial emotion value sum-average arithmetic, computing method are as follows:
I g = 1 p Σ i = 1 p I 0 i
In formula, I 0ibe the initial emotion value of the i sentence short sentence that comprises this idiom, p is the short sentence number that contains this idiom, I gemotion initial value for this idiom;
Consider I in theory gvalue be infinitely great, but its most of numerical value is distributed near 10 again, so according to formula (1) by I gthe value of (as the x in formula) is mapped in [3,3], obtains the emotion value I ' of idiom g(as the y in formula), is recorded in this user's personalized idiom emotion label table;
y = 0.42 x , 0 &le; x < 50 - 0.42 x , - 50 < x < 0 3 , x > 50 - 3 , x > - 50 - - - ( 1 )
The particular flow sheet of step (7)~(8) is shown in Figure 10.
(9) calculate the individualized emotion value of each short sentence, computing method are: deduct the Emotion element calculated value that comprises any word in idiom or idiom in initial emotion value, then add the emotion value of the idiom of calculating according to the list of user individual idiom, obtain the individualized emotion value of short sentence, that is:
I = I 0 - &Sigma; i = 1 m &prime; ( &Sigma; j = 1 n &prime; c gij &CenterDot; f gij ) &CenterDot; q gi + &Sigma; k = 1 r I gk &prime;
In formula, I 0for the initial emotion value of the short sentence of previous calculations, c gij, f gij, q githe correlation computations value of word in idiom, q girepresent i word, c gijrepresent to modify q gij degree word weights, f gijrepresent to modify q gij negative word weights; If q ginot subsidiary degree word, c gijget default value 1; If q ginot subsidiary negative word, f gijget default value 1; N' gets and modifies q gidegree word number and modify q githe number of negative word in maximal value, m' represents the number of word, i, j is positive integer; I' gkthe emotion value that represents k idiom, r represents the number of idiom in this short sentence.
(10) calculate the emotion tendency of user's topic of interest:
Arbitrary user's topic of interest word in user's topic of interest storehouse, is calculated as follows its emotion value:
I topic i = 1 w &Sigma; j = 1 w I j
In formula, I jfor the individualized emotion value of j short sentence comprising this user's topic of interest word, w is the short sentence sum that comprises this user's topic of interest word,
Figure BDA0000442450760000134
emotion value for this user's topic of interest word;
Will according to formula (1)
Figure BDA0000442450760000135
the value of (as the x in formula) is mapped in [3,3], obtains final user's topic of interest emotion propensity value (as the y in formula).Utilize these values, set up user individual microblog topic emotion value list, as shown in table 5.
Table 5 user individual microblog topic emotion value list
Topic Emotion value
nBA cager 3
digital communication brand 2.5
register -3
.。。 。。。
Carry out step (10) afterwards, also can calculate user's overall emotion tendency, carry out following steps:
(11) calculate user's overall emotion tendency:
According to following formula, calculate the overall emotion propensity value of user:
I user = 1 s &Sigma; i = 1 s I i
In formula, I ithe emotion value that represents i short sentence, s represents short sentence sum.
The particular flow sheet of step (5)~(11) is shown in Figure 11.
In general, micro-blog information amount is larger, and the topic word in user's topic of interest storehouse is abundanter, and user's emotion trend analysis is also more accurate.So, to the emotion trend analysis of microblog users, should regularly repeat, cover old result.This is a kind of mechanism that user's sentiment analysis is regularly upgraded, and can make result more comprehensively, accurately, has more ageing.
Wherein, because user's idiom list in renewal process has new idiom, add, the microblog users individualized emotion trend analysis method of the present embodiment adopts following methods to upgrade the analysis result of step (1)~(11):
Carry out step (11) afterwards, also carry out following steps:
(12) repeat step (1)~(7), for the idiom newly adding in the list of user individual idiom, by step (8), calculated the emotion value I ' of this idiom g; For being recorded in the idiom in the list of user individual idiom before this circulation, upgrade by the following method I ' g: to every idiom, first carry out step (8), obtain the I of epicycle circulation gcalculated value I g_new, establish I g_prevthe I obtaining for last round of circulation gcalculated value, I gbe updated to:
I g1I g_prev2I g_new
In formula, ω 1for I g_prevweights, ω 2for I g_newweights;
By I gvalue be mapped in [3,3], obtain the emotion value I ' of idiom g;
(13) I ' obtaining according to step (12) g, carry out follow-up step (9)~(11).
Because user's topic of interest storehouse in renewal process has new user's topic of interest word, add, the microblog users individualized emotion trend analysis method of the present embodiment adopts following methods to upgrade the analysis result of step (1)~(11):
Carry out step (11) afterwards, also carry out following steps:
(14) repeat step (1)~(9), for the user's topic of interest word newly adding in user's topic of interest storehouse in this circulation, by the method for step (10), calculate user's topic emotion tendency; For being recorded in the user's topic of interest word in user's topic of interest storehouse before this circulation, upgrade by the following method user's topic emotion tendency I topic: first carry out step (10), obtain the I of epicycle circulation topiccalculated value I topic_new, establish I topic_prevfor last round of circulation obtains I topiccalculated value, I topicbe updated to:
I topic=ω' 1I topic_prev+ω' 2I topic_new
ω ' 1for I topic_prevweights, ω ' 2for I topic_newweights.
The update method in conventional topic storehouse: regularly check much-talked-about topic one hurdle in microblogging homepage, topic is wherein joined in conventional topic storehouse; In addition, regularly check the renewal of QQ input method classified lexicon, new entry is added in classified lexicon.
As shown in figure 12, the microblog users individualized emotion trend analysis system of the present embodiment, comprises
Data acquisition module, for gathering all data of each user's microblogging homepage, deposits database in;
Word-dividing mode, for the text data of the data that collect is carried out to participle, obtains minute set of words and part-of-speech tagging;
Dictionary load-on module: for loading required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon;
Module is set up in user's topic of interest storehouse: for adopting interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse:
Module divided in short sentence, for the data that data collecting module collected is arrived, divides short sentence, guarantees that each short sentence contains a topic of interest word at the most;
Emotion element extraction module, for extracting the Emotion element of each short sentence, calculates the initial emotion value of each short sentence;
Module is set up in the list of user individual idiom, for the data that data acquisition module is collected, utilizes word slip window sampling to extract the word combination of high frequency, obtains the list of user individual idiom;
Idiom emotion value computing module, carries out statistical study for the emotion value to all short sentences that comprise each idiom, draws the emotion value of idiom:
Short sentence emotion value computing module, for calculating the individualized emotion value of each short sentence;
User's topic emotion tendency computing module, for calculating the emotion tendency of each topic of interest of user;
The overall emotion tendency of user computing module, for calculating user's overall emotion tendency.
Above-described embodiment is preferably embodiment of the present invention; but embodiments of the present invention are not limited by the examples; other any do not deviate from change, the modification done under Spirit Essence of the present invention and principle, substitutes, combination, simplify; all should be equivalent substitute mode, within being included in protection scope of the present invention.

Claims (10)

1. a microblog users individualized emotion trend analysis method, is characterized in that, comprises the following steps:
(1) gather all data of each user's microblogging homepage, deposit database in;
(2) text data in the microblogging data that step (1) collected carries out participle, obtains minute set of words and part-of-speech tagging;
(3) load required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon;
(4) adopt interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse:
(4-1) set up topic tree: filter out the word that there is no topic meaning in user version data, obtain the word of obvious topic information, utilize classified lexicon, statistics word frequency, sets up topic tree; Described topic tree is hierarchical structure, and ground floor is the classification of one-level topic, and the second layer is the sub-topic classification of secondary, and the 3rd layer is three grades of sub-topic classification; Described do not have the word of topic meaning to comprise degree word, negative word, personal pronoun, function word, link word, adjective;
(4-2), according to topic tree, by the aggregating algorithm that makes progress of the interlayer based on word frequency, successively extract high frequency topic;
(4-3) set up Yi Ge main split, for placing, cannot be included into popular peculiar topic word or phrase on the topic word of father's layer topic and network, obtain conventional topic storehouse; Word in microblogging data and word in conventional topic storehouse are mated corresponding, the topic word that occurrence number in the microblogging data user is surpassed to threshold value extracts, also as high frequency topic;
(4-4) using step (4-2) and the high frequency topic that (4-3) obtains as user's topic of interest word, set up user's topic of interest storehouse;
(5) the microblogging data that step (1) collected are divided short sentence, guarantee that each short sentence contains a topic of interest word at the most;
(6) extract the Emotion element in each short sentence, calculate the initial emotion value of each short sentence:
(6-1) set of words of short sentence is mated to mapping with each dictionary and emoticon storehouse, mark all kinds of Emotion elements; Described Emotion element comprises emotion word, degree word, negative word, punctuation mark, emoticon, and wherein degree word and punctuation mark are all for adjusting the degree of emotion word, and negative word is for adjusting the polarity of emotion word;
(6-2) calculate the emotion value of short sentence Chinese version:
The weights of Emotion element are set: positive emotion word weights are "+1 ", and negative emotion word weights are " 1 "; Negative word weights are " 1 "; Degree word and punctuation mark, according to the depth of its degree, arrange weights, and weights scope is between 0 to 3; Nearby principle followed in the emotion word that degree word and punctuation mark affect, and each degree word or punctuation mark impact are apart from the emotion degree of its nearest emotion word;
The emotion value I of short sentence Chinese version wordscomputing method be:
I words = b &CenterDot; &Sigma; i = 1 m ( &Sigma; j = 1 n c ij &CenterDot; f ij ) &CenterDot; q i
In formula, q irepresent i emotion word, c ijrepresent to modify q ij degree word weights, f ijrepresent to modify q ij negative word weights; If q inot subsidiary degree word, c ijget default value 1; If q inot subsidiary negative word, f ijget default value 1; N gets and modifies q idegree word number and modify q inegative word number in maximal value, m represents the number of emotion word, b represents the weights that punctuation mark is corresponding, i, j is positive integer;
(6-3) calculate the emotion value of emoticon in short sentence:
The expression providing for microblogging operator, contribution by it for emotion tendency is divided into front, negative, neutral three kinds of situations: the weights of front emoticon are made as "+1 ", the weights of negative emoticon are made as " 1 ", and the weights of neutral emoticon are made as " 0 ";
Emoticon emotion value I in short sentence markscomputing method be:
I marks = &Sigma; i = 1 l m i
In formula, m ithe expression that represents positive, the negative or neutral emotion of i table, i is positive integer, l is emoticon number;
(6-4) calculate the initial emotion value I of short sentence 0:
I 0=I words+I marks
(7) to the text data after step (2) processing, utilize word slip window sampling to extract the word combination of high frequency, obtain the list of user individual idiom;
(8) the initial emotion value of all short sentences that comprise each idiom is carried out to statistical study, draw the emotion value of idiom;
For every idiom, find out all short sentences that contain this idiom, by its initial emotion value sum-average arithmetic, computing method are as follows:
I g = 1 p &Sigma; i = 1 p I 0 i
In formula, I 0ibe the initial emotion value of the i sentence short sentence that comprises this idiom, p is the short sentence number that contains this idiom, I gemotion initial value for this idiom;
By I gvalue be mapped in [3,3], obtain the emotion value I ' of idiom g, be recorded in this user's personalized idiom emotion label table;
(9) calculate the individualized emotion value of each short sentence, computing method are:
I = I 0 - &Sigma; i = 1 m &prime; ( &Sigma; j = 1 n &prime; c gij &CenterDot; f gij ) &CenterDot; q gi + &Sigma; k = 1 r I gk &prime;
In formula, I 0for the initial emotion value of short sentence, q girepresent i word, c gijrepresent to modify q gij degree word weights, f gijrepresent to modify q gij negative word weights; If q ginot subsidiary degree word, c gijget default value 1; If q ginot subsidiary negative word, f gijget default value 1; N' gets and modifies q gidegree word number and modify q githe number of negative word in maximal value, m' represents the number of word, i, j is positive integer; I' gkthe emotion value that represents k idiom, r represents the number of idiom in this short sentence;
(10) calculate the emotion tendency of user's topic of interest:
Arbitrary user's topic of interest word in user's topic of interest storehouse, is calculated as follows its emotion value:
I topi c i = 1 w &Sigma; j = 1 w I j
I jfor the individualized emotion value of j short sentence comprising this user's topic of interest word, w is the short sentence sum that comprises this user's topic of interest word,
Figure FDA0000442450750000034
emotion value for this user's topic of interest word; Will
Figure FDA0000442450750000035
value be mapped in [3,3], obtain final user's topic emotion propensity value, utilize these values, set up user individual microblog topic emotion value list.
2. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, carries out step (10) afterwards, also carries out following steps:
(a) repeat step (1)~(7), for the idiom newly adding in the list of user individual idiom, by step (8), calculated the emotion value I ' of this idiom g; For being recorded in the idiom in the list of user individual idiom before this circulation, upgrade by the following method I ' g: to every idiom, first carry out step (8), obtain the I of epicycle circulation gcalculated value I g_new, establish I g_prevthe I obtaining for last round of circulation gcalculated value, I gbe updated to:
I g1I g_prev2I g_new
In formula, ω 1for I g_prevweights, ω 2for I g_newweights;
By I gvalue be mapped in [3,3], obtain the emotion value I ' of idiom g;
(b) I ' obtaining according to step (a) g, carry out step (9)~(10).
3. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, carries out step (10) afterwards, also carries out following steps:
Repeat step (1)~(9), for the user's topic of interest word newly adding in user's topic of interest storehouse in this circulation, by the method for step (10), calculate user's topic emotion tendency; For being recorded in the user's topic of interest word in user's topic of interest storehouse before this circulation, upgrade by the following method user's topic emotion tendency I topic: first carry out step (10), obtain the I of epicycle circulation topiccalculated value I topic_new, establish I topic_prevthe I obtaining for last round of circulation topiccalculated value, I topicbe updated to:
I topic=ω' 1I topic_prev+ω' 2I topic_new
ω ' 1for I topic_prevweights, ω ' 2for I topic_newweights.
4. according to the microblog users individualized emotion trend analysis method described in claim 1~3 any one, it is characterized in that, carry out step (10) afterwards, also carry out following steps:
According to following formula, calculate the overall emotion propensity value of user:
I user = 1 s &Sigma; i = 1 s I i
In formula, I ithe emotion value that represents i short sentence, s represents short sentence sum.
5. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, all data of described each user's of collection of step (1) microblogging homepage, are specially:
Take user as unit, collect all data in its homepage; Described data that comprise own issue in user home page face with the microblogging name microblogging data that forward, the data of " comment of sending ", the data of " mentioning mine ", microblogging name and concern user, self-introduction, spontaneous or forwarding microblogging in the url webpage that comprises and the video title in linking.
6. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, step (4-2) is described according to topic tree, by the aggregating algorithm that makes progress of the interlayer based on word frequency, successively extracts high frequency topic, is specially:
To each topic word, if its occurrence number higher than the threshold value of setting, this topic is high frequency topic, otherwise, the occurrence number of this topic word is passed to father's layer topic word, successively calculate and extract high frequency topic; The occurrence number of topic word comprises the occurrence number of the occurrence number of this topic itself and the sub-topic of this topic.
7. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, the described data that step (1) is collected of step (5) are divided short sentence, are specially:
Analyze each microblogging data, if there is no topic word in microblogging data or only relate to a topic, whole piece microblogging is as a short sentence;
If one contain a plurality of topic words in microblogging data, in conjunction with punctuation mark degree of priority, analyze: if having punctuate between the maximum topic word of two distances, in punctuate punishment, split into two short sentences, if there is no punctuate between the maximum topic word of two distances, check so time topic word of large distance, if all there is no punctuate, do not split, whole piece microblogging data are as a short sentence; Described distance for arriving from a topic word tree number that another topic word will pass through in topic tree; If have a plurality of punctuates between two topic words, choose the punctuate place that relative importance value is high and divide: in punctuation mark relative importance value, fullstop > branch > comma.
8. microblog users individualized emotion trend analysis method according to claim 1, is characterized in that, step (7) is described to the text data after step (2) processing, extracts the word combination of high frequency, is specially:
If the length of window of moving window is W, W is the word number that moving window comprises, and successively gets respectively 1,2,3,4; Utilize word moving window to add up in all short sentences, the total degree that each word or phrase occur, the list of user's idiom listed in the word or the phrase that occurrence number are greater than to threshold value; Collect daily statement word, set up daily statement phrase storehouse; Daily statement word is rejected from the list of user's idiom, obtain the list of user's idiom.
9. the microblog users individualized emotion trend analysis system that realizes analytical approach described in claim 1~8 any one, is characterized in that, comprises
Data acquisition module, for gathering all data of each user's microblogging homepage, deposits database in;
Word-dividing mode, for the text data of the data that collect is carried out to participle, obtains minute set of words and part-of-speech tagging;
Dictionary load-on module, for loading required dictionary, emoticon storehouse; Described dictionary comprises hownet emotion word dictionary, and degree dictionary is negated dictionary, personal pronoun dictionary, and function word is connected dictionary, cyberspeak dictionary and classified lexicon;
Module is set up in user's topic of interest storehouse, for adopting interlayer based on the word frequency aggregating algorithm that makes progress to set up user's topic of interest storehouse;
Module divided in short sentence, for the data that data collecting module collected is arrived, divides short sentence, guarantees that each short sentence contains a topic of interest word at the most;
Emotion element extraction module, for extracting the Emotion element of each short sentence, calculates the initial emotion value of each short sentence;
Module is set up in the list of user individual idiom, for the data that data acquisition module is collected, utilizes word slip window sampling to extract the word combination of high frequency, obtains the list of user individual idiom;
Idiom emotion value computing module, carries out statistical study for the emotion value to all short sentences that comprise each idiom, draws the emotion value of idiom;
Short sentence emotion value computing module, for calculating the individualized emotion value of each short sentence;
User's topic emotion tendency computing module, for calculating the emotion tendency of each topic of interest of user.
10. microblog users individualized emotion trend analysis system according to claim 9, is characterized in that, also comprises the overall emotion tendency of user computing module, for calculating user's overall emotion tendency.
CN201310711626.9A 2013-12-20 2013-12-20 Method and system for analysing individual emotion tendency of microblog user Active CN103699626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310711626.9A CN103699626B (en) 2013-12-20 2013-12-20 Method and system for analysing individual emotion tendency of microblog user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310711626.9A CN103699626B (en) 2013-12-20 2013-12-20 Method and system for analysing individual emotion tendency of microblog user

Publications (2)

Publication Number Publication Date
CN103699626A true CN103699626A (en) 2014-04-02
CN103699626B CN103699626B (en) 2017-02-01

Family

ID=50361154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310711626.9A Active CN103699626B (en) 2013-12-20 2013-12-20 Method and system for analysing individual emotion tendency of microblog user

Country Status (1)

Country Link
CN (1) CN103699626B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572616A (en) * 2014-12-23 2015-04-29 北京锐安科技有限公司 Method and device for identifying text orientation
CN104899298A (en) * 2015-06-09 2015-09-09 华东师范大学 Microblog sentiment analysis method based on large-scale corpus characteristic learning
CN105022725A (en) * 2015-07-10 2015-11-04 河海大学 Text emotional tendency analysis method applied to field of financial Web
CN105095190A (en) * 2015-08-25 2015-11-25 众联数据技术(南京)有限公司 Chinese semantic structure and finely segmented word bank combination based emotional analysis method
CN105740224A (en) * 2014-12-11 2016-07-06 仲恺农业工程学院 Text analysis based user psychology early warning method and apparatus
CN105843796A (en) * 2016-03-28 2016-08-10 北京邮电大学 Microblog emotional tendency analysis method and device
WO2016197577A1 (en) * 2015-06-12 2016-12-15 百度在线网络技术(北京)有限公司 Method and apparatus for labelling comment information and computer device
CN106503220A (en) * 2016-10-28 2017-03-15 上海大学 A kind of microblogging emoticon affection computation method based on a mutual information
CN106709829A (en) * 2015-08-03 2017-05-24 科大讯飞股份有限公司 On-line-question-database-based learning condition diagnosis method and system
CN106933789A (en) * 2015-12-30 2017-07-07 阿里巴巴集团控股有限公司 Tourism strategy generation method and generation system
CN107391712A (en) * 2017-07-28 2017-11-24 王亚迪 A kind of network public opinion trend prediction analysis method
CN107943789A (en) * 2017-11-17 2018-04-20 新华网股份有限公司 Mood analysis method, device and the server of topic information
CN108062300A (en) * 2016-11-08 2018-05-22 中移(苏州)软件技术有限公司 A kind of method and device that Sentiment orientation analysis is carried out based on Chinese text
CN109446378A (en) * 2018-11-08 2019-03-08 北京奇艺世纪科技有限公司 Information recommendation method, Sentiment orientation determine method and device and electronic equipment
CN109524106A (en) * 2018-10-31 2019-03-26 北京指掌易科技有限公司 A kind of mental model for analyzing introgression by chat content
CN110189742A (en) * 2019-05-30 2019-08-30 芋头科技(杭州)有限公司 Determine emotion audio, affect display, the method for text-to-speech and relevant apparatus
CN111611455A (en) * 2020-05-22 2020-09-01 安徽理工大学 User group division method based on user emotional behavior characteristics under microblog hot topics
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112684110A (en) * 2020-12-14 2021-04-20 江西省蚕桑茶叶研究所(江西省经济作物研究所) Tea sensory evaluation method based on favorite expressions

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279890A (en) * 2011-09-02 2011-12-14 苏州大学 Sentiment word extracting and collecting method based on micro blog
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text
JP2012221480A (en) * 2011-04-06 2012-11-12 L Is B Corp Message processing system
US20130246463A1 (en) * 2012-03-16 2013-09-19 Microsoft Corporation Prediction and isolation of patterns across datasets

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012221480A (en) * 2011-04-06 2012-11-12 L Is B Corp Message processing system
CN102279890A (en) * 2011-09-02 2011-12-14 苏州大学 Sentiment word extracting and collecting method based on micro blog
US20130246463A1 (en) * 2012-03-16 2013-09-19 Microsoft Corporation Prediction and isolation of patterns across datasets
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴维等: "基于多特征与复合分类法的中文微博情感分析", 《中国期刊全文数据库 北京信息科技大学学报(自然科学版)》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740224A (en) * 2014-12-11 2016-07-06 仲恺农业工程学院 Text analysis based user psychology early warning method and apparatus
CN104572616A (en) * 2014-12-23 2015-04-29 北京锐安科技有限公司 Method and device for identifying text orientation
CN104572616B (en) * 2014-12-23 2018-04-24 北京锐安科技有限公司 The definite method and apparatus of Text Orientation
CN104899298A (en) * 2015-06-09 2015-09-09 华东师范大学 Microblog sentiment analysis method based on large-scale corpus characteristic learning
CN104899298B (en) * 2015-06-09 2018-01-16 华东师范大学 A kind of microblog emotional analysis method based on large-scale corpus feature learning
WO2016197577A1 (en) * 2015-06-12 2016-12-15 百度在线网络技术(北京)有限公司 Method and apparatus for labelling comment information and computer device
CN105022725B (en) * 2015-07-10 2018-04-20 河海大学 A kind of text emotion trend analysis method applied to finance Web fields
CN105022725A (en) * 2015-07-10 2015-11-04 河海大学 Text emotional tendency analysis method applied to field of financial Web
CN106709829B (en) * 2015-08-03 2020-06-02 科大讯飞股份有限公司 Learning situation diagnosis method and system based on online question bank
CN106709829A (en) * 2015-08-03 2017-05-24 科大讯飞股份有限公司 On-line-question-database-based learning condition diagnosis method and system
CN105095190B (en) * 2015-08-25 2018-01-12 众联数据技术(南京)有限公司 A kind of sentiment analysis method combined based on Chinese semantic structure and subdivision dictionary
CN105095190A (en) * 2015-08-25 2015-11-25 众联数据技术(南京)有限公司 Chinese semantic structure and finely segmented word bank combination based emotional analysis method
CN106933789A (en) * 2015-12-30 2017-07-07 阿里巴巴集团控股有限公司 Tourism strategy generation method and generation system
CN106933789B (en) * 2015-12-30 2023-06-20 阿里巴巴集团控股有限公司 Travel attack generation method and generation system
CN105843796A (en) * 2016-03-28 2016-08-10 北京邮电大学 Microblog emotional tendency analysis method and device
CN106503220A (en) * 2016-10-28 2017-03-15 上海大学 A kind of microblogging emoticon affection computation method based on a mutual information
CN108062300A (en) * 2016-11-08 2018-05-22 中移(苏州)软件技术有限公司 A kind of method and device that Sentiment orientation analysis is carried out based on Chinese text
CN107391712A (en) * 2017-07-28 2017-11-24 王亚迪 A kind of network public opinion trend prediction analysis method
CN107943789A (en) * 2017-11-17 2018-04-20 新华网股份有限公司 Mood analysis method, device and the server of topic information
CN109524106A (en) * 2018-10-31 2019-03-26 北京指掌易科技有限公司 A kind of mental model for analyzing introgression by chat content
CN109446378A (en) * 2018-11-08 2019-03-08 北京奇艺世纪科技有限公司 Information recommendation method, Sentiment orientation determine method and device and electronic equipment
CN110189742A (en) * 2019-05-30 2019-08-30 芋头科技(杭州)有限公司 Determine emotion audio, affect display, the method for text-to-speech and relevant apparatus
CN111611455A (en) * 2020-05-22 2020-09-01 安徽理工大学 User group division method based on user emotional behavior characteristics under microblog hot topics
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112684110A (en) * 2020-12-14 2021-04-20 江西省蚕桑茶叶研究所(江西省经济作物研究所) Tea sensory evaluation method based on favorite expressions

Also Published As

Publication number Publication date
CN103699626B (en) 2017-02-01

Similar Documents

Publication Publication Date Title
CN103699626A (en) Method and system for analysing individual emotion tendency of microblog user
CN104657425B (en) Topic management type network public opinion evaluation management system and method
Holmberg et al. Gender differences in the climate change communication on Twitter
CN106980692A (en) A kind of influence power computational methods based on microblogging particular event
Mahmoudi et al. Deep neural networks understand investors better
Khanam et al. The homophily principle in social network analysis: A survey
CN103793503A (en) Opinion mining and classification method based on web texts
Barclay et al. India 2014: Facebook ‘like’as a predictor of election outcomes
Liu et al. Public perceptions of environmental, social, and governance (ESG) based on social media data: Evidence from China
Quercia et al. Talk of the city: Our tweets, our community happiness
CN103917968A (en) System and method for managing opinion networks with interactive opinion flows
CN103559207A (en) Financial behavior analyzing system based on social media calculation
CN103870001A (en) Input method candidate item generating method and electronic device
Chen et al. Analyzing the sentiment correlation between regular tweets and retweets
CN103108049B (en) A kind ofly provide the method for personal page for mobile phone users
CN102663001A (en) Automatic blog writer interest and character identifying method based on support vector machine
CN105389389A (en) Network public opinion transmission situation media linked analysis method
Märkle-Huß et al. Improving sentiment analysis with document-level semantic relationships from rhetoric discourse structures
CN104899335A (en) Method for performing sentiment classification on network public sentiment of information
CN103198098A (en) Network information transfer method and device
Halavais Do Dugg Diggers Digg Diligently? Feedback as motivation in collaborative moderation systems
CN104572877A (en) Detection method and detection system of game public opinion
CN102663027A (en) Method for predicting attributes of webpage crowd
Nguyen et al. Evaluating marijuana-related tweets on Twitter
Wei et al. Using network flows to identify users sharing extremist content on social media

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant