CN106202047A - A kind of character personality depicting method based on microblogging text - Google Patents

A kind of character personality depicting method based on microblogging text Download PDF

Info

Publication number
CN106202047A
CN106202047A CN201610559542.1A CN201610559542A CN106202047A CN 106202047 A CN106202047 A CN 106202047A CN 201610559542 A CN201610559542 A CN 201610559542A CN 106202047 A CN106202047 A CN 106202047A
Authority
CN
China
Prior art keywords
user
class
emotion
microblogging
microblogging text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610559542.1A
Other languages
Chinese (zh)
Inventor
刘春阳
吴俊杰
王卿
苗琳
袁石
王萌
张旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201610559542.1A priority Critical patent/CN106202047A/en
Publication of CN106202047A publication Critical patent/CN106202047A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Abstract

The invention discloses a kind of character personality depicting method based on microblogging text, belong to Data Mining;Specifically include: first, for certain user, to the every microblogging text marking emotion label sent out in this user section period, add up this user and get excited every day class and the leading natural law of depressed class emotion, from emotional characteristics angle, user is marked;Then, carry out paying close attention to topic classification to all microblogging texts of this user, and select the concern topic of this user;Judge whether the concern topic of this user includes politics class and people's livelihood class, portray if it has, utilize Being Critical dictionary that this user is carried out language feature;Otherwise, it is left intact;Finally, merge the emotional characteristics of this user and language feature portrays the personality of this user, obtain personality label.Advantage is: portrays be applicable to person character trait microblogging and analyzes, and in public sentiment monitoring, character attribute is portrayed has important using value with the field such as Information Communication diffusion.

Description

A kind of character personality depicting method based on microblogging text
Technical field
The invention belongs to Data Mining, relate to user's Portrait brand technology, a kind of personage based on microblogging text Personification method.
Background technology
Along with the continuous increase of netizen's scale, social media is also developed rapidly.With forum, microblogging, wechat as generation The social media of table gradually penetrates into each aspect of whole people's live and work, behavioral pattern, the Psychological Model to people Create extremely far-reaching influence.Social media all can produce substantial amounts of short text every day, reflects personage to a certain extent Feature.By portraying the feature of personage, on the one hand people it will be seen that the individual preference of personage in social media, according to individual People's preference, enterprise to Reference Group's recommended products, increases the benefit of enterprise by social media.On the other hand, people are permissible Understanding in social media, for opinion leader, the public opinion agitator of a certain event, and some potential have tremendous influence The impact of the user of power, the public sentiment of departments of government is monitored by this very important effect.
Portray the one side that character personality is figure painting picture, to the monitoring of social media public sentiment, social media marketing etc. Aspect has important effect, has become as the emphasis of current research.
Prior art be all use traditional Empirical Study Methods to portray character personality, such as questionnaire survey, interview etc..Pass The empirical research of system needs to put into substantial amounts of human and material resources, financial resources to analyze character personality, has certain limitation, mainly It being embodied in three below aspect: 1) traditional empirical analysis needs through investigation or interview for a long time to gather data;2) logical The availability of the data crossing investigation or interview collection is relatively low, there is substantial amounts of invalid data;3) traditional investigation cannot be protected Demonstrate,prove the verity of gathered data.
Summary of the invention
The present invention is directed to the demand that public sentiment monitors and character personality is portrayed by socialization's marketing, in order to solve conventional survey The difficulties such as the cost that research brings is high, availability of data is low, evade surveyee and fill in the situation of untrue information, by currently Universal social media, it is proposed that a kind of character personality depicting method based on microblogging text.
Specifically comprise the following steps that
Step one, for certain user, utilize each the microblogging literary composition that this user was sent out within certain period by emotion dictionary This mark emotion label.
Emotion dictionary includes happiness, anger, sadness, detest and five kinds of emotions of anxiety.
First, calculate each microblogging text and belong to weight w_sentiment of certain emotion;
It is calculated as follows:
w _ s e n t i m e n t = Σ w o r d w _ s * c o u n t ( w o r d )
W_s represents the weight that the word word in certain microblogging text is corresponding in emotion dictionary;Word word refers to feelings What in thread dictionary, certain emotion included specifically embodies word;Count (word) represents that this word word goes out in certain microblogging text Existing frequency.
Comparing each microblogging text weight under five class emotions, weighting the highest heavy emotion is as this microblogging text Emotion label.
Step 2, according to emotion label, add up this user and get excited every day class and the quantity of depressed class emotion;
Impulsion class includes angry and detests two kinds of emotions, and depressed class includes sad and two kinds of emotions of anxiety;
Step 3, class of getting excited according to this user and the quantity of depressed class emotion, calculate this user and get excited the leading of class emotion Natural law and the leading natural law of depressed class emotion;
Step 301, calculate this user and get excited class and the microblogging quantity sum of depressed class emotion, account for the institute that this user sent out the same day There is the ratio of microblogging sum;
Step 302, judge the accounting of step 301 whether more than or equal to threshold value R, if it is, enter step 303, otherwise, no Do any process;
Threshold value R sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Step 303, class of this user being got excited emotion accounting are poor with the accounting of depressed class emotion;
Whether the absolute difference that step 304, judgement obtain is more than or equal to threshold value M, if it is, enter step 305;No Then, it is left intact;
Threshold value M sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Whether step 305, the class emotion accounting that judges to get excited be more than depressed class emotion accounting, if it is, this user got excited The natural law of class emotion adds up 1 day;Otherwise, the natural law of this user's depression class emotion is added up 1 day.
Step 4, the leading natural law of class emotion of getting excited according to this user and the leading natural law of depressed class emotion, from emotion User is marked by characteristic angle;
Particularly as follows: for impulsion class emotion dominate natural law dominate natural law more than depressed class emotion in the case of, it is judged that impulsion class Emotion dominates whether natural law is more than or equal to threshold value D, if it is, this user of labelling is " inflammable ";Otherwise, this user of labelling For " being emotionally stable ";
Threshold value D sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
When depressed class emotion is dominated natural law and is dominated natural law more than class emotion of getting excited, it is judged that depressed class emotion is dominated Whether natural law is more than or equal to threshold value D, if it is, this user of labelling is " easily depressed ", otherwise, then this user of labelling is " feelings Thread is stable ".
Dominating natural law when impulsion class emotion and dominate natural law equal to depressed class emotion, this user of labelling is " being emotionally stable ".
Step 5, utilize topic dictionary to carry out all microblogging texts of this user paying close attention to topic classification, and select this use The concern topic at family;
Topic dictionary has political class, people's livelihood class, military class, amusement class and sport category.
First, calculate the weight equation w_topic of various types of topics involved by the microblogging text of user, as follows:
w _ t o p i c = Σ w o r d w _ t * c o u n t ( w o r d )
W_t represents that the word word in all microblogging texts that certain user issued within certain period is in topic dictionary Corresponding weight;
For certain user, calculate five kinds of topics that all microbloggings that this user issues within certain period relate to respectively Weight, then, is ranked up the weight of five kinds of topics, and the weighting the highest top n topic of weight is as this user microblogging text institute The topic paid close attention to;N is more than or equal to 1, less than or equal to 3.
Step 6, judge in the concern topic that this user chooses, if include politics class and people's livelihood class, if it has, utilize Being Critical dictionary carries out language feature to this user and portrays;Otherwise, it is left intact.
The word that Being Critical dictionary includes is the word expressed and satirize, criticize the tone.
Particularly as follows: add up all microblogging texts that this user issued within certain period, calculate appearance in microblogging text The word that Being Critical dictionary includes, it is judged that the number of different terms occurs whether more than or equal to threshold k, if it is, should User is labeled as " criticism type ", otherwise, user is labeled as " other ".
Threshold k sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Step 7, the emotional characteristics merging this user and language feature portray the personality of this user, obtain the people of this user Physical property case marker label;
Concrete fusion method is as follows:
The character personality label finally given has " choleric type ", " pessimistic type ", " criticism type ", " impulsive type ", " depressive type " " stable type ".
The invention have the advantage that
1), a kind of character personality depicting method based on microblogging text, it is adaptable to person character trait in microblogging is portrayed And analysis, in public sentiment monitoring, character attribute is portrayed has important using value with the field such as Information Communication diffusion.
2), a kind of character personality depicting method based on microblogging text, there is high efficiency and ease for use, it is possible to thousand grades of rule The personage of mould carries out personification.
3), a kind of character personality depicting method based on microblogging text, can reduce conventional survey research human and material resources, The cost of the aspects such as financial resources, and can preferably evade the false situation of survey information.
Accompanying drawing explanation
Fig. 1 is the flow chart of present invention character personality based on microblogging text depicting method;
Fig. 2 is that the present invention calculates user and gets excited the flow chart of leading natural law of class emotion and depressed class emotion.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is described in further detail.
User's Portrait brand technology that present invention research character personality based on microblogging text is portrayed, it is considered to the use of people in microblogging Language and word custom, set up the dictionary of emotion and topic, carve the character trait of personage from emotion and two angles of language Draw, then the feature merging the two angle obtains the character trait of personage.First microblogging text is carried out emotion classification, with sky be The quantity of unit statistics emotion and fluctuation characteristic, portray character personality further according to these features from emotion angle.Meanwhile, root According to topic dictionary, the microblogging text of personage is carried out topic classification, choose concern politics class and the user of people's livelihood class topic;Utilize Dictionary carries out the analysis of language feature to the personage paying close attention to politics class and people's livelihood class topic, portrays character personality with this.Finally, combine The personality that conjunction emotion and language angle obtain, portrays the overall character trait of personage.
As it is shown in figure 1, it is as follows to be embodied as step:
Step one, for certain user, utilize each the microblogging literary composition that this user was sent out within certain period by emotion dictionary This mark emotion label.
Emotion dictionary mainly includes happiness, anger, sadness, detest and five kinds of emotions of anxiety.According to dictionary to microblogging text Carry out emotion classification, mainly belong to the weight of above-mentioned five class emotions by calculating each microblogging text.This weight is certain class There is the weight summation of word in the text in emotion, and each microblogging text belongs to weight w_sentiment of certain class emotion, meter Calculation formula is as follows:
w _ s e n t i m e n t = Σ w o r d w _ s * c o u n t ( w o r d )
Wherein, w_s represents the weight that the word word in certain microblogging text is corresponding in emotion dictionary, count (word) frequency that word word occurs in certain microblogging text is represented.
Such as: represent that the word of happy emoticon has: heartily, laugh a great ho-ho, giggle, happy etc.;
For certain microblogging, according to weight w_s that " heartily " is corresponding in emotion dictionary, with " heartily " at this microblogging text The frequency of middle appearance is multiplied, calculate the most respectively " laughing a great ho-ho " " giggle " and word such as " happily ", by each term weighing and frequency Product is added again, finally gives this microblogging text and belongs to the weight of happy emoticon;
Finally, comparing each microblogging text weight under this five classes emotion, the emotion that weighting weight is the highest is micro-as this Blog article emotion label originally.
The false code of this step is as follows:
The all microbloggings of for weibo_text in:
For sentiment_type in [glad, angry, sad, detest, anxiety]:
Total_weight=∑ w*count (word)
Sentiment_type=max (total_weight);
Step 2, according to emotion label, add up this user and get excited every day class and the quantity of depressed class emotion;
According to previous step emotion classification result, be organization unit by people and time, calculate for each person every day impulsion class and The quantity of depressed class emotion;Impulsion class includes angry and detests two kinds of emotions, and depressed class includes sad and two kinds of emotions of anxiety.
Step 3, class of getting excited according to this user and the quantity of depressed class, calculate this user and get excited the leading natural law of class emotion And the leading natural law of depression class emotion;
According to the result of previous step statistics, compare impulsion class and the quantity of depressed class emotion, daily compare emotion by people Quantity and undulatory property.
First, the quantity sum of impulsion class and depressed class emotion and the quantity of other emotions are compared, by impulsion class and depression The ratio of the microblogging sum that the microblogging quantity of class emotion and this user were sent out the same day is weighed.This step arranges threshold value R, right It is more than or equal to the situation of threshold value R in accounting, then calculates the class emotion difference with the quantity accounting of depressed class emotion of getting excited, if This difference is more than or equal to threshold value M, if impulsion class emotion quantity is more than depressed class emotion quantity, then this user gets excited class emotion Natural law adds up 1 day, if depressed class emotion quantity is more than impulsion class emotion quantity, then the natural law of this user's depression class emotion adds up 1 My god.If accounting is less than threshold value M less than threshold value R or impulsion class with depressed class emotion number differences, then it is left intact.
As in figure 2 it is shown, specifically comprise the following steps that
Step 301, calculate this user and get excited class and the microblogging quantity sum of depressed class emotion, account for the institute that this user sent out the same day There is the ratio of microblogging sum;
Step 302, judge the accounting of step 301 whether more than or equal to threshold value R, if it is, enter step 303, otherwise, no Do any process;
Threshold value R sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Step 303, class of this user being got excited emotion accounting are poor with the accounting of depressed class emotion;
Whether the absolute difference that step 304, judgement obtain is more than or equal to threshold value M, if it is, enter step 305;No Then, it is left intact;
Threshold value M sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Whether step 305, the class emotion accounting that judges to get excited be more than depressed class emotion accounting, if it is, this user got excited The natural law of class emotion adds up 1 day;Otherwise, the natural law of this user's depression class emotion is added up 1 day.
Step 4, the leading natural law of class emotion of getting excited according to this user and the leading natural law of depressed class emotion, from emotion User is marked by characteristic angle;
The result calculated according to previous step, each user has two eigenvalues: one is impulsion class emotion master Leading natural law, another is that depressed class emotion dominates natural law;Emotion class personification is carried out according to emotional characteristics.
This step arranges natural law threshold value D, natural law is dominated for impulsion class emotion and dominates sky more than depressed class emotion The situation of number, if impulsion class emotion dominates natural law is more than or equal to threshold value D, then this user of labelling is " inflammable ", otherwise, punching Dynamic class emotion is dominated natural law and is less than threshold value D, and this user of labelling is " being emotionally stable ";
When depressed class emotion is dominated natural law and dominated natural law more than class emotion of getting excited, if depressed class emotion is dominated Natural law is more than or equal to threshold value D, then this user of labelling is " easily depressed ", otherwise, if depressed class emotion dominates natural law less than threshold Value D, then this user of labelling is " being emotionally stable ".
Dominating natural law when impulsion class emotion and dominate natural law equal to depressed class emotion, this user of labelling is " being emotionally stable ".
Each user is divided into " inflammable ", " easily depressed " and " being emotionally stable " three kinds of property from emotional characteristics angle by this step Lattice feature.
Step 5, utilize topic dictionary to carry out all microblogging texts of this user paying close attention to topic classification, and select this use The concern topic at family;
According to topic dictionary to paying close attention to topic classification: politics class, people's livelihood class, military class, amusement class and sport category.
In units of user, add up the word of all kinds of concern topics occurred in this user's microblogging text;
The weight w_topic computing formula that each user pays close attention to various types of topic is as follows:
w _ t o p i c = Σ w o r d w _ t * c o u n t ( w o r d )
W_t represents that the word word in all microblogging texts that certain user issued within certain period is in topic dictionary Corresponding weight;
Such as: represent that the word of people's livelihood class has: clothing, food, shelter, row, obtain employment, entertain, family, corporations, company, tourism etc.;
The all microbloggings issued within certain period for certain user, according to the weight that " clothing " is corresponding in topic dictionary W_s, is multiplied with the frequency occurred in all microblogging texts, calculates the weight of other words and the product of frequency the most respectively, Finally by all product addition, obtain the weight of people's livelihood class topic;
For certain user, the five class topic weight calculated are ranked up, the top n topic classification that weighting weight is higher Concern topic as this user;N is more than or equal to 1, less than or equal to 3.
Step 6, judge in the concern topic that this user chooses, if include politics class and people's livelihood class, if it has, utilize Being Critical dictionary carries out language feature to this user and portrays;Otherwise, it is left intact.
The word that Being Critical dictionary includes is: the word of the tone is satirized, criticized in the expression such as I am dizzy, muddled, shameless.
According to the result of previous step, select concern politics class, the microblog users of people's livelihood class, utilize Being Critical dictionary to this User carries out the analysis of language feature.Particularly as follows: add up all microblogging texts that this user issued within certain period, calculate micro- The quantity of the different terms that the Being Critical dictionary occurred in blog article basis includes, it is judged that occur whether the quantity of different terms is more than Or equal to threshold k, if it is, mark the user as " criticism type ", otherwise, will appear from different terms quantity less than threshold k User is labeled as " other ".
Such as: certain section of microblogging text occurs in that 2 times " I swoons ", 3 times " muddled " and 1 time " shameless ", then this section of microblogging text Occur that different word numbers is 3;
Threshold k sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
Step 7: merge the emotional characteristics of this user and language feature portrays the personality of this user, obtain the people of this user Physical property case marker label;
This step is the character personality result portrayed according to emotional characteristics and language feature to be merged.The combination used Method is as shown in the table.
The character personality label finally given have " choleric type ", " pessimistic type ", " criticism type ", " impulsive type ", " depressive type ", " stable type ".
The present invention, in view of features such as the colloquial style of microblogging text, real-times, utilizes the word obtained by microblogging Textual study Allusion quotation, the microblogging text being sent out personage carries out emotion classification, and carves character personality according to emotion quantity and fluctuation characteristic Draw.Meanwhile, the topic paying close attention to personage divides, and character personality is portrayed by the topic paid close attention to according to personage.Finally, The result merging emotional characteristics and two dimensions of language feature portrays character personality.Consider that personage's term in microblogging text is practised Used, from dictionary angle, it is considered to the emotion of personage and the topic of concern, the personage of thousand grades of scales is carried out personification, tool There are the features such as high efficiency, robustness and ease for use.

Claims (7)

1. a character personality depicting method based on microblogging text, it is characterised in that consider colloquial style and the reality of microblogging text Shi Xing, portrays character personality from emotional characteristics and two dimensions of language feature of microblog users;
Specifically comprise the following steps that
Step one, for certain user, utilize each the microblogging text mark that this user was sent out within certain period by emotion dictionary Note emotion label;
Emotion dictionary includes happiness, anger, sadness, detest and five kinds of emotions of anxiety;
Step 2, according to emotion label, add up this user and get excited every day class and the quantity of depressed class emotion;
Impulsion class includes angry and detests two kinds of emotions, and depressed class includes sad and two kinds of emotions of anxiety;
Step 3, class of getting excited according to this user and the quantity of depressed class emotion, calculate this user and get excited the leading natural law of class emotion And the leading natural law of depression class emotion;
Step 4, the leading natural law of class emotion of getting excited according to this user and the leading natural law of depressed class emotion, from emotional characteristics User is marked by angle;
Step 5, utilize topic dictionary to carry out all microblogging texts of this user paying close attention to topic classification, and select this user's Pay close attention to topic;
Topic dictionary has political class, people's livelihood class, military class, amusement class and sport category;
Step 6, judge in the concern topic that this user chooses, if include politics class and people's livelihood class, if it has, utilize criticism Property dictionary carries out language feature to this user and portrays;Otherwise, it is left intact;
The word that Being Critical dictionary includes is the word expressed and satirize, criticize the tone;
Step 7, the emotional characteristics merging this user and language feature portray the personality of this user, obtain people's physical property of this user Case marker label.
A kind of character personality depicting method based on microblogging text, it is characterised in that described step In rapid one, utilize microblogging emotion dictionary that microblogging text is carried out emotion mark, particularly as follows:
First, calculating each microblogging text and belong to weight w_sentiment of certain emotion, formula is as follows:
w _ s e n t i m e n t = Σ w o r d w _ s * c o u n t ( w o r d )
W_s represents the weight that the word word in certain microblogging text is corresponding in emotion dictionary;Word word refers to mood word What in allusion quotation, certain emotion included specifically embodies word;Count (word) represents what this word word occurred in certain microblogging text Frequency;
Then, comparing each microblogging text weight under five class emotions, weighting the highest heavy emotion is as this microblogging text Emotion label.
A kind of character personality depicting method based on microblogging text, it is characterised in that described step Rapid three, particularly as follows:
Step 301, calculating this user and get excited class and the microblogging quantity sum of depressed class emotion, account for that this user sent out the same day is all micro- The ratio of rich sum;
Step 302, judge that the accounting of step 301, whether more than or equal to threshold value R, if it is, enter step 303, otherwise, is not appointed Where is managed;
Threshold value R sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data;
Step 303, class of this user being got excited emotion accounting are poor with the accounting of depressed class emotion;
Whether the absolute difference that step 304, judgement obtain is more than or equal to threshold value M, if it is, enter step 305;Otherwise, no Do any process;
Threshold value M sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data;
Whether step 305, the class emotion accounting that judges to get excited be more than depressed class emotion accounting, if it is, class feelings that this user is got excited The natural law of thread adds up 1 day;Otherwise, the natural law of this user's depression class emotion is added up 1 day.
A kind of character personality depicting method based on microblogging text, it is characterised in that described step Rapid four, particularly as follows:
For class emotion of getting excited dominate natural law dominate natural law more than depressed class emotion in the case of, it is judged that impulsion class emotion dominates natural law Whether more than or equal to threshold value D, if it is, this user of labelling is " inflammable ";Otherwise, this user of labelling is that " emotion is steady Fixed ";
Threshold value D sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data;
When depressed class emotion is dominated natural law and is dominated natural law more than class emotion of getting excited, it is judged that depressed class emotion dominates natural law Whether more than or equal to threshold value D, if it is, this user of labelling is " easily depressed ", otherwise, then this user of labelling is that " emotion is steady Fixed ";
Dominating natural law when impulsion class emotion and dominate natural law equal to depressed class emotion, this user of labelling is " being emotionally stable ".
A kind of character personality depicting method based on microblogging text, it is characterised in that described step In rapid four, utilize microblog topic dictionary that user carries out paying close attention to the classification of topic, particularly as follows:
First, calculate the weight equation w_topic of various types of topics involved by the microblogging text of user, as follows:
w _ t o p i c = Σ w o r d w _ t * c o u n t ( w o r d )
W_t represents that the word word in all microblogging texts that certain user issued within certain period is corresponding in topic dictionary Weight;
Then, for certain user, calculate five kinds of topics that all microbloggings that this user issues within certain period relate to respectively Weight, the weight of five kinds of topics is ranked up, the weighting the highest top n topic of weight is paid close attention to as this user's microblogging text Topic;N is more than or equal to 1, less than or equal to 3.
A kind of character personality depicting method based on microblogging text, it is characterised in that described step Rapid six, particularly as follows:
Add up all microblogging texts that this user issued within certain period, calculate in the Being Critical dictionary occurred in microblogging text Including word, it is judged that the number of different terms occurs whether more than or equal to threshold k, if it is, mark the user as " batch Sentence type ", otherwise, user is labeled as " other ";
Threshold k sets according to expertise, or the empirical value obtained according to the statistical magnitude of microblogging text data.
A kind of character personality depicting method based on microblogging text, it is characterised in that described step Rapid seven, according to the combined method shown in following table, the emotional characteristics of fusion user and language feature portray the personality of this user, specifically For:
The character personality label finally given has " choleric type ", " pessimistic type ", " criticism type ", " impulsive type ", " depressive type " and " surely Sizing ".
CN201610559542.1A 2016-07-15 2016-07-15 A kind of character personality depicting method based on microblogging text Pending CN106202047A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610559542.1A CN106202047A (en) 2016-07-15 2016-07-15 A kind of character personality depicting method based on microblogging text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610559542.1A CN106202047A (en) 2016-07-15 2016-07-15 A kind of character personality depicting method based on microblogging text

Publications (1)

Publication Number Publication Date
CN106202047A true CN106202047A (en) 2016-12-07

Family

ID=57474640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610559542.1A Pending CN106202047A (en) 2016-07-15 2016-07-15 A kind of character personality depicting method based on microblogging text

Country Status (1)

Country Link
CN (1) CN106202047A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563929A (en) * 2017-07-27 2018-01-09 杭州中奥科技有限公司 A kind of various dimensions siren based on personage's specificity analysis
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data
CN110096575A (en) * 2019-03-25 2019-08-06 国家计算机网络与信息安全管理中心 Psychological profiling method towards microblog users
CN111489263A (en) * 2019-11-20 2020-08-04 北京中人网信息咨询股份有限公司 Humanized behavior model analysis self-drawing system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text
CN103530283A (en) * 2013-10-25 2014-01-22 苏州大学 Method for extracting emotional triggers
CN104536953A (en) * 2015-01-22 2015-04-22 苏州大学 Method and device for recognizing textual emotion polarity
CN104636425A (en) * 2014-12-18 2015-05-20 北京理工大学 Method for predicting and visualizing emotion cognitive ability of network individual or group

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663046A (en) * 2012-03-29 2012-09-12 中国科学院自动化研究所 Sentiment analysis method oriented to micro-blog short text
CN103530283A (en) * 2013-10-25 2014-01-22 苏州大学 Method for extracting emotional triggers
CN104636425A (en) * 2014-12-18 2015-05-20 北京理工大学 Method for predicting and visualizing emotion cognitive ability of network individual or group
CN104536953A (en) * 2015-01-22 2015-04-22 苏州大学 Method and device for recognizing textual emotion polarity

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JENNIFER GOLBECK 等: "Predicting Personality from Twitter", 《2011 IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK, AND TRUST, AND IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING》 *
万丹琳: "基于中文微博的用户倾向挖掘与分析", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563929A (en) * 2017-07-27 2018-01-09 杭州中奥科技有限公司 A kind of various dimensions siren based on personage's specificity analysis
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data
CN107577782B (en) * 2017-09-14 2021-04-30 国家计算机网络与信息安全管理中心 Figure similarity depicting method based on heterogeneous data
CN110096575A (en) * 2019-03-25 2019-08-06 国家计算机网络与信息安全管理中心 Psychological profiling method towards microblog users
CN110096575B (en) * 2019-03-25 2022-02-01 国家计算机网络与信息安全管理中心 Psychological portrait method facing microblog user
CN111489263A (en) * 2019-11-20 2020-08-04 北京中人网信息咨询股份有限公司 Humanized behavior model analysis self-drawing system

Similar Documents

Publication Publication Date Title
Wang et al. Multiple affective attribute classification of online customer product reviews: A heuristic deep learning method for supporting Kansei engineering
CN104636425B (en) A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing
Danisman et al. Feeler: Emotion classification of text using vector space model
Rai Identifying key product attributes and their importance levels from online customer reviews
CN111914096A (en) Public transport passenger satisfaction evaluation method and system based on public opinion knowledge graph
CN105893344A (en) User semantic sentiment analysis-based response method and device
Pons et al. Impact of Corporate Social Responsibility in mining industries
CN104331394A (en) Text classification method based on viewpoint
CN106062730A (en) Systems and methods for actively composing content for use in continuous social communication
KR101955318B1 (en) The method for visualizing big data in cosmetic information providing mobile application
CN105138577B (en) Big data based event evolution analysis method
CN103559199B (en) Method for abstracting web page information and device
CN106202047A (en) A kind of character personality depicting method based on microblogging text
CN105095183A (en) Text emotional tendency determination method and system
CN108108468A (en) A kind of short text sentiment analysis method and apparatus based on concept and text emotion
CN107885785A (en) Text emotion analysis method and device
CN110096587A (en) The fine granularity sentiment classification model of LSTM-CNN word insertion based on attention mechanism
CN110442728A (en) Sentiment dictionary construction method based on word2vec automobile product field
CN109766452A (en) A kind of character personality analysis method based on social data
Tang et al. Evaluation of Chinese sentiment analysis APIs based on online reviews
Kim et al. Emotion-based Hangul font recommendation system using crowdsourcing
Jankowski et al. Modeling the impact of visual components on verbal communication in online advertising
Zappavigna Visualizing logogenesis: Preserving the dynamics of meaning
CN106294312A (en) Information processing method and information processing system
CN102999485A (en) Real emotion analyzing method based on public Chinese network text

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161207