CN106295702B - A social platform user classification method based on individual emotional behavior analysis - Google Patents

A social platform user classification method based on individual emotional behavior analysis Download PDF

Info

Publication number
CN106295702B
CN106295702B CN201610668449.4A CN201610668449A CN106295702B CN 106295702 B CN106295702 B CN 106295702B CN 201610668449 A CN201610668449 A CN 201610668449A CN 106295702 B CN106295702 B CN 106295702B
Authority
CN
China
Prior art keywords
user
emotional
forwarding
individual
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610668449.4A
Other languages
Chinese (zh)
Other versions
CN106295702A (en
Inventor
於志文
马超
王柱
郭斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201610668449.4A priority Critical patent/CN106295702B/en
Publication of CN106295702A publication Critical patent/CN106295702A/en
Application granted granted Critical
Publication of CN106295702B publication Critical patent/CN106295702B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06Q10/40

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公布了一种基于情感行为分析的社交平台用户分类方法,包括以下步骤:S1、构建转发树;S2、构建用户历史记录,提取具有相同用户ID的节点构建此用户的个体转发历史记录;S3、构建用户情感行为描述特征;S4、利用S3中的特征构建给予决策树的用户角色分类模型,完成基于情感行为分析的社交平台用户分类。本发明构建了较为全面的用户情感行为描述模型,可以更全面的考虑用户的个人历史信息;该方法充分利用了微博当中的用户个人信息,传播结构信息,情感信息以及动态时域信息。由于采用以上措施,本发明能够获得更好的分类准确率。

The invention discloses a social platform user classification method based on emotional behavior analysis, comprising the following steps: S1, constructing a forwarding tree; S2, constructing user history records, extracting nodes with the same user ID to construct the user's individual forwarding history records; S3. Build user emotional behavior description features; S4. Use the features in S3 to build a user role classification model given a decision tree, and complete social platform user classification based on emotional behavior analysis. The invention constructs a relatively comprehensive user emotional behavior description model, which can more comprehensively consider the user's personal history information; the method makes full use of the user's personal information, dissemination structure information, emotional information and dynamic time domain information in microblogs. Due to the adoption of the above measures, the present invention can obtain better classification accuracy.

Description

一种基于个体情感行为分析的社交平台用户分类方法A social platform user classification method based on individual emotional behavior analysis

技术领域technical field

本发明属于社交网络技术领域,特别涉及一种基于个体情感行为分析的社交平台用户分类方法。The invention belongs to the technical field of social networks, in particular to a method for classifying social platform users based on individual emotional behavior analysis.

背景技术Background technique

随着互联网技术的发展,以微博为代表的在线社交网络得到大规模的使用。用户可以在其上自行发布信息,也可以通过转发,评论,点赞等方式与其它信息进行交互,与真实社交网络相同,在线社交网络的用户行为传达出的不仅仅是字面信息,它同时包含着用户的情感态度,这种情感态度因用户个人背景与习惯的不同而不同,并贯穿于用户的所有交互行为当中,我们把用户所具有的这种情感特征称之为用户的情感角色。With the development of Internet technology, online social networks represented by Weibo have been used on a large scale. Users can publish information on it themselves, and can also interact with other information by forwarding, commenting, liking, etc. Like real social networks, user behavior on online social networks conveys not only literal information, but also The user's emotional attitude is different due to the user's personal background and habits, and runs through all the user's interactive behaviors. We call this emotional characteristic of the user the user's emotional role.

目前针对在线社交网络用户的研究主要包括以下几个方面,1、用户影响力的挖掘,此类研究着力于通过对用户个人属性以及信息传播特征的分析,建立描述用户社交影响力的模型或算法,实现用户影响力计算,发现社交领导者;2、用户在线行为的预测,此类研究通过对用户历史,上下文环境以及社交关系等因素的考虑对用户进行建模,实现对用户特定行为或偏好的预测,例如是否参与转发,是否感兴趣等。3、用户情感分析,此类研究以某一个时刻用户会有怎样的情感作为出发点,通过多种数据源(包括文本,图片,视频,音乐等),线上线下结合以及社交关系等因素实现用户情感的分析与预测。以上研究在一定程度上为我们揭示了用户的在线行为规律和社交网络的内在运作规律,但缺乏对用户情感的综合考虑。The current research on online social network users mainly includes the following aspects, 1. Mining of user influence. This type of research focuses on the analysis of users' personal attributes and information dissemination characteristics to establish a model or algorithm to describe users' social influence. , realize the calculation of user influence, and discover social leaders; 2. Prediction of user online behavior. This type of research models users by considering factors such as user history, context, and social relations, and realizes user-specific behaviors or preferences. predictions, such as whether to participate in forwarding, whether to be interested, etc. 3. User sentiment analysis. This kind of research takes what kind of emotion the user will have as a starting point at a certain moment, and realizes the user's sentiment through a variety of data sources (including text, pictures, videos, music, etc.), online and offline integration, and social relations. Sentiment Analysis and Prediction. The above studies have revealed the rules of users' online behavior and the internal operation rules of social networks to a certain extent, but lack of comprehensive consideration of users' emotions.

发明内容Contents of the invention

针对以上问题,本发明通过从用户个人情感角度进行分析,提供一种基于个体情感行为分析的社交平台用户分类方法,具体技术方案为:In view of the above problems, the present invention provides a social platform user classification method based on individual emotional behavior analysis by analyzing from the perspective of user's personal emotion, and the specific technical solution is:

一种基于个体情感行为分析的社交平台用户分类方法,包括以下步骤:A social platform user classification method based on individual emotional behavior analysis, comprising the following steps:

S1、构建转发树:提取社交平台用户转发信息,建立基于树型拓扑结构的社交平台转发树;S1. Construct forwarding tree: extract forwarding information of social platform users, and establish social platform forwarding tree based on tree topology;

S2、构建用户历史记录:对于转发树中的节点的转发信息进行情感计算,将结果按情感分类为积极、消极、中立;提取具有相同用户ID的节点构建此用户的个体转发历史记录;S2. Constructing user history records: perform sentiment calculation on forwarding information of nodes in the forwarding tree, and classify the results into positive, negative, and neutral according to sentiment; extract nodes with the same user ID to construct the individual forwarding history records of this user;

S3、构建用户情感行为描述特征:包括用户倾向描述特征:个体与群体情感关系ERu、用户个人历史情感偏好HPu;用户情感影响描述特征EIuS3. Construct user emotional behavior description features: including user tendency description features: individual and group emotional relationship ER u , user's personal historical emotional preference HP u ; user emotional impact description feature EI u ;

S4、利用S3中的特征构建给予决策树的用户角色分类模型,首先构造输入向量Uu=<ERu,HPu,EIu>,然后对每一个特征分别计算信息熵Uj为第j个特征,选取具有最大信息增益的特征构建当前决策节点,对剩余特征逐层递归得到最终决策树模型,进而完成基于情感的用户分类。S4. Use the features in S3 to build a user role classification model given to the decision tree. First, construct the input vector U u =<ER u ,HP u ,EI u >, and then calculate the information entropy for each feature U j is the jth feature, select the feature with the largest information gain to construct the current decision node, and recurse the remaining features layer by layer to obtain the final decision tree model, and then complete the user classification based on emotion.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S1中的转发信息包括原始文本信息、转发文本信息、参与用户的个体信息。Further, forwarding information in a social platform user classification method S1 based on individual emotional behavior analysis includes original text information, forwarded text information, and individual information of participating users.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S1按照层级由底向上进行文本情感解析,逐层添加转发节点,构建转发树。Further, a social platform user classification method S1 based on the analysis of individual emotional behaviors analyzes text sentiment from bottom to top according to the level, adds forwarding nodes layer by layer, and constructs a forwarding tree.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S2中的情感计算采用多规则集模型,通过文本点互信息自底向上建立情感词典、语法规则,所述的自底向上是指按照从词语、短语、短句、整句的顺序依次分析。Further, a social platform user classification method S2 based on individual emotional behavior analysis adopts a multi-rule set model for emotional calculation, and establishes emotional dictionaries and grammatical rules from the bottom up through the mutual information of text points. The bottom-up refers to Analyze in order from words, phrases, short sentences, and whole sentences.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S3中所述的个体和群体情感关系是基于个体的情感选择与群体情感的分布,描述为个体与当前一条文本信息的情感关系因子ERu(w),其取值范围为-1~1,该值越大表示当前关系越趋近积极,该值越小表示当前关系越趋近消极,如下表示:Further, the individual and group emotional relationship described in S3, a social platform user classification method based on individual emotional behavior analysis, is based on individual emotional selection and group emotional distribution, and is described as an emotional relationship factor between an individual and a current piece of text information ER u (w), its value ranges from -1 to 1, the larger the value, the more positive the current relationship is, and the smaller the value, the more negative the current relationship is, as follows:

其中,N(w),P(w),O(w)分别表示当前转发树内的消极情感分布,积极情感分布,中立情感分布,S(w)表示转发树规模。Among them, N(w), P(w), and O(w) respectively represent the negative sentiment distribution, positive sentiment distribution, and neutral sentiment distribution in the current forwarding tree, and S(w) represents the scale of the forwarding tree.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S3中个体历史情感偏好HPu(e)是基于用户历史记录中的情感分布以及历史转发中的用户评论参与度Cu(w),用以下公式表示:Further, a social platform user classification method based on the analysis of individual emotional behavior S3, the individual historical emotional preference HP u (e) is based on the emotional distribution in the user's historical records and the user comment participation C u (w) in the historical forwarding , expressed by the following formula:

其中,exp{-θ1(t0-tw)}为控制用户偏好的时间衰减,log(Cu(w)+2)为通过评论长度描述用户的参与程度。Among them, exp{-θ 1 (t 0 -t w )} is the time decay to control user preference, and log(C u (w)+2) is the description of user participation through comment length.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法S3中所述的情感影响EIu是基于转发树的结构特点SFu(w)、转发树的时域影响TFu(w)、用户的情感变化EIu(w),如下表示:Further, the emotional impact EI u described in a social platform user classification method based on individual emotional behavior analysis S3 is based on the structural characteristics of the forwarding tree SF u (w), the time domain impact of the forwarding tree TF u (w), The user's emotional change EI u (w) is expressed as follows:

HRu表示用户转发作为内部节点的个数,NRu表示用户转发作为叶子节点的个数。HR u represents the number of internal nodes forwarded by users, and NR u represents the number of leaf nodes forwarded by users.

进一步地,一种基于个体情感行为分析的社交平台用户分类方法中转发树的结构特点SFu(w)基于转发树的绝对规模S(w)、相对规模Su(w)以及子树深度DPu(w),如下表示:Further, the structural characteristics of the forwarding tree SF u (w) in a social platform user classification method based on individual emotional behavior analysis is based on the absolute scale S(w) of the forwarding tree, the relative scale S u (w) and the subtree depth DP u (w), as follows:

进一步地,一种基于个体情感行为分析的社交平台用户分类方法中转发树的时域影响TFu(w)为转发树在时间角度对信息传播的贡献,所述贡献体现在子树相对于整个转发树的存活时间、子树相对于原始文本的时间延迟两个方面;Further, in a social platform user classification method based on individual emotional behavior analysis, the time-domain influence TF u (w) of the forwarding tree is the contribution of the forwarding tree to information dissemination in the perspective of time, and the contribution is reflected in the subtree relative to the entire The survival time of the forwarding tree and the time delay of the subtree relative to the original text;

其中LPu(w)为子树生命周期,LP(w)为转发树生命周期,为子树相对于整个转发树的存活时间,exp{-ε(tu-tw)}为子树出现的时域延迟;Among them, LP u (w) is the subtree life cycle, LP (w) is the forwarding tree life cycle, is the survival time of the subtree relative to the entire forwarding tree, and exp{-ε(t u -t w )} is the time domain delay of the appearance of the subtree;

进一步地,一种基于个体情感行为分析的社交平台用户分类方法中用户的情感变化EFu(w)以当前用户的转发行为作为时间分界点,通过计算用户转发前后的情感分布差异,并通过指数函数对参数进行标准化,用以下公式表示:Furthermore, in a social platform user classification method based on individual emotional behavior analysis, the user's emotional change EF u (w) takes the current user's forwarding behavior as the time boundary point, calculates the difference in emotional distribution before and after the user's forwarding, and uses the index The function normalizes the parameters, represented by the following formula:

其中,Bu(w,e),Au(w,e)分别为用户转发前后的情感分布。Among them, B u (w, e) and A u (w, e) are the emotion distribution of the user before and after retweeting respectively.

本发明具有以下有益效果:The present invention has the following beneficial effects:

为了能够系统的描述用户在线情感行为,本发明定义了六类微博用户情感角色,分别是积极领导者,积极追随者,消极领导者,消极追随者,中立领导者,中立追随者,并提出一种基于个体情感行为分析的社交平台用户分类方法,该方法从两个维度(情感倾向与情感影响)建立用户情感行为描述模型。In order to systematically describe the online emotional behavior of users, the present invention defines six types of microblog user emotional roles, namely active leader, active follower, passive leader, passive follower, neutral leader, and neutral follower, and proposes A social platform user classification method based on individual emotional behavior analysis, which establishes a user emotional behavior description model from two dimensions (emotional tendency and emotional impact).

由于采用了技术方案中的用户情感倾向特征和用户影响特征,构建了较为全面的用户情感行为描述模型,可以更全面的考虑用户的个人历史信息;该方法充分利用了微博当中的用户个人信息,传播结构信息,情感信息以及动态时域信息。由于采用以上措施,本发明能够获得更好的分类准确率。Due to the adoption of the user's emotional tendency features and user influence features in the technical solution, a relatively comprehensive user emotional behavior description model can be constructed, which can more comprehensively consider the user's personal historical information; this method makes full use of the user's personal information in Weibo , to propagate structural information, emotional information, and dynamic temporal information. Due to the adoption of the above measures, the present invention can obtain better classification accuracy.

附图说明Description of drawings

图1本发明一种基于个体情感行为分析的社交平台用户分类方法流程图;Fig. 1 is a flow chart of a method for classifying social platform users based on individual emotional behavior analysis of the present invention;

图2本发明一种基于个体情感行为分析的社交平台用户分类方法用户历史记录实例;Fig. 2 a kind of social platform user classification method user historical record instance based on individual emotional behavior analysis of the present invention;

图3本发明一种基于个体情感行为分析的社交平台用户分类方法结构特性分布;Fig. 3 is a kind of social platform user classification method structural characteristic distribution based on individual emotional behavior analysis of the present invention;

图4本发明一种基于个体情感行为分析的社交平台用户分类方法时域特性分布;Fig. 4 is a kind of social platform user classification method time-domain characteristic distribution based on individual emotional behavior analysis of the present invention;

图5本发明一种基于个体情感行为分析的社交平台用户分类方法参数学习结果;Fig. 5 is a parameter learning result of a social platform user classification method based on individual emotional behavior analysis of the present invention;

图6本发明一种基于个体情感行为分析的社交平台用户分类方法情感变化特性分布;Fig. 6 presents a kind of social platform user classification method based on individual emotional behavior analysis emotion change characteristic distribution;

图7本发明一种基于个体情感行为分析的社交平台用户分类方法个人与宏观情感关系分布;Fig. 7 is a social platform user classification method based on individual emotional behavior analysis of the present invention and the distribution of personal and macro-emotional relationships;

图8本发明一种基于个体情感行为分析的社交平台用户分类方法历史情感偏好结果分布;Fig. 8 is a social platform user classification method based on individual emotional behavior analysis of the present invention, and the historical emotional preference result distribution;

图9本发明一种基于个体情感行为分析的社交平台用户分类方法情感影响结果。Fig. 9 is the emotional impact result of a social platform user classification method based on individual emotional behavior analysis of the present invention.

具体实施方式Detailed ways

为了使本发明的目的及优点更加清楚明白,以下结合实施例对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objects and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

实施例Example

S1、构建转发树:提取社交平台用户转发信息,建立基于树型拓扑结构的社交平台转发树.S1. Build forwarding tree: extract forwarding information of social platform users, and build social platform forwarding tree based on tree topology.

以微博为例,抓取微博上的转发数据,保留数据当中的用户信息,转发信息以及原始微博信息,根据微博转发的标识符“//@”以及上级用户昵称,按照层级由底向上进行文本解析,逐层添加转发节点,构建微博转发树。总共收集到19389名用户信息,构建转发树7096颗。Taking Weibo as an example, grab the forwarded data on Weibo, keep the user information in the data, the forwarded information and the original Weibo information, according to the identifier "//@" forwarded by Weibo and the nickname of the upper-level user, according to the level by Bottom-up text analysis, adding reposting nodes layer by layer, and constructing Weibo reposting tree. A total of 19,389 user information was collected, and 7,096 forwarding trees were constructed.

S2、构建用户历史记录:对于转发树中的节点的转发信息进行情感计算,将结果按情感分类为积极、消极、中立;提取具有相同用户ID的节点构建此用户的个体转发历史记。S2. Construct user history records: perform sentiment calculation on the forwarding information of nodes in the forwarding tree, and classify the results into positive, negative, and neutral according to sentiment; extract nodes with the same user ID to construct the user's individual forwarding history.

利用多规则集模型,对转发树中每一个节点所包含的文本信息进行情感计算,得到三种结果,分别是积极,消极和中立。之后,利用每一个微博转发节点所包含的用户信息,将具有相同用户ID的节点提取出来构建用户的个人历史转发记录并以XML文件形式进行存储。图2为一个用户的历史记录示例,<uid_1796678344>代表一个用户,<retweet>为当前用户的一条转发,<org_id>、<org_text>、<org_time>、<org_emotion>、<p_name>、<p_id>、<w_id>、<w_test>、<w_time>、<w_emotion>表示对应转发的相关属性。Using the multi-rule set model, the emotional calculation is carried out on the text information contained in each node in the forwarding tree, and three results are obtained, which are positive, negative and neutral. Afterwards, using the user information contained in each Weibo forwarding node, the nodes with the same user ID are extracted to construct the user's personal historical forwarding records and stored in the form of XML files. Figure 2 is an example of a user's history record, <uid_1796678344> represents a user, <retweet> is a retweet of the current user, <org_id>, <org_text>, <org_time>, <org_emotion>, <p_name>, <p_id> , <w_id>, <w_test>, <w_time>, <w_emotion> indicate the relevant attributes of the corresponding forwarding.

S3、构建用户情感行为描述特征:包括用户倾向描述特征:个体与群体情感关系ERu、用户个人历史情感偏好HPu;用户情感影响描述特征EIuS3. Construct user emotional behavior description features: including user tendency description features: individual and group emotional relationship ER u , user's personal historical emotional preference HP u ; user emotional impact description feature EI u .

从个人与宏观情感关系以及用户个人历史情感偏好两个角度构建用户情感倾向,对于前者,以ERu(w)表示用户与当前一条微博的情感关系因子取值范围在-1~1之间,该值越大表示当前关系越趋近积极,反之趋近消极,为使中立情感位于0附近,设定积极与消极的原点分别是0.5和-0.5,The user's emotional tendency is constructed from two perspectives, the personal and macro emotional relationship and the user's personal historical emotional preference. For the former, ER u (w) represents the emotional relationship factor between the user and the current Weibo. The value range is between -1 and 1. , the larger the value, the more positive the current relationship is, and vice versa. In order to make the neutral emotion near 0, the origins of positive and negative are set to 0.5 and -0.5 respectively.

N(w),P(w),O(w)分别表示当前转发树内的三类情感分布(消极,积极,中立),S(w)表示转发树规模。N(w), P(w), and O(w) respectively represent the three types of emotion distributions (negative, positive, neutral) in the current forwarding tree, and S(w) represents the scale of the forwarding tree.

用户个人历史情感偏好HPu(e)基于用户历史记录中的情感分布以及历史转发中的用户评论参与度Cu(w),指数部分用于控制用户偏好的时间衰减,以最近的微博发布时间t0作为参考点,对数部分通过评论长度描述用户的参与程度:The user's personal historical emotional preference HP u (e) is based on the emotional distribution in the user's historical records and the user comment participation C u (w) in the historical retweeting. The index part is used to control the time decay of the user's preference, published in the most recent Weibo Time t 0 is used as a reference point, and the logarithmic part describes the user's participation degree through the comment length:

从转发的结构特性,时域特性以及情感变化角度描述用户情感影响,微博转发的结构特点SFu(w)权衡转发树的绝对规模S(w)、相对规模Su(w)以及子树深度DPu(w):Describe the user's emotional impact from the perspective of forwarding structural characteristics, time-domain characteristics, and emotional changes. The structural characteristics of Weibo forwarding SF u (w) weigh the absolute size S(w) of the forwarding tree, the relative size S u (w) and the subtree Depth DP u (w):

图3描述了SFu(w)的计算结果分布,我们认为,在具有相同转发规模的情况下,子树越深意味着子树越稀疏,反之则越茂密,而更加茂密的子树往往具有更大范围的影响作用。Figure 3 describes the distribution of calculation results of SF u (w). We believe that with the same forwarding scale, the deeper the subtree means the sparser the subtree, and vice versa, the denser the subtree, and the denser subtree often has wider impact.

与结构特性不同,时域影响TFu(w)用来描述转发树在时间角度对信息传播的贡献,这种贡献集中体现在两个方面,第一,子树相对于整个转发的存活时间;第二,子树相对于原始微博的时间延迟。TFu(w)综合考虑子树生命周期LPu(w)、转发树生命周期LP(w)以及子树出现的时域延迟exp{-ε(tu-tw)}。ε图用于控制衰减速度:Different from the structural characteristics, the time-domain influence TF u (w) is used to describe the contribution of the forwarding tree to information dissemination in the perspective of time. This contribution is mainly reflected in two aspects. First, the survival time of the subtree relative to the entire forwarding; Second, the time delay of the subtree relative to the original microblog. TF u (w) comprehensively considers the subtree life cycle LP u (w), the forwarding tree life cycle LP(w) and the time domain delay exp{-ε(t u -t w )} of the subtree. The ε map is used to control the rate of decay:

本方法中通过试验准确度,将其设为0.2,图4描述了TFu(w)的计算结果分布。α1与β1为学习参数,通过对特征采取单独分类验证,以0.1为步长,选择准确性最高的值作为参数实际数值,此理中采用决策树的分类方法测试结果如图5所示,因此参数值分别设为0.6和0.7。In this method, the test accuracy is set to 0.2, and Fig. 4 depicts the distribution of calculation results of TFu(w). α 1 and β 1 are the learning parameters. Through the separate classification and verification of the features, the value with the highest accuracy is selected as the actual value of the parameter with a step size of 0.1. The test results of the classification method using the decision tree are shown in Figure 5. , so the parameter values are set to 0.6 and 0.7, respectively.

情感变化EFu(w)以当前用户的转发行为作为时间分界点,用户转发前后的情感分布分别以Bu(w,e),Au(w,e)表示,通过|Bu(w,e)-Au(w,e)|计算情感分布差异,并通过指数函数对参数进行标准化:Emotional change EF u (w) takes the current user’s retweeting behavior as the time cut-off point, and the emotional distribution before and after the user’s retweeting is represented by B u (w,e) and A u (w,e) respectively, through |B u (w, e)-Au( w ,e)| Computes the sentiment distribution difference and normalizes the parameters by an exponential function:

图6描述了EFu(w)的计算结果分布。Figure 6 depicts the distribution of calculated results of EF u (w).

S4、利用S3中的特征构建给予决策树的用户角色分类模型,首先构造输入向量Uu=<ERu,HPu,EIu>,然后对每一个特征分别计算信息熵Uj为第j个特征,选取具有最大信息增益的特征构建当前决策节点,对剩余特征逐层递归得到最终决策树模型,进而完成基于情感的用户分类。S4. Use the features in S3 to build a user role classification model given to the decision tree. First, construct the input vector U u =<ER u ,HP u ,EI u >, and then calculate the information entropy for each feature U j is the jth feature, select the feature with the largest information gain to construct the current decision node, and recurse the remaining features layer by layer to obtain the final decision tree model, and then complete the user classification based on emotion.

根据S3得到的结果进行特征融合,得到综合描述用户情感倾向ERu、HPu与情感影响EIu的特征作为模型输入:Perform feature fusion according to the results obtained in S3, and obtain the features that comprehensively describe the user's emotional tendency ER u , HP u and emotional impact EI u as model input:

其中EIu对三类影响特征进行融合,并考虑叶子节点并未产生任何影响这一情况,引入作为去噪因子HRu表示用户转发作为内部节点的个数,NRu表示用户转发作为叶子节点的个数,图7展示了当前数据集ERu的计算结果分布,图8展示了HPu的计算结果分布,图9展示了EIu的计算结果分布。最终通过基于决策树的分类方法,得到6种情感角色分类,分类结果如表1所示。Among them, EI u fuses the three types of influence features, and considers that the leaf nodes do not have any influence, introduces As a denoising factor, HR u represents the number of internal nodes forwarded by users, and NR u represents the number of leaf nodes forwarded by users. Figure 7 shows the distribution of calculation results of the current data set ER u , and Figure 8 shows the calculation of HP u Result distribution, Fig. 9 shows the distribution of calculation results of EI u . Finally, through the classification method based on the decision tree, six emotional role classifications are obtained, and the classification results are shown in Table 1.

表1实施例情感角色分类结果Table 1 embodiment emotion role classification result

情感角色emotional role 准确度Accuracy 积极领导者(PL)Positive Leader (PL) 0.870.87 积极追随者(PF)Positive Follower (PF) 0.900.90 中立领导者(OL)Neutral Leader (OL) 0.830.83 中立追随者(OF)Neutral Follower (OF) 0.860.86 消极领导者(NL)Negative Leader (NL) 0.910.91 消极追随者(NF)Negative Follower (NF) 0.920.92

Claims (7)

1.一种基于个体情感行为分析的社交平台用户分类方法,包括以下步骤:1. A social platform user classification method based on individual emotional behavior analysis, comprising the following steps: S1、构建转发树:提取社交平台用户转发信息,建立基于树型拓扑结构的社交平台转发树:S1. Build a forwarding tree: extract the forwarding information of social platform users, and establish a social platform forwarding tree based on tree topology: S2、构建用户历史记录:对于转发树中的节点的转发信息进行情感计算,将结果按情感分类为积极、消极、中立:提取具有相同用户ID的节点构建此用户的个体转发历史记录:S2. Construct user history records: perform sentiment calculation on forwarding information of nodes in the forwarding tree, and classify the results into positive, negative, and neutral according to sentiment: extract nodes with the same user ID to construct the individual forwarding history records of this user: S3、构建用户情感行为描述特征:包括用户倾向描述特征:个体与群体情感关系ERu、用户个人历史情感偏好HPu:用户情感影响描述特征EIuS3. Construct user emotional behavior description features: including user tendency description features: individual and group emotional relationship ER u , user personal historical emotional preference HP u : user emotional impact description feature EI u ; 所述的个体和群体情感关系是基于个体的情感选择与群体情感的分布,描述为个体与当前一条文本信息的情感关系因子ERu(w),其取值范围为-1~1,该值越大表示当前关系越趋近积极,该值越小表示当前关系越趋近消极,如下表示:The emotional relationship between the individual and the group is based on the emotional choice of the individual and the distribution of group emotion, and is described as the emotional relationship factor ER u (w) between the individual and the current piece of text information, and its value ranges from -1 to 1. The larger the value, the more positive the current relationship is, and the smaller the value, the more negative the current relationship is, as follows: 其中,N(w),P(w),O(w)分别表示当前转发树内的消极情感分布,积极情感分布,中立情感分布,S(w)表示转发树规模;Among them, N(w), P(w), and O(w) respectively represent the negative sentiment distribution, positive sentiment distribution, and neutral sentiment distribution in the current forwarding tree, and S(w) represents the scale of the forwarding tree; 所述的个体历史情感偏好HPu(e)是基于用户历史记录中的情感分布以及历史转发中的用户评论参与度Cu(w),用以下公式表示:The individual historical emotional preference HP u (e) is based on the emotional distribution in user history records and the user comment participation degree C u (w) in historical forwarding, expressed by the following formula: 其中,exp[-θ1(t0-tw)}为控制用户偏好的时间衰减,log(cu(w)+2)为通过评论长度描述用户的参与程度;Among them, exp[-θ 1 (t 0 -t w )} is to control the time decay of user preference, and log(c u (w)+2) is to describe the user's participation degree through the comment length; 所述的情感影响EIu是基于转发树的结构特点SFu(w)、转发树的时域影响TFu(w)、用户的情感变化EFu(w),如下表示:The emotional impact EIu is based on the structural characteristics SF u (w) of the forwarding tree, the time domain impact TF u (w) of the forwarding tree, and the user's emotional change EF u (w), as follows: 其中,HRu表示用户转发作为内N节点的个数,NRu表示用户转发作为叶子节点的个数;Among them, HR u represents the number of internal N nodes forwarded by users, and NR u represents the number of leaf nodes forwarded by users; S4、利用S3中的特征构建给予决策树的用户角色分类模型,首先构造输入向量Uu=<ERu,HPu,EIu>,然后对每一个特征分别计算信息熵Uj为第j个特征,选取具有最大信息增益的特征构建当前决策节点,对剩余特征逐层递归得到最终决策树模型,进而完成基于情感的用户分类。S4. Use the features in S3 to build a user role classification model given to the decision tree. First, construct the input vector U u =< ER u , HP u , EI u >, and then calculate the information entropy for each feature U j is the jth feature, select the feature with the largest information gain to construct the current decision node, and recurse the remaining features layer by layer to obtain the final decision tree model, and then complete the user classification based on emotion. 2.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述的S1中的转发信息包括原始文本信息、转发文本信息、参与用户的个体信息。2. A social platform user classification method based on individual emotional behavior analysis according to claim 1, characterized in that: said forwarding information in S1 includes original text information, forwarding text information, and individual information of participating users. 3.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述的S1按照层级由底向上进行文本情感解析,逐层添加转发节点,构建转发树。3. a kind of social platform user classification method based on individual emotional behavior analysis according to claim 1, it is characterized in that: described S1 carries out text emotion analysis according to hierarchy from bottom to top, adds forwarding node layer by layer, builds forwarding tree . 4.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述的S2中的情感计算采用多规则集模型,通过文本点互信息自底向上建立情感词典、语法规则,所述的自底向上是指按照从词语、短语、短句、整句的顺序依次分析。4. a kind of social platform user classification method based on individual emotion behavior analysis according to claim 1, it is characterized in that: the emotion computing among the described S 2 adopts multi-rule set model, bottom-up by text point mutual information Establishing emotional dictionaries and grammar rules, the bottom-up refers to analyzing words, phrases, short sentences, and whole sentences in sequence. 5.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述的转发树的结构特点SFu(w)基于转发树的绝对规模S(w)、相对规模Su(w)以及子树深度DPu(w),如下表示:5. a kind of social platform user classification method based on individual emotional behavior analysis according to claim 1, is characterized in that: the structural characteristic SF u (w) of described forwarding tree is based on the absolute scale S(w) of forwarding tree , relative scale S u (w) and subtree depth DP u (w), expressed as follows: 6.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述转发树的时域影响TFu(w)为转发树在时间角度对信息传播的贡献,所述贡献体现在子树相对于整个转发树的存活时间、子树相对于原始文本的时间延迟两个方面;6. a kind of social platform user classification method based on individual emotion behavior analysis according to claim 1, it is characterized in that: the time domain influence TF u (w) of described forwarding tree is forwarding tree to information propagating in angle of time Contribution, which is reflected in two aspects: the survival time of the subtree relative to the entire forwarding tree, and the time delay of the subtree relative to the original text; 其中LPu(w)为子树生命周期,LP(w)为转发树生命周期,为子树相对于整个转发树的存活时间,exp{-e(tu-tw)}为子树出现的时域延迟。Among them, LP u (w) is the subtree life cycle, LP (w) is the forwarding tree life cycle, is the survival time of the subtree relative to the entire forwarding tree, and exp{-e(t u -t w )} is the time-domain delay of the appearance of the subtree. 7.根据权利要求1所述的一种基于个体情感行为分析的社交平台用户分类方法,其特征在于:所述的用户的情感变化EFu(w)以当前用户的转发行为作为时间分界点,通过计算用户转发前后的情感分布差异,并通过指数函数对参数进行标准化,用以下公式表示:7. a kind of social platform user classification method based on individual emotional behavior analysis according to claim 1, is characterized in that: described user's emotional change EF u (w) takes current user's forwarding behavior as time demarcation point, By calculating the emotional distribution difference before and after the user reposts, and standardizing the parameters through the exponential function, it is expressed by the following formula: 其中,Bu(w,e),Au(w,e)分别为用户转发前后的情感分布。Among them, B u (w, e) and A u (w, e) are the emotion distribution before and after the user forwards, respectively.
CN201610668449.4A 2016-08-15 2016-08-15 A social platform user classification method based on individual emotional behavior analysis Expired - Fee Related CN106295702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610668449.4A CN106295702B (en) 2016-08-15 2016-08-15 A social platform user classification method based on individual emotional behavior analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610668449.4A CN106295702B (en) 2016-08-15 2016-08-15 A social platform user classification method based on individual emotional behavior analysis

Publications (2)

Publication Number Publication Date
CN106295702A CN106295702A (en) 2017-01-04
CN106295702B true CN106295702B (en) 2019-10-25

Family

ID=57670975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610668449.4A Expired - Fee Related CN106295702B (en) 2016-08-15 2016-08-15 A social platform user classification method based on individual emotional behavior analysis

Country Status (1)

Country Link
CN (1) CN106295702B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107369099B (en) * 2017-06-28 2021-01-22 江苏云机汇软件科技有限公司 User behavior analysis system facing social network
CN107563429B (en) * 2017-07-27 2020-11-10 国家计算机网络与信息安全管理中心 Method and device for classifying network user groups
CN107608792B (en) * 2017-09-12 2020-09-01 中国联合网络通信集团有限公司 Resource scheduling method and device
CN107590742B (en) * 2017-10-16 2021-06-22 东北大学 A behavior-based inversion method for user attribute value in social network
CN108268624B (en) * 2018-01-10 2020-04-24 华控清交信息科技(北京)有限公司 User data visualization method and system
CN109271634B (en) * 2018-09-17 2022-07-01 重庆理工大学 A sentiment polarity analysis method for microblog text based on user sentiment tendency perception
CN111565322B (en) * 2020-05-14 2022-03-04 北京奇艺世纪科技有限公司 User emotional tendency information obtaining method and device and electronic equipment
CN113158082B (en) * 2021-05-13 2023-01-17 和鸿广科技(上海)有限公司 An artificial intelligence-based method for analyzing the authenticity of media content
JP7582148B2 (en) * 2021-10-01 2024-11-13 トヨタ自動車株式会社 State prediction system, member determination system, and state prediction method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678613A (en) * 2013-12-17 2014-03-26 北京启明星辰信息安全技术有限公司 Method and device for calculating influence data
CN105320960A (en) * 2015-10-14 2016-02-10 北京航空航天大学 Voting based classification method for cross-language subjective and objective sentiments
CN105631748A (en) * 2015-12-21 2016-06-01 西北工业大学 Parallel label propagation-based heterogeneous network community discovery method
CN105654115A (en) * 2015-12-28 2016-06-08 西北工业大学 Density adaptive clustering method orienting behavior identification

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8856056B2 (en) * 2011-03-22 2014-10-07 Isentium, Llc Sentiment calculus for a method and system using social media for event-driven trading

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678613A (en) * 2013-12-17 2014-03-26 北京启明星辰信息安全技术有限公司 Method and device for calculating influence data
CN105320960A (en) * 2015-10-14 2016-02-10 北京航空航天大学 Voting based classification method for cross-language subjective and objective sentiments
CN105631748A (en) * 2015-12-21 2016-06-01 西北工业大学 Parallel label propagation-based heterogeneous network community discovery method
CN105654115A (en) * 2015-12-28 2016-06-08 西北工业大学 Density adaptive clustering method orienting behavior identification

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Cross-Domain and Cross-Category Emotion Tagging for Comments of Online News;Ying Zhang等;《proceedings SIGIR’14 proceedings of the 37TH international ACM SIGIR conference on research&development in information retrieve》;20140711;第627-636页 *
Discovering Information Propagation Patterns in Microblogging Services;ZHIWEN YU等;《ACM Transactions on Knowledge Discovery from Data》;20150731;第10卷(第1期);摘要、第3节、第6.1节、第7.2节,图1 *
Featuring, Detecting, and Visualizing Human Sentiment in Chinese Micro-Blog;ZHIWEN YU等;《ACM Transactions on Knowledge Discovery from Data》;20160530;第10卷(第4期);第48:1至48:23页 *
Lexicon-based Sentiment Analysis on Topical Chinese Microblog Messages;CUI Anqi等;《semantic web and science》;20130502;第1节、第2节,图1 *
Sentiment Detection and Visualization of Chinese Micro-blog;Zhitao Wang等;《2014 international conference on data science and advanced analytics(DSAA)》;20141101;第1-7页 *

Also Published As

Publication number Publication date
CN106295702A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN106295702B (en) A social platform user classification method based on individual emotional behavior analysis
Khanam et al. The homophily principle in social network analysis: A survey
Farnadi et al. Computational personality recognition in social media
CN105868317B (en) Digital education resource recommendation method and system
Kherwa et al. An approach towards comprehensive sentimental data analysis and opinion mining
CN113641807B (en) Training method, device, equipment and storage medium for dialogue recommendation model
CN115470991A (en) Network rumor propagation prediction method based on user short-time emotion and evolutionary game
CN104572982B (en) Personalized recommendation method and system based on problem guiding
CN108427715A (en) A kind of social networks friend recommendation method of fusion degree of belief
JP5698105B2 (en) Dialog model construction apparatus, method, and program
CN108549632B (en) A method for constructing social network influence propagation model based on sentiment analysis
CN107239489A (en) The prediction of network public-opinion and emulation mode in accident based on SOAR models
CN106202252A (en) Method, system are recommended in a kind of trip analyzed based on user emotion
CN107103093A (en) A kind of short text based on user behavior and sentiment analysis recommends method and device
CN103631862B (en) Event characteristic evolution excavation method and system based on microblogs
CN104850647A (en) Microblog group discovering method and microblog group discovering device
CN107305545A (en) A kind of recognition methods of the network opinion leader based on text tendency analysis
Wohlgenannt et al. Extracting social networks from literary text with word embedding tools
CN107895027A (en) Individual feelings and emotions knowledge mapping method for building up and device
CN107451689A (en) Topic trend forecasting method and device based on microblogging
Kanev et al. Sentiment analysis of multilingual texts using machine learning methods
Chen et al. Graph meets LLM: A novel approach to collaborative filtering for robust conversational understanding
Wang et al. A genealogy of information spreading on microblogs: A Galton-Watson-based explicative model
Farid et al. Detection of cyberbullying in tweets in Egyptian dialects
Wei et al. Analysis of information dissemination based on emotional and the evolution life cycle of public opinion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191025

CF01 Termination of patent right due to non-payment of annual fee