CN106682770B - Dynamic microblog forwarding behavior prediction system and method based on friend circle - Google Patents

Dynamic microblog forwarding behavior prediction system and method based on friend circle Download PDF

Info

Publication number
CN106682770B
CN106682770B CN201611151738.3A CN201611151738A CN106682770B CN 106682770 B CN106682770 B CN 106682770B CN 201611151738 A CN201611151738 A CN 201611151738A CN 106682770 B CN106682770 B CN 106682770B
Authority
CN
China
Prior art keywords
user
users
interest
microblog
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611151738.3A
Other languages
Chinese (zh)
Other versions
CN106682770A (en
Inventor
柳靓云
肖云鹏
杜江
刘宴兵
张克毅
李茜曦
李晓娟
宋晨光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201611151738.3A priority Critical patent/CN106682770B/en
Publication of CN106682770A publication Critical patent/CN106682770A/en
Application granted granted Critical
Publication of CN106682770B publication Critical patent/CN106682770B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a dynamic microblog forwarding behavior prediction system and method based on a friend circle, and belongs to the field of social network information analysis.

Description

Dynamic microblog forwarding behavior prediction system and method based on friend circle
Technical Field
The invention relates to the field of social network information analysis, in particular to a dynamic microblog forwarding behavior prediction model constructed according to social network user behavior analysis.
Background
With the popularization of the WEB2.0 concept and the increasing maturity of related technologies, social websites such as Twitter, Facebook, Sino microblog and the like have great influence on the life of people. People update the state or send broadcasts in the social network site so as to show the life state of the people, make up thoughts or share information with friends. The social network site provides a very convenient platform for users to communicate with each other, make opinions and views. Modeling and predicting the user behavior of the social network site have important social significance and application value in multiple fields such as safety, business and the like, and are gradually paid attention by researchers in recent years.
The method includes text-based analysis, user Influence-based analysis, network structure-based analysis and the like, wherein the text-based analysis mainly utilizes a probability Topic model to analyze texts, predicts user forwarding behaviors of users according to similarity of text topics and user interests, and the like, and constructs a social Topic forwarding model based on the probability Topic model of Who participates in a black network Community, thereby Finding out a social Influence prediction model of user Influence, which is based on social Influence prediction of users, and user Influence prediction of social Influence of users, which is based on a theoretical user Influence prediction model of user Influence and a theoretical user Influence prediction model of user Influence, which is based on a theoretical user Influence prediction model of user Influence and a theoretical Influence prediction model of user Influence of users Who will participate in a black network Community, and a social Influence prediction model of user Influence of user Who will be found in social network by a social network prediction model of user Influence prediction, which is based on a social Influence prediction model of user Influence of users Who will be found in a social network by a social network.
The information forwarding behavior of the user is a result of multi-factor combined action, but the prior art does not consider the complexity of the user behavior, and only focuses on predicting the user forwarding behavior on the one hand, the prediction result is not accurate, and the importance of each feature influencing the user behavior cannot be evaluated.
Disclosure of Invention
The invention aims at the problems existing in the prior art: aiming at the problems of network dynamic characteristics, user behavior characterization, user characteristic importance evaluation and the like in information transmission, the dynamic microblog forwarding behavior prediction system and method based on the friend circle are provided, wherein the dynamic microblog forwarding behavior prediction system and method can effectively estimate whether the message can be forwarded or not and can find microblogs which possibly cause large-scale outbreaks as soon as possible. The technical scheme of the invention is as follows:
a dynamic microblog forwarding behavior prediction system based on a friend circle comprises a user behavior data source obtaining module, an attribute extracting module, a model building module and a prediction analysis module, wherein the user behavior data source obtaining module is used for obtaining user relationship and user behavior data in a social network and taking fans of text users as alternative users; the microblog forwarding behavior prediction model building module is used for building a microblog forwarding behavior prediction model for the alternative user, the forwarding behavior is mainly determined by the interest difference tau between the alternative user and friends of the alternative user, the activity s of the alternative user in the article release period and the network influence r parameter of the friends of the alternative user, and the model parameters are fitted; and the prediction analysis module is used for predicting whether the candidate user can forward the microblog according to the parameters obtained after fitting and the user text sending condition at any time t.
Further, the attribute extraction module extracts a user interest vector according to the difference of interests among users, including: obtaining an attention list of each user by using attention behavior attributes of the users, and defining an interest vector of the user v as
Figure BDA0001179868950000031
Wherein e isv,uI.e., represents that user v is interested in users in the list, u 1,2v|,|EvAnd | represents the total number of users in the user v interest list.
Further, the attribute extraction module extracts a user state vector for the activity of the candidate user, including: acquiring the user microblog issuing activity and the microblog forwarding activity of each user within a period of time by using the interactive behavior attribute and the time attribute of the user, and defining the activity state vector of the user v as
Figure BDA0001179868950000032
Wherein,
Figure BDA0001179868950000033
representing the microblog release activity of the user v on the time slice t,
Figure BDA0001179868950000034
representing the forwarded microblog activity of the user v on the time slice t,
Figure BDA0001179868950000035
Figure BDA0001179868950000036
and
Figure BDA0001179868950000037
respectively representing the number of microblogs issued by the user v in the time slice t, the number of forwarded microblogs and the average number of microblogs issued by the user v per day.
Further, the attribute extraction module extracts a user feature vector for the influence of the user sending a text, including: obtaining the out-degree, in-degree and local aggregation coefficient of each user node by using the network topological structure attribute, and defining the influence characteristic vector of the user v as
Figure BDA0001179868950000038
Wherein d isv,1Number of fans representing user v, dv,2Indicating the number of buddies of user v,
Figure BDA0001179868950000041
representing the local cluster coefficient, Ng, of user vvIs a set of neighboring nodes, edg, for node vijIs the connection between its adjacent nodes.
The microblog forwarding behavior prediction model extracts user interest vectors from user behavior and user relationship information in terms of the interest difference among users, trains all users by using an L DA model to obtain interest subject distribution of the users, extracts user state vectors in each time slice from the user behavior and time information in terms of the interest difference among users, uses Gaussian distribution improvement L DA for elements in the user state vectors being continuous values, trains all users by using an improved L DA model to obtain active state distribution of the users in each time slice, extracts feature vectors of the users from network structure information in terms of influence of the user who sends out, uses Gaussian distribution improvement L DA as well as all users by using an improved L DA model to obtain network distribution of the users, and finally obtains a whole user forwarding history prediction model according to whether interest roles among the users are consistent, active network states of the users of the alternative users on each time slice, and user role forwarding behavior of the user training data.
Further, the acquiring, by the microblog forwarding behavior prediction model, the distribution of the interest topics of the user further includes: on the basis of the user relationship network, the interactive behaviors among users are reused, the interest vectors I (v) of the users are weighted to obtain weighted user interest vectors
Figure BDA0001179868950000042
Wherein, wv,nN. 1,2v,NvAnd (4) training all users by using an L DA model for the total interaction times of the users v, so that the interest topic distribution of the users can be obtained.
Further, the acquiring the active state distribution of the user on each time slice further includes: publishing liveness x for a userv,t,1And forwarding liveness xv,t,2The continuous variable is obtained by improving an L DA model by using Gaussian distribution, so that values of the release activity and the forwarding activity respectively obey different Gaussian distributions:
Figure BDA0001179868950000043
wherein x isv,t,mRepresenting the mth attribute value, μ, of user v over time slice ts,mAnd σs,mRespectively, the mean and standard deviation of the mth attribute when the user activity state is s.
Further, by a time slicing method, cutting each day from 0 o' clock at night into 4 time intervals, i.e. t is 1,2,3,4, dividing the active state of the user into three levels, i.e. very active, general active and inactive, training all users by using the improved L DA model, the active state distribution of the user on each time slice can be obtained.
Further, the user nodes are divided into three role types based on a network topology structure, namely opinion leaders, information propagators and common users, and after a Gaussian distribution improved L DA model is used, all users are trained by the model, so that the network role distribution of the users can be obtained.
A dynamic microblog forwarding behavior prediction method based on a friend circle of the system comprises the following steps:
acquiring user relationship and user behavior data in a social network, and taking fans of text users as alternative users; acquiring three user vectors from three aspects of interest difference among users, activity of alternative users and influence of a text user as input of a prediction model;
constructing a microblog forwarding behavior prediction model, and fitting model parameters;
and inputting the parameters obtained after fitting and the user text sending condition at any time t into a prediction model to predict whether the candidate user can forward the microblog.
The invention has the following advantages and beneficial effects:
the invention provides a dynamic microblog forwarding behavior prediction method based on a friend circle, which comprises the steps of firstly, utilizing a basic thought and a method which can solve 'one word is polysemic and multiple words is polysemic' by utilizing an L DA topic model aiming at the diversity of the interest, the liveness and the influence of a single user, carrying out modeling analysis on the user behavior to obtain the topic distribution related to the user behavior, secondly, utilizing Gaussian distribution to improve L DA to find the liveness and the influence of the user by considering that elements in a user state vector and a user characteristic vector are continuous values, and finally, utilizing a time dispersion and time slicing method aiming at the change of the liveness of the user along with the time to provide an improved L DA dynamic microblog forwarding behavior prediction model, dynamically monitoring the liveness of the user and improving the accuracy of the prediction model.
The invention provides a dynamic microblog forwarding behavior prediction method based on a friend circle, aiming at the problems of network dynamic characteristics, user behavior characterization, user characteristic importance evaluation and the like in information transmission, and the user forwarding behavior can be accurately predicted. According to the prediction result, whether the message can be forwarded or not and the forwarding scale of the message can be effectively estimated, the microblog which possibly causes large-scale outbreak can be found as soon as possible, and the method has great significance for microblog burstiness detection and microblog influence evaluation.
Drawings
FIG. 1 is a flowchart of a method for predicting a dynamic microblog forwarding behavior based on a friend circle according to an embodiment of the present invention;
FIG. 2 is a block diagram of a predictive model of the present invention;
FIG. 3 is a flow chart of the predictive model of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical solution of the present invention for solving the above technical problems is,
because information propagation in the social network is mainly promoted by interest difference, user historical behaviors and a network structure, the invention starts from three aspects of user interest, activity and influence, utilizes the basic thought and method of an L DA topic model to perform modeling analysis on the user behaviors to obtain topic distribution related to the user behaviors, then uses Gaussian distribution improvement L DA to find the activity and the influence of the user aiming at the problem of continuous variables existing in user attributes, and finally utilizes a time dispersion and time slicing method aiming at the change of the activity of the user along with time to provide an improved L DA dynamic microblog forwarding behavior prediction model, so that the activity of the user can be dynamically monitored, the forwarding behavior of the user can be accurately predicted, and key factors influencing the user forwarding can be found.
The method specifically comprises the steps of giving a social relationship network G (V, E, Y), wherein V represents all users in the network, | V | N represents the number of the users, E represents the relationship among all the users and is an N × N-dimensional matrix, Y represents a series of past behaviors of the users, and | Y | I represents the total number of user behavior data, designing a probability generation model, analyzing each user by utilizing the user relationship and the user behavior information in the social network and adding the influence of timeliness factors, obtaining the interest distribution, the activity level distribution, the network role distribution and the user forwarding behavior distribution of each user through 4 probability generation processes, and predicting the forwarding behavior of the user on the concerned microblog in a period of time according to the 4 distributions.
Fig. 1 shows a general flow chart of the present invention, which mainly includes: the system comprises a data acquisition module, an attribute extraction module, a model construction module and a prediction analysis module.
The detailed implementation of the present invention is described in detail below.
S1: a data source is acquired. The acquired data specifically comprises user behavior information of the user attention relationship network and all users in the network, and the user behavior comprises microblogs issued and forwarded by the users and time for issuing and forwarding the microblogs. Specifically, the following method (which can also be obtained by a conventional method in the prior art) can be used:
s11: raw data is acquired. And acquiring past behavior data of a user attention relationship network and all users under the network. The original data can be obtained through a social network public API or by directly downloading the existing data source, and the data can be supplemented by combining methods such as a web crawler and the like.
S12: simple data cleaning. Most of the data can be made available for analysis by simple data cleansing. Such as deleting duplicate data, cleaning up invalid nodes, etc.
S13: and time slicing is carried out on the data, and the attribute of the user on each time slice is determined. The user attributes herein specifically refer to the publishing activity and forwarding activity of the user. Because the forwarding behavior of the user is closely related to the work and rest time of the user, the user performs time slicing on a day by taking preset time (such as 6 hours) as a time period according to the work and rest characteristics of the user. And in a certain time period t, determining the active state of the user in the time period according to the user attribute so as to predict whether the user can forward the microblog of the friend.
S2: and extracting the relevant attributes. Considering that forwarding behaviors in a social network mainly include three aspects of interest difference, user historical behaviors and network structure, the invention extracts relevant attributes such as attention behavior attributes, interaction behavior attributes, time attributes and network structure attributes from three aspects of user interest, activity and influence. The attributes of which may be modified as appropriate depending on the characteristics of the data aspects.
And after extracting all the attributes of the three aspects, acquiring a corresponding user vector. The specific manner thereof is as follows.
S21: and extracting the user interest vector. Considering the attention of users interested in the users, acquiring an attention list of each user by using the attention behavior attributes of the users, and defining the interest vector of the user v as follows:
Figure BDA0001179868950000081
wherein e isv,u(u=1,2......|EvI) represents user v is interested in the user in the list, | EvAnd | represents the total number of users in the user v interest list. For example: the users in the user a attention list are: b, c, d, e.]。
S22: and extracting the user state vector. According to the daily work and rest characteristics of users, time slicing is carried out on a day by taking preset time (such as 6 hours) as a time period, the microblog publishing activity and the microblog forwarding activity of each user in each time slice are obtained by utilizing the interaction behavior attribute and the time attribute of the users, and the state vector of the user v is defined as follows:
Figure BDA0001179868950000082
wherein,
Figure BDA0001179868950000083
representing the microblog release activity of the user v on the time slice t,
Figure BDA0001179868950000084
and representing the activity of the forwarded microblog of the user v on the time slice t.
Figure BDA0001179868950000085
And
Figure BDA0001179868950000086
respectively representing the number of microblogs issued by the user v in the time slice t, the number of forwarded microblogs and the average number of microblogs issued by the user v per day. For example: the user a publishes 3 microblogs on the 1 st time slice, wherein the number of forwarded microblogs is 2, and the user a publishes 5 microblogs on average one day, so that the behavior vector of the user a is
Figure BDA0001179868950000087
S23: and extracting the user feature vector. Because the position of the user node in the network has a great influence on information transmission, the out-degree, in-degree and local aggregation coefficients of each user node are obtained by utilizing the network topological structure attribute, and the feature vector of the user v is defined as:
Figure BDA0001179868950000088
wherein d isv,1Number of fans representing user v, dv,2Indicating the number of buddies of user v,
Figure BDA0001179868950000089
representing the local cluster coefficients of user v. Ng gvIs a set of neighboring nodes, edg, for node vijIs the connection between its adjacent nodes. For example: the user a has 30 fans and 20 friends, the total number of the neighbor nodes is 40, 200 connecting edges exist between the neighbor nodes, and then the usera is a feature vector of
Figure BDA0001179868950000091
S3: a prediction model is established, and as shown in fig. 2, a block diagram of the prediction model of the present invention is shown. Whether the candidate user can forward the microblog of the friend is mainly determined by the interest difference tau between the candidate user and the friend, the activity s of the candidate user in the article release period and the network influence r of the friend.
The prediction of whether the candidate user can forward a certain microblog of the friend thereof by the prediction model specifically comprises the steps of extracting interest vectors I (v) of the users from user behavior and user relationship information in the aspect of interest difference, training all the users by utilizing an L DA model, and acquiring the interest community distribution of the users
Figure BDA0001179868950000092
Wherein,
Figure BDA0001179868950000093
for the aspect of user activity, state vectors L (v, t) of the users on each time slice are extracted from user behavior and time information, for the elements in the user state vectors are continuous values, Gaussian distribution improvement L DA is firstly used, then all the users are trained by utilizing an improved L DA model, and the activity state distribution of the users on each time slice is obtained
Figure BDA0001179868950000094
Wherein,
Figure BDA0001179868950000095
for the network influence aspect of the users, extracting the characteristic vector F (v) of the users from the network structure information, using Gaussian distribution improvement L DA as the user state vector, training all users by using an improved L DA model, and acquiring the network role score of the users
Figure BDA0001179868950000096
Wherein,
Figure BDA0001179868950000097
representing a network role probability distribution for user v; finally, according to the interest community distribution of the users
Figure BDA0001179868950000098
Distribution of user activity state in each time slice
Figure BDA0001179868950000099
Network role distribution for users
Figure BDA00011798689500000910
Training the whole prediction model by historical forwarding data Y of the user to obtain the forwarding behavior distribution of the user
Figure BDA00011798689500000911
Wherein,
Figure BDA00011798689500000912
representing the probability of the alternative user forwarding the microblog when the interest difference among the users is tau, the alternative user is in an active state s and the text sending user plays a network role r,
Figure BDA00011798689500000913
indicating the probability of not forwarding. The solution of the model and how to predict the forwarding behavior of the candidate users over the respective time slices will be described in detail in the following section.
FIG. 3 is a flow chart of the predictive model of the present invention.
S31: and acquiring the interest community distribution of the user.
The friend relationship only represents the possibility of interaction between users, so that the strength of information interaction between the users cannot be truly reflected, and the friend relationship tends to be static. In order to find an active interest community, interaction behaviors among users are reused on the basis of a user relationship network, and interaction weighting is performed on interest vectors I (v) of the users, wherein the interaction behaviors specifically refer to forwarding behaviors, and the obtained weighted interest vectors of the users are as follows:
Figure BDA0001179868950000101
wherein, wv,n(n=1,2......Nv) Interactive object representing the nth interaction behavior of user v, NvThe total number of interactions for user v. For example: user a interacts with user b2 times, and interacts with user c 4.]。
Given C as the number of interest communities, training all users by adopting an L DA model, wherein the specific generation process is as follows:
for each user v:
1. sampling an edge distribution ξ -Dir (λ), λ being a parameter of Dirichlet distribution;
2. sampling a user interest community distribution
Figure BDA0001179868950000102
α are parameters of Dirichlet distribution;
3. for each edge e of the userv,i
1) Sampling an interest community
Figure BDA0001179868950000103
2) Sampling a side
Figure BDA0001179868950000104
Wherein,
Figure BDA0001179868950000105
represents the distribution of the community of interest of user v,
Figure BDA0001179868950000106
representing the edge distribution of community of interest c.
In this probabilistic generative model, modeling user behavior is actually to compute the distribution of interest communities for the user
Figure BDA0001179868950000107
And edge distribution of communities of interest
Figure BDA0001179868950000108
For the solution of Φ and ξ, using Gibbs sampling, the equations for Gibbs sampling to estimate Φ and ξ at each iteration are as follows:
Figure BDA0001179868950000109
Figure BDA00011798689500001010
wherein,
Figure BDA00011798689500001011
representing the probability of the user v being in the community of interest C, C being the total number of communities of interest,
Figure BDA00011798689500001012
nv,crepresenting the number of times user v interacts with interested users in community of interest c, | NvL is the total number of times of interaction between the user v and friends of the user v;
Figure BDA0001179868950000111
representing the probability of user E appearing in community of interest c, | E | is the total number of edges in the network,
Figure BDA0001179868950000112
nc,erepresenting the number of interactions, n, of a user e in the community of interest ccThe total number of interactions in the community of interest c.
S32: and acquiring the distribution of the active states of the user on each time slice.
The forwarding behavior of the user is closely related to the work and rest time, each user has relatively fixed internet surfing time, in the time period, the user is active, the probability of posting and replying is high, and other time rarely participates in the spreading of topics.Therefore, by the time slicing method, each day is sliced into 4 periods (t ═ 1,2,3,4) from 0 o' clock at night, and the vector data are dispersed in time. Second, the liveness x is published for the userv,t,1And forwarding liveness xv,t,2The continuous variable is obtained by improving an L DA model by using Gaussian distribution, so that values of the release activity and the forwarding activity respectively obey different Gaussian distributions:
Figure BDA0001179868950000113
wherein x isv,t,mRepresenting the mth attribute value, μ, of user v over time slice ts,mAnd σs,mRespectively, the mean and standard deviation of the mth attribute when the user activity state is s.
The invention sets the active state of the user to three levels S to 3, namely, very active, general active and inactive, and trains all users by using the improved L DA model, and the specific generation process is as follows:
for each user v:
1. sampling the distribution of the activity state of a user over a time slice t
Figure BDA0001179868950000114
β are parameters of Dirichlet distribution;
2. sampling an active level
Figure BDA0001179868950000115
3. For each attribute of user v:
1) sampling an attribute value
Figure BDA0001179868950000116
Wherein,
Figure BDA0001179868950000117
representing the distribution of the activity state of user v over time slice t.
In the probability generation model, the user state attribute is processedThe modeling is actually to calculate the distribution of the active state of the user over the various time slices
Figure BDA0001179868950000121
And Gaussian distribution N (mu, sigma) to which each attribute value of the user obeys. For theta(t)And mu and sigma are solved, an EM algorithm is adopted, and the EM iteration estimates theta(t)And μ, σ, the process is divided into two steps:
e-step: updating
Figure BDA0001179868950000122
Figure BDA0001179868950000123
M-step: updating mus,mAnd σs,m
Figure BDA0001179868950000124
Figure BDA0001179868950000125
Wherein,
Figure BDA0001179868950000126
representing the probability that the user v is in an active state of S in a time slice t, S being the number of state levels,
Figure BDA0001179868950000127
m is the number of user state attributes, xv,t,mRepresenting the mth attribute value, μ, of user v over time slice ts,mAnd σs,mRespectively, the mean and standard deviation of the mth attribute when the user activity state is s.
S33: and acquiring the network role distribution of the user.
The position of the node in the network and the resulting impact have a significant impact on the information dissemination. The invention divides the user nodes into three role types R (3), namely opinion leader, information propagator and common user, based on the network topology structure. Opinion leaders have a higher degree of in-coming and information propagators have a higher degree of out-going.
Similarly, due to the existence of continuous variables in the role attributes, after the L DA model is improved by using Gaussian distribution, all users are trained by using the improved L DA model, and the specific generation process is as follows:
for each user v:
1. sampling a user network role distribution
Figure BDA0001179868950000128
Is a parameter of Dirichlet distribution;
2. sampling a network role
Figure BDA0001179868950000131
3. For each role attribute of user v:
1) sampling a role attribute value
Figure BDA0001179868950000132
Wherein,
Figure BDA0001179868950000133
representing the network role distribution of user v.
In this probabilistic generative model, modeling the user role attributes is actually to compute the network role distribution of the user
Figure BDA0001179868950000134
And the attribute value of each role of the user is subjected to Gaussian distribution N (mu ', sigma'). for the solution of η and mu ', sigma', adopting an EM algorithm, the process of EM iterative estimation η and mu ', sigma' is divided into two steps:
e-step: updating
Figure BDA0001179868950000135
Figure BDA0001179868950000136
M-step: mu 'are updated'r,hAnd σ'r,h
Figure BDA0001179868950000137
Figure BDA0001179868950000138
Wherein,
Figure BDA0001179868950000139
representing the probability that user v plays the network role R, which is the number of network roles,
Figure BDA00011798689500001310
h is the number of user state attributes, dv,hH-th attribute value, mu 'representing user v'r,hAnd σ'r,hThe mean and standard deviation, respectively, of the h-th attribute when the user is playing the network role r.
S34: and acquiring the forwarding behavior distribution of the user.
According to user interest community distribution
Figure BDA00011798689500001311
Distribution of user activity state in each time slice
Figure BDA00011798689500001312
Network role distribution for users
Figure BDA00011798689500001313
Training the whole prediction model by historical forwarding data Y of the user to obtain the forwarding behavior distribution of the user
Figure BDA00011798689500001314
The specific generation process is as follows:
forward behavior y for each useri
1. Sampling a user forwarding behavior distribution rho-Dir (gamma), wherein the gamma is a parameter of Dirichlet distribution;
2. sampling a community of interest for alternative users v
Figure BDA0001179868950000141
3. Sampling a community of interest for a textual user u
Figure BDA0001179868950000142
4. Sampling an active state for an alternative user v
Figure BDA0001179868950000143
5. Sampling a network role for a text user u
Figure BDA0001179868950000144
6. Sampling a user forwarding behavior
Figure BDA0001179868950000145
Wherein,
Figure BDA0001179868950000146
representing the distribution of the forwarding behavior of the user,
Figure BDA0001179868950000147
representing the probability of the alternative user forwarding the microblog when the interest difference among the users is tau, the alternative user is in an active state s and the text sending user plays a network role r,
Figure BDA0001179868950000148
indicating the probability of not forwarding. τ is an indicator function defined as follows:
Figure BDA0001179868950000149
wherein z isu,zvRespectively representAnd the interest communities of the users u and v. τ -1 indicates consistent interest, and τ -0 indicates inconsistent interest.
In the probability generation model, modeling the forwarding behavior of the user is to calculate the forwarding behavior distribution of the user
Figure BDA00011798689500001410
For the
Figure BDA00011798689500001411
The solution of (1) adopts Gibbs sampling, and each iteration of the Gibbs sampling is estimated
Figure BDA00011798689500001412
The formula of (1) is as follows:
Figure BDA00011798689500001413
wherein n isi,τ,s,rRepresenting the user behavior y when the interest difference is tau, the active state of the alternative user is s and the user sending the text plays the network role ri1 (forward) or yiNumber of 0 (no forwarding); i is the total number of user behaviors, including non-forwarding behaviors; m is the number of the user state attributes, and H is the number of the user role attributes.
S4: phi and theta obtained by fitting(t)、η、
Figure BDA00011798689500001414
Calculating the forwarding probability of any microblog of the friends of the user according to the fitted prediction model
Figure BDA00011798689500001415
The prediction result can be obtained. And analyzing which friends' microblogs can be forwarded by the user and key factors influencing the microblogs forwarded by the user according to the predicted result.
According to the method, firstly, aiming at the diversity of interest, activeness and influence of a single user, a basic idea and a method of 'multiple words and multiple meanings' can be solved by using an L DA topic model, modeling analysis is carried out on user behaviors to obtain topic distribution related to the user behaviors, secondly, the activity of the user and the influence of the user are found by using Gaussian distribution improved L DA considering that elements in a user state vector and a user characteristic vector are continuous values, and finally, aiming at the change of the activity of the user along with time, a time dispersion and time slicing method is used for providing an improved L DA dynamic microblog forwarding behavior prediction model, the activity of the user is dynamically monitored, the forwarding behavior of the user can be accurately predicted, and key factors influencing user forwarding are analyzed.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (2)

1. A dynamic microblog forwarding behavior prediction system based on a friend circle comprises a user behavior data source obtaining module, an attribute extraction module, a model building module and a prediction analysis module, wherein the user behavior data source obtaining module is used for obtaining user relationship and user behavior data in a social network and taking fans of text users as alternative users; the microblog forwarding behavior prediction model building module is used for building a microblog forwarding behavior prediction model for the alternative user, the forwarding behavior is mainly determined by the interest difference tau between the alternative user and friends of the alternative user, the activity s of the alternative user in the article release period and the network influence r parameter of the friends of the alternative user, and the model parameters are fitted; the prediction analysis module is used for predicting whether the candidate user can forward the microblog or not according to the parameters obtained after fitting and the microblog releasing situation of the user at any time t;
the attribute extraction module extracts user interest vectors according to the interest difference among users, and comprises the following steps: obtaining an attention list of each user by using attention behavior attributes of the users, and defining an interest vector of the user v as
Figure FDA0002458199840000011
Wherein e isv,uI.e., represents that user v is interested in users in the list, u 1,2v|,|EvL represents the total number of users in the user v attention list;
the attribute extraction module extracts a user state vector aiming at the activity of the alternative user, and comprises the following steps: acquiring the user microblog issuing activity and the microblog forwarding activity of each user within a period of time by using the interactive behavior attribute and the time attribute of the user, and defining the activity state vector of the user v as
Figure FDA0002458199840000012
Wherein,
Figure FDA0002458199840000013
representing the microblog release activity of the user v on the time slice t,
Figure FDA0002458199840000014
representing the forwarded microblog activity of the user v on the time slice t,
Figure FDA0002458199840000015
Figure FDA0002458199840000016
and
Figure FDA0002458199840000017
respectively representing the number of microblogs issued by the user v in the time slice t, the number of forwarded microblogs and the average number of microblogs issued by the user v per day;
the attribute extraction module extracts user characteristics aiming at the influence of the user sending a textThe eigenvector includes: obtaining the out-degree, in-degree and local aggregation coefficient of each user node by using the network topological structure attribute, and defining the influence characteristic vector of the user v as
Figure FDA0002458199840000021
Wherein d isv,1Number of fans representing user v, dv,2Indicating the number of buddies of user v,
Figure FDA0002458199840000022
representing the local cluster coefficient, Ng, of user vvIs a set of neighboring nodes, edg, for node vijIs the connection between its adjacent nodes;
the microblog forwarding behavior prediction model extracts user interest vectors from user behavior and user relationship information in terms of interest difference among users, extracts user interest vectors from the user behavior and user relationship information, trains all users by using an L DA model and obtains interest subject distribution of the users, extracts user state vectors on each time slice from the user behavior and time information in terms of interest difference among users, uses Gaussian distribution improvement L DA aiming at continuous elements in the user state vectors, trains all users by using an improved L DA model and obtains active state distribution of the users on each time slice, extracts feature vectors of the users from network structure information in terms of influence of the user who sends out the text, uses Gaussian distribution improvement L DA as well as an improved L DA model to train all users and obtain network role distribution of the users, and finally obtains multiple forwarding behavior prediction items according to whether the interest among the users is consistent, the state of the candidate users on each time slice, the network role of the active users and the forwarding data of the whole user;
the acquiring of the interest topic distribution of the user by the microblog forwarding behavior prediction model further comprises: on the basis of the user relationship network, the interactive behaviors among users are reused, the interest vectors I (v) of the users are weighted to obtain weighted user interest vectors
Figure FDA0002458199840000023
Wherein, wv,nN. 1,2v,NvTraining all users by using an L DA model for the total interaction times of the users v, so that the interest topic distribution of the users can be obtained;
the acquiring the active state distribution of the user on each time slice further comprises: publishing liveness x for a userv,t,1And forwarding liveness xv,t,2The continuous variable is obtained by improving an L DA model by using Gaussian distribution, so that values of the release activity and the forwarding activity respectively obey different Gaussian distributions:
Figure FDA0002458199840000024
wherein x isv,t,mRepresenting the mth attribute value, μ, of user v over time slice ts,mAnd σs,mRespectively is the mean value and standard deviation of the mth attribute when the user activity state is s;
by a time slicing method, cutting every day from 0 night into 4 time intervals, namely t is 1,2,3 and 4, dividing the activity state of the user into three levels, namely very active, general active and inactive, training all users by using an improved L DA model, and obtaining the activity state distribution of the user on each time slice;
based on a network topological structure, user nodes are divided into three role types, namely opinion leaders, information propagators and common users, and similarly, after a Gaussian distribution improved L DA model is used, all users are trained by the model, and then the network role distribution of the users can be obtained.
2. A dynamic microblog forwarding behavior prediction method based on a friend circle of the system of claim 1,
the method is characterized by comprising the following steps:
acquiring user relationship and user behavior data in a social network, and taking fans of text users as alternative users; acquiring three user vectors from three aspects of interest difference among users, activity of alternative users and influence of a text user as input of a prediction model;
constructing a microblog forwarding behavior prediction model, and fitting model parameters;
and inputting the parameters obtained after fitting and the microblog issuing condition of the user at any time t into a prediction model to predict whether the alternative user can forward the microblog.
CN201611151738.3A 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle Active CN106682770B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611151738.3A CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611151738.3A CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Publications (2)

Publication Number Publication Date
CN106682770A CN106682770A (en) 2017-05-17
CN106682770B true CN106682770B (en) 2020-08-04

Family

ID=58867918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611151738.3A Active CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Country Status (1)

Country Link
CN (1) CN106682770B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240042A (en) * 2017-06-28 2017-10-10 梧州市兴能农业科技有限公司 A kind of efficient community management system
CN107808168B (en) * 2017-10-31 2023-08-01 北京科技大学 Social network user behavior prediction method based on strong and weak relation
CN109840790B (en) * 2017-11-28 2023-04-28 腾讯科技(深圳)有限公司 User loss prediction method and device and computer equipment
CN108182339B (en) * 2018-03-20 2021-08-13 北京工业大学 Window state prediction method and system based on Gaussian distribution
CN108596205B (en) * 2018-03-20 2022-02-11 重庆邮电大学 Microblog forwarding behavior prediction method based on region correlation factor and sparse representation
CN108763400B (en) * 2018-05-22 2021-09-14 合肥工业大学 Object dividing method and device based on object behaviors and theme preferences
CN109784578B (en) * 2019-01-24 2021-02-02 中国科学院软件研究所 Online learning stagnation prediction system combined with business rules
CN109829504B (en) * 2019-02-14 2022-07-01 重庆邮电大学 Prediction method and system for analyzing user forwarding behavior based on ICS-SVM
US11338200B2 (en) * 2019-03-15 2022-05-24 Sony Interactive Entertainment Inc. Server load prediction and advanced performance measures
CN109922359B (en) * 2019-03-19 2022-01-04 广州虎牙信息科技有限公司 User processing method, device, equipment and storage medium
CN110069711A (en) * 2019-04-23 2019-07-30 北京科技大学 User's Value Engineering Method and device
CN110233833B (en) * 2019-05-23 2020-09-29 中国科学院计算技术研究所 Message sending method and system supporting privacy protection of social network users
CN110825818B (en) * 2019-09-18 2023-06-27 平安科技(深圳)有限公司 Multidimensional feature construction method and device, electronic equipment and storage medium
CN112712210B (en) * 2020-12-30 2023-07-25 深圳市网联安瑞网络科技有限公司 Sudden topic transmission scale prediction method, system, processing terminal and medium
CN113807616B (en) * 2021-10-22 2022-11-04 重庆理工大学 Information diffusion prediction system based on space-time attention and heterogeneous graph convolution network
CN116166788A (en) * 2023-01-12 2023-05-26 烟台大学 User theme interest preference prediction method based on internal and external dynamic factor perception
CN117217808B (en) * 2023-07-21 2024-04-05 广州有机云计算有限责任公司 Intelligent analysis and prediction method for activity invitation capability

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394798B (en) * 2011-11-16 2014-12-31 北京交通大学 Multi-feature based prediction method of propagation behavior of microblog information and system thereof
US9489638B2 (en) * 2013-08-02 2016-11-08 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for propagating user preference information in a communications network
US20160042277A1 (en) * 2014-08-05 2016-02-11 Hewlett-Packard Development Company, Lp Social action and social tie prediction
CN104572807B (en) * 2014-10-29 2018-02-06 中国科学院计算技术研究所 A kind of news authentication method and system based on micro-blog information source
CN104933475A (en) * 2015-05-27 2015-09-23 国家计算机网络与信息安全管理中心 Network forwarding behavior prediction method and apparatus
CN105809554B (en) * 2016-02-07 2020-03-17 重庆邮电大学 Prediction method for user participating in hot topics in social network

Also Published As

Publication number Publication date
CN106682770A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106682770B (en) Dynamic microblog forwarding behavior prediction system and method based on friend circle
Nguyen et al. Real-time event detection for online behavioral analysis of big social data
US11100411B2 (en) Predicting influence in social networks
Zhang et al. Event detection and popularity prediction in microblogging
Bliss et al. An evolutionary algorithm approach to link prediction in dynamic social networks
Lu et al. The emergence of opinion leaders in a networked online community: A dyadic model with time dynamics and a heuristic for fast estimation
CN105809554B (en) Prediction method for user participating in hot topics in social network
De Choudhury et al. What makes conversations interesting? themes, participants and consequences of conversations in online social media
CN111191099B (en) User activity type identification method based on social media
Li et al. Modeling and evaluating information propagation in a microblogging social network
Liu et al. Detecting collusive spamming activities in community question answering
CN106503858A (en) A kind of method that trains for predicting the model of social network user forwarding message
Lu et al. Collective human behavior in cascading system: discovery, modeling and applications
Hu et al. Learning the strength of the factors influencing user behavior in online social networks
Yan et al. User recommendation with tensor factorization in social networks
CN112256756B (en) Influence discovery method based on ternary association diagram and knowledge representation
Kumar et al. Real-time analysis and visualization of online social media dynamics
Costa et al. Vote-and-comment: Modeling the coevolution of user interactions in social voting web sites
Lu et al. Modeling and predicting the re-post behavior in Sina Weibo
Gupta et al. An analytical review of sentiment analysis on twitter
Wang et al. User Influence Evaluation Based on Behavioral Event Model
Changjun 2 The rules of information diffusion in social networks
CN107483256B (en) Label extraction method for networked data stream
Hasny et al. Predicting the quality of Web services based on user stability
Wan et al. Evolutionary Prediction of Information Propagation Dynamics Based on Time Series Characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant