CN106682770A - Friend circle-based dynamic microblog forwarding behavior prediction system and method - Google Patents

Friend circle-based dynamic microblog forwarding behavior prediction system and method Download PDF

Info

Publication number
CN106682770A
CN106682770A CN201611151738.3A CN201611151738A CN106682770A CN 106682770 A CN106682770 A CN 106682770A CN 201611151738 A CN201611151738 A CN 201611151738A CN 106682770 A CN106682770 A CN 106682770A
Authority
CN
China
Prior art keywords
user
microblogging
users
interest
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611151738.3A
Other languages
Chinese (zh)
Other versions
CN106682770B (en
Inventor
柳靓云
肖云鹏
杜江
刘宴兵
张克毅
李茜曦
李晓娟
宋晨光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201611151738.3A priority Critical patent/CN106682770B/en
Publication of CN106682770A publication Critical patent/CN106682770A/en
Application granted granted Critical
Publication of CN106682770B publication Critical patent/CN106682770B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to a friend circle-based dynamic microblog forwarding behavior prediction system and method and belongs to the social network information analysis field. On the basis of the relationships of users and the behavior data of the users in a social network, the effects of interest differences, the historical behaviors of the users, network structures and the like in promoting network information dissemination are considered; based on the interests, activity degrees and influences of the users, the basis and idea and method of an LDA theme model are adopted to perform modeling analysis on the behaviors of the users, and theme distribution concerning the behaviors of the users is obtained; Gaussian distribution is adopted to improve the LDA, so that the activity degrees and influences of the users can be discovered; based on a principle that the activity degrees of the users change with the time, time discretization and time slicing methods are utilized, so that an improved LDA dynamic microblog forwarding behavior prediction model is put forward; the model is fitted; data are inputted into the prediction model, so that the activity degrees of the users can be monitored dynamically; and therefore, the forwarding behaviors of the users and key factors that affect the forwarding of the users can be predicted more accurately.

Description

A kind of dynamic microblogging forwarding behavior prediction system and method based on good friend's circle
Technical field
The present invention relates to social network information analysis field, relates generally to, according to social network user behavior analysiss, build A kind of dynamic microblogging forwards behavior prediction model.
Background technology
With the popularization of WEB2.0 theories it is increasingly mature with correlation technique, social network sites such as Twitter, Facebook, The life to people such as Sina weibo generates tremendous influence.People more new state or transmission broadcast in social network sites, with this Come represent oneself animation, deliver thoughts or with friends' sharing information.Social network sites are mutually exchanged for user, deliver meaning See conveniently platform is provided with viewpoint.The user behavior of social network sites is modeled and is predicted for safety, business There is highly important social meaning and using value etc. multiple fields, the attention of researcher is gradually obtained in recent years.
Sina weibo was a Information Sharing and intercommunion platform that amusement and leisure service for life is provided for masses, in 2009 August starts interior survey on 14th.By in by the end of June, 2014, China's microblog users scale is 2.75 hundred million, and the pass of complexity is constituted between user Note network, the average microblogging that sends daily are nearly 100,000,000, and concern relation of the information along between user is propagated, and forms Spreading and diffusion network. User's forwarding is maximally effective information dissemination mechanism in microblogging, and the research of current forwarding prediction is concentrated mainly on interest characteristicss, uses The impact to forwarding behavior prediction result such as family power of influence and user property.The method for being used includes text based point Analysis, the analysis based on user force and the analysis based on network structure etc..Wherein, text based analysis mainly utilizes probability Topic model analyzes text, predicts the forwarding behavior of user according to the similarity of text subject and user interest.For example:Xuning Tang et al. exists《Next who will participate inThe participation of prediction black Web Community》(Who will be Participating NextPredicting the Participation of Dark Web Community) in construct a user interest With topic detection model (UTD).Under conditions of given existing certain customers are replied to certain model, UTD models pass through Obtain topic content and prediction of the development trend which user can produce interest to new model;It is main based on user force analysis User is studied in social networkies for the power of influence of other users, and is mutually tied with the behavial factor for affecting user's forwarding, comment Close, the purpose of probability is forwarded so as to reach prediction user.For example:Weng J et al. exist《Power of influence user is found based on theme》 (TwitterRank:Finding Topic-sensitive Influential Twitterers) in pass through user force Evaluation, helps user to find out rapidly oneself information interested, " believes caused by the excessive institute of friend in micro blog network so as to solve Breath overload problem ";Analysis based on network structure mainly goes out in-degree scheduling theory using Small-world Theory in Self, user, builds factor graph The forwarding behavior of model prediction user.For example:Jing Zhang et al. exist《Who have impact on youTurned by social effectiveness prediction It is distributed as》(Who Influenced YouPredicting Re-tweet via social Influence Locality) In have studied based on user's good friend's circle, the forwarding Forecasting Methodology of binding factor graph model and social effectiveness analysis.
The information forwarding behavior of user is multifactor coefficient result, but above-mentioned prior art does not consider user's row For complexity, focus only on the one hand prediction user forwarding behavior, predict the outcome and be inaccurate, and impact cannot be assessed The importance of each feature of user behavior.In addition, current research is concentrated mainly on network static feature to Information Communication Affect, but ignore the important function of network dynamics, cause dynamic network static problem.For example, user activity With dynamic characteristic, the liveness of user was continually changing with the time, and its diffusion of information speed and scope also will change therewith.Cause This, on the basis of network static feature, should take into full account impact of the dynamic factor to Information Communication.Due to filling in social networkies Scold the information of magnanimity, digging user interest is one of main path of raising information forwarding prediction effect, using LDA theme moulds Huge advantage of the type in terms of big text-processing and Feature Dimension Reduction, can help user to find out rapidly oneself information interested. The problems such as this paper emphasis is for network dynamics, user behavior sign and user characteristicses importance sequenencing, introduces and optimizes LDA topic models, are modeled analysis, and adopt time discretization and time dicing method, strengthen LDA models to user behavior Disposal ability to dynamic subscriber's feature, dynamic monitoring user activity improve the accuracy of forwarding prediction.
The content of the invention
The problem that the present invention is present for prior art:Characterize for network dynamics, user behavior in Information Communication And the problems such as user characteristicses importance sequenencing, it is proposed that it is a kind of effectively to estimate whether message be forwarded and its forward rule Mould, have found that it is likely that early the dynamic microblogging based on good friend's circle of the microblogging for causing large-scale outbreak forward behavior prediction system and Method.Technical scheme is as follows:
A kind of dynamic microblogging forwarding behavior prediction system based on good friend's circle, including user behavior data source acquisition mould Block, for obtaining the customer relationship in social networkies and user behavior data, by the vermicelli alternately user of dispatch user, its Also include property extracting module, model construction module and forecast analysis module, wherein, the property extracting module is respectively from user Between interest difference, the liveness of alternative user and dispatch user power of influence three in terms of extract association attributes vector as prediction The input of model;Microblogging forwards behavior prediction model construction module, for microblogging forwarding behavior prediction mould is built to alternative user Type, forwards interest difference τ of the behavior mainly by alternative user and its good friend, alternative user that the liveness s of period is issued in article Determine with the network influence r parameters of its good friend, and model above parameter is fitted;Forecast analysis module is used for will fitting The parameter for obtaining afterwards and user's dispatch situation of any instant t carry out the prediction whether alternative user can forward the microblogging.
Further, for interest difference between user, extract user interest vector includes the property extracting module:Utilize The concern behavior property of user, obtains the concern list of each user, and the interest vector for defining user v isWherein, ev,uRepresent that user v pays close attention to the user in list, u=1,2...... | Ev|, | Ev| represent the total number of users in user v concern lists.
Further, liveness of the property extracting module for alternative user, extracting User Status vector includes:Profit With the interbehavior attribute and time attribute of user, obtain user issuing microblog liveness of each user within a period of time and Microblogging liveness is forwarded, the liveness state vector for defining user v isWherein,Issuing microblog liveness of the user v in timeslice t is represented,Represent that user v exists Forwarding microblogging liveness in timeslice t, WithIssuing microblogs of the user v in timeslice t is represented respectively The average issuing microblog number daily of number, forwarding microblogging number and user v.
Further, power of influence of the property extracting module for dispatch user, extracting user characteristicses vector includes:Profit Network topology structure attribute is used, out-degree, in-degree and the local convergence factor of each user node is obtained, the impact of user v is defined Power characteristic vector isWherein, dv,1Represent the vermicelli number of user v, dv,2Represent that user v's is good Friendly number,Represent the localized clusters coefficient of user v, NgvIt is the neighbor node set of node v, edgijIt is the connection between its neighborhood of nodes.
Further, the microblogging forward behavior prediction model from interest difference between user, alternative user liveness and Dispatch user force three in terms of, for interest difference between user in terms of, from user behavior and customer relationship information extract use The interest vector at family, using all users of LDA model trainings, obtains the interest topic distribution of user;It is active for alternative user The state vector of the user in each timeslice in terms of degree, is extracted from user behavior and temporal information, for User Status to Element in amount is successive value, improves LDA using Gauss distribution, recycles all users of improved LDA model trainings, obtains and use Active state distribution of the family in each timeslice;In terms of for dispatch user force, extract from network structure information and use The characteristic vector at family, with above-mentioned User Status vector, improves LDA using Gauss distribution, recycles improved LDA models instruction The all users of white silk, obtain the network role distribution of user;Finally according to interest between user whether consistent, alternative user each when Between active state, the network role of dispatch user and user residing on piece the whole prediction mould of history forwarding data training Type, obtains the multinomial distribution that user forwards behavior.
Further, the interest topic that the microblogging forwarding behavior prediction model obtains user is distributed and also includes:In user The interbehavior between user is recycled on the basis of relational network, interest vector I (v) of user is weighted and is weighted User interest vector isWherein, wv,nRepresent that user v occurs n-th interbehavior Interactive object, n=1,2......Nv, NvTotal degree is interacted for user v, all users of LDA model trainings is recycled, is just obtained The interest topic distribution of user.
Further, the active state distribution for obtaining user in each timeslice also includes:Issue for user Liveness xv,t,1With transmitting active degree xv,t,2Be continuous variable, LDA models improved using Gauss distribution so that issue liveness Different Gauss distribution are obeyed respectively with the value of transmitting active degree: Wherein, xv,t,mRepresent m-th property values of the user v in timeslice t, μs,mAnd σs,mBe respectively user's active state be s when m The average and standard deviation of individual attribute.
Further, by isochronous surface method, started to cut into 4 periods from 0 point of night daily, i.e. t=1,2, 3,4, the active state of user is divided into into Three Estate, i.e., is enlivened very much, typically enlivened and inactive, using improved LDA moulds The all users of type training, are just obtained active state distribution of the user in each timeslice.
Further, user node is divided into Three role type, i.e. opinion leader, information based on network topology structure to pass The person of broadcasting and domestic consumer, equally, are improved after LDA models using Gauss distribution, using all users of this model training, are just obtained The network role distribution of user.
A kind of dynamic microblogging forwarding behavior prediction method of good friend's circle based on the system, which comprises the following steps:
The customer relationship and user behavior data in social networkies is obtained, by the vermicelli alternately user of dispatch user; Obtain in terms of the power of influence three of interest difference between user, the liveness of alternative user and dispatch user respectively three users to Measure the input as forecast model;
Microblogging forwarding behavior prediction model is built, and model parameter is fitted;
User's dispatch situation of the parameter obtained after fitting and any instant t is input to forecast model carries out alternative user Whether the prediction of the microblogging can be forwarded.
Advantages of the present invention and have the beneficial effect that:
The present invention proposes a kind of dynamic microblogging forwarding behavior prediction method based on good friend's circle.First, for single The multiformity of user interest, liveness and power of influence, can solve the base of " polysemy, many words one are adopted " using LDA topic models Plinth thought and method, are modeled analysis to user behavior, obtain the theme distribution with regard to user behavior;Secondly, it is contemplated that use Element in family state vector and user characteristicses vector is successive value, improves LDA using Gauss distribution, to find that user's is active Degree and power of influence;Finally, for user liveness over time, using time discretization and time dicing method, carry Go out a kind of improved LDA dynamics microblogging forwarding behavior prediction model, the liveness of dynamic monitoring user improves the standard of forecast model Exactness.
The present invention is for network dynamics, user behavior sign and user characteristicses importance sequenencing etc. in Information Communication A kind of problem, it is proposed that dynamic microblogging forwarding behavior prediction method based on good friend's circle, can make to user's forwarding behavior Accurately predict.According to predicting the outcome, can effectively estimate whether message can be forwarded and its forwarding scale, finding early can The microblogging of large-scale outbreak can be caused, detection sudden on microblogging and microblogging affect force estimation significant.
Description of the drawings
Fig. 1 is that the present invention provides dynamic microblogging forwarding behavior prediction method ensemble stream of the preferred embodiment based on good friend's circle Cheng Tu;
Fig. 2 is the forecast model block diagram of the present invention;
Fig. 3 is the forecast model flow chart of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, detailed Carefully describe.Described embodiment is only a part of embodiment of the present invention.
The present invention solves the technical scheme of above-mentioned technical problem,
As the Information Communication in social networkies is mainly promoted by interest difference, user's history behavior and network structure, because This present invention respectively from user interest, three aspects of liveness and power of influence, using LDA topic models basic thought and Method, is modeled analysis to user behavior, obtains the theme distribution with regard to user behavior;Secondly, for depositing in user property In the problem of continuous variable, LDA is improved using Gauss distribution, to find the liveness and power of influence of user;Finally, for user Liveness over time, using time discretization and time dicing method, propose that a kind of improved LDA dynamics microblogging turns It is distributed as forecast model so as to be capable of the liveness of dynamic monitoring user, accurately predicts the forwarding behavior of user and find shadow Ring the key factor of user's forwarding.
Specifically it is expressed as:Give social networks network G=(V, E, a Y).Wherein, V represents all users in network, | V |=N represents the quantity of user;E represents the relation between all users, is the matrix of a N × N-dimensional;Y represents the one of user The passing behavior of series, | Y |=I represent user behavior data sum.One generative probabilistic model of design, using in social networkies Customer relationship and user behavior information, and the impact of timeliness sexual factor is added, each user is analyzed, by 4 probability Generating process obtains the interest distribution of each user, enlivens dividing for distribution of grades, network role distribution and user's forwarding behavior Cloth, is predicted to the forwarding behavior that user in a period of time pays close attention to good friend's microblogging to which according to this 4 distributions.
The overview flow chart of the present invention is illustrated in figure 1, is mainly included:Data module is obtained, attribute module is extracted, is built Model module, the common four module of forecast analysis module.
The detailed implementation process of the detailed description below present invention.
S1:Obtain data source.The data of acquisition specifically include the use of all users in user's concern relation network and network Family behavioural information, user behavior include the passing issue of user and the microblogging for forwarding, and issue and forward the time of microblogging.Specifically (conventional method that prior art may also be employed is obtained) with the following method can be adopted:
S11:Obtain initial data.Obtain the passing row of user's concern relation network and all users under the network For data.By the public API of social networkies or directly download available data source and can obtain initial data, also can be with reference to network The method supplementary data such as reptile.
S12:Simple data cleansing.Most of data can be made beneficial to analysis by simple data cleansing.For example, delete Except duplicate data, cleaning invalid node etc..
S13:Time slicing is carried out to data, attribute of the user in each timeslice is determined.Here user property tool Body refers to the issue liveness and transmitting active degree of user.Due to the forwarding behavior of user it is closely related with its daily schedule, according to One day is carried out time slicing as a time period with the scheduled time (such as 6 hours) by the daily life system feature of user.At certain In time period t, the active state of user in this period is determined according to user property, to predict whether user can forward its good friend Microblogging.
S2:Extract association attributes.Consider social networkies in forwarding behavior mainly from interest difference, user's history behavior with And in terms of network structure three, the present invention goes out to send that extractions is related belongs to respectively in terms of user interest, liveness and power of influence three Property, such as pay close attention to behavior property, interbehavior attribute, time attribute and network structure attribute.In terms of its attribute can be according to data Feature is appropriately modified to which.
After each attribute in terms of three more than having extracted, corresponding user vector is obtained.Its concrete mode is as follows.
S21:Extract user interest vector.In view of user's user's concern interested in oneself, using the concern of user Behavior property, obtains the concern list of each user, and the interest vector for defining user v is:
Wherein, ev,u(u=1,2...... | Ev|) represent that user v pays close attention to the user in list, | Ev| represent user v concerns Total number of users in list.For example:User in user a concern lists has:B, c, d, e......, the then interest vector of user a For I (a)=[b, c, d, e......].
S22:Extract User Status vector.According to the daily life system feature of user, by one day with the scheduled time (as 6 it is little When) time slicing is carried out for a time period, using the interbehavior attribute and time attribute of user, each user is obtained each Issuing microblog liveness in individual timeslice and forwarding microblogging liveness, the state vector for defining user v is:
Wherein,Issuing microblog liveness of the user v in timeslice t is represented,Represent forwarding microblogging liveness of the user v in timeslice t.WithGeneration respectively The average issuing microblog number daily of issuing microblog numbers of the table user v in timeslice t, forwarding microblogging number and user v.For example:With Family a has issued 3 articles of microbloggings in the 1st timeslice, wherein forwarding microblogging is 2, and user a issues 5 microbloggings for average one day, then The behavior vector of user a is
S23:Extract user characteristicses vector.As there is significant impact user node position in a network to Information Communication, Using network topology structure attribute, out-degree, in-degree and the local convergence factor of each user node are obtained, define the spy of user v Levying vector is:
Wherein, dv,1Represent the vermicelli number of user v, dv,2Good friend's number of user v is represented, Represent the localized clusters coefficient of user v.NgvIt is the neighbor node set of node v, edgijIt is the company between its neighborhood of nodes Connect.For example:User a possesses 30 vermicellis, and 20 good friends, neighbor node have 40, there are 200 companies between its neighbor node Edge fit, then the characteristic vector of user a be
S3:Forecast model is set up, the forecast model block diagram of the present invention is illustrated in figure 2.Whether alternative user can forward which The microblogging of good friend, interest difference τ mainly by alternative user and its good friend, alternative user issue the liveness s of period in article Determine with the network influence r of its good friend.
Forecast model carries out whether alternative user can forward the prediction of certain microblogging of its good friend to specifically include:For interest In terms of difference, interest vector I (v) of user is extracted from user behavior and customer relationship information, it is all using LDA model trainings User, obtains the interest community distribution of userWherein,Represent The interest community distribution of user v, N is total number of users;In terms of for user activity, extract from user behavior and temporal information State vector L (v, t) of the user in each timeslice, is successive value for the element in User Status vector, first using height This distribution improves LDA, recycles all users of improved LDA model trainings, obtains user and enlivens shape in each timeslice State is distributedWherein,Represent work of the user v in t timeslices Jump probability distribution over states;For user network influence in terms of, from network structure information extract user characteristic vector F V (), with above-mentioned User Status vector, improves LDA using Gauss distribution, recycle improved LDA model trainings institute useful Family, obtains the network role point of userWherein,Represent user v's Network role probability distribution;Finally, it is distributed according to the interest community of userActive state of the user in each timeslice DistributionThe network role distribution of userAnd history forwarding data Y of user train whole forecast model, obtain The forwarding behavior distribution of userWherein,Represent when between user, interest difference is τ, alternatively User be in active state s and send the documents user when playing the part of network role r alternative user forward the probability of the microblogging,Table Show the probability not forwarded.Model solution and how to predict that forwarding behavior of the alternative user in each timeslice will be following Part describe in detail.
It is illustrated in figure 3 the forecast model flow chart of the present invention.
S31:Obtain the interest community distribution of user.
As friend relation only represents the probability between user with interaction, it is impossible to true to reflect the strong of both information exchanges Degree, tends to static.In order to find active interest community, we are recycled between user on the basis of customer relationship network Interbehavior, interacts weighting to interest vector I (v) of user, and interbehavior here refers specifically to forwarding behavior, is added Weighing user interest vector is:
Wherein, wv,n(n=1,2......Nv) represent that user v occurs the interactive object of n-th interbehavior, NvFor user v Interaction total degree.For example:User a and user b occur 2 times to interact, and occur 4 times to interact with user c ..., then user a's plus Power interest vector is I'(a)=[b, b, c, c, c, c......].
, used as interest community number, using all users of LDA model trainings, concrete generating process is as follows for given C:
To each user v:
1st, one side distribution ξ~Dir (λ) of sampling, λ is the parameter of Dirichlet distributions;
2nd, one user interest community distribution of samplingα is the parameter of Dirichlet distributions;
3rd, every a line e to userv,i
1) one interest community of sampling
2) sampling a line
Wherein,The interest community distribution of user v is represented,Represent the side distribution of interest community c.
In this generative probabilistic model, to the user behavior modeling interest community distribution actually calculated by userAnd the side distribution of interest communityFor the solution of Φ and ξ, Sampled using Gibbs, the formula of each iterative estimation Φ and ξ of Gibbs sampling is as follows:
Wherein,Probability of the user v in interest community c is represented, C is interest community sum,nv,cRepresent and use The number of times of family v and the concern user mutual in interest community c, | Nv| interact total degree for user v and its good friend;Table Occur the probability of user e in showing interest community c, sums of | the E | for side in network,nc,eIn representing interest community c The interaction times of user e, ncInteractive total degree in for interest community c.
S32:Obtain active state distribution of the user in each timeslice.
The forwarding behavior of user is closely related with its daily schedule, when each user has oneself relatively-stationary online Between, within the period, user relatively enlivens, and posting, it is larger to turn note probability, and other times seldom participate in the propagation of topic.Therefore, By isochronous surface method, by daily from 0 point of night start to cut into 4 periods (t=1,2,3,4), to vector data on time Between it is first discrete.Secondly, liveness x is issued for userv,t,1With transmitting active degree xv,t,2It is continuous variable, using Gauss distribution Improve LDA models so that the value for issuing liveness and transmitting active degree obeys different Gauss distribution respectively:
Wherein, xv,t,mRepresent m-th property values of the user v in timeslice t, μs,mAnd σs,mIt is user's active state respectively For s when m-th attribute average and standard deviation.
The active state of user is set to Three Estate S=3 by the present invention, i.e., enliven very much, typically enliven and inactive.Profit With all users of improved LDA model trainings, concrete generating process is as follows:
To each user v:
1st, active state of the user in timeslice t of sampling is distributedβ is Dirichlet point The parameter of cloth;
2nd, sample one and enliven grade
3rd, each attribute to user v:
1) one property value of sampling
Wherein,Represent active state distributions of the user v in timeslice t.
In this generative probabilistic model, User Status model attributes to actually be calculated with user in each timeslice Active state distributionAnd Gauss distribution N (μ, σ) that user each attribute value is obeyed. For Θ(t)And the solution of μ, σ, using EM algorithms, EM iterative estimation Θ(t)And the process of μ, σ is divided into two steps:
E-step:Update
M-step:Update μs,mAnd σs,m
Wherein,The user v probability of active state for s in timeslice t is represented, S is state grade number,M be User Status attribute number, xv,t,mRepresent m-th property values of the user v in timeslice t, μs,mAnd σs,m Be respectively user's active state be s when m-th attribute average and standard deviation.
S33:Obtain the network role distribution of user.
Node position in a network and its produced impact have material impact to Information Communication effect.Base of the present invention User node is divided into into Three role type R=3, i.e. opinion leader, message sender and domestic consumer in network topology structure. Opinion leader possesses higher in-degree and message sender possesses higher out-degree.
Similarly, since there is continuous variable in role attribute, improved after LDA models, using improved using Gauss distribution The all users of LDA model trainings, concrete generating process are as follows:
To each user v:
1st, the user network role that samples is distributedε is the parameter of Dirichlet distributions;
2nd, one network role of sampling
3rd, each role attribute to user v:
1) one role attribute value of sampling
Wherein,Represent the network role distribution of user v.
In this generative probabilistic model, the network role distribution that user role model attributes are actually calculated with userAnd Gauss distribution N (μ ', σ ') that user each role attribute value is obeyed.For η and μ ', σ ' Solution, using EM algorithms, EM iterative estimations η and μ ', σ ' process be divided into two steps:
E-step:Update
M-step:Update μ 'r,hWith σ 'r,h
Wherein,Represent that user v plays the part of the probability of network role r, R is network role number,H is user Status attribute number, dv,hRepresent h-th property value of user v, μ 'r,hWith σ 'r,hIt is h when user plays the part of network role r respectively The average and standard deviation of individual attribute.
S34:Obtain the forwarding behavior distribution of user.
It is distributed according to the interest community of userActive state distribution of the user in each timesliceUser Network role distributionAnd history forwarding data Y of user train whole forecast model, obtain the forwarding behavior of user DistributionConcrete generating process is as follows:
Behavior y is forwarded to each useri
1st, user's forwarding behavior of sampling is distributed ρ~Dir (γ), and γ is the parameter of Dirichlet distributions;
2nd, it is alternative user v one interest community of sampling
3rd, it is user u one interest community of sampling of sending the documents
4th, it is alternative user v one active state of sampling
5th, it is user u one network role of sampling of sending the documents
6th, the user that samples forwards behavior
Wherein,The forwarding behavior distribution of user is represented,Represent when between user, interest difference is τ, it is alternative to use Family be in active state s and send the documents user when playing the part of network role r alternative user forward the probability of the microblogging,Represent The probability not forwarded.τ is indicator function, is defined as follows:
Wherein, zu, zvThe interest community that user u, v are located is represented respectively.τ=1 represents that interest is consistent, and τ=0 represents interest It is inconsistent.
In this generative probabilistic model, forward behavior modeling that the forwarding behavior distribution of user is actually calculated to userForSolution, sampled using Gibbs, Gibbs samples each iterative estimationFormula it is as follows:
Wherein, ni,τ,s,rExpression interest difference is τ, alternative user active state is s, the user that sends the documents plays the part of network role r When, user behavior yi=1 (forwarding) or yiThe number of times of=0 (not forwarding);I is user behavior sum, including not forwarding behavior;M is User Status attribute number, H are user role attribute number.
S4:By fitting Φ, Θ out(t)、η、With any one microblogging of user good friend, according to fitting it is pre- Model is surveyed, its forwarding probability is calculatedCan be predicted the outcome.By analyzing user's meeting by the result that predicts The microblogging of which good friend is forwarded, and affects user to forward the key factor of microblogging.
Customer relationship and user behavior data using in social networkies of the invention, by the vermicelli of dispatch user alternately User, predicts whether alternative user can forward the microblogging of its good friend within a period of time.First, for unique user interest, work The multiformity of jerk and power of influence, can solve basic thought and the side of " polysemy, many words one are adopted " using LDA topic models Method, is modeled analysis to user behavior, obtains the theme distribution with regard to user behavior;Secondly, it is contemplated that User Status vector Be successive value with the element in user characteristicses vector, LDA is improved using Gauss distribution, to find the liveness and user's shadow of user Ring power;Finally, for user liveness over time, using time discretization and time dicing method, propose a kind of Improved LDA dynamics microblogging forwarding behavior prediction model, the liveness of dynamic monitoring user so as to be capable of Accurate Prediction user's Forwarding behavior, and the key factor of analyzing influence user forwarding.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limits the scope of the invention. After the content of the record for having read the present invention, technical staff can be made various changes or modifications to the present invention, these equivalent changes Change and modification equally falls into the scope of the claims in the present invention.

Claims (10)

1. a kind of dynamic microblogging forwarding behavior prediction system based on good friend's circle, including user behavior data source acquisition module, For obtaining the customer relationship in social networkies and user behavior data, by the vermicelli alternately user of dispatch user, which is special Levy and be, also including property extracting module, model construction module and forecast analysis module, wherein, the property extracting module point Association attributes vector is not extracted in terms of the power of influence three of interest difference, the liveness of alternative user and dispatch user between user As the input of forecast model;Microblogging forwards behavior prediction model construction module, for microblogging forwarding row is built to alternative user For forecast model, interest difference τ of the behavior mainly by alternative user and its good friend, alternative user is forwarded to issue the period in article The network influence r parameters of liveness s and its good friend are determined, and model above parameter is fitted;Forecast analysis module is used The parameter obtained after by fitting and user's dispatch situation of any instant t carry out whether alternative user can forward the microblogging Prediction.
2. the dynamic microblogging based on good friend's circle according to claim 1 forwards behavior prediction system, it is characterised in that institute Property extracting module is stated for interest difference between user, extracting user interest vector includes:Using the concern behavior property of user, The concern list of each user is obtained, the interest vector for defining user v isIts In, ev,uRepresent that user v pays close attention to the user in list, u=1,2...... | Ev|, | Ev| represent the user in user v concern lists Sum.
3. the dynamic microblogging based on good friend's circle according to claim 1 and 2 forwards behavior prediction system, its feature to exist In, liveness of the property extracting module for alternative user, extracting User Status vector includes:Using the interactive row of user For attribute and time attribute, obtain user issuing microblog liveness of each user within a period of time and forwarding microblogging is enlivened Spend, the liveness state vector for defining user v isWherein,Represent Issuing microblog liveness of the user v in timeslice t,Represent that forwardings of the user v in timeslice t is micro- Rich liveness, WithRepresent respectively issuing microblog numbers of the user v in timeslice t, forwarding microblogging number and The average issuing microblog numbers daily of user v.
4. the dynamic microblogging based on good friend's circle according to claim 3 forwards behavior prediction system, it is characterised in that institute Power of influence of the property extracting module for dispatch user is stated, extracting user characteristicses vector includes:Using network topology structure attribute, Out-degree, in-degree and the local convergence factor of each user node are obtained, the power of influence characteristic vector for defining user v isWherein, dv,1Represent the vermicelli number of user v, dv,2Good friend's number of user v is represented,Represent the localized clusters coefficient of user v, NgvIt is the neighbor node set of node v, edgijIt is Connection between its neighborhood of nodes.
5. the dynamic microblogging forwarding behavior prediction system based on good friend's circle according to claim 1 or 2 or 4, its feature It is that the microblogging forwards behavior prediction model from interest difference, alternative user liveness and dispatch customer impact between user In terms of power three, for interest difference between user in terms of, from user behavior and customer relationship information extract user interest vector, Using all users of LDA model trainings, the interest topic distribution of user is obtained;In terms of for alternative user liveness, from user The state vector of the user in each timeslice is extracted in behavior and temporal information, is to connect for the element in User Status vector Continuous value, improves LDA using Gauss distribution, recycles all users of improved LDA model trainings, obtains user in each timeslice On active state distribution;In terms of for dispatch user force, the characteristic vector of user is extracted from network structure information, together Above-mentioned User Status vector is the same, improves LDA using Gauss distribution, recycles all users of improved LDA model trainings, obtains The network role distribution of user;Finally according to the whether residing work in each timeslice of consistent, alternative user of interest between user The history forwarding data of jump state, the network role of dispatch user and user train whole forecast model, obtain user's forwarding The multinomial distribution of behavior.
6. the dynamic microblogging based on good friend's circle according to claim 5 forwards behavior prediction system, it is characterised in that institute The interest topic distribution for stating microblogging forwarding behavior prediction model acquisition user also includes:It is sharp again on the basis of customer relationship network With the interbehavior between user, interest vector I (v) of user is weighted to obtain weighting user interest vector it isWherein, wv,nThe interactive object of expression user v generation n-th interbehaviors, n=1, 2......Nv, NvTotal degree is interacted for user v, all users of LDA model trainings is recycled, the interest topic of user is just obtained Distribution.
7. the dynamic microblogging based on good friend's circle according to claim 3 forwards behavior prediction system, it is characterised in that institute Stating to obtain active state of the user in each timeslice and be distributed also includes:Liveness x is issued for userv,t,1And transmitting active Degree xv,t,2Be continuous variable, LDA models improved using Gauss distribution so that issue the value difference of liveness and transmitting active degree Obey different Gauss distribution:Wherein, xv,t,mRepresent and use M-th property values of the family v in timeslice t, μs,mAnd σs,mBe respectively user's active state be s when m-th attribute average and Standard deviation.
8. the dynamic microblogging based on good friend's circle according to claim 7 forwards behavior prediction system, it is characterised in that logical Cross time dicing method, started to cut into 4 periods from 0 point of night daily, i.e. t=1,2,3,4, by the active state of user It is divided into Three Estate, i.e., enlivens very much, typically enlivens and inactive, using all users of improved LDA model trainings, just can obtain Active state distribution to user in each timeslice.
9. the dynamic microblogging based on good friend's circle according to claim 7 forwards behavior prediction system, it is characterised in that base User node is divided into into Three role type, i.e. opinion leader, message sender and domestic consumer in network topology structure, together Sample, is improved after LDA models using Gauss distribution, using all users of this model training, the network role point of user is just obtained Cloth.
10. a kind of dynamic microblogging of good friend's circle based on system described in claim 1 forwards behavior prediction method,
Characterized in that, comprising the following steps:
The customer relationship and user behavior data in social networkies is obtained, by the vermicelli alternately user of dispatch user;Respectively Three user vectors are obtained in terms of the power of influence three of interest difference, the liveness of alternative user and dispatch user between user to make For the input of forecast model;
Microblogging forwarding behavior prediction model is built, and model parameter is fitted;
User's dispatch situation of the parameter obtained after fitting and any instant t is input to into whether forecast model carries out alternative user The prediction of the microblogging can be forwarded.
CN201611151738.3A 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle Active CN106682770B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611151738.3A CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611151738.3A CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Publications (2)

Publication Number Publication Date
CN106682770A true CN106682770A (en) 2017-05-17
CN106682770B CN106682770B (en) 2020-08-04

Family

ID=58867918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611151738.3A Active CN106682770B (en) 2016-12-14 2016-12-14 Dynamic microblog forwarding behavior prediction system and method based on friend circle

Country Status (1)

Country Link
CN (1) CN106682770B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240042A (en) * 2017-06-28 2017-10-10 梧州市兴能农业科技有限公司 A kind of efficient community management system
CN107808168A (en) * 2017-10-31 2018-03-16 北京科技大学 A kind of social network user behavior prediction method based on strong or weak relation
CN108182339A (en) * 2018-03-20 2018-06-19 北京工业大学 A kind of window trend prediction method and system based on Gaussian Profile
CN108596205A (en) * 2018-03-20 2018-09-28 重庆邮电大学 Behavior prediction method is forwarded based on the microblogging of region correlation factor and rarefaction representation
CN108763400A (en) * 2018-05-22 2018-11-06 合肥工业大学 Object partitioning method and device based on object behavior and subject matter preferences
CN109784578A (en) * 2019-01-24 2019-05-21 中国科学院软件研究所 A kind of on-line study stagnation forecasting system of combination business rule
CN109829504A (en) * 2019-02-14 2019-05-31 重庆邮电大学 A kind of prediction technique and system forwarding behavior based on ICS-SVM analysis user
CN109840790A (en) * 2017-11-28 2019-06-04 腾讯科技(深圳)有限公司 Prediction technique, device and the computer equipment of customer churn
CN110069711A (en) * 2019-04-23 2019-07-30 北京科技大学 User's Value Engineering Method and device
CN110233833A (en) * 2019-05-23 2019-09-13 中国科学院计算技术研究所 Support the message method and system of social network user secret protection
CN110825818A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Multi-dimensional feature construction method and device, electronic equipment and storage medium
CN112712210A (en) * 2020-12-30 2021-04-27 深圳市网联安瑞网络科技有限公司 Sudden topic propagation scale prediction method, system, processing terminal and medium
CN113710336A (en) * 2019-03-15 2021-11-26 索尼互动娱乐股份有限公司 Server load prediction and advanced performance metrics
CN113807616A (en) * 2021-10-22 2021-12-17 重庆理工大学 Information diffusion prediction system based on space-time attention and heterogeneous graph convolution network
CN114143571A (en) * 2019-03-19 2022-03-04 广州虎牙信息科技有限公司 User processing method, device, equipment and storage medium
CN116166788A (en) * 2023-01-12 2023-05-26 烟台大学 User theme interest preference prediction method based on internal and external dynamic factor perception
CN117217808A (en) * 2023-07-21 2023-12-12 广州有机云计算有限责任公司 Intelligent analysis and prediction method for activity invitation capability

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394798A (en) * 2011-11-16 2012-03-28 北京交通大学 Multi-feature based prediction method of propagation behavior of microblog information and system thereof
US20150039539A1 (en) * 2013-08-02 2015-02-05 Telefonaktiebolaget L M Ericsson (Publ) Method and Apparatus For Propagating User Preference Information in a Communications Network
CN104572807A (en) * 2014-10-29 2015-04-29 中国科学院计算技术研究所 News authentication method and news authentication system based on microblog information source
CN104933475A (en) * 2015-05-27 2015-09-23 国家计算机网络与信息安全管理中心 Network forwarding behavior prediction method and apparatus
US20160042277A1 (en) * 2014-08-05 2016-02-11 Hewlett-Packard Development Company, Lp Social action and social tie prediction
CN105809554A (en) * 2016-02-07 2016-07-27 重庆邮电大学 Prediction method of hot topics participated by users in social networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394798A (en) * 2011-11-16 2012-03-28 北京交通大学 Multi-feature based prediction method of propagation behavior of microblog information and system thereof
US20150039539A1 (en) * 2013-08-02 2015-02-05 Telefonaktiebolaget L M Ericsson (Publ) Method and Apparatus For Propagating User Preference Information in a Communications Network
US20160042277A1 (en) * 2014-08-05 2016-02-11 Hewlett-Packard Development Company, Lp Social action and social tie prediction
CN104572807A (en) * 2014-10-29 2015-04-29 中国科学院计算技术研究所 News authentication method and news authentication system based on microblog information source
CN104933475A (en) * 2015-05-27 2015-09-23 国家计算机网络与信息安全管理中心 Network forwarding behavior prediction method and apparatus
CN105809554A (en) * 2016-02-07 2016-07-27 重庆邮电大学 Prediction method of hot topics participated by users in social networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘玮 等: ""基于用户行为特征的微博转发预测研究"", 《计算机学报》 *
张亚明 等: ""微博机制和转发预测研究"", 《情报学报》 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240042A (en) * 2017-06-28 2017-10-10 梧州市兴能农业科技有限公司 A kind of efficient community management system
CN107808168A (en) * 2017-10-31 2018-03-16 北京科技大学 A kind of social network user behavior prediction method based on strong or weak relation
CN107808168B (en) * 2017-10-31 2023-08-01 北京科技大学 Social network user behavior prediction method based on strong and weak relation
CN109840790A (en) * 2017-11-28 2019-06-04 腾讯科技(深圳)有限公司 Prediction technique, device and the computer equipment of customer churn
CN109840790B (en) * 2017-11-28 2023-04-28 腾讯科技(深圳)有限公司 User loss prediction method and device and computer equipment
CN108596205A (en) * 2018-03-20 2018-09-28 重庆邮电大学 Behavior prediction method is forwarded based on the microblogging of region correlation factor and rarefaction representation
CN108182339A (en) * 2018-03-20 2018-06-19 北京工业大学 A kind of window trend prediction method and system based on Gaussian Profile
CN108596205B (en) * 2018-03-20 2022-02-11 重庆邮电大学 Microblog forwarding behavior prediction method based on region correlation factor and sparse representation
CN108182339B (en) * 2018-03-20 2021-08-13 北京工业大学 Window state prediction method and system based on Gaussian distribution
CN108763400A (en) * 2018-05-22 2018-11-06 合肥工业大学 Object partitioning method and device based on object behavior and subject matter preferences
CN108763400B (en) * 2018-05-22 2021-09-14 合肥工业大学 Object dividing method and device based on object behaviors and theme preferences
CN109784578A (en) * 2019-01-24 2019-05-21 中国科学院软件研究所 A kind of on-line study stagnation forecasting system of combination business rule
CN109829504A (en) * 2019-02-14 2019-05-31 重庆邮电大学 A kind of prediction technique and system forwarding behavior based on ICS-SVM analysis user
CN109829504B (en) * 2019-02-14 2022-07-01 重庆邮电大学 Prediction method and system for analyzing user forwarding behavior based on ICS-SVM
CN113710336A (en) * 2019-03-15 2021-11-26 索尼互动娱乐股份有限公司 Server load prediction and advanced performance metrics
CN113710336B (en) * 2019-03-15 2024-04-26 索尼互动娱乐股份有限公司 Server load prediction and advanced performance metrics
CN114143571A (en) * 2019-03-19 2022-03-04 广州虎牙信息科技有限公司 User processing method, device, equipment and storage medium
CN114143571B (en) * 2019-03-19 2024-01-19 广州虎牙信息科技有限公司 User processing method, device, equipment and storage medium
CN110069711A (en) * 2019-04-23 2019-07-30 北京科技大学 User's Value Engineering Method and device
CN110233833A (en) * 2019-05-23 2019-09-13 中国科学院计算技术研究所 Support the message method and system of social network user secret protection
CN110825818A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Multi-dimensional feature construction method and device, electronic equipment and storage medium
CN110825818B (en) * 2019-09-18 2023-06-27 平安科技(深圳)有限公司 Multidimensional feature construction method and device, electronic equipment and storage medium
CN112712210B (en) * 2020-12-30 2023-07-25 深圳市网联安瑞网络科技有限公司 Sudden topic transmission scale prediction method, system, processing terminal and medium
CN112712210A (en) * 2020-12-30 2021-04-27 深圳市网联安瑞网络科技有限公司 Sudden topic propagation scale prediction method, system, processing terminal and medium
CN113807616B (en) * 2021-10-22 2022-11-04 重庆理工大学 Information diffusion prediction system based on space-time attention and heterogeneous graph convolution network
CN113807616A (en) * 2021-10-22 2021-12-17 重庆理工大学 Information diffusion prediction system based on space-time attention and heterogeneous graph convolution network
CN116166788A (en) * 2023-01-12 2023-05-26 烟台大学 User theme interest preference prediction method based on internal and external dynamic factor perception
CN117217808A (en) * 2023-07-21 2023-12-12 广州有机云计算有限责任公司 Intelligent analysis and prediction method for activity invitation capability
CN117217808B (en) * 2023-07-21 2024-04-05 广州有机云计算有限责任公司 Intelligent analysis and prediction method for activity invitation capability

Also Published As

Publication number Publication date
CN106682770B (en) 2020-08-04

Similar Documents

Publication Publication Date Title
CN106682770A (en) Friend circle-based dynamic microblog forwarding behavior prediction system and method
Wang et al. Understanding the power of opinion leaders’ influence on the diffusion process of popular mobile games: Travel Frog on Sina Weibo
Guille et al. Information diffusion in online social networks: A survey
Nettleton Data mining of social networks represented as graphs
Sun et al. A survey of models and algorithms for social influence analysis
CN102394798B (en) Multi-feature based prediction method of propagation behavior of microblog information and system thereof
De Choudhury et al. What makes conversations interesting? themes, participants and consequences of conversations in online social media
CN110162717A (en) A kind of method and apparatus of commending friends
CN101986298A (en) Information real-time recommendation method for online forum
Kalampokis et al. Combining social and government open data for participatory decision-making
Di Girolamo et al. Evolutionary game theoretical on-line event detection over tweet streams
CN104471571A (en) System and method for indexing, ranking, and analyzing web activity within event driven architecture
US20120290649A1 (en) Method of characterizing a social network communication using motifs
CN103136331A (en) Micro blog network opinion leader identification method
CN105809554A (en) Prediction method of hot topics participated by users in social networks
CN104008182A (en) Measuring method of social network communication influence and measure system thereof
Guille et al. Predicting the temporal dynamics of information diffusion in social networks
Guo et al. Mining structural influence to analyze relationships in social network
Lu et al. Collective human behavior in cascading system: discovery, modeling and applications
Nguyen et al. Design a management system for the influencer marketing campaign on social network
Roth Generalized preferential attachment: Towards realistic social network models
Alramli et al. Network-Based Model for Dissemination of Advertising
De Choudhury et al. Dynamic prediction of communication flow using social context
De Choudhury et al. Contextual prediction of communication flow in social networks
Yan et al. User recommendation with tensor factorization in social networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant