CN106547901A

CN106547901A - It is a kind of to forward behavior prediction method based on energy-optimised microblog users

Info

Publication number: CN106547901A
Application number: CN201610978548.2A
Authority: CN
Inventors: 朱海; 王伟; 张效尉; 陈立勇; 任国恒; 秦东霞; 刘琳琳
Original assignee: Zhoukou Normal University
Current assignee: Zhoukou Normal University
Priority date: 2016-11-08
Filing date: 2016-11-08
Publication date: 2017-03-29

Abstract

The invention provides a kind of based on energy-optimised microblog users forwarding behavior prediction algorithm, it is related to networking technology area, energy function therein has merged the users such as user property, content of microblog forwarding behavior characteristicss and user's forwarding behavior restraint and colony's forwarding priori, thus globally user's forwarding behavior can be predicted.Test result indicate that, the Forecasting Methodology of the present invention can be with problem present in effectively solving traditional algorithm, on the whole with higher performance and precision of prediction.

Description

It is a kind of to forward behavior prediction method based on energy-optimised microblog users

Technical field

The present invention relates to Internet technical field, more particularly to a kind of pre- based on energy-optimised microblog users forwarding behavior Survey method.

Background technology

As the social networkies such as the popularization of development and the various intelligent terminal of Internet technology, microblogging, forum are to people's day The often impact of life increasingly increases.Especially microblogging social networkies, due to the rapidity of its diffusion of information, user operation it is convenient Property and load media multiformity (such as text, image, video etc.), be increasingly becoming people and share news and society dynamic at one's side Main channel.The mass data that user is produced in microblogging social networkies contains its potential behavioral pattern (such as user couple The comment of subject of interest and forwarding) and emotional factor (emotions such as indignation, hatred are such as shown to social phenomenon), thus, root According to microblogging social networkies historical data, effectively analyzing influence user forwards the feature of behavior and the forwarding behavior to its future is entered Row prediction, the interest and emotion for not only helping digging user are partial to, so as to provide the user more accurately recommendation service (such as Theme, commercial product recommending), and contribute to understanding that flooding mechanism of the message in microblogging social networkies expands to set up reliable message Scattered model, this also has a wide range of applications in fields such as public sentiment monitoring, enterprise's aid decisions.

When being predicted to microblog users forwarding behavior, in addition to the features such as user property, content of microblog, reflect user Between relation social network structure often also to precision of prediction produce large effect.In the case, traditional forwarding behavior Generally there is problems with forecast model：(1) it is predicted only with features such as user property, content of microblog, and does not consider society Impact of the network structure to precision of prediction is handed over, its precision is generally relatively low；(2) by the correlated characteristic of reflection social network structure (such as Vermicelli number, concern number of users) as characteristic component during prediction user's forwarding behavior, it is difficult to social network structure is embodied to user The practical function of forwarding behavior prediction；(3) constraint social networks between user being converted into during prediction user's forwarding behavior, but The forwarding behavior institute structure of the social networks type (such as unidirectional concern, mutually concern etc.) and more users between user is not considered Into colony's forwarding priori, its precision of prediction is often not easy to obtain further to improve.

The content of the invention

Embodiments provide it is a kind of based on energy-optimised microblog users forwarding behavior prediction method, to solve Problems of the prior art.

It is a kind of to forward behavior prediction method, methods described to include based on energy-optimised microblog users：

According to the energy function that the information of user in social networkies is set up under the energy-optimised frameworks of MRF：

Wherein, E (Y) is the energy function；Y is forwarding behavior label sets；N is the number of users in microblogging social networkies Amount；D_T(y_i,u_i) for user u_iUser's forwarding behavior characteristicss tolerance to microblogging T, y_i∈ { 1,0 } represents user u_iMay be divided The forwarding behavior labelling being fitted on, and y_i∈Y；λ₁And λ₂For weight；N (i) is and user u_iThere is the sequence of direct concern relation user Number set；ψ_i,j·δ(y_i≠y_j) behavior restraint item is forwarded for user；ψ_i,jFor punishment amount；δ () is indicator function, and parameter is true When value 1, otherwise value 0；For user u_iThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego The yardstick of network,Priori energy term is forwarded for colony；

Using energy function E (Y) described in Graph Cuts Algorithm for Solving, the near-optimization of forwarding behavior label sets Y is obtained Solution, completes the prediction that user forwards behavior.

Preferably, user's forwarding behavior characteristicss tolerance D_T(y_i,u_i) expression formula be：

D_T(y_i,u_i)=| y_i-P(u_i,T)|

Wherein, P (u_i, T) and for user u_iProbability is forwarded to the local of microblogging T, the calculating process of local forwarding probability is：

Obtain user property feature, and the content and the feature related to content of microblog of microblogging T；

By in the user's forwarding behavior characteristicss vector including the user property feature and with content of microblog correlated characteristic Feature carries out standardization processing respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as local forwarding general Rate P (u_i, T) calculating；

Known users u_iUser's forwarding behavior characteristicss vector after corresponding standardization is x_i, then to local forwarding probability P (u_i, T) calculating carry out according to below equation：

Wherein, w is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment and obtains Take, i.e.,：

Wherein, w^*For the optimal value of w, l () is cross entropy loss function, and n is sample size, | | | |₂For L2 normal form canonicals Change item, λ₃To control the parameter of regularization intensity.

Preferably, by the user property feature selection be concern number of users, vermicelli number, whether certification, issuing microblog number, It is forwarded microblogging number, transmitting active degree this 6 features, the above this 6 feature direct access from microblog data；Will with microblogging in Hold related feature selection be the theme similarity, be forwarded number of times, content of microblog length, whether comprising URL, whether comprising@this 5 Individual feature.

Preferably, wherein the Topic Similarity computational methods are：By the content of the microblogging T and user u_iHistory is original And the microblogging of forwarding accumulates document d_i, document and probability of the microblogging on predetermined theme are calculated respectively using LDA models then Distribution, finally determines corresponding Topic Similarity using COS distance, i.e.,：

Wherein, L (d_i, T) and for document d_iWith the Topic Similarity of microblogging T, LDA (d_i) for document d_iOn predetermined theme Probability distribution, probability distribution of the LDA (T) for microblogging T on predetermined theme.

Preferably, each feature in behavior characteristicss vector is forwarded to represent with f the user, then according to below equation pair Feature carries out standardization processing：

Wherein, f' be standardization after feature, f_minFor the minima in all user's current signatures, f_maxFor all users Maximum in current signature, the then feature group after the user's forwarding behavior characteristicss vector after the standardization is standardized by 11 Into.

Preferably, punishment amount ψ_i,jDefinition be：

ψ_i,j=exp (P (u_i,u_j)/σ)

Wherein, parameter σ is used to control strength of punishment, P (u_i,u_j) it is to forward what behavioral similarity feature determined according to user User u_iWith u_jUser's forwarding behavior similarity probability, user forwarding behavior similarity probability calculated in accordance with the following methods：

The subject matter preferences similarity between two two users is calculated respectively, is mutually paid close attention to feature, is absorbed in jointly feature, mutually turns Send out feature and common forwarding feature；

Will including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutual forwarding feature and common Feature of the forwarding feature in interior user's forwarding behavioral similarity characteristic vector carries out standardization processing respectively, at standardization User's forwarding behavioral similarity characteristic vector after reason is used as to calculate user's forwarding behavioral similarity degree probability P (u_i,u_j)；

Known users u_iUser's forwarding behavioral similarity characteristic vector after corresponding standardization is z_i, then to the user Forwarding behavior similarity probability P (u_i,u_j) calculating carry out according to below equation：

Wherein, ω is characterized weight vectors, minimizes particular risk function using gradient descent algorithm and obtains, i.e.,：

Wherein, ω^*For the optimal value of ω, l () is cross entropy loss function, and m is sample size, | | | |₂For L2 normal forms just Then change item, λ₄To control the parameter of regularization intensity.

Preferably, the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutual forwarding feature and altogether With the computational methods of forwarding feature it is：

By user u_iWith u_jHistory microblogging accumulate document d respectively_iAnd d_j, then by the LDA model themes of two documents COS distance value between distribution vector is used as user u_iWith u_jBetween subject matter preferences similarity：

Wherein, L (d_i,d_j) for user u_iWith u_jBetween subject matter preferences similarity, LDA (d_i) for document d_iIn predetermined theme On probability distribution, LDA (d_j) for document d_jProbability distribution on predetermined theme；

As user u_iWith u_jBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, is mutually paid close attention to Feature takes 0；

Common concern characteristic measure is calculated according to below equation：

Wherein, S_ijFor user u_iWith u_jBetween common concern characteristic measure, U_iRepresent user u_iThe user of all concerns, U_jRepresent user u_jThe user of all concerns；

Mutually forwarding characteristic measure is calculated according to below equation：

R_ij=max (T_ij/T_i,T_ji/T_i)

Wherein, R_ijRepresent user u_iWith u_jMutual forwarding characteristic measure, T_ijRepresent user u_iForwarding user u_jMicroblogging Number, T_iRepresent user u_iThe microblogging sum of forwarding, T_jiRepresent user u_jForwarding user u_iMicroblogging number；

Common forwarding characteristic measure is calculated according to below equation：

Wherein, M_ijRepresent user u_iWith u_jBetween common forwarding characteristic measure, T_jRepresent user u_jThe microblogging of forwarding is total Number.

Preferably, each feature in behavioral similarity characteristic vector is forwarded to represent with g the user, then according to following Formula carries out standardization processing to feature：

Wherein, g ' is the feature after standardization, g_minFor the minima in all user's current signatures, g_maxFor all users Maximum in current signature；

Preferably, the colony forwards priori energy termUsing Pⁿ- Potts models are calculated：

Wherein, λ_maxConstant is that colony forwards priori punishment amount, and ρ is networkIn two two users forwarding behavior similarity it is general Rate P (u_i,u_j) average, Q is networkIn it is all to microblogging T local forwarding probability less than specified threshold ε user institute accountings Example, i.e.,：

Wherein,Represent networkMiddle number of users.

The present invention is proposed based on MRF energy-optimised user's forwarding behavior prediction algorithm, energy function fusion therein The features such as user property, content of microblog and user's forwarding behavior restraint forward priori with colony, thus can be globally User's forwarding behavior is predicted.Test result indicate that, the Forecasting Methodology of the present invention can be depositing in effectively solving traditional algorithm Problem, on the whole with higher performance, and reached following effect：(1) to affect user forwarding behavior it is many because Plain (such as user property, content of microblog etc.) has carried out the analysis of system, especially to affecting user to forward jointly the feature of behavior to enter Deep discussion is gone；(2) propose based on MRF energy-optimised user's forwarding behavior prediction model, comprehensive utilization user's category Property, the information such as the feature such as content of microblog, user's forwarding behavior restraint and colony forwarding priori forward behavior to carry out the overall situation user Property prediction, be effectively improved overall precision of prediction.

Description of the drawings

In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing Accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.

Fig. 1 is a kind of step that behavior prediction method is forwarded based on energy-optimised microblog users provided in an embodiment of the present invention Rapid flow chart.

Specific embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.

With reference to Fig. 1, embodiments provide it is a kind of based on energy-optimised microblog users forwarding behavior prediction method, The method includes：

Step 100, sets up MRF (Markov Random Field) according to the information of user in social networkies energy-optimised Energy function under framework：

Wherein, E (Y) is energy function；Y is forwarding behavior label sets；N is the number of users in microblogging social networkies；D_T (y_i,u_i) for user u_iUser's forwarding behavior characteristicss tolerance to microblogging T, y_i∈ { 1,0 } represents user u_iMay be assigned to Forwarding behavior labelling, and y_i∈Y；λ₁And λ₂For weight；N (i) is and user u_iThere is the sequence number collection of direct concern relation user Close；ψ_i,j·δ(y_i≠y_j) behavior restraint item is forwarded for user；ψ_i,jFor punishment amount；δ () is indicator function, and parameter is taken for true time Value 1, otherwise value 0；For user u_iThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego networks Yardstick,Priori energy term is forwarded for colony.

In the energy function E (Y) shown in being calculated as follows of each several part：

D_T(y_i,u_i)=| y_i-P(u_i,T)|

Wherein, P (u_i, T) and for user u_iTo the local of microblogging T forwarding probability, the calculating process of the probability is：

First sub-step, obtains user property feature, and in the present embodiment the user property feature selection is used for concern Amount, vermicelli number, whether certification, issuing microblog number, be forwarded microblogging number, transmitting active degree this 6 features, the above this 6 is special Levy generally can from microblog data direct access；

Second sub-step, obtains the content and the feature related to content of microblog of microblogging T, will be with microblogging in the present embodiment The related feature selection of content be the theme similarity, be forwarded number of times, content of microblog length, whether comprising URL, whether comprising@ This 5 features.

Wherein described Topic Similarity computational methods are：By the content of the microblogging T and user u_iHistory is original and forwards Microblogging accumulate document d_i, then using LDA (Latent DirichletAllocation) model calculate respectively document with Probability distribution of the microblogging on predetermined 50 themes (such as education, military affairs etc.), is finally determined using COS distance corresponding Topic Similarity, i.e.,：

3rd sub-step, by the user forwarding behavior spy including the user property feature and with content of microblog correlated characteristic Levying the feature in vector carries out standardization processing respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as described Local forwarding probability P (u_i, T) calculating.

Specifically, each feature in behavior characteristicss vector is forwarded to represent with f the user, then according to below equation pair Feature carries out standardization processing：

Wherein, f' be standardization after feature, f_minFor the minima in all user's current signatures, f_maxFor all users Maximum in current signature.Feature in the present embodiment after 11 standardization constitutes the forwarding row of the user after the standardization It is characterized vector.

4th sub-step, it is known that user u_iUser's forwarding behavior characteristicss vector after corresponding standardization is x_i, then to local Forwarding probability P (u_i, T) calculating carry out according to below equation：

Punishment amount ψ_i,jDefinition be：

ψ_i,j=exp (P (u_i,u_j)/σ)

Wherein, parameter σ is used to control strength of punishment, P (u_i,u_j) it is to forward behavioral similarity feature (such as emerging according to user Interesting preference, common concern etc.) the user u that determines_iWith u_jUser forwarding behavior similarity probability, the similarity probability according to Lower method is calculated：

5th sub-step, calculates the subject matter preferences similarity between two two users respectively, mutually pays close attention to feature, is absorbed in jointly Feature, mutually forwarding feature and forward jointly feature；

Specifically, by user u_iWith u_jHistory microblogging (include original and forwarding) accumulate document d respectively_iAnd d_j, then Using the COS distance value between the LDA models theme distribution of two documents vector as user u_iWith u_jBetween subject matter preferences phase Like degree：

Wherein, L (d_i,d_j) for user u_iWith u_jBetween subject matter preferences similarity, LDA (d_i) for document d_iIn predetermined theme On probability distribution, LDA (d_j) for document d_jProbability distribution on predetermined theme.

As user u_iWith u_jBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, is mutually paid close attention to Feature takes 0.

Wherein, S_ijFor user u_iWith u_jBetween common concern characteristic measure, U_iRepresent user u_iThe user of all concerns, U_jRepresent user u_jThe user of all concerns.

R_ij=max (T_ij/T_i,T_ji/T_i)

Wherein, R_ijRepresent user u_iWith u_jMutual forwarding characteristic measure, T_ijRepresent user u_iForwarding user u_jMicroblogging Number, T_iRepresent user u_iThe microblogging sum of forwarding, T_jiRepresent user u_jForwarding user u_iMicroblogging number.

6th sub-step, will be including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutually turn Sending out the feature of feature and common forwarding feature in interior user's forwarding behavioral similarity characteristic vector is carried out at standardization respectively Reason, the user's forwarding behavioral similarity characteristic vector after standardization processing is used as to calculate user's forwarding behavioral similarity degree Probability P (u_i,u_j)。

Specifically, each feature in behavioral similarity characteristic vector is forwarded to represent with g the user, then according to following Formula carries out standardization processing to feature：

Wherein, g ' is the feature after standardization, g_minFor the minima in all user's current signatures, g_maxFor all users Maximum in current signature.

7th sub-step, it is known that user u_iUser's forwarding behavioral similarity characteristic vector after corresponding standardization is z_i, then Behavior similarity probability P (u is forwarded to the user_i,u_j) calculating carry out according to below equation：

Wherein, ω is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment Obtain, i.e.,：

The colony forwards priori energy termUsing Pⁿ- Potts models are calculated：

Wherein,Represent networkMiddle number of users.

Step 200, as the solution of the energy function E (Y) belongs to NP-hard problems, therefore adopts in the present embodiment Graph Cuts algorithms obtain the approximate optimal solution of forwarding behavior label sets Y, that is, complete the prediction that user forwards behavior.

It should be noted that if the number of users in microblogging social networkies were more, the solution of the energy function E (Y) Complexity may be very high.In order to solve this problem, the present invention adopts quick community discovery algorithm by yardstick larger social activity Network is divided into the less sub- social networkies of multiple yardsticks, then again for carrying out corresponding energy function per individual sub- social networkies Solve, its result is merged using the solving result as former social networkies.

Experimental data and analysis

Used in the present embodiment, J.Zhang et al. was published in 2015《Acm Transaction on Knowledge Discovery from Data》In article《Who influenced youpredicting retweet via social influence locality》Disclosed data set (abbreviation D1) is verified to the feasibility of above Forecasting Methodology.The data The collection essential information (such as name, sex, vermicelli number etc.) comprising 1,787,443 Sina weibo users, the newest issue of user altogether 1000 social networks structures between microblogging and user.Additionally, in order to further verify having for Forecasting Methodology of the present invention Effect property, also obtains the data set (abbreviation D2) being characterized with microblogging forwarding depth by the API open interfaces of Sina weibo. During this, corresponding crawlers randomly choose 10,000 popular microblogging first as seed, then take depth-first Mode persistently capture each seed microblogging all forwarding users and each forwarding user vermicelli with concern user, finally The essential information and social networks structure of 1,132,145 users are obtained altogether.

1 user of table forwards behavior prediction result (τ=1, λ₁=0.6, λ₂=0.3)

In table 1, recall rate is predicted as being predicted correctly as " forwarding " in the users of " forwarding " and " forwarding " for all User's proportion, accuracy rate are predicted to be in the user of " forwarding " the user institute accounting being predicted correctly as " forwarding " for all Example, and F1 tolerance is then an aggregative indicator, i.e., accuracy rate × recall rate × 2/ (accuracy rate+recall rate).SVM_1 and LERBP_ 1 is respectively SVM (SupportVector Machine) and LERBP (Local Energy-based Retweet Behavior Predicting, local forwarding probability) using the prediction knot of the user's forwarding behavior characteristicss for including user property and content of microblog Really；SVM_2 and LERBP_2 is then respectively the prediction that SVM and LERBP forwards behavior characteristicss using the user not comprising user property As a result；Algorithm 1 is that J.Zhang et al. was published in 2015《Acm Transaction on Knowledge Discovery from Data》In article《Who influenced youpredicting retweetvia social influence locality》Disclosed algorithm；Algorithm 2 is that X.Tang et al. was published in 2015《Predicting individual retweet behavior by user similarity:A Multi-Task Learning Approach[J]》In text Chapter《Knowledge-Based Systems》Disclosed algorithm；PERBP is using Pairwise ERBP (Energy-based RetweetBehaviorPredicting) model predicts the outcome, and ERBP is predicting the outcome for Forecasting Methodology of the present invention.

As a result show, as SVM and LERBP do not consider the social networks between user, its overall precision of prediction is generally inclined It is low, and SVM shows relatively good performance.On the other hand, by the vermicelli number of social networks between reflection user and concern Number of users forwards the characteristic component of behavior prediction as user, can not describe social networks feature between user exactly, because And its precision of prediction does not have the raising of internal.

Relatively, algorithm 1 considers the shadow of the forwarding behavior restraint in subrange between user and social network structure Ring, thus forwarding behavior that can be preferably to user is predicted.In fact, in microblogging social networkies, society between user Friendship relation would generally cause influencing each other between corresponding forwarding behavior, its result can even change the preference of user itself with it is emerging Interest and cause user forwarding behavior tend to locally coherence.However, algorithm 1 forwards the overall situation of behavior restraint due to not considering user Property feature, thus be difficult to obtain more preferable precision of prediction.Equally, although algorithm 2 is forwarded with user using multi-task learning method Behavioral similarity feature with project different user forward behavior personalized difference, but due to do not consider more users forward behavior Between impact, thus also fail to obtain higher precision of prediction.

Comparatively, PERBP has preferably been merged user's forwarding behavior characteristicss under the energy-optimised frameworks of MRF and has been turned with user Constraint is distributed as, is not only advantageous to project the personalized difference that different user forwards behavior, and is conducive to quarterization social networkies Middle more users forward the common denominator of behavior, and then can obtain predicting the outcome for global optimization.ERBP is due to PERBP's On the basis of by merge colony forward priori, further describe user social contact circle more users forward behavior impact, thus The substitutive characteristics that user forwards behavior are reflected more accurately, corresponding precision of prediction is so as to further being improved.

Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the present invention can adopt complete hardware embodiment, complete software embodiment or with reference to the reality in terms of software and hardware Apply the form of example.And, the present invention can be using the computer for wherein including computer usable program code at one or more The computer program implemented in usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) is produced The form of product.

The present invention be with reference to method according to embodiments of the present invention, equipment (system), and computer program flow process Figure and/or block diagram are describing.It should be understood that can be by computer program instructions flowchart and/or each stream in block diagram The combination of journey and/or square frame and flow chart and/or flow process and/or square frame in block diagram.These computer programs can be provided The processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced for reality by the instruction of computer or the computing device of other programmable data processing devices The device of the function of specifying in present one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.

These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or The function of specifying in multiple square frames.

These computer program instructions can be also loaded in computer or other programmable data processing devices so that in meter Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one The step of function of specifying in individual square frame or multiple square frames.

, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described Property concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to include excellent Select embodiment and fall into the had altered of the scope of the invention and change.

Obviously, those skilled in the art can carry out the essence of various changes and modification without deviating from the present invention to the present invention God and scope.So, if these modifications of the present invention and modification belong to the scope of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to comprising these changes and modification.

Claims

1. it is a kind of based on energy-optimised microblog users forwarding behavior prediction method, it is characterised in that methods described includes：

Wherein, E (Y) is the energy function；Y is forwarding behavior label sets；N is the number of users in microblogging social networkies；D_T (y_i,u_i) for user u_iUser's forwarding behavior characteristicss tolerance to microblogging T, y_i∈ { 1,0 } represents user u_iMay be assigned to Forwarding behavior labelling, and y_i∈Y；λ₁And λ₂For weight；N (i) is and user u_iThere is the sequence number collection of direct concern relation user Close；ψ_i,j·δ(y_i≠y_j) behavior restraint item is forwarded for user；ψ_i,jFor punishment amount；δ () is indicator function, and parameter is taken for true time Value 1, otherwise value 0；For user u_iThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego networks Yardstick,Priori energy term is forwarded for colony；

Using energy function E (Y) described in Graph Cuts Algorithm for Solving, the approximate optimal solution of forwarding behavior label sets Y is obtained, it is complete The prediction of behavior is forwarded into user.

2. the method for claim 1, it is characterised in that user's forwarding behavior characteristicss tolerance D_T(y_i,u_i) expression Formula is：

D_T(y_i,u_i)=| y_i-P(u_i,T)|

By the feature in the user's forwarding behavior characteristicss vector including the user property feature and with content of microblog correlated characteristic Standardization processing is carried out respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as into the local and forwards probability P (u_i, T) calculating；

Known users u_iUser's forwarding behavior characteristicss vector after corresponding standardization is x_i, then to local forwarding probability P (u_i,T) Calculating carry out according to below equation：

P (u_{i}, T) = \frac{1}{1 + \exp (- w^{T} x_{i})}

Wherein, w is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment and obtains, I.e.：

w^{*} = \arg \underset{w}{m i n} Σ_{i = 1}^{n} (l (y_{i}, P (u_{i}, T)) + λ_{3} \cdot | | w | |_{2}^{2})

Wherein, w^*For the optimal value of w, l () is cross entropy loss function, and n is sample size, | | | |₂For L2 normal form regularizations , λ₃To control the parameter of regularization intensity.

3. method as claimed in claim 2, it is characterised in that be concern number of users, powder by the user property feature selection Silk number, whether certification, issuing microblog number, microblogging number, transmitting active degree this 6 features are forwarded, the above this 6 features are from microblogging Direct access in data；By the feature selection related to content of microblog be the theme similarity, to be forwarded number of times, content of microblog long Spend, whether include URL, whether include this 5 features of@.

4. method as claimed in claim 3, it is characterised in that wherein described Topic Similarity computational methods are：Will be described micro- The content of rich T and user u_iHistory it is original and forwarding microblogging accumulate document d_i, document is calculated respectively using LDA models then With probability distribution of the microblogging on predetermined theme, finally corresponding Topic Similarity is determined using COS distance, i.e.,：

L (d_{i}, T) = \frac{L D A (d_{i}) \cdot L D A (T)}{| | L D A (d_{i}) | | | | L D A (T) | |}

Wherein, L (d_i, T) and for document d_iWith the Topic Similarity of microblogging T, LDA (d_i) for document d_iProbability on predetermined theme point Cloth, probability distribution of the LDA (T) for microblogging T on predetermined theme.

5. method as claimed in claim 3, it is characterised in that by each feature in user forwarding behavior characteristicss vector Represented with f, then standardization processing is carried out to feature according to below equation：

f^{'} = \frac{f - f_{\min}}{f_{m a x} - f_{\min}}

Wherein, f' be standardization after feature, f_minFor the minima in all user's current signatures, f_maxIt is current for all users Maximum in feature, the then feature after the user's forwarding behavior characteristicss vector after the standardization is standardized by 11 are constituted.

6. the method for claim 1, it is characterised in that punishment amount ψ_i,jDefinition be：

ψ_i,j=exp (P (u_i,u_j)/σ)

Wherein, parameter σ is used to control strength of punishment, P (u_i,u_j) it is the user that the determination of behavioral similarity feature is forwarded according to user u_iWith u_jUser's forwarding behavior similarity probability, user forwarding behavior similarity probability calculated in accordance with the following methods：

The subject matter preferences similarity between two two users is calculated respectively, is mutually paid close attention to feature, is absorbed in jointly feature, mutually forwards spy Seek peace common forwarding feature；

Will be including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutually forwarding feature and common forward Feature of the feature in interior user's forwarding behavioral similarity characteristic vector carries out standardization processing respectively, after standardization processing User's forwarding behavioral similarity characteristic vector be used as to calculate the user and forward behavioral similarity degree probability P (u_i,u_j)；

Known users u_iUser's forwarding behavioral similarity characteristic vector after corresponding standardization is z_i, then the user is forwarded Behavior similarity probability P (u_i,u_j) calculating carry out according to below equation：

P (u_{i}, u_{j}) = \frac{1}{1 + \exp (- ω^{T} z_{i})}

ω^{*} = \arg \underset{ω}{m i n} Σ_{i = 1}^{m} (l (y_{i}, P (u_{i}, u_{j})) + λ_{4} \cdot | | ω | |_{2}^{2})

Wherein, ω^*For the optimal value of ω, l () is cross entropy loss function, and m is sample size, | | | |₂For L2 normal form regularizations , λ₄To control the parameter of regularization intensity.

7. method as claimed in claim 6, it is characterised in that the subject matter preferences similarity, mutually pay close attention to feature, it is common specially Note feature, the computational methods for mutually forwarding feature and forwarding jointly feature are：

By user u_iWith u_jHistory microblogging accumulate document d respectively_iAnd d_j, then by the LDA model theme distributions of two documents COS distance value between vector is used as user u_iWith u_jBetween subject matter preferences similarity：

L (d_{i}, d_{j}) = \frac{L D A (d_{i}) \cdot L D A (d_{j})}{| | L D A (d_{i}) | | | | L D A (d_{j}) | |}

Wherein, L (d_i,d_j) for user u_iWith u_jBetween subject matter preferences similarity, LDA (d_i) for document d_iOn predetermined theme Probability distribution, LDA (d_j) for document d_jProbability distribution on predetermined theme；

As user u_iWith u_jBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, feature is mutually paid close attention to Take 0；

S_{i j} = \frac{U_{i} \cap U_{j}}{U_{i} \cup U_{j}}

R_ij=max (T_ij/T_i,T_ji/T_i)

Wherein, R_ijRepresent user u_iWith u_jMutual forwarding characteristic measure, T_ijRepresent user u_iForwarding user u_jMicroblogging number, T_i Represent user u_iThe microblogging sum of forwarding, T_jiRepresent user u_jForwarding user u_iMicroblogging number；

M_{i j} = \frac{T_{i} \cap T_{j}}{T_{i} \cup T_{j}}

Wherein, M_ijRepresent user u_iWith u_jBetween common forwarding characteristic measure, T_jRepresent user u_jThe microblogging sum of forwarding.

8. method as claimed in claim 6, it is characterised in that will the user forward it is every in behavioral similarity characteristic vector Individual feature represented with g, then carry out standardization processing to feature according to below equation：

g^{'} = \frac{g - g_{\min}}{g_{m a x} - g_{m i n}}

Wherein, g ' is the feature after standardization, g_minFor the minima in all user's current signatures, g_maxIt is current for all users Maximum in feature.

9. the method for claim 1, it is characterised in that the colony forwards priori energy termUsing Pⁿ- Potts models are calculated：

Wherein, λ_maxConstant is that colony forwards priori punishment amount, and ρ is networkIn two two users forwarding behavior similarity probability P (u_i,u_j) average, Q is networkIn it is all to microblogging T local forwarding probability less than specified threshold ε user's proportions, I.e.：

Q = \frac{Σ_{&upsi; &Element; G_{i}^{τ}} δ (P (&upsi;, T) < ϵ)}{| G_{i}^{τ} |}

Wherein,Represent networkMiddle number of users.