CN106547901A - It is a kind of to forward behavior prediction method based on energy-optimised microblog users - Google Patents
It is a kind of to forward behavior prediction method based on energy-optimised microblog users Download PDFInfo
- Publication number
- CN106547901A CN106547901A CN201610978548.2A CN201610978548A CN106547901A CN 106547901 A CN106547901 A CN 106547901A CN 201610978548 A CN201610978548 A CN 201610978548A CN 106547901 A CN106547901 A CN 106547901A
- Authority
- CN
- China
- Prior art keywords
- user
- forwarding
- feature
- microblogging
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 21
- 239000013598 vector Substances 0.000 claims description 36
- 230000003542 behavioural effect Effects 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 20
- 238000009826 distribution Methods 0.000 claims description 17
- 230000000875 corresponding effect Effects 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 12
- 238000000205 computational method Methods 0.000 claims description 5
- 230000002596 correlated effect Effects 0.000 claims description 4
- 230000001276 controlling effect Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 239000004744 fabric Substances 0.000 claims 1
- 239000000843 powder Substances 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000012360 testing method Methods 0.000 abstract description 2
- 230000006855 networking Effects 0.000 abstract 1
- 230000006399 behavior Effects 0.000 description 69
- 230000006870 function Effects 0.000 description 24
- 230000008859 change Effects 0.000 description 7
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000012706 support-vector machine Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 102100024737 Deoxynucleotidyltransferase terminal-interacting protein 2 Human genes 0.000 description 3
- 101000626071 Homo sapiens Deoxynucleotidyltransferase terminal-interacting protein 2 Proteins 0.000 description 3
- 244000097202 Rathbunia alamosensis Species 0.000 description 2
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000013065 commercial product Substances 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a kind of based on energy-optimised microblog users forwarding behavior prediction algorithm, it is related to networking technology area, energy function therein has merged the users such as user property, content of microblog forwarding behavior characteristicss and user's forwarding behavior restraint and colony's forwarding priori, thus globally user's forwarding behavior can be predicted.Test result indicate that, the Forecasting Methodology of the present invention can be with problem present in effectively solving traditional algorithm, on the whole with higher performance and precision of prediction.
Description
Technical field
The present invention relates to Internet technical field, more particularly to a kind of pre- based on energy-optimised microblog users forwarding behavior
Survey method.
Background technology
As the social networkies such as the popularization of development and the various intelligent terminal of Internet technology, microblogging, forum are to people's day
The often impact of life increasingly increases.Especially microblogging social networkies, due to the rapidity of its diffusion of information, user operation it is convenient
Property and load media multiformity (such as text, image, video etc.), be increasingly becoming people and share news and society dynamic at one's side
Main channel.The mass data that user is produced in microblogging social networkies contains its potential behavioral pattern (such as user couple
The comment of subject of interest and forwarding) and emotional factor (emotions such as indignation, hatred are such as shown to social phenomenon), thus, root
According to microblogging social networkies historical data, effectively analyzing influence user forwards the feature of behavior and the forwarding behavior to its future is entered
Row prediction, the interest and emotion for not only helping digging user are partial to, so as to provide the user more accurately recommendation service (such as
Theme, commercial product recommending), and contribute to understanding that flooding mechanism of the message in microblogging social networkies expands to set up reliable message
Scattered model, this also has a wide range of applications in fields such as public sentiment monitoring, enterprise's aid decisions.
When being predicted to microblog users forwarding behavior, in addition to the features such as user property, content of microblog, reflect user
Between relation social network structure often also to precision of prediction produce large effect.In the case, traditional forwarding behavior
Generally there is problems with forecast model:(1) it is predicted only with features such as user property, content of microblog, and does not consider society
Impact of the network structure to precision of prediction is handed over, its precision is generally relatively low;(2) by the correlated characteristic of reflection social network structure (such as
Vermicelli number, concern number of users) as characteristic component during prediction user's forwarding behavior, it is difficult to social network structure is embodied to user
The practical function of forwarding behavior prediction;(3) constraint social networks between user being converted into during prediction user's forwarding behavior, but
The forwarding behavior institute structure of the social networks type (such as unidirectional concern, mutually concern etc.) and more users between user is not considered
Into colony's forwarding priori, its precision of prediction is often not easy to obtain further to improve.
The content of the invention
Embodiments provide it is a kind of based on energy-optimised microblog users forwarding behavior prediction method, to solve
Problems of the prior art.
It is a kind of to forward behavior prediction method, methods described to include based on energy-optimised microblog users:
According to the energy function that the information of user in social networkies is set up under the energy-optimised frameworks of MRF:
Wherein, E (Y) is the energy function;Y is forwarding behavior label sets;N is the number of users in microblogging social networkies
Amount;DT(yi,ui) for user uiUser's forwarding behavior characteristicss tolerance to microblogging T, yi∈ { 1,0 } represents user uiMay be divided
The forwarding behavior labelling being fitted on, and yi∈Y;λ1And λ2For weight;N (i) is and user uiThere is the sequence of direct concern relation user
Number set;ψi,j·δ(yi≠yj) behavior restraint item is forwarded for user;ψi,jFor punishment amount;δ () is indicator function, and parameter is true
When value 1, otherwise value 0;For user uiThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego
The yardstick of network,Priori energy term is forwarded for colony;
Using energy function E (Y) described in Graph Cuts Algorithm for Solving, the near-optimization of forwarding behavior label sets Y is obtained
Solution, completes the prediction that user forwards behavior.
Preferably, user's forwarding behavior characteristicss tolerance DT(yi,ui) expression formula be:
DT(yi,ui)=| yi-P(ui,T)|
Wherein, P (ui, T) and for user uiProbability is forwarded to the local of microblogging T, the calculating process of local forwarding probability is:
Obtain user property feature, and the content and the feature related to content of microblog of microblogging T;
By in the user's forwarding behavior characteristicss vector including the user property feature and with content of microblog correlated characteristic
Feature carries out standardization processing respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as local forwarding general
Rate P (ui, T) calculating;
Known users uiUser's forwarding behavior characteristicss vector after corresponding standardization is xi, then to local forwarding probability P
(ui, T) calculating carry out according to below equation:
Wherein, w is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment and obtains
Take, i.e.,:
Wherein, w*For the optimal value of w, l () is cross entropy loss function, and n is sample size, | | | |2For L2 normal form canonicals
Change item, λ3To control the parameter of regularization intensity.
Preferably, by the user property feature selection be concern number of users, vermicelli number, whether certification, issuing microblog number,
It is forwarded microblogging number, transmitting active degree this 6 features, the above this 6 feature direct access from microblog data;Will with microblogging in
Hold related feature selection be the theme similarity, be forwarded number of times, content of microblog length, whether comprising URL, whether comprising@this 5
Individual feature.
Preferably, wherein the Topic Similarity computational methods are:By the content of the microblogging T and user uiHistory is original
And the microblogging of forwarding accumulates document di, document and probability of the microblogging on predetermined theme are calculated respectively using LDA models then
Distribution, finally determines corresponding Topic Similarity using COS distance, i.e.,:
Wherein, L (di, T) and for document diWith the Topic Similarity of microblogging T, LDA (di) for document diOn predetermined theme
Probability distribution, probability distribution of the LDA (T) for microblogging T on predetermined theme.
Preferably, each feature in behavior characteristicss vector is forwarded to represent with f the user, then according to below equation pair
Feature carries out standardization processing:
Wherein, f' be standardization after feature, fminFor the minima in all user's current signatures, fmaxFor all users
Maximum in current signature, the then feature group after the user's forwarding behavior characteristicss vector after the standardization is standardized by 11
Into.
Preferably, punishment amount ψi,jDefinition be:
ψi,j=exp (P (ui,uj)/σ)
Wherein, parameter σ is used to control strength of punishment, P (ui,uj) it is to forward what behavioral similarity feature determined according to user
User uiWith ujUser's forwarding behavior similarity probability, user forwarding behavior similarity probability calculated in accordance with the following methods:
The subject matter preferences similarity between two two users is calculated respectively, is mutually paid close attention to feature, is absorbed in jointly feature, mutually turns
Send out feature and common forwarding feature;
Will including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutual forwarding feature and common
Feature of the forwarding feature in interior user's forwarding behavioral similarity characteristic vector carries out standardization processing respectively, at standardization
User's forwarding behavioral similarity characteristic vector after reason is used as to calculate user's forwarding behavioral similarity degree probability P (ui,uj);
Known users uiUser's forwarding behavioral similarity characteristic vector after corresponding standardization is zi, then to the user
Forwarding behavior similarity probability P (ui,uj) calculating carry out according to below equation:
Wherein, ω is characterized weight vectors, minimizes particular risk function using gradient descent algorithm and obtains, i.e.,:
Wherein, ω*For the optimal value of ω, l () is cross entropy loss function, and m is sample size, | | | |2For L2 normal forms just
Then change item, λ4To control the parameter of regularization intensity.
Preferably, the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutual forwarding feature and altogether
With the computational methods of forwarding feature it is:
By user uiWith ujHistory microblogging accumulate document d respectivelyiAnd dj, then by the LDA model themes of two documents
COS distance value between distribution vector is used as user uiWith ujBetween subject matter preferences similarity:
Wherein, L (di,dj) for user uiWith ujBetween subject matter preferences similarity, LDA (di) for document diIn predetermined theme
On probability distribution, LDA (dj) for document djProbability distribution on predetermined theme;
As user uiWith ujBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, is mutually paid close attention to
Feature takes 0;
Common concern characteristic measure is calculated according to below equation:
Wherein, SijFor user uiWith ujBetween common concern characteristic measure, UiRepresent user uiThe user of all concerns,
UjRepresent user ujThe user of all concerns;
Mutually forwarding characteristic measure is calculated according to below equation:
Rij=max (Tij/Ti,Tji/Ti)
Wherein, RijRepresent user uiWith ujMutual forwarding characteristic measure, TijRepresent user uiForwarding user ujMicroblogging
Number, TiRepresent user uiThe microblogging sum of forwarding, TjiRepresent user ujForwarding user uiMicroblogging number;
Common forwarding characteristic measure is calculated according to below equation:
Wherein, MijRepresent user uiWith ujBetween common forwarding characteristic measure, TjRepresent user ujThe microblogging of forwarding is total
Number.
Preferably, each feature in behavioral similarity characteristic vector is forwarded to represent with g the user, then according to following
Formula carries out standardization processing to feature:
Wherein, g ' is the feature after standardization, gminFor the minima in all user's current signatures, gmaxFor all users
Maximum in current signature;
Preferably, the colony forwards priori energy termUsing Pn- Potts models are calculated:
Wherein, λmaxConstant is that colony forwards priori punishment amount, and ρ is networkIn two two users forwarding behavior similarity it is general
Rate P (ui,uj) average, Q is networkIn it is all to microblogging T local forwarding probability less than specified threshold ε user institute accountings
Example, i.e.,:
Wherein,Represent networkMiddle number of users.
The present invention is proposed based on MRF energy-optimised user's forwarding behavior prediction algorithm, energy function fusion therein
The features such as user property, content of microblog and user's forwarding behavior restraint forward priori with colony, thus can be globally
User's forwarding behavior is predicted.Test result indicate that, the Forecasting Methodology of the present invention can be depositing in effectively solving traditional algorithm
Problem, on the whole with higher performance, and reached following effect:(1) to affect user forwarding behavior it is many because
Plain (such as user property, content of microblog etc.) has carried out the analysis of system, especially to affecting user to forward jointly the feature of behavior to enter
Deep discussion is gone;(2) propose based on MRF energy-optimised user's forwarding behavior prediction model, comprehensive utilization user's category
Property, the information such as the feature such as content of microblog, user's forwarding behavior restraint and colony forwarding priori forward behavior to carry out the overall situation user
Property prediction, be effectively improved overall precision of prediction.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
Accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this
Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of step that behavior prediction method is forwarded based on energy-optimised microblog users provided in an embodiment of the present invention
Rapid flow chart.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
With reference to Fig. 1, embodiments provide it is a kind of based on energy-optimised microblog users forwarding behavior prediction method,
The method includes:
Step 100, sets up MRF (Markov Random Field) according to the information of user in social networkies energy-optimised
Energy function under framework:
Wherein, E (Y) is energy function;Y is forwarding behavior label sets;N is the number of users in microblogging social networkies;DT
(yi,ui) for user uiUser's forwarding behavior characteristicss tolerance to microblogging T, yi∈ { 1,0 } represents user uiMay be assigned to
Forwarding behavior labelling, and yi∈Y;λ1And λ2For weight;N (i) is and user uiThere is the sequence number collection of direct concern relation user
Close;ψi,j·δ(yi≠yj) behavior restraint item is forwarded for user;ψi,jFor punishment amount;δ () is indicator function, and parameter is taken for true time
Value 1, otherwise value 0;For user uiThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego networks
Yardstick,Priori energy term is forwarded for colony.
In the energy function E (Y) shown in being calculated as follows of each several part:
DT(yi,ui)=| yi-P(ui,T)|
Wherein, P (ui, T) and for user uiTo the local of microblogging T forwarding probability, the calculating process of the probability is:
First sub-step, obtains user property feature, and in the present embodiment the user property feature selection is used for concern
Amount, vermicelli number, whether certification, issuing microblog number, be forwarded microblogging number, transmitting active degree this 6 features, the above this 6 is special
Levy generally can from microblog data direct access;
Second sub-step, obtains the content and the feature related to content of microblog of microblogging T, will be with microblogging in the present embodiment
The related feature selection of content be the theme similarity, be forwarded number of times, content of microblog length, whether comprising URL, whether comprising@
This 5 features.
Wherein described Topic Similarity computational methods are:By the content of the microblogging T and user uiHistory is original and forwards
Microblogging accumulate document di, then using LDA (Latent DirichletAllocation) model calculate respectively document with
Probability distribution of the microblogging on predetermined 50 themes (such as education, military affairs etc.), is finally determined using COS distance corresponding
Topic Similarity, i.e.,:
Wherein, L (di, T) and for document diWith the Topic Similarity of microblogging T, LDA (di) for document diOn predetermined theme
Probability distribution, probability distribution of the LDA (T) for microblogging T on predetermined theme.
3rd sub-step, by the user forwarding behavior spy including the user property feature and with content of microblog correlated characteristic
Levying the feature in vector carries out standardization processing respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as described
Local forwarding probability P (ui, T) calculating.
Specifically, each feature in behavior characteristicss vector is forwarded to represent with f the user, then according to below equation pair
Feature carries out standardization processing:
Wherein, f' be standardization after feature, fminFor the minima in all user's current signatures, fmaxFor all users
Maximum in current signature.Feature in the present embodiment after 11 standardization constitutes the forwarding row of the user after the standardization
It is characterized vector.
4th sub-step, it is known that user uiUser's forwarding behavior characteristicss vector after corresponding standardization is xi, then to local
Forwarding probability P (ui, T) calculating carry out according to below equation:
Wherein, w is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment and obtains
Take, i.e.,:
Wherein, w*For the optimal value of w, l () is cross entropy loss function, and n is sample size, | | | |2For L2 normal form canonicals
Change item, λ3To control the parameter of regularization intensity.
Punishment amount ψi,jDefinition be:
ψi,j=exp (P (ui,uj)/σ)
Wherein, parameter σ is used to control strength of punishment, P (ui,uj) it is to forward behavioral similarity feature (such as emerging according to user
Interesting preference, common concern etc.) the user u that determinesiWith ujUser forwarding behavior similarity probability, the similarity probability according to
Lower method is calculated:
5th sub-step, calculates the subject matter preferences similarity between two two users respectively, mutually pays close attention to feature, is absorbed in jointly
Feature, mutually forwarding feature and forward jointly feature;
Specifically, by user uiWith ujHistory microblogging (include original and forwarding) accumulate document d respectivelyiAnd dj, then
Using the COS distance value between the LDA models theme distribution of two documents vector as user uiWith ujBetween subject matter preferences phase
Like degree:
Wherein, L (di,dj) for user uiWith ujBetween subject matter preferences similarity, LDA (di) for document diIn predetermined theme
On probability distribution, LDA (dj) for document djProbability distribution on predetermined theme.
As user uiWith ujBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, is mutually paid close attention to
Feature takes 0.
Common concern characteristic measure is calculated according to below equation:
Wherein, SijFor user uiWith ujBetween common concern characteristic measure, UiRepresent user uiThe user of all concerns,
UjRepresent user ujThe user of all concerns.
Mutually forwarding characteristic measure is calculated according to below equation:
Rij=max (Tij/Ti,Tji/Ti)
Wherein, RijRepresent user uiWith ujMutual forwarding characteristic measure, TijRepresent user uiForwarding user ujMicroblogging
Number, TiRepresent user uiThe microblogging sum of forwarding, TjiRepresent user ujForwarding user uiMicroblogging number.
Common forwarding characteristic measure is calculated according to below equation:
Wherein, MijRepresent user uiWith ujBetween common forwarding characteristic measure, TjRepresent user ujThe microblogging of forwarding is total
Number.
6th sub-step, will be including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutually turn
Sending out the feature of feature and common forwarding feature in interior user's forwarding behavioral similarity characteristic vector is carried out at standardization respectively
Reason, the user's forwarding behavioral similarity characteristic vector after standardization processing is used as to calculate user's forwarding behavioral similarity degree
Probability P (ui,uj)。
Specifically, each feature in behavioral similarity characteristic vector is forwarded to represent with g the user, then according to following
Formula carries out standardization processing to feature:
Wherein, g ' is the feature after standardization, gminFor the minima in all user's current signatures, gmaxFor all users
Maximum in current signature.
7th sub-step, it is known that user uiUser's forwarding behavioral similarity characteristic vector after corresponding standardization is zi, then
Behavior similarity probability P (u is forwarded to the useri,uj) calculating carry out according to below equation:
Wherein, ω is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment
Obtain, i.e.,:
Wherein, ω*For the optimal value of ω, l () is cross entropy loss function, and m is sample size, | | | |2For L2 normal forms just
Then change item, λ4To control the parameter of regularization intensity.
The colony forwards priori energy termUsing Pn- Potts models are calculated:
Wherein, λmaxConstant is that colony forwards priori punishment amount, and ρ is networkIn two two users forwarding behavior similarity it is general
Rate P (ui,uj) average, Q is networkIn it is all to microblogging T local forwarding probability less than specified threshold ε user institute accountings
Example, i.e.,:
Wherein,Represent networkMiddle number of users.
Step 200, as the solution of the energy function E (Y) belongs to NP-hard problems, therefore adopts in the present embodiment
Graph Cuts algorithms obtain the approximate optimal solution of forwarding behavior label sets Y, that is, complete the prediction that user forwards behavior.
It should be noted that if the number of users in microblogging social networkies were more, the solution of the energy function E (Y)
Complexity may be very high.In order to solve this problem, the present invention adopts quick community discovery algorithm by yardstick larger social activity
Network is divided into the less sub- social networkies of multiple yardsticks, then again for carrying out corresponding energy function per individual sub- social networkies
Solve, its result is merged using the solving result as former social networkies.
Experimental data and analysis
Used in the present embodiment, J.Zhang et al. was published in 2015《Acm Transaction on Knowledge
Discovery from Data》In article《Who influenced youpredicting retweet via social
influence locality》Disclosed data set (abbreviation D1) is verified to the feasibility of above Forecasting Methodology.The data
The collection essential information (such as name, sex, vermicelli number etc.) comprising 1,787,443 Sina weibo users, the newest issue of user altogether
1000 social networks structures between microblogging and user.Additionally, in order to further verify having for Forecasting Methodology of the present invention
Effect property, also obtains the data set (abbreviation D2) being characterized with microblogging forwarding depth by the API open interfaces of Sina weibo.
During this, corresponding crawlers randomly choose 10,000 popular microblogging first as seed, then take depth-first
Mode persistently capture each seed microblogging all forwarding users and each forwarding user vermicelli with concern user, finally
The essential information and social networks structure of 1,132,145 users are obtained altogether.
1 user of table forwards behavior prediction result (τ=1, λ1=0.6, λ2=0.3)
In table 1, recall rate is predicted as being predicted correctly as " forwarding " in the users of " forwarding " and " forwarding " for all
User's proportion, accuracy rate are predicted to be in the user of " forwarding " the user institute accounting being predicted correctly as " forwarding " for all
Example, and F1 tolerance is then an aggregative indicator, i.e., accuracy rate × recall rate × 2/ (accuracy rate+recall rate).SVM_1 and LERBP_
1 is respectively SVM (SupportVector Machine) and LERBP (Local Energy-based Retweet Behavior
Predicting, local forwarding probability) using the prediction knot of the user's forwarding behavior characteristicss for including user property and content of microblog
Really;SVM_2 and LERBP_2 is then respectively the prediction that SVM and LERBP forwards behavior characteristicss using the user not comprising user property
As a result;Algorithm 1 is that J.Zhang et al. was published in 2015《Acm Transaction on Knowledge Discovery
from Data》In article《Who influenced youpredicting retweetvia social influence
locality》Disclosed algorithm;Algorithm 2 is that X.Tang et al. was published in 2015《Predicting individual
retweet behavior by user similarity:A Multi-Task Learning Approach[J]》In text
Chapter《Knowledge-Based Systems》Disclosed algorithm;PERBP is using Pairwise ERBP (Energy-based
RetweetBehaviorPredicting) model predicts the outcome, and ERBP is predicting the outcome for Forecasting Methodology of the present invention.
As a result show, as SVM and LERBP do not consider the social networks between user, its overall precision of prediction is generally inclined
It is low, and SVM shows relatively good performance.On the other hand, by the vermicelli number of social networks between reflection user and concern
Number of users forwards the characteristic component of behavior prediction as user, can not describe social networks feature between user exactly, because
And its precision of prediction does not have the raising of internal.
Relatively, algorithm 1 considers the shadow of the forwarding behavior restraint in subrange between user and social network structure
Ring, thus forwarding behavior that can be preferably to user is predicted.In fact, in microblogging social networkies, society between user
Friendship relation would generally cause influencing each other between corresponding forwarding behavior, its result can even change the preference of user itself with it is emerging
Interest and cause user forwarding behavior tend to locally coherence.However, algorithm 1 forwards the overall situation of behavior restraint due to not considering user
Property feature, thus be difficult to obtain more preferable precision of prediction.Equally, although algorithm 2 is forwarded with user using multi-task learning method
Behavioral similarity feature with project different user forward behavior personalized difference, but due to do not consider more users forward behavior
Between impact, thus also fail to obtain higher precision of prediction.
Comparatively, PERBP has preferably been merged user's forwarding behavior characteristicss under the energy-optimised frameworks of MRF and has been turned with user
Constraint is distributed as, is not only advantageous to project the personalized difference that different user forwards behavior, and is conducive to quarterization social networkies
Middle more users forward the common denominator of behavior, and then can obtain predicting the outcome for global optimization.ERBP is due to PERBP's
On the basis of by merge colony forward priori, further describe user social contact circle more users forward behavior impact, thus
The substitutive characteristics that user forwards behavior are reflected more accurately, corresponding precision of prediction is so as to further being improved.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the present invention can adopt complete hardware embodiment, complete software embodiment or with reference to the reality in terms of software and hardware
Apply the form of example.And, the present invention can be using the computer for wherein including computer usable program code at one or more
The computer program implemented in usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) is produced
The form of product.
The present invention be with reference to method according to embodiments of the present invention, equipment (system), and computer program flow process
Figure and/or block diagram are describing.It should be understood that can be by computer program instructions flowchart and/or each stream in block diagram
The combination of journey and/or square frame and flow chart and/or flow process and/or square frame in block diagram.These computer programs can be provided
The processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced for reality by the instruction of computer or the computing device of other programmable data processing devices
The device of the function of specifying in present one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to
Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or
The function of specifying in multiple square frames.
These computer program instructions can be also loaded in computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one
The step of function of specifying in individual square frame or multiple square frames.
, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described
Property concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to include excellent
Select embodiment and fall into the had altered of the scope of the invention and change.
Obviously, those skilled in the art can carry out the essence of various changes and modification without deviating from the present invention to the present invention
God and scope.So, if these modifications of the present invention and modification belong to the scope of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to comprising these changes and modification.
Claims (9)
1. it is a kind of based on energy-optimised microblog users forwarding behavior prediction method, it is characterised in that methods described includes:
According to the energy function that the information of user in social networkies is set up under the energy-optimised frameworks of MRF:
Wherein, E (Y) is the energy function;Y is forwarding behavior label sets;N is the number of users in microblogging social networkies;DT
(yi,ui) for user uiUser's forwarding behavior characteristicss tolerance to microblogging T, yi∈ { 1,0 } represents user uiMay be assigned to
Forwarding behavior labelling, and yi∈Y;λ1And λ2For weight;N (i) is and user uiThere is the sequence number collection of direct concern relation user
Close;ψi,j·δ(yi≠yj) behavior restraint item is forwarded for user;ψi,jFor punishment amount;δ () is indicator function, and parameter is taken for true time
Value 1, otherwise value 0;For user uiThe set of correspondence τ-ego networks, parameter τ therein are used for controlling network τ-ego networks
Yardstick,Priori energy term is forwarded for colony;
Using energy function E (Y) described in Graph Cuts Algorithm for Solving, the approximate optimal solution of forwarding behavior label sets Y is obtained, it is complete
The prediction of behavior is forwarded into user.
2. the method for claim 1, it is characterised in that user's forwarding behavior characteristicss tolerance DT(yi,ui) expression
Formula is:
DT(yi,ui)=| yi-P(ui,T)|
Wherein, P (ui, T) and for user uiProbability is forwarded to the local of microblogging T, the calculating process of local forwarding probability is:
Obtain user property feature, and the content and the feature related to content of microblog of microblogging T;
By the feature in the user's forwarding behavior characteristicss vector including the user property feature and with content of microblog correlated characteristic
Standardization processing is carried out respectively, the user's forwarding behavior characteristicss vector after standardization processing is used as into the local and forwards probability P
(ui, T) calculating;
Known users uiUser's forwarding behavior characteristicss vector after corresponding standardization is xi, then to local forwarding probability P (ui,T)
Calculating carry out according to below equation:
Wherein, w is characterized weight vectors, minimizes particular risk function by gradient descent algorithm in the present embodiment and obtains,
I.e.:
Wherein, w*For the optimal value of w, l () is cross entropy loss function, and n is sample size, | | | |2For L2 normal form regularizations
, λ3To control the parameter of regularization intensity.
3. method as claimed in claim 2, it is characterised in that be concern number of users, powder by the user property feature selection
Silk number, whether certification, issuing microblog number, microblogging number, transmitting active degree this 6 features are forwarded, the above this 6 features are from microblogging
Direct access in data;By the feature selection related to content of microblog be the theme similarity, to be forwarded number of times, content of microblog long
Spend, whether include URL, whether include this 5 features of@.
4. method as claimed in claim 3, it is characterised in that wherein described Topic Similarity computational methods are:Will be described micro-
The content of rich T and user uiHistory it is original and forwarding microblogging accumulate document di, document is calculated respectively using LDA models then
With probability distribution of the microblogging on predetermined theme, finally corresponding Topic Similarity is determined using COS distance, i.e.,:
Wherein, L (di, T) and for document diWith the Topic Similarity of microblogging T, LDA (di) for document diProbability on predetermined theme point
Cloth, probability distribution of the LDA (T) for microblogging T on predetermined theme.
5. method as claimed in claim 3, it is characterised in that by each feature in user forwarding behavior characteristicss vector
Represented with f, then standardization processing is carried out to feature according to below equation:
Wherein, f' be standardization after feature, fminFor the minima in all user's current signatures, fmaxIt is current for all users
Maximum in feature, the then feature after the user's forwarding behavior characteristicss vector after the standardization is standardized by 11 are constituted.
6. the method for claim 1, it is characterised in that punishment amount ψi,jDefinition be:
ψi,j=exp (P (ui,uj)/σ)
Wherein, parameter σ is used to control strength of punishment, P (ui,uj) it is the user that the determination of behavioral similarity feature is forwarded according to user
uiWith ujUser's forwarding behavior similarity probability, user forwarding behavior similarity probability calculated in accordance with the following methods:
The subject matter preferences similarity between two two users is calculated respectively, is mutually paid close attention to feature, is absorbed in jointly feature, mutually forwards spy
Seek peace common forwarding feature;
Will be including the subject matter preferences similarity, mutually pay close attention to feature, be absorbed in jointly feature, mutually forwarding feature and common forward
Feature of the feature in interior user's forwarding behavioral similarity characteristic vector carries out standardization processing respectively, after standardization processing
User's forwarding behavioral similarity characteristic vector be used as to calculate the user and forward behavioral similarity degree probability P (ui,uj);
Known users uiUser's forwarding behavioral similarity characteristic vector after corresponding standardization is zi, then the user is forwarded
Behavior similarity probability P (ui,uj) calculating carry out according to below equation:
Wherein, ω is characterized weight vectors, minimizes particular risk function using gradient descent algorithm and obtains, i.e.,:
Wherein, ω*For the optimal value of ω, l () is cross entropy loss function, and m is sample size, | | | |2For L2 normal form regularizations
, λ4To control the parameter of regularization intensity.
7. method as claimed in claim 6, it is characterised in that the subject matter preferences similarity, mutually pay close attention to feature, it is common specially
Note feature, the computational methods for mutually forwarding feature and forwarding jointly feature are:
By user uiWith ujHistory microblogging accumulate document d respectivelyiAnd dj, then by the LDA model theme distributions of two documents
COS distance value between vector is used as user uiWith ujBetween subject matter preferences similarity:
Wherein, L (di,dj) for user uiWith ujBetween subject matter preferences similarity, LDA (di) for document diOn predetermined theme
Probability distribution, LDA (dj) for document djProbability distribution on predetermined theme;
As user uiWith ujBetween mutually pay close attention to, mutually concern feature is taken into 1, if only existing unidirectional concern, feature is mutually paid close attention to
Take 0;
Common concern characteristic measure is calculated according to below equation:
Wherein, SijFor user uiWith ujBetween common concern characteristic measure, UiRepresent user uiThe user of all concerns, UjRepresent
User ujThe user of all concerns;
Mutually forwarding characteristic measure is calculated according to below equation:
Rij=max (Tij/Ti,Tji/Ti)
Wherein, RijRepresent user uiWith ujMutual forwarding characteristic measure, TijRepresent user uiForwarding user ujMicroblogging number, Ti
Represent user uiThe microblogging sum of forwarding, TjiRepresent user ujForwarding user uiMicroblogging number;
Common forwarding characteristic measure is calculated according to below equation:
Wherein, MijRepresent user uiWith ujBetween common forwarding characteristic measure, TjRepresent user ujThe microblogging sum of forwarding.
8. method as claimed in claim 6, it is characterised in that will the user forward it is every in behavioral similarity characteristic vector
Individual feature represented with g, then carry out standardization processing to feature according to below equation:
Wherein, g ' is the feature after standardization, gminFor the minima in all user's current signatures, gmaxIt is current for all users
Maximum in feature.
9. the method for claim 1, it is characterised in that the colony forwards priori energy termUsing Pn-
Potts models are calculated:
Wherein, λmaxConstant is that colony forwards priori punishment amount, and ρ is networkIn two two users forwarding behavior similarity probability P
(ui,uj) average, Q is networkIn it is all to microblogging T local forwarding probability less than specified threshold ε user's proportions,
I.e.:
Wherein,Represent networkMiddle number of users.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610978548.2A CN106547901A (en) | 2016-11-08 | 2016-11-08 | It is a kind of to forward behavior prediction method based on energy-optimised microblog users |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610978548.2A CN106547901A (en) | 2016-11-08 | 2016-11-08 | It is a kind of to forward behavior prediction method based on energy-optimised microblog users |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106547901A true CN106547901A (en) | 2017-03-29 |
Family
ID=58394338
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610978548.2A Pending CN106547901A (en) | 2016-11-08 | 2016-11-08 | It is a kind of to forward behavior prediction method based on energy-optimised microblog users |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547901A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991160A (en) * | 2017-03-30 | 2017-07-28 | 武汉大学 | A kind of microblogging propagation prediction method based on user force and content |
CN107122852A (en) * | 2017-04-24 | 2017-09-01 | 无锡中科富农物联科技有限公司 | A kind of microblog users interest Forecasting Methodology based on PMF |
CN107330562A (en) * | 2017-07-03 | 2017-11-07 | 扬州大学 | Information dissemination method based on individual consumer's feature |
CN108596205A (en) * | 2018-03-20 | 2018-09-28 | 重庆邮电大学 | Behavior prediction method is forwarded based on the microblogging of region correlation factor and rarefaction representation |
CN109829504A (en) * | 2019-02-14 | 2019-05-31 | 重庆邮电大学 | A kind of prediction technique and system forwarding behavior based on ICS-SVM analysis user |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102045707A (en) * | 2010-12-28 | 2011-05-04 | 华中科技大学 | Trust construction method for accelerating trust convergence |
CN104572807A (en) * | 2014-10-29 | 2015-04-29 | 中国科学院计算技术研究所 | News authentication method and news authentication system based on microblog information source |
CN105447196A (en) * | 2015-12-31 | 2016-03-30 | 深圳中泓在线股份有限公司 | Key blogger tracking confirmation method and device |
-
2016
- 2016-11-08 CN CN201610978548.2A patent/CN106547901A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102045707A (en) * | 2010-12-28 | 2011-05-04 | 华中科技大学 | Trust construction method for accelerating trust convergence |
CN104572807A (en) * | 2014-10-29 | 2015-04-29 | 中国科学院计算技术研究所 | News authentication method and news authentication system based on microblog information source |
CN105447196A (en) * | 2015-12-31 | 2016-03-30 | 深圳中泓在线股份有限公司 | Key blogger tracking confirmation method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991160A (en) * | 2017-03-30 | 2017-07-28 | 武汉大学 | A kind of microblogging propagation prediction method based on user force and content |
CN106991160B (en) * | 2017-03-30 | 2020-07-24 | 武汉大学 | Microblog propagation prediction method based on user influence and content |
CN107122852A (en) * | 2017-04-24 | 2017-09-01 | 无锡中科富农物联科技有限公司 | A kind of microblog users interest Forecasting Methodology based on PMF |
CN107330562A (en) * | 2017-07-03 | 2017-11-07 | 扬州大学 | Information dissemination method based on individual consumer's feature |
CN107330562B (en) * | 2017-07-03 | 2020-12-01 | 扬州大学 | Information spreading method based on individual user characteristics |
CN108596205A (en) * | 2018-03-20 | 2018-09-28 | 重庆邮电大学 | Behavior prediction method is forwarded based on the microblogging of region correlation factor and rarefaction representation |
CN108596205B (en) * | 2018-03-20 | 2022-02-11 | 重庆邮电大学 | Microblog forwarding behavior prediction method based on region correlation factor and sparse representation |
CN109829504A (en) * | 2019-02-14 | 2019-05-31 | 重庆邮电大学 | A kind of prediction technique and system forwarding behavior based on ICS-SVM analysis user |
CN109829504B (en) * | 2019-02-14 | 2022-07-01 | 重庆邮电大学 | Prediction method and system for analyzing user forwarding behavior based on ICS-SVM |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11070643B2 (en) | Discovering signature of electronic social networks | |
Jayasinghe et al. | Data centric trust evaluation and prediction framework for IOT | |
CN106547901A (en) | It is a kind of to forward behavior prediction method based on energy-optimised microblog users | |
Zhao et al. | A machine learning based trust evaluation framework for online social networks | |
Wang et al. | Confidence-aware truth estimation in social sensing applications | |
CN112231570B (en) | Recommendation system support attack detection method, device, equipment and storage medium | |
Kalimeris et al. | Learning diffusion using hyperparameters | |
CN103258027A (en) | Context awareness service platform based on intelligent terminal | |
Zhang et al. | IgS-wBSRM: A time-aware Web Service QoS monitoring approach in dynamic environments | |
WO2018031860A1 (en) | Geo-locating individuals based on a derived social network | |
Kim et al. | Evaluating audience loyalty and authenticity in influencer marketing via multi-task multi-relational learning | |
Xiao et al. | Graph Neural Network-Based Design Decision Support for Shared Mobility Systems | |
Yuanyuan | MOOC teaching model of basic education based on fuzzy decision tree algorithm | |
Wang et al. | Research on comprehensive performance evaluation of communication network based on the fuzzy number intuitionistic fuzzy information | |
Fan et al. | Context-aware ubiquitous web services recommendation based on user location update | |
Cheng | Crowd‐Sourcing Information Dissemination Based on Spatial Behavior and Social Networks | |
Xu et al. | Trust-based context-aware mobile social network service recommendation | |
Kuang et al. | Community-based link prediction in social networks | |
Rakhmetullina et al. | Mathematical modeling of the interests of social network users | |
Tahseen et al. | Prediction of user’s behavior on the social media using XGBRegressor | |
Li | Towards the next generation of multi-criteria recommender systems | |
Etuk et al. | TAF: A trust assessment framework for inferencing with uncertain streaming information | |
Gad-ElRab et al. | Multiple criteria-based efficient schemes for participants selection in mobile crowd sensing | |
CN104704490B (en) | Method for processing internet website platform connection data | |
Thirumaran et al. | Recommending Web Service by Predicting Quality of Service using Hidden Markov Model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170329 |
|
RJ01 | Rejection of invention patent application after publication |