CN105913159A - Social network event based user's influence prediction method - Google Patents
Social network event based user's influence prediction method Download PDFInfo
- Publication number
- CN105913159A CN105913159A CN201610279983.6A CN201610279983A CN105913159A CN 105913159 A CN105913159 A CN 105913159A CN 201610279983 A CN201610279983 A CN 201610279983A CN 105913159 A CN105913159 A CN 105913159A
- Authority
- CN
- China
- Prior art keywords
- user
- event
- matrix
- correlation
- influence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 98
- 239000011159 matrix material Substances 0.000 claims abstract description 195
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 12
- 238000009826 distribution Methods 0.000 claims description 14
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 7
- 244000046052 Phaseolus vulgaris Species 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000013480 data collection Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000013479 data entry Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 238000005303 weighing Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000009940 knitting Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a social network event based user's influence prediction method, which is performed through the following steps: building a user's influence matrix S through the influences of M users in a social network on N events; building a user's correlation matrix U on the characteristics information of a user; building an event correlation matrix E on the characteristics information of an event; and building a user's influence matrix S, a user's correlation matrix U and an event correlation matrix E based on the user's influence. According to the invention, the correlation of an event and the correlation of a user are integrated into a matrix decomposition prediction model and a new prediction model MF-EUN is put forward to predict the influence of a user based on a social network event, which raises the accuracy of prediction results. Further, with the method, the influence of a user on a social network event can be predicted in a comprehensive manner.
Description
Technical field
The present invention relates to data mining technology, particularly relate to a kind of user force prediction side based on social networks event
Method, belongs to information science technology field.
Background technology
Along with the fast development of Internet technology, emerged in large numbers substantial amounts of social networks both at home and abroad, as face book (Facebook),
Push away spy (Twitter), wechat, microblogging etc..Increasing user selects to be delivered daily record by this kind of social networks, uploaded photograph
Sheet, participate in all kinds of Above-the-line etc..By the interaction on social networks, user is possible not only to keep in touch with good friend, and
May recognize that more friend, expand social networks.Nowadays, on simple line, exchange and interdynamic cannot meet the need of user
Asking, arise at the historic moment based on movable social networks, such as Meetup, Plancast, Google+Events, bean cotyledon are with city etc..This
A little application and service in addition to exchange and interdynamic on the line meeting user, additionally provide one issue to user at line platform, tissue,
Management and participation doings.
Social influence power shows as the behavior of user and thought is affected the phenomenon changed by other people.Social influence power is divided
Analysis is widely used in multiple fields, the existing substantial amounts of achievement of the research of user influence in social network.But, based on thing
The social networks of part has the characteristic of its uniqueness, as event has positional information, organizer etc. so that the shadow in tradition social networks
Ring power analysis or Forecasting Methodology may be poorly suitable for social networks based on event, it was predicted that result is undesirable, inaccurate.Cause
This, need to excavate for user influence in social network Forecasting Methodology based on event, make full use of social networks event
Characteristic improves the accuracy of user force prediction.
Summary of the invention
The embodiment of the present invention provide a kind of customer impact force prediction method based on social networks event, can improve for
The accuracy of user influence in social network based on event prediction.
The customer impact force prediction method based on social networks event that the embodiment of the present invention provides, including:
User force matrix S is set up, in user force matrix S according to M user influence power in N number of event
Element sueRepresent that user u affects the ratio of friend in event e, wherein, 1≤u≤M and be integer, 1≤e≤N and be integer,
Described M is the integer more than 1, and described N is the integer more than 1;
Characteristic information according to M user sets up user's correlation matrix U, the element u in user's correlation matrix Uuu′Table
Show the degree of correlation between user u and user u ', wherein, 1≤u '≤M and be integer;
Characteristic information according to N number of event sets up event correlation matrix E, the element e in event correlation matrix Eee' table
Show the degree of correlation between event e and event e ', wherein, 1≤e '≤N and be integer;
According to user force matrix S, user's correlation matrix U and event correlation matrix E, determine user characteristics vector
Matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event degree of correlation factor of influence matrix Z, its
In, P and Q be respectively the eigenvectors matrix of user that described user force matrix S carried out to be obtained after matrix decomposition and
The eigenvectors matrix of event, W and Z is respectively user's degree of correlation and the event degree of correlation to the customer impact in social networks event
The influence value of power;
According to user characteristics vector matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event
Degree of correlation factor of influence matrix Z, determines the user force in social networks event.
Based on above-mentioned, the customer impact force prediction method based on social networks event that the embodiment of the present invention provides, passes through
In social networks, M user influence power in N number of event sets up user force matrix S, is built by the characteristic information of user
Vertical user's correlation matrix U, sets up event correlation matrix E by the characteristic information of event, then according to customer impact moment
Battle array S, user's correlation matrix U and event correlation matrix E, be fused to matrix decomposition by event correlation and End-user relevance pre-
Survey in model, obtain more accurate user characteristics vector matrix P and affair character vector matrix Q, and user's degree of correlation shadow
Ring factor matrix W and event degree of correlation factor of influence matrix Z, and then can according to user characteristics vector matrix, affair character to
Moment matrix, user's degree of correlation factor of influence matrix and event degree of correlation factor of influence matrix Z, show that accurate user force is pre-
Survey result, and the Forecasting Methodology provided by the embodiment of the present invention can predict user in social networks event than more comprehensive
Influence power.
Accompanying drawing explanation
In order to be illustrated more clearly that the present invention or technical scheme of the prior art, below will be to embodiment or prior art
In description, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is the one of the present invention
A little embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to according to these
Accompanying drawing obtains other accompanying drawing.
The flow chart of the customer impact force prediction method that Fig. 1 provides for one embodiment of the invention;
The MF-EUN forecast model block schematic illustration that Fig. 2 provides for one embodiment of the invention;
Fig. 3 is one to randomly select user's influence power distribution schematic diagram in zones of different;
Fig. 4 is a distance probability distribution schematic diagram randomly selecting between all events that user participates in;
The AI-UN method that Fig. 5 provides for the embodiment of the present invention and other neighbour find the Performance comparision schematic diagram of method;
The AI-EN method that Fig. 6 provides for the embodiment of the present invention and other neighbour find the Performance comparision schematic diagram of method.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
The a part of embodiment of the present invention rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art
The every other embodiment obtained under not paying creative work premise, broadly falls into the scope of protection of the invention.
User force can apply to the Information Communication in social networks, information recommendation, commodity or service popularization, advertisement
In the scenes such as input, by selecting the user that influence power is bigger to promote crowd as first-selection, by their celebrity's appeal, it is possible to
Information, commodity or service are promoted to more people.Therefore, identify and utilize the user that influence power is bigger, for promoting network
Safety and network economy great significance.
Same, the user force based on social networks event that the embodiment of the present invention provides is predicted for social networks
Movable or the popularization of event, publicity have great role.The executive agent of the embodiment of the present invention can be that corresponding offer is put down online
Application or the webserver of service of doings is issued, organizes, manages and participated in platform to user.
In server, the collection of user is combined into { u1,u2,...,uM, the collection of event is combined into { e1,e2,...,eN}.M represents all
The total quantity of user, M is the integer more than 1.N represents the total quantity of all events, and N is the integer more than 1.First, Wo Menke
With the influence power according to the record acquisition of information event aspect user in server, and set up user force matrix S with this.
Concrete, user's influence power in event can be obtained according to the ratio of the friend that user u affects in event e
Sue, i.e.Wherein, wueThe friend's quantity affected in event e for user u, F (u) is friend's collection of user u
Close, friend's quantity that | F (u) | is user u.
Owing to having M user and N number of event, the user set up according to each user influence power in each event
Influence power matrix S is the matrix on M × N rank.Element s in user force matrix SueRepresent that user u affects in event e
The ratio of friend, 1≤u≤M and be integer, 1≤e≤N and be integer.
It should be noted that due in server the quantity of user a lot, the quantity of event is also a lot, and each user is not
May can affect friend in each event, user's influence power in a lot of events is non-existent.Accordingly, user
It is unknown for having a lot of element in influence power matrix S, say, that user force matrix S is a sparse matrix.Due to
Family influence power matrix S is a sparse matrix, and in S, the element value of the overwhelming majority is missing from.Following embodiment in the present invention
In, how to utilize known influence power data in user force matrix S to predict the influence power data of the unknown by introducing.
Owing to social networks based on event has the characteristic of its uniqueness, as event has positional information, organizer etc., use
Family has topic influence, regional influence etc..Accordingly, each user u and each event e can correspond respectively to a spy
Levy vector PuAnd Qe, PuIn element reflect the degree of correlation of user and individual features, QeIn element reflect event and phase
Answer the degree of correlation of feature.User u influence power in event e just can be predicted by their inner product of characteristic vector.Institute
There is characteristic vector P of useruCharacteristic vector Q with all eventseUser characteristics vector matrix P and affair character are separately constituted
Vector matrix Q.Matrix P and Q features the feature of user and event respectively, and the dimension of matrix P and Q can be specified, and dimension is the highest
The user portrayed and the feature of event are the most, and the precision of calculating the most also can improve.The prediction that inner product according to P and Q obtains
The element value of disappearance in user force matrix S can be supplemented by result, and then the customer impact after can being supplemented
Moment battle array S '.Obviously S '=PTElement in Q, S 'And then can be according to the user force matrix after supplementing
S ' compares the prediction of comprehensive user force.
Matrix decomposition (Matrix Factorization is called for short MF) algorithm is higher with computational accuracy, and extensibility is preferable,
And the relatively low advantage of computation complexity is widely used in forecast model.The basic thought of MF algorithm is, utilizes two dimensions
The product of relatively low matrix P and Q approaches the user force matrix S that oneself knows.
In the training learning process of MF forecast model, the element in first random initializtion matrix P and Q, then continuous edge
The direction that gradient is contrary updates the element in Iterative Matrix P and Q, until P and Q restrains.
Eigenvectors matrix P and Q of user and event is obtained, according to S '=P according to the training study of MF forecast modelTQ obtains
Arrive the user force matrix S ' after supplementing.But, owing to the predicated error of MF forecast model is the biggest, it is thus possible to still
All users cannot be predicted for the influence power of all events, and the degree of accuracy predicted the outcome is the highest.
In embodiments of the present invention, can be according to the characteristic information of event (such as event content, event location and event organization
Person) find the correlation between event, and characteristic information (social information, the user's shadow on topic such as user of user
Ring power, the user influence power on region and user's influence power on organizer) find the correlation between user, then will
Event correlation and End-user relevance are fused in MF forecast model, by having merged the MF of event correlation and End-user relevance
Forecast model (Matrix Factorization with Event and User Neighborhood is called for short MF-EUN) enters
User force prediction in row social networks event, improves the accuracy predicted the outcome.
The flow chart of a kind of customer impact force prediction method that Fig. 1 provides for the embodiment of the present invention, as it is shown in figure 1, this reality
The customer impact force prediction method based on social networks event that executing example provides includes:
S11, sets up user force matrix S according to M user influence power in N number of event;
Exemplary, can according to the influence power of the record acquisition of information event aspect user in the webserver, and with
This sets up user force matrix S.Concrete, user can be obtained according to the ratio of the friend that user u affects in event e and exist
Influence power s in eventue。
S12, sets up user's correlation matrix U according to the characteristic information of M user;
Exemplary, interest similar between user and hobby can be found according to the social network relationships between user,
And using the social network relationships of user as the characteristic information of user, set up user's correlation matrix U.
On the one hand, optionally, can be according to the social information existed between user (such as friend relation or old boy network etc.
Deng) set up user's correlation matrix U.
On the other hand, optionally, it is also possible to according to the social network relationships between user force matrix S structuring user's.
Such as, according to correlation technique, being correlated with between user can be calculated by the mode such as cosine similarity and Pearson correlation coefficients
Degree.
S13, sets up event correlation matrix E according to the characteristic information of N number of event;
Exemplary, can according to the content of event, hold the characteristic information such as position, organizer and find the phase between event
Guan Xing.It is understood that can also be according to the degree of correlation between user force matrix S calculating event, such as according to relevant
Technology, by the degree of correlation between the mode calculating event such as cosine similarity and Pearson correlation coefficients.
During it should be noted that there is social information between user, there is not the relation in sequential in S11 and S12;When with
Social information between family not in the presence of, need to first carry out S11, set up user force matrix S, perform S12 the most again, according to
Social network relationships between user force matrix S structuring user's, sets up correlation matrix U of user, such as, root with this
According to existing user influence power data in event in user force matrix S, calculate the Pearson came phase relation between user
Number obtains the degree of correlation between user.Same, exist between event relevant characteristic information (content is relevant, position is relevant,
Organizer is relevant) time, the most there is not the relation in sequential in S11 and S13;Relevant characteristic information is there is not between event
Time, need to first carry out S11, set up user force matrix S, perform S13 the most again, construct thing according to user force matrix S
Correlation between part, sets up correlation matrix E of event with this, such as, according to existing use in user force matrix S
Family influence power data in event, the degree of correlation between Pearson correlation coefficient acquisition event between calculating event.Meanwhile,
It is understood that there is not the relation in sequential in S12 and S13.
S14, according to user force matrix S, user's correlation matrix U and event correlation matrix E, determines user characteristics
Vector matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event degree of correlation factor of influence matrix Z;
Wherein, P and Q is respectively the feature that described user force matrix S carries out the user obtained after matrix decomposition
Vector matrix and the eigenvectors matrix of event, W and Z is respectively user's degree of correlation and the event degree of correlation in social networks event
The influence value of user force.
S15, according to user characteristics vector matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and
Event degree of correlation factor of influence matrix Z, determines the user force in social networks event.
As it has been described above, event correlation and End-user relevance are fused in MF forecast model by the embodiment of the present invention, propose
New forecast model MF-EUN carries out user force based on social networks event prediction.
Concrete, MF-EUN forecast model has merged End-user relevance factor of influence matrix on the basis of MF forecast model
W and event correlation factor of influence matrix Z.Can be predicted by following formula:
Wherein,Predict the outcome for based on matrix decomposition, WuFor merge End-user relevance factor of influence,ZeFor merge event correlation factor of influence,
Nt(e, u) represents t the user more than preset value of the correlation in user's correlation matrix U and between user u,
Below by Nt(e u) is referred to as neighbour's set of user u.Obviously, Nt(e u) can determine according to user's correlation matrix U.
Nk(u, e) represents k the event more than preset value of the correlation in event correlation matrix E and between event e,
Below by Nk(u e) is referred to as neighbour's set of event e.Obviously, Nk(u e) can determine according to event correlation matrix E.
For user uiTo with
The weighing factor of family u,For event ejWeighing factor to event e.
It is to say, the formula of MF-EUN forecast model is:
The parameter of visible MF-EUN forecast model includes Pu、Qe、And
First, the object function of definition MF-EUN forecast model is:
Wherein,For preventing over-fitting just
Then change item.
To object functionSeek local derviation:
It follows that we use stochastic gradient descent (Stochastic gradient descent is called for short SGD) method
It is optimized study and obtains parameter P of optimumu、Qe、And
Use in MF-EUN model training learning process, first random initializtion Pu、Qe、WithIn element,
Then constantly iteration P is updated along the direction that gradient is contraryu、Qe、WithIn element, until Pu、Qe、WithConvergence, wherein η is learning efficiency.Finally can according to M user and the characteristic vector of N number of event and factor of influence weighted value
To have obtained user characteristics vector matrix P, affair character vector matrix Q of optimization that optimize, and the user's degree of correlation optimized
The event degree of correlation factor of influence matrix Z of factor of influence matrix W and optimization, further according to MF-EUN model prediction formulaCalculate the value of user force.
The customer impact force prediction method based on social networks event that the present embodiment provides, by M in social networks
User's influence power in N number of event sets up user force matrix S, sets up user's degree of correlation square by the characteristic information of user
Battle array U, sets up event correlation matrix E by the characteristic information of event, then according to user force matrix S, user's degree of correlation
Matrix U and event correlation matrix E, be fused to event correlation and End-user relevance in MF forecast model, proposes new pre-
Survey model M F-EUN and carry out user force based on social networks event prediction, improve the accuracy predicted the outcome.It addition,
The customer impact force prediction method provided by the embodiment of the present invention can be than user in more comprehensively prediction social networks event
Influence power.
In the above-described embodiments, owing to user force matrix S is a sparse matrix, and in matrix S between two row or
The element overlapped between person two row is fewer, and with classical method for measuring similarity, (such as, cosine similarity is relevant with Pearson
Coefficient) it is difficult to find reliable neighbour.Therefore, further, also proposed in another embodiment of the invention a kind of based on
The neighbour of characteristic information finds method, is used for determining user's correlation matrix U and event correlation matrix E.
The MF-EUN forecast model block schematic illustration that Fig. 2 provides for one embodiment of the invention, as in figure 2 it is shown, this model bag
Include three parts:
Part I, social influence moment battle array builds, specifically can be by building user's shadow in above-mentioned embodiment illustrated in fig. 1
The method ringing moment battle array S builds social influence moment battle array;
Part II, neighbour based on extraneous information finds method, utilizes the characteristic of social networks based on event, proposes
User neighbour finds that method and event neighbour find method;
Part III, it was predicted that model M F-EUN, is dissolved into user neighbour and event neighbour in MF forecast model, specifically
, the principle of MF-EUN forecast model is identical with embodiment illustrated in fig. 1 with prediction process, and here is omitted.
In the present embodiment, will be described in detail the discovery method of user neighbour and event neighbour.
In neighbour's discovery method based on extraneous information, we consider use unique in social networks based on event
Influence power on region of family characteristic information user influence power on topic, user and user's impact on organizer
Power, and affair character message event content, event location and event organiser.
First aspect, user neighbour finds method.
U (u, u ') is made to represent the degree of correlation between user u and user u', Ut(u,u′)、Ur(u, u ') and Uo(u, u ') difference
Represent two users influence power similarity on topic, the influence power similarity on region and the impact on organizer
Power similarity.Based on this, we have proposed similarity calculating method based on linear fusion, it may be assumed that U (u, u ')=β1Ut(u,u′)+
β2Ur(u,u′)+β3Uo(u, u '), wherein β1, β2And β3It is respectively user's influence power similarity on topic, user in region
On influence power similarity and the weight of user's influence power similarity on organizer.Finally, by calculating any two use
Similarity between family, sets up user's correlation matrix U.
For any one user u, the correlation in user's correlation matrix U and between user u can be found out and be more than
T user of preset value gathers N as the neighbour of user ut(e,u)。
Below the most respectively to influence power similarity on region of user's influence power similarity on topic, user and use
The determination method of family influence power similarity on organizer is illustrative.
1) user's influence power similarity on topic;
According to correlation technique, user's influence power on different topics is different, and therefore, we are at topic aspect degree
Similarity between amount user.Exemplary, it is possible to use document subject matter generates model (Latent Dirichlet
Allocation, be called for short LDA) obtain all events topic distribution.
Make stuRepresenting user u influence power on topic, we calculate st by following formulau:
Wherein For event eiTopic distribution, HEuRepresent the set of all events that user u participated in, |
HEu| represent the quantity of the event that user u participated in the past.
Then, KL-JS divergence is utilized can to calculate any two user influence power similarity on topic: Ut(u,u′)
=1-DJS(stu,stu′), wherein, DJS(stu,stu') it is stuAnd stu′Between KL-JS divergence,
It should be noted that in probability theory and statistics, JS (Jensen Shannon) divergence is used to measure probability
A kind of method of distribution distance (similarity degree), KL divergence (Kullback Leibler divergence) is to describe two generally
A kind of method of rate distribution P and Q difference.Wherein,
2) user's influence power similarity on region;
Finding with the analysis of city data set according to bean cotyledon, each user influence power on different regions is different
, Fig. 3 is one to randomly select user's influence power distribution schematic diagram in zones of different.Based on this, we can be in regional level
The influence power of measure user.
First, according to the position of the event that user participates in the past, we define user u influence power on region is user
The mean value of the influence power of the event that u participates on the area:
Wherein,Represent user u at region RmOn influence power,HEuRepresent that user u participated in is all
The set of event,Represent at region RmOn the event sets held, nu(Rm) represent user u at region RmOn the thing participated in
Number of packages amount.
Then, sr is madeuRepresent the vector of user u influence power on all regions, then
Wherein f is region sum.
Finally, according to any two user influence power on all regions, cosine similarity is utilized
Calculate this any two user influence power similarity on region.
3) user's influence power similarity on organizer;
Being similar to, user's influence power on different tissues person is also different.Therefore, we can also be at organizer's layer
The influence power of face measure user.
First, according to the organizer of the event that user participates in the past, we define user's influence power on organizer and are
The mean value of the influence power of the event of this organizer tissue that this user participates in:
Wherein,Represent user u at organizer OjOn influence power,HEuRepresent the institute that user u participated in
There is the set of event,Represent organizer OjThe event sets organized, nu(Oj) represent user and participated in organizer OjTissue
The total number of events amount crossed.
Then, so is madeuRepresent the vector of user u influence power on all organizers, then
Wherein l is organizer's sum.
Finally, according to any two user influence power on all organizers, cosine similarity is utilized
Calculate the influence power similarity on organizer between this any two user.
Second aspect, event neighbour finds method.
The degree of correlation making E (e, e') represent between event e and event e', makes Ec(e,e')、El(e, e') and Eo(e, e') point
Do not represent the content similarity of two events, location similarity and organizer's similarity.Based on this, similarly, by based on
The similarity calculating method of linear fusion, it may be assumed that E (e, e')=α1Ec(e,e')+α2El(e,e')+α3Eo(e, e') calculates event e
And the degree of correlation between event e', wherein, α1, α2And α3It is respectively event content similarity, event location similarity and event group
The weight of the person's of knitting similarity.Finally, by calculating the similarity between any two event, event correlation matrix E is set up.
For any one event e, the correlation in event correlation matrix E and between event e can be found out and be more than
K event of preset value gathers N as the neighbour of event ek(u,e)。
Respectively the determination method of event content similarity, event location similarity and event organiser's similarity is entered below
Row exemplary illustration.
1) event content similarity;
Obtain the topic distribution of all events first with classical topic model LDA, the distribution of this topic represents
The kind of event, then utilizes KL-JS divergence to calculate the content similarity between any two event.
Make θeAnd θe′The topic distribution of event e of being respectively and event e', utilizes KL-JS divergence can calculate any two thing
The content similarity of part: Ec(e, e ')=1-DJS(θe,θe′), wherein, DJS(θe,θe′) it is θeAnd θe′Between KL-JS divergence,
2) event location similarity;
Carried out data analysis according to bean cotyledon with city data set, and calculate between all events that user participates in away from
From, finding that power-law distribution obeyed by the probability density distribution of these distances, Fig. 4 is between all events randomly selecting user's participation
Distance probability distribution schematic diagram.It is to say, the distance between the social networks event of user's participation is smaller.
If it is therefore believed that the position of two events is the nearest, then the similarity between the two event is the highest.
Therefore, the location similarity that we can utilize gause's rule to define two events is:
Wherein, leAnd le′Be respectively event e and event e' holds position, dis (le,le′) it is leAnd le′Between distance.
3) event organiser's similarity
In social networks based on event, each event has an organizer.Whether user participates in an event also
Affected by event organiser.Meanwhile, an organizer may organize multiple events.Therefore, we define two events at tissue
Similarity on person is:Wherein, O (e) and O (e ') is respectively event e and event e'
Organizer.
In order to the advantage of customer impact force prediction method of based on social networks event that the present invention provide is better described,
In another embodiment of the present invention, we use widely used module root-mean-square error (RootMean Square
Error, is called for short RMSE) and mean absolute error (Mean Absolute Error, abbreviation MAE) schematically illustrate.Its
In, the computational methods of two modules of RMSE and MAE are as follows:
Concrete, we use the True Data collection crawled with city at bean cotyledon to carry out experimental verification.Event in data set
Participate in record during 2013/02/01 to 2014/10/31.We delete and participate in the event number user less than 5 (greatly
Account for the 5% of total number of users) and the event participant's quantity event (constituting about the 3% of total event number) less than 8.Finally,
We have 11123 users, 29342 events and 356052 customer incidents pair.The influence power matrix S that whole data set is constituted
Degree of rarefication be 99.9%.
Significantly, since bean cotyledon arranges sequentially in time with the participant of event on city, therefore, if
User ufThe time clicking on " I to participate in " is later than user u, it is believed that user ufParticipation event is affected by user u.
In an experiment, 11123 users are randomly divided into different size of data set by random, including 1000 users'
Data set, the data set of 5000 users and the data set of 11123 users.Further, we randomly choose 50% respectively, 70% He
The given data of 90% is as training dataset, and remaining element is tested as test data set.
We regulate parameter involved in model to optimal value by experiment effect.Below by analyzing experimental data
The performance of MF-EUN forecast model is described.
First we compare user neighbour based on extraneous information proposed by the invention and find method (Additional
Information User Neighborhood, be called for short AI-UN) and event neighbour based on extraneous information find method
(Additional Information Event Neighborhood is called for short AI-EN) finds the prediction of method with other neighbours
Performance.
For the AI-UN that the embodiment of the present invention provides, Fig. 5 finds that method and other neighbour find the Performance comparision signal of method
Figure.As it is shown in figure 5, other neighbour finds that method includes: user neighbour based on topic finds method (Topic-User
Neighborhood, be called for short T-UN), user neighbour based on region find method (Region-User Neighborhood, letter
Claim R-UN), user neighbour based on organizer find method (Organizer-User Neighborhood, be called for short O-UN), base
User neighbour in topic and region find method (Topic-Region User Neighborhood, be called for short TR-UN), based on
The user neighbour of topic and organizer find method (Topic-Organizer User Neighborhood, be called for short TO-UN),
User neighbour based on region and organizer find method (Organizer-Region Neighborhood, be called for short RO-UN),
User neighbour based on Pearson's similarity finds method (Pearson-User Neighborhood is called for short P-UN)
The AI-EN method that Fig. 6 provides for the embodiment of the present invention and other neighbour find the Performance comparision schematic diagram of method.As
Shown in Fig. 6, other neighbour finds that method includes: event near neighbor method (Content-Event based on event content
Neighborhood, be called for short C-EN), event near neighbor method (Location-Event based on event location
Neighborhood, be called for short L-EN), event near neighbor method (Organizer-Event based on event organiser
Neighborhood, be called for short O-EN), event near neighbor method (Content-Location-based on event content and position
Event Neighborhood, be called for short CL-EN), event near neighbor method (Content-based on event content and organizer
Organizer-Event Neighborhood, be called for short CO-EN), event near neighbor method LO-based on event location and organizer
EN (Location-Organizer-Event Neighborhood is called for short) and event neighbour based on Pearson's similarity
Method (Pearson-Event Neighborhood is called for short P-EN).
From figs. 5 and 6, it can be seen that AI-UN and the AI-EN method that the embodiment of the present invention proposes is substantially better than the near of other
Neighbor discovery method.Owing to other neighbour finds method (T-UN, R-UN, O-UN, TR-UN, TO-UN, RO-UN and C-EN, L-
EN, O-EN, CL-EN, CO-EN, LO-EN), only consider the feature that one or two embodiment of the present invention are put forward, so prediction
Accuracy than simultaneously merge three kinds of features neighbour find that the degree of accuracy of method is low.Additionally, due to the embodiment of the present invention proposes
Neighbour based on extraneous information find that method (AI-UN and AI-EN) considers unique characteristic of social networks based on event,
The neighbour making us finds that method is higher than the prediction accuracy of traditional near neighbor method (P-EN and P-UN) method.
Then, we compare the performance under different size of data set of the method involved by above-described embodiment.Table 1 is base
Carry out the result of experimental verification in bean cotyledon with the True Data collection that city crawls, refer to shown in table 1.
From the experiment results of table 1 it can be seen that MF-EUN is in the case of all data sets and training set difference
Effect is all best.Owing to MF-EUN nearly neighbor discovery method has been dissolved in matrix decomposition, played neighbour's discovery simultaneously
Method and the advantage of matrix decomposition, its prediction effect is better than simple neighbour and finds method and matrix disassembling method.Meanwhile, MF-
Event neighbour and user neighbour have been dissolved in matrix decomposition by EUN, and event neighbour or user neighbour are only dissolved into by its ratio
The prediction accuracy of MF-EN and MF-UN in matrix decomposition is high.
Furthermore, it is necessary to explanation, at correlation technique document (P.Cui, F.Wang, S.Liu, M.Ou, S.Yang, and
L.Sun,“Who should share what?:item-level social influence prediction for
Users and posts ranking, " in SIGIR, 2011, pp.185 194.) in have studied data entries aspect (item-
Level) influence power, i.e. thinks that same user influence power on different data entries is different.This document propose one
Plant HF-NMF (Hybrid Factor Non-Negative Matrix Factorization) method and predict that user is good to it
The influence power of friend, and utilize Projected matrix factorization method to be solved.Although HF-NMF by user in microblogging and
The feature of microblogging entry is dissolved in Non-negative Matrix Factorization, and its prediction effect is better than simple matrix disassembling method (MF).But,
Find the advantage of method owing to the MF-EUN method in the embodiment of the present invention has played matrix decomposition and neighbour simultaneously, i.e. matrix divides
Solution considers the global information of influence power matrix S, and Neighborhood Model (neighbour's set) considers the neighbor information of user and event,
Our MF-EUN is higher than the prediction accuracy of HF-NMF method.
Table 1 is the result carrying out experimental verification based on bean cotyledon with the True Data collection that city crawls
It addition, it is noted that different size of test set in contrast and experiment, it appeared that test set is the biggest,
Prediction accuracy is the highest, say, that in our forecast model, and the degree of rarefication of matrix is the least, and the effect of algorithm is the best.
One of ordinary skill in the art will appreciate that: all or part of step realizing above-mentioned each method embodiment can be led to
The hardware crossing programmed instruction relevant completes.Aforesaid program can be stored in a computer read/write memory medium.This journey
Sequence upon execution, performs to include the step of above-mentioned each method embodiment;And aforesaid storage medium includes: ROM, RAM, magnetic disc or
The various media that can store program code such as person's CD.
Last it is noted that various embodiments above is only in order to illustrate technical scheme, it is not intended to limit;To the greatest extent
The present invention has been described in detail by pipe with reference to foregoing embodiments, it will be understood by those within the art that: it depends on
So the technical scheme described in foregoing embodiments can be modified, or the most some or all of technical characteristic is entered
Row equivalent;And these amendments or replacement, do not make the essence of appropriate technical solution depart from various embodiments of the present invention technology
The scope of scheme.
Claims (10)
1. a customer impact force prediction method based on social networks event, it is characterised in that including:
User force matrix S is set up, in described user force matrix S according to M user influence power in N number of event
Element sueRepresent that user u affects the ratio of friend in event e, wherein, 1≤u≤M and be integer, 1≤e≤N and be integer,
Described M is the integer more than 1, and described N is the integer more than 1;
Characteristic information according to described M user sets up user's correlation matrix U, the element in described user's correlation matrix U
uuu′Represent the degree of correlation between user u and user u ', wherein 1≤u '≤M and be integer;
Characteristic information according to described N number of event sets up event correlation matrix E, the element in described event correlation matrix E
eee′The degree of correlation between expression event e and event e ', wherein 1≤e '≤N and be integer;
According to described user force matrix S, described user's correlation matrix U and described event correlation matrix E, determine user
Eigenvectors matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event degree of correlation factor of influence square
Battle array Z, wherein, P and Q is respectively the characteristic vector that described user force matrix S carries out the user obtained after matrix decomposition
Matrix and the eigenvectors matrix of event, W and Z is respectively user's degree of correlation and the event degree of correlation to the use in social networks event
The factor of influence matrix of family influence power;
According to described user characteristics vector matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event
Degree of correlation factor of influence matrix Z, determines the user force in social networks event.
Method the most according to claim 1, it is characterised in that the characteristic information of described user includes that user is on topic
Influence power on region of influence power, user and user's influence power on organizer;
The described characteristic information according to described M user sets up user's correlation matrix U, including:
User influence power similarity matrix U on topic is set up according to described M user influence power on topict;
User influence power similarity matrix U on region is set up according to described M user influence power on regionr;
User influence power similarity matrix U on organizer is set up according to described M user influence power on organizero;
According to U (u, u ')=β1Ut(u,u′)+β2Ur(u,u′)+β3Uo(u, u ') sets up described user's correlation matrix U, wherein β1,
β2And β3It is respectively user's influence power similarity on topic, the user influence power similarity on region and user at tissue
The weight of the influence power similarity on person.
Method the most according to claim 2, it is characterised in that the described influence power according to described M user on topic
Set up user influence power similarity matrix U on topict, including:
According toDetermine described user influence power on topic, wherein For event ei's
Topic is distributed, HEuRepresent the set of all events that user u participated in;
According to Ut(u, u ')=1-DJS(stu,stu′) determine any two user influence power similarity on topic, wherein, DJS
(stu,stu') it is stuAnd stu′Between KL-JS divergence.
Method the most according to claim 2, it is characterised in that the described influence power according to described M user on region
Set up user influence power similarity matrix U on regionr, including:
According toDetermine described user influence power on region, whereinRepresent user u to exist
Region RmOn influence power,HEuRepresent the set of all events that user u participated in,Represent at region RmOn
The event sets held, nu(Rm) represent user u at region RmOn the event number participated in;
According toDetermining user's influence power vector on all regions, wherein, f is that region is total
Number;
According toDetermine any two user influence power similarity on region.
Method the most according to claim 2, it is characterised in that the described impact according to described M user on organizer
Power sets up user influence power similarity matrix U on organizero, including:
According toDetermine described user influence power on organizer, whereinRepresent user u
At organizer OjOn influence power,HEuRepresent the set of all events that user u participated in,Represent organizer
OjThe event sets organized, nu(Oj) represent user and participated in organizer OjThe total number of events amount organized;
According toDetermining described user influence power on all organizers, wherein l is tissue
Person's sum;
According toDetermine any two user influence power similarity on organizer.
6. according to the method described in any one of Claims 1 to 5, it is characterised in that the characteristic information of described event includes event
Content, event location and event organiser;
The described characteristic information according to described N number of event sets up event correlation matrix E, including:
Content according to described N number of event sets up event content similarity matrix Ec;
Event location similarity matrix E is set up in position according to described N number of eventl;
Organizer according to described N number of event sets up event organiser similarity matrix Eo;
According to E (e, e')=α1Ec(e,e')+α2El(e,e')+α3Eo(e, e') sets up described event correlation matrix E, wherein,
α1, α2And α3It is respectively event content similarity, event location similarity and the weight of event organiser's similarity.
Method the most according to claim 6, it is characterised in that the described content according to described N number of event is set up in event
Hold similarity matrix Ec, including:
According to Ec(e, e ')=1-DJS(θe,θe′) determine the content similarity of any two event, wherein, θeAnd θe′It is respectively thing
The topic distribution of part e and event e', DJS(θe,θe′) it is θeAnd θe′Between KL-JS divergence.
Method the most according to claim 6, it is characterised in that event position is set up in the described position according to described N number of event
Put similarity matrix El, including:
According toDetermine the location similarity of any two event, wherein, leAnd le′Point
Not Wei event e and event e' hold position, dis (le,le′) it is leAnd le′Between distance.
Method the most according to claim 6, it is characterised in that the described organizer according to described N number of event sets up event
Organizer similarity matrix Eo, including:
According toDetermine organizer's similarity of any two event, wherein, O (e) and O
(e ') is respectively event e and the organizer of event e'.
Method the most according to claim 1, it is characterised in that described according to described user force matrix S, described use
Family correlation matrix U and described event correlation matrix E, determine user characteristics vector matrix P, affair character vector matrix Q,
User's degree of correlation factor of influence matrix W and event degree of correlation factor of influence matrix Z, including:
Initialising subscriber eigenvectors matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and event phase
Pass degree factor of influence matrix Z;
According to described user force matrix S, described user's correlation matrix U and described event correlation matrix E, update described
User characteristics vector matrix P, affair character vector matrix Q, user's degree of correlation factor of influence matrix W and the event degree of correlation affect because of
Submatrix Z, until affair character vector matrix Q, described user's degree of correlation factor of influence described in described user characteristics vector matrix P
Matrix W and described event degree of correlation factor of influence matrix Z convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610279983.6A CN105913159A (en) | 2016-04-29 | 2016-04-29 | Social network event based user's influence prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610279983.6A CN105913159A (en) | 2016-04-29 | 2016-04-29 | Social network event based user's influence prediction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105913159A true CN105913159A (en) | 2016-08-31 |
Family
ID=56752309
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610279983.6A Pending CN105913159A (en) | 2016-04-29 | 2016-04-29 | Social network event based user's influence prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105913159A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107230158A (en) * | 2017-06-12 | 2017-10-03 | 合肥工业大学 | Social network user relative influence measure |
CN107808333A (en) * | 2016-09-08 | 2018-03-16 | 阿里巴巴集团控股有限公司 | A kind of commodity launch decision system, method and device |
WO2018077301A1 (en) * | 2016-10-31 | 2018-05-03 | 中国科学技术大学先进技术研究院 | Account screening method and apparatus |
CN109063927A (en) * | 2018-08-28 | 2018-12-21 | 成都信息工程大学 | A kind of microblogging transfer amount prediction technique based on TS-LSTM and DNN |
WO2019095570A1 (en) * | 2017-11-17 | 2019-05-23 | 平安科技(深圳)有限公司 | Method for predicting popularity of event, server, and computer readable storage medium |
CN110209962A (en) * | 2019-06-12 | 2019-09-06 | 合肥工业大学 | The acquisition methods and system of theme level high-impact user |
CN110223125A (en) * | 2019-06-18 | 2019-09-10 | 东北大学 | User location acquisition methods under the income algorithm of node location core side |
CN112313687A (en) * | 2018-06-11 | 2021-02-02 | 奥姆尼欧斯株式会社 | Method and device for measuring influence by using social network |
-
2016
- 2016-04-29 CN CN201610279983.6A patent/CN105913159A/en active Pending
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107808333A (en) * | 2016-09-08 | 2018-03-16 | 阿里巴巴集团控股有限公司 | A kind of commodity launch decision system, method and device |
US11468521B2 (en) | 2016-10-31 | 2022-10-11 | Tencent Technology (Shenzhen) Company Limited | Social media account filtering method and apparatus |
WO2018077301A1 (en) * | 2016-10-31 | 2018-05-03 | 中国科学技术大学先进技术研究院 | Account screening method and apparatus |
CN107230158A (en) * | 2017-06-12 | 2017-10-03 | 合肥工业大学 | Social network user relative influence measure |
WO2019095570A1 (en) * | 2017-11-17 | 2019-05-23 | 平安科技(深圳)有限公司 | Method for predicting popularity of event, server, and computer readable storage medium |
CN112313687A (en) * | 2018-06-11 | 2021-02-02 | 奥姆尼欧斯株式会社 | Method and device for measuring influence by using social network |
CN112313687B (en) * | 2018-06-11 | 2024-05-07 | 奥姆尼欧斯株式会社 | Method and device for measuring influence by utilizing social network |
CN109063927A (en) * | 2018-08-28 | 2018-12-21 | 成都信息工程大学 | A kind of microblogging transfer amount prediction technique based on TS-LSTM and DNN |
CN109063927B (en) * | 2018-08-28 | 2021-12-07 | 成都信息工程大学 | Microblog forwarding capacity prediction method based on TS-LSTM and DNN |
CN110209962A (en) * | 2019-06-12 | 2019-09-06 | 合肥工业大学 | The acquisition methods and system of theme level high-impact user |
CN110209962B (en) * | 2019-06-12 | 2021-02-26 | 合肥工业大学 | Method and system for acquiring theme-level high-influence user |
CN110223125A (en) * | 2019-06-18 | 2019-09-10 | 东北大学 | User location acquisition methods under the income algorithm of node location core side |
CN110223125B (en) * | 2019-06-18 | 2021-06-11 | 东北大学 | User position obtaining method under node position kernel-edge profit algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105913159A (en) | Social network event based user's influence prediction method | |
Ma et al. | Estimating warehouse rental price using machine learning techniques | |
Xia et al. | MVCWalker: Random walk-based most valuable collaborators recommendation exploiting academic factors | |
Benson et al. | Sequences of sets | |
CN103678672B (en) | Method for recommending information | |
He et al. | SocoTraveler: Travel-package recommendations leveraging social influence of different relationship types | |
CN104462592B (en) | Based on uncertain semantic social network user behavior relation deduction system and method | |
CN106484876A (en) | A kind of based on typical degree and the collaborative filtering recommending method of trust network | |
CN104239399A (en) | Method for recommending potential friends in social network | |
CN105320719A (en) | Crowdfunding website project recommendation method based on project tag and graphical relationship | |
CN103258034A (en) | Economic and financial behavior analysis system model based on social media | |
CN111428127B (en) | Personalized event recommendation method and system integrating theme matching and bidirectional preference | |
CN104239335B (en) | User-specific information acquisition methods and device | |
Gong | Estimating participants for knowledge-intensive tasks in a network of crowdsourcing marketplaces | |
Savin et al. | Main topics in EIST during its first decade: A computational-linguistic analysis | |
Lin et al. | A stacking model for variation prediction of public bicycle traffic flow | |
Zhang et al. | Author impact: Evaluations, predictions, and challenges | |
Ranganath et al. | Facilitating time critical information seeking in social media | |
Neokosmidis et al. | Assessment of the gap and (non-) Internet users evolution based on population biology dynamics | |
Meng et al. | POI recommendation for occasional groups Based on hybrid graph neural networks | |
Lin et al. | A dynamic grey target evaluation method with multiple reference points for new R&D institution performance | |
Park et al. | Spec guidance for engineering design based on data mining and neural networks | |
Kung et al. | A recommender system for the optimal combination of energy resources with cost-benefit analysis | |
CN116049549A (en) | Activity recommendation method based on multi-granularity feature fusion | |
Ho et al. | The design of a parallel zone-picking system with cooperation area between neighbouring zones and its cooperation methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160831 |
|
RJ01 | Rejection of invention patent application after publication |