CN103902690A - Method for improving accuracy of influence of user generate content (UGC) information of social network - Google Patents

Method for improving accuracy of influence of user generate content (UGC) information of social network Download PDF

Info

Publication number
CN103902690A
CN103902690A CN201410119194.7A CN201410119194A CN103902690A CN 103902690 A CN103902690 A CN 103902690A CN 201410119194 A CN201410119194 A CN 201410119194A CN 103902690 A CN103902690 A CN 103902690A
Authority
CN
China
Prior art keywords
user
ugc
social
influence power
influence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410119194.7A
Other languages
Chinese (zh)
Other versions
CN103902690B (en
Inventor
李蕾
林鑫
王博远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201410119194.7A priority Critical patent/CN103902690B/en
Publication of CN103902690A publication Critical patent/CN103902690A/en
Application granted granted Critical
Publication of CN103902690B publication Critical patent/CN103902690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for improving accuracy of influence of user generate content (UGC) information of the social network. The UGC comprises M keywords and involves N users. The method includes steps of establishing a social network UGC member involvement mechanism, building a fan network unauthorized oriental map according to user fans relation of the UGC and dividing communities; setting up an interest network authorized non-oriental map according to user replay relation of the UGC and dividing communities; calculating social influence of a user X according to relevant degrees of various influence factors of the member involvement mechanism; calculating social influence of a keyword K published by the user X according to the formula, the m is propagation times of the keywords on the user X, if the m=0, SKX=0; calculating comprehensive social influence of the keyword K in the UGC according to the formula; calculating the sum of the comprehensive social influences of the M keywords in the UGC and obtaining information influence INF of the UGC.

Description

A kind of method that improves social networks user and produce content information influence power accuracy
Technical field
The present invention relates to information monitoring technology, particularly a kind of method that improves social networks user and produce content information influence power accuracy.
Background technology
Internet has entered into the web2.0 epoch, each user can freely communicate one's views, a lot of important contents or news are all to produce content (UGC by user, User Generate Content) first generate, then by social networks wide-scale distribution, finally in certain specific social circle, even entire society produces tremendous influence power.Therefore, the research of UGC influence power all has very important effect for information acquisition, monitoring, prediction etc.But because the quantity of UGC is too huge, the speed of increase is very fast, be difficult to all UGC to process, must filter out quality UGC good and that influence power is high and study and utilize.Thus, the research of the quality to UGC and informational influence force estimation more and more comes into one's own.
At present mainly apply influence power diffusion model (IDM about the research of informational influence power, Influence Diffusion Model) and improved model (as influence power spreading probability model IDPM, Influence Diffusion Probability Model etc.) analyze.Influence power diffusion model IDM based on text session utilizes the reply chain structure in session, and the similarity of calculating between text based on word frequency is calculated the influence power diffusivity in source, and the influence power sum of every reply diffusion is the influence power diffusivity of the text.After this model proposes, become the important foundation stone of informational influence power research, descendant is that this model is improved for the research of informational influence power mostly; The false influence power propagation problem that influence power spreading probability model IDPM solves the influence power transferring structure tomography problem in IDM model and pours water and cause by define single key words probability of spreading influence power in whole space of interest, by considering that the effective key words in sentence solves the influence power transferring content tomography problem in IDM model.
But these models exist some very significantly defects, as identical in the weights of each comment or reply, do not consider relation between user etc.Take a model on BBS as UGC as example, as shown in Figure 1:
User 1 is information publisher, user 2~user 5 is user 1 reply person, A, B, C, D, E, F are the keyword that this model comprises, heavy line represents the influence power propagation relation of model in user, the direction of heavy line is the influence power direction of propagation of model, dot-and-dash line represents bean vermicelli relation between user, and dotted line represents that user belongs to the relation of same community in interest network, and fine line represents that user belongs to the relation of same community in bean vermicelli network.
In Fig. 1, user 2~user 5 has replied user 1 model, but user 2 is beans vermicelli of user 1, user 3 and user 1 belong to same interest network social association, user 4 belongs to identical bean vermicelli network social association (but not being user 1 bean vermicelli) with user 1, user 5 is new users, before may almost it doesn't matter with user 1.
This shows, in IDPM model, the keyword of UGC is not carried out to the informational influence power deviation that weighting processing respectively can cause calculating UGC.
Summary of the invention
In view of this, the present invention proposes a kind of method that improves social networks user and produce content information influence power accuracy, efficiently solve in prior art not the defect of the keyword of UGC being carried out informational influence power that weighting processing respectively causes calculating UGC and existed deviation.The technical scheme that the present invention proposes is:
Improve the method that social networks user produces content information influence power accuracy, the method comprises:
A. set up social networks UGC member participation mechanism, determine the path coefficient between the each influence factor of described member's participation mechanism, described path coefficient is the degree of correlation between the each influence factor of described member's participation mechanism;
B. build bean vermicelli network according to user's bean vermicelli relation of described UGC and have no right digraph, described bean vermicelli network is had no right to digraph and carry out community division; Build the interest network non-directed graph of having the right according to user's reply relation of described UGC, the described interest network non-directed graph of having the right is carried out to community division;
C. calculate the social influence power U of user X according to the degree of correlation between the each influence factor of described member's participation mechanism x;
D. basis
Figure BDA0000482980380000031
calculate the social influence power that user X issues keyword K, m is the propagation times of keyword K on user X, if m=0, S kX=0;
E. according to formula
Figure BDA0000482980380000032
calculate the comprehensive social influence power of keyword K in described UGC;
F. calculate the comprehensive social influence power sum of a described M keyword in described UGC, obtain the informational influence power INF of described UGC.
In such scheme, described member's participation mechanism comprises information quality, group identification sense, is worth perception and participates in four influence factors, and the path coefficient of information quality and group identification sense is a 1, the path coefficient of information quality and value perception is a 2, the path coefficient that is worth perception and group identification sense is a 3, the path coefficient of participation and group identification sense is a 4.
In such scheme, described step C further comprises:
According to formula calculate the social influence power of user X in described UGC,
Wherein, b is the number of times that user X is directly replied in described UGC, if user does not exist direct reply person, U x=0;
If user X reply person direct with it belongs to same interest network social association, C 1=a 1, otherwise, C 1=1;
If user X belongs to identical bean vermicelli network social association, C with its direct reply person 2=a 2× a 3, otherwise, C 2=1;
If user X is described UGC information publisher's bean vermicelli, f=a 2, otherwise f=1.
In such scheme, described step F further comprises:
According to formula
Figure BDA0000482980380000034
the informational influence power INF of described UGC.
In such scheme, the path coefficient a of information quality and group identification sense 1=0.333, the path coefficient a of information quality and value perception 2=0.824, be worth the path coefficient a of perception and group identification sense 3=0.624, the path coefficient a of participation and group identification sense 4=0.437.
In sum, the present invention proposes a kind of method that improves social networks user and produce content information influence power accuracy, application sociability is expanded influence power spreading probability model (S-IDPM, Sociability-based Influence Diffusion Probability Model) calculating UGC informational influence power, mainly utilize user social contact network (comprising bean vermicelli network and interest network) and reply chain structure the reply of different user is weighted, thereby the keyword of UGC is weighted respectively to processing, improve social networks user and produced the accuracy that content information influence power is calculated.
Accompanying drawing explanation
Fig. 1 is by between user and a graph of a relation of being posted.
Fig. 2 is bean vermicelli network chart.
Fig. 3 is interest network chart.
Fig. 4 is user member's participation mechanism.
Fig. 5 is the UGC structural drawing of embodiment of the method one.
Fig. 6 is the process flow diagram of embodiment of the method one.
Fig. 7 is the artificial mark fine work note accumulation containing ratio comparison diagram of embodiment of the method two.
Fig. 8 is class-eigenwert contrast figure of embodiment of the method two.
Fig. 9 is the machine mark fine work note accumulation containing ratio comparison diagram of embodiment of the method two.
Embodiment
Clearer for what the object, technical solutions and advantages of the present invention were expressed, below in conjunction with drawings and the specific embodiments, the present invention is further described in more detail.
The technical scheme of one embodiment of the invention is:
A. set up social networks UGC member participation mechanism, determine the path coefficient between the each influence factor of described member's participation mechanism, described path coefficient is the degree of correlation between the each influence factor of described member's participation mechanism;
B. build bean vermicelli network according to user's bean vermicelli relation of described UGC and have no right digraph, described bean vermicelli network is had no right to digraph and carry out community division; Build the interest network non-directed graph of having the right according to user's reply relation of described UGC, the described interest network non-directed graph of having the right is carried out to community division;
C. calculate the social influence power U of user X according to the degree of correlation between the each influence factor of described member's participation mechanism x;
D. basis
Figure BDA0000482980380000051
calculate the social influence power that user X issues keyword K, m is the propagation times of keyword K on user X, if m=0, S kX=0;
E. according to formula
Figure BDA0000482980380000052
calculate the comprehensive social influence power of keyword K in described UGC;
F. calculate the comprehensive social influence power sum of a described M keyword in described UGC, obtain the informational influence power INF of described UGC.
Technical solution of the present invention is added user's factor to user and is produced in the calculating of content information influence power, and all users of the social networks such as BBS, microblogging, Renren Network are divided into information publisher and information-reply person.Build bean vermicelli network according to the user's bean vermicelli relation that participates in UGC and have no right digraph, as shown in Figure 2, user 1 is user 2 bean vermicelli, between user 1 and user 2, has a limit of pointing to user 2 from user 1; Build the interest network non-directed graph of having the right according to the user's reply relation that participates in UGC, as shown in Figure 3, user 1 and user 2 have participated in 7 information discussion that information publisher issues jointly, between user 1 and user 2, have the nonoriented edge that weights are 7.
According to prior art, above-mentioned bean vermicelli network and interest network are carried out to community division, and the community of dividing is numbered, the community numbering that belongs to same community user is identical, and the user of identical community numbering (identical bean vermicelli network social association numbering or same interest network social association numbering) has similar values to a certain extent.User's community partitioning technology is prior art, be not described in detail in this, table 1 for Fig. 2 community divide after bean vermicelli network social association, table 2 for community divide after interest network social association example, as shown in Table 1, user 1 and user 3 are same bean vermicelli network social association, and as shown in Table 2, user 1 and user 2 are same interest network social association.
Table 1
User Bean vermicelli network social association numbering f
1 1
2 2
3 1
4 3
5 4
6 5
Table 2
User Interest network social association numbering r
1 3
2 3
3 1
4 2
5 4
6 5
Technical solution of the present invention has built user member's participation mechanism, as shown in Figure 4.
Information quality is the informational influence power of UGC in social networks, represents the quality of correctness, promptness, novelty and the service quality of stability, the information of described UGC.
Being worth perception utilizes the bean vermicelli relation between user in social networks to represent.For an information publisher U1, if the bean vermicelli user U2 of U1 has replied this information publisher, think and drive factor that bean vermicelli user U2 replys U1 except to the approval of U1, also with certain factor that maintains interpersonal relation, in this case, be not only that the informational influence issued of U1 has driven the participation of bean vermicelli user U2, the factor of the interpersonal relation between user of wherein also having adulterated, in social networks UGC informational influence power is calculated, the content that bean vermicelli user is replied will reduce corresponding weight.
Group identification sense represents that in social networks, the community between user is divided and replys the impact of evaluating UGC informational influence power.It is the community division of aforesaid bean vermicelli network and interest network that community is divided, if reply person belongs to identical community (identical bean vermicelli network social association and/or same interest network social association) with the person of being responded, alleviates respective weights, otherwise increases respective weights; If the content quality of a UGC is very high or possess the potentiality that produce certain influence power, user is ready to participate in such UGC mostly, and therefore replying to evaluate becomes the key factor that UGC influence power is calculated.In reply evaluation, the propagation times of keyword in UGC is more, illustrates that the influence power of this UGC is larger.
What the path coefficient in Fig. 4 user member participation mechanism was weighed is two degrees of correlation between variable, uses respectively a 1, a 2, a 3, a 4represent, and 0 < a 1< 1,0 < a 2< 1,0 < a 3< 1,0 < a 4< 1.
Calculate the social influence power U of the participating user X of social networks UGC according to the degree of correlation between the each influence factor of described member's participation mechanism x:
U X = { 1 + ln [ + &Sigma; i = 1 b ( C 1 &times; C 2 ) ] } &times; f - - - ( 1 )
Wherein, b is the number of times that user X is directly replied in a UGC.If user X is not directly replied, U x=0.
C 1whether the reply person who represents user X belongs to identical interest network social association with user X, if so, and because information quality and group identification sense have a 1degree of correlation, C 1=a 1; Otherwise, C 1=1.C 1=1 shows that influence power is stronger if can attract to reply to the user of different interest network social associations.
C 2whether the reply person who represents user X belongs to identical bean vermicelli network social association with user X, if so, due to information quality be worth perception and have a 2degree of correlation, being worth perception and group identification sense has a 3degree of correlation, so C 2=a 2× a 3; Otherwise, C 2=1.C 2=1 shows that influence power is stronger if can attract to reply to the user of different bean vermicelli network social associations.
F represents whether user X is the information publisher's of this UGC bean vermicelli, if so, and because information quality and value perception have a 2degree of correlation, f=a 2; Otherwise f=1.This shows if user X is information publisher's bean vermicelli, and the reply of user X is not only to the affirming of content, also with the reason that maintains social networks, therefore will reduce weight.
The social influence power U of user X in a UGC in social networks xafter determining, the social influence power of the keyword K that this user X issues can further be determined:
S KX = &Sigma; i = 1 m U X - - - ( 2 )
Wherein, m is the propagation times of keyword K on user X,, in the direct reply person of user X, also replys direct reply person's quantity of keyword K, if m=0, S kX=0.
The comprehensive social influence power of keyword K in whole UGC is the social influence power sum of all users of this UGC (comprising information publisher and reply person) issue keyword K, that is:
Sum K = &Sigma; i = 1 N S Ki - - - ( 3 )
Wherein, N is the number of users (being information publisher and reply person's quantity sum) that participates in this UGC.
UGC informational influence power is the comprehensive social influence power sum of all keywords in this UGC, that is:
INF = &Sigma; i = 1 M log [ 1 + Sun i &Sigma; j = 1 M Sum j ] - - - ( 4 )
Wherein, M is all keyword quantity that this UGC comprises.
With embodiment, technical solution of the present invention is described further below.
Embodiment of the method one
Fig. 5 is the structural drawing of a UGC of the present embodiment, and as shown in Figure 5, this UGC comprises 4 users: user 1, user 2, user 3, user 4.User 1 is information publisher, and the keyword of issue is A, B, C; User 2 and user 3 have directly replied respectively user 1, and the keyword that user 2 issues is A, C, D, and the keyword that user 3 issues is B, F; User 4 has directly replied user 2, and the keyword of issue is C, F.Interest network social association numbering represents with r, r 1=1, r 2=1, r 3=2, r 4=3; Bean vermicelli network social association numbering represents with f, f 1=1, f 2=2, f 3=1, f 4=3.User 2 and user 4 are beans vermicelli of information publisher user 1.The present embodiment is that in member's participation mechanism, the path coefficient assignment between each factor is: a 1=0.333, a 2=0.824, a 3=0.624, a 4=0.437.Fig. 6 is the process flow diagram of the present embodiment, as shown in Figure 6, comprises the following steps:
Step 601: calculate respectively the social influence power of each user in this UGC.
According to formula
Figure BDA0000482980380000091
calculate respectively each user's social influence power.Describe as example take the social influence power of calculating user 1, user 2~user 4 therewith computing method is identical, repeats no more.
Direct reply person's quantity of user 1 is 2, i.e. b=2; User 2 from user 1 at different interest network social associations, in identical bean vermicelli network social association, therefore C 1=a 1=0.333, C 2=1; User 3 with user 1 at same interest network social association, in different bean vermicelli network social associations, therefore C 1=1, C 2=a 2× a 3=0.514; User 1 is not self bean vermicelli, f=1, therefore
U 1={1+ln[1+a 1×1+1×a 2×a 3]}×1=1.614
In like manner U 2=1.395, U 3=0, U 4=0.
Step 602: the comprehensive social influence power of calculating respectively each keyword in this UGC.
Be calculated as example with the comprehensive social influence power of keyword C and describe, the comprehensive social influence power computing method of keyword A, B, D, F are identical, repeat no more.
The user who issues keyword C has user 1, user 2, user 4, and for user 1, keyword C has only propagated once (direct reply person user 2 of the user 1 has issued keyword C), S c1=U 1; For user 2, keyword C has only propagated once (direct reply person user 4 of the user 2 has issued keyword C), S c2=U 2; For user 3 and user 4, keyword C does not propagate (user 3 and user 4 all not direct reply person issue keyword C), S c3=0, S c4=0, therefore, the social influence power of keyword C is: Sum C = &Sigma; i = 1 4 S Ci = S C 1 + S C 2 = U 1 + U 2 = 3.009 .
In like manner, Sum A = &Sigma; i = 1 4 S Ai = S A 1 = U 1 = 1.614 ; Sum B = &Sigma; i = 1 4 S Bi = S B 1 = U 1 = 1.614 ;
Sum D = &Sigma; i = 1 4 S Di = 0 ; Sum F = &Sigma; i = 1 4 S Fi = 0 .
Step 603: the informational influence power of calculating UGC.
According to formula
Figure BDA0000482980380000095
calculate the informational influence power of this UGC.
INF = 2 log ( 1 + U 1 3 U 1 + U 2 ) + log ( 1 + U 1 + U 2 3 U 1 + U 2 ) = 0.371
Embodiment of the method two
How the UGC that embodiment of the method one participates in take less user calculates social networks UGC informational influence power as example to technical solution of the present invention is illustrated, and the present embodiment is described further technical solution of the present invention as an example of the user of ends of the earth forum tittle-tattle column upper 2012 year and model information example.
User profile comprises the ID of 181841 user ID, its bean vermicelli altogether, the model ID issuing at this column and the ID that replys model at this column; Model information comprises the ID of 43609 pieces of models, sequence number, publisher ID and the content thereof in this model Zhong Mei building altogether.By judging whether model contains the keeper's of forum fine work symbol, from model information, filter out the fine work model collection of 827 pieces of models as artificial mark, other models are as non-fine work model collection.Because data volume is huge, therefore concentrate the fine work model of randomly drawing 9173 pieces of models and 827 pieces of artificial marks to be mixed into the model sample of 10000 pieces from non-fine work model, and utilize this sample respectively to S-IDPM, IDM, IDPM contrasts, assessment and analysis.In addition, also utilize the clustering method of statistical information to carry out machine mark to model, obtained the fine work model collection under machine mark, and equally to S-IDPM, IDM, IDPM has carried out contrast and analysis.
In table 3, provide the contrast and experiment of above-mentioned three kinds of methods, because model quantity is very large, only provided the model of rank front 5 here.
Table 3
Figure BDA0000482980380000102
From table 3, can observe, the key distinction of IDM and S-IDPM is on model 2894103 and 2366245.By observing corresponding language material, find that model 2894103 is advertisement collection notes, he has issued corresponding advertisement masterplate, all users will reply according to set form, so according to IDM model, mainly utilize co-occurrence word to calculate influence power, therefore, this model influence power under IDM is very high.But, from title, can find out, that this model attracts is the user that a group is liked automobile, from the user interest network calculating, also can find out that a lot of users once replied some model jointly, therefore, they have identical interest network numbering in identical interest network, and this illustrates that this model only propagates in an inner circle of people.Therefore, in S-IDPM, its rank is not very high, does not enter first 5.And model 2366245 has caused and pays close attention to widely and reply, have 1476757th floors replies, in model, user does not have significantly large-scale bean vermicelli and interest network, and user's circle relatively disperses, and illustrates that this model has caused the extensive concern of the various customer groups in community.Therefore, its model influence power rank in S-IDPM is higher.
And between IDPM and S-IDPM the key distinction on model 2510082 and model 2713599.By observing corresponding language material, find user's " I am Japanese car owner " by name of the people that posts of model 2713599, very similar with the title of model, also find that by the User Page of observing her this user is without any bean vermicelli and concern, do not reply any model and only sent out this piece of model yet, these have absolutely proved that this user name is a vest name, do not have the relation of social activity or interest network with anyone.This piece of model causes the reply of totally 14944th floors, although and model 2510082 has caused the reply of 28211st floors, more than the reply number of model 2713599, the reply user's of model 2713599 bean vermicelli and interest network disperse more, therefore, the influence power rank in S-IDPM is more forward.
From above qualitative analysis, can find out, S-IDPM can solve some problems that IDM and IDPM do not consider to a certain extent.
Next, the effect of three kinds of methods of quantitative test in model influence power is calculated.
First, the present embodiment has contrasted fine work note manually to mark (keeper of forum mark) as fine work note standard, relatively IDM, and IDPM and S-IDPM fine work note accumulation containing ratio comparison diagram, as shown in Figure 7.As can be seen from Figure 7, S-IDPM fine work note accumulation containing ratio reaches 70% left and right the soonest at first 3000.Illustrate in the influence power sequence of S-IDPM, covered 70% fine work model in front 30%, and front 10%, 20% all higher than IDM and IDPM model.Therefore, illustrate that S-IDPM is better to the result of calculation of model influence power, more meets artificial annotation results.
Next, the leader of opinion's discovery algorithm based on cluster in prior art is used in model influence power analysis of the present invention, realize the algorithm that utilizes statistical information clustering method to find fine work note.
Select the building of model to count F, duration T, replys number P, and reply hourly building number is expressed as F/T, and the word number in average every building is W/F, and the difference D that non-building-owner replys number and building-owner replys number is as eigenwert, the number of members of N representation class.Utilize subclass quantity choosing method and clustering algorithm (being prior art) to obtain 8 classes, as shown in Figure 8.
Screening conditions in leader of opinion's discovery algorithm based on cluster are adjusted into: class members's number is less, the member in the larger class of class members's eigenwert average is as the fine work note of machine mark.Therefore, the member in No. 5 and No. 7 classes is as the fine work note of experiment next, and No. 5 and No. 7 classes have 1001 members, and it is identical that 827 fine work notes that wherein mark with the keeper of forum only have 291 pieces of models, so, this from shown in Fig. 7, test different.Next utilize these 1001 pieces of fine work notes to carry out comparison IDM, IDPM and S-IDPM fine work note accumulation containing ratio comparison diagram, as shown in Figure 9:
As can be seen from Figure 9, S-IDPM fine work note cumulative percentage curve is also always in middle reaches level, and in the time of first 2000 pieces, reach fine work note accumulative total rate more than 85%, illustrate that S-IDPM is still fine to the calculating effect of model influence power in the situation that utilizing statistical information machine mark fine work note.
Finally, three kinds of algorithms fine work note accuracy rate in the situation that manually mark and machine mark is contrasted, as shown in table 4, S-IDPM marks in two kinds of situations at artificial mark and machine, and fine work note accuracy rate is all higher than other two kinds of models.
Table 4
? Pt 0 Pt 1
IDM 28.1% 68.1%
IDPM 30.2% 67.3%
S-IDPM 32.4% 68.4%
By utilizing fine work note accumulative total rate contrast experiment and the fine work note accuracy rate contrast experiment of artificial mark and machine mark, comprehensive above experimental result can find out that S-IDPM is more accurate to the result of model influence power calculating, is better than IDM and IDPM method.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of making within the spirit and principles in the present invention, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (5)

1. improve the method that social networks user produces content information influence power accuracy, be applied to social networks user and produce content UGC, described UGC comprises M keyword, and total N user participates in described UGC, it is characterized in that, the method comprises:
A. set up social networks UGC member participation mechanism, determine the path coefficient between the each influence factor of described member's participation mechanism, described path coefficient is the degree of correlation between the each influence factor of described member's participation mechanism;
B. build bean vermicelli network according to user's bean vermicelli relation of described UGC and have no right digraph, described bean vermicelli network is had no right to digraph and carry out community division; Build the interest network non-directed graph of having the right according to user's reply relation of described UGC, the described interest network non-directed graph of having the right is carried out to community division;
C. calculate the social influence power U of user X according to the degree of correlation between the each influence factor of described member's participation mechanism x;
D. basis
Figure FDA0000482980370000011
calculate the social influence power that user X issues keyword K, m is the propagation times of keyword K on user X, if m=0, S kX=0;
E. according to formula calculate the comprehensive social influence power of keyword K in described UGC;
F. calculate the comprehensive social influence power sum of a described M keyword in described UGC, obtain the informational influence power INF of described UGC.
2. method according to claim 1, is characterized in that, described member's participation mechanism comprises information quality, group identification sense, is worth perception and participates in four influence factors, and the path coefficient of information quality and group identification sense is a 1, the path coefficient of information quality and value perception is a 2, the path coefficient that is worth perception and group identification sense is a 3, the path coefficient of participation and group identification sense is a 4.
3. method according to claim 1 and 2, is characterized in that, described step C further comprises:
According to formula
Figure FDA0000482980370000021
calculate the social influence power of user X in described UGC,
Wherein, b is the number of times that user X is directly replied in described UGC, if user does not exist direct reply person, U x=0;
If user X reply person direct with it belongs to same interest network social association, C 1=a 1, otherwise, C 1=1;
If user X belongs to identical bean vermicelli network social association, C with its direct reply person 2=a 2× a 3, otherwise, C 2=1;
If user X is described UGC information publisher's bean vermicelli, f=a 2, otherwise f=1.
4. method according to claim 1, is characterized in that, described step F further comprises:
According to formula
Figure FDA0000482980370000022
the informational influence power INF of described UGC.
5. method according to claim 2, is characterized in that, the path coefficient a of information quality and group identification sense 1=0.333, the path coefficient a of information quality and value perception 2=0.824, be worth the path coefficient a of perception and group identification sense 3=0.624, the path coefficient a of participation and group identification sense 4=0.437.
CN201410119194.7A 2014-03-27 2014-03-27 Method for improving accuracy of influence of user generate content (UGC) information of social network Active CN103902690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410119194.7A CN103902690B (en) 2014-03-27 2014-03-27 Method for improving accuracy of influence of user generate content (UGC) information of social network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410119194.7A CN103902690B (en) 2014-03-27 2014-03-27 Method for improving accuracy of influence of user generate content (UGC) information of social network

Publications (2)

Publication Number Publication Date
CN103902690A true CN103902690A (en) 2014-07-02
CN103902690B CN103902690B (en) 2017-03-22

Family

ID=50994012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410119194.7A Active CN103902690B (en) 2014-03-27 2014-03-27 Method for improving accuracy of influence of user generate content (UGC) information of social network

Country Status (1)

Country Link
CN (1) CN103902690B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779793A (en) * 2015-11-23 2017-05-31 财团法人资讯工业策进会 Adaptive community fusion and marketing optimization system and method
CN107993156A (en) * 2017-11-28 2018-05-04 中山大学 A kind of community discovery method based on social networks digraph
CN108052568A (en) * 2017-12-07 2018-05-18 百度在线网络技术(北京)有限公司 A kind of Feature Selection method, apparatus, terminal and medium
CN109657105A (en) * 2018-12-25 2019-04-19 杭州铭智云教育科技有限公司 A method of obtaining target user

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770487A (en) * 2008-12-26 2010-07-07 聚友空间网络技术有限公司 Method and system for calculating user influence in social network
CN102893275A (en) * 2010-05-14 2013-01-23 微软公司 Automated social networking graph mining and visualization
CN103617279A (en) * 2013-12-09 2014-03-05 南京邮电大学 Method for achieving microblog information spreading influence assessment model on basis of Pagerank method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770487A (en) * 2008-12-26 2010-07-07 聚友空间网络技术有限公司 Method and system for calculating user influence in social network
CN102893275A (en) * 2010-05-14 2013-01-23 微软公司 Automated social networking graph mining and visualization
CN103617279A (en) * 2013-12-09 2014-03-05 南京邮电大学 Method for achieving microblog information spreading influence assessment model on basis of Pagerank method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOYUAN WANG: "Evaluating quality of Web2.0 UGC based on user authority and topic distribution", 《WIRELESS PERSONAL MULTIMEDIA COMMUNICATIONS(WPMC),2013 16TH INTERNATIONAL SYMPOSIUM ON》 *
王连喜等: "微博用户关系挖掘研究综述", 《情报杂志》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779793A (en) * 2015-11-23 2017-05-31 财团法人资讯工业策进会 Adaptive community fusion and marketing optimization system and method
CN107993156A (en) * 2017-11-28 2018-05-04 中山大学 A kind of community discovery method based on social networks digraph
CN107993156B (en) * 2017-11-28 2021-06-22 中山大学 Social network directed graph-based community discovery method
CN108052568A (en) * 2017-12-07 2018-05-18 百度在线网络技术(北京)有限公司 A kind of Feature Selection method, apparatus, terminal and medium
CN109657105A (en) * 2018-12-25 2019-04-19 杭州铭智云教育科技有限公司 A method of obtaining target user

Also Published As

Publication number Publication date
CN103902690B (en) 2017-03-22

Similar Documents

Publication Publication Date Title
Bovet et al. Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump
CN102394798B (en) Multi-feature based prediction method of propagation behavior of microblog information and system thereof
Bacolod Skills, the gender wage gap, and cities
Westerlund et al. Estimating the gravity model without gravity using panel data
Morrison et al. Nonsurvey Input-Output Techniques at the small area level: An Evaluation.
CN105260390B (en) A kind of item recommendation method based on joint probability matrix decomposition towards group
Abramo et al. Revisiting size effects in higher education research productivity
Elalfy et al. A hybrid model to predict best answers in question answering communities
Castro et al. Back to# 6D: Predicting Venezuelan states political election results through Twitter
CN103902690A (en) Method for improving accuracy of influence of user generate content (UGC) information of social network
Li et al. A hybrid model for experts finding in community question answering
Purohit et al. Finding influential authors in brand-page communities
CN105184371A (en) Domain knowledge push method based on process driving and rough set
Juhji et al. Madrasah teacher job satisfaction, how does it relate to work motivation? A meta-analysis
Pendyala et al. Enhanced algorithmic job matching based on a comprehensive candidate profile using nlp and machine learning
Bloch et al. Size, accumulation and performance for research grants: examining the role of size for centres of excellence
Choi et al. Korean scholarly information analysis based on Korea Science Citation Database (KSCD)
Monogan et al. Measuring state and district ideology with spatial realignment
Garimella et al. Factors in recommending contrarian content on social media
Kim et al. Comparative analysis of traveler destination choice models by method of sampling alternatives
NIKOLAIDOU et al. SOCIAL MEDIA AND TRANSPORT CHOICES: HOW SOCIAL MEDIA CAN AFFECT TRIPMAKERS’CHOICES
Li et al. Extension of the Peters–Belson method to estimate health disparities among multiple groups using logistic regression with survey data
Luo et al. The correlation between social tie and reciprocity in social media
Stockinger The effect of broadband internet on establishments' employment growth: evidence from Germany
Chiu et al. Finding the social networking service factors of homestay intention in Vietnam based on GM (1, N) model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant