CN103136331A - Micro blog network opinion leader identification method - Google Patents

Micro blog network opinion leader identification method Download PDF

Info

Publication number
CN103136331A
CN103136331A CN2013100278084A CN201310027808A CN103136331A CN 103136331 A CN103136331 A CN 103136331A CN 2013100278084 A CN2013100278084 A CN 2013100278084A CN 201310027808 A CN201310027808 A CN 201310027808A CN 103136331 A CN103136331 A CN 103136331A
Authority
CN
China
Prior art keywords
node
weights
calculating
network
bean vermicelli
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013100278084A
Other languages
Chinese (zh)
Inventor
蔡霖
蔡皖东
彭冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN2013100278084A priority Critical patent/CN103136331A/en
Publication of CN103136331A publication Critical patent/CN103136331A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a micro blog network opinion leader identification method and is used for solving the technical problem of poor recall rate of the existing opinion leader identification method. The technical scheme includes storing network topology information collected from the internet into a database by using a web crawler tool; constructing a directed network diagram G=(V, E); calculating an effective fan collection Ef (u); calculating a node weight IRL (ui) generated by a link relationship; calculating a node weight IRTR (ui) generated by a node interactive relationship; calculating a node comprehensive weight IR (ui); and calculating comprehensive weights of all nodes in the network diagram, performing sequencing in a descending order according to the comprehensive weights, and selecting n nodes with larger comprehensive weights as candidates of opinion leaders. The fan number possessed by the nodes, factors such as the node link relationship and the interactive relationship are considered when the node weights are calculated, so that the recall rate and the accuracy are improved. Through detection, the return rate is improved from 81.7-88.5% of the background technology to above 89.3%, and the accuracy is improved from 84.7-90.4% of the background technology to above 91.7%.

Description

Microblogging network leader of opinion recognition methods
Technical field
The present invention relates to a kind of recognition methods, be specifically related to a kind of microblogging network leader of opinion recognition methods.
Background technology
Along with the development of Web2.0 technology, some novel network applications have appearred in the internet, as social networks, microblogging network etc., aspect Information Communication and interpersonal relation carrying, show increasing value and influence power.
Social networks is intended to help people more effectively to set up and keep human relation network as the expansion on the internet of real social networks.Different from the website take aggregation information as characteristics, social networks is take the polymerization crowd as characteristics, people can set up and keep the friend's circle of oneself by social networks, become a kind of novel individual social mode and information interchange platform, by means of the Information Communication pattern that friend's public praise passes on from one to another, accelerated the propagation of information.
Microblogging network (Micro-B1ogging Network) is also a kind of social networks, the user can pass through multiple channel issue 140 words such as browser, mobile phone, instant communication software with interior information, the Information Communication characteristic of this instantaneity, fragmentation, polymerism is subject to users' welcome, and the registered user of domestic Sina microblogging has surpassed 300,000,000 people.
In the spreading network information process, the leader of opinion has brought into play important effect.Local suggestion develops under leader of opinion's guiding and impact and is network public opinion.The leader of opinion claims again opinion leader, refer in interpersonal communication's network " activist " that often information be provided and exert one's influence for other people, they play important intermediary or filtration in the forming process of mass media effect, by them with diffusion of information to the audient, the two-step flow of communication flow that formation information is transmitted.Along with continuing to increase of network public opinion influence power, people are also constantly going deep into network leader of opinion's research.
The statistics demonstration, the most of user in network does not often participate in manufacturing and the propagation of information, and the leader of opinion is often followed in the decision that they make.Recognition network leader of opinion effectively, delivering guided bone information by the leader of opinion affects the place network user but not directly persuades them, can effectively trigger the influence power of whole network or society, for promoting Information Communication, improve demonstration effect and have important practical significance.
The microblogging network is a kind of complex network, usually adopts Complex Networks Theory to carry out modeling analysis to this class complex network both at home and abroad, to disclose intrinsic characteristic and the scientific law of complex network.According to Complex Networks Theory, the microblogging network can be abstracted into a kind of directed networks figure, each user consists of the node in network, limit between user's Relations Among configuration node, the friend who has due to each user is different with bean vermicelli quantity, so each node has different weights, and the node weights are larger, the influence power that this node is described is larger, and the possibility that becomes the leader of opinion is also just larger.Therefore how leader of opinion's identification problem can be summed up as the computing node weighted problem.By setting up the directed networks graph model, structural relation between analysis node is calculated the weights of each node, and the node weights are larger, and the possibility that becomes the leader of opinion is just larger.
Document 1 " Discovering Important Bloggers based on Analyzing Blog Threads[WWW; Chiba; Japan; May 10-14; 2005] " has proposed a kind of blog responsible consumer analytical approach ThreadRank based on the model content analysis, the method judges its user's importance by analyzing a large amount of Blog contents, need to expend a large amount of time for the content cleaning and analyze, and efficient is lower.
Document 2 " Identifying Opinion Leaders in the Blogosphere[CIKM; pp.971-974; 2007] " has proposed a kind of leader of opinion's recognition methods InfluenceRank, the method is according to comparing to judge user's importance with other blogs, and user's weights are calculated in the contribution that these users do whole network, and adopt the cosine law to calculate the similarity of different blog entities, and complicacy is higher, and expense is large.
Document 3 " TwitterRank:Finding topic-sensitive Inuential Twitterers[WSDM; 2010] " has proposed a kind of Twitter network node computing method TwitterRank, the method is carried out weight calculation according to the distribution between the customer relationship in Twitter, bean vermicelli and follower and various user groups play in the process of Information Communication effect, this algorithm is mainly analyzed based on topic, and recall rate is not high.
Summary of the invention
Recall the deficiency of rate variance in order to overcome existing leader of opinion's recognition methods, the invention provides a kind of microblogging network leader of opinion recognition methods.The method is come identification nodes influence power and importance by node weights, and the node weights are larger, and the possibility that becomes the leader of opinion is just larger.When the computing node weight, consider bean vermicelli quantity and the many factors such as node link relation and interactive relation that node has, can improve recall rate, improve simultaneously accuracy rate.
The technical solution adopted for the present invention to solve the technical problems is: a kind of microblogging network leader of opinion recognition methods is characterized in comprising the following steps:
Step 1, utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting wherein network topological information, to deposit database in pending.
Step 2, structure microblogging directed networks figure
G=(E,V)
In formula, E represents the node relationships set, and V represents node set.
Step 3, calculating effective bean vermicelli set Ef (u)
Ef(u)={v|v∈Follower(u)∧Response(u)>δ}
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli.
The node weights IRL (u that step 4, calculating are produced by linking relationship i)
( u i ) = σ N + ( 1 - σ ) Σ u j ∈ Follower ( u i ) IRL ( u j ) L ( u j )
In formula, Follower (u i) be node u iAll bean vermicelli set, L (u j) be node u jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart.
The node weights IRTR (u that step 5, calculating are produced by the node interactive relation i)
( u i ) = Σ t j ∈ Tweet ( u i ) Σ u j ∈ Re sponse ( t j ) | N s ( u j ) - N μ ( u j ) | | A |
In formula, Tweet (u i) be node u iThe model set, A represents that all have the model collection of mutual situation | A| is the set of A, N s(u j) be node u jFor model t j, response times, N μ(u j) being response mean value, Response comprises that the user turns note, money order receipt to be signed and returned to the sender, comment and collection.
Step 6, the comprehensive weights IR of computing node (u i)
IR(u i)=(1-β)×IRL(u i)+β×IRTR(u i)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights determine by linking relationship, calculate weights by linking relationship fully especially when β=0.
The comprehensive weights of all nodes in step 7, computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
The invention has the beneficial effects as follows: owing to coming identification nodes influence power and importance by node weights, the node weights are larger, and the possibility that becomes the leader of opinion is just larger.When the computing node weight, consider bean vermicelli quantity and the many factors such as node link relation and interactive relation that node has, improved recall rate, improved simultaneously accuracy rate.After testing, recall rate is brought up to more than 89.3% by 81.7~88.5% of background technology, and accuracy rate is brought up to more than 91.7% by 84.7~90.4% of background technology.
Below in conjunction with drawings and Examples, the present invention is elaborated.
Description of drawings
Fig. 1 is the process flow diagram of microblogging network leader of opinion of the present invention recognition methods.
Embodiment
With reference to Fig. 1.Microblogging network leader of opinion recognition methods concrete steps of the present invention are as follows:
1. obtain the microblogging network data: utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting the network topological informations such as wherein node, connection, to deposit database in pending.
2. build microblogging directed networks figure:
G=(E,V)
In formula, E represents the node relationships set, and V represents node set.
3. calculate effective bean vermicelli set Ef (u):
Ef(u)={v|v∈Follower(u)∧Response(u)>δ} (1)
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli.
4. calculate the node weights IRL (u that is produced by linking relationship i):
( u i ) = σ N + ( 1 - σ ) Σ u j ∈ Follower ( u i ) IRL ( u j ) L ( u j ) - - - ( 2 )
In formula, Follower (u i) be node u iAll bean vermicelli set, L (u j) be node u jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart.
5. calculate the node weights IRTR (u that is produced by the node interactive relation i):
IRTR ( u i ) = Σ t j ∈ Tweet ( u i ) Σ u j ∈ Re sponse ( t j ) | N s ( u j ) - N μ ( u j ) | | A | - - - ( 3 )
In formula, Tweet (u i) be node u iThe model set, A represents that all have the model collection of mutual situation | A| is the set of A, N s(u j) be node u jFor model t j, response times, N μ(u j) being response mean value, Response comprises that the user turns note, money order receipt to be signed and returned to the sender, comment and collection.
6. the comprehensive weights IR of computing node (u i):
IR(u j)=(1-β)×IRL(u i)+β×IRTR(u i) (4)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights mainly determine by linking relationship, calculate weights by linking relationship fully especially when β=0.
7. comprehensive weights of all nodes in computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
The present invention has improved existing methodical deficiency from counting yield and degree of accuracy two aspects.At first, by defining effective bean vermicelli set, will be not or the node that has a small amount of bean vermicelli exclude, it is minimum that they become leader of opinion's possibility, because leader of opinion or high weight node must have a large amount of beans vermicelli, so just can significantly reduce the network chart scale, be conducive to improve counting yield.Secondly, when the computing node weights, not only consider the linking relationship that produced by bean vermicelli, also considered the node interactive relation that issue, forwarding, reply and the collection etc. of model produce, therefore improved counting accuracy.
The present invention and existing method contrast and experiment are as shown in table 1.
The recall rate of the various node weight value calculating methods of table 1, accuracy rate and the contrast of average nodal processing time
Computing method Out-degree In-degree/out-degree combination Document 1 Document 2 Document 3 The present invention
Recall rate 57.3% 65.4% 82.2% 81.7% 88.5% 89.3%
Accuracy rate 62.2% 67.3% 86.1% 84.7% 90.4% 91.7%
Time/node 0.14min 0.23min 3.37min 2.81min 2.76min 0.31min
This experiment is to process 100,000 microblogging network nodes as benchmark test.As can be seen from Table 1, although higher based on out-degree, in-degree/method counting yielies such as out-degree combination, accuracy rate and recall rate are very low; Although the method that document 1, document 2 and document 3 propose has higher accuracy rate and recall rate, counting yield is lower; And the present invention not only has higher counting yield, and has higher accuracy rate and recall rate.

Claims (1)

1. microblogging network leader of opinion recognition methods is characterized in that comprising the following steps:
Step 1, utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting wherein network topological information, to deposit database in pending;
Step 2, structure microblogging directed networks figure
G=(E,V)
In formula, E represents the node relationships set, and V represents node set;
Step 3, calculating effective bean vermicelli set Ef (u)
Ef(u)={v|v∈Follower(u)∧Response(u)>δ}
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli;
The node weights IRL (u that step 4, calculating are produced by linking relationship i)
( u i ) = σ N + ( 1 - σ ) Σ u j ∈ Follower ( u i ) IRL ( u j ) L ( u j )
In formula, Follower (u i) be node u iAll bean vermicelli set, L (u j) be node u jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart;
The node weights IRTR (u that step 5, calculating are produced by the node interactive relation i)
IRTR ( u i ) = Σ t j ∈ Tweet ( u i ) Σ u j ∈ Re sponse ( t j ) | N s ( u j ) - N μ ( u j ) | | A | In formula, Tweet (u i) be node u iThe model set, A represents that all have the model collection of mutual situation | A| is the set of A, N s(u j) be node u jFor model t j, response times, N μ(u j) being response mean value, Response comprises that the user turns note, money order receipt to be signed and returned to the sender, comment and collection;
Step 6, the comprehensive weights IR of computing node (u i)
IR(u i)=(1-β)×IRL(u i)+β×IRTR(u i)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights determine by linking relationship, calculate weights by linking relationship fully especially when β=0;
The comprehensive weights of all nodes in step 7, computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
CN2013100278084A 2013-01-18 2013-01-18 Micro blog network opinion leader identification method Pending CN103136331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013100278084A CN103136331A (en) 2013-01-18 2013-01-18 Micro blog network opinion leader identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013100278084A CN103136331A (en) 2013-01-18 2013-01-18 Micro blog network opinion leader identification method

Publications (1)

Publication Number Publication Date
CN103136331A true CN103136331A (en) 2013-06-05

Family

ID=48496157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013100278084A Pending CN103136331A (en) 2013-01-18 2013-01-18 Micro blog network opinion leader identification method

Country Status (1)

Country Link
CN (1) CN103136331A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723402A (en) * 2013-10-25 2016-06-29 西斯摩斯公司 Systems and methods for determining influencers in a social data network
CN105959368A (en) * 2016-04-29 2016-09-21 成都信息工程大学 Social cloud hot spot resource prediction and disposition method
CN106055627A (en) * 2016-05-27 2016-10-26 西安电子科技大学 Recognition method of key nodes of social network in topic field
CN107633260A (en) * 2017-08-23 2018-01-26 上海师范大学 A kind of social network opinion leader method for digging based on cluster
CN107729455A (en) * 2017-09-25 2018-02-23 山东科技大学 A kind of social network opinion leader sort algorithm based on multidimensional characteristic analysis
CN108280121A (en) * 2017-12-06 2018-07-13 上海师范大学 A method of social network opinion leader is obtained based on K- nuclear decomposition
CN108335008A (en) * 2017-12-13 2018-07-27 腾讯科技(深圳)有限公司 Web information processing method and device, storage medium and electronic device
CN110134877A (en) * 2019-05-15 2019-08-16 天津大学 Move down the line the method and apparatus that seed user is excavated in social networks
CN110287442A (en) * 2019-06-28 2019-09-27 秒针信息技术有限公司 A kind of determination method, apparatus, electronic equipment and the storage medium of influence power ranking
CN110717085A (en) * 2019-10-12 2020-01-21 浙江工商大学 Opinion leader identification method based on virtual brand community
CN112667876A (en) * 2020-12-24 2021-04-16 湖北第二师范学院 Opinion leader group identification method based on PSOTVCF-Kmeans algorithm

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214212A (en) * 2011-05-20 2011-10-12 西北工业大学 Method for ordering microblog network node weights based on multi-link
CN102662956A (en) * 2012-03-05 2012-09-12 西北工业大学 Method for identifying opinion leaders in social network based on topic link behaviors of users

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214212A (en) * 2011-05-20 2011-10-12 西北工业大学 Method for ordering microblog network node weights based on multi-link
CN102662956A (en) * 2012-03-05 2012-09-12 西北工业大学 Method for identifying opinion leaders in social network based on topic link behaviors of users

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105723402A (en) * 2013-10-25 2016-06-29 西斯摩斯公司 Systems and methods for determining influencers in a social data network
CN105959368B (en) * 2016-04-29 2019-04-02 成都信息工程大学 A kind of method of social activity cloud hot point resource prediction and deployment
CN105959368A (en) * 2016-04-29 2016-09-21 成都信息工程大学 Social cloud hot spot resource prediction and disposition method
CN106055627A (en) * 2016-05-27 2016-10-26 西安电子科技大学 Recognition method of key nodes of social network in topic field
CN106055627B (en) * 2016-05-27 2019-06-18 西安电子科技大学 The recognition methods of social networks key node in topic field
CN107633260B (en) * 2017-08-23 2020-10-16 上海师范大学 Social network opinion leader mining method based on clustering
CN107633260A (en) * 2017-08-23 2018-01-26 上海师范大学 A kind of social network opinion leader method for digging based on cluster
CN107729455A (en) * 2017-09-25 2018-02-23 山东科技大学 A kind of social network opinion leader sort algorithm based on multidimensional characteristic analysis
CN108280121A (en) * 2017-12-06 2018-07-13 上海师范大学 A method of social network opinion leader is obtained based on K- nuclear decomposition
CN108280121B (en) * 2017-12-06 2021-10-22 上海师范大学 Method for obtaining social network opinion leader based on K-kernel decomposition
CN108335008A (en) * 2017-12-13 2018-07-27 腾讯科技(深圳)有限公司 Web information processing method and device, storage medium and electronic device
CN110134877A (en) * 2019-05-15 2019-08-16 天津大学 Move down the line the method and apparatus that seed user is excavated in social networks
CN110287442A (en) * 2019-06-28 2019-09-27 秒针信息技术有限公司 A kind of determination method, apparatus, electronic equipment and the storage medium of influence power ranking
CN110717085A (en) * 2019-10-12 2020-01-21 浙江工商大学 Opinion leader identification method based on virtual brand community
CN110717085B (en) * 2019-10-12 2021-08-06 浙江工商大学 Opinion leader identification method based on virtual brand community
CN112667876A (en) * 2020-12-24 2021-04-16 湖北第二师范学院 Opinion leader group identification method based on PSOTVCF-Kmeans algorithm
CN112667876B (en) * 2020-12-24 2024-04-09 湖北第二师范学院 Opinion leader group identification method based on PSOTVCF-Kmeans algorithm

Similar Documents

Publication Publication Date Title
CN103136331A (en) Micro blog network opinion leader identification method
Lee et al. Measurements, analyses, and insights on the entire ethereum blockchain network
Chen et al. Social network collaborative filtering framework and online trust factors: A case study on Facebook
CN106682770A (en) Friend circle-based dynamic microblog forwarding behavior prediction system and method
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN102662956A (en) Method for identifying opinion leaders in social network based on topic link behaviors of users
CN102262681B (en) A kind of blog information identifies the method for crucial blog collection in propagating
CN102214212A (en) Method for ordering microblog network node weights based on multi-link
CN102394798A (en) Multi-feature based prediction method of propagation behavior of microblog information and system thereof
Chen et al. Influencerank: An efficient social influence measurement for millions of users in microblog
CN104008203A (en) User interest discovering method with ontology situation blended in
CN109726319B (en) User influence analysis method based on interactive relation
CN102571485A (en) Method for identifying robot user on micro-blog platform
CN103838819A (en) Information publish method and system
Ding et al. Measuring the spreadability of users in microblogs
CN105095419A (en) Method for maximizing influence of information to specific type of weibo users
CN104133897A (en) Micro blog topic source tracing method based on topic influence
CN103179198A (en) Topic influence individual digging method based on relational network
CN109492076A (en) A kind of network-based community's question and answer website answer credible evaluation method
CN102664744B (en) Group-sending recommendation method in network message communication
CN105678590A (en) topN recommendation method for social network based on cloud model
CN103294833A (en) Junk user discovering method based on user following relationships
Li et al. Social network user influence dynamics prediction
Sun et al. Matrix based community evolution events detection in online social networks
Tian et al. Boosting social network connectivity with link revival

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130605