CN103136331A - Micro blog network opinion leader identification method - Google Patents
Micro blog network opinion leader identification method Download PDFInfo
- Publication number
- CN103136331A CN103136331A CN2013100278084A CN201310027808A CN103136331A CN 103136331 A CN103136331 A CN 103136331A CN 2013100278084 A CN2013100278084 A CN 2013100278084A CN 201310027808 A CN201310027808 A CN 201310027808A CN 103136331 A CN103136331 A CN 103136331A
- Authority
- CN
- China
- Prior art keywords
- node
- weights
- calculating
- network
- bean vermicelli
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses a micro blog network opinion leader identification method and is used for solving the technical problem of poor recall rate of the existing opinion leader identification method. The technical scheme includes storing network topology information collected from the internet into a database by using a web crawler tool; constructing a directed network diagram G=(V, E); calculating an effective fan collection Ef (u); calculating a node weight IRL (ui) generated by a link relationship; calculating a node weight IRTR (ui) generated by a node interactive relationship; calculating a node comprehensive weight IR (ui); and calculating comprehensive weights of all nodes in the network diagram, performing sequencing in a descending order according to the comprehensive weights, and selecting n nodes with larger comprehensive weights as candidates of opinion leaders. The fan number possessed by the nodes, factors such as the node link relationship and the interactive relationship are considered when the node weights are calculated, so that the recall rate and the accuracy are improved. Through detection, the return rate is improved from 81.7-88.5% of the background technology to above 89.3%, and the accuracy is improved from 84.7-90.4% of the background technology to above 91.7%.
Description
Technical field
The present invention relates to a kind of recognition methods, be specifically related to a kind of microblogging network leader of opinion recognition methods.
Background technology
Along with the development of Web2.0 technology, some novel network applications have appearred in the internet, as social networks, microblogging network etc., aspect Information Communication and interpersonal relation carrying, show increasing value and influence power.
Social networks is intended to help people more effectively to set up and keep human relation network as the expansion on the internet of real social networks.Different from the website take aggregation information as characteristics, social networks is take the polymerization crowd as characteristics, people can set up and keep the friend's circle of oneself by social networks, become a kind of novel individual social mode and information interchange platform, by means of the Information Communication pattern that friend's public praise passes on from one to another, accelerated the propagation of information.
Microblogging network (Micro-B1ogging Network) is also a kind of social networks, the user can pass through multiple channel issue 140 words such as browser, mobile phone, instant communication software with interior information, the Information Communication characteristic of this instantaneity, fragmentation, polymerism is subject to users' welcome, and the registered user of domestic Sina microblogging has surpassed 300,000,000 people.
In the spreading network information process, the leader of opinion has brought into play important effect.Local suggestion develops under leader of opinion's guiding and impact and is network public opinion.The leader of opinion claims again opinion leader, refer in interpersonal communication's network " activist " that often information be provided and exert one's influence for other people, they play important intermediary or filtration in the forming process of mass media effect, by them with diffusion of information to the audient, the two-step flow of communication flow that formation information is transmitted.Along with continuing to increase of network public opinion influence power, people are also constantly going deep into network leader of opinion's research.
The statistics demonstration, the most of user in network does not often participate in manufacturing and the propagation of information, and the leader of opinion is often followed in the decision that they make.Recognition network leader of opinion effectively, delivering guided bone information by the leader of opinion affects the place network user but not directly persuades them, can effectively trigger the influence power of whole network or society, for promoting Information Communication, improve demonstration effect and have important practical significance.
The microblogging network is a kind of complex network, usually adopts Complex Networks Theory to carry out modeling analysis to this class complex network both at home and abroad, to disclose intrinsic characteristic and the scientific law of complex network.According to Complex Networks Theory, the microblogging network can be abstracted into a kind of directed networks figure, each user consists of the node in network, limit between user's Relations Among configuration node, the friend who has due to each user is different with bean vermicelli quantity, so each node has different weights, and the node weights are larger, the influence power that this node is described is larger, and the possibility that becomes the leader of opinion is also just larger.Therefore how leader of opinion's identification problem can be summed up as the computing node weighted problem.By setting up the directed networks graph model, structural relation between analysis node is calculated the weights of each node, and the node weights are larger, and the possibility that becomes the leader of opinion is just larger.
Document 1 " Discovering Important Bloggers based on Analyzing Blog Threads[WWW; Chiba; Japan; May 10-14; 2005] " has proposed a kind of blog responsible consumer analytical approach ThreadRank based on the model content analysis, the method judges its user's importance by analyzing a large amount of Blog contents, need to expend a large amount of time for the content cleaning and analyze, and efficient is lower.
Document 2 " Identifying Opinion Leaders in the Blogosphere[CIKM; pp.971-974; 2007] " has proposed a kind of leader of opinion's recognition methods InfluenceRank, the method is according to comparing to judge user's importance with other blogs, and user's weights are calculated in the contribution that these users do whole network, and adopt the cosine law to calculate the similarity of different blog entities, and complicacy is higher, and expense is large.
Document 3 " TwitterRank:Finding topic-sensitive Inuential Twitterers[WSDM; 2010] " has proposed a kind of Twitter network node computing method TwitterRank, the method is carried out weight calculation according to the distribution between the customer relationship in Twitter, bean vermicelli and follower and various user groups play in the process of Information Communication effect, this algorithm is mainly analyzed based on topic, and recall rate is not high.
Summary of the invention
Recall the deficiency of rate variance in order to overcome existing leader of opinion's recognition methods, the invention provides a kind of microblogging network leader of opinion recognition methods.The method is come identification nodes influence power and importance by node weights, and the node weights are larger, and the possibility that becomes the leader of opinion is just larger.When the computing node weight, consider bean vermicelli quantity and the many factors such as node link relation and interactive relation that node has, can improve recall rate, improve simultaneously accuracy rate.
The technical solution adopted for the present invention to solve the technical problems is: a kind of microblogging network leader of opinion recognition methods is characterized in comprising the following steps:
Step 1, utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting wherein network topological information, to deposit database in pending.
Step 2, structure microblogging directed networks figure
G=(E,V)
In formula, E represents the node relationships set, and V represents node set.
Step 3, calculating effective bean vermicelli set Ef (u)
Ef(u)={v|v∈Follower(u)∧Response(u)>δ}
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli.
The node weights IRL (u that step 4, calculating are produced by linking relationship
i)
In formula, Follower (u
i) be node u
iAll bean vermicelli set, L (u
j) be node u
jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart.
The node weights IRTR (u that step 5, calculating are produced by the node interactive relation
i)
In formula, Tweet (u
i) be node u
iThe model set, A represents that all have the model collection of mutual situation | A| is the set of A, N
s(u
j) be node u
jFor model t
j, response times, N
μ(u
j) being response mean value, Response comprises that the user turns note, money order receipt to be signed and returned to the sender, comment and collection.
Step 6, the comprehensive weights IR of computing node (u
i)
IR(u
i)=(1-β)×IRL(u
i)+β×IRTR(u
i)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights determine by linking relationship, calculate weights by linking relationship fully especially when β=0.
The comprehensive weights of all nodes in step 7, computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
The invention has the beneficial effects as follows: owing to coming identification nodes influence power and importance by node weights, the node weights are larger, and the possibility that becomes the leader of opinion is just larger.When the computing node weight, consider bean vermicelli quantity and the many factors such as node link relation and interactive relation that node has, improved recall rate, improved simultaneously accuracy rate.After testing, recall rate is brought up to more than 89.3% by 81.7~88.5% of background technology, and accuracy rate is brought up to more than 91.7% by 84.7~90.4% of background technology.
Below in conjunction with drawings and Examples, the present invention is elaborated.
Description of drawings
Fig. 1 is the process flow diagram of microblogging network leader of opinion of the present invention recognition methods.
Embodiment
With reference to Fig. 1.Microblogging network leader of opinion recognition methods concrete steps of the present invention are as follows:
1. obtain the microblogging network data: utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting the network topological informations such as wherein node, connection, to deposit database in pending.
2. build microblogging directed networks figure:
G=(E,V)
In formula, E represents the node relationships set, and V represents node set.
3. calculate effective bean vermicelli set Ef (u):
Ef(u)={v|v∈Follower(u)∧Response(u)>δ} (1)
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli.
4. calculate the node weights IRL (u that is produced by linking relationship
i):
In formula, Follower (u
i) be node u
iAll bean vermicelli set, L (u
j) be node u
jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart.
5. calculate the node weights IRTR (u that is produced by the node interactive relation
i):
In formula, Tweet (u
i) be node u
iThe model set, A represents that all have the model collection of mutual situation | A| is the set of A, N
s(u
j) be node u
jFor model t
j, response times, N
μ(u
j) being response mean value, Response comprises that the user turns note, money order receipt to be signed and returned to the sender, comment and collection.
6. the comprehensive weights IR of computing node (u
i):
IR(u
j)=(1-β)×IRL(u
i)+β×IRTR(u
i) (4)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights mainly determine by linking relationship, calculate weights by linking relationship fully especially when β=0.
7. comprehensive weights of all nodes in computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
The present invention has improved existing methodical deficiency from counting yield and degree of accuracy two aspects.At first, by defining effective bean vermicelli set, will be not or the node that has a small amount of bean vermicelli exclude, it is minimum that they become leader of opinion's possibility, because leader of opinion or high weight node must have a large amount of beans vermicelli, so just can significantly reduce the network chart scale, be conducive to improve counting yield.Secondly, when the computing node weights, not only consider the linking relationship that produced by bean vermicelli, also considered the node interactive relation that issue, forwarding, reply and the collection etc. of model produce, therefore improved counting accuracy.
The present invention and existing method contrast and experiment are as shown in table 1.
The recall rate of the various node weight value calculating methods of table 1, accuracy rate and the contrast of average nodal processing time
Computing method | Out-degree | In-degree/out-degree combination | Document 1 | Document 2 | Document 3 | The present invention |
Recall rate | 57.3% | 65.4% | 82.2% | 81.7% | 88.5% | 89.3% |
Accuracy rate | 62.2% | 67.3% | 86.1% | 84.7% | 90.4% | 91.7% |
Time/node | 0.14min | 0.23min | 3.37min | 2.81min | 2.76min | 0.31min |
This experiment is to process 100,000 microblogging network nodes as benchmark test.As can be seen from Table 1, although higher based on out-degree, in-degree/method counting yielies such as out-degree combination, accuracy rate and recall rate are very low; Although the method that document 1, document 2 and document 3 propose has higher accuracy rate and recall rate, counting yield is lower; And the present invention not only has higher counting yield, and has higher accuracy rate and recall rate.
Claims (1)
1. microblogging network leader of opinion recognition methods is characterized in that comprising the following steps:
Step 1, utilize the web crawlers instrument, gather actual microblogging network data from the internet, extracting wherein network topological information, to deposit database in pending;
Step 2, structure microblogging directed networks figure
G=(E,V)
In formula, E represents the node relationships set, and V represents node set;
Step 3, calculating effective bean vermicelli set Ef (u)
Ef(u)={v|v∈Follower(u)∧Response(u)>δ}
In formula, δ is non-negative constant threshold, the feedback degree thresholding of the bean vermicelli node v of expression node u to node u, and the bean vermicelli that surpasses this threshold value and belong to node u just can be can be regarded as effective bean vermicelli;
The node weights IRL (u that step 4, calculating are produced by linking relationship
i)
In formula, Follower (u
i) be node u
iAll bean vermicelli set, L (u
j) be node u
jThe bean vermicelli number, σ is the ratio of damping between 0 and 1, N is the total nodes in network chart;
The node weights IRTR (u that step 5, calculating are produced by the node interactive relation
i)
Step 6, the comprehensive weights IR of computing node (u
i)
IR(u
i)=(1-β)×IRL(u
i)+β×IRTR(u
i)
In formula, parameter beta (β ∈ [0,1]) decision linking relationship and two factors of node interactive relation are residing status in the calculating of node weights; When β hour, the node weights determine by linking relationship, calculate weights by linking relationship fully especially when β=0;
The comprehensive weights of all nodes in step 7, computational grid figure, and by the descending sequence of comprehensive weights, choose the larger n of a comprehensive weights node, as leader of opinion's candidate target.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100278084A CN103136331A (en) | 2013-01-18 | 2013-01-18 | Micro blog network opinion leader identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100278084A CN103136331A (en) | 2013-01-18 | 2013-01-18 | Micro blog network opinion leader identification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103136331A true CN103136331A (en) | 2013-06-05 |
Family
ID=48496157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013100278084A Pending CN103136331A (en) | 2013-01-18 | 2013-01-18 | Micro blog network opinion leader identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103136331A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105723402A (en) * | 2013-10-25 | 2016-06-29 | 西斯摩斯公司 | Systems and methods for determining influencers in a social data network |
CN105959368A (en) * | 2016-04-29 | 2016-09-21 | 成都信息工程大学 | Social cloud hot spot resource prediction and disposition method |
CN106055627A (en) * | 2016-05-27 | 2016-10-26 | 西安电子科技大学 | Recognition method of key nodes of social network in topic field |
CN107633260A (en) * | 2017-08-23 | 2018-01-26 | 上海师范大学 | A kind of social network opinion leader method for digging based on cluster |
CN107729455A (en) * | 2017-09-25 | 2018-02-23 | 山东科技大学 | A kind of social network opinion leader sort algorithm based on multidimensional characteristic analysis |
CN108280121A (en) * | 2017-12-06 | 2018-07-13 | 上海师范大学 | A method of social network opinion leader is obtained based on K- nuclear decomposition |
CN108335008A (en) * | 2017-12-13 | 2018-07-27 | 腾讯科技(深圳)有限公司 | Web information processing method and device, storage medium and electronic device |
CN110134877A (en) * | 2019-05-15 | 2019-08-16 | 天津大学 | Move down the line the method and apparatus that seed user is excavated in social networks |
CN110287442A (en) * | 2019-06-28 | 2019-09-27 | 秒针信息技术有限公司 | A kind of determination method, apparatus, electronic equipment and the storage medium of influence power ranking |
CN110717085A (en) * | 2019-10-12 | 2020-01-21 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN112667876A (en) * | 2020-12-24 | 2021-04-16 | 湖北第二师范学院 | Opinion leader group identification method based on PSOTVCF-Kmeans algorithm |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102214212A (en) * | 2011-05-20 | 2011-10-12 | 西北工业大学 | Method for ordering microblog network node weights based on multi-link |
CN102662956A (en) * | 2012-03-05 | 2012-09-12 | 西北工业大学 | Method for identifying opinion leaders in social network based on topic link behaviors of users |
-
2013
- 2013-01-18 CN CN2013100278084A patent/CN103136331A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102214212A (en) * | 2011-05-20 | 2011-10-12 | 西北工业大学 | Method for ordering microblog network node weights based on multi-link |
CN102662956A (en) * | 2012-03-05 | 2012-09-12 | 西北工业大学 | Method for identifying opinion leaders in social network based on topic link behaviors of users |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105723402A (en) * | 2013-10-25 | 2016-06-29 | 西斯摩斯公司 | Systems and methods for determining influencers in a social data network |
CN105959368B (en) * | 2016-04-29 | 2019-04-02 | 成都信息工程大学 | A kind of method of social activity cloud hot point resource prediction and deployment |
CN105959368A (en) * | 2016-04-29 | 2016-09-21 | 成都信息工程大学 | Social cloud hot spot resource prediction and disposition method |
CN106055627A (en) * | 2016-05-27 | 2016-10-26 | 西安电子科技大学 | Recognition method of key nodes of social network in topic field |
CN106055627B (en) * | 2016-05-27 | 2019-06-18 | 西安电子科技大学 | The recognition methods of social networks key node in topic field |
CN107633260B (en) * | 2017-08-23 | 2020-10-16 | 上海师范大学 | Social network opinion leader mining method based on clustering |
CN107633260A (en) * | 2017-08-23 | 2018-01-26 | 上海师范大学 | A kind of social network opinion leader method for digging based on cluster |
CN107729455A (en) * | 2017-09-25 | 2018-02-23 | 山东科技大学 | A kind of social network opinion leader sort algorithm based on multidimensional characteristic analysis |
CN108280121A (en) * | 2017-12-06 | 2018-07-13 | 上海师范大学 | A method of social network opinion leader is obtained based on K- nuclear decomposition |
CN108280121B (en) * | 2017-12-06 | 2021-10-22 | 上海师范大学 | Method for obtaining social network opinion leader based on K-kernel decomposition |
CN108335008A (en) * | 2017-12-13 | 2018-07-27 | 腾讯科技(深圳)有限公司 | Web information processing method and device, storage medium and electronic device |
CN110134877A (en) * | 2019-05-15 | 2019-08-16 | 天津大学 | Move down the line the method and apparatus that seed user is excavated in social networks |
CN110287442A (en) * | 2019-06-28 | 2019-09-27 | 秒针信息技术有限公司 | A kind of determination method, apparatus, electronic equipment and the storage medium of influence power ranking |
CN110717085A (en) * | 2019-10-12 | 2020-01-21 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN110717085B (en) * | 2019-10-12 | 2021-08-06 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN112667876A (en) * | 2020-12-24 | 2021-04-16 | 湖北第二师范学院 | Opinion leader group identification method based on PSOTVCF-Kmeans algorithm |
CN112667876B (en) * | 2020-12-24 | 2024-04-09 | 湖北第二师范学院 | Opinion leader group identification method based on PSOTVCF-Kmeans algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103136331A (en) | Micro blog network opinion leader identification method | |
Lee et al. | Measurements, analyses, and insights on the entire ethereum blockchain network | |
Chen et al. | Social network collaborative filtering framework and online trust factors: A case study on Facebook | |
CN106682770A (en) | Friend circle-based dynamic microblog forwarding behavior prediction system and method | |
CN104933622A (en) | Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme | |
CN102662956A (en) | Method for identifying opinion leaders in social network based on topic link behaviors of users | |
CN102262681B (en) | A kind of blog information identifies the method for crucial blog collection in propagating | |
CN102214212A (en) | Method for ordering microblog network node weights based on multi-link | |
CN102394798A (en) | Multi-feature based prediction method of propagation behavior of microblog information and system thereof | |
Chen et al. | Influencerank: An efficient social influence measurement for millions of users in microblog | |
CN104008203A (en) | User interest discovering method with ontology situation blended in | |
CN109726319B (en) | User influence analysis method based on interactive relation | |
CN102571485A (en) | Method for identifying robot user on micro-blog platform | |
CN103838819A (en) | Information publish method and system | |
Ding et al. | Measuring the spreadability of users in microblogs | |
CN105095419A (en) | Method for maximizing influence of information to specific type of weibo users | |
CN104133897A (en) | Micro blog topic source tracing method based on topic influence | |
CN103179198A (en) | Topic influence individual digging method based on relational network | |
CN109492076A (en) | A kind of network-based community's question and answer website answer credible evaluation method | |
CN102664744B (en) | Group-sending recommendation method in network message communication | |
CN105678590A (en) | topN recommendation method for social network based on cloud model | |
CN103294833A (en) | Junk user discovering method based on user following relationships | |
Li et al. | Social network user influence dynamics prediction | |
Sun et al. | Matrix based community evolution events detection in online social networks | |
Tian et al. | Boosting social network connectivity with link revival |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20130605 |