CN103530402A - Method for identifying microblog key users based on improved Page Rank - Google Patents

Method for identifying microblog key users based on improved Page Rank Download PDF

Info

Publication number
CN103530402A
CN103530402A CN201310502924.7A CN201310502924A CN103530402A CN 103530402 A CN103530402 A CN 103530402A CN 201310502924 A CN201310502924 A CN 201310502924A CN 103530402 A CN103530402 A CN 103530402A
Authority
CN
China
Prior art keywords
microblogging
network
pagerank
key
microblog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310502924.7A
Other languages
Chinese (zh)
Inventor
程工
刘春阳
张旭
庞琳
吴俊杰
韩洋
刘洪甫
韩小汀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201310502924.7A priority Critical patent/CN103530402A/en
Publication of CN103530402A publication Critical patent/CN103530402A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for identifying microblog key users based on an improved Page Rank. The method comprises the steps that microblog information data are input, wherein the microblog information data comprise n microblogs; word segmentation is conducted on texts of the n microblogs; according to a word segmentation result, a reverse index structure is established, so that retrieval is conveniently conducted according to appointed keywords; according to the retrieved relevant microblog, forwarding hierarchy information of the microblog is extracted and a weighting directed graph is established, wherein the weighting directed graph is a forwarding network G; the forwarding network G is divided into a plurality of maximum connected subgraphs Gi; the Page rank algorithm is applied to each sub network Gi according to the parallelization computing technology; computing results of the sub networks are combined, so that ranking results of the whole network G are generated; the first m ranking results of the ranking results are selected, serve as the key users and are output. According to the method for identifying the microblog key users based on the improved Page Rank, the parallelization computing technology is adopted, a dynamic forwarding network of a microblog platform is ranked and computed in a big data environment, so that the key users in the information transmission process are identified, and the method is applied to the fields of network public opinion analysis and the like.

Description

A kind of microblogging key user recognition methods based on improved PageRank
Technical field
The present invention relates to microblogging key user recognition methods, particularly a kind of microblogging key user recognition methods based on improved PageRank, belongs to complex network and Data Mining, the special microblogging data analysis for magnanimity.
Background technology
Microblogging platform key user is the user that the propagation of information and diffusion are played an important role.Key user plays a part important intermediary or filtration in the forming process of mass media effect, and they are by diffusion of information to audient, and the cascade that forms information is propagated.Therefore, the discovery of key user's the information that is identified in and propagation analysis aspect play important effect, for work such as Internet public opinion analysis, have great directive significance.Yet content-based key user's identification, because the feature of microblogging short text is often not accurate enough; Key user's identification based on structure, think in the network structure being formed by user that key user is all important node, and only consider static concern bean vermicelli relational network, but because the reasons such as the degree of correlation, user's liveness cause a lot of important node not have key user's feature.In view of this, the present invention, in conjunction with content and network structure two aspect features, utilizes the dynamic network of the forwarding information structure Topic relative in microblogging platform, and key user's identification is converted into the sequencing problem based on PageRank on digraph.In addition, consider that microblogging platform has the feature of mass data, the network dynamically generating is often huge, and needs instant computing result, and therefore the consumption of traditional power iteration algorithm on computing time and space is all difficult to satisfy the demands.Adopt parallel computing to decompose on some extendible computing nodes calculating on a large scale, reduced the Space-time Complexity of algorithm and for the requirement of the computing power of single node, be particularly suitable in cloud computing platform deploy and application.So the present invention adopts MapReduce to realize PageRank parallel computation, under large data environment, for microblogging short text information, excavate, thus the key user in identifying information communication process.
Summary of the invention
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank.The method be take PageRank as core, extracts forwarding information and builds the network being formed by forwarding relation, and then utilize final acquisition of PageRank to have high robust and high-quality key user's recognition result from microblogging text structure.Utilize forwarding relation can obtain inquiring about the dynamic transmission network that relevant quality is higher, overcome to a certain extent the inferior solution being caused by short text, correlativity, static network structure; A plurality of dynamic transmission networks combine after calculating respectively by PageRank and improve counting yield and algorithm robustness.A microblogging key user recognition methods of improved PageRank, comprises the steps:
Step 1: utilize web crawlers and microblogging API to collect the micro-blog data on internet, mainly comprise author information, text message, authorship's information and the temporal information of microblogging.
Step 2: the text data in microblogging data is carried out to word segmentation processing.
Step 3: stop words removes, stem extracts.
Step 4: build inverted index structure, be convenient to fast from keyword retrieval microblogging.
Step 5: the structure of dynamic transmission network, microblogging for each forwarding extracts its forwarding hierarchical information, as shown in Figure 2, comprise from authorship to deliver author and propagate process user, build thus directed networks, node is user, the relation of limit for forwarding, is the direction using the contrary direction of information flow as limit here, and the weight on limit is for forwarding the number of times occurring, direction is pointed to the author of its forwarding for author, and the final data structure forming is weighted digraph.
Step 6: the decomposition of transmission network, is divided into several very big connected subgraphs by transmission network G, for each sub-network G i, give its weight, i.e. W i=V (G i)/V (G), the wherein nodes in V (G) representative graph G.
Step 7: parallel iteration, solves the PageRank(pr of each node in sub-network) value.
(1) starting stage, for each sub-network G ieach node v give initial p ageRank value pr i0(v)=1/V (G i).
(2) iteration phase, utilizes MapReduce concurrent technique to calculate each node PageRank value in subnet, and its computing formula is pr ik=((1-d) E/n+dA i t) pr ik-1, wherein k is the current iteration stage; D is ratio of damping, is generally made as 0.85; N is the nodes in network; E is complete 1 vector; A ifor sub-network G iadjacency matrix.The end condition of iteration can reach the threshold value that the threshold value of appointment or result difference that twice iteration produces are less than appointment for k, | and pr ik-pr ik-1| < ε.
Step 8: the fusion of ranking results, by each sub-network G iranking results pr i, be multiplied by its weights W i, then the result of each sub-network is merged to the ranking results pr that finally obtains G.
Step 9: sort result output, pr, according to value sequence, is exported to the list of user's importance sorting, a front m user is key user.
The present invention utilizes dynamic forwarding relation to build user network, and a plurality of dynamic transmission network ranking results are merged, and greatly improves efficiency and the quality of microblogging platform key user identification under large data environment, also has very strong robustness.Method of the present invention can overcome the shortcoming of short text and static network structure, can effectively improve the efficiency of Internet public opinion analysis.
Accompanying drawing explanation
Fig. 1 is general frame figure of the present invention.
Fig. 2 is forwarding information exemplary plot.
Fig. 3 is dynamic network figure.
Fig. 4 is MapReduce schematic diagram.
Embodiment
Below in conjunction with accompanying drawing and concrete embodiment, the present invention is made further instructions.
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank.The method be take PageRank as core, adopt MapReduce parallel computing to overcome the low problem of counting yield of the large data of microblogging, from microblogging text structure, extract forwarding information and build the network being formed by forwarding relation, and then utilize final acquisition of PageRank to there is high robust and high-quality key user's recognition result.Utilize forwarding relation can obtain inquiring about the dynamic transmission network that relevant quality is higher, overcome to a certain extent the inferior solution being caused by short text, correlativity, static network structure; A plurality of dynamic transmission networks combine the robustness that improves result after calculating respectively by PageRank.
A kind of microblogging platform key user recognition methods based on improved PageRank, be divided into three modules: Data Collection and pretreatment module, dynamically transmission network builds module and PageRank calculating and Fusion Module, wherein Data Collection and pretreatment module comprise Data Collection, process, stop words removes, extract stem and generate the processes such as inverted index, dynamically transmission network builds module and utilizes micro-blog information extraction forwarding relation structure transmission network, PageRank calculates with Fusion Module and comprises transmission network division, PageRank calculates, result of calculation merges and last sequence and output, specific implementation process is as follows:
1. utilize web crawlers and microblogging API to collect the micro-blog data on internet, mainly comprise author information, text message, authorship's information and the temporal information of microblogging.Take Sina's microblogging as example, and the present invention comprises in the field of obtaining data phase by API and may needing:
Wherein a partial data structure (JSON form) that forwards microblogging is as follows:
Figure BDA0000400155970000042
In addition it should be noted that the retweeted_staus item for here, the microblogging that not necessarily user directly forwards, but be the microblogging of tracing to the source certainly, be the initiation point of Information Communication, i.e. user E in Fig. 2 forwarding information.
2. the text data in microblogging data is carried out to word segmentation processing.
3. stop words removes, stem extracts.
4. build inverted index structure, be convenient to fast from keyword retrieval microblogging.
5. the dynamic structure of transmission network, for the microblogging of each forwarding, extract the hierarchical information of its forwarding, comprise from authorship to deliver author and propagate process user, build thus directed networks, node is user, the relation of limit for forwarding, it should be noted that the direction using the contrary direction of information flow as limit, the number of times of the weight on limit for forward occurring, direction is pointed to the author of its forwarding for author, and the data structure of final formation is weighted digraph.The Fig. 3 of take below illustrates the building process of transmission network as example.
In this figure, have 2 forwarding informations, relate to altogether A, B, C, D and E totally 5 users, in figure, each limit represents forwarding relation, as A-> B shows that A has forwarded the information of B, the weight on this limit is the number of times that A has forwarded the information of B, is 2 times in the figure.
6. the decomposition of transmission network, is divided into several very big connected subgraphs by transmission network G, for each sub-network G i, give its weight, i.e. W i=V (G i)/V (G), the wherein number of vertex in V (G) representative graph G.
7. sub-network parallel computation.
(1) starting stage, for each sub-network G ieach node v give initial value pr i0(v)=1/V (G i).
(2) iteration phase, PageRank prime formula is pr ik=((1-d) E/n+dA i t) pr ik-1, wherein k is the current iteration stage; D is ratio of damping, is generally made as 0.85; N is the nodes in network; E is complete 1 vector; A ifor sub-network G iadjacency matrix.The end condition of iteration can reach threshold value that the threshold value of appointment or result difference that twice iteration produces be less than appointment for k | pr ik-pr ik-1| < ε.
Common PageRank algorithm the convergence speed is very slow, when data scale is very large, can greatly reduce the execution efficiency of algorithm.The present invention, by MapReduce computation model, is converted into several minor issues by the problem of Algorithm for Solving, for do distributed parallel at different computing nodes, calculates, and finally does merger again and gathers.Below in conjunction with Fig. 3, introduce the principle of MapReduce parallel computation PageRank.
The kernel data structure of MapReduce is " key-value pair " structure, uses < Key here, and Value > represents.Wherein Key represents key, Value typical value.
In the present invention, regard the node in network as key,
Figure BDA0000400155970000062
using the pr value of each node and out-degree node thereof as value, i.e. Value i=(pr i, outlinks (v i, G)), wherein outlinks (v, G) shows the out-degree node of node v in figure G, as shown in the input phase in Fig. 4.By whole dynamic transmission network thaumatropy, be finally the set of row, wherein key-value pair of each behavior.?
RowData = < Key 1 , Value 1 > < Key i , Value i > . . . . . . < Key n , Value n >
The core calculations process of MapReduce can be understood as two steps below:
(a) mapping (Map): a pair of key-value pair is answered by forwarding relation and another or several key-value pair, as follows:
< Key 0 , Value 0 > < Key 1 , Value 1 > , < Key 2 , Value 2 > , &DoubleRightArrow; M < Key 3 , Value 3 > . . .
Wherein M is mapping function.
(b) abbreviation (Reduce): the key-value pair that several are had to same keys merges their value letter, as follows:
<Key 1,Value 1
< Key 1 , Value 2 > &DoubleRightArrow; < Key 1 , R ( Value 1 , Value 2 , Value 3 ) >
<Key 1,Value 3>,
Wherein R is abbreviation function
It should be noted that the result generating in abbreviation (Reduce) process is wherein also the set of key-value pair, can continue on for doing the input of shining upon (Map) process, this may for realizing that some iterative algorithms provide.
In this application, mapping process is using the out-degree node of the node in network as new key, using the PageRank value of node divided by the total quantity of out-degree node as new value, bear new key-value pair,
< Key i , Value i > &DoubleRightArrow; < Key j , pr i / count ( outlinks ( v i , G ) ) > , &ForAll; v j &Element; outlinks ( v i , G ) .
In abbreviation process, for each key, its value is for pointing to the PageRank value sum of the all-links of the corresponding node of this key,
< Key i , Value 1 > < Key i , Value 2 > &DoubleRightArrow; < Key i , ( 1 - d ) / N + d * &Sigma; j Value j > < Key i , Value 3 >
Wherein d is the ratio of damping of PageRank algorithm, is traditionally arranged to be the total number that 0.85, N is nodes.
Last abbreviation result can be used as again the input of mapping next time, can constantly carry out iteration thus, until reach end condition.It should be noted that in said process and the out-degree nodal information of node do not transmitted in mapping-abbreviation, actual be need to be in mapping the new pr value of photogenerated node not, also to preserve its original structural relation.
(3) fusing stage, by each sub-network G iranking results pr i, be multiplied by its weights W i, then the result of each sub-network is merged to the ranking results pr that finally obtains G.
8. sort result output, by pr, according to value sequence, obtaining node is key user's sorted lists, pr value is larger, shows that this user is more important.
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank, the method combines the structural information of content and dynamic network, user is sorted, and adopt MapReduce concurrent technique to accelerate travelling speed.The method overcomes the inferior solution being caused by short text, correlativity, static network structure to a certain extent; A plurality of dynamic transmission networks combine the robustness that improves result after calculating respectively by PageRank.
Although embodiment of the present invention openly as above; but it is not limited only to listed utilization in instructions and embodiment; it can be applied to various applicable the field of the invention completely; for those skilled in the art; can easily implement other modification, therefore, not deviate under the universal that claim and equivalency range limit; can also make some improvements and modifications, these improvements and modifications also should be considered as in protection domain of the present invention.

Claims (9)

1. the microblogging key user recognition methods based on improved PageRank, is characterized in that, comprises the following steps:
Step 1, input micro-blog information data, comprising the user of n bar microblogging, text, temporal information etc.;
Step 2, according to the default dictionary that contains m vocabulary, carries out word segmentation processing to the text of n bar microblogging;
Step 3, sets up inverted index structure according to the result after participle, is convenient to according to designated key word and search;
Step 4, extracts it according to the relevant microblogging retrieving and forwards hierarchical information, builds the oriented transmission network G of weighting;
Step 5, is divided into several very big connected subgraph G by transmission network G i;
Step 6 adopts respectively parallel computing in each sub-network, and application PageRank algorithm merges after drawing the ranking results of each sub-network again;
Step 7, ranking results output.
2. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, it is characterized in that, the micro-blog information in described step 1 is data from capture the micro-blog information data in a Preset Time section on internet, microblogging platform api.
3. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, the text of every microblogging is carried out to word segmentation processing and all comprise the following steps:
1) remove the stop-word in text;
2) according to the vocabulary in default dictionary, according to the vertical order of text, the word in text is carried out to participle fractionation.
4. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, sets up microblogging word segmentation result to the inverted index between microblogging in step 3 described in claim 1, convenient according to keyword retrieval appointment microblogging.
5. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 4 described in claim 1, according to the information extraction of microblogging text, it forwards hierarchical information, builds weighted digraph.
6. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 5 described in claim 1, transmission network is decomposed into several very big connected subgraphs, and gives weights for each sub-network.
7. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 6 described in claim 1, in each sub-network, adopts PageRank algorithm.
8. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, it is characterized in that, in step 6 described in claim 1, in each sub-network, adopt the parallel computing based on MapReduce computation model to realize PageRank algorithm.
9. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 6 described in claim 1, forms the sequence result of calculation of whole transmission network after the result of calculation of each sub-network merges.
CN201310502924.7A 2013-10-23 2013-10-23 Method for identifying microblog key users based on improved Page Rank Pending CN103530402A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310502924.7A CN103530402A (en) 2013-10-23 2013-10-23 Method for identifying microblog key users based on improved Page Rank

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310502924.7A CN103530402A (en) 2013-10-23 2013-10-23 Method for identifying microblog key users based on improved Page Rank

Publications (1)

Publication Number Publication Date
CN103530402A true CN103530402A (en) 2014-01-22

Family

ID=49932411

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310502924.7A Pending CN103530402A (en) 2013-10-23 2013-10-23 Method for identifying microblog key users based on improved Page Rank

Country Status (1)

Country Link
CN (1) CN103530402A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793525A (en) * 2014-02-21 2014-05-14 江苏唯实科技有限公司 MapReduce model graph node authority value calculation method based on local iteration
CN104504251A (en) * 2014-12-10 2015-04-08 沈阳航空航天大学 Community dividing method based on PageRank algorithm
CN105245380A (en) * 2015-10-21 2016-01-13 腾讯科技(深圳)有限公司 Message transmission mode identifying method and device
CN105260358A (en) * 2015-10-14 2016-01-20 上海大学 Short text-oriented unexpected incident development process representation method
CN106202487A (en) * 2016-07-19 2016-12-07 西北工业大学 Based on user post behavioral pattern multi thread social events sum up method
CN106972952A (en) * 2017-02-28 2017-07-21 浙江工业大学 A kind of Information Communication leader's Node extraction method based on internet pricing correlation
CN107092651A (en) * 2017-03-14 2017-08-25 中国科学院计算技术研究所 A kind of key person's method for digging analyzed based on communication network data and system
CN108334953A (en) * 2018-01-09 2018-07-27 浙江工业大学 One kind being based on the newer social networks degree of belief learning method of weight
CN108509551A (en) * 2018-03-19 2018-09-07 西北大学 A kind of micro blog network key user digging system under the environment based on Spark and method
CN108763206A (en) * 2018-05-22 2018-11-06 南京邮电大学 A method of quicksort is carried out to single text keyword
CN110334270A (en) * 2018-03-21 2019-10-15 腾讯科技(深圳)有限公司 A kind of news sort method, server and computer readable storage medium
CN111681120A (en) * 2020-05-20 2020-09-18 卓尔智联(武汉)研究院有限公司 Core user determination method, device and storage medium
CN112052404A (en) * 2020-09-23 2020-12-08 西安交通大学 Group discovery method, system, device and medium for multi-source heterogeneous relation network
CN112632361A (en) * 2020-12-29 2021-04-09 中科院计算技术研究所大数据研究院 Iterative data acquisition method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388025A (en) * 2008-10-09 2009-03-18 浙江大学 Semantic web object ordering method based on Pagerank
CN101976245A (en) * 2010-10-09 2011-02-16 吕琳媛 Sequencing method of node importance in network
CN102708285A (en) * 2012-04-24 2012-10-03 河海大学 Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN102799671A (en) * 2012-07-17 2012-11-28 西安电子科技大学 Network individual recommendation method based on PageRank algorithm
CN102819576A (en) * 2012-07-23 2012-12-12 无锡雅座在线科技发展有限公司 Data mining method and system based on microblog
US20120330864A1 (en) * 2011-06-21 2012-12-27 Microsoft Corporation Fast personalized page rank on map reduce
CN102982157A (en) * 2012-12-03 2013-03-20 北京奇虎科技有限公司 Device and method used for mining microblog hot topics

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388025A (en) * 2008-10-09 2009-03-18 浙江大学 Semantic web object ordering method based on Pagerank
CN101976245A (en) * 2010-10-09 2011-02-16 吕琳媛 Sequencing method of node importance in network
US20120330864A1 (en) * 2011-06-21 2012-12-27 Microsoft Corporation Fast personalized page rank on map reduce
CN102708285A (en) * 2012-04-24 2012-10-03 河海大学 Coremedicine excavation method based on complex network model parallelizing PageRank algorithm
CN102799671A (en) * 2012-07-17 2012-11-28 西安电子科技大学 Network individual recommendation method based on PageRank algorithm
CN102819576A (en) * 2012-07-23 2012-12-12 无锡雅座在线科技发展有限公司 Data mining method and system based on microblog
CN102982157A (en) * 2012-12-03 2013-03-20 北京奇虎科技有限公司 Device and method used for mining microblog hot topics

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
YONG ZHANG 等: "User Influence Analysis On Micro Blog", 《2012 IEEE 2ND INTERNATIONAL CONFERENCE》, vol. 3, 1 November 2012 (2012-11-01) *
廖松博 等: "GCPR:一种在MapReduce平台上基于图划分的PageRank加速方法", 《小型微型计算机系统》, vol. 33, no. 6, 30 June 2012 (2012-06-30) *
方超 等: "一种基于MapReduce架构的微博用户影响力评价算法的设计与实现", 《通信市场》, 30 June 2013 (2013-06-30) *
舒琰 等: "基于PageRank的微博排名MapReduce算法研究", 《计算机技术与发展》, vol. 23, no. 2, 28 February 2013 (2013-02-28) *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793525A (en) * 2014-02-21 2014-05-14 江苏唯实科技有限公司 MapReduce model graph node authority value calculation method based on local iteration
CN104504251A (en) * 2014-12-10 2015-04-08 沈阳航空航天大学 Community dividing method based on PageRank algorithm
CN104504251B (en) * 2014-12-10 2017-12-15 沈阳航空航天大学 A kind of community division method based on PageRank algorithms
CN105260358A (en) * 2015-10-14 2016-01-20 上海大学 Short text-oriented unexpected incident development process representation method
CN105245380A (en) * 2015-10-21 2016-01-13 腾讯科技(深圳)有限公司 Message transmission mode identifying method and device
CN106202487B (en) * 2016-07-19 2019-06-21 西北工业大学 Based on user post behavior pattern multi thread social event summarize method
CN106202487A (en) * 2016-07-19 2016-12-07 西北工业大学 Based on user post behavioral pattern multi thread social events sum up method
CN106972952A (en) * 2017-02-28 2017-07-21 浙江工业大学 A kind of Information Communication leader's Node extraction method based on internet pricing correlation
CN106972952B (en) * 2017-02-28 2020-02-21 浙江工业大学 Information propagation leader node extraction method based on network degree correlation
CN107092651A (en) * 2017-03-14 2017-08-25 中国科学院计算技术研究所 A kind of key person's method for digging analyzed based on communication network data and system
CN107092651B (en) * 2017-03-14 2020-07-24 中国科学院计算技术研究所 Key character mining method and system based on communication network data analysis
CN108334953A (en) * 2018-01-09 2018-07-27 浙江工业大学 One kind being based on the newer social networks degree of belief learning method of weight
CN108334953B (en) * 2018-01-09 2021-08-03 浙江工业大学 Social network trust learning method based on weight updating
CN108509551A (en) * 2018-03-19 2018-09-07 西北大学 A kind of micro blog network key user digging system under the environment based on Spark and method
CN108509551B (en) * 2018-03-19 2022-03-01 西北大学 Microblog network key user mining system and method based on Spark environment
CN110334270A (en) * 2018-03-21 2019-10-15 腾讯科技(深圳)有限公司 A kind of news sort method, server and computer readable storage medium
CN110334270B (en) * 2018-03-21 2023-07-11 深圳市雅阅科技有限公司 News ordering method, server and computer readable storage medium
CN108763206A (en) * 2018-05-22 2018-11-06 南京邮电大学 A method of quicksort is carried out to single text keyword
CN108763206B (en) * 2018-05-22 2022-04-05 南京邮电大学 Method for quickly sequencing keywords of single text
CN111681120A (en) * 2020-05-20 2020-09-18 卓尔智联(武汉)研究院有限公司 Core user determination method, device and storage medium
CN112052404A (en) * 2020-09-23 2020-12-08 西安交通大学 Group discovery method, system, device and medium for multi-source heterogeneous relation network
CN112052404B (en) * 2020-09-23 2023-08-15 西安交通大学 Group discovery method, system, equipment and medium of multi-source heterogeneous relation network
CN112632361A (en) * 2020-12-29 2021-04-09 中科院计算技术研究所大数据研究院 Iterative data acquisition method

Similar Documents

Publication Publication Date Title
CN103530402A (en) Method for identifying microblog key users based on improved Page Rank
CN108509551B (en) Microblog network key user mining system and method based on Spark environment
CN106021457B (en) RDF distributed semantic searching method based on keyword
CN102346766A (en) Method and device for detecting network hot topics found based on maximal clique
CN105512245A (en) Enterprise figure building method based on regression model
CN104268230B (en) A kind of Chinese micro-blog viewpoint detection method based on heterogeneous figure random walk
CN104484343A (en) Topic detection and tracking method for microblog
CN102163226A (en) Adjacent sorting repetition-reducing method based on Map-Reduce and segmentation
CN105095419A (en) Method for maximizing influence of information to specific type of weibo users
CN103150663A (en) Method and device for placing network placement data
CN105893382A (en) Priori knowledge based microblog user group division method
Wang et al. Discover community leader in social network with PageRank
CN109992784A (en) A kind of heterogeneous network building and distance metric method for merging multi-modal information
CN103095849A (en) A method and a system of spervised web service finding based on attribution forecast and error correction of quality of service (QoS)
Setayesh et al. Presentation of an Extended Version of the PageRank Algorithm to Rank Web Pages Inspired by Ant Colony Algorithm
Elyasir et al. Focused web crawler
Wang et al. Link prediction based on weighted synthetical influence of degree and H-index on complex networks
CN107316246A (en) A kind of method for digging of social networks key user
Sangeetha et al. Page ranking algorithms used in Web Mining
Liu et al. Community discovery in weighted networks based on the similarity of common neighbors
Wu et al. Link prediction based on random forest in signed social networks
Zhubing et al. An overview on overlapping community detection
CN110019981B (en) Directed super-edge propagation method integrating unsupervised learning and network out-degree
Lambhate et al. Hybrid algorithm on semantic web crawler for search engine to improve memory space and time
Qing et al. AMSC: adaptive multi-channel graph convolutional network-enhanced web services classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140122