CN103530402A - Method for identifying microblog key users based on improved Page Rank - Google Patents
Method for identifying microblog key users based on improved Page Rank Download PDFInfo
- Publication number
- CN103530402A CN103530402A CN201310502924.7A CN201310502924A CN103530402A CN 103530402 A CN103530402 A CN 103530402A CN 201310502924 A CN201310502924 A CN 201310502924A CN 103530402 A CN103530402 A CN 103530402A
- Authority
- CN
- China
- Prior art keywords
- microblogging
- network
- pagerank
- key
- microblog
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for identifying microblog key users based on an improved Page Rank. The method comprises the steps that microblog information data are input, wherein the microblog information data comprise n microblogs; word segmentation is conducted on texts of the n microblogs; according to a word segmentation result, a reverse index structure is established, so that retrieval is conveniently conducted according to appointed keywords; according to the retrieved relevant microblog, forwarding hierarchy information of the microblog is extracted and a weighting directed graph is established, wherein the weighting directed graph is a forwarding network G; the forwarding network G is divided into a plurality of maximum connected subgraphs Gi; the Page rank algorithm is applied to each sub network Gi according to the parallelization computing technology; computing results of the sub networks are combined, so that ranking results of the whole network G are generated; the first m ranking results of the ranking results are selected, serve as the key users and are output. According to the method for identifying the microblog key users based on the improved Page Rank, the parallelization computing technology is adopted, a dynamic forwarding network of a microblog platform is ranked and computed in a big data environment, so that the key users in the information transmission process are identified, and the method is applied to the fields of network public opinion analysis and the like.
Description
Technical field
The present invention relates to microblogging key user recognition methods, particularly a kind of microblogging key user recognition methods based on improved PageRank, belongs to complex network and Data Mining, the special microblogging data analysis for magnanimity.
Background technology
Microblogging platform key user is the user that the propagation of information and diffusion are played an important role.Key user plays a part important intermediary or filtration in the forming process of mass media effect, and they are by diffusion of information to audient, and the cascade that forms information is propagated.Therefore, the discovery of key user's the information that is identified in and propagation analysis aspect play important effect, for work such as Internet public opinion analysis, have great directive significance.Yet content-based key user's identification, because the feature of microblogging short text is often not accurate enough; Key user's identification based on structure, think in the network structure being formed by user that key user is all important node, and only consider static concern bean vermicelli relational network, but because the reasons such as the degree of correlation, user's liveness cause a lot of important node not have key user's feature.In view of this, the present invention, in conjunction with content and network structure two aspect features, utilizes the dynamic network of the forwarding information structure Topic relative in microblogging platform, and key user's identification is converted into the sequencing problem based on PageRank on digraph.In addition, consider that microblogging platform has the feature of mass data, the network dynamically generating is often huge, and needs instant computing result, and therefore the consumption of traditional power iteration algorithm on computing time and space is all difficult to satisfy the demands.Adopt parallel computing to decompose on some extendible computing nodes calculating on a large scale, reduced the Space-time Complexity of algorithm and for the requirement of the computing power of single node, be particularly suitable in cloud computing platform deploy and application.So the present invention adopts MapReduce to realize PageRank parallel computation, under large data environment, for microblogging short text information, excavate, thus the key user in identifying information communication process.
Summary of the invention
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank.The method be take PageRank as core, extracts forwarding information and builds the network being formed by forwarding relation, and then utilize final acquisition of PageRank to have high robust and high-quality key user's recognition result from microblogging text structure.Utilize forwarding relation can obtain inquiring about the dynamic transmission network that relevant quality is higher, overcome to a certain extent the inferior solution being caused by short text, correlativity, static network structure; A plurality of dynamic transmission networks combine after calculating respectively by PageRank and improve counting yield and algorithm robustness.A microblogging key user recognition methods of improved PageRank, comprises the steps:
Step 1: utilize web crawlers and microblogging API to collect the micro-blog data on internet, mainly comprise author information, text message, authorship's information and the temporal information of microblogging.
Step 2: the text data in microblogging data is carried out to word segmentation processing.
Step 3: stop words removes, stem extracts.
Step 4: build inverted index structure, be convenient to fast from keyword retrieval microblogging.
Step 5: the structure of dynamic transmission network, microblogging for each forwarding extracts its forwarding hierarchical information, as shown in Figure 2, comprise from authorship to deliver author and propagate process user, build thus directed networks, node is user, the relation of limit for forwarding, is the direction using the contrary direction of information flow as limit here, and the weight on limit is for forwarding the number of times occurring, direction is pointed to the author of its forwarding for author, and the final data structure forming is weighted digraph.
Step 6: the decomposition of transmission network, is divided into several very big connected subgraphs by transmission network G, for each sub-network G
i, give its weight, i.e. W
i=V (G
i)/V (G), the wherein nodes in V (G) representative graph G.
Step 7: parallel iteration, solves the PageRank(pr of each node in sub-network) value.
(1) starting stage, for each sub-network G
ieach node v give initial p ageRank value pr
i0(v)=1/V (G
i).
(2) iteration phase, utilizes MapReduce concurrent technique to calculate each node PageRank value in subnet, and its computing formula is pr
ik=((1-d) E/n+dA
i t) pr
ik-1, wherein k is the current iteration stage; D is ratio of damping, is generally made as 0.85; N is the nodes in network; E is complete 1 vector; A
ifor sub-network G
iadjacency matrix.The end condition of iteration can reach the threshold value that the threshold value of appointment or result difference that twice iteration produces are less than appointment for k, | and pr
ik-pr
ik-1| < ε.
Step 8: the fusion of ranking results, by each sub-network G
iranking results pr
i, be multiplied by its weights W
i, then the result of each sub-network is merged to the ranking results pr that finally obtains G.
Step 9: sort result output, pr, according to value sequence, is exported to the list of user's importance sorting, a front m user is key user.
The present invention utilizes dynamic forwarding relation to build user network, and a plurality of dynamic transmission network ranking results are merged, and greatly improves efficiency and the quality of microblogging platform key user identification under large data environment, also has very strong robustness.Method of the present invention can overcome the shortcoming of short text and static network structure, can effectively improve the efficiency of Internet public opinion analysis.
Accompanying drawing explanation
Fig. 1 is general frame figure of the present invention.
Fig. 2 is forwarding information exemplary plot.
Fig. 3 is dynamic network figure.
Fig. 4 is MapReduce schematic diagram.
Embodiment
Below in conjunction with accompanying drawing and concrete embodiment, the present invention is made further instructions.
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank.The method be take PageRank as core, adopt MapReduce parallel computing to overcome the low problem of counting yield of the large data of microblogging, from microblogging text structure, extract forwarding information and build the network being formed by forwarding relation, and then utilize final acquisition of PageRank to there is high robust and high-quality key user's recognition result.Utilize forwarding relation can obtain inquiring about the dynamic transmission network that relevant quality is higher, overcome to a certain extent the inferior solution being caused by short text, correlativity, static network structure; A plurality of dynamic transmission networks combine the robustness that improves result after calculating respectively by PageRank.
A kind of microblogging platform key user recognition methods based on improved PageRank, be divided into three modules: Data Collection and pretreatment module, dynamically transmission network builds module and PageRank calculating and Fusion Module, wherein Data Collection and pretreatment module comprise Data Collection, process, stop words removes, extract stem and generate the processes such as inverted index, dynamically transmission network builds module and utilizes micro-blog information extraction forwarding relation structure transmission network, PageRank calculates with Fusion Module and comprises transmission network division, PageRank calculates, result of calculation merges and last sequence and output, specific implementation process is as follows:
1. utilize web crawlers and microblogging API to collect the micro-blog data on internet, mainly comprise author information, text message, authorship's information and the temporal information of microblogging.Take Sina's microblogging as example, and the present invention comprises in the field of obtaining data phase by API and may needing:
Wherein a partial data structure (JSON form) that forwards microblogging is as follows:
In addition it should be noted that the retweeted_staus item for here, the microblogging that not necessarily user directly forwards, but be the microblogging of tracing to the source certainly, be the initiation point of Information Communication, i.e. user E in Fig. 2 forwarding information.
2. the text data in microblogging data is carried out to word segmentation processing.
3. stop words removes, stem extracts.
4. build inverted index structure, be convenient to fast from keyword retrieval microblogging.
5. the dynamic structure of transmission network, for the microblogging of each forwarding, extract the hierarchical information of its forwarding, comprise from authorship to deliver author and propagate process user, build thus directed networks, node is user, the relation of limit for forwarding, it should be noted that the direction using the contrary direction of information flow as limit, the number of times of the weight on limit for forward occurring, direction is pointed to the author of its forwarding for author, and the data structure of final formation is weighted digraph.The Fig. 3 of take below illustrates the building process of transmission network as example.
In this figure, have 2 forwarding informations, relate to altogether A, B, C, D and E totally 5 users, in figure, each limit represents forwarding relation, as A-> B shows that A has forwarded the information of B, the weight on this limit is the number of times that A has forwarded the information of B, is 2 times in the figure.
6. the decomposition of transmission network, is divided into several very big connected subgraphs by transmission network G, for each sub-network G
i, give its weight, i.e. W
i=V (G
i)/V (G), the wherein number of vertex in V (G) representative graph G.
7. sub-network parallel computation.
(1) starting stage, for each sub-network G
ieach node v give initial value pr
i0(v)=1/V (G
i).
(2) iteration phase, PageRank prime formula is pr
ik=((1-d) E/n+dA
i t) pr
ik-1, wherein k is the current iteration stage; D is ratio of damping, is generally made as 0.85; N is the nodes in network; E is complete 1 vector; A
ifor sub-network G
iadjacency matrix.The end condition of iteration can reach threshold value that the threshold value of appointment or result difference that twice iteration produces be less than appointment for k | pr
ik-pr
ik-1| < ε.
Common PageRank algorithm the convergence speed is very slow, when data scale is very large, can greatly reduce the execution efficiency of algorithm.The present invention, by MapReduce computation model, is converted into several minor issues by the problem of Algorithm for Solving, for do distributed parallel at different computing nodes, calculates, and finally does merger again and gathers.Below in conjunction with Fig. 3, introduce the principle of MapReduce parallel computation PageRank.
The kernel data structure of MapReduce is " key-value pair " structure, uses < Key here, and Value > represents.Wherein Key represents key, Value typical value.
In the present invention, regard the node in network as key,
using the pr value of each node and out-degree node thereof as value, i.e. Value
i=(pr
i, outlinks (v
i, G)), wherein outlinks (v, G) shows the out-degree node of node v in figure G, as shown in the input phase in Fig. 4.By whole dynamic transmission network thaumatropy, be finally the set of row, wherein key-value pair of each behavior.?
The core calculations process of MapReduce can be understood as two steps below:
(a) mapping (Map): a pair of key-value pair is answered by forwarding relation and another or several key-value pair, as follows:
Wherein M is mapping function.
(b) abbreviation (Reduce): the key-value pair that several are had to same keys merges their value letter, as follows:
<Key
1,Value
1>
<Key
1,Value
3>,
Wherein R is abbreviation function
It should be noted that the result generating in abbreviation (Reduce) process is wherein also the set of key-value pair, can continue on for doing the input of shining upon (Map) process, this may for realizing that some iterative algorithms provide.
In this application, mapping process is using the out-degree node of the node in network as new key, using the PageRank value of node divided by the total quantity of out-degree node as new value, bear new key-value pair,
In abbreviation process, for each key, its value is for pointing to the PageRank value sum of the all-links of the corresponding node of this key,
Wherein d is the ratio of damping of PageRank algorithm, is traditionally arranged to be the total number that 0.85, N is nodes.
Last abbreviation result can be used as again the input of mapping next time, can constantly carry out iteration thus, until reach end condition.It should be noted that in said process and the out-degree nodal information of node do not transmitted in mapping-abbreviation, actual be need to be in mapping the new pr value of photogenerated node not, also to preserve its original structural relation.
(3) fusing stage, by each sub-network G
iranking results pr
i, be multiplied by its weights W
i, then the result of each sub-network is merged to the ranking results pr that finally obtains G.
8. sort result output, by pr, according to value sequence, obtaining node is key user's sorted lists, pr value is larger, shows that this user is more important.
The present invention proposes a kind of microblogging platform key user recognition methods based on improved PageRank, the method combines the structural information of content and dynamic network, user is sorted, and adopt MapReduce concurrent technique to accelerate travelling speed.The method overcomes the inferior solution being caused by short text, correlativity, static network structure to a certain extent; A plurality of dynamic transmission networks combine the robustness that improves result after calculating respectively by PageRank.
Although embodiment of the present invention openly as above; but it is not limited only to listed utilization in instructions and embodiment; it can be applied to various applicable the field of the invention completely; for those skilled in the art; can easily implement other modification, therefore, not deviate under the universal that claim and equivalency range limit; can also make some improvements and modifications, these improvements and modifications also should be considered as in protection domain of the present invention.
Claims (9)
1. the microblogging key user recognition methods based on improved PageRank, is characterized in that, comprises the following steps:
Step 1, input micro-blog information data, comprising the user of n bar microblogging, text, temporal information etc.;
Step 2, according to the default dictionary that contains m vocabulary, carries out word segmentation processing to the text of n bar microblogging;
Step 3, sets up inverted index structure according to the result after participle, is convenient to according to designated key word and search;
Step 4, extracts it according to the relevant microblogging retrieving and forwards hierarchical information, builds the oriented transmission network G of weighting;
Step 5, is divided into several very big connected subgraph G by transmission network G
i;
Step 6 adopts respectively parallel computing in each sub-network, and application PageRank algorithm merges after drawing the ranking results of each sub-network again;
Step 7, ranking results output.
2. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, it is characterized in that, the micro-blog information in described step 1 is data from capture the micro-blog information data in a Preset Time section on internet, microblogging platform api.
3. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, the text of every microblogging is carried out to word segmentation processing and all comprise the following steps:
1) remove the stop-word in text;
2) according to the vocabulary in default dictionary, according to the vertical order of text, the word in text is carried out to participle fractionation.
4. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, sets up microblogging word segmentation result to the inverted index between microblogging in step 3 described in claim 1, convenient according to keyword retrieval appointment microblogging.
5. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 4 described in claim 1, according to the information extraction of microblogging text, it forwards hierarchical information, builds weighted digraph.
6. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 5 described in claim 1, transmission network is decomposed into several very big connected subgraphs, and gives weights for each sub-network.
7. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 6 described in claim 1, in each sub-network, adopts PageRank algorithm.
8. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, it is characterized in that, in step 6 described in claim 1, in each sub-network, adopt the parallel computing based on MapReduce computation model to realize PageRank algorithm.
9. the microblogging key user recognition methods based on improved PageRank as claimed in claim 1, is characterized in that, in step 6 described in claim 1, forms the sequence result of calculation of whole transmission network after the result of calculation of each sub-network merges.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310502924.7A CN103530402A (en) | 2013-10-23 | 2013-10-23 | Method for identifying microblog key users based on improved Page Rank |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310502924.7A CN103530402A (en) | 2013-10-23 | 2013-10-23 | Method for identifying microblog key users based on improved Page Rank |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103530402A true CN103530402A (en) | 2014-01-22 |
Family
ID=49932411
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310502924.7A Pending CN103530402A (en) | 2013-10-23 | 2013-10-23 | Method for identifying microblog key users based on improved Page Rank |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103530402A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793525A (en) * | 2014-02-21 | 2014-05-14 | 江苏唯实科技有限公司 | MapReduce model graph node authority value calculation method based on local iteration |
CN104504251A (en) * | 2014-12-10 | 2015-04-08 | 沈阳航空航天大学 | Community dividing method based on PageRank algorithm |
CN105245380A (en) * | 2015-10-21 | 2016-01-13 | 腾讯科技(深圳)有限公司 | Message transmission mode identifying method and device |
CN105260358A (en) * | 2015-10-14 | 2016-01-20 | 上海大学 | Short text-oriented unexpected incident development process representation method |
CN106202487A (en) * | 2016-07-19 | 2016-12-07 | 西北工业大学 | Based on user post behavioral pattern multi thread social events sum up method |
CN106972952A (en) * | 2017-02-28 | 2017-07-21 | 浙江工业大学 | A kind of Information Communication leader's Node extraction method based on internet pricing correlation |
CN107092651A (en) * | 2017-03-14 | 2017-08-25 | 中国科学院计算技术研究所 | A kind of key person's method for digging analyzed based on communication network data and system |
CN108334953A (en) * | 2018-01-09 | 2018-07-27 | 浙江工业大学 | One kind being based on the newer social networks degree of belief learning method of weight |
CN108509551A (en) * | 2018-03-19 | 2018-09-07 | 西北大学 | A kind of micro blog network key user digging system under the environment based on Spark and method |
CN108763206A (en) * | 2018-05-22 | 2018-11-06 | 南京邮电大学 | A method of quicksort is carried out to single text keyword |
CN110334270A (en) * | 2018-03-21 | 2019-10-15 | 腾讯科技(深圳)有限公司 | A kind of news sort method, server and computer readable storage medium |
CN111681120A (en) * | 2020-05-20 | 2020-09-18 | 卓尔智联(武汉)研究院有限公司 | Core user determination method, device and storage medium |
CN112052404A (en) * | 2020-09-23 | 2020-12-08 | 西安交通大学 | Group discovery method, system, device and medium for multi-source heterogeneous relation network |
CN112632361A (en) * | 2020-12-29 | 2021-04-09 | 中科院计算技术研究所大数据研究院 | Iterative data acquisition method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101388025A (en) * | 2008-10-09 | 2009-03-18 | 浙江大学 | Semantic web object ordering method based on Pagerank |
CN101976245A (en) * | 2010-10-09 | 2011-02-16 | 吕琳媛 | Sequencing method of node importance in network |
CN102708285A (en) * | 2012-04-24 | 2012-10-03 | 河海大学 | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm |
CN102799671A (en) * | 2012-07-17 | 2012-11-28 | 西安电子科技大学 | Network individual recommendation method based on PageRank algorithm |
CN102819576A (en) * | 2012-07-23 | 2012-12-12 | 无锡雅座在线科技发展有限公司 | Data mining method and system based on microblog |
US20120330864A1 (en) * | 2011-06-21 | 2012-12-27 | Microsoft Corporation | Fast personalized page rank on map reduce |
CN102982157A (en) * | 2012-12-03 | 2013-03-20 | 北京奇虎科技有限公司 | Device and method used for mining microblog hot topics |
-
2013
- 2013-10-23 CN CN201310502924.7A patent/CN103530402A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101388025A (en) * | 2008-10-09 | 2009-03-18 | 浙江大学 | Semantic web object ordering method based on Pagerank |
CN101976245A (en) * | 2010-10-09 | 2011-02-16 | 吕琳媛 | Sequencing method of node importance in network |
US20120330864A1 (en) * | 2011-06-21 | 2012-12-27 | Microsoft Corporation | Fast personalized page rank on map reduce |
CN102708285A (en) * | 2012-04-24 | 2012-10-03 | 河海大学 | Coremedicine excavation method based on complex network model parallelizing PageRank algorithm |
CN102799671A (en) * | 2012-07-17 | 2012-11-28 | 西安电子科技大学 | Network individual recommendation method based on PageRank algorithm |
CN102819576A (en) * | 2012-07-23 | 2012-12-12 | 无锡雅座在线科技发展有限公司 | Data mining method and system based on microblog |
CN102982157A (en) * | 2012-12-03 | 2013-03-20 | 北京奇虎科技有限公司 | Device and method used for mining microblog hot topics |
Non-Patent Citations (4)
Title |
---|
YONG ZHANG 等: "User Influence Analysis On Micro Blog", 《2012 IEEE 2ND INTERNATIONAL CONFERENCE》, vol. 3, 1 November 2012 (2012-11-01) * |
廖松博 等: "GCPR:一种在MapReduce平台上基于图划分的PageRank加速方法", 《小型微型计算机系统》, vol. 33, no. 6, 30 June 2012 (2012-06-30) * |
方超 等: "一种基于MapReduce架构的微博用户影响力评价算法的设计与实现", 《通信市场》, 30 June 2013 (2013-06-30) * |
舒琰 等: "基于PageRank的微博排名MapReduce算法研究", 《计算机技术与发展》, vol. 23, no. 2, 28 February 2013 (2013-02-28) * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793525A (en) * | 2014-02-21 | 2014-05-14 | 江苏唯实科技有限公司 | MapReduce model graph node authority value calculation method based on local iteration |
CN104504251A (en) * | 2014-12-10 | 2015-04-08 | 沈阳航空航天大学 | Community dividing method based on PageRank algorithm |
CN104504251B (en) * | 2014-12-10 | 2017-12-15 | 沈阳航空航天大学 | A kind of community division method based on PageRank algorithms |
CN105260358A (en) * | 2015-10-14 | 2016-01-20 | 上海大学 | Short text-oriented unexpected incident development process representation method |
CN105245380A (en) * | 2015-10-21 | 2016-01-13 | 腾讯科技(深圳)有限公司 | Message transmission mode identifying method and device |
CN106202487B (en) * | 2016-07-19 | 2019-06-21 | 西北工业大学 | Based on user post behavior pattern multi thread social event summarize method |
CN106202487A (en) * | 2016-07-19 | 2016-12-07 | 西北工业大学 | Based on user post behavioral pattern multi thread social events sum up method |
CN106972952A (en) * | 2017-02-28 | 2017-07-21 | 浙江工业大学 | A kind of Information Communication leader's Node extraction method based on internet pricing correlation |
CN106972952B (en) * | 2017-02-28 | 2020-02-21 | 浙江工业大学 | Information propagation leader node extraction method based on network degree correlation |
CN107092651A (en) * | 2017-03-14 | 2017-08-25 | 中国科学院计算技术研究所 | A kind of key person's method for digging analyzed based on communication network data and system |
CN107092651B (en) * | 2017-03-14 | 2020-07-24 | 中国科学院计算技术研究所 | Key character mining method and system based on communication network data analysis |
CN108334953A (en) * | 2018-01-09 | 2018-07-27 | 浙江工业大学 | One kind being based on the newer social networks degree of belief learning method of weight |
CN108334953B (en) * | 2018-01-09 | 2021-08-03 | 浙江工业大学 | Social network trust learning method based on weight updating |
CN108509551A (en) * | 2018-03-19 | 2018-09-07 | 西北大学 | A kind of micro blog network key user digging system under the environment based on Spark and method |
CN108509551B (en) * | 2018-03-19 | 2022-03-01 | 西北大学 | Microblog network key user mining system and method based on Spark environment |
CN110334270A (en) * | 2018-03-21 | 2019-10-15 | 腾讯科技(深圳)有限公司 | A kind of news sort method, server and computer readable storage medium |
CN110334270B (en) * | 2018-03-21 | 2023-07-11 | 深圳市雅阅科技有限公司 | News ordering method, server and computer readable storage medium |
CN108763206A (en) * | 2018-05-22 | 2018-11-06 | 南京邮电大学 | A method of quicksort is carried out to single text keyword |
CN108763206B (en) * | 2018-05-22 | 2022-04-05 | 南京邮电大学 | Method for quickly sequencing keywords of single text |
CN111681120A (en) * | 2020-05-20 | 2020-09-18 | 卓尔智联(武汉)研究院有限公司 | Core user determination method, device and storage medium |
CN112052404A (en) * | 2020-09-23 | 2020-12-08 | 西安交通大学 | Group discovery method, system, device and medium for multi-source heterogeneous relation network |
CN112052404B (en) * | 2020-09-23 | 2023-08-15 | 西安交通大学 | Group discovery method, system, equipment and medium of multi-source heterogeneous relation network |
CN112632361A (en) * | 2020-12-29 | 2021-04-09 | 中科院计算技术研究所大数据研究院 | Iterative data acquisition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103530402A (en) | Method for identifying microblog key users based on improved Page Rank | |
CN108509551B (en) | Microblog network key user mining system and method based on Spark environment | |
CN106021457B (en) | RDF distributed semantic searching method based on keyword | |
CN102346766A (en) | Method and device for detecting network hot topics found based on maximal clique | |
CN105512245A (en) | Enterprise figure building method based on regression model | |
CN104268230B (en) | A kind of Chinese micro-blog viewpoint detection method based on heterogeneous figure random walk | |
CN104484343A (en) | Topic detection and tracking method for microblog | |
CN102163226A (en) | Adjacent sorting repetition-reducing method based on Map-Reduce and segmentation | |
CN105095419A (en) | Method for maximizing influence of information to specific type of weibo users | |
CN103150663A (en) | Method and device for placing network placement data | |
CN105893382A (en) | Priori knowledge based microblog user group division method | |
Wang et al. | Discover community leader in social network with PageRank | |
CN109992784A (en) | A kind of heterogeneous network building and distance metric method for merging multi-modal information | |
CN103095849A (en) | A method and a system of spervised web service finding based on attribution forecast and error correction of quality of service (QoS) | |
Setayesh et al. | Presentation of an Extended Version of the PageRank Algorithm to Rank Web Pages Inspired by Ant Colony Algorithm | |
Elyasir et al. | Focused web crawler | |
Wang et al. | Link prediction based on weighted synthetical influence of degree and H-index on complex networks | |
CN107316246A (en) | A kind of method for digging of social networks key user | |
Sangeetha et al. | Page ranking algorithms used in Web Mining | |
Liu et al. | Community discovery in weighted networks based on the similarity of common neighbors | |
Wu et al. | Link prediction based on random forest in signed social networks | |
Zhubing et al. | An overview on overlapping community detection | |
CN110019981B (en) | Directed super-edge propagation method integrating unsupervised learning and network out-degree | |
Lambhate et al. | Hybrid algorithm on semantic web crawler for search engine to improve memory space and time | |
Qing et al. | AMSC: adaptive multi-channel graph convolutional network-enhanced web services classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140122 |