CN105159922B - The parallelization Combo discovering method towards consignment data based on label propagation algorithm - Google Patents

The parallelization Combo discovering method towards consignment data based on label propagation algorithm Download PDF

Info

Publication number
CN105159922B
CN105159922B CN201510469289.6A CN201510469289A CN105159922B CN 105159922 B CN105159922 B CN 105159922B CN 201510469289 A CN201510469289 A CN 201510469289A CN 105159922 B CN105159922 B CN 105159922B
Authority
CN
China
Prior art keywords
consignment
sender
label
addressee
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510469289.6A
Other languages
Chinese (zh)
Other versions
CN105159922A (en
Inventor
马云龙
刘敏
桂峰
章锋
袁菡
孙源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201510469289.6A priority Critical patent/CN105159922B/en
Publication of CN105159922A publication Critical patent/CN105159922A/en
Application granted granted Critical
Publication of CN105159922B publication Critical patent/CN105159922B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The present invention relates to a kind of based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, including:Step S1:Consignment data are pre-processed, text data is turned to according to setting format structure;Step S2:Consignment contact information, standardizes the weights of directed edge between node, is finally built into the oriented relational network model of having the right of consignment to abut sheet form between comprehensive text data interior joint;Step S3:Using improved label propagation algorithm, the community structure in consignment network is excavated with MapReduce frame parallelizations;Step S4:The community structure that analyzing step S3 is obtained finds corporations in consignment network.Compared with prior art, the present invention improves the autgmentability and operational efficiency of conventional labels propagation algorithm, and final realize accurately and efficiently excavates corporations in consignment network.

Description

The parallelization Combo discovering method towards consignment data based on label propagation algorithm
Technical field
The method that the present invention relates to a kind of to build consignment network based on consignment data being based on label more particularly, to one kind Parallelization Combo discovering method of the propagation algorithm towards consignment data.
Background technology
The research origin of social network analysis in the early 1920s, lay particular emphasis on research social entity between relationship, Such as:Exchange inside group membership, country between trade or company between economic transaction.With the fast development of information, Social networks complexity is increasing, no matter network manager or network research personnel, be intended to have social network structure Clearly recognize.Community mining is to understanding social network structure important in inhibiting, and the discovery of community structure is for network Analysis of Topological Structure, network functionality analysis and network behavior prediction are with very important theory significance and practical valence Value, is widely used in the fields such as social network and biological net, and it is more to be widely used in social networks, terroristic organization's identification etc. A field.
First, the community discovery algorithm based on cluster often only considers the attribute information of node, and causing to ignore others has With information (weights on such as side), and it needs a previously given input parameter (numbers of corporations in network), leads to society The accuracy that group divides is not high.Secondly, it is contemplated that any input parameter is not needed based on label pass-algorithm, and with linear Time complexity, convergence rate is very fast, and the accuracy excavated is also higher, is suitable in large scale network corporations and excavates. Finally, due to the fast development of computer technology and Internet technology, the ability that people obtain data constantly enhances, and needs to analyze Network size also from tens to hundreds of original nodes rise to million to millions scale, lead to non-distributed algorithm It has been no longer desirable for community discovery in fairly large network.And the MapReduce Computational frames in Hadoop platform are very suitable for Large-scale data is handled, therefore introduces MapReduce Computational frames in community mining algorithm, is solved using Distributed Calculation Extensive consignment network in community discovery, be a realistic plan.
Invention content
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind to be propagated based on label Parallelization Combo discovering method of the algorithm towards consignment data, on the basis of constructing consignment relational network model, utilize MapReduce distributed computing frameworks, improve the autgmentability and operational efficiency of conventional labels propagation algorithm, it is final realize it is accurate, Efficiently excavate corporations in consignment network.
The purpose of the present invention can be achieved through the following technical solutions:
It is a kind of based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, including:
Step S1:Consignment data are pre-processed, text data is turned to according to setting format structure;
Step S2:Consignment contact information, standardizes the weights of directed edge between node between comprehensive text data interior joint, Finally the oriented relational network model of having the right of consignment is built into abut sheet form;
Step S3:Using improved label propagation algorithm, excavated in consignment network with MapReduce frame parallelizations Community structure;
Step S4:The community structure that analyzing step S3 is obtained finds corporations in consignment network.
The text data is uploaded in the HDFS (Hadoop Distributed File System) of Hadoop platform Storage and processing.
The step S1 is specially:For every consignment data, sender's name, sender telephone number are extracted respectively Code, addressee's name, addressee's telephone number, sender's name, sender telephone number, addressee's name, addressee Telephone number corresponds to four column informations of every style of writing notebook data.
The step S2 is specially:
201:For each sender, the adjacency list of logistics contact frequency between the sender and other addressees is obtained, And adjacency list is standardized;
202:The sender and addressee that come are flowed to arbitrary existence, corresponds to and deposits when counting them respectively as sender It is denoted as shared transmission neighbours' number in the quantity A of identical addressee, quantity A;
203:The sender and addressee that come are flowed to arbitrary existence, corresponds to and deposits when counting them respectively as addressee It is denoted as shared reception neighbours' number in the quantity B of identical sender, quantity B;
204:The sender and addressee that come are flowed to arbitrary existence, obtain shared transmission neighbours number between them with Shared receive neighbours' number and value, should and be worth as shared neighbours' number between the sender and addressee, and to shared neighbours Number is standardized;
205:By the shared neighbours' number obtained in the weights of adjacency list that step 201 obtains and step 204 by α:The ratio of 1- α Example obtains after being added while considering that post part frequency sends neighbours' number and the common directed edge weights for receiving neighbours' number with common, and more New adjacency list, wherein 0 < α < 1.
By the way of successive ignition, an iteration process is specially the improved label propagation algorithm:
301:The adjacency list that step S2 is obtained ending plus corresponding sender's node unique mark ID, as posting part People node label Label completes init Tag;
302:It is multiple according to the adjacency list output with node label<key,value>Form key-value pair is divided into sender's key Value pair and addressee's key-value pair;
303:The key-value pair for obtaining identical key values traverses each value, and the value for obtaining sender's key-value pair first is used Come indicate the key values adjacency list value, and be stored in variable adjacent, secondly, for addressee's key-value pair Value counts the sum of weighted value under different Label, and the node label of the key values is updated according to the proportion of different Label NewLabel;
304:NewLabel is added at adjacent endings, export one it is new<key,value>Form key-value pair, And the label of adjacency list is updated, the community structure in consignment network is corresponding with the adjacency list containing label.
The stopping criterion for iteration of the improved label propagation algorithm includes:Front and back iterative process twice is more than setting percentage The node label of ratio does not change or reaches the iterations of setting.
The percentage that sets is 90%.
The iterations set is 20~30 times.
The step S4 is specially:According to the adjacency list that step S3 is obtained, the node of same label is considered as same corporations, To find corporations in consignment network.
Compared with prior art, the present invention has the following advantages:
1) prior art is mainly based upon uniprocessor algorithm and excavates corporations, is not suitable for corporations in large scale network and excavates, this hair The bright method for building consignment network based on consignment data, while using simultaneously row label propagation algorithm in consignment network, with standard Really, corporations in consignment network are efficiently excavated, especially suitable for the excavation of large scale network, are excavated compared to conventional individual algorithm The superiority of corporations, method provided by the present invention is fairly obvious.
2) from the aspect of calculating the weights of consignment network edge 3 aspects index:1, the logistics contact of consignment both sides Frequency;2, there are the quantity of identical addressee for correspondence when statistics consignment both sides are respectively as sender;3, statistics consignment both sides point It is corresponded to when not as addressee there are the quantity of identical sender, this comprehensive 3 indexs of the last present invention, which calculate in network, to be owned The weights on side excavate precision and accuracy to provide.
3) the method for the present invention does not need any input parameter, and has linear time complexity, and convergence rate is very fast, It is suitable for corporations in large scale network to excavate.
4) MapReduce distributed computing frameworks are combined, the text data for reacting consignment data is uploaded to Hadoop collection Storage and processing, improve the autgmentability and time efficiency of algorithm in the HDFS of group.
Description of the drawings
Fig. 1 is the overall flow figure of the method for the present invention;
Fig. 2 is the flow chart that consignment relational network model is built based on consignment data;
Fig. 3 is the flow chart that corporations are excavated using improved label propagation algorithm parallelization.
Specific implementation mode
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, give detailed embodiment and specific operating process, but protection scope of the present invention is not limited to Following embodiments.
As shown in Figure 1, a kind of being divided into structure based on parallelization Combo discovering method of the label propagation algorithm towards consignment data Consignment relational network model stage and excavation phase are built, it is specific as follows:
Step S1:Consignment data are pre-processed, turn to text data according to setting format structure, text data is uploaded to Storage and processing in the HDFS of Hadoop clusters.Specially:
For every consignment data, sender's name, sender telephone number, addressee's name, addressee are extracted respectively People's telephone number, sender's name, sender telephone number, addressee's name, addressee's telephone number correspond to every style of writing originally Four column informations of data.
Step S2:Consignment contact information, standardizes the weights of directed edge between node between comprehensive text data interior joint, It finally is built into the oriented relational network model of having the right of consignment to abut sheet form, and is uploaded in HDFS.As shown in Fig. 2, specific For:
201:For each sender, the adjacency list of logistics contact frequency between the sender and other addressees is obtained, And adjacency list is standardized.It is specifically described below:
1) MapReduce Computational frames first, are based on, HDFS and process are stored in the Map stages are by row read step S1 Text data after standardization uses the combination of its name and telephone number as its unique mark sender and addressee respectively Show ID, exports<key,value>Form key-value pair, wherein key are sender ID, and value is addressee ID.
2) in the case where the Reduce stages obtain identical key values, i.e., in the case of identical sender, the sender and different receipts are counted Part personage, which flows to, carrys out frequency.Finally one is obtained for each sender only consider that logistics is past between the sender and other addressees Carry out the adjacency list of frequency.
3) secondly, according to the adjacency list of each sender, it is more than setting frequency (this reality when the sender sends express delivery frequency Apply and rule of thumb taken in example 500 times), then it can determine whether situations such as sender is logistics terminal or is Taobao seller, therefore The adjacency list of the sender need to be left out, while leaving out sender's node from the adjacency list of other senders.
4) finally, according to the adjacency list of newly generated all senders, the statistics logistics contact maximum sender of frequency with Addressee standardizes the adjacency list of all senders using Max if maximum contact number is Max:Assuming that some sender Adjacency list [S tR1:C1\tR2:C2...\tRk:Ck], wherein t be separator, S writes a Chinese character in simplified form for Sender, indicates that sender, R are Receiver writes a Chinese character in simplified form, and indicates that addressee, C write a Chinese character in simplified form for Count, and subscript k is the serial number of addressee and corresponding number, expression time Number will have the addressee (R that logistics is come and gone with it1、R2And RkDeng) contact number (C1、C2And CkDeng) divided by Max, it finally obtains The sender standardization after adjacency list, i.e., [S tR1:W1\tR2:W2...\tRk:Wk], wherein Wk=Ck/Max。
202:Acquire shared transmission neighbours' number:The sender and addressee that come are flowed to arbitrary existence, them is counted and divides There are the quantity A of identical addressee, quantity A to be denoted as shared transmission neighbours' number for correspondence when not as sender.Below specifically It is bright:
1) first, under MapReduce Computational frames, in 1) the Map stages read each sender adjacency list [S tR1:W1\tR2:W2...\tRk:Wk], output is multiple<key,value>Form key-value pair:<S,+R1\tR2...\tRk>(+to It distinguishes subsequent<key,value>Key-value pair) and<R1,S\tR2...\tRk>、<R2,S\tR1...\tRk>、……、<Rk,S\ tR1...\tRk-1>Deng.
2) identical key values are obtained in the Reduce stages<key,value>Key-value pair traverses each value, obtains first The value with "+" is taken, the receipts that element is current key user after it is divided into array with " t ", in array when being sender These neighbor users are stored in a HashSet data structures set_key by part people.Secondly, to being left each without "+" Value use " t " be divided into array and parsed, result is stored in (map in the map of a HashMap data structure Key be array after being divided by " t " first element, value is the other elements for being used to store array HashSet structures).Finally, this map is traversed, is sought common ground to the value and set_key of each element in map, intersection Size be the key of key values and the current Reduce of this element respectively as sender when shared transmission neighbours number.
203:Acquire shared reception neighbours' number:The sender and addressee that come are flowed to arbitrary existence, them is counted and divides There are the quantity B of identical sender, quantity B to be denoted as shared reception neighbours' number for correspondence when not as addressee.Below specifically It is bright:
First, according to the adjacency list of each sender in step 201 [S tR1:W1\tR2:W2...\tRk:Wk], it is each Addressee establishes the inverted index [R to sender1\tSl\tSp...\tSn], subscript l, p, n indicate the sender's after the row of falling Serial number;Secondly, it is analogous to step 202 solution procedure, obtains sender and addressee that any two has logistics to come and go, counts him Respectively as addressee when shared reception neighbours number.
204:The sender and addressee that come are flowed to arbitrary existence, obtain shared transmission neighbours number between them with Shared receive neighbours' number and value, should and be worth as shared neighbours' number between the sender and addressee, and acquire entire net The maximum value that neighbours' number is shared in network, to standardize sender's node of each existing logistics contact and being total to for recipient node Enjoy neighbours' number.
205:By the shared neighbours' number obtained in the weights of adjacency list that step 201 obtains and step 204 by α:The ratio of 1- α Example obtains after being added while part frequency and the common directed edge weights for sending neighbours' number and common reception neighbours' number are posted in consideration, i.e., adjacent Connecing the weights on side in table, to account for weight ratio be α, and it is 1- α to send neighbours' number and the common weight ratio that accounts for for receiving neighbours' number jointly, In, newly generated adjacency list is uploaded in HDFS by 0 < α < 1 with new directed edge right value update adjacency list.
The above data processing for completing the structure consignment relational network model stage, as shown in Figure 2.Excavation phase is carried out below Data processing, as shown in Figure 3.
Step S3:Using improved label propagation algorithm, excavated in consignment network with MapReduce frame parallelizations Community structure.
By the way of successive ignition, an iteration process is specially improved label propagation algorithm:
301:The adjacency list that step S2 is obtained ending plus corresponding sender's node unique mark ID, as posting part People node label Label, completes init Tag, and the corresponding adjacency list with node label is expressed as [S tR1:W1\tR2: W2...\tRk:Wk\tLabel]。
302:It is the Map stages, multiple according to the adjacency list output with node label<key,value>Form key-value pair, is divided into Sender's key-value pair<S,+R1:W1\tR2:W2...\tRk:Wk>(+below generated to distinguish<key,value>Key-value pair) and Addressee's key-value pair<R1,Label\tW1>、<R2,Label\tW2>、……、<Rk,+Label\tWk>。
303:In the Reduce stages, identical key values are obtained<key,value>Key-value pair traverses each value, first The value (i.e. the value with "+") of sender's key-value pair is obtained for indicating the value of the adjacency list of the key values, and is stored in In variable adjacent, secondly, for the value (without the value of "+") of addressee's key-value pair, counts and weighed under different Label The sum of weight values, and update according to the proportion of different Label the node label NewLabel of the key values, wherein shared by Label Proportion is bigger, and the label of current key nodes may more update Label thus.
304:The newly generated label NewLabel of key nodes is added at adjacent endings, export one it is new< key,value>Form key-value pair, i.e.,<S,R1:W1\tR2:W2...\tRk:Wk\tNewLabel>, and the label of adjacency list is updated, Community structure in consignment network is corresponding with the adjacency list containing label.
The stopping criterion for iteration of improved label propagation algorithm includes following two:1, each node label is basicly stable, i.e., The node label that front and back iterative process twice is more than setting percentage does not change, wherein percentage is set in the present embodiment It is 90%, 2, reach the iterations of setting, it generally takes 20~30 times, is taken in the present embodiment 25 times.
Step S4:The community structure that analyzing step S3 is obtained finds corporations in consignment network, and result is stored in HDFS In.Specially:
According to the adjacency list that step S3 is obtained, the node of same label is considered as same corporations, to find consignment network Middle corporations.
To sum up, the structure consignment relational network model stage is process of data preprocessing, excavation phase iterative process, iteration mistake Journey realizes the distributed form of algorithm based on single machine label propagation algorithm, simultaneously as the particularity of the consignment data of logistics, this The index of patent 3 aspects from the aspect of calculating the weights of consignment network edge:1, the logistics contact frequency of consignment both sides;2、 There are the quantity of identical addressee for correspondence when counting consignment both sides respectively as sender;3, statistics consignment both sides are respectively as receipts It being corresponded to when part people there are the quantity of identical sender, this comprehensive 3 indexs of the last present invention calculate the weights on all sides in network, Corporations in consignment network are accurately and efficiently excavated to realize.

Claims (8)

1. a kind of based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, which is characterized in that including:
Step S1:Consignment data are pre-processed, text data is turned to according to setting format structure;
Step S2:Consignment contact information, standardizes the weights of directed edge between node, finally between comprehensive text data interior joint It is built into the oriented relational network model of having the right of consignment to abut sheet form;
Step S3:Using improved label propagation algorithm, the corporations in consignment network are excavated with MapReduce frame parallelizations Structure;
Step S4:The community structure that analyzing step S3 is obtained finds corporations in consignment network;
By the way of successive ignition, an iteration process is specially the improved label propagation algorithm:
301:Unique mark ID of corresponding sender's node is added in the ending for the adjacency list that step S2 is obtained, is saved as sender Point label Label, completes init Tag;
302:Multiple < key, value are exported according to the adjacency list with node label>Form key-value pair is divided into sender's key-value pair With addressee's key-value pair;
303:The key-value pair for obtaining identical key values traverses each value, and the value for obtaining sender's key-value pair first is used for table Show the value of the adjacency list of the key values, and be stored in variable adjacent, secondly, for the value of addressee's key-value pair, system The sum of weighted value under different Label is counted, and updates the node label NewLabel of the key values according to the proportion of different Label;
304:NewLabel is added at adjacent endings, new < a key, value are exported>Form key-value pair, and more The label of new adjacency list, the community structure in consignment network are corresponding with the adjacency list containing label.
2. it is according to claim 1 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the text data is uploaded to storage and processing in the HDFS of Hadoop clusters.
3. it is according to claim 1 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the step S1 is specially:For every consignment data, sender's name, sender telephone number are extracted respectively Code, addressee's name, addressee's telephone number, sender's name, sender telephone number, addressee's name, addressee Telephone number corresponds to four column informations of every style of writing notebook data.
4. it is according to claim 1 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the step S2 is specially:
201:For each sender, the adjacency list of logistics contact frequency between the sender and other addressees is obtained, and right Adjacency list is standardized;
202:The sender and addressee that come are flowed to arbitrary existence, there are phases for correspondence when counting them respectively as sender With the quantity A of addressee, quantity A is denoted as shared transmission neighbours' number;
203:The sender and addressee that come are flowed to arbitrary existence, there are phases for correspondence when counting them respectively as addressee With the quantity B of sender, quantity B is denoted as shared reception neighbours' number;
204:The sender and addressee that come are flowed to arbitrary existence, obtain the shared transmission neighbours number between them and shared Receive neighbours' number and value, should and be worth as shared neighbours' number between the sender and addressee, and to share neighbours' number into Row standardization;
205:By the shared neighbours' number obtained in the weights of adjacency list that step 201 obtains and step 204 by α:The ratio phase of 1- α It is obtained after adding while consideration posts part frequency and sends the directed edge weights of neighbours' number and common reception neighbours' number with common, and update neighbour Connect table, wherein 0 < α < 1.
5. it is according to claim 1 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the stopping criterion for iteration of the improved label propagation algorithm includes:Front and back iterative process twice is more than setting hundred The iterations for dividing the node label of ratio not change or reach setting.
6. it is according to claim 5 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the percentage that sets is 90%.
7. it is according to claim 5 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the iterations set is 20~30 times.
8. it is according to claim 1 based on parallelization Combo discovering method of the label propagation algorithm towards consignment data, It is characterized in that, the step S4 is specially:According to the adjacency list that step S3 is obtained, the node of same label is considered as same society Group.
CN201510469289.6A 2015-08-03 2015-08-03 The parallelization Combo discovering method towards consignment data based on label propagation algorithm Expired - Fee Related CN105159922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510469289.6A CN105159922B (en) 2015-08-03 2015-08-03 The parallelization Combo discovering method towards consignment data based on label propagation algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510469289.6A CN105159922B (en) 2015-08-03 2015-08-03 The parallelization Combo discovering method towards consignment data based on label propagation algorithm

Publications (2)

Publication Number Publication Date
CN105159922A CN105159922A (en) 2015-12-16
CN105159922B true CN105159922B (en) 2018-08-24

Family

ID=54800779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510469289.6A Expired - Fee Related CN105159922B (en) 2015-08-03 2015-08-03 The parallelization Combo discovering method towards consignment data based on label propagation algorithm

Country Status (1)

Country Link
CN (1) CN105159922B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993477B (en) * 2018-01-03 2023-04-07 阿里巴巴集团控股有限公司 Logistics task marking and inquiring method, code scanning gun, equipment and medium
CN112990633B (en) * 2019-12-18 2024-04-05 菜鸟智能物流控股有限公司 Index data generation method, logistics cost simulation method, equipment and storage medium
CN113449112A (en) * 2020-03-24 2021-09-28 顺丰科技有限公司 Abnormal consignment behavior identification method and device, computer equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663108A (en) * 2012-04-16 2012-09-12 南京大学 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120290293A1 (en) * 2011-05-13 2012-11-15 Microsoft Corporation Exploiting Query Click Logs for Domain Detection in Spoken Language Understanding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663108A (en) * 2012-04-16 2012-09-12 南京大学 Medicine corporation finding method based on parallelization label propagation algorithm for complex network model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Efficient and Accurate Label Propagation on Dynamic Graphs and Label Sets;Michele Covell 等;《International Journal on Advances in Networks and Services》;20131231;第6卷(第3期);第246页-259页 *
基于链路标签传播的重叠社团发现算法;董哲 等;《计算机工程与设计》;20141031;第35卷(第10期);第3380页-3385页 *

Also Published As

Publication number Publication date
CN105159922A (en) 2015-12-16

Similar Documents

Publication Publication Date Title
Yao et al. Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model
Sheng et al. Identifying influential nodes in complex networks based on global and local structure
US10846052B2 (en) Community discovery method, device, server and computer storage medium
Ahmed et al. Graph sample and hold: A framework for big-graph analytics
CN103678671A (en) Dynamic community detection method in social network
CN102915373A (en) Data storage method and device
CN102456062B (en) Community similarity calculation method and social network cooperation mode discovery method
CN110555172B (en) User relationship mining method and device, electronic equipment and storage medium
CN107609469B (en) Social network associated user mining method and system
CN105279187A (en) Edge clustering coefficient-based social network group division method
CN104008420A (en) Distributed outlier detection method and system based on automatic coding machine
CN102347917A (en) Contact semantic grouping method for network message communication
CN105159922B (en) The parallelization Combo discovering method towards consignment data based on label propagation algorithm
Chen et al. Marked self-exciting point process modelling of information diffusion on Twitter
CN102664744B (en) Group-sending recommendation method in network message communication
CN104216889B (en) Data dissemination analyzing and predicting method and system based on cloud service
CN105469315A (en) Dynamic social network community structure evolution method based on incremental clustering
CN106294676B (en) A kind of data retrieval method of ecommerce government system
CN102799616B (en) Outlier point detection method in large-scale social network
Trolliet et al. Interest clustering coefficient: a new metric for directed networks like twitter
CN106845536A (en) A kind of parallel clustering method based on image scaling
CN107133279A (en) A kind of intelligent recommendation method and system based on cloud computing
CN105069290A (en) Parallelization critical node discovery method for postal delivery data
CN103488637A (en) Method for carrying out expert search based on dynamic community mining
Rani et al. A survey of tools for social network analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180824

Termination date: 20210803

CF01 Termination of patent right due to non-payment of annual fee