CN105069290B - A kind of parallelization key node towards consignment data finds method - Google Patents

A kind of parallelization key node towards consignment data finds method Download PDF

Info

Publication number
CN105069290B
CN105069290B CN201510469302.8A CN201510469302A CN105069290B CN 105069290 B CN105069290 B CN 105069290B CN 201510469302 A CN201510469302 A CN 201510469302A CN 105069290 B CN105069290 B CN 105069290B
Authority
CN
China
Prior art keywords
node
key
represent
weights
consignment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510469302.8A
Other languages
Chinese (zh)
Other versions
CN105069290A (en
Inventor
马云龙
刘敏
桂峰
章锋
袁菡
孙源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201510469302.8A priority Critical patent/CN105069290B/en
Publication of CN105069290A publication Critical patent/CN105069290A/en
Application granted granted Critical
Publication of CN105069290B publication Critical patent/CN105069290B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a kind of parallelization key node towards consignment data to find method, including:Step S1:According to the transmitting-receiving total degree of each node obtains node liveness, the weights using node liveness as node itself in setting time in consignment data;Step S2:According in consignment data in setting time the frequency of interaction of each node pair and shared neighbours several times figureofmerit obtain each node pair side weights, be an oriented double weighted network figures by the net definitions formed by consignment data;Step S3:The weights of node itself and the weights on the side of node pair are added on the basis of PageRank algorithms, concurrently excavate the key node in oriented double weighted network figures.Compared with prior art, the present invention takes full advantage of the information in logistics consignment network, reduces the loss of effective information, improves the accuracy that key node is found in network, while parallelization is run, and substantially increases the efficiency and stability of key node excavation.

Description

A kind of parallelization key node towards consignment data finds method
Technical field
It is crucial more particularly, to a kind of parallelization towards consignment data the present invention relates to social network analysis technical field Node discovery method.
Background technology
After being put forward in nineteen twenty " community network " this concept by British scholar, researchers are to community network Research be never interrupted.Especially now with biology information technology, network technology, the communication technology, social platform quick hair Open up, a huge complicated community network is formd between each individual in community network.In social life, complex network It is closely bound up with our life, we often in contact with to complex network include:Internet, Wan Wei in computer realm Net, communication network, mail network, micro blog network, the albumen of logistics consignment relational network and biomedical sector in logistics Matter and the network of the interphase interaction of protein.Key node is important one kind of generally existing in social network structure Node, the in recent years research to key node in community network are always a focus.Find to close in society and physical network Key node, which to its importance assess, has critically important practical significance.Such as a group of society is searched out in social networks Key node in most active user in body, positioning network attack and defence, determines key person etc. in logistics network.To society Key node in meeting network structure finds to help to excavate the information in community network more profoundly, finds out community structure In key node, the theory and realistic meaning of the 26S Proteasome Structure and Function own profound for understanding community network.
First, the existing research on key node in complex network is all the PageRank algorithms using Google mostly And it is improved on its basis.But most of key nodes find that algorithm only considered the weights on side, few people The weights of node itself are taken into account, cause to have ignored many useful information when excavating key person in a network, It has impact on the accuracy of key node discovery.Secondly, outside our definition node liveness are as node itself weights, Wo Menyong Two factors calculate the weights on side, one be two nodes connecting side shared neighbours' number, another is between node pair Frequency of interaction, the information being thus sufficiently used in network.Finally, due to computer technology and Internet technology is fast Exhibition is hailed, the ability that people obtain data constantly strengthens, and the network size of researcher's research is also from original tens to hundreds of Individual node rise to million to millions scale, while in view of MapReduce programming frameworks be adapted to handle large-scale data, Therefore the present invention proposes to be based on MapReduce programming frameworks, realizes the parallelization key node hair towards extensive consignment data It is existing.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind towards consignment data Parallelization key node find method, based on real logistics network, by node liveness, node frequency of interaction and node pair Shared neighbours' number etc. consider in weight computing, take full advantage of the information in logistics consignment network, reduce effective information Loss, the accuracy that key node in network is found is improved, and based on MapReduce programming frameworks, in comparative maturity Google PageRank algorithms on make improvements, realize the parallelization of algorithm, substantially increase key node excavation Efficiency and stability.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of parallelization key node towards consignment data finds method, including:
Step S1:According to the transmitting-receiving total degree of each node obtains node liveness in setting time in consignment data, will save Weights of the point liveness as node itself;
Step S2:According to the frequency of interaction of each node pair and shared neighbours figureofmerit several times in setting time in consignment data The weights on the side of each node pair are obtained, are an oriented double weighted network figures by the net definitions formed by consignment data;
Step S3:The weights of node itself and the weights on the side of node pair are added on the basis of PageRank algorithms, and The key node in oriented double weighted network figures is excavated capablely.
The node liveness meets below equation:
ai=Mi/Max_num (1)
In formula, aiRepresent the node liveness of node i, MiRepresent that node i receives and dispatches total degree, Max_num in setting time Represent all MiIn maximum.
The weights on the side meet below equation:
wji=a × freqij+(1-a)Neighbor(i,j) (2)
In formula, wjiRepresent the weights on the side between node i and node j, freqijRepresent the friendship between node i and node j Crossing over frequency, figureofmerit, a represent Dynamic gene to the shared neighbours between Neighbor (i, j) expression node is and node j several times.
The frequency of interaction meets below equation:
freqij=nij/Max_num (3)
In formula, freqijRepresent the frequency of interaction between node i and node j, nijRepresent node i and the side that node j is formed Occurrence number, Max_num represents all nijIn maximum.
Figureofmerit meets below equation to the shared neighbours several times:
Neighbor (i, j)=Neighbor_shared_num (i, j)/Max_SharedNum (4)
In formula, Neighbor (i, j) represents shared neighbours between node i and node j figureofmerit several times, Neighbor_ Shared_num (i, j) represents shared neighbours' number between node i and node j, described in Max_SharedNum is represented Maximum in Neighbor_shared_num (i, j).
The step S3 is specially:
301:The PageRank value of each node is obtained, meets below equation:
PR(pi)=ai/N+(1-ai)×ΣPR(pj)×wji/L(pj) (5)
In formula, PR (pi) represent node i PageRank value, pj∈M(pi), M (pi) represent to point to the set of node i, L (pj) represent to point to the out-degree of this node of node i, N represents node number total in consignment data, aiRepresent the section of node i Point liveness, wjiRepresent the weights on the side between node i and node j;
302:For each node, the PageRank value obtained twice before and after contrast, whether the absolute value of both differences of judgement More than given threshold epsilon, if so, jump procedure 301, continues to obtain the PageRank value of each node of next round, if it is not, performing step Rapid 303;
303:The PageRank value of each node finally obtained to step 302 is ranked up, and the node of k is institute before ranking The key node of excavation, k are the quantity of key node.
It is parallel to be based on the progress of MapReduce programming frameworks for the data of each step in the parallelization key node discovery method Change is handled.
Compared with prior art, the present invention has advantages below:
1) because in the existing discovery algorithm for key node in complex network, few researchers are simultaneously in view of section The weights of point itself and the weights on the side influenceed by frequency of interaction between node and shared neighbours' number, and the inventive method is designing In be additionally contemplates that the liveness of node itself, the weights using node liveness as node itself, considering the weights on side When, shared neighbours' number of the factor, i.e. frequency of interaction between node and node of the weights on two decision sides of introduction, sufficiently The information in network is make use of, improves the accuracy of algorithm, is suitable for key node in large scale community network and finds.
2) consignment network is built based on consignment data, PageRank algorithms is applied to the net of logistics consignment data formation Key node is excavated in network, suitable for the accurately and quickly excavation of the key node magnanimity consignment data.
3) parallelization calculating is realized to the PageRank after improvement based on MapReduce programming frameworks, is greatly improved The autgmentability of algorithm, digging efficiency and stability.
Brief description of the drawings
Fig. 1 is the overall flow figure of Parallelization Scheme of the present invention;
Fig. 2 is the procedure chart of MapReduce processing datas;
Fig. 3 is the schematic diagram that shared neighbours' number defines.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, give detailed embodiment and specific operating process, but protection scope of the present invention is not limited to Following embodiments.
As shown in Fig. 2 MapReduce passes through the step of division, mass data is grouped and its processing will be distributed to Each partial node under host node is completed jointly, and the result of calculation for finally integrating each partial node obtains final result. Whole data handling procedure is abstracted as two parts by MapReduce, with function representation, respectively map and reduce.Map's Work is that reduce is responsible for the result for collecting multitasking into multiple by Task-decomposing.Data set under MapReduce frameworks Multiple small data sets can be resolved into, and processing can be parallelized.
As shown in figure 1, a kind of parallelization towards consignment data based on MapReduce frameworks and PageRank algorithms is closed Key node discovery method includes:
Step S1:According to the transmitting-receiving total degree of each node obtains node liveness in setting time in consignment data, will save Weights of the point liveness as node itself.It is specific as follows:
Based on MapReduce frameworks, by the multiple data block split () of the random division of raw data set to be excavated, Computer node in MapReduce clusters starts multiple Mapper, and each Mapper stages handle corresponding data block respectively Information:The relevant information of node data is read, by the processing routine map () in map functions, is translated into<key, value>Form exports, and it is present node to obtain key values, and value values are the adjacent nodes for having interactive relation with present node.Example It is A → B if any a consignment behavior, A represents the sender in this consignment behavior here, and B is represented in this consignment behavior Addressee, to be oriented, for A → B when map export, although input is oriented, but export be A-B with B-A, it is undirected.Finally, the output result of each map functions is transferred in the processing routine reduce () in Reudcer stages Carry out result to collect, count the total degree of each node transmitting-receiving express delivery, that is, total degree is received and dispatched, with node:Count data format Write in file and save.
By transmitting-receiving total degree calculate node liveness, meet below equation:
ai=Mi/Max_num (1)
In formula, aiRepresent the node liveness of node i, MiRepresent that node i receives and dispatches total degree, Max_num in setting time Represent all MiIn maximum.
Step S2:According to the frequency of interaction of each node pair and shared neighbours figureofmerit several times in setting time in consignment data The weights on the side of each node pair are obtained, are an oriented double weighted network figures by the net definitions formed by consignment data.Specifically It is as follows:
201:Calculate the frequency of interaction of each node pair in setting time:
Data prediction is same as above, and is several pieces by raw data set random division, the computer node of MapReduce clusters Starting multiple Mapper, each Mapper stages handle corresponding data block information respectively, read the relevant information on node and side, It is translated into<key,value>Form exports, and the key values of acquisition are nodes pair, and value values are 1, for example, once A → B Consignment behavior, map output is<A-B,1>With<B-A,1>.Then, each map output is sent to the progress of Reducer ends Collect, count the total degree that each node occurs to the side of formation, can all have finally for each consignment behavior<node1- node2:count>With<node2-node1:count>Form is write in file and saved.
Then frequency of interaction meets below equation:
freqij=nij/Max_num (2)
In formula, freqijRepresent the frequency of interaction between node i and node j, nijRepresent node i and the side that node j is formed Occurrence number, Max_num represents all nijIn maximum.
202:Each node is to enjoying neighbours' figureofmerit several times in calculating setting time:
Data prediction is same as above, and to raw data set processing, by Mapper processing, is obtained<key,value>Form Output, the key of acquisition value is node pair, and value values are the common adjacent nodes of the node one of the node centering two, most Afterwards, collecting for result is carried out at Reducer ends, shared neighbours number of each node to i.e. each side is counted, finally for every A line preserves two values, such as A → B, and what finally we came out is<A-B:count>With<B-A:count>.
As shown in figure 3, two nodes A and B of interaction shared neighbours' number ,=the shared neighbours' number that sends+is shared and receives neighbours Number, shares that neighbor node number is more, and showing it to exist together, a possibility for associating scope is bigger, and relation is closer, then shares neighbour Occupy figureofmerit several times and meet below equation:
Neighbor (i, j)=Neighbor_shared_num (i, j)/Max_SharedNum (3)
In formula, Neighbor (i, j) represents shared neighbours between node i and node j figureofmerit several times, Neighbor_ Shared_num (i, j) represents shared neighbours' number between node i and node j, and Max_SharedNum represents Neighbor_ Maximum in shared_num (i, j).
203:Calculate the weights on the side of each node pair, the weight computing formula on the side between two nodes is as follows:
wji=a × freqij+(1-a)Neighbor(i,j) (4)
In formula, wjiRepresent the weights on the side between node i and node j, freqijRepresent the friendship between node i and node j Crossing over frequency, figureofmerit, a represent Dynamic gene to the shared neighbours between Neighbor (i, j) expression node is and node j several times.
Step S3:The weights of node itself and the weights on the side of node pair are added on the basis of PageRank algorithms, and The key node in oriented double weighted network figures is excavated capablely.Specially:
301:According to the step S2 node liveness calculated and the weights of each edge, by Google's webpage after improving Rank algorithm-PageRank algorithms obtain the PageRank value of each node, and PageRank calculation formula are as follows:
PR(pi)=ai/N+(1-ai)×ΣPR(pj)×wji/L(pj) (5)
In formula, PR (pi) represent node i PageRank value, pj∈M(pi), M (pi) represent to point to the set of node i, L (pj) represent to point to the out-degree of this node of node i, N represents node number total in consignment data, aiRepresent the section of node i Point liveness, wjiRepresent the weights on the side between node i and node j;
302:After the PageRank value for calculating all nodes, PageRank value that last computation is come out and current PankRank values are contrasted, if the PageRank value of each node and the absolute value of the difference of last time are more than given threshold value ε, then repeat step 301 calculate the PageRank value of each node of next round.If PageRank value before and after this twice The absolute value of difference is less than given threshold epsilon, then performs step 303;
303:The PageRank value of each node finally obtained to step 302 is ranked up, and the node of k is institute before ranking K most important key nodes are excavated, k is the quantity of key node.
Illustrated below by taking actual program in MapReduce frameworks as an example:
1) consignment data to be excavated are divided into multiple data blocks to handle respectively, by a MapReduce operation, output <key,value>Formal model, wherein, key values are one node is of people in network, and value values are that have consignment behavior with node i Node number, including the number of sender and addressee.Specifically include following steps:
11) consignment data to be excavated are divided into data block form, Mapper processing is given in units of data block.
12) each calculate node handles corresponding data block respectively in cluster, performs a MapReduce operation.
The Mapper stages:
Input:The original consignment data of analysis to be excavated;
Output:<nodei,nodej>, its interior joint nodeiAnd nodejAll represent the addressee for participating in a consignment behavior People and sender, and nodeiAnd nodejIt can be addressee or be sender, so in the Mapper stages, for< nodei,nodej>Such node pair, we export in the Mapper stages and should exported<nodei,nodej>Also to export< nodej,nodei>。
The Reducer stages:
Input:<nodei,nodej>;
Output:<nodei,count>, wherein key is node nodei, value be and node nodeiThere is transmitting-receiving relation Degree of node count, result is write into a file A1 on HDFS (Hadoop Distributed File System).
2) consignment data to be excavated are divided into multiple data blocks to handle respectively, by a MapReduce operation, output <key,value>Formal model, wherein, key values are that the node pair of a consignment behavior occurs in logistics network, and value values are Integer, represent number of each node to appearance.Specifically include following steps:
21) consignment data to be excavated are divided into data block form, Mapper processing is given in units of data block.
22) each calculate node handles corresponding data block respectively in cluster, performs a MapReduce operation.
The Mapper stages:
Input:The original consignment data of analysis to be excavated;
Output:<(nodei,nodej),1>, its interior joint nodeiAnd nodejAll represent and participate in a consignment behavior Addressee and sender, this output form illustrate node nodeiTo node nodejOnce consignment behavior.
The Reducer stages:
Input:<(nodei,nodej),1>;
Output:<(nodei,nodej),count>, wherein key is node to (nodei,nodej), this section of value Point writes result a file A2 on HDFS to the number of appearance.
3) similar with previous step, input is still original consignment data set, calculates shared neighbours' number of egress pair, It is node pair to obtain key values, the consignment behavior of sender and addressee is represented, according to the definition of the shared neighbours of node pair Shared neighbours' number of simultaneously node pair is calculated, specifically includes following steps:
31) consignment data to be excavated are divided into data block form, Mapper processing is given in units of data block.
32) each calculate node handles corresponding data block respectively in cluster, performs MapReduce operations twice.
The Mapper1 stages:
Input:The original consignment data of analysis to be excavated;
Output:<nodei,nodej>, its interior joint nodeiAnd nodejAll represent the addressee for participating in a consignment behavior People and sender, this output form illustrate node nodeiTo node nodejOnce consignment behavior.
The Reducer1 stages:
Input:<nodei,nodej>;
Output:<(nodei,nodej),adjacent nodes of nodei>, wherein key is node to (nodei, nodej), value is the node set being connected with this node, and obtained result is exactly the critical sheet form in figure.
The Mapper2 stages:
Input:<(nodei,nodej),adjacent nodes of nodei>, that is to say, that Mapper2 input is just It is Mapper1 Reduce output results;
Output:<(nodea,nodeb),nodei>, wherein key values (nodea,nodeb) represent node nodeiNeighbours The node pair that any two node forms in node, the sequence number sequence of its previous node of interior joint centering are forward compared with the latter.Example Such as, for input<A,(B,C,D)>,<B,(C,D)>, then Mapper2 output is exactly<(B,C),A>、<(B,D),A>、< (C,D),A>,<(C,D),B>。
The Reducer2 stages:
Input:<(nodea,nodeb),nodei>;
Output:<(nodea,nodeb),common adjacent nodes of nodea,nodeb>, such as upper one In the hypothesis of step, the output in Reducer2 stages is exactly<(B,C),A>、<(B,D),A>、<(C,D),A,B>.Result is write into text In part A3.
4) PageRank value of each node is calculated, before Mapper stages and Reduce stages is carried out, is individually write One program individually write a stand-alone program be used for read the data that above obtain, read in setup functions in the present embodiment Data in file A1, file A2 and file A3, each is then obtained according to the definition of node liveness and calculation formula (1) The liveness of node, using node character string form as key values, deposited the node liveness calculated as value values Into the HashMap set hashmap1 defined.Then according to the definition of the frequency of interaction between node and calculation formula (2), the shared neighbours between node several times come by the calculation formula (4) of the definition of figureofmerit and the weights on calculation formula (3) and side Calculate the weights on side.Wherein a is the Dynamic gene for two factors for influenceing side right value, can control the two Effects of Factors sides Weight.Then using the character string forms on side as key values, by the weight w on the sidejiIt is stored in and defines as value values HashMap set hashmap2 in.Then algorithms are found according to based on double weighting key nodes after PageRank algorithm improvements Definition and formula (5) calculate the PageRank value of each node.A MapReduce operation is performed, calculates each section The PageRank value of point, the then input using the result that last time obtains as MapReduce operations next time so ceaselessly change In generation, goes down, until the absolute value of the PageRank value difference of each corresponding node in operation twice in succession is less than given threshold value ε just stops iterative process, so obtains result.Specifically include following steps:
41) calculate node liveness and the weights on side in setup functions
File A1, A2, A3 data are read in setup functions, it is then public according to the definition of node liveness and calculating Formula calculates the liveness of each node, and result is stored in hashmap1 set.Then further according to node frequency of interaction The fixed and calculation formula of shared neighbours' number between node calculates the weights on the side between every a pair of nodes, and its result is deposited In hashmap2 set.
42) each calculate node handles corresponding data block respectively in cluster, performs a MapReduce operation, specifically Processing is mainly the pretreatment of data, and main step is as follows:
The Mapper stages:
Input:<nodei,nodej>The consignment data to be excavated of form, original data mode style representatives node nodeiGive node nodejAn express delivery, directive, node are postedi→nodej, finally obtained critical sheet form.
Output:<nodej,nodei>, initial data is changed into a direction, form has changed node intoj→nodei
The Reducer stages:
Input:<nodej,list of nodes point to nodej>.After Mapper output results, pass through The processing of shuffle and combine functions, Reduce input key values are addressee nodej, value values be send by special delivery to nodejSender set.
Output:<nodej,list of nodes point to nodej>, Reduce directly exports result, such as Obtain result<A,(B,C,D)>, represent node B, C, D and all posted an express delivery to node A.A is addressee, and B, C, D are Sender gathers.
43) this step is arrived, it is known that each liveness of node, the weights on each side and initial data are after treatment Sheet form is abutted, each calculate node handles corresponding data block respectively in cluster, then performs a MapReduce operation, root According to the key node innovatory algorithm formula (5) based on PageRank, the PageRank value of each node is calculated.We are with puppet below Calculating details of the form of code to egress PageRank.
Algorithm 1:Map(key,value)
Input:
Logistics network nodes -- logistic network nodal points
PR(pi):The PageRank value of node -- PageRank value
wij:The weights on the value of the edges (i, j) -- side
Links[p1,p2,p3,...pm]:all the node pj linked by node pi
Output:
List of<key:value>
1.Emit(pi,links[p1,p2,p3,...pm])
2.For each pj in links[p1,p2,p3,...pm]
3.Partial (j)=PR (pi)×wij/L(pj)
4.Emit(pj,partial(j))
5.End For
Algorithm 2:Reduce(key,value)
Input:
Logistics Network node pj list of<pj,partial(j)>
Output:
PR(pj):the PageRank value of user pj
1.//Initial new PageRank value of node pj
2.PR(pj)=0
3.For each partial(j)in the list
4.PR(pj) +=partial (j)
5.End For
6.PR(pj)=(1-a) × PR (pi)+a/N //N be nodes sum
44) when the PageRank value for each node for obtaining calculating for the first time, by the node obtained for the first time Initial p ageRank value of the PageRank value as the node of second of MapReduce operation, then carries out second of iteration To calculate the PageRank value of next iteration process.So using the result that last interative computation calculates as calculating next time The initial p ageRank values of each node, are constantly iterated computing, until the last each node being calculated PageRank value differs with the PageRank value for each node being calculated next time just to be terminated to change no more than given threshold epsilon For computing, what is now obtained is exactly the PageRank value of final each node.Then to each node according to respective PageRank value is ranked up, and k's is exactly preceding k most important key nodes before ranking.
In the existing research to key node, few people pay close attention to the discovery of key node in logistics consignment network, the present invention Based on real logistics network, shared neighbours' number of node liveness, node frequency of interaction and node pair etc. is considered into weights In calculating, the information in logistics consignment network is taken full advantage of, reduces the loss of effective information, improves crucial section in network The accuracy of point discovery, and based on MapReduce programming frameworks, it is right on PageRank algorithms to be obtained in the Google of comparative maturity It is improved, and realizes the parallelization of algorithm, substantially increases the efficiency and stability of key node excavation.
For finding key node in the small scale network that is formed in small data set, traditional uniprocessor algorithm can be very good Meet to require, and efficiency is appropriate.But seem power not for the large scale network that mass data is formed, traditional uniprocessor algorithm From the heart, the method superiority that the present invention puts forward is fairly obvious.

Claims (6)

1. a kind of parallelization key node towards consignment data finds method, it is characterised in that including:
Step S1:According to the transmitting-receiving total degree of each node obtains node liveness in setting time in consignment data, node is lived Weights of the jerk as node itself;
Step S2:According to figureofmerit obtains several times by the frequency of interaction of each node pair and shared neighbours in setting time in consignment data The weights on the side of each node pair, it is an oriented double weighted network figures by the net definitions formed by consignment data;
The weights on the side meet below equation:
wji=a × freqij+(1-a)Neighbor(i,j) (2)
In formula, wjiRepresent the weights on the side between node i and node j, freqijRepresent the interaction frequency between node i and node j Rate, figureofmerit, a represent Dynamic gene to the shared neighbours between Neighbor (i, j) expression node is and node j several times;
Step S3:The weights of node itself and the weights on the side of node pair are added on the basis of PageRank algorithms, concurrently Excavate the key node in oriented double weighted network figures.
2. a kind of parallelization key node towards consignment data according to claim 1 finds method, it is characterised in that The node liveness meets below equation:
ai=Mi/Max_num (1)
In formula, aiRepresent the node liveness of node i, MiRepresent that node i receives and dispatches total degree in setting time, Max_num is represented All MiIn maximum.
3. a kind of parallelization key node towards consignment data according to claim 1 finds method, it is characterised in that The frequency of interaction meets below equation:
freqij=nij/Max_num (3)
In formula, freqijRepresent the frequency of interaction between node i and node j, nijRepresent the appearance on the side of node i and node j formation Number, Max_num represent all nijIn maximum.
4. a kind of parallelization key node towards consignment data according to claim 1 finds method, it is characterised in that Figureofmerit meets below equation to the shared neighbours several times:
Neighbor (i, j)=Neighbor_shared_num (i, j)/Max_SharedNum (4)
In formula, Neighbor (i, j) represents shared neighbours between node i and node j figureofmerit several times, Neighbor_ Shared_num (i, j) represents shared neighbours' number between node i and node j, described in Max_SharedNum is represented Maximum in Neighbor_shared_num (i, j).
5. a kind of parallelization key node towards consignment data according to claim 1 finds method, it is characterised in that The step S3 is specially:
301:The PageRank value of each node is obtained, meets below equation:
PR(pi)=ai/N+(1-ai)×∑PR(pj)×wji/L(pj) (5)
In formula, PR (pi) represent node i PageRank value, pj∈M(pi), M (pi) represent to point to the set of node i, L (pj) table Show the out-degree for this node for pointing to node i, N represents node number total in consignment data, aiRepresent that the node of node i enlivens Degree, wjiRepresent the weights on the side between node i and node j;
302:For each node, the PageRank value obtained twice before and after contrast judges whether the absolute value of both differences is more than Given threshold epsilon, if so, jump procedure 301, continues to obtain the PageRank value of each node of next round, if it is not, performing step 303;
303:The PageRank value of each node finally obtained to step 302 is ranked up, and the node of k is is excavated before ranking Key node, k be key node quantity.
6. a kind of parallelization key node towards consignment data according to claim 1 finds method, it is characterised in that The data of each step are based on MapReduce programming frameworks and carry out parallelization processing in the parallelization key node discovery method.
CN201510469302.8A 2015-08-03 2015-08-03 A kind of parallelization key node towards consignment data finds method Expired - Fee Related CN105069290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510469302.8A CN105069290B (en) 2015-08-03 2015-08-03 A kind of parallelization key node towards consignment data finds method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510469302.8A CN105069290B (en) 2015-08-03 2015-08-03 A kind of parallelization key node towards consignment data finds method

Publications (2)

Publication Number Publication Date
CN105069290A CN105069290A (en) 2015-11-18
CN105069290B true CN105069290B (en) 2017-12-26

Family

ID=54498655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510469302.8A Expired - Fee Related CN105069290B (en) 2015-08-03 2015-08-03 A kind of parallelization key node towards consignment data finds method

Country Status (1)

Country Link
CN (1) CN105069290B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506192A (en) * 2016-10-09 2017-03-15 中国电子科技集团公司第三十六研究所 A kind of method and apparatus of identification network key node
CN106685690B (en) * 2016-10-27 2019-07-09 中南大学 Computer network key node based on simulation building process finds method
CN107729478B (en) * 2017-10-16 2021-03-02 天津微迪加科技有限公司 Data analysis method and device
CN109379220B (en) * 2018-10-10 2021-06-15 太原理工大学 Complex network key node cluster mining method based on combination optimization
CN112990633B (en) * 2019-12-18 2024-04-05 菜鸟智能物流控股有限公司 Index data generation method, logistics cost simulation method, equipment and storage medium
CN112507996B (en) * 2021-02-05 2021-04-20 成都东方天呈智能科技有限公司 Face detection method of main sample attention mechanism

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259263A (en) * 2013-05-31 2013-08-21 重庆大学 Electrical power system key node identification method based on active power load flow betweenness
CN103906271A (en) * 2014-04-21 2014-07-02 西安电子科技大学 Method for measuring key nodes in Ad Hoc network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9113412B2 (en) * 2011-12-12 2015-08-18 Qualcomm Incorporated Low power node dormant state

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259263A (en) * 2013-05-31 2013-08-21 重庆大学 Electrical power system key node identification method based on active power load flow betweenness
CN103906271A (en) * 2014-04-21 2014-07-02 西安电子科技大学 Method for measuring key nodes in Ad Hoc network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《加权社会网络中重要节点发现算法》;韩忠明;《计算机应用》;20130601;正文第1554页第2栏第4段-第1556页第2栏第6段、图1 *

Also Published As

Publication number Publication date
CN105069290A (en) 2015-11-18

Similar Documents

Publication Publication Date Title
CN105069290B (en) A kind of parallelization key node towards consignment data finds method
Cai et al. A survey on network community detection based on evolutionary computation
Ciriello et al. AlignNemo: a local network alignment method to integrate homology and topology
CN103379158B (en) The method and system of commending friends information in a kind of social networks
CN102810113B (en) A kind of mixed type clustering method for complex network
Jiang et al. An efficient evolutionary user interest community discovery model in dynamic social networks for internet of people
CN106991617A (en) A kind of microblogging social networks extraction algorithm based on Information Communication
Gagnon et al. A flexible ancestral genome reconstruction method based on gapped adjacencies
CN105869053A (en) Two-stage memetic based social network influence maximizing method
Ruz et al. Dynamical and topological robustness of the mammalian cell cycle network: A reverse engineering approach
Williams et al. An investigation of phylogenetic likelihood methods
CN105160097B (en) A kind of three value FPRM circuit area optimization methods of utilization Population Migration Algorithm
Warnow Supertree construction: opportunities and challenges
Santander-Jiménez et al. Applying a multiobjective metaheuristic inspired by honey bees to phylogenetic inference
Hejase et al. Fastnet: fast and accurate statistical inference of phylogenetic networks using large-scale genomic sequence data
Ali et al. Improved differential evolution algorithm with decentralisation of population
CN105205534B (en) A kind of three value FPRM circuit areas and power consumption optimum polarity search method
Park et al. On the power of gradual network alignment using dual-perception similarities
Tamura et al. Distributed Modified Extremal Optimization using Island Model for Reducing Crossovers in Reconciliation Graph.
Pournara et al. FPGA-accelerated Bayesian learning for reconstruction of gene regulatory networks
CN105577434A (en) Multi-association mining method and device based on social network
Du et al. Combining quantum-behaved PSO and K2 algorithm for enhancing gene network construction
Xia et al. A median solver and phylogenetic inference based on double-cut-and-join sorting
CN112070200B (en) Harmonic group optimization method and application thereof
Zhang et al. An improved archaeology algorithm based on integrated multi-source biological information for yeast protein interaction network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171226

Termination date: 20200803

CF01 Termination of patent right due to non-payment of annual fee