CN109242713A

CN109242713A - Three decision group dividing methods and device based on the processing of random walk Boundary Region

Info

Publication number: CN109242713A
Application number: CN201811045237.6A
Authority: CN
Inventors: 陈洁; 李洋; 赵姝; 段震; 张燕平
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2018-09-07
Filing date: 2018-09-07
Publication date: 2019-01-18

Abstract

The invention discloses a kind of three decision group dividing methods based on the processing of random walk Boundary Region, 1) method includes:, obtains abstract network；2) cluster granulation is carried out after, being initially granulated to abstract network, abstract network is divided into corporations after multiple divisions, and using the structural relation of corporations after division as the first division result；3) the first division result of the corresponding abstract network of overlapping corporations' modularity maximum value, is obtained；4), all nodes in Boundary Region are divided using Random Walk Algorithm；5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains the second division result, and using the second division result as target division result.The invention discloses a kind of three decision community division devices based on the processing of random walk Boundary Region.With the application of the invention, the precision of community division can be improved, to facilitate analysis and awareness network structure, convenient for optimizing and managing to network.

Description

Three decision group dividing methods and device based on the processing of random walk Boundary Region

Technical field

The present invention relates to a kind of group dividing method and devices, are more particularly to a kind of based on the processing of random walk Boundary Region Three decision group dividing methods and device.

Background technique

Network is widely present in our actual life, such as the interpersonal relationship in social system, in biosystem The protein Internet, WWW and internet in computer network etc., each of these networks individual passes through information Interchange channel is connect with other individuals.With the further investigation to network, it has been found that in many real networks all there is Community structure, i.e. whole network are made of several corporations.For example, can be by network present in reality, such as computer network In each device abstract at a node, and the network channel between each node can be abstracted into the side between node, It can be found that connecting relative close between node/node inside each corporations and the connection between corporations is relatively sparse. In order to realize the analysis to network, need to find according to the incidence relation between node and contact closely small-sized community structure, be somebody's turn to do Process can be referred to as community division or community discovery.In recent years, go deep into research, it has been found that carrying out community division Shi Jingchang will appear lap, i.e. a node may belong to multiple corporations.In fact, overlapping nodes are divided into individually It is more conducive to existing rule in discovery corporations in corporations, and predicts the behavior and function of network.

Currently, common non-overlap community detecting algorithm, is broadly divided into following four: hierarchical clustering algorithm, objective function Optimization algorithm, network dynamics algorithm and the community detecting algorithm based on Granule Computing etc., wherein applying more non-overlap society Clique partitioning algorithm have based on the granulated community discovery algorithm of level, GN algorithm (the amp- Newman algorithm of Grivan-Newman, Ge Rui), NFA algorithm (Newman fast algorithm, Newman fast algorithm) and LPA algorithm (Label Propagation Algorithm, label propagation algorithm) etc..These existing non-overlap community detecting algorithms are from different angles and application The division of non-overlap corporations is studied, and achieves plentiful and substantial research achievement.

But these algorithms to lap processing when be all applied only for two traditional decision-making techniques, i.e., according to existing Information only make and accept or reject decision.But the node of lap is often because information content deficiency can not determine its ownership Lap is appeared in, if forcing to make a policy, the result of final non-overlap community division may be will affect.Therefore, existing Have in technology that there are the technical problems that non-overlap community division result precision is not high.

Summary of the invention

Technical problem to be solved by the present invention lies in provide it is a kind of based on random walk Boundary Region processing three certainly Plan group dividing method and device, to improve the precision of non-overlap community division result.

The present invention is to solve above-mentioned technical problem by the following technical programs:

The embodiment of the invention provides a kind of three decision group dividing methods based on the processing of random walk Boundary Region, institutes The method of stating includes:

1) network abstraction for the pending community division that, will acquire is at the abstract network connected by node；

2), for each of abstract network node, according to the connection attribute of the node, to the abstract net Network carries out cluster granulation after being initially granulated, using the structural relation of abstract network after cluster granulation as the first division result, wherein The connection attribute, comprising: the connection relationship of the node and other nodes；

3) the corresponding overlapping corporations modularity of each granulosa, is calculated, and obtains overlapping corporations' modularity maximum value pair First division result of the abstract network answered, wherein the overlapping corporations are the corporations with same node point, and granulosa is pair Answer the first division result of different default granulation coefficients；

4), using the lap of corporations after each division in first division result as Boundary Region；And using at random Migration algorithm divides all nodes in the Boundary Region, updates first division result and the Boundary Region, Until first division result and the update times of the Boundary Region reach preset times；

5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains the Two division results, and using second division result as target division result.

Optionally, 1) step, comprising:

Each of network for the pending community division that will acquire is abstracted into one by network individual interconnected A node, the connecting line being then abstracted into the connecting link between each node between node.

Optionally, 2) step, comprising:

A: node centered on the node is obtained into the center for each of abstract network node The set of the neighbor node of node, and whether there is in network that judge the central node and that the neighbor node is constituted The corporations being made of node only connected by the central node；If so, the neighbours that obtain the central node and described Included in the network that node is constituted, in addition to the central node between corporations connectionless relationship each corporations, and by institute Each corporations are stated as corporations after dividing；If it is not, using the central node the and described neighbor node constitute network as Corporations after division.

B: for the default granulation coefficient of each of the default granulation coefficient of preset quantity, for corporations after the division There are corporations after the division of connection relationship for every a pair in the set of composition, obtain corporations after the division there are connection relationship Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not Less than default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until the division Afterwards corporations constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation Coefficient；Using the structural relation of the corporations of acquisition as the first division result.

Optionally, 3) before the step, the method also includes:

Duplicate removal processing is carried out to first division result.

Optionally, 3) step, comprising:

Using formula,Obtain each default grain Change the corresponding overlapping corporations modularity of coefficient, wherein

EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient；M is in the abstract network Side quantity；G_rFor corporations after each default each division being granulated in the corresponding community division result of coefficient；I is i-th The serial number of node；J is the serial number of j-th of node；o_iFor the quantity of corporations after division belonging to i-th of node；o_jFor j-th of node The quantity of corporations after affiliated division；A_ijFor the corresponding adjacency matrix element of the abstract network；d_iFor the degree of i-th of node；d_j For the degree of j-th of node.

It is optionally, described that all nodes in the Boundary Region are divided using Random Walk Algorithm, comprising:

A: the corresponding adjacency matrix of the abstract network is obtained；

B: according to the adjacency matrix, using formula,Described in calculating Similarity value in Boundary Region between the connected node of any two, wherein

S(v_i,v_j) it is similarity value between any two are connected in Boundary Region node；v_iFor i-th of section in Boundary Region Point；v_jFor i-th of node in Boundary Region；Min () is minimum value value finding function；Y_iγFor the i-th row all elements structure in adjacency matrix At the vector that constitutes of vector and γ row all elements between inner product, and Y_iγ=∑_tA_itA_γt；A_itIt is in adjacency matrix The element of i row t column；A_γtThe element arranged for γ row t in adjacency matrix；∑ is summing function；T is member in adjacency matrix The serial number of column corresponding to element；Y_jγThe vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted Vector between inner product, and Y_jγ=∑_tA_jtA_γt；A_jtThe element arranged for jth row t in adjacency matrix；Max () is maximum value Value finding function；γ is the serial number of row in adjacency matrix；

C: the similarity value in the Boundary Region between any two node is obtained；Wherein, in any two node Serial number it is identical when, the similarity value between any two node is set as 1；Exist between any two node When connecting side, the similarity value between any two node is calculated using the formula in step B；In any two section When side being not present between point, the similarity value between any two node is set as 0；

D: the similarity value in the Boundary Region between any two node is normalized, normalized moments are obtained Battle array, and utilize formula, u=(1- α) P^T·u₀+ α d calculates in the Boundary Region each node migration to other each sections The probability vector of point, wherein

U be Boundary Region in each node migration to other each nodes probability vector；α jumps probability to be preset；P For normalization matrix；u₀For probability vector；D is transition probability vector；P^TFor the transposed matrix of normalization matrix；

E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region Convergence obtains in the Boundary Region each node migration to the destination probability vector of other each nodes；

F: being directed to and utilize formula,Calculate institute State the probability of each node migration corporation after each division into abstract network in Boundary Region, wherein

P(n→C_j) be Boundary Region in n-th of node migration to j-th divide after corporations probability；Avg { } is average It is worth value finding function；N is the serial number of the node in Boundary Region；v_iFor corporations C after division_jIn i-th of node；J is corporations after dividing Serial number；For arbitrary function；

G: for each of Boundary Region node, the node division is arrived, the node migration is to each stroke In the probability of Fen Hou corporations after the corresponding division of maximum value in corporations, corporations and packet after the Boundary Region, the division are updated The first division result containing the structural relation of corporations after each division.

The embodiment of the invention provides a kind of three decision community division devices based on the processing of random walk Boundary Region, institutes Stating device includes:

First obtains module, and the network abstraction for the pending community division that will acquire is abstract at being connected by node Network；

Setup module is right according to the connection attribute of the node for being directed to each of abstract network node The abstract network carries out cluster granulation after being initially granulated, and the abstract network is divided into corporations after multiple divisions, and will draw The structural relation of Fen Hou corporations is as the first division result, wherein the connection attribute, comprising: the node and other nodes Connection relationship；

Computing module for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' module Spend the first division result of the corresponding abstract network of maximum value, wherein the overlapping corporations are the corporations with same node point, Granulosa is the first division result of corresponding different default granulation coefficient；

Division module, for using the lap of corporations after each division in first division result as Boundary Region； And all nodes in the Boundary Region are divided using Random Walk Algorithm, update first division result and institute Boundary Region is stated, until first division result and the update times of the Boundary Region reach preset times；

Second obtains module, for using three Decision Methods for each of updated Boundary Region node It is handled, obtains the second division result, and using second division result as target division result.

Optionally, described first module is obtained, is also used to:

Optionally, the setup module, is also used to:

Optionally, described device further include: deduplication module, for carrying out duplicate removal processing to first division result.

Optionally, the computing module, is also used to:

Optionally, the division module, is also used to:

A: the corresponding adjacency matrix of the abstract network is obtained；

The present invention has the advantage that compared with prior art

Using the embodiment of the present invention, the node in the Boundary Region that can not be made a policy is prolonged using three decision-making techniques Slow decision, carry out decision to the node in Boundary Region again after getting sufficient information makes to reduce the fault of decision Decision is more reasonable；In addition, the embodiment of the present invention also uses Random Walk Algorithm to divide the node in Boundary Region, thus not Easily lead to local optimum.Therefore, compared with the existing technology using the embodiment of the present invention, more reasonable non-overlap society can be marked off Group, to effectively improve the precision of community division.

Detailed description of the invention

Fig. 1 is a kind of three decision community division sides based on the processing of random walk Boundary Region provided in an embodiment of the present invention The flow diagram of method；

Fig. 2 is the structural schematic diagram of the abstract network obtained in the embodiment of the present invention；

Fig. 3 is a kind of three decision community divisions dress based on the processing of random walk Boundary Region provided in an embodiment of the present invention The structural schematic diagram set.

Specific embodiment

It elaborates below to the embodiment of the present invention, the present embodiment carries out under the premise of the technical scheme of the present invention Implement, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to following implementation Example.

The embodiment of the invention provides it is a kind of based on random walk Boundary Region processing three decision group dividing methods and Device, first below with regard to a kind of three decision community divisions based on the processing of random walk Boundary Region provided in an embodiment of the present invention Method is introduced.

Fig. 1 is a kind of three decision community division sides based on the processing of random walk Boundary Region provided in an embodiment of the present invention The flow diagram of method；Fig. 2 is the structural schematic diagram of the abstract network obtained in the embodiment of the present invention, as depicted in figs. 1 and 2, The described method includes:

S101: the network abstraction for the pending community division that will acquire is at the abstract network connected by node.

Specifically, each of network for the pending community division that can be will acquire passes through interconnected, network Body is abstracted into a node, the connecting line being then abstracted into the connecting link between each node between node.

Illustratively, G=(X, f, T) can be defined, indicates a network, wherein X is the node set of abstract network G, F is the line set of abstract network G, and T is the topological structure of abstract network G；

Assuming that the given undirected and unweighted network G=(X, f, T) such as Fig. 2, which includes 10 nodes, 17 sides, wherein X =(1,2 ..., i ..., j ..., 10).

It is emphasized that above-mentioned side refers to, connection relationship directly occurs, rather than is connected by third party The connecting line that link between the node of relationship obtains after being abstracted.

S102: for each of abstract network node, according to the connection attribute of the node, to described abstract Network carries out cluster granulation after being initially granulated, using the structural relation of abstract network after cluster granulation as the first division result, In, the connection attribute, comprising: the connection relationship of the node and other nodes.

Specifically, S102 step may include:

As illustrated in fig. 2, it is assumed that with node 1 be center node, then node 1 neighbor node constitute set are as follows: N (1)= { 2,3,4,5,6,9,10 } be the corporations that center node is constituted with node 1 can be therefore SubC (1)={ SubC₁=1,2, 3,4},SubC₂={ 1,5,6,9,10 } }.When with node 3 be center node building corporations for SubC (3)={ SubC₁=1, 2,3,4}}。

Specifically, the value of granulation coefficient lambda can be preset for any one, and λ ∈ (0,1), any two in step A are calculated Adjacent corporations after division such as,AndGranulation coefficientThen, maximum grain is found out Change coefficient are as follows:IfCorporations after then two are divided Corporations after merging into a division, update the division result of abstract network, repeat aforesaid operations, until any two are drawn The granulation coefficient of Fen Hou corporations is respectively less than default granulation coefficient lambda, it may be assumed that

It is granulated coefficientWhen, the quotient space of most coarseness is obtained, Cluster granulation is completed, and the corporations after each division for being included can also be referred to as a grain, wherein m is default granulation system Several serial numbers；I is the serial number of i-th of corporation；J is the serial number of j-th of corporation.

It should be noted that calculate corporations granulation coefficient be the prior art, the embodiment of the present invention herein not to its into Row repeats.

Finally, can be using step B treated abstract network as the first division result of the default granulation coefficient, it can be with Understand, each preset granulation coefficient corresponds to first division result.

Further, in practical applications, the preparation method for presetting granulation coefficient can be with are as follows: by the step of default granulation coefficient Long value is 0.01, and then according to the step-length, the upper limit value from granulation coefficient is default to that can obtain several between lower limit value It is granulated coefficient.Preset granulation coefficient is ranked up by step-length according to sequence from big to small, adjacent two default granulations Difference between coefficient is 0.01.

The merging of corporations after being divided according to this step obtains the corporations of abstract network under each default granulation coefficient Division result can be improved the dividing precision of the community division result of abstract network, facilitate using the above embodiment of the present invention The structure of analysis and awareness network, convenient for optimizing and managing to network.

S103: calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' modularity maximum value First division result of corresponding abstract network, wherein the overlapping corporations are the corporations with same node point, and granulosa is pair Answer the first division result of different default granulation coefficients.

Specifically, can use formula,It obtains every One is preset the corresponding overlapping corporations modularity of granulation coefficient, wherein

It is understood that granulosa refers to multiple default granulation coefficients, grain is preset according to each Change each first division result obtained after the cluster granulating operation that coefficient is carried out.

S104: using the lap of corporations after each division in first division result as Boundary Region；And using with Machine migration algorithm divides all nodes in the Boundary Region, updates first division result and the boundary Domain, until first division result and the update times of the Boundary Region reach preset times.

Specifically, described divide all nodes in the Boundary Region using Random Walk Algorithm, comprising:

A: obtaining the corresponding adjacency matrix of the abstract network, for example, obtained adjacency matrix are as follows:

If the element that the i-th row jth arranges in adjacency matrix is 1, taking out As in network, indicating there is side connection between i-th of node and j-th of node, if the element that the i-th row jth arranges in adjacency matrix is 0, in abstract network, indicate there is no side connection between i-th of node and j-th of node.

S(v_i,v_j) it is similarity value between any two are connected in Boundary Region node；v_iFor i-th of section in Boundary Region Point；v_jFor i-th of node in Boundary Region；Min () is minimum value value finding function；Y_iγFor the i-th row all elements structure in adjacency matrix At the vector that constitutes of vector and γ row all elements between inner product, and Y_iγ=∑_tA_itA_γt；A_itIt is in adjacency matrix The element of i row t column；A_γtFor the element of γ row t column；∑ is summing function；T is corresponding to element in adjacency matrix The serial number of column；Y_jγBetween the vector that the vector and γ row all elements constituted for jth row all elements in adjacency matrix is constituted Inner product, and Y_jγ=∑_tA_jtA_γt；A_jtThe element arranged for jth row t in adjacency matrix；Max () is maximum value value finding function； γ is the serial number of row in adjacency matrix.

Illustratively, the corresponding similarity moment of similarity in this step B, between the connected node of obtained any two Battle array can be with are as follows:

E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability to Amount, and return and execute the D step, until probability vector of each node migration to other each nodes in the Boundary Region Convergence obtains in the Boundary Region each node migration to the destination probability vector u of other each nodes；

F: utilizing formula,Calculate the side The probability of each node migration corporation after each division into abstract network in boundary domain, wherein

P(n→C_j) be Boundary Region in node n migration to j-th divide after corporations probability；Avg { } asks for average value Value function；N is the serial number of the node in Boundary Region；v_iFor corporations C after division_jIn i-th of node；J is the sequence of corporations after dividing Number；For arbitrary function；

G: for each of Boundary Region node, by the node division to the node migration to each division Afterwards in the probability of corporations after the corresponding division of maximum value in corporations, update after the Boundary Region, the division corporations and comprising First division result of the structural relation of corporations after each division.

In this step, by the successive ignition to community division process, the precision of community division can be improved, in addition, The quantity of Boundary Region interior joint can also be reduced, calculation amount is reduced, improves operation efficiency.

S105: it for each of updated Boundary Region node, is handled, is obtained using three Decision Methods Second division result, and using second division result as target division result.

In practical applications, three Decision Methods are for handling those uncertain informations with certain practicability, for example, working as When information deficiency, three decision theories are first divided into three domains to uncertain problem, i.e., positive domain (receiving), negative domain (refusal) It (does not promise to undertake), the node that can not currently make a policy is placed in Boundary Region, i.e. Delayed Decision with Boundary Region；When getting abundance Information after decision carried out to the node in Boundary Region again keep decision more reasonable to reduce the fault of decision.

In addition, Random Walk Algorithm has the selection independent of start node, not easily leads to local optimum and corporations It was found that the advantages that precision is higher, therefore, the processing of corporations' lap after being divided using the embodiment of the present invention can be divided More reasonable non-overlap corporations out facilitate the structure of analysis and awareness network, just to effectively improve the precision of community division In optimizing and manage to network.

Using embodiment illustrated in fig. 1 of the present invention, using three decision-making techniques to the section in the Boundary Region that can not be made a policy Point carries out Delayed Decision, decision is carried out to the node in Boundary Region again after getting sufficient information, to reduce decision Fault, keep decision more reasonable；Therefore, compared with the existing technology using the embodiment of the present invention, can mark off more reasonably non- Corporations are overlapped, to effectively improve the precision of community division, facilitate the structure of analysis and awareness network, convenient for being carried out to network Optimization and management.

In addition, existing algorithm is easy to lead to local optimum during community division, to influence the essence of community division Degree.Using the embodiment of the present invention, random walk is introduced into three decision thoughts and handles corporations' lap, to not easily lead to Local optimum.

Moreover, the embodiment of the present invention obtains the optimum structure of overlapping corporations first, Boundary Region interior joint has more been fully considered Between prototype structure, so as to obtain more true community structure；

Furthermore using embodiment illustrated in fig. 1 of the present invention, facilitate the structure of analysis and awareness network, convenient for network into Row optimization and management.In social networks, such as wechat, QQ social software, using user as node, the good friend between user is closed System is used as side, some users for having common friend can be recommended to become good friend by carrying out community division to these social networks, These users increase the interaction between user often in same corporations, with this.

It is described on the basis of embodiment illustrated in fig. 1 of the present invention in a kind of specific embodiment of the embodiment of the present invention Following steps are also added before S103 step, duplicate removal processing are carried out to first division result, i.e., for identical division Corporations afterwards only retain one.

Using the above embodiment of the present invention, calculation amount can be reduced, improves calculating speed.

Corresponding with embodiment illustrated in fig. 1 of the present invention, the embodiment of the invention also provides one kind to be based on random walk boundary Three decision community division devices of domain processing.

Fig. 3 is a kind of three decision community divisions dress based on the processing of random walk Boundary Region provided in an embodiment of the present invention The structural schematic diagram set, as shown in figure 3, described device includes:

First obtains module 301, and the network abstraction for the pending community division that will acquire by node at being connected into Abstract network；

Setup module 302, for being directed to each of abstract network node, according to the connection category of the node Property, cluster granulation is carried out after being initially granulated to the abstract network, using the structural relation of abstract network after cluster granulation as the One divides as a result, wherein, the connection attribute, comprising: the connection relationship of the node and other nodes；

Computing module 303 for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' mould First division result of the corresponding abstract network of lumpiness maximum value, wherein the overlapping corporations are the society with same node point Group, granulosa are the first division result of corresponding different default granulation coefficient；

Division module 304, for using the lap of corporations after each division in first division result as boundary Domain；And all nodes in the Boundary Region are divided using Random Walk Algorithm, update first division result with And the Boundary Region, until first division result and the update times of the Boundary Region reach preset times；

Second obtains module 305, for using three decisions for each of updated Boundary Region node Method is handled, and obtains the second division result, and using second division result as target division result.

Using embodiment illustrated in fig. 3 of the present invention, using three decision-making techniques to the section in the Boundary Region that can not be made a policy Point carries out Delayed Decision, decision is carried out to the node in Boundary Region again after getting sufficient information, to reduce decision Fault, keep decision more reasonable；Therefore, compared with the existing technology using the embodiment of the present invention, can mark off more reasonably non- Corporations are overlapped, to effectively improve the precision of community division, facilitate the structure of analysis and awareness network, convenient for being carried out to network Optimization and management.

In in a kind of specific embodiment of the embodiment of the present invention, described first obtains module 301, is also used to:

In a kind of specific embodiment of the embodiment of the present invention, the setup module 302 is also used to:

In a kind of specific embodiment of the embodiment of the present invention, on the basis of embodiment illustrated in fig. 3 of the present invention, also increase Deduplication module is added, for carrying out duplicate removal processing to first division result.

In a kind of specific embodiment of the embodiment of the present invention, the computing module 303 is also used to:

In a kind of specific embodiment of the embodiment of the present invention, the division module 304 is also used to:

A: the corresponding adjacency matrix of the abstract network is obtained；

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. a kind of three decision group dividing methods based on the processing of random walk Boundary Region, which is characterized in that the method packet It includes:

2), for each of abstract network node, according to the connection attribute of the node, at the beginning of the abstract network Cluster granulation is carried out after the granulation that begins, and using the structural relation of abstract network after cluster granulation as the first division result, wherein institute State connection attribute, comprising: the connection relationship of the node and other nodes；

3) the corresponding overlapping corporations modularity of each granulosa, is calculated, and it is corresponding to obtain overlapping corporations' modularity maximum value First division result of abstract network, wherein the overlapping corporations are the corporations with same node point, and granulosa is to correspond to not First division result of same default granulation coefficient；

4), using the lap of corporations after each division in first division result as Boundary Region；And utilize random walk Algorithm divides all nodes in the Boundary Region, updates first division result and the Boundary Region, until First division result and the update times of the Boundary Region reach preset times；

5) it, for each of updated Boundary Region node, is handled using three Decision Methods, obtains second stroke Divide as a result, and using second division result as target division result.

2. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1, It is characterized in that, 1) step, comprising:

Each of network for the pending community division that will acquire is abstracted into a section by network individual interconnected Point, the connecting line being then abstracted into the connecting link between each node between node.

3. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1, It is characterized in that, 2) step, comprising:

A: node centered on the node is obtained into the central node for each of abstract network node Neighbor node set, and with the presence or absence of only leading in network that judge the central node and that the neighbor node is constituted Cross the corporations of the central node connection being made of node；If so, the neighbor node that obtain the central node and described Included in the network of composition, in addition to the central node between corporations connectionless relationship each corporations, and will be described each A corporations are as corporations after dividing；If it is not, the network that the central node the and described neighbor node is constituted is as division Corporations afterwards.

B: it for the default granulation coefficient of each of the default granulation coefficient of preset quantity, is constituted for corporations after the division Set in every a pair there are corporations after the division of connection relationship, corporations are distinguished after obtaining the division there are connection relationship Corresponding granulation coefficient, and then the maximum value in the granulation coefficient is obtained, if the maximum value in the granulation coefficient is not less than Default granulation coefficient, corporations after the division there are connection relationship, which are merged, becomes a corporations, until society after the division Group constitute set in any pair there are the granulation coefficients of corporations after the division of connection relationship to be respectively less than the default granulation coefficient； Using the structural relation of the corporations of acquisition as the first division result.

4. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1, It is characterized in that, 3) before the step, the method also includes:

Duplicate removal processing is carried out to first division result.

5. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1, It is characterized in that, 3) step, comprising:

Using formula,Obtain each default granulation system Number corresponding overlapping corporations modularity, wherein

EQ is that each presets the corresponding overlapping corporations modularity of granulation coefficient；M is the side for including in the abstract network Quantity；G_rFor corporations after each default each division being granulated in the corresponding community division result of coefficient；I is i-th of node Serial number；J is the serial number of j-th of node；o_iFor the quantity of corporations after division belonging to i-th of node；o_jFor belonging to j-th of node The quantity of corporations after division；A_ijFor the corresponding adjacency matrix element of the abstract network；d_iFor the degree of i-th of node；d_jFor jth The degree of a node.

6. a kind of three decision group dividing methods based on the processing of random walk Boundary Region according to claim 1, It is characterized in that, it is described that all nodes in the Boundary Region are divided using Random Walk Algorithm, comprising:

A: the corresponding adjacency matrix of the abstract network is obtained；

B: according to the adjacency matrix, using formula,Calculate the Boundary Region Similarity value between the connected node of middle any two, wherein

S(v_i,v_j) it is similarity value between any two are connected in Boundary Region node；v_iFor i-th of node in Boundary Region；v_j For i-th of node in Boundary Region；Min () is minimum value value finding function；Y_iγIt is constituted for the i-th row all elements in adjacency matrix The inner product between vector that vector and γ row all elements are constituted, and Y_iγ=∑_tA_itA_γt；A_itFor the i-th row in adjacency matrix The element of t column；A_γtThe element arranged for γ row t in adjacency matrix；∑ is summing function；T is element institute in adjacency matrix The serial number of corresponding column；Y_jγFor in adjacency matrix jth row all elements constitute vector and γ row all elements composition to Inner product between amount, and Y_jγ=∑_tA_jtA_γt；A_jtThe element arranged for jth row t in adjacency matrix；Max () is maximum value evaluation Function；γ is the serial number of row in adjacency matrix；

C: the similarity value in the Boundary Region between any two node is obtained；Wherein, in the sequence of any two node When number identical, the similarity value between any two node is set as 1；There is connection between any two node Bian Shi calculates the similarity value between any two node using the formula in step B；Any two node it Between be not present side when, the similarity value between any two node is set as 0；

D: the similarity value in the Boundary Region between any two node being normalized, normalization matrix is obtained, And utilize formula, u=(1- α) P^T·u₀+ α d calculates in the Boundary Region each node migration to other each nodes Probability vector, wherein

U be Boundary Region in each node migration to other each nodes probability vector；α jumps probability to be preset；P is to return One changes matrix；u₀For probability vector；D is transition probability vector；P^TFor the transposed matrix of normalization matrix；

E: using the probability vector of each node migration in the Boundary Region to other each nodes as probability vector, and It returns and executes the D step, until the probability vector of each node migration to other each nodes restrains in the Boundary Region, Each node migration is obtained in the Boundary Region to the destination probability vector of other each nodes；

F: being directed to and utilize formula,Calculate the side The probability of each node migration corporation after each division into abstract network in boundary domain, wherein

P(n→C_j) be Boundary Region in n-th of node migration to j-th divide after corporations probability；Avg { } is average value evaluation Function；N is the serial number of the node in Boundary Region；v_iFor corporations C after division_jIn i-th of node；J is the sequence of corporations after dividing Number；For arbitrary function；

G: for each of Boundary Region node, the node division is arrived, after the node migration to each division In the probability of corporations after the corresponding division of maximum value in corporations, corporations are updated after the Boundary Region, the division and comprising each First division result of the structural relation of corporations after a division.

7. a kind of three decision community division devices based on the processing of random walk Boundary Region, which is characterized in that described device packet It includes:

First obtains module, and the network abstraction for the pending community division that will acquire is at the abstract net connected by node Network；

Setup module, for being directed to each of abstract network node, according to the connection attribute of the node, to described Abstract network carries out cluster granulation after being initially granulated, and the structural relation of abstract network after cluster granulation is divided as first and is tied Fruit, wherein the connection attribute, comprising: the connection relationship of the node and other nodes；

Computing module for calculating the corresponding overlapping corporations modularity of each granulosa, and obtains overlapping corporations' modularity most It is worth the first division result of corresponding abstract network greatly, wherein the overlapping corporations are the corporations with same node point, granulosa For the first division result of the different default granulation coefficient of correspondence；

Division module, for using the lap of corporations after each division in first division result as Boundary Region；And benefit All nodes in the Boundary Region are divided with Random Walk Algorithm, update first division result and the side Boundary domain, until first division result and the update times of the Boundary Region reach preset times；

Second obtains module, for being carried out using three Decision Methods for each of updated Boundary Region node Processing obtains the second division result, and using second division result as target division result.

8. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7, It is characterized in that, described first obtains module, it is also used to:

9. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7, It is characterized in that, the setup module is also used to:

10. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7, It is characterized in that, described device further include: deduplication module, for carrying out duplicate removal processing to first division result.

11. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7, It is characterized in that, the computing module is also used to:

12. a kind of three decision community division devices based on the processing of random walk Boundary Region according to claim 7, It is characterized in that, the division module is also used to:

A: the corresponding adjacency matrix of the abstract network is obtained；