CN101901251A

CN101901251A - Method for analyzing and recognizing complex network cluster structure based on markov process metastability

Info

Publication number: CN101901251A
Application number: CN 201010210628
Authority: CN
Inventors: 杨博; 刘大有
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2010-06-28
Filing date: 2010-06-28
Publication date: 2010-12-01
Anticipated expiration: 2030-06-28
Also published as: CN101901251B

Abstract

The invention relates to a method for analyzing and recognizing a complex network cluster structure based on markov process metastability, comprising the following main steps: configuring a markov process on a given complex network; calculating a transition probability matrix of the markov process; calculating the characteristic values of the matrix; calculating the number of network clusters through analyzing the characteristic values; calculating the first metastability of the markov process; and recognizing all the network clusters of the network and hierachical structures thereof according to the first metastability. The invention provides the new and high-efficient method for the analysis and recognition of the complex network clusters, and has the characteristics of unbiasedness (having optimization objects or heuristic rule which are not dependent of subjective definition), rapid calculation speed (having approximate linear calculation time complexity), high recognition precision (correctly recognizing the network clusters of the complex network in a real world and the hierachical structures thereof) and no monitoring ( needing no prior knowledge) as compared with the existing similar methods.

Description

Complex network cluster structure analysis and recognition methods based on the Markov process metastability

Technical field

The invention belongs to pattern-recognition and data mining field, relate in particular to the analysis of complex networks such as social network, WWW and bio-networks.

Background technology

All multisystems in the real world all exist with latticed form, as the interpersonal relationship in the social system, scientist's coorporative network and epidemic transmission net, neuron net in the ecosystem, gene regulation net and the mutual net of protein, the electrical network in the science technology system, the Internet and WWW etc.Because these networks have very high complicacy, therefore be called as " complex network ".Arranged side by side with worldlet and scaleless property, complex network cluster structure (CNCS) is one of the most general and most important topological structure attribute of complex network, has to interconnect tight, different cluster knot point with the cluster knot point and interconnect sparse characteristics.The CNCS recognition methods is intended to disclose the network cluster structure of necessary being in the complex network.

The CNCS recognition methods is to the Analysis of Complex topology of networks, understand its function, find that its latent pattern and its behavior of prediction all have crucial meaning, be with a wide range of applications, be applied to terroristic organization's identification at present, organizational structure's management waits social network analysis, the metabolism network analysis, the protein Internet is analyzed and agnoprotein matter function prediction, bio-networks analyses such as gene regulatory network analysis and master control gene recognition, WWW community excavates and based on the web documents cluster of descriptor, search engine, spatial data cluster and image segmentation, various fields such as relation data analysis.

Had multiple CNCS recognizer at present, according to the basic solution strategies that is adopted, it is two big classes that the great majority in them can belong to: based on recognition methods and the heuristic recognition methods optimized.The former is converted into optimization problem with the CNCS identification problem, comes the clustering architecture of calculation of complex network by the predefined objective function of optimization, and the latter is based on predefined heuristic rule design heuritic approach.Spectral method and local search approach are the main CNCS recognition methodss based on optimization of two classes.

Spectral method and local search approach are the main CNCS recognition methodss based on optimization of two classes.Spectral method is converted into the quadratic form optimization problem with the network clustering problem, optimizes predefined " cutting " function by the proper vector of calculating Special matrix.Spectral method has tight mathematical theory, has developed into a kind of important method (being called the spectral clustering method) of cluster, is widely used in figure and cuts apart and fields such as spatial point cluster.But at CNCS identification, the main deficiency of spectral method is: need define the recurrence end condition by priori, promptly spectral method does not possess the ability of automatic recognition network bunch sum.Kernighan-Lin algorithm in 1970, quick Newman algorithm in 2004 and Guimera-Amaral algorithm in 2005 are three typical CNCS recognizers based on the Local Search optimisation technique.This class algorithm all comprises three essential parts: the search strategy of objective function, candidate solution and the search strategy of optimum solution, but on specific implementation, have nothing in common with each other.Thereafter the representative CNCS recognizer based on optimizing that proposes has: the CNCS recognizer based on maximum likelihood that proposed in 2008, the improvement spectral method that proposed in 2009 and the CNCS recognizer based on monte carlo method that proposed in 2010.

The network cluster structure that adopts optimization method to identify depends on optimization aim fully, and therefore the objective function that " has partially " can cause separate (the network cluster structure that promptly obtains and the network cluster structure of necessary being are not inconsistent) of " having partially ".It should be noted that to comprise above-mentioned quick Newman algorithm and Guimera-Amaral algorithm, much based on the CNCS recognition methods of optimizing all to maximize Q function that Newman in 2002 proposes as optimization aim.Yet discover that the Q function has inclined to one side, can not the real network cluster structure of entirely accurate ground portrayal.For example, for benchmark test data Karate community network, the Q value of its real network cluster structure correspondence is a local maximum, but not global maximum.2004, Guimera etc. discovered that for some random network, because the influence that is disturbed, obviously bad network cluster structure is the higher relatively Q value of correspondence but.2007, Fortunato and Barthelemy systematically studied the influence of Q function to accuracy of identification, point out: tend to find coarse rather than meticulous network cluster structure based on the CNCS recognizer of optimizing the Q function.This means that this class algorithm can not identify the overall network bunch of necessary being in the network as a rule.

MFC in 2002 (Maximum Flow Community) algorithm, Girvan-Newman (GN) algorithm in 2002, Wu-Huberman (WH) algorithm in 2004 and CPM (Clique Percolation Method) algorithm etc. in 2005 are typical heuristic CNCS recognizers.Thereafter the representative heuritic approach of Ti Chuing also comprises: the hierarchical clustering algorithm that proposed in 2008, proposed in 2009 based on information-theoretical CNCS recognizer, proposed in 2010 based on many granularities CNCS recognizer of Laplce's dynamic property etc.The common feature of this class algorithm is: suppose to come the heuristic information of algorithm for design employing intuitively based on some, for most of network, they can look for an approximate optimal solution apace, but theoretically strict guarantee they can both find gratifying separating to any fan-in network.

In sum, although there has been several different methods, all have limitation separately, the CNCS identification problem also is far from by fine solution, embodies a concentrated reflection of following 2 aspects:

The first, we also are not familiar with the essential implication of knowing the network cluster structure objectively theoretically.Does we also can't answer similar following basic problem at present: how the network cluster structure form? what positive connection does other character of it and network have? is which inherent attribute of it and network self relevant? therefore, present stage we have to have bunch " external " phenomenon that network showed to go to understand the network cluster notion by observation, and then by " subjectivity " definition objective function or heuristic rule go the portrayal and identification CNCS.As preceding analysis, usually can cause the result of calculation of " having partially " based on the algorithm of these objective functions or heuristic rule, and adopt different objective functions or heuristic rule usually can calculate different CNCS.Therefore, a basic problem is: from network " inherence " attribute, the theoretical model that can we provide a kind of " objective " goes to explain, portrays and identification CNCS.

The second, existing C NCS recognizer all has limitation separately, can not satisfy simultaneously do not have partially, computing velocity is fast, accuracy of identification is high, do not have supervision basic demands such as (promptly not relying on priori, insensitive to parameter).By finding after qualitative and quantitative analysis, the more existing main algorithm, the algorithm that accuracy of identification is high often has very high time complexity (being higher than O (n2)), and recognizer is cost with the sacrifice precision often and needs more parameter and priori fast.Special needs to be pointed out is in addition how under situation, to identify real network cluster sum and be still a unsolved difficult problem without any prior imformation.Therefore, how to design fast, high precision and unsupervised CNCS recognition methods are one of problem of solving of current expectation.

Summary of the invention

The objective of the invention is to disclose the essence of complex network sub-clustering phenomenon, and a kind of method that is used for quantitative test and quick identification complex network cluster structure is provided.

For achieving the above object, the invention provides a kind of complex network cluster structure analysis and recognition methods, it is characterized in that comprising the steps: based on the Markov process metastability

Construct a Markov process on the given complex network;

Calculate a step transition probability matrix of this process, and calculate the eigenwert of this matrix;

Calculate the network cluster number by the analytical characteristic value;

Calculate first metastable state of Markov process;

Identify the overall network bunch and the hierarchical structure thereof of network according to first metastable state.

Description of drawings

Process flow diagram shown in Figure 1 has provided the theoretical frame NAP that analyzes and discern complex network cluster structure based on the Markov process metastability;

Process flow diagram shown in Figure 2 has provided a kind of Fast implementation fast_NAP of above theoretical frame, and this method can identify the overall network bunch and the hierarchical structure thereof of necessary being in the network in the time of approximately linear.This method is exempted from parameter, and does not need the network priori;

Fig. 3-Fig. 7 has provided the result who adopts NAP and fast_NAP methods analyst heterogeneous networks.

Embodiment

Below with the present invention is described in detail.

With reference to Fig. 1, the flow process of NAP method starts from step 101.

Step 102 has provided the method for Markov process on the tectonic network, and is specific as follows:

Suppose in the network to exist an Agent, this Agent can along network connect from a network node at random move to other network node.Use P{X _t=i, 1≤i≤n} represent that the Agent process t time arrives the probability of network node i, X={X _t, t 〉=0} represents the random series that Agent constitutes in difference moment position.Since Agent in t position constantly uniquely by its t-1 constantly determining positions and and the position of t-1 before constantly all have nothing to do, promptly satisfied:

P(X _t＝i _t|X ₀＝i ₀，X ₁＝i ₁，…，X _t-1＝i _t-1}＝P(X _t＝i _t|X _t-1＝i _t-1}

Therefore random series X satisfies Markov, be one discrete, the time neat Markov process.

Step 103 has provided the method for Markov process transition probability matrix P on the tectonic network, and is specific as follows:

P＝D ^-1A

Matrix A=(a wherein _Ij) _{N * n}The adjacency matrix of expression network, D=diag (d ₁... d _n) the degree matrix of expression network, d wherein _i=∑ _ja _Ij.

Step 104 has provided the method for calculating the I-P proper value of matrix, specifically adopts power method, calculates whole eigenwerts of matrix I-P iteratively.

Step 105 has provided the method that goes out the network cluster number K according to above eigenvalue calculation, and is specific as follows:

K = \arg \min_{k} {CQ}_{k}

Wherein, CQ _K=λ _K/ λ _K+1, Λ=(λ ₁..., λ _n) eigenwert of representing matrix I-P.

Step 106 has provided and has calculated first metastable state of Markov process S ₁Method, specific as follows:

S_{1} = P^{{1 / λ}_{K + 1}}

Wherein P represents one step of the Markov process transition probability matrix that step 103 is calculated.λ _K+1The little eigenwert of K+1 of representing matrix I-P.

Step 107 has provided from S ₁Identify the conventional method of K network cluster and hierarchical structure thereof, one of them concrete grammar is as follows:

With matrix S ₁In each row regard vector of n dimension as, so total n n-dimensional vector, a node in each vectorial map network;

Employing gathers into K class based on the vectorial clustering method (as the K-average) of similarity with above n n-dimensional vector, the K of map network bunch.

With reference to Fig. 2, the flow process of fast NAP method starts from step 201;

Step 202 has provided the method for selecting in the network stable status, and is specific as follows:

c = \arg \max_{i} {d_{i}}, 1 \leq i \leq n

D wherein _iThe degree of expression node i.

Step 203 has provided the method for the constant preface distribution of calculating c, and is specific as follows:

All states of iterative computation arrive the t-step transition probability of state c:

P_{i, c}^{(t)} = \frac{1}{d} \underset{i < i, j > &Element; E}{Σ} (A_{ij} \cdot P_{j, c}^{(t - 1)});

According to probable value

With all state orderings, obtain the t attitude index sequence S of shape constantly _t

Repeat above two steps till the state index sequence is constant, i.e. S _t=S _T-1

The state transition probability of the constant moment T of state index sequence correspondence distributes

Constant preface distribution OTD. for c

Step 204 provides two minutes the method for an optimum that goes out network according to the constant preface Distribution calculation of c, and is specific as follows:

According to the constant preface distribution OTD that calculates, with network node according to the ascending ordering of each component value of OTD;

According to above node order the adjacency matrix A of network N is converted to new adjacency matrix B;

Make I _{{ f}}Expression examination property function is if f is for very then I _{{ f}} Value 1, otherwise value 0. makes vectorial X _x(i)=1I _{{ i≤x}}+ (1) I _{{ i＞}}, 1≤i≤n, the expression network is divided (N ₁, N ₂), (cut-point of the expression of 1≤x≤n) network node preface can be divided into the node preface two parts by x, and then whole network N is divided into two sub-network N x ₁And N ₂. optimal network is divided x ^*Satisfy:

x^{*} = \arg \min_{1 \leq x \leq n} \frac{{Y_{x}}^{T} (I - D^{- 1} B) Y_{x}}{{Y_{x}}^{T} Y_{x}} = \arg \min_{1 \leq x \leq n} \frac{{Y_{x}}^{T} Q_{B} Y_{x}}{{Y_{x}}^{T} Y_{x}}

Wherein vectorial Y _x=(1-x) (1+X _x)-x (1-X _x), D represents the network degree matrix that step 103 calculates, Q _B=I-D ^-1B.

According to following formula, can calculate x=1 successively, x=2 ..., the n of x=n correspondence

Value, wherein the x value of minimum value correspondence is network optimal dividing x to be asked ^*, and then network N is divided into N ₁And N ₂Two sub-networks.

Step 205 provides the determination methods of stop condition, and is specific as follows:

EP(N ₁，N ₂)≥0.5∧EP(N ₂，N ₁)≥0.5

Wherein,

Expression is from N ₁To N ₂The escape probability, (N ₁, N ₂) expression one of network N two minutes, satisfy N ₁∪ N ₂=N and N ₁∩ N ₂=Φ, wherein Φ represents empty set.P is the Markov process transition probability matrix that is calculated by step 103.

Step 207 provides network is divided into two sub-network methods, and is specific as follows:

The best cutting point x that calculates according to step 204 ^*Be vectorial X _xAssignment:

X_{x} (i) = 1 \cdot I_{{i \leq x^{*}}} + (- 1) \cdot I_{{i > x^{*}}}, 1 \leq i \leq n

X _xComponent value is the node and mutual first sub-network, the X of connecting and composing thereof of 1 correspondence _xComponent value is for the node of-1 correspondence and mutual connect and compose second sub-network.

Step 208 provides the method that recurrence is handled first sub-network, and is specific as follows: first sub-network that 207 steps are obtained begins to carry out from step 202 as input.

Step 209 provides the method that recurrence is handled second sub-network, and is specific as follows: second sub-network that 207 steps are obtained begins to carry out from step 202 as input.

Below adopt different networks that NAP and fast_NAP method have been carried out test and evaluation, further specify the principle and the effect of the inventive method, specific as follows:

Example 1 adopts the network cluster number of necessary being in the NAP methods analyst network

Fig. 3 (a) expression one width of cloth comprises 4 character pictures.Fig. 3 (b) represents the network of this image correspondence, and network modeling method adopts full connection method, and the weights that network connects adopt Gauss's similarity formula to calculate.In this network, each character constitutes the network cluster of a nature, so the live network of this network bunch number is 4.Each CQ that Fig. 3 (c) expression adopts NAP to calculate _KValue, wherein CQ ₄Minimum, so the network cluster that NAP calculates adds up to 4, consistent with real network cluster sum.

Example 2 Fig. 4 have provided the software interface of realizing the fast_NAP method.Adopt the fast_NAP method, the overall network of this software in can recognition network bunch and hierarchical structure thereof, and adopt visual complex network of mode that adjacency matrix and hierarchical tree combine bunch and hierarchical structure thereof.By rearranging the row and column of original adjacency matrix, will be arranged in together with the cluster knot point, can obtain can clear expression network cluster structure the conversion adjacency matrix.If network has clearly demarcated clustering architecture, its corresponding transition matrix should be an approximate diagonal matrix, the just corresponding network cluster of each piece submatrix of principal diagonal.The nonzero element (corresponding bunch inner edge) that is distributed in the principal diagonal zone is far away more than the nonzero element (limit between corresponding bunch) that is scattered in outside the principal diagonal zone.Hierarchical relationship tree represenation between bunch.

Example 3 adopts fast_NAP methods analyst American university football alliance network

Fig. 5 (a) has provided 2000 racing season American university football league networks.This network comprises 115 nodes and 613 limits.Each node is represented a football team of university in the network, and the bout that carries out between two teams is represented on every limit.According to the geographic position, all team is organized into 12 alliances.According to laws of the game, the match in the alliance is far away more than the match between alliance.Therefore, according to the relation of match, 12 corresponding 12 network clusters of alliance.

Fig. 5 (b) has provided the result of calculation of fast_NAP method, a diagonalizable adjacency matrix and a network cluster hierarchical structure tree.Analyze relatively back discovery, 12 network clusters that the fast_NAP algorithm obtains and the football league of 12 reality are identical substantially, have only 6 teams that are under the jurisdiction of 3 relatively independent alliances by the mistake branch.Wrong reason of dividing is that the match of team is too much outside these teams and the alliance.

Example 4 adopts fast_NAP methods analyst dolphin network

Fig. 6 (a) has provided the dolphin community network.This network comprises 62 nodes, 160 limits.Each node is represented a dolphin, and two social relationships between the dolphin are represented on every limit.The tracked observation of this group dolphin 7 years.For a certain reason, this group dolphin finally is split into 2 subgroups, shown in Fig. 6 (a).

Fig. 6 (b) has provided the result of calculation of fast_NAP method, a diagonalizable adjacency matrix and a network cluster hierarchical structure tree.Analyze relatively the back and find, 2 maximum network that the fast_NAP method obtains bunch have accurately been predicted the actual division situation of dolphin society.In addition, the fast_NAP method has also analyzed the division situation that 2 dolphin groups after the division may further be taken place.

Example 5 adopts the fast_NAP method to carry out image model identification

This section is by the validity of image model identification problem test fast_NAP method.After image was modeled as network, the identification of essential element just was converted to corresponding network cluster identification in the image.First row of Fig. 7 have provided 3 width of cloth different images, and the network of their correspondences generates in accordance with the following methods:

(1) each pixel is regarded a network node as;

(2) for each to pixel i (x _i, y _i) and j (x _j, y _j), calculate connection (i, weights j) according to Gauss's similarity formula;

(3) remove the sub-average connection of weights.

The secondary series of Fig. 7 has provided the network of the different images correspondence that obtains according to above method.

The 3rd row of Fig. 7 have provided the recognition result of using the fast_NAP method, and wherein different pictorial elements is with different color differentiating.As shown in the figure, all images element all is correctly validated.What particularly point out is that in the 3rd width of cloth image, character " A " and " B " can be correctly validated from noise.

Claims

1. complex network cluster structure analysis and recognition methods is characterized in that, comprise the steps:

Construct a Markov process X on the given complex network N;

Calculate the step transition probability matrix P of X, and the eigenwert of compute matrix I-P;

Go out the network cluster number K of N by eigenvalue calculation;

Calculate first metastable state S of X ₁

According to S ₁Identify K network cluster and the hierarchical structure thereof of N.

2. complex network cluster structure recognition methods according to claim 1 is characterized in that, this method adopts the network cluster structure in the following ultimate principle identification complex network:

Ubiquitous complex network sub-clustering phenomenon is the external expressive form of the inherent metastability of complex network.The Markov process that has on the complex network of clustering architecture can some metastable state of experience in the limit process that develops to its full stable state.The Markov process transition probability matrix of first metastable state correspondence has comprised the most information of network cluster, thereby can identify network cluster structural informations such as network cluster form, number and hierarchical structure by first metastable state of calculating, analyzing Markov process.

3. complex network cluster structure recognition methods according to claim 1 is characterized in that, a Markov process on the complex network is constructed as follows:

Suppose in the network to exist an Agent, this Agent can along network connect from a network node at random move to other network node.X={X _t, t 〉=0} represents the random series that Agent constitutes in difference moment position, P{X _t=i, 1≤i≤n} represent that Agent arrives the probability of network node i through the t time.Since Agent in t position constantly uniquely by it in t-1 determining positions constantly, and and the position of t-1 before constantly all have nothing to do, promptly satisfied:

P(X _t＝i _t|X ₀＝i ₀，X ₁＝i ₁，…，X _t-1＝i _t-1}＝P(X _t＝i _t?|X _t-1＝i _t-1}

4. complex network cluster structure recognition methods according to claim 1 is characterized in that, a step transition probability matrix P of the Markov process on the complex network calculates as follows:

P＝D ^-1A

Matrix A=(a wherein _Ij) _{N * n}The adjacency matrix of expression network, D=diag (d ₁... d _n) the degree matrix of expression network, d wherein _i=∑ _ja _IjThe degree of expression node i. the eigenwert of matrix I-P is designated as Λ=(λ ₁..., λ _n).

5. complex network cluster structure recognition methods according to claim 1, it is characterized in that (as: network has clustering architecture to the important information of complex network cluster structure? how is the form of each network cluster? bunch with bunch how be connected form? how many numbers of network cluster is) can obtain as follows:

Based on the large deviation theory, can adopt the eigenvalue calculation of one step of Markov process transition probability matrix (patent requires the 4 matrix P that calculated) to go out each metastable entering and post-set time.Based on the physical meaning of these times, define each physical quantity of quantitative description network cluster structural form as follows, comprising: bunch tightness degree, bunch with bunch separation degree, network cluster architecture quality.

For the complex network that comprises K network cluster:

Bunch tightness degree be defined as: compact degree=1/ λ _K+1

Bunch with bunch separation degree be defined as: seperateddegree=1/ λ _K

The network cluster architecture quality is defined as:

Wherein, Λ=(λ ₁..., λ _n) be the eigenwert of matrix I-P, P represents to require one step of the Markov stochastic process transition probability matrix that method calculates in 4 according to patent.The tightness degree of more little each bunch of compact degree is good more; The separation degree of big more each bunch of seperated degree is good more; 0≤CQ _K≤ 1, CQ _KApproach 0 more, the quality of network cluster structure is good more, and more near 1, the quality of network cluster structure is poor more.

According to above definition, optimum network cluster number should the top-quality situation of map network clustering architecture, that is:

CQ wherein _KThe network cluster architecture quality of the above definition of expression.

6. complex network cluster structure recognition methods according to claim 1 is characterized in that, first metastable state of Markov process that comprises the complex network of K network cluster is calculated as follows:

Wherein P represents to adopt patent to require one step of the Markov process transition probability matrix that method is calculated in 4.K adopts patent to require the network cluster number that method is calculated in 5.λ _K+1The little eigenwert of K+1 of representing matrix I-P.

7. complex network cluster structure recognition methods according to claim 1 is characterized in that, can calculate first metastable state of Markov process as follows fast and therefrom identify overall network bunch and hierarchical structure.

Select in the network stable status c;

Calculate the constant preface distribution OTD (Ordering Time Distribution) of c;

Go out the optimum two minutes of network according to the constant preface Distribution calculation of c;

If satisfy stop condition then stop algorithm,

Otherwise network is divided into two sub-networks according to the optimum segmentation standard;

Recurrence is handled this two sub-networks.

8. quick first metastable state of calculating Markov process according to claim 7 also therefrom identifies overall network bunch and hierarchical structure, it is characterized in that, can select stable status c in the network as follows, and calculate the constant preface distribution OTD of steady state (SS) c:

(1) select in the network stable status c:

Wherein, π _iThe expression Markov process arrives the limiting probabilities of state i, d _iThe degree of expression node i.

(2) calculate the constant preface distribution OTD of steady state (SS) c:

According to probable value

With all state orderings, obtain t state index sequence S constantly _t

Constant preface distribution OTD for c.

9. quick first metastable state of calculating Markov process according to claim 7 also therefrom identifies overall network bunch and hierarchical structure, it is characterized in that, can calculate the optimum two minutes of network as follows:

Make (N ₁, N ₂) expression one of network N two minutes, wherein N ₁And N ₂Represent two sub-networks, and satisfy N ₁∪ N ₂=N and N ₁∩ N ₂=Φ, Φ represents empty set.

Order

Expression is from sub-network N ₁To sub-network N ₂The escape probability; Wherein, | N ₁| expression sub-network N ₁The number of middle node, P represents to adopt patent to require one step of the Markov process transition probability matrix that method is calculated in 4.

Order

The network cluster architecture quality of expression randomization;

The order vector

The expression network is divided (N ₁, N ₂), then have

Wherein, P represents to adopt patent to require one step of the Markov process transition probability matrix that method is calculated in 4, Y=(1-k) (1+X)-k (1-X) and

Expression examination property function is if f is for very then I _{{ f}}Value 1, otherwise value 0.

The optimal dividing X of network ^*Satisfy:

The optimal dividing X of network ^*Can calculate as follows:

The constant preface distribution OTD that calculates according to Claim 8, with network node according to the ascending ordering of each component value of OTD;

Make vectorial X _x(i)=1I _{{ i≤x}}+ (1) I _{{ i＞x}}, 1≤i≤n represents network division (N ₁, N ₂), (cut-point of the expression of 1≤x≤n) network node preface can be divided into the node preface two parts by x, and then whole network N is divided into two sub-network N x ₁, N ₂. optimal network is divided x ^*Satisfy:

Wherein vectorial Y _x=(1-x) (1+X _x)-x (1-X _x), D represents the network degree matrix that calculates according to claim 4, Q _B=I-D ^-1B.

Value, wherein the x value of minimum value correspondence is network optimal dividing x to be asked ^*. according to x ^*Sequence node can be divided into two set, corresponding respectively two sub-network N that are split to form ₁And N ₂, and then according to

Calculate from N respectively ₁To N ₂With from N ₂To N ₁The escape probability.

10. quick first metastable state of calculating Markov process according to claim 7 also therefrom identifies overall network bunch and hierarchical structure, it is characterized in that, can judge two minutes stop condition of recurrence as follows, and calculate the hierarchical structure of network cluster:

(1) two minutes stop condition of recurrence is:

EP(N ₁，N ₂)≥0.5∧EP(N ₂，N ₁)≥0.5

EP (N wherein ₁, N ₂) expression according to patent require 9 calculate from N ₁To N ₂The escape probability.

(2) hierarchical structure of network cluster can be set up as follows:

The recursive calculation process produces a binary tree structure, and each node is represented a sub-network in the tree, and the root node of tree is represented whole network, and each leaf node is represented a network cluster.This binary tree structure is represented the hierarchical structure of overall network bunch formation.