CN105631751A

CN105631751A - Directional local group discovery method

Info

Publication number: CN105631751A
Application number: CN201510996221.3A
Authority: CN
Inventors: 潘理; 吴鹏
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2015-12-25
Filing date: 2015-12-25
Publication date: 2016-06-01

Abstract

The invention provides a directional local group discovery method comprising the following steps that a network adjacent matrix and an attribute matrix are established; an attribute importance weight vector is inferred based on the model node of a response directional target; a network edge weight is weighted again based on the weight vector; edges with significantly large weights are extracted on the network weighted again so as to form group seeds; directional local groups are extracted through local extension of group seed optimization weighted conductance; and unimportant and repeated groups in the extracted directional local groups are removed. A directional target self-adaptive inference method is provided by aiming at the characteristics that social network group structures are diverse and group application targets are clear, the network is weighted again under the inferred targeted subspace and the seeds are constructed, the directional local groups are extracted based on local extension, and thus the method is suitable for specific social application targets.

Description

Directed local colony discover method

Technology neighborhood

The present invention relates to social network technical field, specifically, it relates to directed local colony discover method in a kind of social network, can be used for social network functional analysis, structures visualization and various social activity application input.

Background technology

Colony in social network finds to play an important role to understanding network function, visual network structure and develop other social application. From structure, colony inside connects closely, connects sparse between colony; From attribute, colony inner on specified genus subspace relatively homogeneous.

Through the literature search of prior art being found, major part colony discover method only considers network topology information, and only extracts a kind of fixing group structure. In fact, due to complexity and the huge property of social network, it comprises multiple group structure usually, and the target of different social application is different, need the group structure of different deflection, it is thus desirable to extract suitable directed local colony based on specific application target and user's interest.

However, it may be difficult to only just obtain directed group structure based on network structure information. Nowadays, it is possible to obtain the attribute information of a large amount of social network, attribute can react and describe application target and user's interest, because herein is provided a kind of method guiding directed local colony to find.

Colony's discover method of compages information and attribute information mainly comprises attribute total space method and attribute subspace method. Attribute total space method uses all hierarchical cluster attribute node collection given. The people such as Xu delivered, in international conference " SIGMOD ", the article being entitled as " Amodel-basedapproachtoattributedgraphclustering " in 2012, adopted Bayes's model to process structure and all attribute informations in literary composition simultaneously; This model is that each possible colony's structure based connects and all property distribution distribute a probability, colony is pinpointed the problems and changes into a probabilistic inference problem, and uses the variational method to solve. But, the attribute of not all acquisition and certain specific target are related, and the usual resolving ability of total space method is not enough, causes finding not good colony. On the other hand, attribute subspace method is based on certain attribute subspace cluster node collection. The people such as Huang deliver, on the international periodical " InformationScience " of 2015 Nian, the article being entitled as " Densecommunitydetectioninmulti-valuedattributednetworks ", literary composition adopts the subspace clustering method based on unit, find the unit in subspace with dense connection, it is desired to colony meets subspace interest thresholding, fraction of coverage and connection property thresholding. But, existing subspace is selected based on non-supervisory feature selection mechanism usually, cannot for specific objective chooser space.

Summary of the invention

For defect of the prior art, the directed local colony discover method that it is an object of the invention to provide, comprises the steps:

Step 1: adjacency matrix A and the attribute matrix B setting up network to be analyzed;

Step 2: user provides the model node v of a reaction orientation target_p, the present invention infers the Importance of attribute weight vector collection in its structure neighborhood based on this node;

Step 3: judge whether weight vector collection is empty, if the non-sky of weight vector collection, then carries out step 4; If weight vector collection is empty, then perform step 13;

Step 4: concentrate from weight vector and take out a weight vector

Step 5: based on described weight vectorAgain weighting obtains the comprehensive weights of network edge;

Step 6: extract the limit that weights are significantly big on weighting network again, and build colony's subset with described limit;

Step 7: judge whether colony's subset is empty, if the non-sky of colony's subset, then performs step 8; If colony's subset is empty, then perform step 12;

Step 8: take out colony's seed from colony's subset;

Step 9: judge whether described colony seed belongs to set of access nodes, if not belonging to set of access nodes, then performs step 10; If belonging to set of access nodes, then return and perform step 7;

Step 10: colony's seed described in local expansion until the weighting of colony representated by this seed to lead rate minimum, now obtain a directed local colony, the ratio that the weighted volume with this colony is cut in the weighting that rate is defined as this colony is led in the weighting of colony;

Step 11: directed local colony is added described weight vectorUnder directed local colony collection, upgrade set of access nodes;

Step 12: remove directed local colony under described weight vector and concentrate unimportant in the directed local colony repeated, the colony of unimportant colony to be those internal edges weights sums with the ratio of network all limits weights sum be less than a significance thresholding, the colony of repetition is that those and the scale occured simultaneously between having there is colony that colony concentrates are greater than the colony that is repeated thresholding;

Step 13: export all directed local colonies collection.

Preferably, the adjacency matrix A coding network structural information in described step 1, the either element A in matrix_ijRepresentative edge (v_i,v_j) topological weights, on duty when being 0, represent corresponding node between there is not limit, attribute matrix B coding network attribute information, the either element B in attribute matrix B_ipRepresent the p attribute value of i-th node.

Preferably, described step 2 comprises: based on the model node v of a reaction orientation target_pInfer Importance of attribute weight vector collection ��, wherein:

In formula:Represent kth weight vector,Representing the weights of q attribute in certain weight vector, q represents property index, and t represents attribute number, SD_qRepresent the similarity of model's node collection on the q attribute;

Specifically, comprising:

Step 2.1: stochastic sampling in a network | P_R| individual node to composition random node to collection, | P_R| represent that random node is to the number of centralized node pair;

Step 2.2: calculate all random nodes to the sum of squares RSum of the difference of attribute value_q, calculate standardizing factor ��_q=RSum_q/|P_R|;

Step 2.3: the proximity network extracting model node out, namely the network of all node compositions being connected with model node, divides out neighborhood colony collection NCS (v in proximity network_p);

Step 2.4: judge whether neighborhood colony collection is empty, then performs step 2.5 if not empty, otherwise end step 2;

Step 2.5: concentrate from neighborhood colony and take out a neighborhood colony NC^k, judge whether the inner number of nodes of neighborhood colony is greater than CS_lIf then carrying out 2.6, otherwise returning step 2.4;

Step 2.6: choose CS in described neighborhood colony at random_lIndividual node composition model's node collection, model's node concentrates the node of all any two nodes compositions to composition similar node to collection;

Step 2.7: calculate all similar node to the sum of squares SSum of the difference of attribute value_q;

Step 2.8: calculate the similarity of model's node collection on the q attribute

Step 2.9: the importance weight calculating q attribute

Preferably, described step 5 comprises:

Step 5.1: the attributive distance on every bar limit in computational grid

In formula: B_iRepresent the attribute vector of i-th node, B_jRepresent the attribute vector of jth node,Represent that diagonal lines isDiagonal matrix, (B_i-B_j)^TRepresent the transposition of i-th node attribute vector with the difference of jth node attribute vector;

Step: 5.2: the weights calculating every bar limit, namely based on described weight vectorAgain weighting network limit weights W={W_ij}:

W_{i j} = \frac{1}{\frac{{SED}_{i j}}{A v g D} + \frac{γ}{A v g A \times A_{i j}}}, &ForAll; (v_{i}, v_{j}) &Element; ϵ,

In formula: v_iRepresent node i, v_jRepresent node j, (v_i,v_j) representing the limit between node i and node j, E represents network edge collection, W_ijRepresent limit (v_i,v_j) weights, SED_ijRepresent limit (v_i,v_j) attributive distance, AvgD represents all limits average properties distance; A_ijRepresent limit (v_i,v_j) structure weights, on duty is that 0 expression does not exist this limit; AvgA represents all limits average structure distance, and �� represents control attribute and the balance parameters of structure importance balance.

Preferably, described step 6 comprises: extracts the significantly big limit of weights on weighting network again and builds colony subset SeedSet^k;

Specifically, comprising:

Step 6.1: is sorted composition sequence limit collection from big to small by weights in all limits;

Step 6.2: take out sequence limit and concentrate front Size_BSBar limit composition guiding set BS;

Step 6.3: the average of limit weights and variance are as parameter in guiding set, guide a normal distribution;

Step 6.4: take out the limit that foremost is concentrated on sequence limit;

Step 6.5: judge whether the weights on the limit of described taking-up meet described normal distribution, if meeting, adds guiding set by limit, upgrades normal distribution average with the average of limit weights in new guiding set, and returns step 6.4, if not meeting, performs step 6.6;

Step 6.6: be guided out a subgraph with limits all in guiding set;

Step 6.7: in subgraph, each connected component is colony's seed, all connected components form colony's subset.

Preferably, described step 10 comprises:

Step 10.1: colony's seed, as current colony, calculates the weighting of current colony and leads rate ��_curr, weighting is led rate and is defined as:

ψ = \frac{W C u t}{m i n (W V o l, T V o l - W V o l)};

In formula: WCut represents that the weighting of colony is cut, i.e. limit weights sum between node outside colony's interior joint and colony, WVol represents the weighted volume of colony, i.e. all node incidence edge weights sums in colony, TVol represents total weighted volume of network, i.e. all node incidence edge weights sums in network;

Step 10.2: current conductivity value is given and initially leads rate ��_init=��_curr;

Step 10.3: the current colony of local expansion cannot reduce, until continuing expansion, the rate of leading, calculates new for leading rate ��_curr;

Step 10.4: the current colony of local contraction cannot reduce, until continuing contraction, the rate of leading, calculates new for leading rate ��_curr;

Step 10.5: whether judgement is initially led rate and work as leading rate equal, if equal, then end step 10, no inequal, then return step 10.2.

Preferably, described step 12 comprises:

Step 12.1: all colonies that directed local colony concentrates are led rate by weighting and sorts from small to large;

Step 12.2: judge whether colony's collection is empty, if not empty, then perform step 12.3; , if it is empty, then perform step 12.6;

Step 12.3: take out the colony that directed local colony concentrates foremost;

Step 12.4: judge whether the described colony taken out is less than significance thresholding, if being less than, then adds this colony and removes collection; If being not less than, then judging whether this colony belongs to and remove collection, removing collection if not belonging to, this colony being added and retains collection, and perform step 12.5, removing collection if belonging to, return and perform step 12.2;

Step 12.5: directed local colony concentrates each colony after the described colony coming taking-up judge, namely whether the degree of overlapping of each colony after described colony and described colony is greater than overlapping thresholding, if being greater than overlapping thresholding, the colony after described colony being added and removing collection; If being less than or equal to overlapping thresholding, then the colony after described colony is retained in directed local colony and concentrates;

Step 12.6: form new directed local colony collection by retaining all colonies concentrated.

Compared with prior art, the present invention has following useful effect:

1, according to directed local colony provided by the invention discover method, only need to provide a model node relevant to target group, get final product automatic hard objectives, infer the Importance of attribute weight vector collection around this node.

2, according to directed local colony provided by the invention discover method, based on the Importance of attribute weight vector weighting network again inferred, the limit weights in outstanding target group, extract the composition colony of the limit in target group seed, and local expansion becomes target group.

3, according to directed local colony provided by the invention discover method, directed local colony collection is carried out aftertreatment, it is possible to effectively remove the wherein unimportant colony with repetition.

Accompanying drawing explanation

By reading with reference to the detailed description that non-limiting example is done by the following drawings, the other features, objects and advantages of the present invention will become more obvious:

Fig. 1 is the schema of directed local colony provided by the invention discover method;

Fig. 2 is that the directed colony between the present invention and multiple existing method finds that there is validity performance comparison figure, wherein, Fig. 2 (a) is for performance is with fuzzy parameter variation diagram, Fig. 2 (b) is for performance is with attribute number variation diagram, Fig. 2 (c) is for performance is with colony's number variation diagram in subspace, and Fig. 2 (d) is for performance is with number of network node variation diagram;

Fig. 3 is the Riming time of algorithm comparison diagram between the present invention and multiple existing method, and wherein, Fig. 3 (a) is for working time is with network edge number variation diagram, and Fig. 3 (b) is for working time is with attribute number variation diagram.

Embodiment

Below in conjunction with specific embodiment, the present invention is described in detail. The technician contributing to this neighborhood is understood the present invention by following examples further, but does not limit the present invention in any form. It should be appreciated that concerning the those of ordinary skill of this neighborhood, without departing from the inventive concept of the premise, it is also possible to make some distortion and improvement. These all belong to protection scope of the present invention.

In order to the technical scheme being illustrated more clearly in the present invention, the specific embodiment being listed below illustrates further:

According to directed local colony provided by the invention discover method, comprise the steps:

Step S1, the adjacency matrix A setting up network to be analyzed and attribute matrix B: for all nodes of network are numbered continuously, numbering is from 1; Elements A in adjacency matrix A_ijRepresentative edge (v_i,v_j) structure weights, value be 0 represent corresponding node between there is not limit; Build attribute matrix B, the element B in attribute matrix B_ipRepresent the p attribute value of i-th node;

Step S2, model node v based on a reaction orientation target_pInfer Importance of attribute weight vector collection

Wherein, t is attribute number, SD_qIt is that model's node collection is at attribute f_qOn similarity;

Described step S2, is specially:

Step S21, stochastic sampling | P_R| individual node is to forming random node to collection;

Step S22, calculate all random nodes to f on each attribute_qValue difference square sum RSum_q, calculate standardizing factor ��_q=RSum_q/|P_R|;

Step S23, the proximity network extracting model node out, divide neighborhood colony collection NCS (v in proximity network_p);

Step S24, whether collection is empty to judge neighborhood colony, then carries out step S25 if not, otherwise end step S2;

Step S25, from colony concentrate take out a neighborhood colony NC^k, judge whether its scale is greater than CS_lIf then carrying out S26, otherwise returning step S24;

Step S26, choose CS in neighborhood colony_lIndividual node composition model's node collection, model's node concentrates the node of all any two nodes compositions to composition similar node to collection;

Step S27, calculate all similar node to f on each attribute_qValue difference square sum SSum_q;

Step S28, calculating model's node collection are at attribute f_qOn similarity

Step S29, calculate each attribute f_qImportance weight

Whether step S3, to judge weight vector collection be empty, then carries out step S4 if not, otherwise carries out step S13;

Step S4, from vector set take out a vector

Step S5, based on this weight vectorAgain weighting network limit weights W={W_ij}:

W_{i j} = \frac{1}{\frac{{SED}_{i j}}{A v g D} + \frac{γ}{A v g A \times A_{i j}}}, &ForAll; (v_{i}, v_{j}) &Element; ϵ,

Wherein, W_ijRepresent limit (v_i,v_j) weights, SED_ijRepresent limit (v_i,v_j) attributive distance, AvgD represents all limits average properties distance, A_ijRepresent limit (v_i,v_j) structure weights, its intermediate value is that 0 expression does not exist this limit, and AvgA represents all limits average structure distance, and �� represents control attribute and the balance parameters of structure importance balance;

Described step S5, is specially:

Step S51, the attributive distance calculating every bar limit

Step S52, the comprehensive weights calculating every bar limit

W_{i j} = \frac{1}{\frac{{SED}_{i j}}{A v g D} + \frac{γ}{A v g A \times A_{i j}}}, &ForAll; (v_{i}, v_{j}) &Element; ϵ;

Step S6, the limit that to extract weights on weighting network again significantly big build colony subset SeedSet^k;

Described step S6, is specially:

Step S61, is sorted composition sequence limit collection from big to small by weights in all limits;

Size before step S62, taking-up sequence limit collection_BSBar limit composition guiding set BS;

Step S63, taking the average of limit weights in guiding set and variance as parameter, guide a normal distribution;

Step S64, the limit taking out sequence Bian Ji foremost

Step S65, detecting whether described limit meets this normal distribution, if meeting, this limit being added guiding set, upgrade normal distribution average with new guiding set limit weights average, returning step S64, if not meeting, carrying out step S66;

Step S66, it is guided out a subgraph with limits all in guiding set;

In step S67, subgraph, each connected component is colony's seed, and all connected components form colony's subset;

Whether step S7, to judge subset be empty, then carries out step S8 if not, otherwise carries out S12;

Step S8, from subset take out colony's seed;

Step S9, judge whether this colony's seed belongs to set of access nodes visitedNodes, then carry out step S10 if not, otherwise return step S7;

This colony's seed optimization weighting of step S10, local expansion is led rate and is extracted a directed local colony;

Described step S10, is specially:

Step S101, colony's seed, as current colony, calculate as leading rate ��_curr;

Step S102, when leading rate give initially lead rate ��_init=��_curr;

The current colony of step S103, local expansion cannot reduce, until continuing expansion, the rate of leading, and calculates new for leading rate ��_curr;

The current colony of step S104, local contraction cannot reduce, until continuing contraction, the rate of leading, and calculates new for leading rate ��_curr;

Step S105, judge initially to lead rate and when whether leading rate is equal, if then end step S10, otherwise return step S102;

Step S11, colony is added weight vector under directed local colony collection, upgrade set of access nodes;

Step S12, remove this weight vector under directed local colony concentrate colony that is unimportant and that repeat;

Described step S12, is specially:

Step S121, the colony that sorts from small to large by rate of leading concentrate all colonies;

Step S122, whether collection is empty to judge colony, then carries out S123 if not, otherwise carries out S126;

Step S123, take out a colony in turn by this;

Step S124, judge take out described colony whether be less than significance thresholding, if being less than, then this colony is added and removes collection; If being not less than, then judging whether this colony belongs to and remove collection, removing collection if not belonging to, this colony being added and retains collection, and perform step S125, removing collection if belonging to, return and perform step S122;

Step S125, directed local colony is concentrated the described colony coming taking-up after each colony judge, namely whether the degree of overlapping of each colony after described colony and described colony is greater than overlapping thresholding, if being greater than overlapping thresholding, the colony after described colony being added and removing collection; If being less than or equal to overlapping thresholding, then the colony after described colony is retained in directed local colony and concentrates;

The colony that step S127, reservation are concentrated forms new colony's collection;

Step S13, export all colonies collection.

For the present embodiment to be solved technical problem, technical scheme and advantage clearly, below in conjunction with accompanying drawing, the present embodiment is described in detail.

As shown in Figure 1, the directed local colony discover method that the present embodiment provides, comprises the steps:

Step S1, the adjacency matrix A setting up network to be analyzed and attribute matrix B: for all nodes of network are numbered continuously, numbering is from 1; Build square adjacency matrix A, the elements A in adjacency matrix A_ijRepresentative edge (v_i,v_j) structure weights, value be 0 represent corresponding node between there is not limit; Build attribute matrix B, the element B in attribute matrix B_ipRepresent the p attribute value of i-th node.

Wherein, t is attribute number, SD_qIt is that model's node collection is at attribute f_qOn similarity.

Step S6, the limit that to extract weights on weighting network again significantly big build colony subset SeedSet^k, is sorted composition sequence limit collection from big to small by weights in all limits, takes out Size before the collection of sequence limit_BSBar limit composition guiding set BS, in guiding set, the average of limit weights and variance are as parameter, guide a normal distribution, whether the limit that detection sequence limit is concentrated one by one meets this normal distribution, if meeting, limit is added guiding set, and upgrades normal distribution average with the average of limit weights in new guiding set, if not meeting, stop detection, being guided out a subgraph with limits all in guiding set, in subgraph, each connected component is colony's seed, and all connected components form colony's subset.

Whether step S7, to judge subset be empty, then carries out step S8 if not, otherwise carries out S12.

Step S8, from subset take out colony's seed.

Step S9, judge whether this colony's seed belongs to set of access nodes visitedNodes, then carry out step S10 if not, otherwise return step S7.

This colony's seed optimization weighting of step S10, local expansion is led rate and is extracted a directed local colony, and colony's seed, as current colony, calculates as leading rate ��_curr, give when leading rate and initially lead rate ��_init=��_curr, the current colony of local expansion cannot reduce, until continuing expansion, the rate of leading, and calculates new for leading rate ��_curr, the current colony of local contraction cannot reduce, until continuing contraction, the rate of leading, and calculates new for leading rate ��_curr, judge initially to lead rate and when whether leading rate is equal, if then end step S10, otherwise return and continue expansion colony.

Step S11, colony is added weight vector under directed local colony collection, upgrade set of access nodes.

Step S12, removing directed local colony under this weight vector concentrates unimportant with the colony repeated, all colonies are concentrated by rate of the leading colony that sorts from small to large, judge whether colony's collection is empty, each colony is taken out from small to large in turn if not empty by leading rate, judge whether its significance is less than significance thresholding, if then this colony adds and removes collection, otherwise judge whether it belongs to and remove collection, then this colony is added if not and retain collection, and whether the degree of overlapping judging each colony after coming this colony and this colony is greater than overlapping thresholding, if then adding remove collection by coming colony below, iteration said process is until former colony concentrates all colonies to be all removed, retain the colony concentrated and form new colony's collection.

Step S13, export all colonies collection.

The validity of the present embodiment can be illustrated further by emulation experiment below. It should be noted that, in experiment, the parameter of application does not affect the generality of the present invention.

1) simulated conditions:

CPUIntelI7-3770S3.10GHz, RAM16.00GB, operating system Windows10, software Matlab 2013.

2) content is emulated:

Choose synthetic attribute network to test, the LFR benchmark network using Lancichinetti and Fortunato to propose generates the data set with different mixing parameter �� and different scales n, the degree of mixing of hybrid parameter �� net control, it is worth more big, mixture of networks degree is more big, more difficult accurate discovery colony. On each LFR benchmark network, for the attached length of each node is AN attribute vector, generate numerical value attribute (num) network, two-value property (bin) network and absolute value attribute (cate) network respectively. Assuming that exists two attribute subspace, every sub spaces has 5 important attribute, has NCS directed colony.

The present embodiment represents with TLCD in emulation experiment.

Other colony's discover method of the present embodiment and 4 is carried out emulation contrast. These 4 methods are as follows, the Louvian method proposed in " Fastunfoldingofcommunitiesinlargenetworks " that the people such as Vincent delivered on " JournalofStatisticalMechanics " in 2008, the method only uses network topology information; The PICS method proposed in " the Pics:Parameter-freeidentificationofcohesivesubgroupsinla rgeattributedgraphs " that the people such as Akoglu delivered on " SDM " in 2012, the method is the overall space method simultaneously using structural information and attribute information; BAGC method that the people such as Xu published an article in international conference " SIGMOD " in 2012 in " Amodel-basedapproachtoattributedgraphclustering " and propose, the method is the overall space method simultaneously using topology information and attribute information; " Focusedclusteringandoutlierdetectioninlargeattributedgra phs " middle FocusCO method proposed that the people such as Perozzi delivered in 2014 at " SIGKDD ", the method is the attribute subspace method simultaneously using topology information and attribute information.

Emulation experiment validity results of property is as shown in Fig. 2 (a)��Fig. 2 (d), and the performance of PICS is worst in all cases, and when major part, the performance the 2nd of BAGC is poor, which show the shortcoming of total space method. Performance is all best in all cases for TLCD, and FocusCO is then relatively unstable and poor. Working time, result was as shown in Fig. 3 (a)��Fig. 3 (b), and PICS needs maximum working times in all cases. The working time of BAGC is shorter when network is less, but increases along with the increase of network. Only depending primarily on expansion colony's seed the working time of TLCD and FocusCO, their time curve is steadily approximate.

The directed local colony discover method that the present embodiment provides, can be used for the visual network with node attribute information, it has been found that the group structure of specific objective, is applicable to various different social application. The present embodiment infers Importance of attribute weight vector based on the model node of a reaction orientation target; Based on weight vector again weighting network limit weights; Again weighting network extracts the limit composition colony seed that weights are significantly big; The seed optimization weighting of the present embodiment local expansion colony is led rate and is extracted directed local colony, and removes in the directed local colony extracted unimportant in the colony repeated.

Above specific embodiments of the invention are described. It is understood that the present invention is not limited to above-mentioned particular implementation, this neighborhood technician can make various distortion or amendment within the scope of the claims, and this does not affect the flesh and blood of the present invention.

Claims

1. a directed local colony discover method, it is characterised in that, comprise the steps:

Step 4: concentrate from weight vector and take out a weight vector

Step 8: take out colony's seed from colony's subset;

Step 13: export all directed local colonies collection.

2. directed local colony according to claim 1 discover method, it is characterised in that, the adjacency matrix A coding network structural information in described step 1, the either element A in matrix_ijRepresentative edge (v_i,v_j) topological weights, on duty when being 0, represent corresponding node between there is not limit, attribute matrix B coding network attribute information, the either element B in attribute matrix B_ipRepresent the p attribute value of i-th node.

3. directed local colony according to claim 1 discover method, it is characterised in that, described step 2 comprises: based on the model node v of a reaction orientation target_pInfer Importance of attribute weight vector collection ��, wherein:

Specifically, comprising:

Step 2.9: the importance weight calculating q attribute

4. directed local colony according to claim 1 discover method, it is characterised in that, described step 5 comprises:

Step 5.1: the attributive distance on every bar limit in computational grid

W_{i j} = \frac{1}{\frac{{SED}_{i j}}{A v g D} + \frac{γ}{A v g A \times A_{i j}}}, &ForAll; (v_{i}, v_{j}) &Element; ϵ,

5. directed local colony according to claim 1 discover method, it is characterised in that, described step 6 comprises: extracts the significantly big limit of weights on weighting network again and builds colony subset SeedSet^k;

Specifically, comprising:

Step 6.4: take out the limit that foremost is concentrated on sequence limit;

Step 6.6: be guided out a subgraph with limits all in guiding set;

6. directed local colony according to claim 1 discover method, it is characterised in that, described step 10 comprises:

ψ = \frac{W C u t}{m i n (W V o l, T V o l - W V o l)};

7. directed local colony according to claim 1 discover method, it is characterised in that, described step 12 comprises: