CN108009575A - A kind of community discovery method for complex network - Google Patents

A kind of community discovery method for complex network Download PDF

Info

Publication number
CN108009575A
CN108009575A CN201711215141.5A CN201711215141A CN108009575A CN 108009575 A CN108009575 A CN 108009575A CN 201711215141 A CN201711215141 A CN 201711215141A CN 108009575 A CN108009575 A CN 108009575A
Authority
CN
China
Prior art keywords
mrow
node
community
msub
mtd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711215141.5A
Other languages
Chinese (zh)
Inventor
胡文斌
许平华
邱振宇
高旷
唐传慧
刘中舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201711215141.5A priority Critical patent/CN108009575A/en
Publication of CN108009575A publication Critical patent/CN108009575A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of community discovery method for complex network, the algorithm is by analyzing network topology structure come design node transition probability, and importance of the node to Web Community is assessed based on random walk methods, then using the higher node of importance as core structure Web Community, community structure is adjusted finally by community's edge pruning method.Different from the existing method based on random walk, CDATP is that the transition probability of the design of node in network has asymmetry, and only assesses significance level of the node to community by node local transfer.

Description

A kind of community discovery method for complex network
Technical field
The present invention relates to computer science and community network field, proposes a kind of net based on node asymmetry transition probability Network community discovery algorithm (a community detection algorithm based on asymmetric transfer Probility of nodes, CDATP).For existing community discovery algorithm there are the drawbacks of, it is proposed that a kind of new node turn Move probability metrics method and the community division method based on event propagation rule.
Background technology
In recent years, the community structure being prevalent in network has received the extensive concern of domestic and foreign scholars.On The research of community discovery is also had been applied in many fields, and achieves good achievement.
It is more one of research method of mainstream that random walk are used for community discovery.Random walk are to be based on horse The method of Er Kefu models, its main thought are with a substantial amounts of random walker of initial distribution release, in diffusion process Afterwards, the distribution function of walker can be obtained.There are more contacts in community structure, between node, and between different communities Contact it is then relatively fewer.Therefore, the walker in a random selection direction will be for longer periods trapped in inside community.
By a series of researchs, domestic and foreign scholars propose the several community discovery algorithm based on random walk. The Walktrap algorithms that Pons et al. et al. is proposed first assess the probability that node belongs to same community, then the mode with hierarchical cluster It was found that community, is one of earliest method based on random walk.Darong Lai et al. are by the way that the directional information on side is turned The weight on side in Undirected networks is turned to find the community in directed networks.The algorithm synthesis that Qingju Jiao et al. are proposed is examined Consider global Topological Structure and local topology, there is good performance on artificial data collection.What Xin Huang et al. were proposed SCMAG algorithms build the community based on attribute by the similarity of calculate node attribute.Chang Su et al. will Random walk are combined with label propagation algorithm to find community.Di Jin et al. have been used based on random walk Ant colony optimization algorithm come the affiliated community of decision node.However, above-mentioned algorithm in the step of node shifts it is more using Indifference transition probability, it is impossible to reflect the otherness of live network interior joint relation, and community's division accuracy is by transfer iteration Times influence is larger, it is necessary to which more priori carrys out aid decision, and feasibility is relatively low.
At research initial stage in community discovery field, most of scholar is to be used as research object using Undirected networks.But with The popularization of application of the community discovery technology in actual production scene, has more and more scholars how to begin to focus in digraph Community is found in network.A kind of processing method of early stage is exactly to neglect the directionality on side, it is come directly as Undirected networks Processing, but the direction for having scholar to point out side should be considered, and otherwise the key character of network can be caused to be lost, one of weight It is exactly after have ignored the direction on side to want reason, and the correlation between node will become imperfect.For example, certain in Twitter One user has unilaterally paid close attention to another user, " position " between them be it is unequal, but nonoriented edge can not describe it is this Relation.Node transfer in random walk inherently has directionality, therefore relevant community discovery algorithm can more hold Change places and complete transition from Undirected networks to directed networks.Related side's rule based on index optimization is needed according to directed networks Feature redesigns quality index, such as the modularity of the oriented version of Newman et al. propositions.In addition, Rosvall et al. is carried The OSLOM algorithms that the Infomap algorithms gone out and Lancichinetti et al. propose all are that more classical can be used for directed networks Community discovery algorithm, have preferable performance in the network of manual construction.However, the method based on index optimization can not fill Divide the complexity for considering live network, adaptability is poor.Method based on random walk in the environment of directed networks for The number of transfer iteration becomes more sensitive, it is necessary to which more prioris carry out aid decision, and feasibility is relatively low.
The content of the invention
The present invention mainly solves the technical problem of the grade present in the prior art;Provide a kind of asymmetric based on node The Web Community of transition probability finds algorithm (a community detection algorithm based on Asymmetric transfer probility of nodes, CDATP) for existing community discovery algorithm there are the drawbacks of, A kind of new node transition probability measure and the community division method based on event propagation rule are proposed, on the one hand can be filled Divide using network topology information and embody the not equity of node, on the other hand can reduce to preparing experiment and expertise Demand.Specifically, the problem of this method is performed poor for existing community discovery algorithm in live network, event is passed Broadcast rule and be combined the importance with evaluation node to community with random walk methods, and divide community on this basis.Society Area finds that algorithm CDATP includes two steps:(1) structure attribute subspace and generate attribute enhancing network;(2) node pair is assessed Community's importance simultaneously makes node spontaneously be drawn close to affiliated community, then the edge of community is trimmed.The present invention is in artificial mould Intend having preferable performance on data set and truthful data collection.
The above-mentioned technical problem of the present invention is mainly what is be addressed by following technical proposals:
A kind of community discovery method for complex network, it is characterised in that comprise the steps of:
Step 1:The sub- attribute space with superperformance is found from the attribute space of network, then maps that to network In dummy node, and build the virtual linkage between origin node and dummy node, realize and attribute information is converted into topological structure Information, specific implementation include following sub-step:
Step 1.1:Statistical information based on attribute, constructs the sub- attribute space with superperformance;
Step 1.2:Some dummy nodes, each dummy node are added in former network according to the length in sub- attribute space domain A kind of property value is represented, and virtual linkage is established according to the correspondence of node attribute values and dummy node;
Step 2:Based on the topological structure of network, quantify the asymmetric transition probability between node, and by restricted Random transferring obtain the core coefficients of node, importance of the node to community is assessed with this, specific implementation includes following son Step:
Step 2.1:Quantify the asymmetric transition probability of node based on network topology information;
Step 2.2:Pass through the reverse core propagated, node is assessed with this of conditional random walk method modeling event Feel concerned about number;Appraisal procedure is as follows:
Wherein, back represents the backtracking probability in random walk process, and P represents node transition probability matrix, and wherein N is original Node number in network;
Step 3:The cluster direction of node is determined based on transition probability and core coefficients, realizes that node spontaneously clusters, then Cluster shape is adjusted according to the core coefficients positioned at cluster fringe node, forms community's sequence, specific implementation includes following son Step:
Step 3.1:Transition probability and core coefficients based on node, determine the cluster direction of node;Rule is as follows:
If following either condition is true, the cluster direction of node i is i → j;
Condition one, Wij=1, and the core coefficients of node j are more than the core coefficients of the adjacent node of all other node i;
Condition two, Wij=1, and the core coefficients of node j are more than or equal to the core system of the adjacent node of all other node i Number, and fijMore than all other element of the i-th rows of matrix f;
Step 3.2:Node is set spontaneously to cluster for more intensive cluster, i.e., initial community;
Step 3.3:According to the correlation properties positioned at cluster marginal position node, the shape of cluster is trimmed, exports community Sequence.
In a kind of above-mentioned community discovery method for complex network, the specific implementation of step 1.1 includes following sub-step Suddenly:
Step 1.1.1:The comentropy of each attribute in computation attribute space, removes comentropy and is more than threshold value thAttribute, Remaining attribute is arranged by comentropy ascending order again;
Step 1.1.2:Remaining attribute is added into sub- attribute space successively, if causing sub- attribute space after adding a certain attribute Structure disturbance degree be more than threshold value ta, then the attribute is removed.
In a kind of above-mentioned community discovery method for complex network, the specific implementation of step 2.1 includes following sub-step Suddenly:
Step 2.1.1:The influence power between node is assessed, altogether including three subdivisions:
Sub- influence power one, by the influence power for going out chain relation generation
Wherein W be network adjacency matrix, αoutTo go out the chain factor;
Sub- influence power two, by the influence power for entering chain relation generation
Wherein αinTo enter the chain factor;
Sub- influence power three, the influence power produced by relation on attributes
Wherein I be add dummy node after adjacency matrix, αattrFor attribute factor;
Influence power between node is the sum of three seed influence powers:
Step 2.1.2:Every a line of influence power matrix is normalized, the element f in matrixijAs turn at random from node i Move on to the probability of node j;Method for normalizing is as follows:
In a kind of above-mentioned community discovery method for complex network, the specific implementation of step 3.2 includes following sub-step Suddenly:
Step 3.2.1:A community is initialized, the node of core coefficients maximum in node listing is moved into the community;
Step 3.2.2:If there is new node addition in community, shift direction in node listing is directed toward newly added node Node moves into community;
Step 3.2.3:If there is new node to add community, step 3.2.2 is transferred to, otherwise exports community's sequence, and It is transferred to step 3.2.1;If node listing is sky, terminate.
In a kind of above-mentioned community discovery method for complex network, the specific implementation of step 3.3 includes following sub-step Suddenly:
Step 3.3.1:The core coefficients of the adjacent node of each node in each community are calculated using affiliated community as label The sum of, and the new affiliated community by the community of the sum of corresponding core coefficients maximum labeled as node;
Step 3.3.2:If community's label of no node changes, community's sequence is exported;Otherwise, node is moved into New affiliated community, if the step executed after twice, exports community's sequence, otherwise, is transferred to step 3.3.1.
Therefore, the invention has the advantages that:The present invention by analyzing network topology structure come design node transition probability, And node is assessed to the importance of Web Community based on random walk methods, core is then used as using the higher node of importance Heart tectonic network community, is adjusted community structure finally by community's edge pruning method.Random is based on existing The method of walk is different, and CDATP is that the transition probability of the design of node in network has asymmetry, and only local by node Shift to assess significance level of the node to community.
Brief description of the drawings
Fig. 1 is CDATP general frames.
Fig. 2 is chain, enter the relation of chain, attribute and influence power.
Fig. 3 is test results of the CDATP in common data sets.
Embodiment
Below with reference to the embodiments and with reference to the accompanying drawing the technical solutions of the present invention will be further described.
Embodiment:
First, lower CDATP general frames of the present invention are introduced first.
Fig. 1 describes the general frame that CDATP carries out community's detection, the input data set complicated society such as including social networks Network, output result are community's sequence.Frame includes following two parts:
(1) the sub- attribute space to behave oneself best is found in the construction phase of subspace, and corresponding attribute is converted into net Dummy node in network, structure attribute enhancing network;
(2) in community's division stage, network is strengthened as calculation and object node transition probability using attribute, uses random Walk methods assess joint core coefficient, determine the cluster direction of each node on this basis, create initial community, then carry out Edge pruning, final output community sequence.
2nd, subspace construction and attribute enhancing figure is described below.
People would generally develop social circle according to its hobby, action, and various objects also can by according to its characteristic, Function divides classification, and attribute is important " bridge " that contact is set up between object.It is different from the network for studying high abstraction, When studying scene increasingly complex in real world, if having ignored the attribute that node possesses in itself, it is likely that will miss Important information.The node for belonging to same community often possesses some close or even identical property value, is considering these attributes Afterwards, the power contacted between energy more scientific ground node metric so that the border of community becomes more fully apparent, while can also be from node The reason for angle of attribute helps to excavate community structure formation.
In order to influence of the metric attribute being contacted node, there is employed herein the virtual section being converted into attribute in network The method that point carrys out structure attribute enhancing network.For attribute A, there is Dom (A)={ a1, a2, a3 ..., an } after discretization, scheming N dummy node of middle addition, corresponds with the value of A, and between the dummy node of former network node and its correspondence property value A two-way side is established, so possessing the node of same alike result value, (dummy node can be considered the one of side using dummy node as intermediary The partly node of dependent) generate new contact.
But not all properties are all valuable.The all properties of node are all directly added into calculating not only can be big It is big to reduce computational efficiency, in some instances it may even be possible to cause because excessive contact is generated between different communities under community's Detection accuracy Drop.First consider to use single attribute now.It is assuming that non-there are being contacted between Liang Ge communities C1 and C2, C1 and C2 in certain network It is often few.If considering the double attributes sex of description object gender, the identical node of sex values (suppose there is such between C1 and C2 Node is to existing) between will produce contact, but the quantity of this contact is very much, destroys the more independent shape of C1 and C2 State, has a negative impact the result of community's division;If the Property ID of description object ID card No. is considered, due to each section The value of the ID of point is all different, new contact will not be produced between node, therefore the division to community is also without help.
Therefore, it is necessary to select suitable attribute, this attribute should meet following 2 conditions:
(1) there is the probability of same alike result value larger on the attribute in the close node of structure first line of a couplet system;
(2) after the attribute is considered, the contact produced between different communities should produce contact as few as possible.
And an only attribute it is identical tend not to explanation node between just have very strong contact, the attribute number of consideration is got over More, then the identical contingency of property value is smaller, considers the sub- attribute space comprising multiple attributes compared to the single attribute of consideration more Add reliable.Regard all properties in sub- attribute space as a complex attribute, it should equally meet above-mentioned two Condition.
Comentropy can be understood as the probability of occurrence of customizing messages.For attribute A, if Dom (A)=a1, a2, a3 ..., Ak }, and corresponding property value is aiNode number be ni, then its comentropy H (A) can use formula (1) calculate.Since there is no weight Multiple value, so the comentropy that ID attributes are noted above is very big, and such attribute is nonsensical, therefore should not Attribute of the comentropy more than threshold value ht is added into sub- attribute space.
The influence that attribute produces topological structure can be measured by the change of network connectivty.If have from node viRefer to To vjDirected edge, then it is assumed that from viTo vjIt is connection;Assuming that cursor can be shifted along the direction of directed edge, if from section Point viV can be reached after less than step times transferjThen it also hold that from viTo vjIt is connection.Based on this can the company of construction Logical Matrix C onn, if Connij=1, then it represents that from viTo vjIt is connection, if Connij=0, then it represents that from viTo vjDo not connect, The sum of all elements of Conn are denoted as Sum (Conn).
In the Conn of computation attribute enhancing figure, if cursor is moved to dummy node, an extra step is continued to move to, And it is not counted in mobile total step number.The connection matrix of the former network of note is Conn_1, and the connection matrix of the enhancing network of attribute A is Conn_2, the structure disturbance degree of attribute is Affect (A), can be according to the structure disturbance degree of formula (2) computation attribute.
Wherein sum (m) is the sum of element, α in matrix mAIt is matrix zoom factor, it is intended to more obvious adhering to separately property of area Structure disturbance degree, is arranged to 1.2 herein.
Sub- attribute space should meet that comentropy is smaller and the less condition of structure disturbance degree, its constitution step are as follows at the same time:
(1) comentropy of each attribute is calculated, and screens out the attribute that comentropy is less than threshold value ht;
(2) calculate the structure disturbance degree of remaining attribute, and select that structure disturbance degree is less than threshold value at and comentropy is minimum Attribute adds sub- attribute space;
(3) remaining attribute is sorted from small to large by comentropy, if after attribute therein adds sub- attribute space so that son The comentropy and structure disturbance degree of attribute space are respectively less than threshold value, then are added into sub- attribute space.
The adjacency matrix of the former network of note is Adj.Network is strengthened according to the attribute constructed under sub- attribute space, can obtain one A new relation Increment Matrix IncreAdj represents the change of sub- attribute space lower node relation, for Aij∈IncreAdj, If Aij=1 represents that node i and node j have identical property value.
3rd, cluster direction and community's division are finally introduced.
Due to lacking priori, the clustering method for preassigning number of clusters is often difficult to obtain good effect in community discovery Fruit.In order to mark off better community structure, the core of cluster is automatically determined this paper presents a kind of, and make beyond core The clustering method that node is drawn close according to respective cluster direction to core, not only interior polymeric degree is high for thus obtained community, and And have the hierarchical structure being apparent from, easy to be studied the event propagation in community is further.
The state of one node has certain probability to change because of the behavior of its adjacent node.If cursor can be from One node is shifted to its any adjacent node, and is tended to the node turn for having more maximum probability to be had an impact to its state Move, then after the transfer of certain number is carried out, cursor have larger probability fall on origin node in event propagation stream " on Trip " position.
Introduce the concept of node influence power force herein come the adjacent node of any node is described can be to shadow that it is produced Ring.Remember forceijFor node vjTo node viInfluence power, forceij=foutij+finij+attrij, wherein foutijIt is node viGo out chain generation influence power, finijIt is by node viEnter chain generation influence power, attrijIt is to be produced by relation on attributes Influence power.
By taking microblogging as an example, as shown in Fig. 2, the width on side represents the size of influence power, concern relation and bean vermicelli relation can be by Directed edge in network represents.If user A has paid close attention to user B, illustrate that the content that B is produced has A certain influence power.A The information content of the content received is related with the total number of persons that it is paid close attention to, and number is more, then information content is bigger, the content that B is produced Information content accounting can also become weaker with regard to small, to the A attractions produced;Opposite, when A concerns are fewer in number, influences of the B to A Power can be stronger.Simultaneously as A becomes the bean vermicelli of B, B, also can be to a certain extent by the content of A generations because this relation Attract, likewise, the power of influence power and the bean vermicelli number of B have certain relation, but the size of this influence power is much smaller than by closing The attraction size that note relation produces.Since the new relation that attribute produces can equally act on influence power, its size is between preceding Between the two.
There is used herein describe fout using natural constant e as the exponential function at bottomijAnd finijChange, their meter It is respectively (3) and (4) to calculate formula.Wherein, sum (Adji,The sum of) for i-th row elements of matrix A dj, sum (Adj,j) it is matrix A dj The sum of jth column element.αoutAnd αinIt is respectively chain and enter chain coefficient, the convergence rate for control function.
attrijCalculation formula be (5), the number of attributes and the corresponding number of nodes of attribute that its size and subspace include Correlation.Wherein, the number for the attribute that n includes for sub- attribute space, sum (IncreAdji,) it is matrix IncreAdj The sum of i-th row element, αattrFor attribute coefficients, the convergence rate for control function.
After obtaining influence power matrix, matrix is normalized per a line by formula (6), that is, obtains transition probability matrix Trans, TransijIt is cursor by node viTo node vjThe probability of transfer.
In real world, the either interest group of ant colony or Reddit, the member in community is and non-fully " flat Deng ", but there are member's structure of pyramid.It is subject in community the inspiration of " inequality " phenomenon, in order to which evaluation node is in society Whether " pyramid top " position being in area, introduce the concept of joint core coefficient herein, the core coefficients of node are bigger, More it is likely to become the core of cluster.Presently describe the computational methods of joint core coefficient.Assuming that a cursor is successively from network In each node set out, according to the destination node that shifts of transition probability random selection in Trans next time, after setting out every time altogether Walk step steps, have after often making a move back probabilities retract before node, take it is last where node be terminal.The core of each node It is that the number that the node is peripheral node it is expected to feel concerned about several Core.
On the basis of core coefficients, the present invention devises a kind of clustering method that need not preassign number of clusters, in net In network, each node can determine that it clusters direction and with corresponding node aggregation to together in the method.
Cluster direction is together decided on by transition probability and core coefficients, if TransijTo be unique in the i-th row of Trans matrixes Maximum, then node viCluster direction be j;If the member equal to maximum is known as multiple, take wherein core coefficients maximum As cluster direction.Cluster direction list Dir={ d can according to said method be established1,d2,d3,…,dN, wherein diFor node viIt is poly- Class direction.
The new network after the information by side in former network all removes is built, is added according still further to following step Side:
(1) the node v of core coefficients maximum is choseniIf not being connected with any side, using the node as in new cluster The heart;
(2) the cluster direction of Centroid is denoted as sky, Dir is scanned, if there is node vjIt is not connected and clusters with any side Direction dj=i, then establish one by node vjIt is directed toward node viDirected edge;
(3) repeat the above steps, until no node spent for 0.
Initial community has thus been obtained, has been further continued for carrying out edge pruning work below:
(1) for each node in network, the sum of its core coefficients of adjacent node in its affiliated community are calculated, The sum of its core coefficients of adjacent node in other communities are calculated again, and the corresponding community of the sum of maximum core coefficients is marked The new affiliated community of node is denoted as, but node is not included among new community temporarily;
(2) after the corresponding new communities of all nodes are obtained, all nodes are included among new communities, if there is the society of node Area is changed, then repeats the work of (1), otherwise stop.
In order to verify that community's edge pruning work can be with boosting algorithm accuracy, herein in 4 true common data sets It is tested, the results are shown in Figure 3.Edge pruning step improves the quality of initial community well.
Specific embodiment described herein is only to spirit explanation for example of the invention.Technology belonging to the present invention is led The technical staff in domain can do various modifications or additions to described specific embodiment or replace in a similar way Generation, but without departing from spirit of the invention or beyond the scope of the appended claims.

Claims (5)

1. a kind of community discovery method for complex network, it is characterised in that comprise the steps of:
Step 1:The sub- attribute space with superperformance is found from the attribute space of network, then is mapped that in network Dummy node, and the virtual linkage between origin node and dummy node is built, realize and attribute information be converted into topology information, Specific implementation includes following sub-step:
Step 1.1:Statistical information based on attribute, constructs the sub- attribute space with superperformance;
Step 1.2:Some dummy nodes are added in former network according to the length in sub- attribute space domain, each dummy node represents A kind of property value, and establish virtual linkage according to the correspondence of node attribute values and dummy node;
Step 2:Based on the topological structure of network, quantify node between asymmetric transition probability, and by it is conditional with Machine is shifted to obtain the core coefficients of node, assesses importance of the node to community with this, specific implementation includes following sub-step:
Step 2.1:Quantify the asymmetric transition probability of node based on network topology information;
Step 2.2:Pass through the reverse core system for propagating, node being assessed with this of conditional random walk method modeling event Number;Appraisal procedure is as follows:
<mrow> <mi>C</mi> <mi>o</mi> <mi>r</mi> <mi>e</mi> <mo>=</mo> <msup> <mrow> <mo>(</mo> <mo>(</mo> <mrow> <mn>1</mn> <mo>-</mo> <mi>b</mi> <mi>a</mi> <mi>c</mi> <mi>k</mi> </mrow> <mo>)</mo> <mi>P</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> <msubsup> <mrow> <mo>(</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>,</mo> <mn>...</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> <mi>N</mi> <mi>T</mi> </msubsup> </mrow>
Wherein, back represents the backtracking probability in random walk process, and P represents node transition probability matrix, and wherein N is former network In node number;
Step 3:The cluster direction of node is determined based on transition probability and core coefficients, realizes that node spontaneously clusters, further according to Core coefficients positioned at cluster fringe node are adjusted cluster shape, form community's sequence, and specific implementation includes following sub-step:
Step 3.1:Transition probability and core coefficients based on node, determine the cluster direction of node;Rule is as follows:
If following either condition is true, the cluster direction of node i is i → j;
Condition one, Wij=1, and the core coefficients of node j are more than the core coefficients of the adjacent node of all other node i;
Condition two, Wij=1, and the core coefficients of node j are more than or equal to the core coefficients of the adjacent node of all other node i, And fijMore than all other element of the i-th rows of matrix f;
Step 3.2:Node is set spontaneously to cluster for more intensive cluster, i.e., initial community;
Step 3.3:According to the correlation properties positioned at cluster marginal position node, the shape of cluster is trimmed, exports community's sequence.
2. a kind of community discovery method for complex network according to claim 1, it is characterised in that step 1.1 Specific implementation includes following sub-step:
Step 1.1.1:The comentropy of each attribute in computation attribute space, removes comentropy and is more than threshold value thAttribute, then will be surplus Remaining attribute is arranged by comentropy ascending order;
Step 1.1.2:Remaining attribute is added into sub- attribute space successively, if causing the knot of sub- attribute space after adding a certain attribute Structure disturbance degree is more than threshold value ta, then the attribute is removed.
3. a kind of community discovery method for complex network according to claim 1, it is characterised in that step 2.1 Specific implementation includes following sub-step:
Step 2.1.1:The influence power between node is assessed, altogether including three subdivisions:
Sub- influence power one, by the influence power for going out chain relation generation
<mrow> <msubsup> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> </mrow> </msubsup> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>W</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>1</mn> <mo>+</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;alpha;</mi> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> </mrow> </msub> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <msub> <mi>w</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>W</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
Wherein W be network adjacency matrix, αoutTo go out the chain factor;
Sub- influence power two, by the influence power for entering chain relation generation
<mrow> <msubsup> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msubsup> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>W</mi> <mrow> <mi>j</mi> <mi>i</mi> </mrow> </msub> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msup> <mrow> <mo>(</mo> <mfrac> <mi>e</mi> <mn>2</mn> </mfrac> <mo>)</mo> </mrow> <mrow> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;alpha;</mi> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msub> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>w</mi> <mrow> <mi>k</mi> <mi>i</mi> </mrow> </msub> <mo>)</mo> </mrow> </msup> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>W</mi> <mrow> <mi>j</mi> <mi>i</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
Wherein αinTo enter the chain factor;
Sub- influence power three, the influence power produced by relation on attributes
<mrow> <msub> <mi>attr</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>I</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msup> <mn>1.1</mn> <mrow> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;alpha;</mi> <mrow> <mi>a</mi> <mi>t</mi> <mi>t</mi> <mi>r</mi> </mrow> </msub> <msubsup> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>i</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> </mrow> </msup> <mo>)</mo> </mrow> <mrow> <mn>2</mn> <mo>/</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>+</mo> <mi>log</mi> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>I</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
Wherein I be add dummy node after adjacency matrix, αattrFor attribute factor;
Influence power between node is the sum of three seed influence powers:
<mrow> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <msubsup> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> </mrow> </msubsup> <mo>+</mo> <msubsup> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msubsup> <mo>+</mo> <msubsup> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mrow> <mi>a</mi> <mi>t</mi> <mi>t</mi> <mi>r</mi> </mrow> </msubsup> <mo>.</mo> </mrow>
Step 2.1.2:Every a line of influence power matrix is normalized, the element f in matrixijAs from node i random transferring to The probability of node j;Method for normalizing is as follows:
<mrow> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mi>k</mi> </mrow> </msub> </mrow> <mi>N</mi> </mfrac> <mo>.</mo> </mrow>
4. a kind of community discovery method for complex network according to claim 1, it is characterised in that step 3.2 Specific implementation includes following sub-step:
Step 3.2.1:A community is initialized, the node of core coefficients maximum in node listing is moved into the community;
Step 3.2.2:If there is new node addition in community, shift direction in node listing is directed toward to the node of newly added node Move into community;
Step 3.2.3:If there is new node to add community, step 3.2.2 is transferred to, otherwise exports community's sequence, and shift To step 3.2.1;If node listing is sky, terminate.
5. a kind of community discovery method for complex network according to claim 1, it is characterised in that step 3.3 Specific implementation includes following sub-step:
Step 3.3.1:The sum of core coefficients of adjacent node of each node in each community are calculated by label of affiliated community, And the new affiliated community by the community of the sum of corresponding core coefficients maximum labeled as node;
Step 3.3.2:If community's label of no node changes, community's sequence is exported;Otherwise, node is moved into newly Affiliated community, if the step executed after twice, exports community's sequence, otherwise, is transferred to step 3.3.1.
CN201711215141.5A 2017-11-28 2017-11-28 A kind of community discovery method for complex network Pending CN108009575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711215141.5A CN108009575A (en) 2017-11-28 2017-11-28 A kind of community discovery method for complex network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711215141.5A CN108009575A (en) 2017-11-28 2017-11-28 A kind of community discovery method for complex network

Publications (1)

Publication Number Publication Date
CN108009575A true CN108009575A (en) 2018-05-08

Family

ID=62054338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711215141.5A Pending CN108009575A (en) 2017-11-28 2017-11-28 A kind of community discovery method for complex network

Country Status (1)

Country Link
CN (1) CN108009575A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109451507A (en) * 2019-01-02 2019-03-08 北京工业大学 A kind of tracing area planing method based on community's detection in isomery cellular network
CN109508415A (en) * 2018-06-27 2019-03-22 北京理工大学 Influence based on social networks hierarchical structure maximizes subset method for building up
CN109727150A (en) * 2018-12-29 2019-05-07 广东德诚科教有限公司 A kind of community detection algorithm for more people's on-line study platforms
CN110704694A (en) * 2019-09-29 2020-01-17 哈尔滨工业大学(威海) Organization hierarchy dividing method based on network representation learning and application thereof
CN111698743A (en) * 2020-06-09 2020-09-22 嘉兴学院 Complex network community identification method fusing node analysis and edge analysis
CN111932273A (en) * 2020-09-28 2020-11-13 支付宝(杭州)信息技术有限公司 Transaction risk identification method, device, equipment and medium
CN112256935A (en) * 2020-10-26 2021-01-22 临沂大学 Complex network clustering method based on optimization
CN112994933A (en) * 2021-02-07 2021-06-18 河北师范大学 Generalized community discovery method for complex network
CN112989189A (en) * 2021-03-08 2021-06-18 武汉大学 Structural hole node searching method based on hyperbolic geometric space
CN113792802A (en) * 2021-09-16 2021-12-14 东北电力大学 Clustering coefficient-based method for predicting link of superposition random walk gravity model
CN114143207A (en) * 2020-08-14 2022-03-04 中国移动通信集团广东有限公司 Home user identification method and electronic equipment

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508415A (en) * 2018-06-27 2019-03-22 北京理工大学 Influence based on social networks hierarchical structure maximizes subset method for building up
CN109508415B (en) * 2018-06-27 2021-01-05 北京理工大学 Influence maximization seed set establishment method based on social network hierarchical structure
CN109727150B (en) * 2018-12-29 2021-08-24 广东德诚科教有限公司 Community identification method for multi-user online learning platform
CN109727150A (en) * 2018-12-29 2019-05-07 广东德诚科教有限公司 A kind of community detection algorithm for more people's on-line study platforms
CN109451507A (en) * 2019-01-02 2019-03-08 北京工业大学 A kind of tracing area planing method based on community's detection in isomery cellular network
CN110704694A (en) * 2019-09-29 2020-01-17 哈尔滨工业大学(威海) Organization hierarchy dividing method based on network representation learning and application thereof
CN110704694B (en) * 2019-09-29 2021-12-31 哈尔滨工业大学(威海) Organization hierarchy dividing method based on network representation learning and application thereof
CN111698743A (en) * 2020-06-09 2020-09-22 嘉兴学院 Complex network community identification method fusing node analysis and edge analysis
CN111698743B (en) * 2020-06-09 2022-09-13 嘉兴学院 Complex network community identification method fusing node analysis and edge analysis
CN114143207A (en) * 2020-08-14 2022-03-04 中国移动通信集团广东有限公司 Home user identification method and electronic equipment
CN111932273B (en) * 2020-09-28 2021-02-19 支付宝(杭州)信息技术有限公司 Transaction risk identification method, device, equipment and medium
CN111932273A (en) * 2020-09-28 2020-11-13 支付宝(杭州)信息技术有限公司 Transaction risk identification method, device, equipment and medium
CN112256935A (en) * 2020-10-26 2021-01-22 临沂大学 Complex network clustering method based on optimization
CN112994933A (en) * 2021-02-07 2021-06-18 河北师范大学 Generalized community discovery method for complex network
CN112989189A (en) * 2021-03-08 2021-06-18 武汉大学 Structural hole node searching method based on hyperbolic geometric space
CN113792802A (en) * 2021-09-16 2021-12-14 东北电力大学 Clustering coefficient-based method for predicting link of superposition random walk gravity model

Similar Documents

Publication Publication Date Title
CN108009575A (en) A kind of community discovery method for complex network
CN111950594A (en) Unsupervised graph representation learning method and unsupervised graph representation learning device on large-scale attribute graph based on sub-graph sampling
CN112308115B (en) Multi-label image deep learning classification method and equipment
CN113255895B (en) Structure diagram alignment method and multi-diagram joint data mining method based on diagram neural network representation learning
CN108734223A (en) The social networks friend recommendation method divided based on community
CN106326637A (en) Link prediction method based on local effective path degree
CN113297429B (en) Social network link prediction method based on neural network architecture search
CN113780002A (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
CN113190939B (en) Large sparse complex network topology analysis and simplification method based on polygon coefficient
Deng et al. Efficient vector influence clustering coefficient based directed community detection method
CN112417289A (en) Information intelligent recommendation method based on deep clustering
CN115114409B (en) Civil aviation unsafe event combined extraction method based on soft parameter sharing
CN109921936A (en) Multiple target dynamic network community division method based on memetic frame
CN115952424A (en) Graph convolution neural network clustering method based on multi-view structure
CN110334278A (en) A kind of web services recommended method based on improvement deep learning
CN115828143A (en) Node classification method for realizing heterogeneous primitive path aggregation based on graph convolution and self-attention mechanism
CN109639469A (en) A kind of sparse net with attributes characterizing method of combination learning and system
CN115456093A (en) High-performance graph clustering method based on attention-graph neural network
CN113868537B (en) Recommendation method based on multi-behavior session graph fusion
CN105591876A (en) Virtual network mapping method
Wang et al. EHGNN: Enhanced Hypergraph Neural Network for Hyperspectral Image Classification
CN102193928B (en) Method for matching lightweight ontologies based on multilayer text categorizer
Schrodi et al. Towards discovering neural architectures from scratch
CN115086179B (en) Detection method for community structure in social network
CN110633394A (en) Graph compression method based on feature enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180508

RJ01 Rejection of invention patent application after publication