CN103325061B - A kind of community discovery method and system - Google Patents

A kind of community discovery method and system Download PDF

Info

Publication number
CN103325061B
CN103325061B CN201310201298.8A CN201310201298A CN103325061B CN 103325061 B CN103325061 B CN 103325061B CN 201310201298 A CN201310201298 A CN 201310201298A CN 103325061 B CN103325061 B CN 103325061B
Authority
CN
China
Prior art keywords
community
node
entropy
attributes
represent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310201298.8A
Other languages
Chinese (zh)
Other versions
CN103325061A (en
Inventor
徐冰莹
贾焰
杨树强
周斌
韩伟红
李爱平
韩毅
李莎莎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201310201298.8A priority Critical patent/CN103325061B/en
Publication of CN103325061A publication Critical patent/CN103325061A/en
Application granted granted Critical
Publication of CN103325061B publication Critical patent/CN103325061B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention provides a kind of community discovery method, and methods described includes:Multiple nodes in network are maximized based on modularity and carry out community's division;And community boundary node obtained in the previous step is adjusted based on community attributes entropy minimization.Methods described also includes:If the community obtained after adjustment divides meets termination condition, the community is divided as final community and is divided;Otherwise, using the community obtained after adjustment as node, community is carried out again to the node and is divided and is readjusted community boundary node.Methods described considers the attributive character of network structure and node simultaneously, improves the degree of accuracy of community discovery.Additionally, the time complexity of methods described is close to linearly, it is adaptable to large-scale online social network data.

Description

A kind of community discovery method and system
Technical field
The present invention relates to Complex Networks Analysis and Data Mining, more particularly to a kind of community discovery method and system.
Background technology
With the further investigation of property and mathematical feature to social networks, researcher has found that many networks all have one Common feature --- community structure, that is to say, that network is made up of several " groups " or " group ", the section in each " group " Closely, and the connection between " group " is then relatively sparse for connection between point.The discovery of Web Community can help people More effectively awareness network architectural feature, so as to provide more effectively, the service of more personalized.For example:Push away for information Recommend, user classifies, and the behavioural analysis of internet colony etc..On online social networks, individual often produces from media properties The substantial amounts of content of text of life, these content of text reflect the topic of author's care and viewpoint tendency etc..The individual mark of itself Content and personal information data are signed, such as:Age, occupation, interest etc., can reflect it is individual there is a certain social characteristic, and social network The homogeney of network causes the people with identical social characteristic to be easier to get together.Therefore, only can be improved using these information The quality of community is found only in accordance with network structure.
Currently, many researchs find community by the structure of analysis network.Wherein, Blondel et al. is based in reality The community structure of large scale network all there is level, it is proposed that a kind of two benches modularity of iteration is maximized to be calculated quickly soon Method (BGL algorithms) is for finding community.The algorithm is divided into two steps:The first step, society is caused by local exchange node between community The modularity of Division is maximized.Second step, the community for producing the division of back network are used as a section in new network Point, between node while weights be its represent two it is intercommunal while weights sum.The two above that iterates step, Until the size of modularity is no longer possible to increase.The modularity module used by BGL algorithms as defined by the following equation, this definition Suitable for weighted network:
Wherein, AijRepresent the weight on the side between node i and node j, imaxFor the maximum of i, jmaxFor the maximum of j;The weights sum on all sides that expression is connected with node i;ciRepresent (affiliated) community that node i is located;δ letters Number δ (u, v) represents when u is equal with v to be 1, and in the case of remaining is 0;Represent the power on all sides in network Value sum.
However, BGL algorithms are not involved with the attribute information of network node.And study and show, real online social In network, the attribute information of node can be one of standard for judging, the node on the premise of close structure, in same community Attribute is more similar better.In addition, although existing many figure clustering methods are special by the attribute of the structure and node of network Levy (or title nodal community or node attribute information) and combine consideration (such as method by being weighted to attribute and structure The new network of construction, and community's division is carried out on new network), but the result of these clusters is often deposited in structure not The community not associated closely or, so as to cause the result of community discovery inaccurate;And, the time complexity of these methods compared with Height, is unsuitable for processing large-scale data.
Accordingly, it would be desirable to a kind of method is improving the degree of accuracy of community discovery;Also can be suitably used for simultaneously extensive online social Network data.
The content of the invention
According to one embodiment of present invention, there is provided a kind of community discovery method, including:
Step 1), to network in multiple nodes based on modularity maximize carry out community's division;
Step 2), adjusted from step 1 based on community attributes entropy minimization) the community boundary node that obtains;
Step 3) if, from step 2) community that obtains divides and meets termination condition, the community is divided as final Community divides;Otherwise, will be from step 2) community that obtains, as node, re-executes step 1) community's division is carried out to the node And re-execute step 2) adjustment community boundary node.
In one embodiment, the termination condition is:Through step 1) and step 2) process after the community that obtains draw Point modularity do not increase compared with the modularity of before processing, and through step 1) and step 2) process after the society that obtains The whole community attributes entropy of Division is not reduced compared with the whole community attributes entropy of before processing.
In another embodiment, the termination condition is:Repeat step 1) and step 2) number of times reached default threshold Value.
In one embodiment, step 1) include:For each node in network, by the node motion to modularity just Neighbor node place community corresponding to increment maximum, until the movement of any node can not all bring the positive increment of modularity be Only.
In one embodiment, step 2) include:
Step 21), random selection one community boundary node;
Step 22), calculate the community boundary node and move on to the neighbours society produced by its neighbours community from the community being located The community attributes entropy production in area;
Step 23), the neighbours community that selects the community attributes entropy production minimum, judge whether the neighbours community is described The community that community boundary node is located, if it is not, then the community boundary node is moved to the neighbour from its community being located Occupy community;
Step 24) if, the whole community attributes entropy before and after the community boundary node motion change, return step It is rapid 21).
In a further embodiment, step 21) include:
Step 211), from step 1) randomly choose a community in the community that obtains;
Step 212), from selected community randomly choose a node, wherein the end points on the side being connected with the node Not exclusively be its be located community in node.
In one embodiment, step 21) before also include:
Step 20), by step 1) community divide and be reduced to the community of ancestor node and divide, wherein ancestor node is first It is secondary that the node carried out in the network before community's division is maximized based on modularity.
According to one embodiment of present invention, there is provided a kind of community discovery system, including community's division module and community Adjusting module.Wherein, community's division module is maximized based on modularity for the multiple nodes in network and carries out community's division. Community's adjusting module is for based on the community attributes entropy minimization community boundary node that obtains from community's division module of adjustment; If the community obtained after adjustment divides meets predetermined condition, the community is divided into final community and divides;Otherwise, will adjustment The community for obtaining afterwards again to the node carries out community division and by the community by community's division module as node Adjusting module readjusts community boundary node.
Beneficial effects of the present invention are as follows:
By using the attributive character of node, the boundary node of the community to being produced using modularity maximization approach is carried out Optimization operation, to find community that feature becomes apparent from, so as to improve the degree of accuracy of community discovery.Additionally, the present invention when Between complexity be close to it is linear, it is adaptable to extensive online social network data.
Description of the drawings
Fig. 1 is community discovery method flow chart according to an embodiment of the invention;
Fig. 2 is the arborescence of the one embodiment for the network for including multiple nodes;
Fig. 3 is that the node shown in Fig. 2 is maximized to carry out first time community's division and based on community attributes based on modularity The arborescence that the community that entropy minimization adjustment community boundary node is obtained divides;
Fig. 4 is that the node shown in Fig. 2 is maximized to carry out second community's division and based on community attributes based on modularity The arborescence that the community that entropy minimization adjustment community boundary node is obtained divides;
Fig. 5 a are the schematic diagrames of one embodiment of network structure;
Fig. 5 b are the attribute matrix schematic diagrames of the network structure interior joint shown in Fig. 5 a;
Fig. 5 c are only to consider that network structure carries out the result schematic diagram of community's division;
Fig. 5 d are the result schematic diagrams that the community discovery method provided according to the present invention carries out community's division;And
Fig. 6 is that the Comparative result for carrying out community's division according to the community discovery method and existing method of present invention offer is illustrated Figure.
Specific embodiment
The present invention is described in detail with reference to the accompanying drawings and detailed description.
In order to be described below conveniently, following concept is introduced first:
1st, network
Give a network G=(V, E), wherein V={ v1,v2,...vnRepresent nodes set, E= {e1,e2,...,emRepresent side set.And| V |=n and | E |=m.Generally, < v can be usedi,vj> tables Show node viAnd vjBetween the side that connects, wherein, 1≤i≤n, 1≤j≤n, i ≠ j.Thus, side can also be expressed as ek=< vi,vj>, 1≤k≤m.In social networks, side can represent between its connected node (user) good friend, bean vermicelli etc. each other Deng, can according to analysis target depending on.In text, involved network is non-directed graph, and the node in G to represented side is Unordered, i.e. < vi,vj> and < vj,viWhat > was represented is same side.
2nd, the network with node attribute information
As described above, by taking online social networks as an example, attribute (attributive character i.e. as described above, attribute information) can Being the label of user, the age of user, the nationality of user, hobby etc..Dividing different networks may need to use To different attributes, attribute can be it is discrete can also be continuous.The discrete hobby for example including user:Body Educate, literature, politics etc.;The continuous age for example including user.However, the age 10~15 of user discrete can also turn to On " youth " this discrete attribute.Attribute can be represented with characteristic vector, for example:One includes address (to provincial), age (arriving age bracket), and the triple of interest (perhaps label is to specific classification in artificial or computer-assisted analysis):(Hunan, Teenager, literature), (Zhejiang, middle age, literature).
For the node in network has the network of attribute, can be defined as the figure with node attribute information (or claim it is many Attributed graph) G={ V, E, A }, wherein, A={ a1,a2,...alRepresent the community set that nodes have, all properties Number | A |=l.Any node v in networki∈ V correspond to an attribute vector [ai1,...,ail], wherein ailIt is node vi In attribute alOn value.
Community discovery target on many attributed graphs be in attributed graph more than, by node division be k community (i.e. Group, group), it is expressed as Gi=(Vi,Ei,Ai), whereinVi∩Vj=φ.And the not only phase of the node in same community Connect closely and with very high similitude.
3rd, the neighbor node set of node
Node viNeighbor node set N (vi)={ vj| < vi,vj> ∈ E, vj∈ V }, with node v in expression networkiDirectly Connect the end points on connected side.
4th, the neighbours community set of node
Node viNeighbours community setRepresent and node viDirectly Meet the community being connected.
5th, community boundary node
Community VmBoundary node setWherein, the node in community viIt is the boundary node of this community, refers to and viThe end points on the side being joined directly together not exclusively is community VmInterior node.In social activity In network, the boundary node of community often undertakes the important node of link bridge and Information Communication channeling, it is also possible to be Belong to the node of multiple communities.
6th, community attributes entropy
Entropy is a key concept in Shannon information theory, in Data Mining, can be used to define a number According to the similarity in set, the similarity in a data acquisition system between each node is higher, and its overall entropy is lower.It is false If giving a community Vm, comprising | Vm|=M node, then the attribute entropy H (V of this community can be defined using following formulam):
In formula (2), si,jTwo similarities between node i and j are represented, i.e., the similarity (reflection two on attribute Intimate degree of the node on attribute), and
On this basis, the attribute entropy (or claiming whole community attributes entropy) that whole community divides is represented by:
According to one embodiment of present invention, can be come between calculate node on attribute using various similarity calculating methods Similarity, calculate the similarity such as using the cosine law and generalized J accard coefficient.Due to attribute can be it is continuous, It can also be discrete or content of text.During attribute is the embodiment of Category Attributes wherein, the calculating side of attributes similarity The method that method can adopt feature based vector, or adopt Descartes's similarity calculating method.If attribute is continuous, Discrete (conversion to the age described above) can be first converted into process again.
According to one embodiment of present invention, there is provided a kind of community discovery method.As shown in figure 1, the method include it is following Three steps:
Step one, community's division is carried out using modularity maximization approach.
Each node during weighted network has the embodiment of N number of node wherein, when initial, in the weighted network Represent a different community, that is to say, that how many node means that how many community in the network in network.Connect , it is assumed that node i removes and be transferred to the community at its neighbor node j places from the community that oneself is located, calculate its movement respectively To the increment of the modularity of different communities.The positive increment maximum of modularity is enabled in the neighbor node set for finding out node i Node j', and the real community that node i is transferred to node j' places.If can not find the neighbours for meeting condition in this process Node j', then to be maintained at original community constant for node i.This process lasts till that in the network the movement of any node is not Till the positive increment of modularity can be brought.After this process terminates, obtain network local modularity and maximize.
Step 2, carry out community attributes entropy minimization process.
As described above, the similarity in a community between each node is higher, and its attribute entropy is lower.In basis On the community that step one is obtained divides, whole community attributes entropy is minimized by adjusting community boundary node.
In one embodiment, if not being to carry out community's division to ancestor node in step one, this step it The front community for also obtaining step one divides the community's division for reverting to ancestor node.This is because, next iteration (or Person to say and carry out community's division using modularity maximization approach) it is required for before to divide from existing community (including modularity most Bigization and community attributes entropy minimization process) community that obtains regards node as, in next iteration according to the node again Carry out community's division (will be described below).Therefore, during the community that step one is obtained divides, each node in community The community that possibly a front community is obtained after dividing, and community attributes entropy need according to the attribute of the ancestor node in community come Calculated.
In a further embodiment, can pass through to create and safeguard the society that the adjacent maximization approach of modularity twice is produced Realizing above-mentioned reduction, its structure can be represented with setting as described in Fig. 2-4 for Division and its corresponding relation.Figure 2-4 respectively illustrate the network with 10 ancestor nodes when initial, (each iteration includes that modularity is maximum to first time iteration Change and community attributes entropy minimization process) afterwards, the community structure schematic diagram that obtains after second iteration.Although the net is not shown Network structure chart, it should be appreciated that may have side to be connected between 10 nodes, and each node has attribute.In Fig. 2,0_0 to 0_ 9 numberings for representing this 10 ancestor nodes, wherein the numeral before " _ " is used to represent that the tree is the result which iteration is obtained (or for representing which network for being formed, the wherein network of original state is the 0th network).Above 10 nodes "-" node represents the root node of whole tree, and its each stalk tree represents a community, and (when i.e. initial, each node is one Individual community).Community is being divided using modularity maximization approach for the first time to the network and is being adjusted based on community attributes entropy minimization After the whole community divides, the network can be divided into 4 communities (or claiming supernode), wherein ancestor node 0_0,0_1,0_2 and 0_3 A community is divided into, ancestor node 0_4,0_5 and 0_6 are divided into a community and ancestor node 0_7 is individually divided into one Community.Fig. 3 shows the community and ancestor node corresponding relation for obtaining after the first iteration, the numbering of 4 in Fig. 3 community Respectively 1_0,1_1,1_2 and 1_3.Wherein, the corresponding ancestor nodes of community 1_0 be 0_0,0_1,0_2 and 0_3, community 1_1 pair The ancestor node answered is 0_4,0_5 and 0_6, and 1_2 corresponding ancestor nodes in community are 0_7, and the corresponding ancestor nodes of community 1_3 It is then 0_8 and 0_9.If not meeting predetermined termination condition, the new network to this 4 supernode compositions is done by next step Further Division (or claiming iteration, including modularity maximization procedure and community attributes entropy minimization process).As Fig. 4 show into The corresponding relation of community and node after second division of row, wherein community 2_0 contain supernode 1_0 and 1_1, community's 2_1 bags Supernode 1_2 and 1_3 are contained.It is thus possible to the community obtained currently on ancestor node be divided into 0_0,0_1,0_2,0_3, 0_4,0_5,0_6 } and { 0_7,0_8,0_9 }.
Above it has been illustratively described with tree to represent the corresponding relation between ancestor node and community's division, Ying Li Solution, it would however also be possible to employ other technologies well known in the art are creating and safeguard society that the adjacent maximization approach of modularity twice is produced Division and its corresponding relation.
In one embodiment, many attributed graphs and its certain initial community to giving are divided, and its community attributes entropy is minimum Change method includes:
A) a community V is randomly choosed in, dividing from current communitym, according to concept above to community attributes entropy Description, obtains its boundary node set.
B), one community boundary node i of random selection from set, it is assumed that be moved into its neighbours community, calculate The contribution degree increment (i.e. community attributes entropy production) of its attribute entropy to neighbours community.
Wherein, according to below equation calculate node i to community VmThe tribute of the attribute entropy of (the neighbours community moved to by which) Degree of offering increment:
Wherein, node i is the community boundary node of selection, imaxFor the maximum of i;H(Vm) for node i be located community (neighbours community) VmAttribute entropy;si,jThe similarity of two node is and j on attribute is represented,
C) the neighbours community for, selecting community attributes entropy production minimum (calculates the minimum delta H institute of gained using formula (4) Corresponding neighbours community), and judge that whether the community is the community that node i is currently located, if it is not, then by the node from current society Area moves to new community.
D), judge whether to meet end condition:Judge whether whole community attributes entropy no longer changes, if still changing, Repeat the above steps.
In one embodiment, community attributes entropy minimization process can be described using following algorithm:
Algorithm:entropyMin
Input:Community divides V, similarity matrix
Output:Community divides V'
After community attributes entropy minimization process, replaced minimizing with each node (supernode) in new network The community that process is obtained, and the weights on the side between node, are the weights sums on corresponding two intercommunal sides;Section Point from ring while weights be its represent same community interior nodes between while weights sum.
Step 3, judge whether to meet termination condition, repeat above procedure if being unsatisfactory for.
In one embodiment, as final purpose is so that the community for finally giving is maximized and sign society in modularity Balance is reached between the homogeneous community attributes entropy minimization in area, therefore said method terminates at modularity and can not be further added by and whole Individual community attributes entropy can not be reduced again.And in another embodiment, community's division can be controlled by controlling iterations Granularity and scale, reflect different levels on community structure, reach find hierarchy type community structure purpose.
In one embodiment, can be maximized based on modularity using following arthmetic statement and minimum based on community attributes entropy The process for changing to carry out community's division:
Algorithm:ACD
Input:imax
Output:V
According to one embodiment of present invention, a kind of community discovery system, including community's division module and community are also provided Adjusting module.Wherein, community's division module is maximized based on modularity for the multiple nodes in network and carries out community's division. Community's adjusting module is for based on the community attributes entropy minimization community boundary node that obtains from community's division module of adjustment;If The community obtained after adjustment divides and meets predetermined condition, then the community is divided into final community and divides;Otherwise, will obtain after adjustment The community for arriving as node, by community's division module again the node is carried out community's division and by community's adjusting module again Adjustment community boundary node.
Fig. 5 a-5d describe the community discovery method that the present invention is provided is adopted in a network example and adopt existing The method for being based only upon network structure finds the result precision contrast of community.Fig. 5 a show the network structure of the network example Figure, Fig. 5 b show the attribute matrix of node.Here for the sake of simplicity, with A1、A2、A3Section needed for community divides is described respectively Point attribute information, and whether there is the attribute representing node with 0,1.The node shown in Fig. 1 is can be seen that from Fig. 5 b 1st, 2,3 attribute is identical, and the attribute of node 4,5,6 is identical, and the attribute of node 7,8,9,10,11,12 is identical.Fig. 5 c are illustrated Using the community division result for only considering that the existing method of network structure is obtained, and Fig. 5 d are shown according to present invention offer , considered structure and final result that community that the community discovery method of nodal community is obtained divides.From Fig. 5 c and 5d As can be seen that in Fig. 5 c, it is not all the same in the attribute of each community's internal node.And in the community division result of Fig. 5 d, often Nodal community inside individual community is identical, i.e., node 1,2,3 forms a community, and node 4,5,6 forms a community, other sections Point forms a community.It can be seen that, it is more accurate using the result obtained by the community discovery method that the present invention is provided.
Further to verify the degree of accuracy of the community discovery method for providing of the invention, inventor acquires the politics in the U.S. and wins The relevant data of visitor, including 1490 blogs and its 19090 hyperlink, each blog has its own attribute (bag Include " democracy " and " republicanism ").Fig. 6 shows the result for carrying out community discovery using distinct methods to political blog.Such as Fig. 6 institutes Show, carried out after community's division using BGL methods, modularity and whole community attributes entropy are respectively 0.472 and 0.407;And adopt The community discovery method (being represented with ACD in figure) that the present invention is provided is carried out community and divides the modularity and whole community attributes for obtaining Entropy is 0.411 and 0.03, and (as described above, entropy is lower shows to have dropped 4.6% and 92.1% compared with BGL methods respectively Node similarity in community is higher).Using the whole community attributes entropy obtained by SA-Cluster methods with carried using the present invention For community discovery method obtained by whole community attributes entropy it is close, but its modularity is relatively low.According to Fig. 6 in Fiel Hand over the experimental analysis in network data, it was demonstrated that the community discovery method of offer of the present invention can obtain the society that feature becomes apparent from Area.Node inside same community is not only structurally joining together closely, and the similarity between node is also higher.
It should be noted last that, above example is only to illustrate technical scheme and unrestricted.Although ginseng The present invention is described in detail according to embodiment, it will be understood by those within the art that, the technical side to the present invention Case is modified or equivalent, and without departure from the spirit and scope of technical solution of the present invention, which all should be covered in the present invention Right in the middle of.

Claims (9)

1. a kind of community discovery method, including:
Step 1), to network in multiple nodes based on modularity maximize carry out community's division;Wherein, mould is calculated using following formula Lumpiness:
Q = 1 2 m Σ i = 1 i max Σ j = 1 j max [ A i j - k i k j 2 m ] δ ( c i , c j ) ,
Wherein, AijRepresent the weight on the side between node i and node j, imaxFor the maximum of i, jmaxFor the maximum of j,The weights sum on all sides that expression is connected with node i, ciRepresent the community that node i is located, δ (ci,cj) represent Work as ciWith cjIt is 1 when equal, is 0 in the case of remaining,Represent the weights sum on all sides in network;
Step 2), adjusted from step 1 based on community attributes entropy minimization) the community boundary node that obtains;Including:
Step 21), random selection one community boundary node;
Step 22), calculate the community boundary node and move on to the neighbours community produced by its neighbours community from the community being located Community attributes entropy production;
Step 23), the neighbours community that selects the community attributes entropy production minimum, judge whether the neighbours community is the community The community that boundary node is located, if it is not, then the community boundary node is moved to the neighbours society from its community being located Area;
Step 24) if, the whole community attributes entropy before and after the community boundary node motion change, return to step 21);Wherein, whole community attributes entropy is calculated according to following formula:
H = Σ m = 1 K H ( V m ) ,
Wherein, K represents the quantity of community, H (Vm) represent community VmAttribute entropy, and
H ( V m ) = Σ i = 1 M - 1 Σ j = i + 1 M ( s i , j 2 ln s i , j 2 + ( 1 - s i , j 2 ) ln ( 1 - s i , j 2 ) ) ,
Wherein M represents community VmComprising number of nodes, si,jRepresent the similarity of two node is and j on attribute;
Step 3) if, from step 2) community that obtains divides and meets termination condition, the community is divided as final community Divide;Otherwise, will be from step 2) community that obtains, as node, re-executes step 1) node is carried out community's division and Re-execute step 2) adjustment community boundary node.
2. method according to claim 1, wherein, the termination condition is:
Through step 1) and step 2) process after the modularity that divides of the community that obtains do not have compared with the modularity of before processing Increase, and through step 1) and step 2) process after obtain community division whole community attributes entropy it is whole with before processing Individual community attributes entropy is compared and is not reduced.
3. method according to claim 1, wherein, the termination condition is:
Repeat step 1) and step 2) number of times reached predetermined threshold value.
4. the method according to any one in claim 1-3, wherein, step 1) include:
For each node in network, by the neighbor node place society corresponding to the node motion to the positive increment maximum of modularity Area, till the movement of any node can not all bring the positive increment of modularity.
5. the method according to any one in claim 1-3, wherein, the network is weighted network.
6. method according to claim 1, wherein, step 21) include:
Step 211), from step 1) randomly choose a community in the community that obtains;
Step 212), from selected community randomly choose a node, wherein the end points on the side being connected with the node is not complete Be entirely its be located community in node.
7. method according to claim 1, wherein step 21) before also include:
Step 20), by step 1) community divide and be reduced to the community of ancestor node and divide, wherein ancestor node is first time base Node in the network that modularity maximizes before carrying out community's division.
8. method according to claim 1, wherein, community attributes entropy production is calculated according to following formula:
Δ H = H ( V m ) - H ( V m - i ) = Σ i = 1 i max Σ j = V m - 1 V m - i max ( s i , j 2 ln s i , j 2 + ( 1 - s i , j 2 ) ln ( 1 - s i , j 2 ) ) ,
Wherein, node i represents selected community boundary node, imaxFor the maximum of i, H (Vm) represent that node i moves to neighbours Community VmAfterwards, including the community V of node imAttribute entropy, H (Vm- i) represent move to neighbours community V in node imThe neighbour before Occupy the attribute entropy of community, si,jRepresent the similarity of two node is and j on attribute.
9. a kind of community discovery system, including:
Community's division module, carries out community division based on modularity maximization for the multiple nodes in network;Wherein, adopt Following formula computing module degree:
Q = 1 2 m Σ i = 1 i max Σ j = 1 j max [ A i j - k i k j 2 m ] δ ( c i , c j ) ,
Wherein, AijRepresent the weight on the side between node i and node j, imaxFor the maximum of i, jmaxFor the maximum of j,The weights sum on all sides that expression is connected with node i, ciRepresent the community that node i is located, δ (ci,cj) represent Work as ciWith cjIt is 1 when equal, is 0 in the case of remaining,Represent the weights sum on all sides in network;
Community's adjusting module, for the community boundary obtained from community's division module based on the adjustment of community attributes entropy minimization Node;If the community obtained after adjustment divides meets predetermined condition, the community is divided into final community and divides;Otherwise, Using the community obtained after adjustment as node, community's division is carried out and by institute to the node again by community's division module State community's adjusting module and readjust community boundary node;
The community boundary node obtained from community's division module based on the adjustment of community attributes entropy minimization is included:
One community boundary node of random selection;Calculate the community boundary node its institute of neighbours community is moved on to from the community being located The community attributes entropy production of the neighbours community of generation;The neighbours community for selecting the community attributes entropy production minimum, judges the neighbour The community whether community is that the community boundary node is located is occupied, if it is not, then the community boundary node is located from which Community move to the neighbours community;If the whole community attributes entropy before and after the community boundary node motion changes, Above adjustment process is re-executed then;
Wherein, whole community attributes entropy is calculated according to following formula:
H = Σ m = 1 K H ( V m ) ,
Wherein, K represents the quantity of community, H (Vm) represent community VmAttribute entropy, and
H ( V m ) = Σ i = 1 M - 1 Σ j = i + 1 M ( s i , j 2 ln s i , j 2 + ( 1 - s i , j 2 ) ln ( 1 - s i , j 2 ) ) ,
Wherein M represents community VmComprising number of nodes, si,jRepresent the similarity of two node is and j on attribute.
CN201310201298.8A 2012-11-02 2013-05-27 A kind of community discovery method and system Expired - Fee Related CN103325061B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310201298.8A CN103325061B (en) 2012-11-02 2013-05-27 A kind of community discovery method and system

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201210433720 2012-11-02
CN2012104337208 2012-11-02
CN201210433720.8 2012-11-02
CN201310201298.8A CN103325061B (en) 2012-11-02 2013-05-27 A kind of community discovery method and system

Publications (2)

Publication Number Publication Date
CN103325061A CN103325061A (en) 2013-09-25
CN103325061B true CN103325061B (en) 2017-04-05

Family

ID=49193785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310201298.8A Expired - Fee Related CN103325061B (en) 2012-11-02 2013-05-27 A kind of community discovery method and system

Country Status (1)

Country Link
CN (1) CN103325061B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411572A (en) * 2016-09-06 2017-02-15 山东大学 Community discovery method combining node information and network structure

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793489B (en) * 2014-01-16 2017-01-18 西北工业大学 Method for discovering topics of communities in on-line social network
CN103942308B (en) * 2014-04-18 2017-04-05 中国科学院信息工程研究所 The detection method and device of extensive myspace
CN104408149B (en) * 2014-12-04 2017-12-12 威海北洋电气集团股份有限公司 Suspect based on social network analysis excavates correlating method and system
CN104700311B (en) * 2015-01-30 2018-02-06 福州大学 A kind of neighborhood in community network follows community discovery method
CN104715418A (en) * 2015-03-16 2015-06-17 北京航空航天大学 Novel social network sampling method
CN104820945B (en) * 2015-04-17 2018-06-22 南京大学 Online community network information based on community structure mining algorithm propagates maximization approach
CN105095403A (en) * 2015-07-08 2015-11-25 福州大学 Parallel community discovery algorithm based on mixed neighbor message propagation
CN105701511B (en) * 2016-01-14 2019-04-02 河南科技大学 A kind of Adaptive spectra clustering method extracting network node community attributes
CN105704776B (en) * 2016-01-14 2019-07-05 河南科技大学 A kind of node messages retransmission method for taking into account network node energy and caching
CN106027296B (en) * 2016-05-16 2019-06-04 国网江苏省电力公司信息通信分公司 The decomposition method and device of information model in a kind of pair of electric system
CN107818474B (en) * 2016-09-13 2022-01-18 百度在线网络技术(北京)有限公司 Method and device for dynamically adjusting product price
CN106570082B (en) * 2016-10-19 2019-11-05 浙江工业大学 A kind of friends method for digging of combination network topology characteristic and user behavior characteristics
CN108090132B (en) * 2017-11-24 2021-05-25 西北师范大学 Community overlapping division method integrating average division distance and structural relationship of labels
CN109657016A (en) * 2018-12-30 2019-04-19 南京邮电大学盐城大数据研究院有限公司 The method for meeting the attribute of homogeney requirement is excavated in a kind of attribute graph model
CN110135853A (en) * 2019-04-25 2019-08-16 阿里巴巴集团控股有限公司 Clique's user identification method, device and equipment
CN111047453A (en) * 2019-12-04 2020-04-21 兰州交通大学 Detection method and device for decomposing large-scale social network community based on high-order tensor
CN113593713A (en) * 2020-12-30 2021-11-02 南方科技大学 Epidemic situation prevention and control method, device, equipment and medium
CN112925989B (en) * 2021-01-29 2022-04-26 中国计量大学 Group discovery method and system of attribute network

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383748B (en) * 2008-10-24 2011-04-13 北京航空航天大学 Community division method in complex network
CN101877711B (en) * 2009-04-28 2013-08-28 华为技术有限公司 Social network establishment method and device, and community discovery method and device
CN102148717B (en) * 2010-02-04 2013-08-21 明仲 Community detecting method and device in bipartite network
CN102194149B (en) * 2010-03-01 2012-12-05 中国人民解放军国防科学技术大学 Community discovery method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411572A (en) * 2016-09-06 2017-02-15 山东大学 Community discovery method combining node information and network structure
CN106411572B (en) * 2016-09-06 2019-05-07 山东大学 A kind of community discovery method of combination nodal information and network structure

Also Published As

Publication number Publication date
CN103325061A (en) 2013-09-25

Similar Documents

Publication Publication Date Title
CN103325061B (en) A kind of community discovery method and system
Zarandi et al. Community detection in complex networks using structural similarity
CN105740401B (en) A kind of interested site recommended method and device based on individual behavior and group interest
CN105868281B (en) Location aware recommender system based on non-dominated ranking multi-target method
CN104951518B (en) One kind recommends method based on the newer context of dynamic increment
CN103971161B (en) Hybrid recommendation method based on Cauchy distribution quantum-behaved particle swarm optimization
CN105260390B (en) A kind of item recommendation method based on joint probability matrix decomposition towards group
CN110321494A (en) Socialization recommended method based on matrix decomposition Yu internet startup disk conjunctive model
CN102148717B (en) Community detecting method and device in bipartite network
CN107122455A (en) A kind of network user's enhancing method for expressing based on microblogging
CN109783738A (en) A kind of double extreme learning machine mixing collaborative filtering recommending methods based on more similarities
CN103559199B (en) Method for abstracting web page information and device
CN106779867A (en) Support vector regression based on context-aware recommends method and system
CN106776928A (en) Recommend method in position based on internal memory Computational frame, fusion social environment and space-time data
CN105787068A (en) Academic recommendation method and system based on citation network and user proficiency analysis
Iezzi Centrality measures for text clustering
CN106991614A (en) The parallel overlapping community discovery method propagated under Spark based on label
Li et al. Overlap community detection using spectral algorithm based on node convergence degree
CN110490686A (en) A kind of building of commodity Rating Model, recommended method and system based on Time Perception
CN105787662A (en) Mobile application software performance prediction method based on attributes
CN109949174A (en) A kind of isomery social network user entity anchor chain connects recognition methods
CN107918778A (en) A kind of information matching method and relevant apparatus
CN109543708A (en) Merge the mode identification method towards diagram data of topological characteristic
CN110457477A (en) A kind of Interest Community discovery method towards social networks
Sun et al. Overlapping community detection based on information dynamics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170405

Termination date: 20190527