CN102722530B - Community detection method in complex network - Google Patents

Community detection method in complex network Download PDF

Info

Publication number
CN102722530B
CN102722530B CN201210154812.2A CN201210154812A CN102722530B CN 102722530 B CN102722530 B CN 102722530B CN 201210154812 A CN201210154812 A CN 201210154812A CN 102722530 B CN102722530 B CN 102722530B
Authority
CN
China
Prior art keywords
node
network
corporations
value
variation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210154812.2A
Other languages
Chinese (zh)
Other versions
CN102722530A (en
Inventor
李侃
庞垠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201210154812.2A priority Critical patent/CN102722530B/en
Publication of CN102722530A publication Critical patent/CN102722530A/en
Application granted granted Critical
Publication of CN102722530B publication Critical patent/CN102722530B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a community detection method in a complex network. The method comprises the steps that: one, m is given as a community number in the complex network beforehand; the network is assumed to have n nodes, namely n communities are assumed to be in the network at initial moments, namely each node is a community in the network; two, any two communities in the network are merged; calculation is carried out, and the change of a Q value is obtained by subtracting a Q value before the merging from a Q value after merging; three, merging is carried out after choosing a scheme through which the change of the Q value is the greatest; and one is subtracted from the community number in the network at the moment; four, the step two and the step three are carried out repeatedly and are not ended until a number of left communities is m, thereby final results of community detection carried out in the complex network are obtained. With the method provided by the invention, communities can be found in the complex network effectively with no need of knowing whether the network is a unipartite network or a bipartite network.

Description

A kind of Detecting Community method in complex network
Technical field
The present invention relates to a kind of Detecting Community method in complex network, belong to networking technology area.
Background technology
The complex network such as social networks, bio-networks is all the object that has dynamic, and As time goes on they can grow up fast and change.These networks can be divided into a plurality of corporations." corporations ", also can be called " group ", are one group of set that has identical attribute or play the part of in the drawings similar role's node.It is exactly to find the node in complex network to belong to which corporation that complex network community is surveyed, and to node, hives off.It is to understand complex network topologies that complex network community is surveyed, and understands complex network function, finds stealth mode, link prediction, the key of evolving and detecting.
The method of popular searching community structure is the modularization matrix method being proposed by people such as Newman, and this method is based on spectral clustering.Fact proved, this modularization model can be found the community structure in unipartite network and bipartite network by eigenvalue of maximum or minimal eigenvalue respectively in the situation that knowing network structure type.Michael J.Barber and some scholars also search out method and survey the corporations in bipartite network.The BRIM algorithm that Barber and his colleague propose can detect the corporations' number in bipartite network.In addition, Bareber and Clark are used label propagation algorithm (LPA) to identify community structure.Yet these methods are not but proved effective in the situation that knowing network type.
In some cases, researchers do not know the structure type of network.For example, we know the interaction of node in protein network, but we but do not know the type of network.For another example, we know that a network consists of refectory interpersonal relation, and the type of network is uncertain.Because if connect, be just present between student, this network is exactly unipartite network so, if connected, is present between student and teacher, and this network is bipartite network so.Therefore, need to there is a Detecting Community method that is effectively simultaneously applicable to unipartite network and bipartite network.
Summary of the invention
The object of this invention is to provide a kind of Detecting Community method in complex network, the method does not need to know in advance that network is unipartite network or bipartite network.
The object of the invention is to be achieved through the following technical solutions:
A Detecting Community method in complex network, comprises the following steps:
One, the number m of corporations in prior given complex network; In hypothetical network, have n node, initial time has n corporations, and in network, each node is exactly Yi Ge corporations;
Two, Liang Ge corporations merging arbitrarily in network, calculate
Figure BDA00001650235900021
and deduct by the Q value after merging the variation that Q value before merging obtains Q value;
Wherein node i and node j are any two nodes that belong to same corporations in network; N ij=| Г i∩ Γ j| and N ii=0, Γ xthe neighbours that represent nodes x, namely those set with the direct connected node of node x by a limit, | Γ i∩ Γ j| represent the number of the common neighbor node of node i and node j, namely have the number of the node that limit is connected simultaneously with node i and node j; R is the common neighbor node number in network,
Figure BDA00001650235900022
wherein node a and node b are any two nodes in network;
Definition adjacency matrix A is symmetric matrix, if having a limit connected node i and node j, so A ij=1, otherwise, A ij=0, definition a ifor the i row that i column vector of matrix A is matrix A, matrix A also can be write A=[a 1, a 2..., a n], and if only if A ika kj=1 o'clock, node k was the common neighbours of node i and node j, makes a ia i=k i, i.e. the degree of node i, common neighbours' index c x=k x(k x-1)/2;
Three, selection merges the Merge Scenarios of the variation maximum of Q value, and now the corporations' number in network subtracts one; If it is not unique making the Merge Scenarios of the variation maximum of Q value, according to these schemes, merge so respectively, obtain the amalgamation result of different schemes;
Four, repeating step two, three, until remain in network when corporations' number is m, finish; If there is a plurality of schemes, so relatively the Q value of these schemes, selects the maximum scheme of Q value, if still there are a plurality of schemes, the while is as final plan, thereby obtains complex network to carry out the net result of Detecting Community.
The principle that the present invention carries out Detecting Community in complex network is: the limit in dissimilar network structure connects in a different manner, and the node in identical corporations is similar.So the present invention transfers to notice a little upper from limit, carry out Detecting Community from the angle of node.
1, unipartite network
For a unipartite network, basic Detecting Community rule is " limit of corporations inside is tightr, and the limit of corporations outside is more sparse ", and the number of establishing limit connected between different corporations node is A out,
A out = Σ i , j ∉ same commumity A ij
The main task of surveying corporations is to make A outminimize, be denoted as min (A out).
Suppose that node i and node j are two nodes in different corporations, if node i has limit to be connected with node j, so node j can with k i-1 node forms take node that node i is common neighbor node to (k ithe degree of node i).Node is to referring to: for two node x and y, if there is a node z, be the neighbours of node x and node y, x and y are nodes pair so simultaneously.In unipartite network, the neighbor node of a node is nearly all in identical corporations, so for node i, its most of neighbor nodes all should be with node i in identical corporations, as shown in Figure 1.If node i does not have limit to be connected with node j, node j cannot form and take the node pair that node i is common neighbor node with any node so.
Therefore interstitial content that, can be right from the node configuration node in different corporations is
N out = Σ i , j ∉ same commumity A ij ( k i - 1 )
Above formula can rewrite as follows:
N out = &Sigma; i , j &NotElement; same commumity andi < j A ij ( k i + k j - 2 )
For two node i in different corporations and j, N outwith A ij, k iand k jrelevant.If have limit, so an A between node i and node j outwill add 1, N outwill add k i+ k j-2.If k i+ k j-2>=0, we think A so outand N outthere is identical rising tendency, that is to say and obtain min (N out) and obtain m1n (A out) be of equal value.
2, bipartite network
For a bipartite network, basic Detecting Community rule is " limit in corporations is more sparse, and the limit between corporations is tightr ".So survey the main task of corporations, make exactly A outmaximize, be denoted as max (A out).
In bipartite network, nearly all neighbor node is in different corporations, and for the group node in identical corporations, common neighbor node should be in different corporations, as shown in Figure 2.
For the node a in identical corporations and node b, node that can be right with node b configuration node substantially all with node b in same corporations.Therefore interstitial content N that, can be right with the node configuration node in identical corporations inbe
2 N in = &Sigma; i , j &NotElement; same commumity A ij ( k i + k j - 2 )
Above formula can rewrite
N in = &Sigma; i , j &NotElement; same commumity andi < j A ij ( k i + k j - 2 )
A outand N inthere is identical rising tendency, that is to say and obtain max (N in) and obtain max (A out) equivalence.And max (N in) and min (N out) be of equal value, therefore from having the right angle of node of common neighbor node, calculate N inor N outvalue, the type of diffServ network not, carries out the detection of corporations and make to use the same method.
In randomized block model, the people such as Newman have proved in reasonable corporations of network divide, " the limit number in corporations is more than anticipation ".Similar, during a good network is divided, the common neighbor node number in corporations also should be more than I expected.
Therefore, the present invention defines the model that complex network community surveys and is
Q=(the common neighbor node number of the common neighbor node number-anticipation in corporations).
This is a formula of dividing network, and its larger proof community structure of value is more powerful.We set up a random network, and the node in this network has identical common neighbor node degree with the node in complex network, and we are the common neighbor node number of expection in random network using the common neighbor node number of the expection in complex network.N in unipartite network outwith the N in bipartite network inall can be affected by the degree of limit and node simultaneously.The probability that random node becomes the common neighbours of certain specific node i only depends on the common neighbor node degree c of expection i.The probability that random node becomes the common neighbours of two nodes does not rely on this two nodes.This means the common neighbours P of node i and node j ijexpection number be f (c i) f (c j), wherein f () is the function of common neighbor node degree, because P ijsymmetrical, therefore for certain constant C, f (c i)=Cc i
i,j∈nP ij=∑ if(c i)∑ jf(c j)=C 2R 2
Node in random network has identical common neighbor node degree with the node in complex network
i,j∈nP ij=∑ i,j∈nN ij=2R
So
C = 2 R
f ( c i ) = 2 R c i - - - ( 8 )
Therefore, node is as follows to the common neighbor node number of the expection of (i, j)
P ij = f ( c i ) f ( c j ) = 2 c i c j R
Therefore, the present invention carries out the model of complex network community detection and is
Q = 1 2 R &Sigma; i , j &NotElement; same community [ N ij - 2 c i c j R ] ,
Wherein, N ij=| Г i∩ Γ j| and N ii=0, Γ xthe neighbours that represent nodes x, namely those are by a set with the direct connected node of node x; | Γ x∩ Γ y| represent the number of the common neighbor node of node i and node j, namely have the number of the node that limit is connected simultaneously with node i and node j; R is common neighbours' number in network,
Figure BDA00001650235900053
c i=k i(k i-1)/2, wherein k ithe degree that represents node i.
This model is applicable to unipartite network and bipartite network.
Beneficial effect
Method of the present invention does not need to know in advance that network is unipartite network or bipartite network, just can effectively search out the corporations in complex network.
Accompanying drawing explanation
Fig. 1 be in unipartite network two nodes at the schematic diagram of different corporations.
Fig. 2 be in bipartite network two nodes at the schematic diagram of identical corporations.
Fig. 3 is the bipartite network with Liang Ge corporations, and wherein circular node represents Yi Ge corporations, and square nodes represents another corporations.
Fig. 4 is the unipartite network with Liang Ge corporations, and wherein circular node represents Yi Ge corporations, and square nodes represents another corporations.
Embodiment
Below in conjunction with accompanying drawing 3 and accompanying drawing 4, illustrate the preferred embodiment of the present invention.
Embodiment 1:
Attachedly Figure 3 shows that a bipartite network, this network has Liang Ge corporations, and wherein circular node represents Yi Ge corporations, and square nodes represents another corporations.
Setting final corporations' number is in advance 2.When initial, each node represents Yi Ge corporations, has 6 corporations, has recorded the degree of each node in network in table 1, and table 2 is common neighbours' indexes of each node in network.
Table 3 is numbers of common neighbor node between two nodes in network.
The degree of each node in table 1. network
Node serial number The degree of node
a 3
b 1
c 1
d 3
e 2
f 2
Common neighbours' index of each node in table 2. network
Node serial number c i
a 3
b 0
c 0
d 3
e 1
f 1
The number of common neighbor node between two nodes in table 3. network
a b c d e f
a 0 1 1 0 1 1
b 1 0 1 1 0 0
c 1 1 0 1 0 0
d 0 1 1 0 1 1
e 1 0 0 1 0 1
f 1 0 0 1 1 0
Corporations in network are merged between two, calculate the variation of Q value, first calculate the degree of each node, the number of the common neighbours between common neighbours' exponential sum node of node.
According to formula
Figure BDA00001650235900061
after merging between two, the variation of Q value is as follows:
The variation of Q value after the i of Biao4. corporations and the j of corporations merge
a b c d e f
a -0.0469 -0.0469 -0.1406 -0.0313 -0.0313
b 0.0469 0.0156 -0.0313 -0.0313
c -0.0156 -0.0313 -0.0313
d -0.0313 -0.0313
e 0.0000
f
Analyze: according to table 4, we can know that it is 0.0469 of maximum that the b of corporations and c merge worth variation of the Q obtaining.
B and c regard the bc of Yi Ge corporations as after merging, and then between corporations, carry out the variation of joint account Q value between two.
The variation of Q value after the i of Biao5. corporations and the j of corporations merge
a bc d e f
a -0.0938 -0.1406 -0.0313 -0.0313
bc 0.0313 -0.0625 -0.0625
d -0.0313 -0.0313
e 0.0000
f
Analyze: from table 5, show that known selection bc and d merge the Q value variation obtaining maximum, we just select bc and d to merge so.
Bc and d are merged into the bcd of Yi Ge corporations, and the variation of then carrying out union operation between two between corporations again and calculating Q value is as following table:
The variation of Q value after the i of Biao6. corporations and the j of corporations merge
a bcd e f
a -0.2344 -0.0313 -0.0313
bcd -0.0938 -0.0938
e 0.0000
f
Analyze: from table 6, show that known selection e and f merge the Q value variation obtaining maximum, we just select e and f to be merged into ef so.
The variation of carrying out again union operation between two and calculating Q value:
The variation of Q value after the i of Biao7. corporations and the j of corporations merge
a bcd ef
a -0.2344 -0.0625
bcd -0.1875
ef
Analyze: in table 7 maximal value only have one that be exactly that a and ef are merged into aef, at this moment in network, only remaining 2 corporations meet the number of default corporations, finish.
Final corporations' division result is bcdShi Yige corporations, aefShi Yige corporations.
Embodiment 2:
Attachedly Figure 4 shows that a unipartite network, this network has Liang Ge corporations, and wherein circular node represents Yi Ge corporations, and square nodes represents another corporations.
The number of setting final corporations is 2.When initial, each node represents Yi Ge corporations, has 7 corporations.In table 8, recorded the degree of each node in network, table 9 is common neighbours' indexes of each node in network.
Table 10 is numbers of common neighbor node between two nodes in network.
The degree of each node in table 8. network
Node serial number The degree of node
a 1
b 1
c 1
d 4
e 3
f 1
g 1
Common neighbours' index of each node in table 9. network
Node serial number c i
a 0
b 0
c 0
d 6
e 3
f 0
g 0
The number of common neighbor node between two nodes of table 10.
a b c d e f g
a 0 1 1 0 1 0 0
b 1 0 1 0 1 0 0
c 1 1 0 1 1 0 0
d 0 0 1 0 0 1 1
e 1 1 1 0 0 0 0
f 0 0 0 1 0 0 1
g 0 0 0 1 0 1 0
Corporations in network are merged between two, calculate the variation of Q value, first calculate the degree of each node and the number of the common neighbours between node.
The variation of Q value after the i of Biao11. corporations and the j of corporations merge
a b c d e f g
a 0.15 0.05 0 0.05 0 0
b 0.05 0 0.05 0 0
c 0.05 0.05 0 0
d -0.018 0.05 0.05
e 0 0
f 0.05
g
Analyze: according to table 11, we can know that it is 0.15 of maximum that a of corporations and b merge worth variation of the Q obtaining.
A and b regard the ab of Yi Ge corporations as after merging, and then between corporations, carry out the variation of joint account Q value between two.
The variation of Q value after the i of Biao12. corporations and the j of corporations merge
ab c d e f g
ab 0.10 0 0.05 0 0
c 0.05 0.05 0 0
d -0.018 0.05 0
e 0 0.05
f 0.05
g
Analyze: according to table 12, we can know that it is 0.10 of maximum that the ab of corporations and c merge worth variation of the Q obtaining.
Ab and c regard the abc of Yi Ge corporations as after merging, and then between corporations, carry out the variation of joint account Q value between two.
The variation of Q value after the i of Biao13. corporations and the j of corporations merge
abc d e f g
abc 0.15 0.10 0 0
d -0.018 0.05 0
e 0 0.05
f 0.05
g
Analyze: according to table 13, we can know that it is 0.15 of maximum that the abc of corporations and d merge worth variation of the Q obtaining.
Abc and d regard the abcd of Yi Ge corporations as after merging, and then between corporations, carry out the variation of joint account Q value between two.
The variation of Q value after the i of Biao14. corporations and the j of corporations merge
abcd e f g
abcd 0 -0.03 0.02
e 0 0
f 0.05
g
Analyze: according to table 14, we can know that it is 0.05 of maximum that the f of corporations and g merge worth variation of the Q obtaining.
F and g regard the fg of Yi Ge corporations as after merging, and then between corporations, carry out the variation of joint account Q value between two.
The variation of Q value after the i of Biao15. corporations and the j of corporations merge
abcd e fg
abcd 0 -0.03
e 0.10
fg
analyze: now in table 15 maximal value only have one that be exactly that e and fg are merged into efg, at this moment in network, only remaining 2 corporations meet the number of default corporations, finish.
Final corporations' division result is abcd Shi Yige corporations, efgShi Yige corporations.
The present invention is not limited only to above embodiment, everyly utilizes mentality of designing of the present invention, does the design of some simple change, within all should counting protection scope of the present invention.

Claims (1)

1. the Detecting Community method in complex network, comprises the following steps:
One, the number m of corporations in prior given complex network; In hypothetical network, have n node, initial time has n corporations, and in network, each node is exactly Yi Ge corporations;
Two, Liang Ge corporations merging arbitrarily in network, calculate
Figure FDA0000409996850000011
and deduct by the Q value after merging the variation that Q value before merging obtains Q value;
Wherein node i and node j are any two nodes that belong to same corporations in network; N ij=| Γ i∩ Γ j| and N ii=0, Γ xthe neighbours that represent nodes x, namely those set with the direct connected node of node x by a limit, | Γ i∩ Γ j| represent the number of the common neighbor node of node i and node j, namely have the number of the node that limit is connected simultaneously with node i and node j; R is the common neighbor node number in network, wherein node a and node b are any two nodes in network; Definition adjacency matrix A is symmetric matrix, if having a limit connected node i and node j, so A ij=1, otherwise, A ij=0, definition a ifor the i row that i column vector of matrix A is matrix A, matrix A also can be write A=[a 1, a 2..., a n], and if only if A ika kj=1 o'clock, node k was the common neighbours of node i and node j, makes a ia i=k i, i.e. the degree of node i, common neighbours' index c x=k x(k x-1)/2;
Three, selection merges the Merge Scenarios of the variation maximum of Q value, and now the corporations' number in network subtracts one; If it is not unique making the Merge Scenarios of the variation maximum of Q value, according to these schemes, merge so respectively, obtain the amalgamation result of different schemes;
Four, repeating step two, three, until remain in network when corporations' number is m, finish; If there is a plurality of schemes, so relatively the Q value of these schemes, selects the maximum scheme of Q value, if still there are a plurality of schemes, the while is as final plan, thereby obtains complex network to carry out the net result of Detecting Community.
CN201210154812.2A 2012-05-17 2012-05-17 Community detection method in complex network Expired - Fee Related CN102722530B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210154812.2A CN102722530B (en) 2012-05-17 2012-05-17 Community detection method in complex network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210154812.2A CN102722530B (en) 2012-05-17 2012-05-17 Community detection method in complex network

Publications (2)

Publication Number Publication Date
CN102722530A CN102722530A (en) 2012-10-10
CN102722530B true CN102722530B (en) 2014-04-16

Family

ID=46948291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210154812.2A Expired - Fee Related CN102722530B (en) 2012-05-17 2012-05-17 Community detection method in complex network

Country Status (1)

Country Link
CN (1) CN102722530B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164533B (en) * 2013-04-09 2016-07-06 哈尔滨工业大学 Complex network community detection method based on information theory
CN103400299B (en) * 2013-07-02 2015-04-29 西安交通大学 Method for detecting network overlapped communities based on overlapped point identification
CN105631750A (en) * 2015-12-25 2016-06-01 中国民航信息网络股份有限公司 Civil aviation passenger group discovery method
CN107147696B (en) * 2017-04-07 2019-11-01 北京信息科技大学 The method and apparatus of cache server are distributed in complex network
CN107395399B (en) * 2017-06-29 2019-10-01 南京邮电大学 A kind of fault-tolerant synchronisation control means of complex network inner couplings failure and time delay
CN109255722B (en) * 2018-08-22 2020-09-18 电子科技大学 Complex network hierarchical analysis system and method based on neighbor topology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101895419A (en) * 2010-07-13 2010-11-24 北京航空航天大学 Tree structure-based data aggregation method with reliability assurance
CN102073700A (en) * 2010-12-30 2011-05-25 浙江大学 Discovery method of complex network community

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9251279B2 (en) * 2007-10-10 2016-02-02 Skyword Inc. Methods and systems for using community defined facets or facet values in computer networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101895419A (en) * 2010-07-13 2010-11-24 北京航空航天大学 Tree structure-based data aggregation method with reliability assurance
CN102073700A (en) * 2010-12-30 2011-05-25 浙江大学 Discovery method of complex network community

Also Published As

Publication number Publication date
CN102722530A (en) 2012-10-10

Similar Documents

Publication Publication Date Title
CN102722530B (en) Community detection method in complex network
CN103678671B (en) A kind of dynamic community detection method in social networks
CN108492201A (en) A kind of social network influence power maximization approach based on community structure
CN104199852B (en) Label based on node degree of membership propagates community structure method for digging
CN110213164B (en) Method and device for identifying network key propagator based on topology information fusion
CN106055604A (en) Short text topic model mining method based on word network to extend characteristics
CN102662964B (en) Method and device for grouping friends of user
CN106503148A (en) A kind of form entity link method based on multiple knowledge base
CN104636978B (en) A kind of overlapping community detection method propagated based on multi-tag
Kaple et al. Viral marketing for smart cities: Influencers in social network communities
CN107391542A (en) A kind of open source software community expert recommendation method based on document knowledge collection of illustrative plates
CN104123279A (en) Clustering method for keywords and device
CN103729467B (en) Community structure discovery method in social network
CN103500168B (en) A kind of overlap complex network community discovery method and system based on Topology Potential
CN112269922B (en) Community public opinion key character discovery method based on network representation learning
CN102194149A (en) Community discovery method
Zhang et al. Detecting colocation flow patterns in the geographical interaction data
CN104750762A (en) Information retrieval method and device
CN102819611A (en) Local community digging method of complicated network
CN103559318B (en) The method that the object containing heterogeneous information network packet is ranked up
Liu et al. Incremental algorithms of the core maintenance problem on edge-weighted graphs
CN105490830A (en) Method and system for finding ring structure in network topological graph
Su et al. A new random-walk based label propagation community detection algorithm
CN105243121A (en) Data mining based text data network construction system
CN105069003A (en) User focus object recommendation calculation method based on forward chain similarity

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140416

Termination date: 20190517