CN116308860A - Dynamic community detection method based on allocation and splitting - Google Patents

Dynamic community detection method based on allocation and splitting Download PDF

Info

Publication number
CN116308860A
CN116308860A CN202310277594.XA CN202310277594A CN116308860A CN 116308860 A CN116308860 A CN 116308860A CN 202310277594 A CN202310277594 A CN 202310277594A CN 116308860 A CN116308860 A CN 116308860A
Authority
CN
China
Prior art keywords
node
community
network
communities
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310277594.XA
Other languages
Chinese (zh)
Other versions
CN116308860B (en
Inventor
姜万昌
张晓茜
刘丹妮
王圣达
郭健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Electric Power University
Original Assignee
Northeast Dianli University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Dianli University filed Critical Northeast Dianli University
Priority to CN202310277594.XA priority Critical patent/CN116308860B/en
Publication of CN116308860A publication Critical patent/CN116308860A/en
Application granted granted Critical
Publication of CN116308860B publication Critical patent/CN116308860B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The method detects the community structure of a first snapshot network by considering closeness among nodes, allocates communities for active nodes reflecting network changes and defines edge increasing nodes by considering the influence of new edges on community detection precision in a subsequent snapshot network, splits the allocated community structure into a plurality of local communities constructed by the edge increasing nodes and single communities constructed by other nodes, and optimizes the split communities by using modular gain combination, thereby detecting the final community structure. The method can detect a community structure with higher quality, and compared with other methods, the method reduces error accumulation of an increment method in the last snapshot of each real dynamic network, and obtains the highest Q value.

Description

Dynamic community detection method based on allocation and splitting
Technical Field
The invention relates to the technical field of complex networks, in particular to a dynamic community detection method based on distribution and splitting.
Background
Community structure detection is a fundamental problem in complex network research of social, biological, traffic and the like. The community detection in the social network can mine trending topics of users for network public opinion control. Due to frequent interactions between nodes in a dynamic social network, different evolution behaviors have a certain influence on the community structure. Thus, researchers have proposed a number of dynamic community detection methods, among which widely used detection methods are classified into static detection methods, optimization-based detection methods, and delta-based detection methods.
In the static detection method, since community detection of different snapshot networks in the dynamic network is relatively independent, the community structure of valuable information contained in the previous snapshot network is ignored in the detection process. In order to consider the influence of the historical community structure, the community detection is converted into a single or multi-objective optimization problem based on an optimization detection method. While such methods take into account the impact of historical community information, iterative optimization targets are a time-consuming process. The incremental-based method only needs to consider the changed nodes and edges in the network, so that the community detection efficiency in the dynamic network is greatly improved, and the smoothness of community evolution is ensured. However, as the network evolves, the error history information is accumulated, so the accuracy of the incremental community detection result is gradually reduced.
Disclosure of Invention
The invention provides a dynamic community detection method based on allocation and splitting, which aims to solve the problem that the existing dynamic community detection method based on increment is easily subjected to error accumulation due to the influence of an initial network community and an increment detection process.
A dynamic community detection method based on allocation and splitting is realized by the following steps:
step one, setting a network sequence G= { G for a dynamic network 0 ,G 1 ,…,G t ,…,G T ]A representation; wherein G is t =(V t ,E t ) For a snapshot network at time T, t=0, 1, …, T;
V t ={u t i u=1, 2, …, n } is snapshot network G t A set of n nodes;
Figure BDA0004136865480000011
for snapshot network G t A set of m sides of said +.>
Figure BDA0004136865480000021
For slave node u t To node vt A kind of electronic device Edges>
Figure BDA0004136865480000022
CS={CS 0 ,CS 1 ,…,CS t ,…,CS T Is the community structure of the network sequence G, wherein
Figure BDA0004136865480000023
Figure BDA0004136865480000024
For snapshot network G t Is a group of k communities;
step two, realizing a snapshot network G by adopting a static community division method based on public neighbor clustering entropy node similarity 0 Obtaining community structure CS by automatic community detection of (a) 0 In CS t-1 Is removed from G t-1 To G t Disappeared node and take it as G t Initial community structure of (a)
Figure BDA0004136865480000025
Step three, determining the movable node set
Figure BDA0004136865480000026
For each node u t Connecting communities
Figure BDA0004136865480000027
A node with a difference between the inner and outer edges of greater than zero is called an active node +.>
Figure BDA0004136865480000028
And constructs the active node set as +.>
Figure BDA0004136865480000029
Step four, identifying G t Active node set in (a)
Figure BDA00041368654800000210
Active node set->
Figure BDA00041368654800000211
Each active node of (a)>
Figure BDA00041368654800000212
From->
Figure BDA00041368654800000213
Removing and assigning communities to the same to obtain community structure +.>
Figure BDA00041368654800000214
Step five, the community structure obtained in the step four
Figure BDA00041368654800000215
Merging and optimizing to obtain a first-stage community structure
Figure BDA00041368654800000216
Step six, determining an edge-increasing node set
Figure BDA00041368654800000217
For each node u t-1 ∈V t-1 If node u t-1 From G t-1 To G t Change, node u t Then called change node, V C ={u t |d(u t-1 )≠d(u t ) And d (u) t-1 ) And d (u) t ) Respectively, are node u t-1 And u t A degree value of (2);
setting edge
Figure BDA00041368654800000218
For new edge, node u t And node v t Belonging to the change node set V C And Community->
Figure BDA00041368654800000219
Node u t Then called edge node ++>
Figure BDA00041368654800000220
The edge-increasing node set is +.>
Figure BDA00041368654800000221
Expressed by the following formula:
Figure BDA00041368654800000222
in the method, in the process of the invention,
Figure BDA00041368654800000223
to from G t-1 To G t Is a new edge of (a);
step seven, determining local communities
Figure BDA00041368654800000224
When edge node is increased
Figure BDA00041368654800000225
At the time, add edge node +.>
Figure BDA00041368654800000226
Same as itBelongs to the edge node set->
Figure BDA00041368654800000227
Is combined to form a local community, and an edge node is defined +.>
Figure BDA00041368654800000228
The local community at time t is +.>
Figure BDA00041368654800000229
Expressed by the following formula:
Figure BDA00041368654800000230
in the method, in the process of the invention,
Figure BDA0004136865480000031
is added with edge nodes->
Figure BDA0004136865480000032
Is a neighbor node set;
step eight, determining a single instance community C t _Sig(u t );
For communities
Figure BDA0004136865480000033
Node->
Figure BDA0004136865480000034
Then it is node u t Creating a single community, said single instance community C t _Sig(u t ) The method comprises the following steps:
Figure BDA0004136865480000035
step nine, local communities obtained according to step seven
Figure BDA0004136865480000036
And step eight obtainThe obtained single instance community C t _Sig(u t ) The community structure of the first stage is +.>
Figure BDA0004136865480000037
The communities in the network are split, recombined and optimized to obtain a final community structure CS t
The invention has the beneficial effects that:
in the method, the community structure of a first snapshot network is detected by considering the closeness among nodes, communities are allocated for active nodes reflecting network changes in a subsequent snapshot network, edge-added nodes are defined by considering the influence of new edges on community detection precision, the allocated community structure is divided into a plurality of local communities constructed by the edge-added nodes and single communities constructed by other nodes, and the divided communities are combined and optimized by using modular gain, so that the final community structure is detected.
Comparing the evaluation indexes NMI and Q values obtained by the dynamic community detection method based on allocation and splitting and other community detection methods, it can be seen that the method can detect a community structure with higher quality along with the increase of the number of community merging and splitting events in the artificial synthesis dynamic network, and compared with other methods, the method reduces the error accumulation of an increment method and obtains the highest Q value in the last snapshot of each real dynamic network.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a dynamic community detection method based on allocation and splitting according to the present invention.
Fig. 2 is a schematic diagram of an active node according to the present invention.
FIG. 3 is a schematic diagram of a community structure in an adjacent snapshot network according to the present invention.
Fig. 4 is a graph showing NMI and Q values versus algorithms in an LFR500 network with different merge and split events according to the present invention. Wherein: (a) NMI effect graphs of merge=3 and split=3 in a community structure merging and splitting evolution event; (b) NMI effect graphs of merge=5 and split=5 in a community structure merging and splitting evolution event; (c) NMI effect graphs of merge=10 and split=10 in a community structure merging and splitting evolution event; (d) A Q value effect graph with merge=3 and split=3 in a community structure merging and splitting evolution event; (e) A Q value effect graph with merge=5 and split=5 in a community structure merging and splitting evolution event; (f) A Q value effect graph with merge=10 and split=10 in a community structure merging and splitting evolution event;
fig. 5 is a graph showing NMI and Q values versus algorithms in an LFR1000 network with different merge and split events according to the present invention. Wherein: (a) NMI effect graphs of merge=5 and split=5 in a community structure merging and splitting evolution event; (b) NMI effect graphs of merge=10 and split=10 in a community structure merging and splitting evolution event; (c) NMI effect graphs of merge=20 and split=20 in a community structure merging and splitting evolution event; (d) A Q value effect graph with merge=5 and split=5 in a community structure merging and splitting evolution event; (e) A Q value effect graph with merge=10 and split=10 in a community structure merging and splitting evolution event; (f) A Q value effect graph with merge=20 and split=20 in a community structure merging and splitting evolution event;
fig. 6 is a schematic diagram showing the Q value comparison of different algorithms in a real dynamic network according to the present invention. The method comprises the steps of (a) comparing the Q value of the community structure detected by the ASCDA method with that of the community structure detected by the existing method by adopting an Office contact network; (b) The Q value comparison diagram of the community structure detected by the ASCDA method and the existing method is carried out by adopting the Enron email; (c) Adopting High school to carry out Q value comparison schematic diagram of the community structure detected by the ASCDA method and the existing method; (d) The Q value comparison diagram of the community structure detected by the ASCDA method and the existing method is carried out by adopting Primary school;
FIG. 7 is a graph showing average modularity of different algorithms in a real dynamic network.
Detailed Description
Detailed description of the inventionin the first embodiment, the dynamic community detection method based on allocation and splitting will be described with reference to fig. 1 to 3, and the specific steps are as follows:
step one, setting a dynamic networkComplex sequence g= { G 0 ,G 1 ,…,G t ,…,G T -a }; wherein G is t =(V t ,E t ) For a snapshot network at time T (t=0, 1, …, T), V t ={u t Sum of |u=1, 2, …, n }, and
Figure BDA0004136865480000041
Figure BDA0004136865480000042
for snapshot network G t A set of n nodes and m edges, here +.>
Figure BDA0004136865480000051
Is slave node u t To node v t Of (1), wherein->
Figure BDA0004136865480000052
Figure BDA0004136865480000053
Is a community structure of the network G, wherein +.>
Figure BDA0004136865480000054
For snapshot network G t Is a group of k communities;
snapshot network G t-1 To snapshot network G t Incremental change (i.e. G) t =G t-1 u.DELTA.G) is denoted as DELTA.G t =(ΔV t ,ΔE t ) Wherein DeltaV t And delta E t Respectively at time (t-1, t]A collection of nodes and edges that are changed inside. Thus, from G t The process of dividing communities in the middle may be expressed as (CS t-1 ,G t ,ΔG t )→CS t Dividing communities using incremental methods requires knowledge of the first snapshot network G of the dynamic network G 0 And incremental changes to neighboring snapshot networks.
Step two, realizing a snapshot network G by adopting a static community division method (CSCDA) based on public neighbor clustering entropy node similarity 0 Obtaining community structure CS by automatic community detection of (a) 0 In CS t-1 Is removed from G t-1 To G t Disappeared node and take it as G t Initial community structure of (a)
Figure BDA0004136865480000055
Step three, defining movable node set
Figure BDA0004136865480000056
Incremental changes of the snapshot network are represented by deletion and addition of nodes and edges, and the series of changes are represented by increase and decrease of node degree values;
for each node u t ∈V t Connecting communities
Figure BDA0004136865480000057
A node with a difference between the inner and outer edges of greater than zero is called an active node +.>
Figure BDA0004136865480000058
Define the active node set as +.>
Figure BDA0004136865480000059
Figure BDA00041368654800000510
In the method, in the process of the invention,
Figure BDA00041368654800000511
Figure BDA00041368654800000512
is to combine node u t Connect to community->
Figure BDA00041368654800000513
The number of edges beyond->
Figure BDA00041368654800000514
Is node u t Is located in community->
Figure BDA00041368654800000523
The number of edges inside; as shown in fig. 2, node u t-1 The degree value from time t-1 to time t is changed, the initial community structure at time t is +.>
Figure BDA00041368654800000515
In node u t Connect community->
Figure BDA00041368654800000516
The difference between the inner and outer edges of (2) is 1, so that node u t Becomes an active node at the time t, and the node is added with the community +.>
Figure BDA00041368654800000517
The external connection needs to be considered again for communities to which the nodes belong;
step four, identifying G t Active node set in (a)
Figure BDA00041368654800000518
In which each active node is selected from +.>
Figure BDA00041368654800000519
Removing and assigning communities to them;
cosine similarity (Salton index) is proposed based on the structural relation of local neighborhood of nodes, and is suitable for local community division of a complex network. For active node sets
Figure BDA00041368654800000520
Each of the movable nodes is arranged in descending order of illuminance value, and Salton index is adopted to be +.>
Figure BDA00041368654800000521
Active node->
Figure BDA00041368654800000522
Find its phaseLike a neighboring node.
When similar neighbor nodes already belong to the community
Figure BDA0004136865480000061
When the node is active->
Figure BDA0004136865480000062
Is assigned to community
Figure BDA0004136865480000063
Otherwise, the active node is->
Figure BDA0004136865480000064
Merging neighboring nodes similar to the neighboring nodes to create a new community; if the similar neighbor nodes are also +>
Figure BDA0004136865480000065
The node in (2) is no longer considered as active node +.>
Figure BDA0004136865480000066
Distributing communities;
after all the movable nodes are distributed, a community structure is obtained
Figure BDA0004136865480000067
Including many sparse, dual-node communities, among others. The existence of a small community can reduce community quality, and has no meaning on the community structure of the network;
step five, according to the community structure obtained in the step four
Figure BDA0004136865480000068
Carrying out merging optimization to obtain a first-stage community structure +.>
Figure BDA0004136865480000069
The method comprises the following steps:
fifthly, according to the module degree gain delta Q ij Is defined by:
Figure BDA00041368654800000610
in the formula e ij Is C i And C j The ratio of the number of continuous edges between (i.noteq.j) to the total number of edges; a, a i Is C with i The nodes in (a) being connected but the other node not belonging to C i The ratio of the number of edges to the total number of edges of the network; a, a j Is in combination with
Figure BDA00041368654800000611
The nodes of which are connected but the other node does not belong to +.>
Figure BDA00041368654800000612
The ratio of the number of edges to the total number of edges of the network; k (k) i Is C i The sum of the degrees of the middle nodes; m is the total number of edges in the network;
step five, calculating
Figure BDA00041368654800000613
The modularity gain between each pair of adjacent communities is selected, and two communities with the maximum modularity gain are selected>
Figure BDA00041368654800000614
They are combined to form a new community.
Step five, repeating the step five two until the modularity gain is negative, and obtaining the community structure of the first stage
Figure BDA00041368654800000615
For incremental changes within the same community, the community affiliation of the node cannot fully reflect the changes within the community, but at node u belonging to the same community t-1 And v t-1 An edge is added between the two nodes, and the node u cannot be connected t And v t Partitioning into different communities, thus consider node u t And v t Separating the material from the community to further improve the quality of the community structure;
step six, defining an edge node set
Figure BDA00041368654800000616
For each node u t-1 ∈V t-1 If node u t-1 From G t-1 To G t Change, node u t Then called change node, V C ={u t |d(u t-1 )≠d(u t ) And d (u) t-1 ) And d (u) t ) Respectively, are node u t-1 And u t A degree value of (2);
edge(s)
Figure BDA0004136865480000071
Is a new edge, node u t And v t Belonging to the change node set V C And->
Figure BDA0004136865480000072
Node u t Then called edge node ++>
Figure BDA0004136865480000073
Define the increased side node set as +.>
Figure BDA0004136865480000074
Figure BDA0004136865480000075
In the method, in the process of the invention,
Figure BDA0004136865480000076
the representation is from G t-1 To G t Is a new edge of (a);
step seven, defining local communities
Figure BDA0004136865480000077
For edge node
Figure BDA0004136865480000078
Node->
Figure BDA0004136865480000079
Same as it pertains to->
Figure BDA00041368654800000710
Is merged to form a local community, defining node +.>
Figure BDA00041368654800000711
The local community at time t is +.>
Figure BDA00041368654800000712
Figure BDA00041368654800000713
In the method, in the process of the invention,
Figure BDA00041368654800000714
for node->
Figure BDA00041368654800000715
Is a neighbor node set;
step eight, defining a single instance community C t _Sig(u t );
For communities
Figure BDA00041368654800000716
Node->
Figure BDA00041368654800000717
For node u t Creating a single community, defining a single instance community C t _Sig(u t ) The method comprises the following steps:
Figure BDA00041368654800000718
step nine, local communities obtained according to step seven
Figure BDA00041368654800000719
And step eight, obtaining a single instance community C t _Sig(u t ) Will->
Figure BDA00041368654800000720
The communities in the network are split, combined and optimized to obtain a final community structure CS t The method specifically comprises the following steps:
step nine one, for
Figure BDA00041368654800000721
Every community->
Figure BDA00041368654800000722
If node->
Figure BDA00041368654800000723
For the edge node, access node u t Neighbor node set->
Figure BDA00041368654800000724
Node u t Is associated with->
Figure BDA00041368654800000725
Is combined with neighbor nodes to construct local communities
Figure BDA00041368654800000726
Otherwise, it is node u t Creating a single instance community C t _Sig(u t ) The method comprises the steps of carrying out a first treatment on the surface of the When community->
Figure BDA00041368654800000727
Figure BDA00041368654800000728
If the node in the local community exists in a certain local community, a new local community is not created for the node in the local community; obtaining split community structure
Figure BDA00041368654800000729
The community at time t as shown in FIG. 3
Figure BDA00041368654800000730
An edge is newly added between the middle node 5 and the node 6, namely the node 5 and the node 6 are edge adding nodes, and the node is +.>
Figure BDA00041368654800000731
Node 5 and node 6 in (a) are combined to construct a local community +.>
Figure BDA00041368654800000732
The remaining nodes construct a single instance community such as: c (C) t _Sig(1 t )={1}、C t _Sig(2 t )={2}、C t _Sig(3 t )={3}、C t _Sig(4 t )={4};
Step nine, calculating according to the modularity gain of step five
Figure BDA00041368654800000733
The modularity gain between each pair of adjacent communities is selected, and two communities with the maximum modularity gain are selected>
Figure BDA0004136865480000081
They are combined to form a new community.
For community structure
Figure BDA0004136865480000082
Every community->
Figure BDA0004136865480000083
Executing the step nine and calculating the modularity of the corresponding community structure, and updating the ++according to the rule of increasing the modularity>
Figure BDA0004136865480000084
Obtaining a final community structure CS t The method comprises the steps of carrying out a first treatment on the surface of the The final community as shown in FIG. 3Structure CS t = {1,2,4}, {3,5,6}, {7,8,9,10}, change from two communities at time t-1 to 3 communities, and change modularity from 0.35 at time t-1 to 0.38 at time t;
a second embodiment is described with reference to fig. 4 to fig. 7, where the second embodiment is an application example of the dynamic community detection method based on allocation and splitting described in the first embodiment.
Four real dynamic social networks of a synthetic dynamic network and Office contact network (Office contact), a middle school student interaction network (High school) and a Primary school student interaction network (Primary email) generated by an extended LFR model are selected, community detection is carried out on the four real dynamic social networks by using SCAN, QCA, inBatch and an incNSA method, corresponding evaluation indexes NMI and Q values are calculated, and NMI and Q values based on an allocation and splitting dynamic community detection method (Dynamic Community Detection Algorithm based on Allocating and Splitting, ASCDA for short) are compared.
The synthetic dynamic network is controlled by several parameters, such as node number, average degree, maximum degree, mixing parameters, snapshot number, and merge and split representing the merging and splitting evolution events of the community structure. In this embodiment, two scale synthetic dynamic networks were generated, and specific parameter settings of the LFR synthetic dynamic network are shown in table 1.
TABLE 1
Network system Node count Average degree of Maximum degree of merge split Mixing parameters Snapshot number
LFR500 500 10 30 3 3 0.2 10
LFR500 500 10 30 5 5 0.2 10
LFR500 500 10 30 10 10 0.2 10
LFR1000 1000 20 40 5 5 0.2 10
LFR1000 1000 20 40 10 10 0.2 10
LFR1000 1000 20 40 20 20 0.2 10
Fig. 4 illustrates NMI and Q values for different methods with different merge and split events on LFR500 network, such as: fig. 4 (a), (b), (c), and fig. 4 (d), (e), (f), where the number of merging and splitting events is 3,5, and 10, respectively.
The NMI value of the ASCDA method is not optimal on the first snapshot network of each group of networks. The reason is that ASCDA employs a modularity optimization approach to achieve the final result. For a network with a real community structure, the community structure corresponding to the highest modularity is not necessarily the structure closest to the real community, so NMI value may be reduced. For subsequent snapshot networks, the ASCDA achieves optimal results on the remaining groups of networks of LFR500, except for the 2 nd and 5 th snapshots of FIGS. 4 (a) and (d). Compared to NMI values representing suboptimal IncNSA, NMI values of ASCDA over three groups of networks, namely: the improvement of the diagrams (a), (b) and (c) is 0.50%, 1.38% and 3.07%, respectively.
As can be seen in fig. 4 (d), the Q value of ASCDA is comparable to that of IncNSA on LFR500 network. On the LFR500 network in (e), the Q value of ASCDA is significantly raised over the last 3 snapshots. Compared with the IncNSA method, the Q value is improved by 1.72% on the LFR500 network in (f) most significantly. QCA and InBatch are both incremental detection methods, their performance gradually decreases over time, and the final NMI and Q values are both near 0.2. As can be seen from fig. 4 (f), the SCAN overall has a decreasing trend and has a large fluctuation.
As shown in fig. 5, in the LFR1000 network, the optimum values of NMI and Q of ASCDA in (a) and (d) are obtained in the LFR1000 network. The results of the SCAN, QCA and InBatch methods all show a decreasing trend. As the number of community merging and splitting events increases, the NMI and Q values of ASCDA in (b) and (e) are increased by 0.36% and 0.89% respectively over the IncNSA method. ASCDA has significant advantages when the number of community merge and split events reaches 20. Compared with the IncNSA method, the NMI and Q values of the ASCDA are respectively improved by 7.38 percent and 8.27 percent. Complex evolution behavior of community merging and splitting due to merge=20 and split=20: as in (c) and (f), the NMI and Q values of the ASCDA, SCAN and IncNSA methods fluctuate widely. In addition, the SCAN method is not excellent in (f) of fig. 5, and the difference between the maximum Q value and the minimum Q value is close to 0.8. This is because the conventional method independently detects each snapshot, and ignores the evolution behavior of the community.
By combining the NMI and Q metrics, the performance of the ASCDA method is more outstanding along with the increase of the number of community merging and splitting events.
Table 2 gives the basic information of 4 real dynamic networks.
TABLE 2
Network system Snapshot number Maximum node number Minimum node number Maximum edge number Minimum edge number
Office contact
10 72 59 188 94
High school 9 159 16 747 11
Primary school 18 236 117 2137 628
Enron email 44 18396 19 48471 19
As the real dynamic network does not have a real community structure, the modularity Q is used to compare the comparison results of the method of the invention with other dynamic community detection methods on Office contact, high school, primary school and Enron email networks as shown in FIG. 6.
The Q value of the community structure detected by the ASCDA method in the Office contact network shown in fig. 6 (a) is similar to that of the IncNSA method, but is still 0.73% higher than that of the IncNSA method. The SCAN method has large fluctuation of Q value, while QCA and InBatch methods are relatively stable, and the Q value is concentrated at about 0.4.
The Enron Email network collected data from month 11 1998 to month 6 2002, and no obvious community structure was formed due to less communication between Email contacts from month 11 1998 to month 2 1999. The ASCDA method still yields a higher modularity Q value as shown in fig. 6 (b), where Q value reaches 0.54 on the 4 th snapshot network. In other snapshot networks, the Q values obtained by the ASCDA method are ranked first.
The 6 th snapshot network of the High school network counts the communication situation between Saturday students. Since this day is not a workday, there is less and relatively decentralised contact between students. Therefore, for this case, the ASCDA method shown in fig. 6 (c) has a Q value up to 32.26% higher than the IncNSA method.
As shown in fig. 6 (d), Q values of community structures reach the lowest point in the 4 th to 5 th snapshot networks of the Primary school network. At this time, the student may have more interactions with students in different classes during the dining time. Therefore, the community structure in the class becomes decentralized, resulting in a decrease in Q value. After noon break, the community structure changes again with the restoration of classroom learning activities. It can be seen that the ASCDA method has detected a community structure Q value of 0.8 after the community splitting and merging events are experienced. Compared with the IncNSA method, the Q value of the ASCDA is improved by 4.34 percent.
On the last snapshot of each real dynamic network, the ASCDA method achieves the highest Q value compared to other methods, indicating that it reduces the error accumulation of the incremental method. With the evolution of the network, the ASCDA can partition a corresponding high-quality community structure. The Q value of the last snapshot on the Office contact network and the High school network is even the highest Q value in the overall network evolution process.
As shown in fig. 7, the average modularity value of the ASCDA and IncNSA calculation method is always in the lead compared to other methods. In contrast, the SCAN method is not stable enough, ranks 3 rd in the Office contact network and the Primary school network, and ranks last in the High school and the acron Email networks. As the network scale increases, the gap between QCA and InBatch approaches increases. Given the large number of merge and split events between student interconnections, the ASCDA method has a significant improvement in average modularity over two student communication networks. Furthermore, the ASCDA method still works well on Office contact networks and acron email networks without obvious merge and split events.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (6)

1. A dynamic community detection method based on allocation and splitting is realized by the following steps:
step one, setting a network sequence G= { G for a dynamic network 0 ,G 1 ,...,G t ,...,G T -representation; wherein G is t =(V t ,E t ) For a snapshot network at time T, t=0, 1,;
V t ={u t i u=1, 2, n } is snapshot network G t A set of n nodes;
Figure FDA0004136865470000011
for snapshot network G t A set of m sides of said +.>
Figure FDA0004136865470000012
For slave node u t To node v t Of (1), wherein->
Figure FDA0004136865470000013
CS={CS 0 ,CS 1 ,...,CS t ,...,CS T Is the community structure of the network sequence G, wherein
Figure FDA0004136865470000014
Figure FDA0004136865470000015
For snapshot network G t Is a group of k communities;
step two, realizing a snapshot network G by adopting a static community division method based on public neighbor clustering entropy node similarity 0 Obtaining community structure CS by automatic community detection of (a) 0 In CS t-1 Is removed from G t-1 To G t Disappeared node and take it as G t Initial community structure of (a)
Figure FDA0004136865470000016
Step three, determining the movable node set
Figure FDA0004136865470000017
For each node u t Connecting communities
Figure FDA0004136865470000018
Nodes with differences between the inner and outer edges greater than zero are referred to as active nodes
Figure FDA0004136865470000019
And constructs the active node set as +.>
Figure FDA00041368654700000110
Step four, identifying G t Active node set in (a)
Figure FDA00041368654700000111
Active node set->
Figure FDA00041368654700000112
Each active node of (a)>
Figure FDA00041368654700000113
From the slave
Figure FDA00041368654700000114
Removing and assigning communities to the same to obtain community structure +.>
Figure FDA00041368654700000115
Step five, the community structure obtained in the step four
Figure FDA00041368654700000116
Performing merging optimization to obtain a first-stage community structure +.>
Figure FDA00041368654700000117
Step six, determining an edge-increasing node set
Figure FDA00041368654700000118
For each node u t-1 ∈V t-1 If node u t-1 From G t-1 To G t Change, node u t Then called change node, V C ={u t |d(u t-1 )≠d(u t ) And d (u) t-1 ) And d (u) t ) Respectively, are node u t-1 And u t A degree value of (2);
setting edge
Figure FDA00041368654700000119
For new edge, node u t And node v t Belonging to the change node set V C And Community->
Figure FDA00041368654700000120
Node u t Then called edge node ++>
Figure FDA00041368654700000121
The edge-increasing node set is +.>
Figure FDA00041368654700000122
Expressed by the following formula:
Figure FDA00041368654700000123
in the method, in the process of the invention,
Figure FDA00041368654700000124
to from G t-1 To G t Is a new edge of (a);
step seven, determining local communities
Figure FDA0004136865470000021
When edge node is increased
Figure FDA0004136865470000022
At the time, add edge node +.>
Figure FDA0004136865470000023
Is the same as the edge node set +.>
Figure FDA0004136865470000024
Is combined to form a local community, and an edge node is defined +.>
Figure FDA0004136865470000025
The local community at time t is +.>
Figure FDA0004136865470000026
Expressed by the following formula:
Figure FDA0004136865470000027
in the method, in the process of the invention,
Figure FDA0004136865470000028
is added with edge nodes->
Figure FDA0004136865470000029
Is a neighbor node set;
step eight, determining a single instance community C t _Sig(u t );
For communities
Figure FDA00041368654700000210
Node->
Figure FDA00041368654700000211
Then it is node u t Creating a single community, said single instance community C t _Sig(u t ) The method comprises the following steps:
Figure FDA00041368654700000212
step nine, local communities obtained according to step seven
Figure FDA00041368654700000213
And step eight, obtaining a single instance community C t _Sig(u t ) The community structure of the first stage is +.>
Figure FDA00041368654700000214
The communities in the network are split, recombined and optimized to obtain a final community structure CS t
2. The allocation and splitting-based dynamic community detection method of claim 1, wherein: in the third step, the active node set is
Figure FDA00041368654700000215
Expressed by the following formula:
Figure FDA00041368654700000216
in the method, in the process of the invention,
Figure FDA00041368654700000217
to node u t Connect to community->
Figure FDA00041368654700000218
The number of edges beyond->
Figure FDA00041368654700000219
For node u t Is located in community->
Figure FDA00041368654700000220
Number of sides in the interior.
3. The allocation and splitting-based dynamic community detection method of claim 1, wherein: in the fourth step, the specific process is as follows:
for movable node set
Figure FDA00041368654700000221
Each of the movable nodes is arranged in descending order of illuminance value, and the Salton index is adopted to be +.>
Figure FDA00041368654700000222
Active node->
Figure FDA00041368654700000223
Find its similar neighbor node when it already belongs to the community +.>
Figure FDA00041368654700000224
When the node is active
Figure FDA00041368654700000225
Assigned to community->
Figure FDA00041368654700000226
Otherwise, the active node is->
Figure FDA00041368654700000227
Merging neighboring nodes similar to the neighboring nodes to create a new community; if the similar neighbor nodes are also +>
Figure FDA00041368654700000228
The node in (2) is no longer the active node +.>
Figure FDA00041368654700000229
Assigning communities, and finally obtaining assigned community structures>
Figure FDA00041368654700000230
4. The allocation and splitting-based dynamic community detection method of claim 1, wherein: the specific process of the fifth step is as follows:
fifthly, according to the module degree gain delta Q ij Calculating the community structure
Figure FDA0004136865470000031
Modularity gain between each pair of adjacent communities;
step five, selecting two communities with the maximum modularity gain
Figure FDA0004136865470000032
-putting the two communities->
Figure FDA0004136865470000033
Merging to form a new community;
step five, repeating the step five two until the modularity gain is negative, and obtaining the community structure of the first stage
Figure FDA0004136865470000034
5. The dynamic community detection method based on allocation and splitting according to claim 4, wherein: in the fifth step, the module gain Δq ij Expressed by the following formula:
ΔQ ij =2(e ij -a i a j ),
Figure FDA0004136865470000035
in the formula e ij Is that
Figure FDA0004136865470000036
And->
Figure FDA0004136865470000037
The number of the connecting edges is the proportion of the total number of the edges; a, a i Is->
Figure FDA0004136865470000038
The nodes of which are connected but the other node does not belong to +.>
Figure FDA0004136865470000039
The ratio of the number of edges to the total number of edges of the network; a, a j Is->
Figure FDA00041368654700000310
The nodes of which are connected but the other node does not belong to +.>
Figure FDA00041368654700000311
The ratio of the number of edges to the total number of edges of the network; k (k) i Is->
Figure FDA00041368654700000312
The sum of the degrees of the middle nodes; m is the total number of edges in the network.
6. The allocation and splitting-based dynamic community detection method of claim 1, wherein: the specific process of the step nine is as follows:
step nine one, for
Figure FDA00041368654700000313
Every community->
Figure FDA00041368654700000314
If node->
Figure FDA00041368654700000315
For the edge node, access node u t Neighbor node set->
Figure FDA00041368654700000316
Node u t Is associated with->
Figure FDA00041368654700000317
Is combined with neighbor nodes to construct local communities
Figure FDA00041368654700000318
Otherwise, it is node u t Creating a single instance community C t _Sig(u t ) The method comprises the steps of carrying out a first treatment on the surface of the If the community is->
Figure FDA00041368654700000319
Figure FDA00041368654700000320
If the node in the local community exists in a certain local community, a new local community is not created for the node; obtaining split Community Structure->
Figure FDA00041368654700000321
Step nine two, according to the module degree gain delta Q ij Calculation of
Figure FDA00041368654700000322
The modularity gain between each pair of adjacent communities is selected, and two communities with the maximum modularity gain are selected>
Figure FDA00041368654700000323
Combining them to form a new community;
sequentially accessing community structure of each first stage
Figure FDA00041368654700000324
Every community->
Figure FDA00041368654700000325
And updating with the rule of module increment
Figure FDA00041368654700000326
Obtaining a final community structure CS t
CN202310277594.XA 2023-03-21 2023-03-21 Dynamic community detection method based on allocation and splitting Active CN116308860B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310277594.XA CN116308860B (en) 2023-03-21 2023-03-21 Dynamic community detection method based on allocation and splitting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310277594.XA CN116308860B (en) 2023-03-21 2023-03-21 Dynamic community detection method based on allocation and splitting

Publications (2)

Publication Number Publication Date
CN116308860A true CN116308860A (en) 2023-06-23
CN116308860B CN116308860B (en) 2024-01-12

Family

ID=86792208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310277594.XA Active CN116308860B (en) 2023-03-21 2023-03-21 Dynamic community detection method based on allocation and splitting

Country Status (1)

Country Link
CN (1) CN116308860B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180341696A1 (en) * 2017-05-27 2018-11-29 Hefei University Of Technology Method and system for detecting overlapping communities based on similarity between nodes in social network
CN109166047A (en) * 2018-08-04 2019-01-08 福州大学 Increment dynamics community based on Density Clustering finds method
CN111667373A (en) * 2020-06-08 2020-09-15 上海大学 Evolution community discovery method based on neighbor subgraph social network dynamic increment
CN115169501A (en) * 2022-08-05 2022-10-11 东北电力大学 Community detection method based on close similarity of common neighbor node clustering entropy

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180341696A1 (en) * 2017-05-27 2018-11-29 Hefei University Of Technology Method and system for detecting overlapping communities based on similarity between nodes in social network
CN109166047A (en) * 2018-08-04 2019-01-08 福州大学 Increment dynamics community based on Density Clustering finds method
CN111667373A (en) * 2020-06-08 2020-09-15 上海大学 Evolution community discovery method based on neighbor subgraph social network dynamic increment
CN115169501A (en) * 2022-08-05 2022-10-11 东北电力大学 Community detection method based on close similarity of common neighbor node clustering entropy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜万昌;宋人杰;: "基于聚类划分和关联规则的继电保护状态评估方法", 黑龙江科技信息, no. 30 *

Also Published As

Publication number Publication date
CN116308860B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
Psorakis et al. Overlapping community detection using Bayesian non-negative matrix factorization
CN103678671B (en) A kind of dynamic community detection method in social networks
CN102521386B (en) Method for grouping space metadata based on cluster storage
Frank et al. A class of probabilistic models for role engineering
CN104217015B (en) Based on the hierarchy clustering method for sharing arest neighbors each other
CN111612053A (en) Calculation method for reasonable interval of line loss rate
CN112508726B (en) False public opinion identification system based on information spreading characteristics and processing method thereof
CN111985623A (en) Attribute graph group discovery method based on maximized mutual information and graph neural network
CN116308860B (en) Dynamic community detection method based on allocation and splitting
CN115169501A (en) Community detection method based on close similarity of common neighbor node clustering entropy
Galvani et al. FunCC: A new bi-clustering algorithm for functional data with misalignment
CN109685675A (en) The didactic dynamic network community structure recognition methods of nature is propagated based on label
CN103902547A (en) Increment type dynamic cell fast finding method and system based on MDL
Jindal et al. A novel approach for mining frequent patterns from incremental data
Luo et al. A reduced mixed representation based multi-objective evolutionary algorithm for large-scale overlapping community detection
CN110941767A (en) Network community detection countermeasure enhancement method based on multi-similarity integration
Visalakshi et al. Distributed data clustering: A comparative analysis
De Carvalho et al. Clustering methods in symbolic data analysis
Jiang et al. Dynamic community detection algorithm based on allocating and splitting
Sawardecker et al. Comparison of methods for the detection of node group membership in bipartite networks
CN109828998B (en) Grouping method and system based on core group mining and opinion leader identification results
CN112733926A (en) Multi-layer network clustering method based on semi-supervision
Preethi et al. An implementation of clustering project proposals on ontology based text mining approach
Elyazid et al. A comparative study of some algorithms for detecting communities in social networks
CN111144614A (en) Short-term low-voltage distribution network theoretical line loss prediction algorithm based on kmeans-LightGBM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant