CN112015954A - Martha effect-based community detection method - Google Patents

Martha effect-based community detection method Download PDF

Info

Publication number
CN112015954A
CN112015954A CN202010884765.1A CN202010884765A CN112015954A CN 112015954 A CN112015954 A CN 112015954A CN 202010884765 A CN202010884765 A CN 202010884765A CN 112015954 A CN112015954 A CN 112015954A
Authority
CN
China
Prior art keywords
node
community
network
nodes
attraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010884765.1A
Other languages
Chinese (zh)
Other versions
CN112015954B (en
Inventor
孙泽军
常新峰
王启明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pingdingshan University
Original Assignee
Pingdingshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pingdingshan University filed Critical Pingdingshan University
Priority to CN202010884765.1A priority Critical patent/CN112015954B/en
Publication of CN112015954A publication Critical patent/CN112015954A/en
Application granted granted Critical
Publication of CN112015954B publication Critical patent/CN112015954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a Martha effect-based community detection method, which relates to the technical field of information science, and comprises the following steps: inputting a network G consisting of nodes and edges; initializing the network G, and dividing each node into an independent community; calculating a core packet of the network G; simulating a Marble effect process in a Marble effect model by adopting an iteration method; judging whether the network structure achieves the optimal division; if the optimal condition is not reached, performing iterative simulation of the Martian effect again; and if the optimal community division is achieved, carrying out community division to obtain a community division result.

Description

Martha effect-based community detection method
Technical Field
The invention relates to the technical field of information science, in particular to a community detection method based on the Martha effect.
Background
The community structure reflects the structural characteristics of the network on a mesoscopic scale and is widely present in a real network. Communities are also commonly referred to as communities (communities), clusters (clusters), groups (groups), and the like. Due to the diversity of complex networks and the complexity of community structures, no uniform and definite definition is formed for the community structures of complex networks at present. A community is generally considered to be a set of nodes, the connection between nodes in the group is more compact, and the connection between nodes in the group is sparse.
The community detection provides an important way for analyzing the structural characteristics of the complex network, researching the organization function of the complex network and mining the potential relation of the complex network. In addition, community detection has been widely used in many disciplines and fields, such as computer science, bioinformatics, sociology, economics, and epidemiology. For example, it may be used to discover organizational groups and provide personalized services in a social network. In an e-commerce network, community detection may be used for intelligent recommendation and precision marketing. In crime and anti-terrorism networks, it can be used to find crime parties. In addition, the method can be used for optimizing routing tables in the Internet, finding out close relations among proteins and analyzing the cooperative relations among authors in the citation network. Therefore, the method for researching and designing the efficient and accurate community structure detection has important significance.
Heretofore, various community detection methods have been proposed from different perspectives, including graph partitioning-based methods, modularization-based methods, and dynamics-based methods. Among these methods, identifying communities using the inherent topology and dynamics of the network is an emerging method that is simple, efficient, accurate, and data-driven. However, most of the existing methods have some limitations, such as high time complexity, complex parameter setting, poor stability and the like. For example, WalkTrap is a dynamics-based approach that uses random walks to obtain high quality communities. However, the temporal complexity is O (mn ^ 2). The markov clustering algorithm (MCL) is a well-known dynamic-based clustering method and is widely used in graph clustering. However, it is sensitive to the "swell" parameter. Furthermore, the tag propagation algorithm (LPA) has an approximately linear time complexity, however, the results of community detection are not always stable. The Fluid Community (Fluid) algorithm is a diffusion-based approach, similar to LPA, and Fluid C often returns different results during each run. In summary, the conventional community detection methods are all limited.
Aiming at the phenomenon, the community detection method based on the Martha effect can reveal the community structure in the network and solve the problems of high time complexity, complex parameter setting and poor stability in the existing method.
Disclosure of Invention
The invention aims to provide a community detection method based on the Martha effect, which can reveal a community structure in a network and solve the problems of high time complexity, complex parameter setting and poor stability in the conventional method.
The invention provides a Martha effect-based community detection method, which comprises the following steps of:
s1: inputting a network G consisting of nodes and edges;
s2: initializing the network G, and dividing each node into an independent community;
s3: calculating a core packet of the network G;
s4: simulating a Marble effect process in a Marble effect model by adopting an iteration method;
s5: judging whether the network structure achieves the optimal division;
s6: if the optimal condition is not reached, performing iterative simulation of the Martian effect again; and if the optimal community division is achieved, carrying out community division to obtain a community division result.
Further, the step of dividing the community in step S2 is:
s21: taking the node number of each node as a label;
s22: each node is divided into an independent community.
Further, the step S3 calculates the core packet of the network G by using the node attraction formula, and the calculating steps are as follows:
s31: calculating the Jaccard similarity coefficient between the nodes:
given undirected network G ═ V, E, the Jaccard similarity coefficients for nodes u and V are defined as:
Figure BDA0002655231000000031
whereinU=N(u)∪{u},UIs a group of neighbors of the node u, comprising the node u and the nodes directly connected with the node u;
s32: calculating the node attraction force:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
NAv→u=Juv*Du (2)
wherein JuvRepresenting the Jaccard similarity coefficient, D, between nodes u and vuRepresenting u degrees of a node;
s33: the core packet of the network is calculated using the node attraction formula:
because the nodes have attractive force, each node attracts the neighbors to join the community, and the node selects the community where the node with the strongest attractive force to join according to the formula (3), wherein the formal definition is as follows:
Figure BDA0002655231000000032
wherein, CvDenotes a community to which the node v belongs, DvThe degree of the node v is then selected according to the formula (3) to join the community where the node with the strongest attraction is located, and if the degree of the node v is greater than the degree of the adjacent node, the node v still belongs to the original community Cv(ii) a If the degree of the neighbor node is greater than the degree of the node v, the node v selects the node having the greatest attraction max (NA)v→u) The node of (2) joining its community; through iteration, nodes with more resources attract neighboring nodes to join the community, and therefore a plurality of core groups are formed.
Further, the step S4 of simulating the martai effect process includes the steps of:
s41: calculating the community attraction:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
Figure BDA0002655231000000041
wherein the content of the first and second substances,
Figure BDA0002655231000000042
indicating the proximity of the node v to a community ci, according to whether the number of edges from a node to different communities is the same or not,
Figure BDA0002655231000000043
the formalization of (a) is defined as follows:
Figure BDA0002655231000000044
wherein the content of the first and second substances,
Figure BDA0002655231000000045
representing a community ciInternal degree of the middle node v.
S42: simulating a Martian effect process:
after obtaining the core packet from step S3, more and more nodes are attracted by different core packets, and the simulation of the mattesy effect is performed according to equation (6), formalized as follows:
Figure BDA0002655231000000046
wherein, ciIs a neighbor community connected to node v,
Figure BDA0002655231000000047
representing a community ciAttraction to node v.
Compared with the prior art, the invention has the following remarkable advantages:
the invention provides a Martha effect-based community detection method, which discloses a community structure in a network in a natural way by considering a community detection problem from the viewpoint of the Martha effect. Compared with the existing community detection method, the CDME algorithm has higher efficiency and better community detection quality in both the generation network and the real network. And unlike algorithms that rely on prior knowledge and parameter settings, the CDME method does not require parameter settings. Because of adopting the local interaction mode, the CDME only needs to countComputing the attractiveness of neighboring nodes, which reduces the time complexity to 0(n · k)2). Where k represents the average degree of the node, which is usually small. Therefore, the CDME algorithm can be applied to a large-scale network. The community detection method based on the Martha effect can reveal the community structure in the network and solve the problems of high time complexity, complex parameter setting and poor stability in the existing method.
Drawings
FIG. 1 is a flow chart of community detection of a Martian-effect-based community detection method according to an embodiment of the present invention;
FIG. 2 is a social attraction diagram of a social detection method based on the Martian effect according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a community detection process of the community detection method based on the Martian effect according to the embodiment of the invention;
FIG. 4 is a graph showing a comparison of performance of comparison algorithms provided by embodiments of the present invention on a generated reference network when the parameter μ varies from 0.1 to 0.8;
FIG. 5 is a graph of performance of comparison algorithms versus synthetic reference networks when the degree of averaging provided by embodiments of the present invention changes;
FIG. 6 is a diagram of community detection results of a CDME for an airway club network according to an embodiment of the present invention;
fig. 7 is a diagram illustrating a community detection result of CDME for the american football network according to an embodiment of the present invention;
FIG. 8 is a diagram of community detection results of a CDME for use in the U.S. political book network, according to an embodiment of the present invention;
FIG. 9 is a diagram of community detection results of a CDME for a dolphin network according to an embodiment of the present invention;
FIG. 10 is a comparison graph of the running time of each comparison algorithm for the number of nodes from 2K to 50K provided by the embodiment of the present invention;
FIG. 11 shows the number of nodes from 1K to 10 according to the embodiment of the present invention3K, run-time comparison of the comparison algorithms.
Detailed Description
The technical solutions of the embodiments of the present invention are clearly and completely described below with reference to the drawings in the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
First, referring to table 1, symbols in the present application are defined, where G ═ V, E is a given network, which represents an undirected graph and an unweighted graph, where V is a set of nodes and E is a set of edges.
TABLE 1 description of symbols
Figure RE-GDA0002694893460000061
Referring to fig. 1-11, the present invention provides a martial effect-based community detection method, comprising the steps of:
s1: inputting a network G consisting of nodes and edges;
s2: initializing the network G, and dividing each node into an independent community;
s3: calculating a core packet of the network G;
s4: simulating a Marble effect process in a Marble effect model by adopting an iteration method;
s5: judging whether the network structure achieves the optimal division;
s6: if the optimal condition is not reached, performing iterative simulation of the Martian effect again; and if the optimal community division is achieved, carrying out community division to obtain a community division result.
Example 1
The step of dividing the community in step S2 is:
s21: taking the node number of each node as a label;
s22: each node is divided into an independent community.
Example 2
Before computing the core packet, it is necessary to compute the attractiveness between the nodes, which is related to the resources owned by the nodes and the similarities between the nodes. According to the topological structure of the network, the resources owned by the nodes are represented by the node degree, and the similarity between the nodes is represented by the Jaccard similarity coefficient. Due to the mutual attraction between the nodes, the nodes can attract adjacent nodes to join the communities, and core groups are formed.
The step S3 calculates the core packet of the network G by using the node attraction formula, and the calculation steps are as follows:
s31: calculating the Jaccard similarity coefficient between the nodes:
given undirected network G ═ V, E, the Jaccard similarity coefficients for nodes u and V are defined as:
Figure BDA0002655231000000071
whereinU=N(u)∪{u},UIs a group of neighbors of the node u, comprising the node u and the nodes directly connected with the node u;
s32: calculating the node attraction force:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
NAv→u=Juv*Du (2)
wherein JuvRepresenting the Jaccard similarity coefficient, D, between nodes u and vuRepresenting u degrees of a node;
s33: the core packet of the network is calculated using the node attraction formula:
because the nodes have attractive force, each node attracts the neighbors to join the community, and the node selects the community where the node with the strongest attractive force to join according to the formula (3), wherein the formal definition is as follows:
Figure BDA0002655231000000072
wherein, CvDenotes a community to which the node v belongs, DvIs a degree of node v, howeverAnd then selecting the community where the node with the strongest attraction force is located according to the formula (3) to join, wherein if the degree of the node v is greater than that of the adjacent node, the node v still belongs to the original community Cv(ii) a If the degree of the neighbor node is greater than the degree of the node v, the node v selects the node having the greatest attraction max (NA)v→u) The node of (2) joining its community; through iteration, nodes with more resources attract neighboring nodes to join the community, and therefore a plurality of core groups are formed.
Example 3
The step of simulating the Martha effect in the step S4 comprises the following steps:
s41: calculating the community attraction:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
Figure BDA0002655231000000081
wherein the content of the first and second substances,
Figure BDA0002655231000000082
representing from node v to a community ciThe proximity of the nodes is determined according to whether the number of edges from one node to different communities is the same or not,
Figure BDA0002655231000000083
the formalization of (a) is defined as follows:
Figure BDA0002655231000000084
wherein the content of the first and second substances,
Figure BDA0002655231000000085
representing a community ciInternal degree of the middle node v.
S42: simulating a Martian effect process:
after obtaining the core packet from step S3, more and more nodes are attracted by different core packets, and the simulation of the mattesy effect is performed according to equation (6), formalized as follows:
Figure BDA0002655231000000086
wherein, ciIs a neighbor community connected to node v,
Figure BDA0002655231000000087
representing a community ciAttraction to node v.
The better a team is, the more interesting it can attract people's attention, and more likely it will attract more people to join. Also, a node will naturally choose the most attractive community to join. When a node joins a community, the structure of the community may change, as may its attractiveness. Thus, community detection based on the Martian effect is a constantly iterative process. In each iteration, each node updates its community tag according to equation (6). Finally, under the drive of the network topology, the community to which the node belongs does not change, and the network structure reaches a balanced state. Then, the community structure of the network is naturally exposed.
Example 4
In step S5, based on the proposed madrepore model, the core group continuously attracts the surrounding nodes to join, so as to form a larger community structure, and an iterative method is used to simulate the madrepore process. Further, the step S5 evaluates the quality of each community partition using a Normalized Mutual Information (NMI) index. Over time, the network structure reaches a steady state due to topology driven effects, and then, the best community division can be obtained.
Example 5
Referring to fig. 2, the network is composed of 13 nodes, divided into two communities. Two community attractions are calculated using equations (4) and (5). For node 6, at community ciInterior degree of (1) is
Figure BDA0002655231000000091
And in community CjInterior degree of (1) is
Figure BDA0002655231000000092
It is clear that the internal degree of the node 6 in these two communities is not equal, and therefore, according to the first part of equation (5),
Figure BDA0002655231000000093
and
Figure BDA0002655231000000094
so that the attraction of the community Ci to the node 6 can be calculated as
Figure BDA0002655231000000095
The attraction of the community Cj to the node 6 is
Figure BDA0002655231000000096
Due to community ciThe attraction to the node 6 is stronger than that of the community CjThe attraction to the node 6, and therefore, the node 6 is more likely to join the community ciIn (1).
Similarly, for node 13, since
Figure BDA0002655231000000097
That is, the nodes 13 are within the same degree in both communities. In this case the influence of indirect neighbors on the node 13 needs to be further considered. According to the second part of equation (5), the community attractiveness of two communities to the node 13 can be calculated as
Figure BDA0002655231000000101
And
Figure BDA0002655231000000102
thus, community ciThe attraction to the node 13 is larger, and the node 13 is more likely to join the community ciIn (1).
Example 6
Referring to fig. 3, the community detection process based on the madrepore effect is shown, which is divided into 4 stages:
stage one: as shown in fig. 3(a), each individual has its own resources and independent groups, edges represent the relationships between individuals, and each individual has a different label;
and a second stage: as shown in fig. 3(b), individuals with more resources attract neighboring individuals to join and form an initial core population; for example, since individuals U7 have more resources, individuals U1, U8, U10, and U12 are attracted to join their groups, each initially having its own resources, each considered an independent group or community;
and a third stage: as shown in fig. 3(c), due to the mattesia effect, more and more individuals are attracted to join the core population, forming a large community; for example, individual U9 was attracted to the group labeled L7;
and a fourth stage: as shown in fig. 3(d), each individual is attracted to a different community, and the community structure is exposed by itself.
Example 7
In order to verify the effectiveness of the method, a Martha effect-based Community Detection Method (CDME) is compared with nine representative community detection algorithms in the prior art, experiments are respectively carried out on a generation network and a real network, and three widely-used evaluation methods (NMI, ARI and Purity) are adopted to evaluate the accuracy of community detection.
Each algorithm was run 20 times independently for each network and the results were then averaged. All experiments were run independently on the same desktop computer with a CPU of 3.3GHz, an Intel Core i5 processor, and 16.0GB memory.
(one) experiment on the generating network
The LFR generation network can be conveniently controlled using a plurality of attribute parameters of the real network, such as the average degree, the population scale and the clustering coefficient, to create a synthetic reference network using a well-known LFR model. The most important parameter of the LFR model is the blending parameter μ, which is used to control the complexity of community division. Table 2 is a parametric description of the LFR model:
table 2: description of LFR model parameters
Figure BDA0002655231000000111
Referring to fig. 4, to evaluate the performance of the comparison algorithm, the parameter μ is varied from 0.1 to 0.8 to increase the difficulty of generating the network. Meanwhile, other parameters are fixed as n 1000, k 15, Cmin 10, Cmax 50, τ 12 and τ 21. Fig. 4 shows the performance of each algorithm on three different evaluation metrics. On the NMI index, the CDME, Infomap, Ncut, LPA, MCL, WT, Louvain and FluidC methods yield excellent community divisions when the parameter μ varies from 0.1 to 0.5. However, the performance of LPA, Infomap and FluidC starts to decline significantly as μ increases. Especially when the value of μ increases to 0.6, the NMI index value of LPA will be substantially zero. In contrast, the MCL and CDME algorithms are more robust than the other comparable methods, and when μ is increased to 0.8, they can still achieve better community division.
To more accurately reflect the performance of each comparison algorithm, the standard deviation was used to evaluate the results of the community detection. Specifically, the parameter μ is increased from 0.1 to 0.8, and each change in the parameter generates 10 networks, for a total of 80 networks. Referring to table 3, table 3 shows the average accuracy and standard deviation of each algorithm on these composite networks. Taking the results shown in the NMI index as an example, it can be seen that the standard deviation is very low for most methods. For example, when the parameter μ varies between 0.1 and 0.4, the values of CDME, Infmap and WT are at most 1, and the standard deviation is at least 0. Even if the parameter μ is increased to 0.8, MCL and CDME still achieve better community division. Wherein the average accuracy of the LPA is close to 0 when the parameter μ is increased to 0.6.
TABLE 3 average accuracy and Standard deviation of algorithms on NMI index
Figure BDA0002655231000000121
Referring to fig. 5, in order to evaluate the effectiveness of each comparison algorithm on different community density networks, the parameter μ is fixed to 0.1, and the average degree parameter k is changed from 5 to 25 to generate a reference network. Fig. 5 is used to show the performance of each comparison algorithm on different criteria when the parameter k is varied from 5 to 25. It can be observed that the CDME, Ncut, Infomap, MCL, LPA, WT, Louvain and EDCD methods all work well on these generated networks when the averaging parameter k is > 10. However, when k.ltoreq.5, the community detection quality of the FG, FluidC, and EDCD methods is significantly degraded.
(II) experiments on real networks
In order to further evaluate the performance of each comparison algorithm on the real network, a plurality of real networks with different characteristics and different domains are selected for experiment. Since these networks know real community partition information, the NMI, ARI and Purity indices are still used to evaluate the community detection quality of the respective algorithms. The basic information of these real networks is described with reference to table 4.
TABLE 4 statistical characteristics of each real network
Figure BDA0002655231000000131
Where | V | represents the number of nodes, | E | represents the number of edges, k represents the average degree, C C represents the clustering coefficient, and # C represents the number of communities.
TABLE 5 Performance of different comparison algorithms on real networks
Figure BDA0002655231000000132
Tables 5 and 6 show the results of the detection of each comparison algorithm community on the real network. # C denotes the number of communities obtained by each algorithm. It can be seen that the CDME algorithm performs very well on these real networks, achieving good community detection quality.
Referring to fig. 6, for Zachary air channel club networks, the CDME method successfully detected two communities, achieved the highest index scores (NMI 1, ARI 1, Purity 1) and outperformed other comparison algorithms. FIG. 6 is a visualization of CDME community detection results, which identifies a community structure that is completely consistent with the true partitions. The existing algorithm cannot divide the tenth node into the right communities because it is basically equivalent to the degree of connection of two communities. However, the present application can still correctly divide the node because the CDME method considers not only the attractiveness of the node but also the attractiveness of the community. The performance of each comparison algorithm is shown in table 6:
TABLE 6 Performance of different comparison algorithms on large networks
Figure BDA0002655231000000141
Referring to fig. 7, for the american college football network, the existing algorithms achieve a good clustering effect due to a high average degree (k ═ 10.66). Fig. 7 shows communities detected by CDME, where the CDME algorithm automatically detects 12 high-quality communities (NMI 0.93).
Referring to fig. 8, CDME achieves good clustering results for political book networks with the highest NMI and ARI scores (NMI 0.58, ARI 0.67). FIG. 8 is a visualization that the CDME has detected communities, three communities being discovered.
Referring to fig. 9, CDME achieves good results for Dolphin networks and performance is better than other comparison algorithms.
To evaluate the performance of the comparison algorithm on a large scale network, experiments were performed using three large datasets, Amazon, DBLP, and LiveJournal.
In Amazon networks, CDME, WT and Infomap algorithms achieve the best performance, they achieve very high NMI and Purity index values (NMI 0.9, Purity 0.99).
On a DBLP cooperative network, the CDME, LPA, and FluidC methods achieved good performance with an NMI index value of 0.75. Meanwhile, CDME achieves the highest precision index value (precision ═ 0.92).
On the LiveJournal social network, CDME obtains the best community division quality with the highest index value (NMI 0.93, ARI 0.49, Purity 0.98). While the Ncut, WT, FG, Infmap, MCL and EDCD algorithms, run for more than 10 days, still do not yield community partition results. The LPA, Louvain and FluidC algorithms also achieve better community detection results.
In summary, the community detection results of the Infomap and the CDME on the generated reference network are good, and the Infomap and the CDME both obtain most of the highest index values. On a real network, CDME also achieves the best community division. Therefore, the CDME can not only process networks with different sizes, but also achieve higher community detection quality.
Example 8
With reference to fig. 10-11, the method and existing algorithms of the present application were run-time analyzed, and in order to evaluate the scalability of the CDME algorithm in terms of network size, the present embodiment generated reference networks of different sizes using LFR models. The fixed average k is 15, the number of nodes varies from 1,000 to 1,000,000, and the running time of the algorithm on different scale networks is then tested. Fig. 10 and 11 show the run times of the respective comparison algorithms at different network scales. It can be observed that CDME is faster than Ncut, EDCD, WT, FG and MCL because its time complexity is 0(k2 · n), where the value of k is usually small. Thus, the CDME algorithm can be used to handle large-scale networks. However, CDME is slower than existing LPAs, Louvain, and FluidC. But these algorithms detect a poorer community quality than CDME. In addition, the LPA and FluidC methods also suffer from stability problems.
The above disclosure is only for a few specific embodiments of the present invention, however, the present invention is not limited to the above embodiments, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims (4)

1. The community detection method based on the Martha effect is characterized by comprising the following steps of:
s1: inputting a network G consisting of nodes and edges;
s2: initializing the network G, and dividing each node into an independent community;
s3: calculating a core packet of the network G;
s4: simulating a Marble effect process in a Marble effect model by adopting an iteration method;
s5: judging whether the network structure achieves the optimal division;
s6: if the optimal condition is not reached, performing iterative simulation of the Martian effect again; and if the optimal community division is achieved, carrying out community division to obtain a community division result.
2. The method of claim 1, wherein the step of dividing the community in step S2 is:
s21: taking the node number of each node as a label;
s22: each node is divided into an independent community.
3. The Matai-effect-based community detection method of claim 1, wherein the step S3 utilizes a node attraction formula to calculate the core groups of the network G by the following steps:
s31: calculating the Jaccard similarity coefficient between the nodes:
given undirected network G ═ V, E, the Jaccard similarity coefficients for nodes u and V are defined as:
Figure FDA0002655230990000011
whereinU=N(u)∪{u},UIs a group of neighbors of the node u, including the node u and nodes directly connected with the node u;
s32: calculating the node attraction force:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
NAv→u=Juv*Du (2)
wherein JuvRepresenting the Jaccard similarity coefficient, D, between nodes u and vuRepresenting u degrees of a node;
s33: the core packet of the network is calculated using the node attraction formula:
because the nodes have attractive force, each node attracts the neighbors to join the community, and the node selects the community where the node with the strongest attractive force to join according to the formula (3), wherein the formal definition is as follows:
Figure FDA0002655230990000021
wherein, CvDenotes a community to which the node v belongs, DvThe degree of the node v is then selected according to the formula (3) to join the community where the node with the strongest attraction is located, and if the degree of the node v is greater than the degree of the adjacent node, the node v still belongs to the original community Cv(ii) a If the degree of the neighbor node is greater than the degree of the node v, the node v selects the node having the greatest attraction max (NA)v→u) The node of (2) joining its community; through iteration, nodes with more resources attract neighboring nodes to join the community, and therefore a plurality of core groups are formed.
4. The method for detecting communities based on the martensitic effect as claimed in claim 1, wherein the step S4 simulates the martensitic effect process steps of:
s41: calculating the community attraction:
given a undirected network G ═ V, E, the attraction of node u to node V is defined as:
Figure FDA0002655230990000022
wherein the content of the first and second substances,
Figure FDA0002655230990000023
representing from node v to a community ciThe proximity of the nodes is determined according to whether the number of edges from one node to different communities is the same or not,
Figure FDA0002655230990000024
formalization ofIt is defined as follows:
Figure FDA0002655230990000025
wherein the content of the first and second substances,
Figure FDA0002655230990000026
representing a community ciInternal degree of the middle node v.
S42: simulating a Martian effect process:
after obtaining the core packets from step S3, more and more nodes are attracted by different core packets, and the simulation of the mattesy effect is performed according to equation (6), formalized as follows:
Figure FDA0002655230990000031
wherein, ciIs a neighbor community connected to node v,
Figure FDA0002655230990000032
representing a community ciAttraction to node v.
CN202010884765.1A 2020-08-28 2020-08-28 Martha effect-based community detection method Active CN112015954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010884765.1A CN112015954B (en) 2020-08-28 2020-08-28 Martha effect-based community detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010884765.1A CN112015954B (en) 2020-08-28 2020-08-28 Martha effect-based community detection method

Publications (2)

Publication Number Publication Date
CN112015954A true CN112015954A (en) 2020-12-01
CN112015954B CN112015954B (en) 2021-08-27

Family

ID=73503796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010884765.1A Active CN112015954B (en) 2020-08-28 2020-08-28 Martha effect-based community detection method

Country Status (1)

Country Link
CN (1) CN112015954B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722750A (en) * 2012-06-06 2012-10-10 清华大学 Updating method and device of community structure in dynamic network
CN103020267A (en) * 2012-12-26 2013-04-03 上海交通大学 Complex network community structure mining method based on triangular cluster multi-label transmission
CN104933111A (en) * 2015-06-03 2015-09-23 中南大学 Expert academic distance assessment method based on academic relational network
CN105469315A (en) * 2015-08-04 2016-04-06 电子科技大学 Dynamic social network community structure evolution method based on incremental clustering
CN107993156A (en) * 2017-11-28 2018-05-04 中山大学 A kind of community discovery method based on social networks digraph
CN111723298A (en) * 2020-05-11 2020-09-29 珠海高凌信息科技股份有限公司 Social network community discovery method, device and medium based on improved label propagation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722750A (en) * 2012-06-06 2012-10-10 清华大学 Updating method and device of community structure in dynamic network
CN103020267A (en) * 2012-12-26 2013-04-03 上海交通大学 Complex network community structure mining method based on triangular cluster multi-label transmission
CN104933111A (en) * 2015-06-03 2015-09-23 中南大学 Expert academic distance assessment method based on academic relational network
CN105469315A (en) * 2015-08-04 2016-04-06 电子科技大学 Dynamic social network community structure evolution method based on incremental clustering
CN107993156A (en) * 2017-11-28 2018-05-04 中山大学 A kind of community discovery method based on social networks digraph
CN111723298A (en) * 2020-05-11 2020-09-29 珠海高凌信息科技股份有限公司 Social network community discovery method, device and medium based on improved label propagation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
聂祥林: "基于依赖度和相似度的社团结构发现算法研究", 《中国优秀硕士学位论文全文数据库 基础科学辑》 *

Also Published As

Publication number Publication date
CN112015954B (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN110532436B (en) Cross-social network user identity recognition method based on community structure
Kumar et al. Identifying influential nodes in Social Networks: Neighborhood Coreness based voting approach
Zhang et al. WOCDA: A whale optimization based community detection algorithm
Sun et al. Community detection based on the Matthew effect
He et al. Heuristics-based influence maximization for opinion formation in social networks
CN106708953A (en) Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
Cai et al. An improved random walk based clustering algorithm for community detection in complex networks
Cheng et al. A local-neighborhood information based overlapping community detection algorithm for large-scale complex networks
CN106355091B (en) Propagating source localization method based on biological intelligence
Hajibagheri et al. Social networks community detection using the shapley value
Cai et al. Evaluation repeated random walks in community detection of social networks
Cao et al. A novel community detection method based on discrete particle swarm optimization algorithms in complex networks
Bhat et al. OCMiner: a density-based overlapping community detection method for social networks
Sun et al. Dynamic community detection based on the Matthew effect
CN112015954B (en) Martha effect-based community detection method
Pan et al. Overlapping community detection via leader-based local expansion in social networks
Gao et al. Accelerating graph mining algorithms via uniform random edge sampling
CN116932923A (en) Project recommendation method combining behavior characteristics and triangular collaboration metrics
Sheng et al. Overlapping community detection via preferential learning model
Chebotarev et al. How to choose the most appropriate centrality measure? A decision tree approach
Sun et al. Heterogeneous network representation learning based on role feature extraction
Khatri et al. Influence Maximization in social networks using discretized Harris’ Hawks Optimization algorithm
CN108595684A (en) A kind of overlapping community discovery method and system based on preferential learning mechanism
Yang et al. A few-shot inductive link prediction model in knowledge graphs
Zhng et al. A multi-objective hybrid genetic algorithm for detecting communities in complex networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant