CN111738515A - Social network community discovery method based on local distance and node rank optimization function - Google Patents

Social network community discovery method based on local distance and node rank optimization function Download PDF

Info

Publication number
CN111738515A
CN111738515A CN202010581334.8A CN202010581334A CN111738515A CN 111738515 A CN111738515 A CN 111738515A CN 202010581334 A CN202010581334 A CN 202010581334A CN 111738515 A CN111738515 A CN 111738515A
Authority
CN
China
Prior art keywords
node
network
social
representing
community
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010581334.8A
Other languages
Chinese (zh)
Other versions
CN111738515B (en
Inventor
刘小洋
丁楠
刘加苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Technology
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN202010581334.8A priority Critical patent/CN111738515B/en
Publication of CN111738515A publication Critical patent/CN111738515A/en
Application granted granted Critical
Publication of CN111738515B publication Critical patent/CN111738515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a social network community discovery method based on local distance and node rank optimization functions, which comprises the following steps: s1, acquiring a network social node data set, and performing Laplace normalization processing on the acquired network social node data set; obtaining a Laplace node matrix; s2, calculating to obtain a social network node value according to the internal distance and the external distance of the social network: if the network social node value is larger than or equal to the preset network social node value, discovering a network social community; if the value of the network social node is smaller than the preset value of the network social node, rediscovery the network social community. The invention takes the node self-transfer problem into account. Secondly, the method comprehensively considers the problem of the edge weight and can effectively show the characteristic structure of the whole social network. Finally, compared with other methods, the method has better performance.

Description

Social network community discovery method based on local distance and node rank optimization function
Technical Field
The invention relates to the technical field of social networks, in particular to a social network community discovery method based on local distance and a node rank optimization function.
Background
In the last two decades, the internet has increased in speed in developing a global process, the position of data networks in human society has become more and more important, and researchers have become more and more interested in the study of complex networks. In nature, complex networks are diverse in form and are composed of communities with relatively independent mutual influence. Such as social networks, biological networks, economic networks, information networks, and so forth. The community structure is an important topological attribute of the complex network, so community discovery has important significance in the research of complex network analysis, data mining and the like. This attribute allows community discovery to better analyze complex networks and extract useful information and apply to various fields, such as text analysis, personality recommendation systems, user identification, epidemic propagation, behavior prediction.
Although there are many articles on social network community discovery, in a network, the nodes contained in each cluster must be somehow related to each other, rather than to nodes outside the cluster, to form a community. Most researchers believe that communities are characterized by tight connections between community nodes and sparse connections with nodes outside the community. Since the initiative of Girvan and Newman, many algorithms for community detection in complex networks have been proposed, the most typical of which are, for example, a modularity optimization algorithm, a label propagation algorithm, a greedy algorithm, a random walk algorithm, a spectrum division algorithm, and a fuzzy algorithm.
Disclosure of Invention
The invention aims to at least solve the technical problems in the prior art, and particularly provides a social network community discovery method based on local distance and a node rank optimization function.
In order to achieve the above object, the present invention provides a social network community discovery method based on local distance and node rank optimization function, including the following steps:
s1, acquiring a network social node data set, and performing Laplace normalization processing on the acquired network social node data set; obtaining a Laplace node matrix;
s2, calculating to obtain a social network node value according to the internal distance and the external distance of the social network:
if the network social node value is larger than or equal to the preset network social node value, discovering a network social community;
if the value of the network social node is smaller than the preset value of the network social node, rediscovery the network social community.
In a preferred embodiment of the present invention, in step S1, the laplacian normalization calculation method for the obtained social networking node is:
Figure BDA0002552421540000021
wherein D represents a node degree matrix;
Figure BDA0002552421540000022
represents the un-normalized laplacian matrix;
a denotes an adjacency matrix.
In a preferred embodiment of the present invention, in step S1, the calculation method of the element values in the laplacian node matrix is:
Figure BDA0002552421540000023
wherein deg (v)i) Represents the degree of node i;
deg(vj) Represents the degree of node j;
virepresents a node i;
vjrepresents node j;
Figure BDA0002552421540000024
and the element values of the ith row and the jth column in the Laplace node matrix are represented.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking internal distance is:
Figure BDA0002552421540000031
wherein L issymRepresenting a laplacian node matrix;
Figure BDA0002552421540000032
representing a set of nodes VkThe adjacency matrix of (a);
g represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dinternal(G,Vk) Representing the internal distance of network societies.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking external distance is:
Figure BDA0002552421540000033
wherein L issymRepresenting a laplacian node matrix;
Figure BDA0002552421540000034
represents V-VkThe adjacency matrix of (a);
Figure BDA0002552421540000035
representing a set of nodes VkThe adjacency matrix of (a);
v represents a node partition set; v ═ V1,V2,V3,...,VK};
G represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dexternal(G,Vk) Representing the external distance of network socialization.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking node value is:
Figure BDA0002552421540000036
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
dinternal(G,Vk) An external distance representing network socialization;
dexternal(G,Vk) An internal distance representing network socializing;
SLDL(G, V) represents a network social node value.
In a preferred embodiment of the invention, the set of nodes VkOf a neighboring matrix
Figure BDA0002552421540000041
The calculation method comprises the following steps:
Figure BDA0002552421540000042
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
vx represents node x; x is 1,2,3, …, N;
vyrepresents node y; y is 1,2,3, …, N.
In a preferred embodiment of the present invention, in step S2, the node set V-VkOf a neighboring matrix
Figure BDA0002552421540000043
The calculation method comprises the following steps:
Figure BDA0002552421540000044
wherein, VkRepresenting a set of nodes; k is 1,2,3, …, K;
v represents a node partition set; v ═ V1,V2,V3,…,VK};
vx represents node x; x is 1,2,3, …, N;
vyrepresents node y; y is 1,2,3, …, N.
In summary, due to the adoption of the technical scheme, firstly, the invention considers the problem of node self-transmission. Secondly, the method comprehensively considers the problem of the edge weight and can effectively show the characteristic structure of the whole social network. Finally, compared with other methods, the method has better performance.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic flow diagram of the present invention.
FIG. 2 is a schematic diagram of local distance community partitioning according to the present invention.
Fig. 3 is a schematic diagram comparing different algorithms of the present invention on an artificial network.
FIG. 4 is a schematic diagram illustrating an overview of the community discovery process of the present invention.
Fig. 5 is a schematic view of the visualization of the present invention on different networks.
FIG. 6 is a schematic diagram of community membership in a real data network according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
To date, some classical and effective local community discovery algorithms and MF algorithms have been proposed, and Liu et al propose a local community discovery framework based on node pair similarity, and a new local community discovery algorithm can be obtained by embedding a better node similarity measure.
Clauset et al propose an algorithm R for measuring local community structure, the calculation method is as follows:
Figure BDA0002552421540000051
wherein B is a local community, R represents an algorithm for measuring the structure of the local community, BinRepresenting the number of edges whose endpoints are all in local community B, and BoutIs the number of edges that have an endpoint in local community B. The algorithm requires a predefined size of the community. It will continue to add the neighbor node that increases R the most to the current community until the current community reaches a predefined size.
Luo et al propose another local community discovery algorithm M, the calculation method is as follows:
Figure BDA0002552421540000052
wherein M represents a local community discovery algorithm, EinRepresents the number of internal edges of the community, and EoutRepresenting the number of edges between the community boundary and the external node. The algorithm provides three heuristic node searching methods to partially solve the problem of community discovery in a complex network. However, it must set different thresholds for different sizes of networks.
The two algorithms have several ideal advantages, can detect clusters of any shape, do not need to preset the number of clusters, and can display the selection process of the center through a decision diagram. However, DPC still has drawbacks. First, the truncation distance has a greater impact on the clustering results. Furthermore, manual intervention is required to select a suitable cluster center.
Lancihienti et al propose a fitness function FcTo measure the density of nodes within a community. The fitness function is defined as follows:
Figure BDA0002552421540000061
in the formula,
Figure BDA0002552421540000062
represents the sum of the internal degrees of the community c,
Figure BDA0002552421540000063
denotes the sum of the degrees of externality of the community c, α denotes a resolution parameter for controlling the size of the detected community, FcRepresenting the density of nodes within the community. The quality function can effectively measure the closeness of nodes in the community, but cannot fully utilize local information between the nodes.
Xu et al studied how to apply computational intelligence genetic algorithms to directed, undirected community discovery and developed optimization algorithms through iterations. Wang et al propose a method for discovering overlapping communities using a bayesian MF model. The advantage of this approach is that the number of communities can be automatically determined and there is no resolution limit. However, its internal value estimation of the number of communities may mislead the decomposition and return a wrong solution.
Guo et al uses locality center nodes and Jaccard coefficients to detect the core members of the community as seeds in the network, thereby ensuring that the selected seeds are the center nodes of the community. The node with the greatest degree in the seed is pre-expanded each time by the fitness function. And expanding the first k nodes with the best performance in the pre-expansion process by utilizing the internal force among the nodes according to the fitness function so as to obtain a high-quality community in the network.
Chen et al propose a novel community discovery method that separates overlapping communities from the network using a non-Negative Matrix Factorization (NMF) model, and solves the problem of unknown community number through feature Matrix preprocessing and sorting optimization, thereby enabling the algorithm to divide the network structure of unknown community number. Hu et al propose an improved lagrangian alternating direction algorithm for symmetric non-negative matrix factorization.
Recently, Li et al proposed a method based on semi-supervised matrix factorization and random walk to perform community partitioning. And calculating the transition probability among the nodes through network topology, obtaining the final wandering probability by using a random wandering model, and constructing a characteristic matrix.
Wu et al propose a novel framework called hybrid hypergraph Regularization non-negative Matrix Factorization (MHGNMF) that takes into account higher-order information between nodes to improve clustering performance. The hypergraph regularization term enforces that the nodes in the same superperiphere are projected to the same potential subspace, thereby realizing more discriminant representation. In the proposed framework, the topological connectivity information and the structural similarity information are exploited by blending together two neighbors of each centroid to generate a set of hyper-edges.
The local community discovery algorithms all use the topological property of the network, and all default networks have the same edge weight, but in a real data network, the connection strength between entities is different, and the node bias is not considered, so that the weight is easily estimated incorrectly. To the best of the present invention, there is no other community discovery work that combines local distance and laplacian matrix decomposition based methods.
To better describe the proposed model, the invention will use the following mathematical definition:
definition 1: the network G ═ (V, E) is composed of a set of node partitions V and a set of edges E, and the nodes contained in the set of node partitions V will be labeled V1、v2、v3、……、vN,vpIndicating that the node p, p is 1,2,3, …, N indicates the total number of nodes in the node partition set, and the edge contained in the edge set E indicates which nodes are connectedx≠y(vx,vy) In the edge set E, x is 1,2,3, … N, and y is 1,2,3, … N; v is thenxIs connected to vyIn the present invention, only undirected graphs are processed, so edge pairs ∪x≠y(vx,vy) And edge pair ∪y≠x(vy,vx) Same, ∪y≠x(vy,vx) Watch (A)Show vyIs connected to vxI.e. represent vyAnd vxWherein ∪ are connected with each otherζDenotes the condition ζ, i.e. ∪x≠yIndicating condition x ≠ y, ∪y≠xIndicating that condition y ≠ x.
Definition 2 Each network G has an adjacency matrix A. if a network G has N nodes, the adjacency matrix A is an N × N matrix in the form of a combination of 0 and 1pq1 if and only if pair of edges ∪p≠q(vp,vq) ∈ E, p 1,2,3, …, N, q 1,2,3, …, N, i.e. vpAnd vqConnection, known from definition 1, ∪p≠q(vp,vq)=∪q≠p(vq,vp) Therefore, the adjacency matrix a here is a symmetric matrix.
Definition 3: the community discovery of the network G ═ V, E) is to divide the node partition set V into node sets V1,V2,V3,...,VKAs a result of (3), so that V1∪V2∪V3∪...∪VKIs equal to V, and V1,V2,V3,...,VKAre not empty sets. I.e. set of nodes V1,V2,V3,...,VKIs the community structure. The present invention defines a partition as V ═ V1,V2,V3,...,VK}. The number of the subareas is K ═<V>,<V>And the number of the node sets in the node partition set V is represented.
Definition 4: given a network G ═ (V, E) and a set of node partitions V ═ V1,V2,V3,...,VKThe edges of the network G can be divided into an edge set EmnI.e. Emn∈E,
Figure BDA0002552421540000081
And is
Figure BDA0002552421540000082
1,2,3, K, n 1,2,3, K; if and only if
Figure BDA0002552421540000083
And is
Figure BDA0002552421540000084
There is an edge pair
Figure BDA0002552421540000085
Figure BDA0002552421540000086
Definition 5: the definition is particularly given in the following,
Figure BDA0002552421540000087
and
Figure BDA0002552421540000088
k1, 2,3., K, l 1,2,3., K; in other words, the inner set of edges
Figure BDA0002552421540000089
Containing a set of nodes VkInternal edge, internal edge set
Figure BDA00025524215400000810
Two nodes on any edge pair in the system belong to the same community; and the outer edge set
Figure BDA00025524215400000811
Comprising VkOuter edge, outer edge set
Figure BDA00025524215400000812
A node on any edge pair in the node set VkIn that another node does not belong to the set of nodes VkAnd belongs to a node set V-VkIn (1).
The invention discloses a social network community discovery method based on local distance and a node rank optimization function, which comprises the following steps of:
s1, acquiring a network social node data set, and performing Laplace normalization processing on the acquired network social node data set; obtaining a Laplace node matrix;
s2, calculating to obtain a social network node value according to the internal distance and the external distance of the social network:
if the network social node value is larger than or equal to the preset network social node value, discovering a network social community;
if the value of the network social node is smaller than the preset value of the network social node, rediscovery the network social community.
In a preferred embodiment of the present invention, in step S1, the laplacian normalization calculation method for the obtained social networking node is:
Figure BDA00025524215400000813
wherein D represents a node degree matrix;
Figure BDA0002552421540000091
represents the un-normalized laplacian matrix;
a denotes an adjacency matrix.
In a preferred embodiment of the present invention, in step S1, the calculation method of the element values in the laplacian node matrix is:
Figure BDA0002552421540000092
wherein deg (v)i) Represents the degree of node i;
deg(vj) Represents the degree of node j;
virepresents a node i;
vjrepresents node j;
Figure BDA0002552421540000093
and the element values of the ith row and the jth column in the Laplace node matrix are represented.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking internal distance is:
Figure BDA0002552421540000094
wherein L issymRepresenting a laplacian node matrix;
Figure BDA0002552421540000095
representing a set of nodes VkThe adjacency matrix of (a);
g represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dinternal(G,Vk) Representing the internal distance of network societies.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking external distance is:
Figure BDA0002552421540000101
wherein L issymRepresenting a laplacian node matrix;
Figure BDA0002552421540000102
represents V-VkThe adjacency matrix of (a);
Figure BDA0002552421540000103
representing a set of nodes VkThe adjacency matrix of (a);
v represents a node partition set; v ═ V1,V2,V3,...,VK};
G represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dexternal(G,Vk) Representing the external distance of network socialization.
In a preferred embodiment of the present invention, in step S2, the method for calculating the social networking node value is:
Figure BDA0002552421540000104
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
dinternal(G,Vk) An external distance representing network socialization;
dexternal(G,Vk) An internal distance representing network socializing;
SLDL(G, V) represents a network social node value.
In a preferred embodiment of the present invention, in step S2, the node set VkOf a neighboring matrix
Figure BDA0002552421540000105
The calculation method comprises the following steps:
Figure BDA0002552421540000106
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
vxRepresents node x; x is 1,2,3, …, N;
vyrepresents node y; 1,2,3.
In a preferred embodiment of the present invention, in step S2, the node set V-VkOf a neighboring matrix
Figure BDA0002552421540000107
The calculation method comprises the following steps:
Figure BDA0002552421540000111
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
vx represents node x; x is 1,2,3,. N;
vyrepresents node y; 1,2,3.
In a preferred embodiment of the present invention, the method further comprises the steps of:
s3, optimizing the social network community found in the step S2;
s4, displaying the social network community obtained in the step S3. In step S3, the method for optimizing the found social network community includes:
Figure BDA0002552421540000112
wherein, VkRepresents a set of nodes, K ═ 1,2,3.., K;
v represents a node partition set; v ═ V1,V1,V1,...,VK};
viRepresents a node i;
indicates that in the case of … …, there is … …;
V[vi]indicating that node i belongs to a set of nodes Vi];
vjRepresents node j;
Aijrepresenting the ith row and jth column element values in the adjacency matrix A;
if yes, keeping node set V [ V ]i];
If not, the node set V is discardedi]。
In a preferred embodiment of the present invention, the method further comprises:
Figure BDA0002552421540000113
wherein m represents the total number of connecting node edges; a. theijRepresenting the values of the elements in adjacency matrix a; fijRepresenting the proportion of any edge connecting the two nodes i and j;
Figure BDA0002552421540000121
wherein deg (v)i) Represents the degree of node i; deg (v)j) Represents the degree of node j; v. ofiRepresents a node i; v. ofjRepresents node j;
Figure BDA0002552421540000122
and/or the method also comprises a method for calculating the Jaccard coefficient:
Figure BDA0002552421540000123
wherein, VMThe community structure is optimal;
V0is a reference vector;
J(VM,V0) Represents the Jaccard coefficient;
when V isMAnd V0All are empty, J (V)M,V0)=1;
And/or further comprises an Error index calculation method:
Figure BDA0002552421540000124
wherein, VM' structural feature of V;
V0is' a V0Structural features of (a);
E(VM′,V0') indicates the Error index;
when V has the same value as V0The same community structure time E (V)M′,V0') is equal to 0.
As shown in fig. 2: the entire network G is divided into 5 partitions, i.e. V ═{V1,V2,V3,V4,V5In which V is indicated briefly1Partitioned internal edge set
Figure BDA0002552421540000125
And external edge set
Figure BDA0002552421540000126
Community discovery is to find a node partition set V (V, E) of a network G (V, E)1,V2,V3,...,VKThe nodes contained in each cluster must be somehow related to each other, not to nodes outside the cluster, to form a community. Firstly, in order to solve the problem of node information self-transmission, the invention comprehensively considers the influence of the node on the node, introduces a self-degree matrix, and constructs the following model by utilizing the Laplace matrix decomposition principle:
Figure BDA0002552421540000127
wherein D represents a node degree matrix;
Figure BDA0002552421540000131
represents the un-normalized laplacian matrix; a represents an adjacency matrix; i isnRepresenting an n-order identity matrix; l issymIs a laplacian node matrix.
The completeness of extracting the network features is considered, namely, the problem of the edge weight is fully considered. The method is obtained by normalizing the adjacency matrix, multiplying two sides of the adjacency matrix by the degree evolution of the nodes and then inverting. For single node operation, normalization is to divide the degree of its node, so that the information transfer value of each adjacent edge is normalized, and the influence of the former is not larger than that of the latter because a certain node has 10 edges and another has 1 edge, because the weight of the latter is only 0.1 after normalization, the operation of rising from a single node to a two-dimensional matrix is to invert the matrix and multiply the inverse nature of the matrix, namely the operation of rising from a single node to a two-dimensional matrix is to multiply the inverse nature of the matrixAnd performing matrix division to finish normalization. However, the left and right are multiplied by the evolution of the i, j degrees of the node respectively, which is the degree of the point at both sides of one edge. Specific to each node pair vi,vjThe elements in the matrix are given by the following equation:
Figure BDA0002552421540000132
wherein deg (v)i) Represents the degree of node i; deg (v)j) Represents the degree of node j; i.e. the value of the degree matrix at node i, j; v. ofiRepresents a node i; v. ofjRepresents node j;
Figure BDA0002552421540000133
and the element values of the ith row and the jth column in the Laplace node matrix are represented.
The inner and outer distances are given by:
Figure BDA0002552421540000134
Figure BDA0002552421540000135
wherein L issymRepresenting a laplacian node matrix;
Figure BDA0002552421540000136
represents VkThe adjacency matrix of (a);
Figure BDA0002552421540000137
represents V-VkThe adjacency matrix of (a); dinternal(G,Vk) An internal distance representing network socializing; dexternal(G,Vk) External distance representing network socializing, dexternal(G,Vk) Can be written as de(G,Vk) Or de;dinternal(G,Vk) Can be written as di(G,Vk) Or di
Figure BDA0002552421540000138
Figure BDA0002552421540000139
It should be understood in equation (8) that when node x and node y both belong to node set (node set is also called community) Vk,VkOf a neighboring matrix
Figure BDA0002552421540000141
The value of the element in the x row and the y column is 1; when node x belongs to node set VkNode y belongs to the set of nodes V-Vk,VkOf a neighboring matrix
Figure BDA0002552421540000142
The value of the element in the x row and the y column is 0; similarly, in formula (9), when both node x and node y belong to node set Vk,VkOf a neighboring matrix
Figure BDA0002552421540000143
The value of the element in the x row and the y column is 0; when node x belongs to node set VkNode y belongs to the set of nodes V-Vk,V-VkOf a neighboring matrix
Figure BDA0002552421540000144
The value of the element in the x-th row and y-th column is 1.
For all vx∈ V all have Axx1 (i.e., each node has a self-loop). All edges except the self-loop are counted twice. dinternal(G,Vk) Is taken to be [0,1 ]]When the network G is a union of communities which are not continuous with each other, dinternal(G,Vk) This case is a perfect community structure diagram. It dexternal(G,Vk) Also take on values of [0,1]And (for a perfect community structure graph, its value is 0).
The local distance Laplace network social node value function is as follows:
Figure BDA0002552421540000145
wherein, VkRepresenting a set of nodes; v represents a node partition set; dinternal(G,Vk) An external distance representing network socialization; dexternal(G,Vk) An internal distance representing network socializing; sLDL(G, V) represents a network social node value.
One point to emphasize for the LDL model is that the weight for each local partition (local inner distance plus local outer distance) is | VkI/2 | V |. This is done to avoid that smaller communities will have a disproportionate impact on the score of their total community.
3.3 Node Rank Optimization Function
Due to the community discovery algorithm proposed by the present invention, more than one possible community discovery result is generated. In this case, a community discovery optimization is required. The optimal community selection method provided by the invention is based on the idea of community discovery effectiveness, namely, more edges should be arranged inside the community but not outside the community. Weak criteria (WRC) and Strong criteria (SRC) were first proposed by radichi et al, but his WRC was too weak and showed no distinction at each node. Then even u and VkCompletely disconnected, any additional node u may also be added to VkAnd still satisfy the WRC. This can lead to failure of many discovered communities. Therefore, the present invention provides a node rank optimization function, which is as follows:
Figure BDA0002552421540000151
wherein,
Figure BDA0002552421540000152
that is to
Figure BDA0002552421540000153
Phi represents a node set to which the node j belongs; a. theijRepresenting the ith row and jth column element values in the adjacency matrix A; v [ i ]]Indicating that node i belongs to the set of nodes V [ i ]](ii) a That is, in the case of … …, there is … ….
Figure BDA0002552421540000154
Wherein v isiRepresenting nodes i, vyRepresenting node y.
Thus, the NRO function is expressed as follows:
Figure BDA0002552421540000155
wherein, VkRepresenting a node set, and V representing a node partition set; v. ofiRepresents a node i; indicates that in the case of … …, there is … …; v [ V ]i]Indicating that node i belongs to a set of nodes Vi];vjRepresents node j; a. theijRepresenting the ith row and jth column element values in the adjacency matrix a.
If yes, keeping node set V [ V ]i];
If not, the node set V is discardedi]。
In the optimization effect, because two coordination parameters V [ i ] and V-V [ i ] are set, the optimization effect is stronger than WRC and weaker than SRC, and thus a better optimization effect is achieved.
The main flow of the algorithm provided by the invention is as follows:
Figure BDA0002552421540000156
Figure BDA0002552421540000161
4 results and analysis of the experiments
To evaluate the algorithm proposed by the present invention, the present invention contemplates the use of eleven real data networks and artificial network datasets. Data sources are http:// www-personal. umich. edu/mejn/Netdata/http:// snap. stanford. edu/data/. The hardware environment of the experiment was as follows: inter (R) core (TM) i5-4160M CPU, 3.60GHz and 4GB memory, windows 10, MATLAB R2019 a.
4.1 evaluation index
In the present invention, Q is used as a performance metric in experiments in order to evaluate the performance of networks that do not have authenticity.
The performance metric Q is:
Figure BDA0002552421540000171
wherein m represents the total number of connecting node edges; a. theijRepresenting the values of the elements in adjacency matrix a; fijRepresenting the proportion of any edge connecting the two nodes i and j;
it is composed of
Figure BDA0002552421540000172
deg(vi) Represents the degree of node i; deg (v)j) Represents the degree of node j; v. ofiRepresents a node i; v. ofjRepresenting node j.
ij(ci,cj) Is represented as follows:
Figure BDA0002552421540000173
wherein, ciIs the community to which vertex i is assigned, cjIs the community to which vertex j is assigned.
The Jaccard Coefficient (JSC) is used to compare Similarity and difference between a finite sample set.
Given two sets VM,V0The Jaccard coefficient is defined as VMAnd V0A larger value of the ratio of the size of the intersection to the size of the union indicates a higher degree of similarity.
Figure BDA0002552421540000174
Wherein, VMFor an optimal community structure, V0As a reference vector, when VMAnd V0All are empty, J (V)M,V0)=1。
The range of the RI is larger, which means that the community discovery result is more consistent with the real situation. A larger RI indicates a higher accuracy of clustering effect and a higher purity within each class.
Error index when V has the same value as V0The same community structure time E (V)M′,V0') is equal to 0, defined as follows:
Figure BDA0002552421540000175
wherein, VM' structural feature as V, V0Is' a V0The structural characteristics of (1).
4.2 Artificial network Performance comparison
The invention adopts an algorithm operated on an artificial data network (GN reference network). Internal edge set E for each nodeinternalExternal edge set E connected to other nodes in the same communityexternalConnect with other communities. With outer edge set EexternalWith the increase in community structure becoming less clear, the community discovery task becomes more challenging.
TABLE 1 Artificial network parameters
Figure BDA0002552421540000181
Fig. 3 shows the performance comparison of 8 algorithms in an artificial data network, and the proposed LDL algorithm was experimentally analyzed on various data sets of the artificial network and the real network and compared with the conventional algorithm by experiments, which are LinkLPA, MFM, LFK, NMF, LRLFP, specluster 1 and specluster 2, respectively.
As shown in fig. 3 (a): the performance of the algorithm 8 on the Jaccard coefficient evaluation standard is described, it is easily understood that when the external edge number overview is larger, the Jaccard coefficient value is lower, and when the external edge probability is less than 0.4, the LDL algorithm provided by the invention is obviously advantageous, but after 0.4, the Jaccard coefficient value is slightly lower than that of other algorithms, but always higher than that of the LinkLPA algorithm.
Fig. 3(b) depicts the performance of the algorithm on the Rand index evaluation standard, the overall trend of each algorithm is similar to that of fig. 3(a), and the Rand index gradually decreases as the probability of the number of external edges increases. It is noted that the algorithms LDL and LinkLPA provided by the present invention have significant advantages over other algorithms, and when the probability of the number of external edges is less than 0.4, the LDL algorithm is better than the LinkLPA algorithm.
Figure 3(c) shows that the performance of the algorithm does not differ much in the performance of the modulority evaluation criteria, but the LDL algorithm remains dominant throughout.
The Error values of the algorithms in fig. 3(d) are significantly different, and it can be seen that the Error value of the LDL algorithm is the lowest when the probability of the number of outer edges is less than 0.8, and the LDL algorithm is only 3% worse than the MFM algorithm when the probability is greater than 0.8. In conclusion, the LDL algorithm proposed by the present invention is indeed better and more stable than the other 7 algorithms.
4.3 true network Performance comparison
To further evaluate the LDL algorithm proposed by the present invention, eleven representative social networks of different sizes were selected by the present invention. In table 2, Networks represents a real data network, nodes represent Node numbers, edges represent Edge numbers, a-co represents an average clustering coefficient of nodes, a-Lenth represents an average path length, and Description describes the practical significance of the network. As shown in table 2:
TABLE 2 true network
Networks Node Edge A-co A-Lenth Description
Karate 34 78 0.588 2.408 Zachary’s karate club
Dolphin 62 159 0.303 3.357 Dolphin social network
Lemis 77 254 0.736 2.641 Victor Hugo novel Les Miserables
Public book 103 441 0.488 3.079 Books about US politics
Football 115 616 0.289 3.421 A map of the popular board game Risk
Celegansnertal 297 2359 0.308 2.455 Celegansnertal dissertation
Email 1005 25571 0.439 2.968 Students in ANLP course email message
Public blogs 1490 19025 0.361 2.738 Blogs about politics
Netscience 1589 2742 0.701 2.842 Co-authorship in network science
Power grid 4941 6594 0.405 2.391 The topology of Power Grid
Hep_th 8361 15751 0.636 3.129 Collaboration of High Energy Physics Theory
To better illustrate the overall social network community discovery process, fig. 4(a) -4 (i) show a brief overview of the overall community discovery process, taking a power grid network as an example. A total of 9 subgraphs, i.e. finally 9 communities are formed.
As shown in fig. 4 (a): community structure (green cut set) for the first one divided; secondly, a second community structure (purple cut set) is divided, as shown in FIG. 4 (b); then, a third community structure is divided, as shown in fig. 4 (c); by analogy, until the ninth community is divided, the convergence criterion has been reached, i.e. all nodes are contained within a certain community, as shown in fig. 4 (i).
The divided social networks already have clear community structures, and fig. 5(a) to 5(d) are respectively the visualization results of the community discovery of the LDL algorithm proposed by the present invention in 4 social networks, i.e., Dolphin, Lemis, celegansnert, and Netscience. It can be found that the LDL algorithm has high recognition quality in a large-scale data network (as shown in table 2), and the higher the degree and the average clustering coefficient of the node is, the stronger the display effect is, and the more easily the node becomes a community center to form a community structure.
Table 4 shows the results of the proposed LDL algorithm compared to the conventional algorithm on the Jaccard index in the real dataset. The bolded values in the table indicate algorithms that perform optimally, and the shaded gray values indicate algorithms that perform suboptimally.
Results of LDL and traditional algorithms presented in Table 4 on Jaccard index (real dataset)
LinkLPA MFM LDL NMF LRLFP LFK speClust1 speClust2
Karate 0.5 0.7375 0.6507 0.5882 0.325 0.6052 0.5593 0.2852
Dolphins 0.1035 0.1918 0.2131 0.1877 0.0541 0.2118 0.2161 0.2136
Lemis 0.4112 0.4793 0.6524 0.2844 0.2410 0.6276 0.4159 0.1972
Public book 0.3403 0.6671 0.6440 0.6512 0.0551 0.3951 0.6749 0.6951
Football 0.7147 0.6357 0.4052 0.8413 0.6920 0.0798 0.0798 0.07798
Celegansnertal 0.3445 0.2151 0.4804 0.343 0.0681 0.3551 0.2150 0.2151
Email 0.2599 0.0460 0.2085 0.1912 0.1251 0.0462 0.0467 0.0467
Public blogs 0.3112 0.5167 0.5690 0.5426 0.0162 0.4027 0.4120 0.4998
Netscience 0.2186 0.1332 0.2213 0.1780 0.0841 0.1464 0.0239 0.0100
Power 0.1603 0.0168 0.2240 0.2092 0.0048 0.0023 0.1371 0.0285
Hep_th 0.1524 0.1912 0.2203 0.1036 0.2015 0.1242 0.2003 0.0972
As shown in Table 4, the LDL algorithm provided by the invention has the optimal performance in the Lemis, Celegansnertal, Public blogs, Netsccience, Power and Hep _ th data networks, and is superior to the rest 7 algorithms; the LDL algorithm is suboptimal in Karate, Dolphins and Email data networks, is respectively second to MFM, speClust1 and LinkLPA algorithms, but has better performance than the other 6 algorithms; the LDL algorithm generally performs better in Public book, Football data networks than some other algorithms.
Results of the LDL algorithm presented in Table 5 with the conventional algorithm on the Rand index (real dataset)
Figure BDA0002552421540000201
Figure BDA0002552421540000211
As shown in Table 5, the LDL algorithm provided by the invention is optimal in the Lemis, Celegansnertal, Public blogs, Netsccience, Power and Hep _ th data networks, and is superior to the other 7 algorithms; the LDL algorithm is suboptimal in Dolphins and Email data networks, is inferior to the LRLFP algorithm, but has better performance than the other 6 algorithms; the LDL algorithm generally performs slightly better in karte, Public book, and Football data networks than some of the rest of the algorithms. For example, in the Karate data network, the performance is better than 5 algorithms, LFK, speClost 1, LinkLPA, LRLFP, and speClost 2.
Results of the LDL algorithm and the conventional algorithm on the Modularity index (real data set) presented in Table 6
LinkLPA MFM LDL NMF LRLFP LFK speClust1 speClust2
Karate 0.4427 0.4477 0.4347 0.4459 0.3663 0.4343 0.4116 0.1545
Dolphins 0.46 0.0108 0.4709 0.4486 0.4022 0.01080 0.0054 0.1299
Lemis 0.5882 0.5768 0.5772 0.4849 0.5298 0.5632 0.1088 0.2034
Public book 0.5531 0.5091 0.5196 0.5182 0.4117 0.4065 0.4595 0.4209
Football 0.6189 0.5423 0.6092 0.6236 0.6171 0.1075 0.5933 0.5753
Celegansnertal 0.433 0.4378 0.4521 0.3761 0.1722 0.2035 0.0092 0.3874
Email 0.6178 0.0381 0.5008 0.6547 0.6547 0.3292 0.1002 0.3048
Public blogs 0.3007 0.3431 0.3967 0.367 0.1864 0.1155 0.0087 0.2133
Netscience 0.8085 0.8118 0.872 0.8238 0.8238 0.8011 0.2062 0.73
Power 0.5826 0.531 0.6438 0.6241 0.5471 0.5289 0.6207 0.5227
Hep_th 0.6021 0.6754 0.7181 0.501 0.6503 0.685 0.6821 0.5431
As shown in Table 6, the LDL algorithm of the present invention performed best in the Dolphins, Celegansnal, Public blogs, Netsccience, Power and Hep _ th data networks, and was superior to the other 7 algorithms; the LDL algorithm is suboptimal in the Lemis and Public book data networks, is inferior to the LinkLPA algorithm, but has better performance than other 6 algorithms; the LDL algorithm generally performs slightly better in karte, Football, and Email data networks than some of the rest of the algorithms. Taking the Football data network as an example, the performance of the algorithm is better than that of 4 algorithms such as MFM, LFK, speClost 1 and speClost 2.
Results of LDL and conventional algorithms on Error index (true data set) presented in Table 7
LinkLPA MFM LDL NMF LRLFP LFK speClust1 speClust2
Karate 0.75 0.5 0.25 0.5 2.5 0.25 0.25 0.75
Dolphins 0.8 0.4 0.4 0.5 3.2 0.4 0.6 0.8
Lemis 0.8333 0.6667 0.1 0.5 0.4 0.8333 0.6667 0.8333
Public book 1.3333 0.667 0.3333 0.5543 0.62 0.3333 0.6667 0.6667
Football 0.833 0.3333 0.5 0.75 0.25 0.9167 0.9167 0.9167
Celegansnertal 0.5 0.8333 0.3333 0.6667 0.7 0.8333 0.8333 0.8333
Email 0.3333 0.5 0.4286 0.7 0.75 0.5234 0.9762 0.9762
Public blogs 0.17 0.138 0.1345 0.1375 0.1375 0.417 0.5 0.5
Netscience 0.0024 0.1935 0.0123 0.032 0.4975 0.0024 0.9877 0.9975
Power 0.8611 0.7574 0.5832 0.6239 0.8621 0.8265 0.980 0.7281
Hep_th 0.4278 0.5439 0.3738 0.7421 0.6384 0.4839 0.7846 0.8971
As shown in Table 7, the LDL algorithm provided by the invention has more obvious advantages in Error indexes, is optimal in ten data networks of Karate, Dolphins, Lemis, Poblic book, Celegansnal, Email, Public blogs, Netscience, Power and Hep _ th, is slightly worse than the LRLFP algorithm in Football, and has stronger stability as shown by experimental data.
In summary, although the LDL algorithm proposed by the present invention does not perform optimally in every data network, the ratio of the dominance (optimal + suboptimal) is much higher than other algorithms. The LDL algorithm provided by the invention has better performance in a social network with higher average clustering coefficient and more complex data network, and is more suitable for the characteristics of large scale and complexity of the modern social network.
As shown in fig. 6(a) to 6 (e): respectively represents the community structure comparison expression of the LDL algorithm in 5 real data networks of Karate, Lemis, Celegansnartal, Public blogs and Power grid. The abscissa represents the number of nodes, the ordinate represents the community membership relationship of the community, namely the community to which the node belongs, blue is the reference community structure, and red is the community structure of the LDL algorithm. The more similar the community structure after the algorithm execution is to the reference community structure, the higher the score.
In table 8, the performance of the LDL algorithm and the conventional algorithm mentioned in tables 4 to 7 on each index is counted, and the loss function Y constructed by the present invention is LOG10((X1+X2+X3)/(X4+1))。X1Represents the coefficient of variation at the Jaccard index; x2Representing the coefficient of variation at Rand index; x3Expressing the coefficient of variation at the modulority index; x4Represents the coefficient of variation at the Error index; y represents the constructed loss function.
Results of LDL Algorithm and conventional Algorithm on Each index (real data set) presented in Table 8
Figure BDA0002552421540000231
As shown in table 8: and (3) performing result analysis on the performance of the LDL algorithm on each index through statistical mean, standard deviation, coefficient of variation and constructed loss functions. The first proposed LDL algorithm has the highest score (bold data value) on two indexes of Jaccard and Rand, and in the Jaccard index, the mean value of the LDL algorithm in each data network is the highest, but the standard deviation is higher than LinkLPA, which indicates that the performance difference of the LDL algorithm in the data networks is larger than that of the LinkLPA algorithm, but the score of the index variation coefficient (the mean value/standard deviation is higher as well as better) is finally the highest in comprehensive consideration; in the Rand index, the performance of the LDL algorithm in the mean value and the standard deviation is superior to that of other algorithms, and the score of the variation coefficient is obviously higher than that of other best NMF algorithms and is close to 77 percent; secondly, the score of the LDL algorithm on the modulatity index is second to that of the LinkLPA algorithm, because the performance difference of the LDL algorithm in each data network is larger than that of the LinkLPA algorithm; the performance of the LDL algorithm on the Error index is the best compared with other algorithms, the Error rate is only 0.0548 and is far better than other algorithms, and the experimental data show that the algorithm provided by the invention has stronger robustness; finally, the performance score at the loss function is also highest, which is approximately 7 percentage points higher than the conventional best method.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (8)

1. A social network community discovery method based on local distance and node rank optimization functions is characterized by comprising the following steps:
s1, acquiring a network social node data set, and performing Laplace normalization processing on the acquired network social node data set; obtaining a Laplace node matrix;
s2, calculating to obtain a social network node value according to the internal distance and the external distance of the social network:
if the network social node value is larger than or equal to the preset network social node value, discovering a network social community;
if the value of the network social node is smaller than the preset value of the network social node, rediscovery the network social community.
2. The method for discovering social network communities based on local distances and node rank optimization functions as claimed in claim 1, wherein in step S1, the calculation method of performing laplacian normalization on the obtained social networking nodes is:
Figure FDA0002552421530000011
wherein D represents a node degree matrix;
Figure FDA0002552421530000012
represents the un-normalized laplacian matrix;
a denotes an adjacency matrix.
3. The social network community discovery method based on local distance and node rank optimization function of claim 1, wherein in step S1, the calculation method of the element values in the laplacian node matrix is:
Figure FDA0002552421530000013
wherein deg (v)i) Represents the degree of node i;
deg(vj) Represents the degree of node j;
virepresents a node i;
vjrepresents node j;
Figure FDA0002552421530000021
and the element values of the ith row and the jth column in the Laplace node matrix are represented.
4. The method for discovering social network community based on local distance and node rank optimization function according to claim 1, wherein in step S2, the internal distance of network societies is calculated by:
Figure FDA0002552421530000022
wherein L issymRepresenting a laplacian node matrix;
Figure FDA0002552421530000023
representing a set of nodes VkThe adjacency matrix of (a);
g represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dinternal(G,Vk) Representing the internal distance of network societies.
5. The method for discovering social network community based on local distance and node rank optimization function according to claim 1, wherein in step S2, the external distance of the social network is calculated by:
Figure FDA0002552421530000024
wherein L issymRepresenting a laplacian node matrix;
Figure FDA0002552421530000025
represents V-VkThe adjacency matrix of (a);
Figure FDA0002552421530000026
representing a set of nodes VkThe adjacency matrix of (a);
v represents a node partition set; v ═ V1,V2,V3,...,VK};
G represents a social network;
Vkrepresenting a set of nodes; k is 1,2,3,. K;
dexternal(G,Vk) Representing the external distance of network socialization.
6. The method for discovering social network community based on local distance and node rank optimization function according to claim 1, wherein in step S2, the method for calculating the value of the social network node is:
Figure FDA0002552421530000027
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
dinternal(G,Vk) An external distance representing network socialization;
dexternal(G,Vk) An internal distance representing network socializing;
SLDL(G, V) represents a network social node value.
7. The locality-based of claim 4The social network community discovery method of the distance and node rank optimization function is characterized in that in step S2, a node set VkOf a neighboring matrix
Figure FDA0002552421530000031
The calculation method comprises the following steps:
Figure FDA0002552421530000032
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
vxRepresents node x; x is 1,2,3,. N;
vyrepresents node y; 1,2,3.
8. The social network community discovery method based on local distance and node rank optimization function of claim 5, wherein in step S2, the node set is V-VkOf a neighboring matrix
Figure FDA0002552421530000033
The calculation method comprises the following steps:
Figure FDA0002552421530000034
wherein, VkRepresenting a set of nodes; k is 1,2,3,. K;
v represents a node partition set; v ═ V1,V2,V3,...,VK};
vxRepresents node x; x is 1,2,3,. N;
vyrepresents node y; 1,2,3.
CN202010581334.8A 2020-06-23 2020-06-23 Social network community discovery method based on local distance and node rank optimization function Active CN111738515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010581334.8A CN111738515B (en) 2020-06-23 2020-06-23 Social network community discovery method based on local distance and node rank optimization function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010581334.8A CN111738515B (en) 2020-06-23 2020-06-23 Social network community discovery method based on local distance and node rank optimization function

Publications (2)

Publication Number Publication Date
CN111738515A true CN111738515A (en) 2020-10-02
CN111738515B CN111738515B (en) 2021-08-10

Family

ID=72650625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010581334.8A Active CN111738515B (en) 2020-06-23 2020-06-23 Social network community discovery method based on local distance and node rank optimization function

Country Status (1)

Country Link
CN (1) CN111738515B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293505A1 (en) * 2017-04-06 2018-10-11 Universite Paris Descartes Method for clustering nodes of a textual network taking into account textual content, computer-readable storage device and system implementing said method
CN108920678A (en) * 2018-07-10 2018-11-30 福州大学 A kind of overlapping community discovery method based on spectral clustering with fuzzy set
CN111738514A (en) * 2020-06-23 2020-10-02 重庆理工大学 Social network community discovery method using local distance and node rank optimization function
CN111738516A (en) * 2020-06-23 2020-10-02 重庆理工大学 Social network community discovery system through local distance and node rank optimization function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180293505A1 (en) * 2017-04-06 2018-10-11 Universite Paris Descartes Method for clustering nodes of a textual network taking into account textual content, computer-readable storage device and system implementing said method
CN108920678A (en) * 2018-07-10 2018-11-30 福州大学 A kind of overlapping community discovery method based on spectral clustering with fuzzy set
CN111738514A (en) * 2020-06-23 2020-10-02 重庆理工大学 Social network community discovery method using local distance and node rank optimization function
CN111738516A (en) * 2020-06-23 2020-10-02 重庆理工大学 Social network community discovery system through local distance and node rank optimization function

Also Published As

Publication number Publication date
CN111738515B (en) 2021-08-10

Similar Documents

Publication Publication Date Title
Hu et al. FCAN-MOPSO: an improved fuzzy-based graph clustering algorithm for complex networks with multiobjective particle swarm optimization
Drton et al. Binary models for marginal independence
CN112488791A (en) Individualized recommendation method based on knowledge graph convolution algorithm
CN111738514B (en) Social network community discovery method using local distance and node rank optimization function
CN111738516B (en) Social network community discovery system through local distance and node rank optimization function
CN103489033A (en) Incremental type learning method integrating self-organizing mapping and probability neural network
CN108052683B (en) Knowledge graph representation learning method based on cosine measurement rule
Yang et al. Hyperbolic representation learning: Revisiting and advancing
Golzari Oskouei et al. EDCWRN: efficient deep clustering with the weight of representations and the help of neighbors
CN116340646A (en) Recommendation method for optimizing multi-element user representation based on hypergraph motif
Chen et al. Differentiated graph regularized non-negative matrix factorization for semi-supervised community detection
Peng et al. JGSED: An end-to-end spectral clustering model for joint graph construction, spectral embedding and discretization
CN116932923B (en) Project recommendation method combining behavior characteristics and triangular collaboration metrics
CN103164487B (en) A kind of data clustering method based on density and geological information
CN113111193A (en) Data processing method and device of knowledge graph
CN111738515B (en) Social network community discovery method based on local distance and node rank optimization function
Xiong et al. One-shot marginal map inference in Markov random fields
CN111816259B (en) Incomplete multi-study data integration method based on network representation learning
Roy et al. Learning structurally consistent undirected probabilistic graphical models
CN114494753A (en) Clustering method, clustering device, electronic equipment and computer-readable storage medium
Li et al. Path-Graph fusion based community detection over heterogeneous information network
Mohammadi et al. AN NMF-based community detection method regularized with local and global information
CN112990364B (en) Graph data node classification method and device
Bouchachia et al. A hybrid ensemble approach for the Steiner tree problem in large graphs: A geographical application
CN111709846A (en) Local community discovery algorithm based on line graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant