CN102684912B - Community structure mining method based on network potential energy - Google Patents

Community structure mining method based on network potential energy Download PDF

Info

Publication number
CN102684912B
CN102684912B CN201210104303.9A CN201210104303A CN102684912B CN 102684912 B CN102684912 B CN 102684912B CN 201210104303 A CN201210104303 A CN 201210104303A CN 102684912 B CN102684912 B CN 102684912B
Authority
CN
China
Prior art keywords
network
node
community
potential energy
community structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210104303.9A
Other languages
Chinese (zh)
Other versions
CN102684912A (en
Inventor
李生红
陈秀真
赵郁忻
楼昊
蔡贵贤
陶彤彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201210104303.9A priority Critical patent/CN102684912B/en
Publication of CN102684912A publication Critical patent/CN102684912A/en
Application granted granted Critical
Publication of CN102684912B publication Critical patent/CN102684912B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a community structure mining method based on network potential energy. The community structure mining method comprises the following steps of: introducing a physical concept of potential energy into a complex network, defining node potential energy and network potential energy of the complex network, dividing the complex network by virtue of an optimal potential function, and finally mining a community structure in the complex network. Because the network potential energy can better utilize topological information of the network to reflect the closeness of a topological structure of the whole network, the community structure mining method can be used for greatly improving the detection precision of the community structure.

Description

The community structure method for digging of potential energy Network Based
Technical field
What the present invention relates to is the method for the data mining in a kind of complex network field, the specifically community structure method for digging of complex network.
Background technology
Complex network is a kind of abstraction form of complication system in real world, node in complex network represents the individuality in complication system, the connection between node in representative system between individuality according to a kind of relation of certain regular self-assembling formation or arteface.At present, complex network has been widely used in characterizing the various complication systems such as neural net, electric power networks, the Internet, social relationships net.
Complex network comprises important topological property, i.e. a community structure.That is to say, whole complex network is made up of several community structures, and the node of each community structure inside connects very tight, and the connection between community structure is relatively sparse.Community structure is corresponding to the functional unit or the organizations etc. that have common feature in reality, for example in the Internet, community structure is exactly that the number of site of common topic is discussed, and community structure is exactly a group that has people's composition of common interest hobby in social relationships net.Therefore the community structure of, excavating in complex network has important practical significance.
Through literature search, M.E.J.Newman and M.Girvan are at article " Community structure in social and biological networks[J] " (" community structure in society and bio-networks ") (Proc.Natl.Acad.Sci.USA 99, 7821-7826 (2001)) in a kind of community mining method based on shortest path has been proposed, be specially: by the shortest path between any two points in computing network, obtain the shortest path table of whole network, utilize number of times that every limit of this table statistics passed through by the shortest path weights as this limit, remove the limit of weights maximum in network, then recalculate the weights on each limit in whole network, repeat above step, until whole network is divided into rational community structure.But it is not very high problem that the method exists accuracy of detection.
Find through retrieval again, Tyler has proposed a kind of community mining method based on local shortest path in article " Email as spectroscopy:Automated discovery of community structure within organizations " (" using Email as frequency spectrum: automatically find in-house community structure ") (The Information Society 21 (2), 143-153 (2005)).Its method is, shortest path in computing network between any two points, and only calculating and record length are less than the shortest path of set threshold value T, so just obtain a local shortest path table, recycle the number of times that passed through by shortest path on every limit of this table statistics weights as this limit, remove the limit of weights maximum in network, then recalculate the weights on each limit in whole network, repeat above step, until whole network is divided into rational community structure.The method can improve the speed of method, and the memory space that reduction method is required still can cause the reduction of accuracy of detection.
Find through retrieval again, the people such as Filippo Radicchi and Claudio Castellano have proposed a kind of community mining method based on Local network topology structure in article " Defining and identifying communities in networks[J] " (" community structure in definition recognition network ") (Proc.Natl.Acad.Sci.USA 101,2658-2663 (2004)).Its method is: by the local topology feature of community structure in Analysis of Complex network, propose using the triangular structure in network as the most basic structure of Web Community, the quantity of the triangular structure being subordinated to by every limit in statistics network is as the weights on this limit, remove the limit of weights maximum in network, then recalculate the weights on each limit in whole network, repeat above step, until whole network is divided into rational community structure.The method has been considered the local features of network topology structure, but also exists the problem that community structure accuracy of detection is not high.
Summary of the invention
The present invention is directed to the deficiency that above-mentioned prior art exists, a kind of community mining method of potential energy Network Based is provided, the concept of potential energy in physics is incorporated in Complex Networks Analysis, the definition of network potential energy has been proposed, and by the potential-energy function of peak optimizating network, reach the object of excavating community structure in complex network.
The present invention is achieved through the following technical solutions, the present invention includes following steps:
The first step, for complex network G=(V, E), V represents the set of node, E represents the set on limit.The adjacency matrix A of input network G does data preliminary treatment, obtains pretreated network adjacent matrix A '.
Described data preliminary treatment, refers to that removing network moderate is 1 node, removes the row and column that this point is corresponding in the adjacency matrix of network input.
The degree of described node, refers to the number of other nodes that are connected with this node.
Second step, for pretreated network adjacent matrix A ', analyzes topology of networks.Retrieve other nodes in whole network for each node with the searching method of breadth-first, obtain the distance between any two nodes, thereby set up Distance matrix D.
Described breadth-first, refer to that when search is from start node, all nodes that retrieval is connected with start node, obtain the node of ground floor, and then start from the node of ground floor all nodes that retrieval is connected with the node of ground floor, obtain the node of the second layer, by that analogy, until traveled through all nodes in network.
Distance between described any two nodes, refers to the limit number that in network, between 2, shortest path comprises.
The 3rd step, the distance matrix of node Network Based, calculates whole network of network potential energy.Each node in network is regarded as to the source of a gravitational field, can be obtained the potential energy between any two nodes in network computing formula is as follows:
Wherein, R is a constant, can be set as a positive number, and D is the distance matrix obtaining in second step.
The network potential energy of whole network G is the potential energy sum between all nodes of comprising of network, and computing formula is as follows:
The 4th step, for every limit e in network k, sub-network G after calculating deletion this edge k=G-{e knetwork potential energy and the network potential energy difference of former network G computing formula is as follows:
The 5th step, removes be worth maximum limit, and check whether generated independently sub-network.If no, get back to the 4th step; If there is independently sub-network to produce, whether the sub-network that inspection division generates meets predefined strong and weak community structure.If result of calculation meets strong and weak community structure, turn the first step, method continues; If result of calculation does not meet strong and weak community structure, method finishes, and turns the 6th step.
Described independently sub-network, refers to that set of node and limit collection are all contained in former network, and other node in node and former network does not exist the network of connection.
Described strong community structure, refers to the topological structure that meets following condition: for each node of inside, community, all meet with the linking number of community's interior nodes and be greater than and the linking number of community's exterior node.
Described weak community structure, refers to the topological structure that meets following condition: the linking number of inside, community is greater than the external linking number in community.
The 6th step, the result obtaining for the 5th step, re-constructs primitive network figure.The node that preliminary treatment in the first step is fallen, rejoins in primitive network figure, and belongs to the community structure at the node place being directly connected with it.
The present invention is incorporated into the concept of potential energy in physics in complex network, has defined node potential energy, the network potential energy of complex network, by optimization potential-energy function, complex network is divided, and finally excavates the community structure in complex network.Because network potential energy can utilize the topology information of network preferably, reflect the tightness degree of whole network topology structure, so the precision of the method improves greatly.
Brief description of the drawings
Fig. 1 is the community mining method flow diagram of potential energy Network Based;
Fig. 2 is the adjacency matrix schematic diagram of complex network;
Fig. 3 is the community structure figure of karate club.
For foregoing of the present invention can be become apparent, preferred embodiment cited below particularly, and accompanying drawing shown in coordinating, be described in detail below.
Embodiment
Below embodiments of the invention are elaborated, the present embodiment is implemented under taking technical solution of the present invention as prerequisite, provided detailed execution mode and concrete operating process, but protection scope of the present invention is not limited to following embodiment.
The present embodiment adopts the karate club data set of the classical data set Zachary in community network, and this network is processed according to flow process shown in Fig. 1 of the present invention, and concrete steps are as follows:
First step S11, does data preliminary treatment for the adjacency matrix A of karate club network G, and the node 12 that is 1 for network moderate removes in network, in the adjacency matrix of network input, removes the 12nd row and 12 row.
Second step S12, for pretreated network A ', analyze topology of networks.Retrieve other nodes in whole network for each node with the searching method of breadth-first, obtain the distance between any two nodes, thereby set up Distance matrix D.
The 3rd step S13, according to the distance matrix of network, calculates whole network of network potential energy.Each node in network is regarded as to the source of a gravitational field, can be obtained the potential energy between any two nodes in network computing formula is as follows:
Wherein, R is set as positive integer 1, and D is the distance matrix obtaining in second step.
Whole network of network potential energy is the potential energy sum between all nodes of comprising of network, and computing formula is as follows:
Thereby, can calculate the network potential energy of whole network G before iteration
The 4th step S14, for every limit e in network k, sub-network G after calculating deletion this edge k=G-{e knetwork potential energy and the network potential energy difference of former network G computing formula is as follows:
The 5th step S15, removes the limit of value maximum, upgrade Distance matrix D, and check whether generated independently sub-network.If no, get back to the 4th step; If there is independently sub-network to produce, whether the sub-network that inspection division generates meets predefined strong and weak community structure.If result of calculation meets strong and weak community structure, turn the first step, method continues; If result of calculation does not meet strong and weak community structure, circulation finishes, and turns the 6th step.
Described independently sub-network, refers to that set of node and limit collection are all contained in former network, and other node in node and former network does not exist the network of connection.
Described strong community structure, refers to the topological structure that meets following condition: for each node of inside, community, all meet with the linking number of community's interior nodes and be greater than and the linking number of community's exterior node.
Described weak community structure, refers to the topological structure that meets following condition: the linking number of inside, community is greater than the external linking number in community.
The 6th step S16, the result obtaining for the 5th step, re-constructs primitive network figure.The node that preliminary treatment in the first step is fallen, rejoins in primitive network figure, and belongs to the community at the node place being directly connected with it.
More than experiment is used typical network data collection, calculating to method flow of the present invention and network potential energy is illustrated, the background knowledge of experimental result and data is basically identical, as shown in Figure 3, be divided the node ratio that enters correct community structure and reached 97.05%, thereby verified validity and the accuracy of this method.
Although the present invention discloses as above with preferred embodiment; so it is not in order to limit the present invention, anyly has the knack of this skill person, without departing from the spirit and scope of the present invention; when doing a little change and retouching, therefore protection scope of the present invention is when being as the criterion depending on the claim person of defining.

Claims (3)

1. a community structure method for digging for potential energy Network Based, is characterized in that, the method comprises the steps:
The first step, does data preliminary treatment to network adjacent matrix, and the degree of deleting nodes is not more than the corresponding row and column of node of given threshold value, obtains pretreated network adjacent matrix;
The degree of described node, refers to the number of other nodes that are connected with this node;
Described threshold value is the parameter of setting for data preliminary treatment;
Second step, for pretreated network data, analyzes topology of networks, sets up distance matrix;
The 3rd step, the distance matrix of node Network Based, calculates whole network of network potential energy.Each node in network is regarded as to the source of a gravitational field, can be obtained the potential energy between any two nodes in network computing formula is as follows:
Wherein, R is a constant, can be set as a positive number, and D is the distance matrix obtaining in second step;
The network potential energy of whole network G is the potential energy sum between all nodes of comprising of network, and computing formula is as follows:
The 4th step, for every limit e in network k, sub-network G after calculating deletion this edge k=G-{e knetwork potential energy and the network potential energy difference of former network G computing formula is as follows:
The 5th step, removes be worth maximum limit, and check whether generated independently sub-network; If no, get back to the 4th step; If there is independently sub-network to produce, whether the sub-network that inspection division generates meets predefined strong and weak community structure; If result of calculation meets strong and weak community structure, turn the first step, method continues; If result of calculation does not meet strong and weak community structure, method finishes, and turns the 6th step;
Wherein, described independently sub-network, refers to that set of node and limit collection are all contained in former network, and other node in node and former network does not exist the network of connection;
Described strong community structure, refers to the topological structure that meets following condition: for each node of inside, community, all meet with the linking number of community's interior nodes and be greater than and the linking number of community's exterior node;
Described weak community structure, refers to the topological structure that meets following condition: the linking number of inside, community is greater than the external linking number in community;
The 6th step, the result obtaining for the 5th step, re-constructs primitive network figure, and the node that preliminary treatment in the first step is deleted rejoins in primitive network figure, and belongs to the community at the node place being directly connected with it.
2. the community structure method for digging of potential energy Network Based according to claim 1, is characterized in that, is that data preliminary treatment is that to remove the degree of nodes be 1 node described in the first step, in the adjacency matrix of network input, removes the row and column that this point is corresponding.
3. the community structure method for digging of potential energy Network Based according to claim 1, it is characterized in that, second step concrete steps are: analyze topology of networks, retrieve other nodes in whole network for each node with the searching method of breadth-first, obtain the distance between any two nodes, thereby set up distance matrix.
CN201210104303.9A 2012-04-11 2012-04-11 Community structure mining method based on network potential energy Expired - Fee Related CN102684912B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210104303.9A CN102684912B (en) 2012-04-11 2012-04-11 Community structure mining method based on network potential energy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210104303.9A CN102684912B (en) 2012-04-11 2012-04-11 Community structure mining method based on network potential energy

Publications (2)

Publication Number Publication Date
CN102684912A CN102684912A (en) 2012-09-19
CN102684912B true CN102684912B (en) 2014-10-15

Family

ID=46816307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210104303.9A Expired - Fee Related CN102684912B (en) 2012-04-11 2012-04-11 Community structure mining method based on network potential energy

Country Status (1)

Country Link
CN (1) CN102684912B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647590A (en) * 2019-09-23 2020-01-03 税友软件集团股份有限公司 Target community data identification method and related device
CN111666312B (en) * 2020-05-13 2023-05-23 中国科学院软件研究所 Social network community cluster mining method and system based on universal gravitation law

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101272328A (en) * 2008-02-29 2008-09-24 吉林大学 Dispersion type community network clustering method based on intelligent proxy system
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101272328A (en) * 2008-02-29 2008-09-24 吉林大学 Dispersion type community network clustering method based on intelligent proxy system
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network

Also Published As

Publication number Publication date
CN102684912A (en) 2012-09-19

Similar Documents

Publication Publication Date Title
Zhang et al. Exact solution for mean first-passage time on a pseudofractal scale-free web
CN104102745B (en) Complex network community method for digging based on Local Minimum side
Zou et al. Finding top-k maximal cliques in an uncertain graph
CN102571954B (en) Complex network clustering method based on key influence of nodes
CN102810113B (en) A kind of mixed type clustering method for complex network
CN107784598A (en) A kind of network community discovery method
CN104657418B (en) A kind of complex network propagated based on degree of membership obscures corporations' method for digging
CN103020267B (en) Based on the complex network community structure method for digging of triangular cluster multi-label
CN106650486A (en) Trajectory privacy protection method in road network environment
CN103020163A (en) Node-similarity-based network community division method in network
CN103488683B (en) Microblog data management system and implementation method thereof
CN102819611B (en) Local community digging method of complicated network
CN103400299B (en) Method for detecting network overlapped communities based on overlapped point identification
CN107203619A (en) A kind of core subgraph extraction algorithm under complex network
CN102611588B (en) Method for detecting overlapped community network based on automatic phase conversion clustering
CN109978710A (en) Overlapping community division method based on K- core iteration factor and community's degree of membership
CN106981194B (en) A kind of recognition methods of highway network key road segment
CN102684912B (en) Community structure mining method based on network potential energy
CN102722530B (en) Community detection method in complex network
CN104700311B (en) A kind of neighborhood in community network follows community discovery method
CN104967114A (en) Power grid load real-time digital modeling method and system
CN104899283A (en) Frequent sub-graph mining and optimizing method for single uncertain graph
Zhou et al. Identifying technology evolution pathways by integrating citation network and text mining
CN116307325A (en) Line planning method and device for power distribution network, electronic equipment and storage medium
CN103902547A (en) Increment type dynamic cell fast finding method and system based on MDL

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20141015

Termination date: 20210411