CN108537452A - It is a kind of to be overlapped community division method towards the intensive of large-scale complex network - Google Patents

It is a kind of to be overlapped community division method towards the intensive of large-scale complex network Download PDF

Info

Publication number
CN108537452A
CN108537452A CN201810331707.9A CN201810331707A CN108537452A CN 108537452 A CN108537452 A CN 108537452A CN 201810331707 A CN201810331707 A CN 201810331707A CN 108537452 A CN108537452 A CN 108537452A
Authority
CN
China
Prior art keywords
community
circle
node
intensive
accessed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810331707.9A
Other languages
Chinese (zh)
Inventor
吴迪
叶国桥
吴展鹏
陈润源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201810331707.9A priority Critical patent/CN108537452A/en
Publication of CN108537452A publication Critical patent/CN108537452A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

It being overlapped community division method towards the intensive of large-scale complex network the present invention relates to a kind of, is included the following steps:S1. catenet is abstracted into a non-directed graph, initializes systematic parameter, including the initial community's circle collection of setting is combined into empty set, and iteration stopping condition is arranged;S2. the node being accessed not yet is traversed, and sets the node in the seed of community's circle, circle is extended from the node;And the expansible node set of the node is safeguarded using Priority Queues, the Priority Queues maintenance of neighbor node, and constantly update;S3. the node is constantly extended, when the conductance of community's circle can not reduce or have reached iteration stopping condition again, community's extension finishes, and community's circle is added to community's circle set;S4. judge whether all nodes have all accessed to finish, if not provided, jumping to step S2, otherwise enter S5;S5. community's circle set is exported, algorithm terminates.

Description

It is a kind of to be overlapped community division method towards the intensive of large-scale complex network
Technical field
The present invention relates to parallel computations and field of social network, more particularly, to one kind towards large-scale complex network Intensive be overlapped community division method.
Background technology
In recent years, with the rise of large-scale social networks (such as Facebook, Twitter and microblogging), community mining (CommunityDetection) extensive concern of academia and industrial circle is gradually caused.Community mining is aimed at from pumping Among the network of elephant, the social circle with high cohesion lower coupling is excavated, these circles may be dense and be that can weigh Folded.Currently, Facebook has had a monthly any active ues more than 2,000,000,000, so many user construct one it is huge Social networks.Community mining can bring many incomes, such as Facebook that can give the circle for having same interest to like Interior member's recommending friends;Amazon can be excavated according to the network of its cargo and user's composition with similar purchase interest Circle, to user's Recommendations in circle;Among Bank Danamon, community mining can be carried out, to the user in circle into Row risk assessment and user credit evaluation and test.
But among community mining research at present, there is also many challenges and deficiency.First, to solve catenet Computation complexity problem, existing research is thought, is excavated among huge social networks with high cohesion lower coupling Community's circle needs algorithm of the design with lower complexity, it is therefore desirable to design good heuritic approach;Second is that intensive Be overlapped among the Research on Mining of community, existing research will often assume to have known the number of the community of excavation, still This number is usually to have prodigious randomness it is difficult to which estimation, can only estimate by rule of thumb;Third, among existing research, Researcher cuts into several small figures, can lose the information of original figure so often by the Large Graph of network abstraction, although having one How a little research and probes minimize the loss information of figure, cut into the figure of several separation, but always lose the useful of original figure Information.
Invention content
The present invention provides a kind of towards large-scale complex network it is intensive be overlapped community division method, this method can The intensive community's circle being overlapped effectively is excavated, there is lower algorithm complexity, without estimating community's circle in advance Number, and Large Graph need not be cut into several small figures, it will not thus cause the information of original figure to lose problem.
To realize the above goal of the invention, the technical solution adopted is that:
It is a kind of to be overlapped community division method towards the intensive of large-scale complex network, include the following steps:
S1. catenet is abstracted into a non-directed graph, initializes systematic parameter, including the initial community's circle set of setting For empty set, iteration stopping condition is set;
S2. the node being accessed not yet is traversed, and sets the node in the seed of community's circle, is gone out from the node Hair is extended circle;And the expansible node set of the node is safeguarded using Priority Queues, which safeguards adjacent Node is occupied, and is constantly updated;
S3. the node is constantly extended, when the conductance of community's circle can not reduce or have reached iteration stopping condition again, Community's extension finishes, and community's circle is added to community's circle set;
S4. judge whether all nodes have all accessed to finish, if not provided, jumping to step S2, otherwise enter S5;
S5. community's circle set is exported, algorithm terminates.
Compared with prior art, the beneficial effects of the invention are as follows:
Method provided by the invention can effectively excavate the intensive community's circle being overlapped, and have lower algorithm multiple Large Graph without estimating the number of community's circle in advance, and need not be cut into several small figures, will not thus drawn by miscellaneous degree The information for playing original figure loses problem.
Description of the drawings
Fig. 1 is the flow diagram of method.
Specific implementation mode
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
Below in conjunction with drawings and examples, the present invention is further elaborated.
Embodiment 1
Use G=(V, E) abstracts to indicate a network first, which is made of nonoriented edge, wherein V indicate network it The combination of interior joint, E indicate the set on side among network.If two node u, v ∈ V, if existed between the two nodes Side, then (u, v) ∈ E can be expressed as.Indicate the weight between node using ω uv, present invention primarily contemplates be undirected Figure, therefore wuv=1.
The present invention indicates the set of several nodes using S, and defines mSThe weight for indicating the side among S, then having:
The weight for defining the cut edge of S is cS, then cSIt is defined as:
The set for defining the neighbor node that N (S) is S can be with the outer of definition node u for a neighbor node u ∈ N (S) It is to weight:
And the interior of node u is to weight:
Among existing community mining, there is many indexs, wherein conductance (Conductance) and modularity (Modularity) be relatively broad receiving index.Among method provided by the invention, using Conductance, this refers to It is denoted as valuation functions.The Conductance of definition set S is φ (S):
According to the definition of Conductance, it is required that circle has higher tight ness rating or contiguity, it is necessary to constantly Reduce the size of Conductance.
Therefore, for a set S, present invention determine that the condition that it is a community is:One cannot be continued growing Neighbor node makes its Conductance smallers, that is to say, that forWithThere are the following conditions:
φ (S)≤φ (S ') and | S ' |≤| S |+1iff S=S ',
The key step for carrying out interpretation algorithms with reference to Fig. 1, is as follows:
Catenet is abstracted into a non-directed graph by (Step 1), initializes systematic parameter, including the initial community's circle of setting Subset is combined into empty set, and iteration stopping condition is arranged.
(Step 2) traverses the node being accessed not yet, and sets the node in the seed of community's circle, from this Node, which sets out, is extended circle.And safeguard that the expansible node of the node combines using Priority Queues, the Priority Queues Maintenance of neighbor node, and constantly update.
(Step 3) constantly extends the node, when the conductance (Conductance) of community's circle can not reduce or again When reaching iteration stopping condition, community's extension finishes, and community's circle is added to community's circle set.
(Step 4) judges that all nodes have all accessed and finishes, if not provided, jumping to Step 2, otherwise enters Step 5
(Step 5) exports community's circle set, and algorithm terminates.
The derivation and theoretical foundation of algorithm are told about with lower part.
The present invention proves two variable first, this is the theoretical foundation of the method for the present invention.
Theorem 1:If set S is the subset of V, ifSo
φ (S) < φ (S ∪ { u }).
It proves:According to (2-6), can obtain:
Allow S '=S ∪ { u }, (2-2) shows cS< cS′, simultaneously (2-1) show mS=mS′, so as to obtain following push away It leads:
From theorem 1 it can be seen that, if it is desired to so that Conductance smallers, the non-neighbor node of node, should not be added Enter among community.
Theorem 2:If set S is the subset of V, and u is not the node of set S, then defining:
It is so available:
It proves:First so that u ∈ V S, S '=S ∪ { u } can be obtained, following derivation can be obtained:
Assuming that
φ (S) >=φ (S '), (2-12)
Have:
It can obtain:
Directly judge whether set S is a community according to (2-6), is highly difficult, then next according to theorem 1 With theorem 2, judge that one combines whether S is a community with condition below:IfSo S is a community And if only if:
Its central principle is, the section that can so that Conductance becomes smaller is found among the neighbor node of set S Point, if set S cannot be expanded, set S is exactly a community.
In conjunction with the above derivation, the present invention devises a kind of intensive towards large-scale complex network and is overlapped community The pseudocode of division methods, algorithm is as follows:
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this All any modification, equivalent and improvement etc., should be included in the claims in the present invention made by within the spirit and principle of invention Protection domain within.

Claims (1)

1. a kind of being overlapped community division method towards the intensive of large-scale complex network, it is characterised in that:Include the following steps:
S1. catenet is abstracted into a non-directed graph, initializes systematic parameter, including the initial community's circle collection of setting is combined into sky Iteration stopping condition is arranged in collection;
S2. traverse the node being accessed not yet, and set the node in the seed of community's circle, from the node into Row extension circle;And the expansible node set of the node is safeguarded using Priority Queues, the Priority Queues maintenance of neighbor section Point, and constantly update;
S3. the node is constantly extended, when the conductance of community's circle can not reduce or have reached iteration stopping condition again, community Extension finishes, and community's circle is added to community's circle set;
S4. judge whether all nodes have all accessed to finish, if not provided, jumping to step S2, otherwise enter S5;
S5. community's circle set is exported, algorithm terminates.
CN201810331707.9A 2018-04-13 2018-04-13 It is a kind of to be overlapped community division method towards the intensive of large-scale complex network Pending CN108537452A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810331707.9A CN108537452A (en) 2018-04-13 2018-04-13 It is a kind of to be overlapped community division method towards the intensive of large-scale complex network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810331707.9A CN108537452A (en) 2018-04-13 2018-04-13 It is a kind of to be overlapped community division method towards the intensive of large-scale complex network

Publications (1)

Publication Number Publication Date
CN108537452A true CN108537452A (en) 2018-09-14

Family

ID=63480404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810331707.9A Pending CN108537452A (en) 2018-04-13 2018-04-13 It is a kind of to be overlapped community division method towards the intensive of large-scale complex network

Country Status (1)

Country Link
CN (1) CN108537452A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109952742A (en) * 2018-12-04 2019-06-28 区链通网络有限公司 Graph structure processing method, system, the network equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052263A1 (en) * 2006-08-24 2008-02-28 Yahoo! Inc. System and method for identifying web communities from seed sets of web pages
CN103455612A (en) * 2013-09-07 2013-12-18 西安电子科技大学 Method for detecting non-overlapping network communities and overlapping network communities based on two-stage strategy
CN103729475A (en) * 2014-01-24 2014-04-16 福州大学 Multi-label propagation discovery method of overlapping communities in social network
CN104166731A (en) * 2014-08-29 2014-11-26 河海大学常州校区 Discovering system for social network overlapped community and method thereof
CN106204290A (en) * 2015-05-05 2016-12-07 杨宁 A kind of overlapping community detection method in quotation coupling network
CN107133877A (en) * 2017-06-06 2017-09-05 安徽师范大学 The method for digging of overlapping corporations in network
CN107480213A (en) * 2017-07-27 2017-12-15 上海交通大学 Community's detection and customer relationship Forecasting Methodology based on sequential text network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052263A1 (en) * 2006-08-24 2008-02-28 Yahoo! Inc. System and method for identifying web communities from seed sets of web pages
CN103455612A (en) * 2013-09-07 2013-12-18 西安电子科技大学 Method for detecting non-overlapping network communities and overlapping network communities based on two-stage strategy
CN103729475A (en) * 2014-01-24 2014-04-16 福州大学 Multi-label propagation discovery method of overlapping communities in social network
CN104166731A (en) * 2014-08-29 2014-11-26 河海大学常州校区 Discovering system for social network overlapped community and method thereof
CN106204290A (en) * 2015-05-05 2016-12-07 杨宁 A kind of overlapping community detection method in quotation coupling network
CN107133877A (en) * 2017-06-06 2017-09-05 安徽师范大学 The method for digging of overlapping corporations in network
CN107480213A (en) * 2017-07-27 2017-12-15 上海交通大学 Community's detection and customer relationship Forecasting Methodology based on sequential text network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JOYCE JIYOUNG WHANG等: "Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *
YANG GAO等: "A Fast and High Quality Approach for Overlapping Community Detection through Minimizing Conductance", 《2016 IEEE FIRST INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109952742A (en) * 2018-12-04 2019-06-28 区链通网络有限公司 Graph structure processing method, system, the network equipment and storage medium
CN109952742B (en) * 2018-12-04 2022-02-22 区链通网络有限公司 Graph structure processing method, system, network device and storage medium

Similar Documents

Publication Publication Date Title
Sen et al. Region-based connectivity-a new paradigm for design of fault-tolerant networks
CN104317904A (en) Generalization method for weighted social network
CN102810113A (en) Hybrid clustering method aiming at complicated network
Choe et al. Midas: Representative sampling from real-world hypergraphs
CN105530609B (en) The indoor orientation method of efficient secret protection based on Wi-Fi fingerprints
CN108537452A (en) It is a kind of to be overlapped community division method towards the intensive of large-scale complex network
Cooper et al. The cover times of random walks on random uniform hypergraphs
Liu et al. Spotting significant changing subgraphs in evolving graphs
Taha et al. A system for analyzing criminal social networks
WO2016086634A1 (en) Reject rate-controllable metropolis-hastings graph sampling algorithm
Liu et al. Asymptotic properties of blow-up solutions in reaction–diffusion equations with nonlocal boundary flux
CN106685893B (en) A kind of authority control method based on social networks group
Adriaens et al. Minimizing hitting time between disparate groups with shortcut edges
Cygan et al. On the inequality between radius and Randic index for graphs
CN103200034B (en) Network user structure disturbance method based on spectral constraint and sensitive area partition
Korman Minimizing interference in ad hoc networks with bounded communication radius
He et al. Reachability analysis in privacy-preserving perturbed graphs
Fushimi et al. Efficient analytical computation of expected frequency of motifs of small size by marginalization in uncertain network
Xiang et al. TKDA: An improved method for k-degree anonymity in social graphs
Asada et al. An efficient silent self-stabilizing algorithm for 1-maximal matching in anonymous networks
Chang et al. Adaptive and blind regression for mobile crowd sensing
Fibich et al. Funnel theorems for spreading on networks
CN105813235B (en) The division method and system of mobile terminal client corporations
Shioda Random walk based biased sampling for data collection on communication networks
Berry et al. Cooperative computing for autonomous data centers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180914

RJ01 Rejection of invention patent application after publication