CN109886313A - A kind of Dynamic Graph clustering method based on density peak - Google Patents

A kind of Dynamic Graph clustering method based on density peak Download PDF

Info

Publication number
CN109886313A
CN109886313A CN201910080266.4A CN201910080266A CN109886313A CN 109886313 A CN109886313 A CN 109886313A CN 201910080266 A CN201910080266 A CN 201910080266A CN 109886313 A CN109886313 A CN 109886313A
Authority
CN
China
Prior art keywords
vertex
density
index
noise
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910080266.4A
Other languages
Chinese (zh)
Inventor
谷峪
吴长发
于戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910080266.4A priority Critical patent/CN109886313A/en
Publication of CN109886313A publication Critical patent/CN109886313A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of Dynamic Graph clustering methods based on density peak, for clustering to Dynamic Graph, return to cluster result and discovery cluster evolution event in real time, and wherein cluster result includes cluster, abnormal vertex and the bridge vertex in figure.Including static map clustering method and Dynamic Graph clustering method two parts, it is divided into two stages of initialization and dynamic detection.In initial phase, the local density on vertex is calculated, vertex is relied on and relies on similarity;For boosting algorithm efficiency, DP-Index index is generated;Decision diagram is generated, density peak maximum and noise vertex are obtained by decision diagram;Cluster result set, abnormal vertex set and bridge vertex set are obtained based on density peak thought;Dependency graph is created according to cluster result, is laid the foundation for Dynamic Graph cluster.In the dynamically more new stage, according to the insertion on vertex or deletes the insertion with side or delete update DP-Index index and dependency graph;Cluster result and cluster evolution event are obtained according to the dynamic change of dependency graph and dependency graph.

Description

A kind of Dynamic Graph clustering method based on density peak
Technical field
The invention belongs to computer large-scale graph data process fields, and in particular to a kind of Dynamic Graph based on density peak is poly- Class method.
Background technique
Since graph model has powerful expressive force, the relationship between data and data is built in the application of many real worlds Mould is figure, and wherein vertex represents corresponding entity, the relationship between Bian Daibiao entity.With social networks, information network, quotation How effectively the network applications such as network, collaborative network, electronic commerce network, communication network and bio-networks emerge in multitude, Management and analysis diagram data have obtained the concern of more and more people.Wherein, figure cluster be used as a basic problem, by Research extensively.
Figure cluster is the cluster for diagram data.The purpose of figure cluster is exactly will be in network according to the similarity between vertex Vertex partition is several subgraphs intensively connected, and be otherwise known as cluster, so that the vertex connection in the same cluster is relatively dense, it is different Connection between cluster is relatively sparse.Cluster in real network represents the set of special object, and e.g., the cluster in community network represents The true public organization formed according to interest or background;Cluster in citation network represents the theory of correlation for being directed to same subject Text.It was found that the cluster in these networks, which facilitates us, more efficiently understands and develop these networks, such as it can capture quotation Strong online document is contacted in network, it is found that the social groups that social environment is shared in social networks, and analysis people exist Product often bought together when online shopping etc..
The figure clustering method of many types has been proposed now, but there are still following two aspects challenges:
First, identify bridge and abnormal vertex
Although detecting, cluster is critically important, and the bridge and abnormal vertex in detection figure are also critically important.In one drawing, each Vertex all plays different roles.Such as some people may with many groups are all very friendly is but not belonging to any one group Body (such as politicians), referred to as bridge.And some people and less people's dealing (such as recluse), referred to as abnormal vertex.Identify bridge It is particularly significant for the complex network different for excavation with abnormal vertex.In addition, there is also need for many figure clustering algorithms at present The problems such as inputting parameter, adjusting parameter, excessively high time complexity.
Second, Dynamic Graph and tracking cluster evolution event are handled in real time
A large amount of algorithm is defined according to respective different cluster proposes the method for capableing of automatic cluster.But at present mostly Counting method be all since one it is very strict assume, that is, real world can be modeled with static network.So And the network of many real worlds is to constantly update, over time, vertex and side can be added and leave net Network, to generate relevant variation in figure topology, cluster result can also change therewith.Existing figure clustering algorithm should Consider adaptation Dynamic Graph.But current figure clustering method is directed to static network mostly, seems power not when handling dynamic network From the heart.Furthermore traditional figure clustering algorithm is intended to identify hiding clustering architecture, and Dynamic Graph clustering method should then be dedicated to tracking The local topology of Dynamic Graph and its mutation, but few methods can accomplish this point at present.The variation of these local topologies and mutation Be otherwise known as cluster evolution event.Tracking cluster evolution event has important practical significance.Such as during Ictiobus cyprinllus, The social networks such as Facebook backstage carries out Dynamic Graph cluster, and person participating in the election, which can timely learn, supports each person participating in the election's Quantity, size and the careful personnel of cluster flow to dynamic etc., then can timely be adopted according to the dynamic change of real-time cluster Corresponding measure is taken to guarantee the political support of oneself.
Generally speaking, current figure clustering algorithm can not well solve the above two aspects problem, this is with regard to urgent need The challenge of Dynamic Graph clustering problem is met with new method and technology.
Summary of the invention
For overcome the deficiencies in the prior art, the present invention provides a kind of Dynamic Graph clustering algorithms based on density peak.
The method of the present invention includes following steps:
Step 1: structural similarity, used structural similarity formula are calculated to each pair of adjacent vertex in figure are as follows:
Wherein, N [u] indicates the structure neighborhood of vertex u, for a figure G (V, E), i.e. N [u]=v ∈ V | (u, v) ∈ E } ∪{u}.The structure neighborhood of N [v] expression vertex v.Deg [u] indicates that the degree of vertex u, the degree of vertex u are that u structure in vertex is adjacent The number in residence.The degree of deg [v] expression vertex v.
Descending arrangement is carried out to the structural similarity between all vertex later, is taken at structural similarity descending arrangement 20% Value be similarity threshold σt, the number on all sides in figure is indicated with m, bidding k indicates similarity threshold in structural similarity Corresponding structural similarity serial number after descending arrangement, then k should meet:
Step 2: successively calculating three measurements on each vertex in figure: local density, rely on vertex and rely on similarity.
Step 2-1: can basis of local density's formula as entire algorithm be the key that realize to realize cluster on the diagram Factor.This patent first has to the partial structurtes for considering vertex, defines any one vertex as structure-based figure clustering algorithm Local density include structural similarity between the vertex and its all structure neighborhood.Design serialization functionAnd using standardized normal distribution as μuvWeight in local density's formula.By μuvValue range be set as 0 < μuv≤ 2, to exclude to be unsatisfactory for the structural similarity of this value range.According to structural similarity, local density's calculation formula are as follows:
Step 2-2: local density ratio u in the neighbours vertex of vertex u is big and be known as with the highest vertex of u structural similarity The dependence vertex of u, is denoted asBy u withBetween structural similarity be known as rely on similarity, be denoted as δu.Calculation formula are as follows:
Wherein N (u) indicates the opening neighbours of vertex u, for a figure G (V, E), i.e. N (u)=v ∈ V | (u, v) ∈ E }. If setting δ there is no the vertex that local density's ratio u is big in the neighbours vertex of vertex uu=0, andOne is pushed up Point u, if there are two even more dependence vertex, that algorithm will therefrom select the dependence vertex as vertex u at random.
Step 3: DP-Index index being established to entire figure according to the three of each vertex measurements, DP-Index index includes The local density on each vertex and each vertex in figure relies on vertex and relies on similarity, finally to the top in index Point carries out descending arrangement according to their local density.Based on DP-Index index, in this patent static map clustering algorithm when Between complexity be O (n), wherein n be vertex quantity.
Step 4: according to local density ρ defined in step 2-1 and 2-2 and similarity δ is relied on, using ρ as abscissa, δ is ordinate, is generated as the decision diagram of G- Design.Then local density is more than or equal to by ξ according to decision diagram and relies on similarity Vertex less than γ is selected into density summit point set, and the vertex by local density less than ξ is selected into noise vertex set.
Step 5: concentrating each vertex to distribute a cluster for density peak maximum first.Then for being not belonging to density peak maximum Each vertex of collection and noise vertex set, puts in order according to the descending of the local density on vertex and is traversed, and will be every A vertex is assigned to that local density in neighbours vertex is bigger than its, in cluster belonging to the highest vertex of structural similarity, finally obtains Cluster result set.
Step 6: in noise vertex set vertex carry out further division, if in noise vertex set certain vertex u neighbour Residence belongs to different clusters, then this vertex u is just selected into bridge vertex set, is otherwise just selected into abnormal vertex set.
Step 7: dependency graph being obtained according to DP-Index index, density summit point set and noise vertex set, is initialized first One dependency graph G ' (V ', E '), if vertex set V ' and side collection E ' is sky, later if a vertex u belongs in original image G (V, E) Vertex u, then be added in dependency graph G ' by density summit point set or noise vertex set, otherwise by this vertex u and side It is added in dependency graph G '.At this point, each connected component in dependency graph corresponds to a cluster, each isolated vertex is equal Belong to noise vertex.
Step 8: in the dynamic detection stage, considering four kinds of variations of Dynamic Graph: increasing or delete side and increase or deletion Vertex.Change real-time update DP-Index index respectively according to above-mentioned four class.
Increase and exist: when increasing when (u, v), the degree of vertex of vertex u and vertex v adds 1, then opposite vertexes u and vertex v into Row is further to update operation.For vertex u, recalculate the structural similarity of vertex u Yu neighbours vertex, more new summit u and The local density on neighbours vertex, later the dependence vertex on the neighbours vertex of more new summit u, neighbours vertex and neighbours vertex and Similarity is relied on, the measurement after finally changing according to vertex is updated DP-Index;The update of vertex v is operated with vertex u.
Delete: when deleting when (u, v), operation, which is similar to, increases side, and the degree of vertex of vertex u and vertex v subtracts 1, so Opposite vertexes u and vertex v carry out further update operation afterwards.For vertex u, the structure of vertex u Yu neighbours vertex are recalculated Similarity, the more local density of new summit u and neighbours vertex, later more new summit u, neighbours vertex and neighbours vertex neighbours The dependence vertex on vertex and dependence similarity, measurement is updated DP-Index after finally being changed according to vertex;Vertex v Operation is updated with vertex u.
Increase vertex: when increasing vertex u, initializing the local density of vertex u, rely on vertex and relies on similarity, and It is added in DP-Index index.
It deletes vertex: when deleting vertex u, deletion being executed to the side (u, v) between each vertex u and neighbours' vertex v Side operation, later deletes vertex u from DP-Index index.
Step 9: the variation occurred according to DP-Index index is updated dependency graph.The update of dependency graph mainly divides For following 5 kinds of situations:
Non-noise vertex becomes noise vertex: when non-noise vertex u becomes noise vertex, if existed in dependency graph SideThen delete side
Noise vertex becomes non-noise vertex: when noise vertex, u becomes non-noise vertex, if after vertex u variation not For density peak maximum, side is added in dependency graph
Density peak maximum becomes non-density peak maximum: when density peak maximum u becomes non-density peak maximum, if vertex u becomes It is not noise vertex after change, side is added in dependency graph
Non- density peak maximum becomes density peak maximum: if non-density peak maximum u is not noise vertex, vertex u becomes close When spending peak maximum, side is deleted in dependency graph
The dependence vertex on vertex changes: if vertex u is not noise vertex or density peak maximum, and vertex u according to Rely vertex fromBecomeSide is then deleted in dependency graphSide is added
Finally, can be obtained by real-time cluster result by obtaining the connected component in dependency graph;Connect in monitoring dependency graph The variation of reduction of fractions to a common denominator amount can obtain cluster evolution event.
Advantageous effects of the invention are as follows:
1) a kind of structure-based figure clustering algorithm is devised based on density peak thought, which does not need adjusting parameter, Precision is high, can obtain cluster, bridge vertex and abnormal vertex, there is theoretical guarantee and experiment to guarantee.
2) define three new measurements for each vertex: local density relies on vertex and dependence similarity.Based on this A little measurements can obtain the decision diagram exclusively for G- Design, can be obtained according to the demand of user by decision diagram certain amount of Cluster or automatic identification cluster number.
3) for the operation of accelerating algorithm, the present invention devises DP-Index index structure.It is quiet based on this index structure State structure chart clustering algorithm can efficiently calculate reasonable cluster online as a result, make Algorithms T-cbmplexity only with vertex quantity It is related.
4) in order to which algorithm is suitable for Dynamic Graph, the present invention devises the data structure of entitled dependency graph, and according to dependence Figure designs corresponding algorithm for the dynamic change of figure, obtains cluster result and cluster evolution event in real time.
Detailed description of the invention
Fig. 1 is the non-directed graph example of the specific embodiment of the invention;
Fig. 2 is the decision diagram example of the specific embodiment of the invention;
Fig. 3 is the dependency graph example of the specific embodiment of the invention;
Fig. 4 is the increase side (v of the specific embodiment of the invention5, v6) after non-directed graph example;
Fig. 5 is the increase side (v of the specific embodiment of the invention5, v6) after decision diagram example;
Fig. 6 is the increase side (v of the specific embodiment of the invention5, v6) after dependency graph example;
Fig. 7 is the cluster evolution event of the specific embodiment of the invention;
Fig. 8 is the algorithm flow chart of the specific embodiment of the invention;
V in figure1-v11For the vertex in figure.
Specific embodiment
The present invention is described further in conjunction with attached drawing.
The algorithm mainly includes that the static structure figure clustering algorithm based on density peak and the Dynamic Graph based on density peak are poly- Class algorithm two parts.Static structure figure clustering algorithm main thought be structural similarity based on vertex in figure and neighbours vertex with Structure Dependence first defines the local density on vertex in figure, relies on vertex and rely on similarity, and generates DP-Index rope Draw;Then decision diagram is generated, density peak maximum and noise vertex in figure are found by decision diagram;It is obtained according to density peak maximum The cluster result set of figure carries out noise vertex set to divide the abnormal vertex set of acquisition and bridge vertex set.Dynamic based on density peak The result building dependency graph and DP-Index index that figure clustering algorithm is clustered in initial phase according to Static Density peak figure first; Then according to the dynamic change incremental update DP-Index index and dependency graph of figure, according to the variation of dependency graph and dependency graph The cluster result and cluster evolution event of Dynamic Graph are obtained in real time.
The technical solution adopted by the present invention: firstly, it is similar to calculate the structure between any adjacent vertex in initial phase Degree, then calculate three kinds of measurements on each vertex: local density relies on vertex and relies on similarity, and to each vertex Above-mentioned three kinds of measurements establish DP-Index index, and carry out descending arrangement based on the local density on vertex to the vertex in index, By this index can effective boosting algorithm efficiency;The decision diagram exclusively for G- Design is generated according to DP-Index index, Selecting local density greatly and relying on the relatively small vertex of similarity is density peak maximum, and the lesser vertex of local density is noise Vertex;To remaining vertex, i.e., do not include the vertex set at density peak and noise vertex, is arranged according to the descending in DP-Index index Column sequence is traversed, by vertex and its dependence vertex partition into the same cluster, to obtain the cluster result set of figure;To making an uproar Sound vertex set, which divide, obtains abnormal vertex set and bridge vertex set;Dependency graph is created according to the cluster result of figure later, is used Cluster result is stored, updates and lays the foundation for dynamic;In the dynamic detection stage, according to inserting for the insertion/deletion on vertex and side Enter/delete to update DP-Index index and dependency graph, the cluster of figure is obtained according to the dynamic change of dependency graph and dependency graph And cluster evolution event as a result.
The present invention is divided into two stages --- initial phase and dynamic detection stage.Initial phase needs to realize three Aspect --- figure is clustered using density peak static structure figure clustering algorithm, construct DP-Index index structure, creation according to Lai Tu.Need to handle two aspects in the dynamic detection stage --- it updates DP-Index index and dependency graph, obtain cluster knot Fruit and cluster evolution event.Core algorithm therein is static structure figure clustering algorithm and use increment thought based on density peak The Dynamic Graph clustering algorithm of dynamic update is carried out to cluster result.
An example of the invention is illustrated using a undirected illustrated example in Fig. 1.Fig. 1 is one and calculates to test Method and the small diagram data designed.
The specific implementation step of the method for the present invention is as follows:
It is illustrated in figure 8 the flow chart of algorithm of the invention;
Step 1: structural similarity, used structural similarity formula are calculated to each pair of adjacent vertex in figure are as follows:
Wherein, N [u] indicates the structure neighborhood of vertex u, for a figure G (V, E), i.e. N [u]=v ∈ V | (u, v) ∈ E } ∪{u}.The structure neighborhood of N [v] expression vertex v.Deg [u] indicates the degree of vertex u, and the degree of vertex u is top in the method The number of point u structure neighborhood.The degree of deg [v] expression vertex v.As shown in Figure 1, vertex v1And vertex v2Structural similarity beThe structural similarity calculation method of other adjacent vertexes is same as described above.
Descending arrangement is carried out to structural similarity later, taking the value at structural similarity descending arrangement 20% is similarity threshold Value σt, similarity threshold σtIt is the default parameters of algorithm.The number on all sides in figure is indicated with m, bidding k indicates similarity threshold It is worth corresponding structural similarity serial number after the arrangement of structural similarity descending, then k should meet:For Non-directed graph in Fig. 1, σt=0.866.
Step 2: successively calculating three measurements on each vertex in figure: local density, rely on vertex and rely on similarity.
Step 2-1: can basis of local density's formula as entire algorithm be the key that realize to realize cluster on the diagram Factor.The local density on one vertex of this patent design includes the structural similarity on vertex Yu all structure neighborhoods in vertex.In order to More reasonably distinguish the local density on each vertex, Patent design serialization functionAnd by standard normal Distribution is used as μuvWeight in local density's formula.Furthermore this patent is by μuvValue range be set as 0 < μuv≤ 2, to exclude Influence of the too small similarity to local density.According to structural similarity, local density's calculation formula are as follows:
Step 2-2: local density ratio u in the neighbours vertex of vertex u is big and be known as with the highest vertex of u structural similarity The dependence vertex of u, is denoted asBy u withBetween structural similarity be known as rely on similarity, be denoted as δu.Calculation formula are as follows:
Here, N (u) indicates the opening neighbours of vertex u, for a figure G (V, E), i.e. N (u)=v ∈ V | (u, v) ∈ E}.If setting δ there is no the vertex that local density's ratio u is big in the neighbours vertex of vertex uu=0, andFor one Vertex u, if there are two even more dependence vertex, that algorithm will therefrom select the dependence top as vertex u at random Point.
Step 3: DP-Index index being established to entire figure according to the three of each vertex measurements, DP-Index index includes The local density on each vertex and each vertex in figure relies on vertex and relies on similarity, finally to the top in index Point carries out descending arrangement according to their local density.The index structure of DP-Index is as shown in table 1.Based on DP-Index rope Draw, the time complexity of static map clustering algorithm is O (n) in this patent, and wherein n is the quantity on vertex.It is, the time is complicated Degree is only related with vertex quantity, to greatly speed up the efficiency of algorithm.
1. DP-Index index structure of table
Step 4: according to local density ρ defined in step 2-1 and 2-2 and similarity δ is relied on, using ρ as abscissa, δ is ordinate, generates the decision diagram in this patent exclusively for G- Design, and the corresponding decision diagram of Fig. 1 is as shown in Figure 2.If ξ is 1.5, γ is 0.4.Then local density is more than or equal to by ξ according to decision diagram and relies on vertex of the similarity less than γ and be selected into density summit Point set, the vertex by local density less than ξ are selected into noise vertex set.Then density summit point set includes { v6, v10, noise vertex Collection includes { v5, v7, v11}。
Step 5: concentrating each vertex to distribute a cluster for density peak maximum first, so the cluster result set of Fig. 1 includes 2 Cluster, current cluster result set are C={ { v6, { v10}}.Then for being not belonging to density summit point set and noise vertex set Each vertex puts in order according to the descending of the local density on vertex and is traversed, and each vertex is assigned to neighbours top Dian Zhong local density is bigger than its, (also just relies in cluster belonging to vertex) in cluster belonging to the highest vertex of structural similarity, most Cluster result set C={ { v is obtained eventually1, v2, v3, v4, v6, { v8, v9, v10}}。
Step 6: in noise vertex set vertex carry out further division, if in noise vertex set certain vertex u neighbour Residence belongs to two and otherwise more than two different clusters are just selected into different then this vertex u is just selected into bridge vertex set Normal vertex set.Because of v7Neighbours vertex belong to two different clusters, so v7It is bridge vertex.Because of v5And v11All only one Neighbours, so v5And v11It is abnormal vertex.
Step 7: dependency graph being obtained according to DP-Index index, density summit point set and noise vertex set, is initialized first One dependency graph G ' (V ', E '), if vertex set V ' and side collection E ' is sky, later if a vertex u belongs in original image G (V, E) Vertex u, then be added in dependency graph G ' by density summit point set or noise vertex set, otherwise by this vertex u and sideIt is added in dependency graph G ', the corresponding dependency graph of Fig. 1 is as shown in Figure 3.At this point, each connection point in Fig. 3 dependency graph Amount all corresponds to a cluster, such as vertex v1、v2、v3、v4、v6The connected component at place is a cluster of cluster result, v8、v9、 v10The connected component at place is another cluster of cluster result.Each isolated vertex, such as vertex v5, v7, v11, belong to In noise vertex.The dependency graph designed in this patent is by the foundation structure in clustering as Dynamic Graph, for based on the quiet of density peak State structure chart clustering algorithm obtains cluster result in real time in Dynamic Graph and tracking cluster evolution event provides bridge.
Step 8: in the dynamic detection stage, this patent considers four kinds of Dynamic Graph variations: increase/deletion side and increase/ Delete vertex.Change real-time update DP-Index index respectively according to above-mentioned four class.
Increase and exist: when increasing when (u, v), the degree of vertex of vertex u and vertex v adds 1, then opposite vertexes u and vertex v into Row is further to update operation.For vertex u, recalculate the structural similarity of vertex u Yu neighbours vertex, more new summit u and The local density on neighbours vertex, later the dependence vertex on the neighbours vertex of more new summit u, neighbours vertex and neighbours vertex and Similarity is relied on, the measurement after finally changing according to vertex is updated DP-Index;The update of vertex v is operated with vertex u. Such as increase side (v5, v6), the figure after Fig. 1 variation is as shown in Figure 4.For vertex v5For, v is affected first5With neighbours Vertex v2, v6Between structural similarity, and then affect v5With neighbours' vertex v2, v6Local density, so to v5It is pushed up with neighbours Point v2, v6Local density be updated.It may influence to rely on vertex and dependence similarity since local density changes, then It needs to update v5With neighbours' vertex v2, v6It relies on vertex and relies on similarity.Then because of v2, v6Local density change can It can influence v2、v6Neighbours' vertex v1、v3、v4、v7Dependence vertex and rely on similarity, so to v1、v3、v4、v7Dependence top Point is updated with similarity is relied on.Finally DP-Index index is updated, and is arranged by the descending of local density.For Vertex v6Processing and vertex v5Similar, to front, updated vertex does not need then to update, it is only necessary to update v1、v3、v4、 v7Local density and v8Dependence similarity and rely on vertex.DP-Index index after final updated is as shown in table 2.
Delete: when deleting when (u, v), operation, which is similar to, increases side, and the degree of vertex of vertex u and vertex v subtracts 1, so Opposite vertexes u and vertex v carry out further update operation afterwards.For vertex u, the structure of vertex u Yu neighbours vertex are recalculated Similarity, the more local density of new summit u and neighbours vertex, later more new summit u, neighbours vertex and neighbours vertex neighbours The dependence vertex on vertex and dependence similarity, measurement is updated DP-Index after finally being changed according to vertex;Vertex v Operation is updated with vertex u.Such as side (v is deleted in Fig. 45, v6), the figure after changing is as shown in Figure 1.For vertex v5Come It says, affects v first5With neighbours' vertex v2, v6Between structural similarity, and then affect v5With neighbours' vertex v2, v6Part Density, so to v5With neighbours' vertex v2, v6Local density be updated.Due to local density change may influence according to Rely vertex and rely on similarity, then needs to update v5With neighbours' vertex v2, v6It relies on vertex and relies on similarity.Then because of v2, v6Local density change and may influence v2、v6Neighbours' vertex v1、v3、v4、v7Dependence vertex and rely on similarity, institute To v1、v3、v4、v7Dependence vertex with rely on similarity be updated.Finally DP-Index index is updated, and is pressed The descending of local density arranges.For vertex v6Update and vertex v5It is similar, the updated vertex in front is not needed then It updates, it is only necessary to update v1、v3、v4、v7Local density and v8Dependence similarity and rely on vertex.After final updated DP-Index index is as shown in table 1.
2. DP-Index index structure of table
Increase vertex: when increasing vertex u, initializing the local density of vertex u, rely on vertex and relies on similarity, and It is added in DP-Index index.Such as increase vertex v in Fig. 112, because of v12Structure neighborhood there was only own.It is then initial Change vertex u local density beI.e. 0.6872.Vertex will be relied on to be set asIt relies on similarity and is set as 0, then more New DP-Index index.DP-Index index after final updated is as shown in table 3.
3. DP-Index index structure of table
It deletes vertex: when deleting vertex u, deletion being executed to the side (u, v) between each vertex u and neighbours' vertex v Side operation, later deletes vertex u from DP-Index index.Such as vertex v is deleted in Fig. 15, then side (v is deleted first5, v2), specific update step is identical as the operation on above-mentioned deletion side.Then by vertex v5It is deleted from index.After final updated DP-Index index is as shown in table 4.
4. DP-Index index structure of table
Step 9: the variation occurred according to DP-Index index is updated dependency graph.Algorithm first to local density, Similarity is relied on, the changed vertex in vertex is relied on and is judged according to the variation of decision diagram opposite vertexes type, then basis Different situations updates dependency graph.
Dependency graph is updated and is broadly divided into following 5 kinds of situations:
Non-noise vertex becomes noise vertex: when non-noise vertex u becomes noise vertex, if existed in dependency graph SideThen delete sideSuch as side (v is deleted in Fig. 14, v6), so that v4Local density become by 1.695 Become 1.078.The screening rule of decision diagram same Fig. 2, ξ 1.5, γ 0.4.According to decision diagram, v4Local density be less than ξ, So v4Noise vertex is become by non-noise vertex, then deletes side (v in dependency graph4, v3), wherein v3It is v4Former rely on top Point.
Noise vertex becomes non-noise vertex: when noise vertex, u becomes non-noise vertex, if after vertex u variation not For density peak maximum, side is added in dependency graphSuch as side (v is increased in Fig. 15, v6), so that v5Part it is close Degree becomes 1.639 by 1.079.Decision diagram after variation is as shown in figure 5, ξ is 1.5, γ 0.4.According to decision diagram, v5Part Density is greater than ξ, relies on similarity and is greater than γ, so v5Non-noise vertex is become by noise vertex and is not density peak maximum, then Increase side (v in dependency graph5, v2), wherein v2It is v5Dependence vertex.
Density peak maximum becomes non-density peak maximum: when density peak maximum u becomes non-density peak maximum, if vertex u becomes It is not noise vertex after change, side is added in dependency graphSuch as side (v is increased in Fig. 110, v6), so that v6Part Density is 2.794, and relying on similarity is 0.845, and dependence vertex is v3.The screening rule of decision diagram same Fig. 2, ξ 1.5, γ are 0.4.According to decision diagram, v6Local density be greater than ξ, rely on similarity be greater than γ, so becoming non-density by density peak maximum Peak maximum and not be noise vertex, therefore in dependency graph be added side (v6, v3)。
Non- density peak maximum becomes density peak maximum: if non-density peak maximum u is not noise vertex, vertex u becomes close When spending peak maximum, side is deleted in dependency graphSuch as side (v is deleted in Fig. 110, v11), so that non-noise vertex v8 Local density becomes 2.225, and relying on similarity is 0, relies on vertex and is.The screening rule of decision diagram same Fig. 2, ξ 1.5, γ are 0.4.According to decision diagram, v8Local density be greater than ξ, rely on similarity be less than γ, density summit is become by non-density peak maximum Point, therefore side (v is deleted in dependency graph8, v10), v10For v8Former rely on vertex.
The dependence vertex on vertex changes: if vertex u is not noise vertex or density peak maximum, and vertex u according to Rely vertex fromBecomeSide is then deleted in dependency graphSide is addedSuch as side is increased in Fig. 1 (v5, v6), updated DP-Index index is as shown in table 2, wherein v2Dependence vertex by v3Become v6.The screening of decision diagram is advised Then same Fig. 2, ξ 1.5, γ 0.4.According to decision diagram, vertex v2It is not noise vertex or density peak maximum, so in dependency graph Middle deletion side (v2, v3), side (v is added2, v6)。
The connected component obtained in dependency graph can be obtained by real-time cluster result;Monitor the change of connected component in dependency graph Change can obtain cluster evolution event.Advantage of this is that, cluster evolution event can be obtained in real time while variation, without Need to obtain cluster result goes the difference of comparison cluster result that could obtain evolution event again later.Cluster evolution event such as Fig. 7 institute Show, cluster evolution event includes: the new life and extinction of cluster;The expansion and contraction of cluster, the fusion and division of cluster.Such as increase in Fig. 1 Side (v is added5, v6), during dependency graph becomes Fig. 6 from Fig. 3, increase side (v2, v5), the connected component in left side increases at this time Vertex v is added5, because of v in Fig. 35Only one vertex of the connected component at place, so the cluster evolution event occurred at this time is cluster Expansion, and the vertex that cluster result includes by each connected component.Cluster result includes: that cluster result set is C={ { v1, v2, v3, v4, v5, v6, { v8, v9, v10}};For each remaining isolated vertex, because of v7Neighbours vertex belong to two not Same cluster, so v7It is bridge vertex;Because of v11Only one neighbour, so v11It is abnormal vertex.Dependency graph becomes Fig. 6 from Fig. 3 During, also delete side (v2, v3), and increase side (v2, v6), but since the vertex in final connected component does not have Variation, so cluster does not also change, there is no cluster evolution event occurs at this time.
By the processing of above-mentioned steps, being clustered based on density peak thought to Dynamic Graph for task has all been realized.This Provided static structure figure clustering algorithm is invented in clustering precision better than existing structure-based figure clustering algorithm, and The promotion of several times is obtained in efficiency.And this algorithm realizes the cluster to Dynamic Graph, can return to figure cluster result in real time, and Obtain cluster evolution event.

Claims (1)

1. a kind of Dynamic Graph clustering method based on density peak, which comprises the following steps:
Step 1: structural similarity, used structural similarity formula are calculated to each pair of adjacent vertex in figure are as follows:
Wherein, N [u] indicates the structure neighborhood of vertex u, for a figure G (V, E), i.e. N [u]=v ∈ | and (u, v) ∈ E } ∪ { u }, N [v] indicate that the structure neighborhood of vertex v, deg [u] indicate that the degree of vertex u, the degree of vertex u are vertex u structure neighborhood Number, deg [v] indicate vertex v degree;
Descending arrangement is carried out to the structural similarity between all vertex later, takes the value at structural similarity descending arrangement 20% For similarity threshold σt, the number on all sides in figure is indicated with m, bidding k indicates similarity threshold in structural similarity descending Corresponding structural similarity serial number after arrangement, then k should meet:
Step 2: successively calculating three measurements on each vertex in figure: local density, rely on vertex and rely on similarity;
Step 2-1: basis of local density's formula as entire algorithm, be the key that can realize realize on the diagram cluster because Element, this patent first have to the partial structurtes for considering vertex, define any one vertex as structure-based figure clustering algorithm Local density includes the structural similarity between the vertex and its all structure neighborhood, designs serialization function And using standardized normal distribution as μuvWeight in local density's formula, by μuvValue range be set as 0 < μuv≤ 2, with row Except the structural similarity for being unsatisfactory for this value range, according to structural similarity, local density's calculation formula are as follows:
Step 2-2: local density ratio u in the neighbours vertex of vertex u is big and be known as u's with the highest vertex of u structural similarity Vertex is relied on, is denoted asBy u withBetween structural similarity be known as rely on similarity, be denoted as δu, calculation formula are as follows:
Wherein N (u) indicates the opening neighbours of vertex u, for a figure G (V, E), i.e. N (u)=v ∈ V | (u, v) ∈ E }, if There is no the vertex that local density's ratio u is big in the neighbours vertex of vertex u, then δ is setu=0, andFor a vertex u, If there are two even more dependence vertex, that algorithm will therefrom select the dependence vertex as vertex u at random;
Step 3: DP-Index index being established to entire figure according to the three of each vertex measurements, DP-Index index includes in figure Each vertex and each vertex local density, rely on vertex and rely on similarity, finally to the vertex root in index Descending arrangement is carried out according to their local density, is based on DP-Index index, the time of static map clustering algorithm is multiple in this patent Miscellaneous degree is O (n), and wherein n is the quantity on vertex;
Step 4: according to local density ρ defined in step 2-1 and 2-2 and relying on similarity δ, using ρ as abscissa, δ is Ordinate is generated as the decision diagram of G- Design, and local density is then more than or equal to ξ according to decision diagram and dependence similarity is less than The vertex of γ is selected into density summit point set, and the vertex by local density less than ξ is selected into noise vertex set;
Step 5: be first that density peak maximum concentrates each vertex to distribute a cluster, then for be not belonging to density summit point set with And each vertex of noise vertex set, it puts in order and is traversed according to the descending of the local density on vertex, and by each top Point is assigned to that local density in neighbours vertex is bigger than its, in cluster belonging to the highest vertex of structural similarity, finally obtains cluster knot Fruit collection;
Step 6: further division being carried out to the vertex in noise vertex set, if the neighbours of certain vertex u belong in noise vertex set In different clusters, then this vertex u is just selected into bridge vertex set, it is otherwise just selected into abnormal vertex set;
Step 7: dependency graph being obtained according to DP-Index index, density summit point set and noise vertex set, initializes one first Dependency graph G ' (V ', E '), if vertex set V ' and side collection E ' is sky, later if a vertex u belongs to density in original image G (V, E) Vertex u, then be added in dependency graph G ' by summit point set or noise vertex set, otherwise by this vertex u and sideAdd Enter into dependency graph G ';At this point, each connected component in dependency graph corresponds to a cluster, each isolated vertex belongs to In noise vertex;
Step 8: in the dynamic detection stage, considers four kinds of variations of Dynamic Graph: increasing or delete side and increase or delete vertex, Change real-time update DP-Index index respectively according to above-mentioned four class;
Increase exist: when increase while (u, v) when, the degree of vertex of vertex u and vertex v adds 1, then opposite vertexes u and vertex v carry out into The update of one step operates, and for vertex u, recalculates the structural similarity of vertex u Yu neighbours vertex, more new summit u and neighbours The local density on vertex, later the dependence vertex on the neighbours vertex of more new summit u, neighbours vertex and neighbours vertex and dependence Similarity, the measurement after finally being changed according to vertex are updated DP-Index;The update of vertex v is operated with vertex u;
Delete: when deleting when (u, v), operation, which is similar to, increases side, and the degree of vertex of vertex u and vertex v subtracts 1, then right Vertex u and vertex v carry out further update and operate, and for vertex u, it is similar to the structure on neighbours vertex to recalculate vertex u It spends, the more local density of new summit u and neighbours vertex, later the neighbours vertex of more new summit u, neighbours vertex and neighbours vertex Dependence vertex and rely on similarity, after finally being changed according to vertex measurement DP-Index is updated;The update of vertex v Operation is the same as vertex u;
Increase vertex: when increasing vertex u, initializing the local density of vertex u, relies on vertex and rely on similarity, and be added Into DP-Index index;
It deletes vertex: when deleting vertex u, the side (u, v) between each vertex u and neighbours' vertex v being executed and deletes side behaviour Make, later deletes vertex u from DP-Index index;
Step 9: according to DP-Index index occur variation, dependency graph is updated, the update of dependency graph be broadly divided into Lower 5 kinds of situations:
Non-noise vertex becomes noise vertex: when non-noise vertex u becomes noise vertex, if there are sides in dependency graphThen delete side
Noise vertex becomes non-noise vertex: when noise vertex, u becomes non-noise vertex, if not being close after vertex u variation Peak maximum is spent, side is added in dependency graph
Density peak maximum becomes non-density peak maximum: when density peak maximum u becomes non-density peak maximum, if after vertex u variation It is not noise vertex, side is added in dependency graph
Non- density peak maximum becomes density peak maximum: if non-density peak maximum u is not noise vertex, vertex u becomes density peak When vertex, side is deleted in dependency graph
The dependence vertex on vertex changes: if vertex u is not noise vertex or density peak maximum, and the dependence top of vertex u Point fromBecomeSide is then deleted in dependency graphSide is added
Finally, can be obtained by real-time cluster result by obtaining the connected component in dependency graph;Monitor connection point in dependency graph The variation of amount can obtain cluster evolution event.
CN201910080266.4A 2019-01-28 2019-01-28 A kind of Dynamic Graph clustering method based on density peak Pending CN109886313A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910080266.4A CN109886313A (en) 2019-01-28 2019-01-28 A kind of Dynamic Graph clustering method based on density peak

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910080266.4A CN109886313A (en) 2019-01-28 2019-01-28 A kind of Dynamic Graph clustering method based on density peak

Publications (1)

Publication Number Publication Date
CN109886313A true CN109886313A (en) 2019-06-14

Family

ID=66927098

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910080266.4A Pending CN109886313A (en) 2019-01-28 2019-01-28 A kind of Dynamic Graph clustering method based on density peak

Country Status (1)

Country Link
CN (1) CN109886313A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552843A (en) * 2020-04-23 2020-08-18 中国电子科技集团公司第五十四研究所 Fault prediction method based on weighted causal dependency graph

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552843A (en) * 2020-04-23 2020-08-18 中国电子科技集团公司第五十四研究所 Fault prediction method based on weighted causal dependency graph
CN111552843B (en) * 2020-04-23 2023-03-31 中国电子科技集团公司第五十四研究所 Fault prediction method based on weighted causal dependency graph

Similar Documents

Publication Publication Date Title
Nepusz et al. Fuzzy communities and the concept of bridgeness in complex networks
CN103106279B (en) Clustering method a kind of while based on nodal community and structural relationship similarity
CN103678671B (en) A kind of dynamic community detection method in social networks
Li et al. A novel complex network community detection approach using discrete particle swarm optimization with particle diversity and mutation
CN109921921B (en) Method and device for detecting aging-stable community in time-varying network
CN107169871B (en) Multi-relationship community discovery method based on relationship combination optimization and seed expansion
CN107301328B (en) Cancer subtype accurate discovery and evolution analysis method based on data flow clustering
CN107784327A (en) A kind of personalized community discovery method based on GN
Li et al. An improved flower pollination optimizer algorithm for multilevel image thresholding
CN107705213A (en) A kind of overlapping Combo discovering method of static social networks
CN107357858B (en) Network reconstruction method based on geographic position
Lu et al. Multiple-kernel combination fuzzy clustering for community detection
Handoyo et al. The fuzzy inference system with rule bases generated by using the fuzzy C-means to predict regional minimum wage in Indonesia
CN109472712A (en) A kind of efficient Markov random field Combo discovering method strengthened based on structure feature
CN109886313A (en) A kind of Dynamic Graph clustering method based on density peak
Liu et al. Fast community discovery and its evolution tracking in time-evolving social networks
CN117056763A (en) Community discovery method based on variogram embedding
Meena et al. Overlapping community detection in social network using disjoint community detection
CN116450954A (en) Collaborative filtering recommendation method based on graph convolution network
Xu et al. Network group hawkes process model
Zhang et al. A novel hybrid forecasting system based on data augmentation and deep learning neural network for short-term wind speed forecasting
Bai et al. A Large-Scale Group Decision-Making Consensus Model considering the Experts’ Adjustment Willingness Based on the Interactive Weights’ Determination
Long et al. A unified community detection algorithm in large-scale complex networks
Liu et al. Three-way decision based overlapping community detection
Shi et al. Hybrid embedding via cross-layer random walks on multiplex networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190614

RJ01 Rejection of invention patent application after publication