CN103440263B - Method for evolutionary analysis on anonymous graph data - Google Patents

Method for evolutionary analysis on anonymous graph data Download PDF

Info

Publication number
CN103440263B
CN103440263B CN201310331668.XA CN201310331668A CN103440263B CN 103440263 B CN103440263 B CN 103440263B CN 201310331668 A CN201310331668 A CN 201310331668A CN 103440263 B CN103440263 B CN 103440263B
Authority
CN
China
Prior art keywords
node
snapshot
core
collection
cum rights
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310331668.XA
Other languages
Chinese (zh)
Other versions
CN103440263A (en
Inventor
丁旋
刘云浩
孙家广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201310331668.XA priority Critical patent/CN103440263B/en
Publication of CN103440263A publication Critical patent/CN103440263A/en
Application granted granted Critical
Publication of CN103440263B publication Critical patent/CN103440263B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for evolutionary analysis on anonymous graph data, and belongs to the field of graph data mining and analysis. The method comprises the following steps of mining the anonymous graph data to obtain the first k core nodes of each snapshot; for each snapshot, obtaining an induced subgraph taking the core nodes of the snapshot as vertexes; converting each induced subgraph into a corresponding weighted complete graph; establishing a mapping from each weighted complete graph to the next weighted complete graph; associating all the mappings to obtain a core penetration; iteratively extending a penetration towards the periphery in each snapshot of the anonymous graph data on the basis of a current penetration set; stopping extension under a condition of convergence to obtain a full penetration. The method provided by the invention can be used for the evolutionary analysis on a plurality of snapshots of the same graph data source which is published by adopting an anonymization technology at different time points, and the problem that the anonymous graph data cannot be evolutionarily analyzed in the prior art is solved.

Description

A kind of method that EVOLUTION ANALYSIS is carried out to anonymous diagram data
Technical field
The present invention relates to graphical data mining and analysis technical field, more particularly to one kind carries out developing to anonymous diagram data point The method of analysis.
Background technology
Figure is a most frequently used class abstract data structure in computer science, at structurally and semantically aspect than linear list and tree It is increasingly complex, with more general expression ability.Many application scenarios in real world are required for being represented with graph structure, with figure Related process and application are almost omnipresent.So that we are in community network therein as an example, it is one kind typically to scheme For the data of basic structure.Social networkies embody the social connections between friend, and the complexity of this social connections is with the mankind The various technological progresses that historical progress occurs are continuously increased, the traffic technique, global communication including convenient people's long-distance travel Technology and Digital AC and interaction technique, etc..
In recent years, the research with regard to large-scale graph data is in explosive increase, mainly has benefited from people and can obtain more next More large-scale graph datas, especially community network data.Internet technologies, massive store technology fast development with And progressively expanding for data sharing scope causes the automatic collection and issue of data increasingly to facilitate.It is hidden in data issuing process Private leakage problem also becomes increasingly conspicuous, therefore Privacy Protection is just particularly important.Secret protection object master in data publication If to the corresponding relation between user's sensitive data and individual identity.The general method by deleting identifier issues data Privacy leakage, attacker cannot be prevented to pass through to link to attack to obtain individual private data.Anonymization technology can be effective Ground solves the problems, such as to link attacks brought privacy leakage.Since Samarati in 1998 et al. propose anonymization concept first with Come, domestic and international experts and scholars have carried out research work extensively and profoundly to seek to prevent from or reduce privacy letting out to anonymization technology The effective ways of dew, achieve a series of correlational study achievements.Anonymization technology is due to can prevent under data publication environment User's sensitive data is compromised, while and can guarantee that the verity for issuing data, receive significant attention in practical application area.
However, the anonymization technical research currently for diagram data is concerned only with the static data of single issue, to dynamic more Newly, the anonymization technical research of the continuous diagram data issued is very not enough.Thus the big problem for causing be Data Analyst only Static analysis can be carried out to the diagram data after anonymization, and the same diagram data source issued using anonymization technology cannot be existed Multiple snapshots of different time points carry out EVOLUTION ANALYSIS.The EVOLUTION ANALYSIS of diagram data relate to graph theory, theory of probability, biomathematic How related content, the topological structure of main research diagram data act on the evolution differentiation of colony, are graphical data mining and analysis The one big important topic in field.Existing anonymization technology due to destroying the identification information in diagram data, isolated snapshot it Between contact so that EVOLUTION ANALYSIS cannot be carried out.
The content of the invention
(One)Technical problem to be solved
For the deficiencies in the prior art, the technical problem to be solved is:How one kind is provided to anonymous figure number Method according to EVOLUTION ANALYSIS is carried out, can be to the same diagram data source issued using anonymization technology in the multiple of different time points Snapshot carries out EVOLUTION ANALYSIS.
(Two)Technical scheme
In order to solve above-mentioned technical problem, the invention provides a kind of method that EVOLUTION ANALYSIS is carried out to anonymous diagram data, Comprise the steps:
A, the mapping set up between the core node of each snapshot of anonymous diagram data, associate all mappings, obtain anonymous figure The core of data is through collection;
B, using node matching algorithm, other nodes of each snapshot of the above-mentioned core beyond collection are matched, are closed The node that connection can be matched, core is expanded to through collection complete through collection;
Time of the nodes for referring to same object all in each snapshot for refer to anonymous diagram data by snapshot The node chain that order is conspired to create;The core is through referring to by running through that the core node of anonymous each snapshot of diagram data is constituted.
Specifically, step A specifically includes following steps:
A1, anonymous diagram data is excavated, obtain k core node of each snapshot, k is the integer more than 1;
A2, the induced subgraph for setting up each snapshot with the core node of above-mentioned snapshot as node;
A3, above-mentioned each induced subgraph is converted to into corresponding cum rights complete graph;
A4, according to the time sequencing of snapshot, set up each cum rights complete graph successively between the complete node of graph of next cum rights Mapping one by one, the node of wherein last cum rights complete graph is mapped to the node of first cum rights complete graph;
A5, all mappings of association, obtain core through collection.
Further, step A1 is specially:
To each snapshot, it is ranked up according to the core level of node from big to small, the front k node of ranking results is The core node of the snapshot.
Preferably, the core level is that the degree of node is bigger, then its core level is got over weighing with the degree of node It is high.
Further, step A3 is specially:
Cum rights complete graph is constructed by summit of the summit of the induced subgraph, makes have a line phase between any two summit Even, it is preferable that the weight on side is:
Wherein, u and v represent that two summits on the side, N (u) and N (v) represent summit u and vertex v in induced subgraph respectively In neighbor node set, | N (u) | and | N (v) | represent the quantity of N (u) and N (v) interior joints, | N (u) ∩ N (v) | tables respectively Show the quantity of the summit u and vertex v public-neighbor in induced subgraph.
Further, map what is set up particular by following steps in step A4:
Optimum mapping is solved, after making all node mappings of two cum rights complete graphs, the difference of weight is total in corresponding sides And minimum.
Further, all mappings of association in step A5, specifically include:
A51, one empty core of construction are through collection;
A52, in the first cum rights complete graph appoint take a node v, by its according to above-mentioned steps set up mapping gradually reflect The node being mapped in next cum rights complete graph, until its mapping node in last cum rights complete graph is obtained, according still further to Mapping relations of last cum rights complete graph to the first cum rights complete graph, obtain its mapping section in the first cum rights complete graph Point v ', if v '=v, the node chain that node v and its mapping node in each cum rights complete graph are constituted is used as one Core runs through collection through the core for being incorporated to the foundation of step A51;
A53, the above-mentioned node v of renewal, repeat step A52, all k nodes until having processed cum rights complete graph are obtained The core of anonymous diagram data is through collection.
Further, step B is specifically included:
B1, the core for obtaining step A are through collection as current through collection;
If B2, reaching the condition of convergence, stop extension, run through entirely;Otherwise execution step B3;
B3, by current based on collection, in each snapshot of anonymous diagram data to circumferential expansion one run through, should It is current through concentration, execution step B2 through being added to.
Further, step B3 is specifically included:
The first snapshot be not belonging to it is current appoint in the node of collection take 1 point of u, by its according to snapshot time sequencing Its matched node in next snapshot is obtained gradually, until obtaining its matched node in last snapshot, above-mentioned It is not admitted to currently through collecting with node, then the section in matched node again by u in last snapshot and the first snapshot Point is matched, and obtains node u ', if u '=u, the node chain that u and its matched node in each snapshot are constituted as One current through collection through being incorporated to.
Further, the matched node is obtained by performing following steps:
B31, noteThe node to be matched in i-th snapshot is represented, is asked in i-th snapshotBelong to and currently run through All neighbor nodes setAccording to currently running through,The set being mapped in next snapshot is designated asWherein 1≤i≤n;
The fiExpression currently runs through, and belongs to the node of i-th snapshot to belonging to [(i+1) mod n] individual snapshot The mapping of node;
B32, in [(i+1) mod n] individual snapshot appoint take a node for being not belonging to currently run throughCalculateWithSimilarity:
WhereinRepresentAll neighbor nodes quantity,RepresentAll neighbor nodes Quantity,RepresentWithCommon factor node quantity;
B33, renewalRepeat step B32, it is all in obtaining [(i+1) mod n] individual snapshot to be not belonging to currently The node that runs through withSimilarity, all similarities constitute set scores;
B34, the discrimination σ for calculating most like node and secondary similar node:
Wherein max (scores) represents the maximum in scores, max2(scores) second largest value in expression scores, δ (scores) represents the standard deviation of scores;
If B35, σ are more than given threshold, then it is assumed that the corresponding nodes of max (scores) areBest match node;It is no Then, it is believed that for the node in i-th snapshotThere is no its matched node in [(i+1) mod n] individual snapshot.
Further, the condition of convergence in step B2 is referred to:
If all current node outside collection of the first snapshot can not bring the extension for currently running through, extend Convergence.
(Three)Beneficial effect
Above-mentioned technical proposal has the following advantages:
Excavated by the topological structure to anonymous each snapshot of diagram data, it is first determined anonymous diagram data each snapshot Core node between association, set up core through collection, also determined that the evolutionary process of anonymous figure main body, and then passed through based on core Collection is worn, is further expanded and is currently run through, set up the complete through collection of anonymous diagram data, just can determine the evolution feelings of anonymous figure completely Condition.Method described in the present invention, changing anonymous figure cannot carry out the present situation of EVOLUTION ANALYSIS, and excavate between each snapshot During association, the accuracy of respective nodes matching is high.
After the detailed description of embodiment of the present invention is read in conjunction with the accompanying, the other features and advantages of the invention will become more Plus it is clear.
Description of the drawings
The step of Fig. 1 is one embodiment of the present invention schematic flow sheet;
Fig. 2 is the concrete steps schematic flow sheet of step A in Fig. 1;
Fig. 3 is the concrete steps schematic flow sheet of step B in Fig. 1;
Fig. 4 runs through schematic diagram for the construction core of the embodiment of the present invention 2;
Fig. 5 runs through schematic diagram entirely for the construction of the embodiment of the present invention 2.
Specific embodiment
With reference to Figure of description and embodiment, the specific embodiment of the present invention is described in further detail.With Lower embodiment is merely to illustrate the present invention, but is not limited to the scope of the present invention.
Embodiment one
Technical scheme is further illustrated below in conjunction with the accompanying drawings and by specific embodiment.
As shown in figure 1, the embodiment of the present invention describes a kind of method for carrying out EVOLUTION ANALYSIS to anonymous diagram data, the method Mainly include the following steps that:
A, the mapping set up between the core node of each snapshot of anonymous diagram data, associate all mappings, obtain anonymous figure The core of data runs through;
B, using node matching algorithm, other nodes of above-mentioned core through each snapshot in addition are matched, are associated The node that can be matched, core is run through to expand to and is run through entirely;
Time of the nodes for referring to same object all in each snapshot for refer to anonymous diagram data by snapshot The node chain that order is conspired to create;The core is through referring to by running through that the core node of anonymous each snapshot of diagram data is constituted.
As shown in Fig. 2 step A is mainly by performing what following steps were realized:
A1, anonymous diagram data is excavated, obtain k core node of each snapshot, k is the integer more than 1.
If the quantity of known anonymous diagram data snapshot is n, according to the time sequencing of snapshot, i-th snapshot is designated asWherein 1≤i≤n.
The tolerance of the degree as the core level of snapshot interior joint of node is chosen, the degree of node is bigger, then its core level It is higher.For each snapshot, its all node is sorted from big to small according to the degree of node, by the front k node of ranking results As the core node of the snapshot.By i-th snapshotCore node collection be designated as
Above-mentioned k is the integer more than 1, specifically can be chosen according to the scale of diagram data and empirical value.
Why the degree of node is selected in the present embodiment as the tolerance of its core level, mainly due to the degree of node It is readily available, convenience of calculation.In other embodiments of the invention, the core level of node can also have other tolerance sides Method, if can be shown that node in diagram data with other nodes contact number, just can be for weighing its core level.Cause This, other metric forms of joint core degree are also understood to belong to scope of protection of the present invention.
A2, the induced subgraph for setting up each snapshot with the core node of above-mentioned each snapshot as summit.
Construction subgraph, the subgraph with core node as summit, for any two summit, if corresponding core node is in snapshot In have side, then corresponding side is there is also in subgraph, otherwise in subgraph, there is no corresponding side.Thus obtain induced subgraph.Note WithInduced subgraph for vertex set is
A3, above-mentioned each induced subgraph is converted to into corresponding cum rights complete graph.
Cum rights complete graph is constructed by summit of the summit of the induced subgraph, makes have a line phase between any two summit Even, the weight on side is:
Wherein, u and v represent that two summits on the side, N (u) and N (v) represent summit u and vertex v in induced subgraph respectively In neighbor node set, | N (u) | and | N (v) | represent the quantity of N (u) and N (v) interior joints, | N (u) ∩ N (v) | tables respectively Show the quantity of the summit u and vertex v public-neighbor in induced subgraph.Note is by induced subgraphThe band being converted to Weighing complete graph is1≤i≤n。
A4, according to the time sequencing of snapshot, set up each cum rights complete graph successively between the complete node of graph of next cum rights Mapping, the node of wherein last cum rights complete graph is mapped to the node of first cum rights complete graph.
Remember i-th cum rights complete graphTo i+1 cum rights complete graphBe mapped as mi, wherein 1≤i≤n-1, the N cum rights complete graphTo first cum rights complete graphBe mapped as mn
Note cum rights complete graphSide collection beM is mapped soiSolution procedure be:
Optimum mapping is solved by heuritic approaches such as simulated annealings, makes all nodes of two cum rights complete graphs map it Afterwards, in corresponding sides, the difference summation of weight is minimum.
A5, all mappings of association, obtain core and run through.
The step specifically includes following steps:
A51, one core of construction are designated as T through collectionsd, initialization
A52, in the first cum rights complete graphIn appoint take a node v1, which is gradually reflected according to the mapping that step A4 is set up The node being mapped in next cum rights complete graph, i.e., calculate v successively2=m1(v1),v3=m2(v2),…,vn=mn-1(vn-1), and v1'=mn(vn).If v1'=v1, then will<v1,v2,...,vn> runs through collection T through core is incorporated into as onesdIn, otherwise Give up.
A53, the above-mentioned node v of renewal1, repeat step A52, until having processedIn all k nodes, obtain anonymous figure The core of data is through collection.
Step A52 is from the first cum rights complete graphIn optionally a little start to map successively, it is complete first until obtaining which Till mapping node in full figure.As the mapping in step A4 constitutes a circulation, it is therefore evident that, from any one cum rights Complete graph starts mapping, returns this cum rights complete graph, it is also possible to as whether being through after one cycle mapping Standard, it should also be understood as falling within scope of protection of the present invention.
As shown in figure 3, step B is specifically included:
B1, the core for obtaining step A are through collection as current through collection.
Note currently runs through and integrates as Tex, then initialize Tex=Tsd
If B2, reaching the condition of convergence, stop extension, run through entirely;Otherwise execution step B3.
The condition of convergence is referred to:If the first snapshotBe not belonging to TexNode can not bring the extension for currently running through, then Extension restrains.
B3, by current based on collection, in each snapshot of anonymous diagram data to circumferential expansion one run through, should It is current through concentration, execution step B2 through being added to.
Step B3 is specifically included:
The first snapshot be not belonging to it is current appoint in the node of collection take 1 point of u, by its according to snapshot time sequencing Its matched node in next snapshot is obtained gradually, until obtaining its matched node in last snapshot, above-mentioned It is not admitted to currently through collecting with node, then the section in matched node again by u in last snapshot and the first snapshot Point is matched, and obtains node u ', if u '=u, the node chain that u and its matched node in each snapshot are constituted as One current through collection through being incorporated to.
Note is currently respectively through corresponding mapping:f1,f2,...,fi,...,fn, wherein f1During expression currently runs through, Belong to snapshotNode to belonging to snapshotNode mapping;fiDuring expression currently runs through, belong to snapshotNode To belonging to snapshotNode mapping;fnDuring expression currently runs through, belong to snapshotNode to belonging to snapshot's The mapping of node.
The matched node is obtained by performing following steps:
B31, noteThe node to be matched in i-th snapshot is represented, is asked in i-th snapshotBelong to and currently run through All neighbor nodes setAccording to currently running through,The set being mapped in next snapshot is designated asWherein 1≤i≤n。
B32, in [(i+1) mod n] individual snapshot appoint take a node for being not belonging to currently run throughCalculateWithSimilarity:
WhereinRepresentAll neighbor nodes quantity,RepresentAll neighbor nodes Quantity,RepresentWithCommon factor node quantity.
B33, renewalRepeat step B32, it is all in obtaining [(i+1) mod n] individual snapshot to be not belonging to currently The node that runs through withSimilarity, all similarities constitute set scores;
B34, the discrimination σ for calculating most like node and secondary similar node:
Wherein max (scores) represents the maximum in scores, max2(scores) second largest value in expression scores, δ (scores) represents the standard deviation of scores;
If B35, σ are more than given threshold, then it is assumed that the corresponding nodes of max (scores) areBest match node;It is no Then, it is believed that for the node in i-th snapshotThere is no its matched node in [(i+1) mod n] individual snapshot.
The setting of the threshold value can be chosen according to the scale of anonymous diagram data and empirical value.
Embodiment two
Based on embodiment one, with reference to Fig. 4 and Fig. 5, this gives one is specifically drilled to anonymous diagram data Change the implementation process of the method for analysis.
As shown in figure 4, the present embodiment obtains three snapshots of a certain anonymous diagram data altogether, by the sequencing of snapshot, It is designated as respectivelyAs shown in Figure 4.
According to the degree of node as the tolerance of each joint core degree, 3 nodes are chosen in each snapshot fast as this According to core node, thenCorresponding core node collection is respectively: Respectively withInduced subgraph is set up for vertex set, and according to the step in embodiment one A3 is expanded to cum rights complete graphIt is calculated further according to step A4 in embodiment oneArrive's Mapping m1=→ a2 →,ArriveMapping m2={ 2 → F, 3 → B, 5 → A },ArriveMapping m3={ F → a, B →d,A→b}。
According to mapping m1、m2And m3Understand, 2=m1(a), F=m2(2),a=m3(F), so node a, 2, F constitute a core It is through { a → 2 → F }, similar, two other core can be obtained through { b → 5 → A } and { d → 3 → B }, core runs through and integrates as Tsd ={ a → 2 → F, b → 5 → A, d → 3 → B }.
As shown in figure 5, according to step B in embodiment one, can be to above-mentioned core through collection TsdIt is extended and can obtains Tex=Tsd∪{c→1→D}∪{e→6→C}∪{f→4→E}。
Thus the node of anonymous each snapshot of diagram data can be mapped, to carry out the EVOLUTION ANALYSIS of next step.
One of ordinary skill in the art will appreciate that:Realize that all or part of step of said method embodiment can pass through Completing, aforesaid program can be stored in computer read/write memory medium the related hardware of programmed instruction, and the program exists During execution, the step of including said method embodiment is performed;And aforesaid storage medium includes:ROM, RAM, magnetic disc or CD Etc. it is various can be with the medium of store program codes.
Finally it should be noted that:Above example only to illustrate technical scheme, rather than a limitation;Although With reference to the foregoing embodiments the present invention has been described in detail, it will be understood by those within the art that:Which still may be used To modify to the technical scheme described in foregoing embodiments, or equivalent is carried out to which part technical characteristic; And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, on the premise of without departing from the technology of the present invention principle, some improvement and modification can also be made, these improve and modification Also should be regarded as protection scope of the present invention.

Claims (9)

1. a kind of method that EVOLUTION ANALYSIS is carried out to anonymous diagram data, it is characterised in that comprise the steps:
A, the mapping set up between the core node of each snapshot of anonymous diagram data, associate all mappings, obtain anonymous diagram data Core through collection;
B, using node matching algorithm, other nodes of each snapshot of the above-mentioned core beyond collection are matched, correlation energy The node of matching, core is expanded to through collection complete through collection;
Time sequencing of the nodes for referring to same object all in each snapshot for refer to anonymous diagram data by snapshot The node chain for conspiring to create;The core is through referring to by running through that the core node of anonymous each snapshot of diagram data is constituted;
Step A specifically includes following steps:
A1, anonymous diagram data is excavated, obtain k core node of each snapshot, k is the integer more than 1;
A2, the induced subgraph for setting up each snapshot with the core node of above-mentioned snapshot as node;
A3, above-mentioned each induced subgraph is converted to into corresponding cum rights complete graph;
A4, according to the time sequencing of snapshot, set up each cum rights complete graph successively between the complete node of graph of next cum rights One mapping, the node of wherein last cum rights complete graph are mapped to the node of the first cum rights complete graph;
A5, all mappings of association, obtain core through collection;
Step A3 is specially:
Cum rights complete graph is constructed by summit of the summit of the induced subgraph, makes between any two summit, have a line to be connected, And the weight on side is:
w ( u , v ) = | N ( u ) &cap; N ( v ) | | N ( u ) | | N ( v ) |
Wherein, u and v represent that two summits on the side, N (u) and N (v) represent summit u and vertex v in induced subgraph respectively Neighbor node set, | N (u) | and | N (v) | represent the quantity of N (u) and N (v) interior joints respectively, and | N (u) ∩ N (v) | represents top The quantity of the point u and vertex v public-neighbor in induced subgraph;
Step A4 is specially:
Remember i-th cum rights complete graphTo i+1 cum rights complete graphBe mapped as mi, wherein 1≤i≤n-1, n-th band Power complete graphTo the first cum rights complete graphBe mapped as mn, remember cum rights complete graphSide collection beWherein, 1≤i≤ N, then mapping miSolution procedure be:
m i = arg min m &Sigma; ( u i , v i ) &Element; E ~ i ( w ( u i , v i ) - w ( m ( u i ) , m ( v i ) ) ) 2
Wherein, n is the quantity of known anonymity diagram data snapshot, uiAnd viNode to be matched in for i-th snapshot.
2. method according to claim 1, it is characterised in that step A1 is specially:
To each snapshot, it is ranked up according to the core level of node from big to small, it is fast that the front k node of ranking results is this According to core node.
3. method according to claim 2, it is characterised in that the core level is weighing, to save with the degree of node The degree of point is bigger, then its core level is higher.
4. method according to claim 1, it is characterised in that map in step A4 and build particular by following steps Vertical:
Optimum mapping is solved, after making all nodes mapping of two cum rights complete graphs, in corresponding sides the difference summation of weight is most It is little.
5. method according to claim 1, it is characterised in that all mappings of association in step A5, specifically includes:
A51, one empty core of construction are through collection;
A52, in the first cum rights complete graph appoint take a node v, by its according to above-mentioned steps set up mapping be gradually mapped to Node in next cum rights complete graph, until its mapping node in last cum rights complete graph is obtained, according still further to last Mapping relations of one cum rights complete graph to the first cum rights complete graph, obtain its mapping node in the first cum rights complete graph V ', if v '=v, the node chain that node v and its mapping node in each cum rights complete graph are constituted is used as a core Run through collection through the core for being incorporated to the foundation of step A51;
A53, the above-mentioned node v of renewal, repeat step A52, all k nodes until having processed each cum rights complete graph are obtained The core of anonymous diagram data is through collection.
6. the method according to any one of claim 1-5, it is characterised in that step B is specifically included:
B1, the core for obtaining step A are through collection as current through collection;
If B2, reaching the condition of convergence, stop extension, run through entirely;Otherwise execution step B3;
B3, by current based on collection, in each snapshot of anonymous diagram data to circumferential expansion one run through, this is run through It is added to current through concentration, execution step B2.
7. method according to claim 6, it is characterised in that step B3 is specifically included:
The first snapshot be not belonging to it is current appoint in the node of collection take 1 point of u, by its according to snapshot time sequencing gradually Its matched node in next snapshot is obtained, until obtaining its matched node in last snapshot, above-mentioned matching section Point is not admitted to currently through collection, and then matched node again by u in last snapshot is entered with the node in the first snapshot Row matching, obtains node u ', if u '=u, the node chain that u and its matched node in each snapshot are constituted is used as one It is current through collection through being incorporated to.
8. method according to claim 7, it is characterised in that the matched node is obtained by performing following steps 's:
B31, note uiThe node to be matched in i-th snapshot is represented, u is sought in i-th snapshotiBelong to currently run through all The set of neighbor nodeAccording to currently running through,The set being mapped in next snapshot is designated asWherein 1≤i≤ n;
The fiDuring expression currently runs through, belong to the node of i-th snapshot to the node for belonging to [(i+1) mod n] individual snapshot Mapping;
B32, in [(i+1) mod n] individual snapshot appoint take a node for being not belonging to currently run throughCalculateWith uiPhase Like degree:
WhereinRepresentAll neighbor nodes quantity, | N (ui) | represent uiAll neighbor nodes quantity,RepresentWithCommon factor node quantity;
B33, renewalRepeat step B32, it is all in obtaining [(i+1) mod n] individual snapshot to be not belonging to what is currently run through Node and uiSimilarity, all similarities constitute set scores;
B34, the discrimination σ for calculating most like node and secondary similar node:
&sigma; = m a x ( s c o r e s ) - max 2 ( s c o r e s ) &delta; ( s c o r e s ) ,
Wherein max (scores) represents the maximum in scores, max2(scores) second largest value in expression scores, δ (scores) represent the standard deviation of scores;
If B35, σ are more than given threshold, then it is assumed that the corresponding nodes of max (scores) are uiBest match node;Otherwise, recognize It is for the node u in i-th snapshoti, there is no its matched node in [(i+1) mod n] individual snapshot.
9. method according to claim 6, it is characterised in that the condition of convergence in step B2 is referred to:
If all current node outside collection of the first snapshot can not bring the extension for currently running through, extend and receive Hold back.
CN201310331668.XA 2013-08-01 2013-08-01 Method for evolutionary analysis on anonymous graph data Active CN103440263B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310331668.XA CN103440263B (en) 2013-08-01 2013-08-01 Method for evolutionary analysis on anonymous graph data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310331668.XA CN103440263B (en) 2013-08-01 2013-08-01 Method for evolutionary analysis on anonymous graph data

Publications (2)

Publication Number Publication Date
CN103440263A CN103440263A (en) 2013-12-11
CN103440263B true CN103440263B (en) 2017-04-19

Family

ID=49693955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310331668.XA Active CN103440263B (en) 2013-08-01 2013-08-01 Method for evolutionary analysis on anonymous graph data

Country Status (1)

Country Link
CN (1) CN103440263B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291400B (en) * 2017-06-30 2020-07-28 苏州浪潮智能科技有限公司 Snapshot volume relation simulation method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106616A (en) * 2013-02-27 2013-05-15 中国科学院自动化研究所 Community detection and evolution method based on features of resources integration and information spreading

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106616A (en) * 2013-02-27 2013-05-15 中国科学院自动化研究所 Community detection and evolution method based on features of resources integration and information spreading

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
De-anonymizing Dynamic Social Networks;Xuan Ding等;《IEEE》;20111209;全文 *
De-anonymizing Social Networks;Arvind Narayanan等;《IEEE》;20090520;全文 *
Enabling Dynamic Analysis of Anonymized Social Network Data;Xuan Ding等;《IEEE》;20121012;第21页第1栏10-11行,第22页第1栏第19-20行,24-25行,第22页第2栏7-17行,40-42行,第23页第2栏2-3行,11-12行,17-27行,倒数第1-10行,第24页第1栏6-18行,24-28行,31-37行,41-45行,第2栏6-12行,图1 *

Also Published As

Publication number Publication date
CN103440263A (en) 2013-12-11

Similar Documents

Publication Publication Date Title
Li et al. Quantitative function for community detection
Hao et al. Diversified top-k maximal clique detection in social internet of things
CN113015169B (en) Optimal control method, device and medium for malicious program propagation of charging wireless sensor network
CN102708327A (en) Network community discovery method based on spectrum optimization
Yu et al. Rum: Network representation learning using motifs
Botta et al. Finding network communities using modularity density
CN106203494A (en) A kind of parallelization clustering method calculated based on internal memory
CN104836711A (en) Construction method of command control network generative model
CN110399286A (en) A kind of automatic generation of test data based on independent pathway
CN104750762A (en) Information retrieval method and device
Mazepa et al. An ontological approach to detecting fake news in online media
Yan et al. Data-driven pollution source location algorithm in water quality monitoring sensor networks
CN106097090A (en) A kind of taxpayer interests theoretical based on figure associate group&#39;s recognition methods
CN105302823A (en) Overlapped community parallel discovery method and system
CN102819611A (en) Local community digging method of complicated network
CN103440263B (en) Method for evolutionary analysis on anonymous graph data
Tunali Large-scale network community detection using similarity-guided merge and refinement
CN106127595A (en) A kind of community structure detection method based on positive and negative side information
CN107578136A (en) The overlapping community discovery method extended based on random walk with seed
Hao et al. Iceberg clique queries in large graphs
CN114254131A (en) Network security emergency response knowledge graph entity alignment method
CN104850646A (en) Method of mining frequent subgraphs for single uncertain graphs
CN103593800A (en) Community discovery method based on faction random walk
Xia et al. An improved local community detection algorithm using selection probability
Zhao et al. Overlapping Community Detection Algorithm Based on High‐Quality Subgraph Extension in Local Core Regions of Network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant