CN109063089B - Subgraph matching method and device based on community structure - Google Patents

Subgraph matching method and device based on community structure Download PDF

Info

Publication number
CN109063089B
CN109063089B CN201810836811.3A CN201810836811A CN109063089B CN 109063089 B CN109063089 B CN 109063089B CN 201810836811 A CN201810836811 A CN 201810836811A CN 109063089 B CN109063089 B CN 109063089B
Authority
CN
China
Prior art keywords
community
matching
subgraph
nodes
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810836811.3A
Other languages
Chinese (zh)
Other versions
CN109063089A (en
Inventor
王朝坤
楼昀恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810836811.3A priority Critical patent/CN109063089B/en
Publication of CN109063089A publication Critical patent/CN109063089A/en
Application granted granted Critical
Publication of CN109063089B publication Critical patent/CN109063089B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a subgraph matching method and a subgraph matching device based on a community structure, wherein the method comprises the following steps: importing a file containing a target mode, analyzing a target mode structure, and finding out mutually matched equivalent subgraphs in the target mode; generating a hypergraph with communities as nodes according to the network graph data, and calculating the number of edges between each node in each community and an adjacent community of the community; finding out subgraphs, matched with the target mode structure, of each community by using a preset subgraph matching algorithm in each community of the network graph to obtain a first matching result; in the network graph, finding out sub-graphs which cross communities and are matched with the target pattern based on the number of edges of each node in each community of the network graph and each adjacent community of the community and the found sub-graphs which are matched with each other and are equivalent in the target pattern, and obtaining a second matching result; and summarizing the first matching result and the second matching result to obtain a final sub-graph matching result. The sub-graph matching speed can be improved, and the time overhead is reduced.

Description

Subgraph matching method and device based on community structure
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a subgraph matching method and device based on a community structure.
Background
With the wide application of graph data in various fields, the demand for a correlation algorithm for graph data processing is increasing. The subgraph matching algorithm is taken as a classic algorithm on the graph, and how to carry out efficient subgraph matching becomes an important problem. The subgraph matching refers to a process of specifying a target mode by a user and finding all subgraphs isomorphic with the target mode in the graph through a subgraph matching algorithm. The application of subgraph matching includes finding a single node satisfying a specific condition, finding several nodes with a specific relationship, and the like, such as finding a structure composed of all three people known to each other in a social network.
The Ullmann algorithm and the vf2 algorithm are classical subgraph matching methods. Neo4j is a commonly used graph database system that has no specific implementation for sub-graph matching, but achieves the effect of sub-graph matching by performing path matching.
The speed of the sub-graph matching method used in the Ullmann algorithm, the vf2 algorithm and the Neo4j is low in actual calculation, and when the data size is large, the time overhead of the algorithm is large, so that the requirements of actual application cannot be met. Moreover, the algorithms do not fully utilize the community structure of the graph, and have a large improvement space.
Therefore, how to perform sub-graph matching to increase the computation speed of sub-graph matching becomes a technical problem to be solved at present.
Disclosure of Invention
Because the existing method has the problems, the embodiment of the invention provides a subgraph matching method and device based on a community structure, which are applied to a social network.
In a first aspect, an embodiment of the present invention provides a subgraph matching method based on a community structure, which is applied to a social network, and includes:
importing a file containing a target mode representing three mutually recognized people, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode;
generating a hypergraph with communities as nodes according to a social network, and calculating the number of edges between each node in each community and the node in the adjacent community of the community;
in each community in the social network, respectively finding out subgraphs formed by all three people known to each other in each community by using a preset subgraph matching algorithm to obtain a subgraph matching result in the community;
performing node repeatable subgraph matching on a target pattern on the hypergraph among all communities in the social network by utilizing a preset subgraph matching algorithm, storing an obtained distribution scheme in a preset table, processing each distribution scheme s in the preset table based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraph which is equivalent to each other in the target pattern when the number of elements in the preset table is larger than a preset threshold value, judging whether the number of nodes of the target pattern distributed to each community in the social network in s is larger than the number of nodes in the community, judging whether s can be converted into a distribution scheme with a smaller index value through matching equivalence relation, and calculating the common one-hop number of the nodes in the target pattern in other communities except the community in pairs for the nodes distributed to the same community in s, finding out a subgraph formed by all three mutually known people across communities to obtain a subgraph matching result among the communities;
and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the social network, namely obtaining all three people known mutually in the social network.
Optionally, the importing a file including a target pattern, analyzing a structure of the target pattern based on the imported file, and finding out mutually matched equivalent subgraphs in the target pattern includes:
importing a file containing a target mode, and analyzing nodes, edges and related attributes in the target mode;
and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Optionally, inside each community in the social network, respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm, and obtaining subgraph matching results inside the community, including:
and in parallel, in each community in the social network, respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm, and obtaining a subgraph matching result in the community.
Optionally, the method includes performing node-repeatable subgraph matching on a target pattern on the hypergraph by using a preset subgraph matching algorithm among all communities in the social network, storing the obtained distribution schemes in a preset table, processing each distribution scheme s in the preset table based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraph equivalent to each other in the target pattern when the number of elements in the preset table is greater than a preset threshold, judging whether the number of nodes of the target pattern distributed to each community in the social network in s is greater than the number of nodes in the community, judging whether s can be converted into a distribution scheme with a smaller index value through matching equivalence, and calculating the number of common one-hop neighbors of the nodes in the target pattern except the community in pairs among the nodes distributed to the same community in s in the target pattern except the community Pruning is carried out, a subgraph formed by three mutually known people across communities is found, and subgraph matching results among communities are obtained, and the method comprises the following steps:
performing node repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm, and storing the obtained distribution scheme in a preset table;
when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table, including:
judging whether the number of the nodes of the target mode distributed to each community in the social network in the step s is larger than that of the nodes in the community or not;
if the number of the nodes of the target mode distributed to each community in the social network in s is less than or equal to the number of the nodes in the community, taking the nodes of the target mode distributed to each community in s as the nodes to be matched according to the structure of the target mode, calculating the number of edges between the nodes to be matched and the nodes distributed to other communities except the community in the target mode, pruning the nodes in the community and the edges of other communities except the community in the social network to reduce the number of candidate matching nodes corresponding to each node in the target mode, and if the number of the candidate matching nodes corresponding to a certain node in the target mode after pruning is reduced to 0, finishing the processing of s;
if the number of candidate matching nodes corresponding to all nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found target pattern matched with equivalent subgraphs, and if not, recording all distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file;
for the nodes in the target mode which is distributed to the same community in the s, calculating the number of the common one-hop neighbors of the nodes in the target mode in other communities except the community two by two;
taking a derived subgraph of a target pattern formed by nodes in the target pattern distributed to the same community in the step s as a target pattern subgraph, carrying out subgraph matching on the target pattern subgraph and a corresponding community in the social network by using a preset subgraph matching algorithm, carrying out matching attempt on each node in the target pattern subgraph by using the corresponding candidate matching node in the matching process, and pruning by using the calculated public one-hop neighbor number to obtain a matching result of carrying out subgraph matching on each target pattern subgraph in the step s and the corresponding community in the social network;
carrying out further inspection on matching results obtained by carrying out subgraph matching on each target pattern subgraph in s and the corresponding community in the social network, and for the matching results of any two target pattern subgraphs in s in two communities, if two nodes of the two target pattern subgraphs are connected by edges, the nodes matched with the two nodes are also connected by edges; if a matching result set containing a matching result of each target pattern subgraph in s passes the check in pairs, combining the matching results in the matching result set to generate a matching result of the target pattern and the social network;
and calculating a final matching result for all distribution schemes recorded in the preset file by matching equivalent corresponding relations according to the generated matching results of all target patterns and the social network.
Optionally, after determining whether the number of nodes of the target pattern assigned to each community in the social network is greater than the number of nodes in the community in s, the method further includes:
and if the number of the nodes in the target mode distributed to a certain community in the social network in the s is judged and known to be larger than the number of the nodes in the community, finishing the processing of the s.
Optionally, after determining whether the current s can be converted into an allocation scheme with a smaller index value through a matching equivalence relation according to mutually matching equivalent subgraphs in the found target pattern, the method further includes:
and judging to acquire that the current s can be converted into a distribution scheme with a smaller index value through matching the equivalence relation, and finishing the processing of the s.
Optionally, when the number of elements in the preset table is greater than a preset threshold, each matching s in the preset table is processed in parallel.
In a second aspect, an embodiment of the present invention further provides a subgraph matching device applied to a social network and based on a community structure, including:
the import module is used for importing a file containing a target mode representing three mutually recognized people and analyzing nodes, edges and related attributes in the target mode;
the generating module is used for generating a hypergraph with communities as nodes according to the social network and calculating the number of edges between each node in each community and each adjacent community interval of the community;
the analysis module is used for analyzing the structure of the target pattern based on the analyzed nodes, edges and related attributes in the target pattern and finding out mutually matched equivalent subgraphs in the target pattern;
the first acquisition module is used for respectively finding out subgraphs formed by all three people known mutually in the subgraphs, matched with the target mode, of each community by utilizing a preset subgraph matching algorithm in each community in the social network to obtain a subgraph matching result in the community;
a second obtaining module, configured to perform node-repeatable subgraph matching on the hypergraph on the target pattern by using a preset subgraph matching algorithm among all communities in the social network, store the obtained distribution schemes in a preset table, process, for each distribution scheme s in the preset table, based on the calculated number of edges between each node in each community and each neighboring community of the community and the found subgraph equivalent to each other in the target pattern, when the number of elements in the preset table is greater than a preset threshold, determine whether the number of nodes of the target pattern allocated to each community in the social network is greater than the number of nodes in the community, determine whether s can be converted into a distribution scheme with a smaller index value through matching equivalence relations, and for nodes in the target pattern allocated to the same community in s, in the target mode, the number of public one-hop neighbors of the target mode in other communities except the community is calculated pairwise to prune, a subgraph formed by all three people mutually known across communities is found out, and a subgraph matching result between communities is obtained;
and the summarizing module is used for summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and the subgraph matching result of the social network, namely obtaining all three people known mutually in the social network.
In a third aspect, an embodiment of the present invention further provides an electronic device, including: a processor, a memory, a bus, and a computer program stored on the memory and executable on the processor;
the processor and the memory complete mutual communication through the bus;
the processor, when executing the computer program, implements the method described above.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above method.
According to the technical scheme, the subgraph matching method and device based on the community structure, provided by the embodiment of the invention, find out the subgraphs, which are matched with each other and equivalent, in the target pattern by a method of analyzing the structure of the target pattern based on the imported file; generating a hypergraph with communities as nodes according to a network graph, calculating the number of edges between each node in each community and each adjacent community of the community, respectively finding out subgraphs, matched with a target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining a subgraph matching result in each community; and between all communities in the network graph, finding out subgraphs which cross the communities and are matched with the target pattern based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs which are matched with each other and are equivalent in the target pattern, obtaining subgraph matching results between the communities, summarizing the subgraph matching results in the communities and the subgraph matching results between the communities, and obtaining the subgraph matching results between the target pattern and the network graph, so that the calculation speed of subgraph matching can be effectively improved, and the time overhead can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a subgraph matching method based on a community structure according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a subgraph matching apparatus based on a community structure according to an embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention;
FIG. 4a is a diagram illustrating an example of a directed pattern graph according to an embodiment of the present invention;
FIG. 4b is an exemplary diagram of another directed pattern graph according to an embodiment of the present invention;
fig. 4c is an exemplary diagram of a undirected network diagram according to an embodiment of the invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Fig. 1 shows a flowchart of a subgraph matching method based on a community structure according to an embodiment of the present invention, and as shown in fig. 1, the subgraph matching method based on a community structure according to the embodiment includes:
and S1, importing a file containing the target pattern, analyzing the structure of the target pattern based on the imported file, and finding out the sub-images which are matched and equivalent in the target pattern.
And S2, generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community.
For convenience of description and understanding, in the present embodiment, the total number of edges between one node and all nodes in the corresponding community in the network graph may be referred to as the number of edges between the node and the community.
It can be understood that, in this embodiment, each community may be regarded as a super node according to the data of the network graph, and if there is an edge connection between nodes in two communities, the super nodes corresponding to the two communities are also connected, and the two communities are referred to as being connected; and if two nodes in a community are connected by edges, adding a self-loop for the super-junction point corresponding to the community. The graph formed by these super junction points is called a hypergraph.
It is understood that in this step, which communities are connected to each community may be calculated and stored.
It is understood that the method of the present embodiment is based on a community-based graph database system, and it is assumed that the following information has been obtained: each node belongs to which community, each community contains which nodes, and which nodes are connected to an edge for a given node.
And S3, respectively finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
In a specific application, the preset subgraph matching algorithm of the embodiment may include: the basic subgraph matching algorithms such as the vf2 algorithm and the like are not limited in the embodiment, and other basic subgraph matching algorithms can be selected as the preset subgraph matching algorithm according to actual conditions.
S4, in all communities in the network graph, based on the calculated number of the edges of each node in each community and each adjacent community of the community and the found subgraph which is equivalent to each other in the target pattern, finding out the subgraph which crosses the communities and is matched with the target pattern, and obtaining the subgraph matching result of the community.
S5, summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
The subgraph matching method based on the community structure comprises the steps of dividing subgraph matching into two stages, respectively calculating through two processes of subgraph matching in communities and subgraph matching between communities, finding out subgraphs which are matched with each other and are equivalent in a target mode through a method of analyzing the structure of the target mode based on an imported file, generating a hypergraph with communities as nodes according to a network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities; and between all communities in the network graph, finding out subgraphs which cross the communities and are matched with the target pattern based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs which are matched with each other and are equivalent in the target pattern, obtaining subgraph matching results between the communities, summarizing the subgraph matching results in the communities and the subgraph matching results between the communities, and obtaining the subgraph matching results between the target pattern and the network graph, so that the calculation speed of subgraph matching can be effectively improved, and the time overhead can be reduced.
It should be noted that, in this embodiment, the matching equivalence definitions and related concepts are defined as follows:
definition 1 (symmetry point): given pattern diagram P ═ Vp,Ep),vi,vj∈VpWherein V ispSet of vertices in the pattern P, EpIs the set of edges in the pattern P, if viAnd vjThe topological structure of (1) is symmetrical, namely viAnd vjAre symmetrical points with each other. Node viAnd vjThe definition of topological symmetry is: if there is a non-trivial autoregressive on the graph, the corresponding bijective g of the autoregressive satisfies g (v)i)=vj,g(vj)=viThen called node viAnd vjThe topological structure is symmetrical.
For example, referring to FIG. 4a, the schema diagram in FIG. 4a exists for a self-isomorphic mapping g, satisfying g (v)1)=v1,g(v2)=v3,g(v3)=v2,g(v4)=v5,g(v5)=v4,g(v6)=v6. Node v is thus2And v3Symmetrical topological structure, node v4And v5The topology is symmetrical, so v2And v3,v4And v5Are mutually symmetrical points respectively.
Definition 2 (symmetry diagram): given pattern diagram P ═ Vp,Ep) Let Psub1=(Vp1,Ep1) And Psub2=(Vp2,Ep2) For two subgraphs existing in P, if bijective f: V representing symmetric relation exists for the two subgraphsp1→Vp2(abbreviated as symmetric bijective), wherein for each mapping f (u) v, the node u and the node v are symmetric points, and then P is calledsub1And Psub2Are symmetrical subgraphs of each other. Let the bijection f be Psub1In respect of Psub2Symmetric bijections of (2).
Obviously, if there is Psub1In respect of Psub2The symmetric bijective f of (a) defines a bijective g satisfying for each mapping f (u) v for f, respectively g (v) u, then g is Psub2In respect of Psub1Symmetric bijections of (2).
Define 3 (extended symmetry mapping): given pattern diagram P ═ Vp,Ep) And there are two subgraphs P in Psub1=(Vp1,Ep1) And Psub2=(Vp2,Ep2) They are symmetric subgraphs of each other. Let Psub1In respect of Psub2The symmetric bijection of (b) is f, the mapping h is called Vp→VpIs Psub1In respect of Psub2If h satisfies the extended symmetric mapping of (1)
Figure GDA0002971082330000101
For example, Psub1={v2And Psub2={v3Are two subgraphs of the pattern diagram shown in FIG. 4a, and v2And v3Points of mutual symmetry, present Psub1In respect of Psub2Of (c) symmetrical bijective f: { v2}→{v3Satisfy f (v)2)=v3Thus P issub1And Psub2Are symmetrical subgraphs of each other. Psub1In respect of Psub2Satisfies h (v) by the extended symmetric mapping1)=v1,h(v2)=v3,h(v3)=v2,h(v4)=v4,h(v5)=v5,h(v6)=v6
Definition 4 (congruent): given undirected pattern diagram P ═ Vp,Ep),v1,v2∈VpNeighbours (v) represents a set of neighbor nodes for node v. Balance v1And v2Are congruent to each other, if: (1) | Neighbours (v)1)|=|Neighbours(v2)|;(2)Neighbours(v1)-{v1,v2}=Neighbours(v2)-{v1,v2}. For directed pattern graphs, OutNeighbours (v) represents the set of out-neighbor nodes for node v, InNeighbours (v) represents the set of in-neighbor nodes for node v, denoted v1And v2Are congruent to each other, if: (1) | OutNeighbours (v)1)|=|OutNeighbours(v2) I and OutNeighbours (v)1)-{v1,v2}=OutNeighbours(v2)-{v1,v2};(2)|InNeighbours(v1)|=|InNeighbours(v2) I and InNeighbours (v)1)-{v1,v2}=InNeighbours(v2)-{v1,v2};(3)v1∈OutNeighbours(v2) And v is2∈OutNeighbours(v1) Or v or1And v2There is no edge in between.
For example, referring to FIG. 4b, node v in FIG. 4b1,v2,v3Two by two are congruent points.
Define 5 (neighborhood range): given undirected pattern diagram P ═ Vp,Ep) For node V ∈ VpAnd its neighborhood range n (v) { t | t ═ vort ∈ neighbours (v) }, where neighbours (v) represents a neighbor node of node v. For the directed pattern graph, the neighborhood range No is respectively definedutAnd an in-neighborhood region Nin:Nout(v)={t|t=vort∈OutNeighbours(v)},Nin(v)={t|t=vort∈InNeighbours(v)}。
For example, there is N in FIG. 4aout(v1)={v1,v2,v3},Nin(v1)={v1},Nout(v2)={v2,v4},Nin(v2)={v1,v2},Nout(v3)={v3,v5},Nin(v3)={v1,v3}。
Define 6 (neighborhood range rule): given undirected pattern diagram P ═ Vp,Ep),Psub1And Psub2Are two subgraphs of P, and they are symmetric subgraphs to each other, let P besub1In respect of Psub2Symmetric bijection of (d) is f, Psub1In respect of Psub2The extended symmetric bijection of (c) is h. For Psub1And Psub2The neighborhood range rule refers to the following two rules: rule 1, Psub1In the neighborhood of each node P, N (P) and P are in Psub2A neighborhood n (q) of the symmetry point q ═ f (p) satisfies h (n (p)) n (q); rule 2, Psub1With each node P in Psub2The symmetric points q ═ f (p) in (a) are congruent points with each other. If Psub1And Psub2If at least one of the two rules is satisfied, P is calledsub1And Psub2Satisfying the neighborhood range rule. For a directed pattern graph P ═ Vp,Ep) About P being symmetric subgraphs of each othersub1And Psub2The neighborhood range rule of (1) is: rule 1, Psub1Of each node p in the treeout(p) neighborhood of entries Nin(P) and P are in Psub2The symmetric point q in (f) (p) is the out-neighborhood region Nout(q) neighborhood of entries Nin(q) satisfy the conditions h (No) respectivelyut(p))=Nout(q) and h (N)in(p))=Nin(q); rule 2, Psub1With each node P in Psub2The symmetric points q ═ f (p) in (a) are congruent points with each other. Wherein f is Psub1In respect of Psub2Is symmetric bijective, h is Psub1In respect of Psub2Extended symmetric bijective of. If Psub1And Psub2If at least one of the two rules is satisfied, P is calledsub1And Psub2Satisfying the neighborhood range rule.
Definition 7 (matching equivalence): given an undirected or directed pattern graph P ═ Vp,Ep),Psub1And Psub2Is two subgraphs of P, and they are mutually symmetrical subgraphs, note Psub1In respect of Psub2Symmetric bijection of (d) is f, Psub1In respect of Psub2Is symmetrically mapped as h. If Psub1And Psub2Satisfies the neighborhood range rule, and Psub1And Psub2Isomorphism, then called Psub1And Psub2And matching is equivalent.
Further, on the basis of the above embodiment, the step S1 may include:
importing a file containing a target mode, and analyzing nodes, edges and related attributes in the target mode;
and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Therefore, the structure of the target pattern can be analyzed, and the equivalent subgraphs matched with each other in the target pattern can be found.
Further, on the basis of the above embodiment, the step S3 may include:
and parallelly finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
It can be understood that the computing speed can be accelerated by realizing the parallelization of the subgraph matching process in the community.
Further, on the basis of the above embodiment, the step S4 may include:
performing repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm (namely two nodes in the target pattern can be matched with the same node in the hypergraph), and storing an obtained distribution scheme in a preset table (list) (the distribution scheme refers to a distribution mode for distributing the nodes in the target pattern to various communities, wherein for repeatable subgraph matching of the nodes, if a certain node in the hypergraph is matched with a certain node in the target pattern, the target pattern node is distributed to a community corresponding to the node in the hypergraph);
when the number of elements in the preset table is greater than a preset threshold, the processing may specifically include steps P1-P7, not shown in the figure, for each allocation scheme s in the preset table:
and P1, judging whether the number of the nodes of the target mode distributed to each community in the network graph in s is larger than that of the nodes in the community.
In step P1, if it is determined that the number of nodes in s assigned to the target pattern of a community in the hypergraph is greater than the number of nodes in the community, the process for s is terminated.
And P2, if the number of the nodes of the target pattern allocated to each community in the network graph in s is less than or equal to the number of the nodes in the community, taking the nodes of the target pattern allocated to each community in s as the nodes to be matched according to the structure of the target pattern, calculating the number of edges between the nodes to be matched in the target pattern and the nodes allocated to other communities except the community, pruning the nodes in the network graph and the edges of other communities except the community in the network graph in combination to reduce the number of the candidate matching nodes corresponding to each node in the target pattern, and if the number of the candidate matching nodes corresponding to a certain node in the target pattern after pruning is reduced to 0, ending the processing on s.
Wherein, the candidate matching node of one node in the target pattern refers to the network graph node which is matched with the candidate matching node.
P3, if the number of candidate matching nodes corresponding to all the nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found mutually matched equivalent subgraphs in the target pattern, and if not, recording all the distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file (newallmaps).
In step P3, if it is determined that s can be converted into an allocation scheme with a smaller index value through matching equivalence relation, the process for s is ended.
P4, for the nodes in the target pattern which are distributed to the same community in s, calculating the number of common one-hop neighbors of the nodes in the target pattern in other communities outside the community in pairs.
And P5, taking the derived subgraph of the target pattern corresponding to the nodes in the target pattern distributed to the same community in s as a target pattern subgraph, performing subgraph matching on the target pattern subgraph and the corresponding community in the network graph by using a preset subgraph matching algorithm, performing matching attempt on each node in the target pattern subgraph by using the corresponding candidate matching node in the matching process, and performing pruning by using the calculated public one-hop neighbor number to obtain a matching result of performing subgraph matching on each target pattern subgraph in s and the corresponding community in the network graph.
P6, carrying out further inspection on matching results obtained by carrying out subgraph matching on each target mode subgraph in s and the corresponding community in the network graph, and for the matching results of any two target mode subgraphs in s in two communities, if two nodes of the two target mode subgraphs are connected by edges, the nodes matched with the two nodes are also connected by edges; and if the matching result set comprises a matching result of each target pattern subgraph in the s and the matching results in the matching result set pass the inspection pairwise, combining the matching results in the matching result set to generate a matching result of the target pattern and the network graph.
And P7, calculating a final matching result for all distribution schemes recorded in the preset files (newallmaps) by matching equivalent corresponding relations according to the generated matching results of all target patterns and the network graph.
It can be understood that inter-community subgraph matching in the embodiment fully utilizes the structure of the target pattern and the structure information of the community, and calculates the matching equivalent subgraph in the pattern graph, thereby reducing the number of matching calculation; the number of the nodes in the target mode in the matching process of the nodes is reduced by calculating the number of the edges of the nodes and each community in the network graph and the number of the edges of the nodes and the nodes distributed to other communities different from the nodes in the target mode through pairwise calculation of the number of the common one-hop neighbors of the nodes in the target mode in each other community except the target mode in the target mode, so that the matching calculation speed can be further increased.
Further, when the number of elements in the preset table is greater than the preset threshold, the present embodiment may process each allocation scheme s in the preset table in parallel.
Therefore, the computation speed of the subgraph matching among communities can be increased, and the computation speed of the whole subgraph matching method is further increased.
The subgraph matching method based on the community structure fully utilizes the structure of the target mode and the information of the community structure, has a good pruning effect, can effectively improve the computation speed of subgraph matching, and reduces time overhead.
Fig. 2 shows a schematic structural diagram of a subgraph matching device based on a community structure according to an embodiment of the present invention, and as shown in fig. 2, the subgraph matching device based on a community structure according to the embodiment includes: the system comprises an importing module 21, a generating module 22, a first matching module 23, a second matching module 24 and a summarizing module 25; wherein:
the import module 21 is configured to import a file including a target pattern, analyze a structure of the target pattern based on the imported file, and find out mutually matched equivalent subgraphs in the target pattern;
the generating module 22 is configured to generate a hypergraph using communities as nodes according to the network graph, and calculate the number of edges between each node in each community and each adjacent community of the community;
the first matching module 23 is configured to find sub-graphs, which are matched with the target pattern, of each community in the network graph by using a preset sub-graph matching algorithm, and obtain sub-graph matching results in the communities;
the second matching module 24 is configured to find out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the subgraphs, which are matched with each other and are equivalent in the found target pattern, and obtain a subgraph matching result between communities;
and the summarizing module 25 is configured to summarize the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target pattern and a subgraph matching result of the network graph.
Specifically, the importing module 21 imports a file containing a target pattern, analyzes the structure of the target pattern based on the imported file, and finds out mutually matched equivalent subgraphs in the target pattern; the generation module 22 generates a hypergraph with communities as nodes according to the network graph, and calculates the number of edges between each node in each community and each adjacent community of the community; the first matching module 23 finds sub-graphs, which are matched with the target pattern, of each community in the network graph by using a preset sub-graph matching algorithm, and obtains sub-graph matching results in the communities; the second matching module 24 finds out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtains a subgraph matching result between communities; the summarizing module 25 summarizes the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
For convenience of description and understanding, in the present embodiment, the total number of edges between one node and all nodes in the corresponding community in the network graph may be referred to as the number of edges between the node and the community.
It can be understood that, in this embodiment, each community may be regarded as a supernode according to the data of the network graph, and if there is an edge between nodes of two communities connected, the supernodes corresponding to the two communities are also connected; and if two nodes in a community are connected by edges, adding a self-loop for the super junction point corresponding to the community. The graph formed by these community nodes is called hypergraph.
It will be appreciated that in the generation module 22, which communities are connected to each community may be calculated and stored.
It is understood that the apparatus of the present embodiment is based on a community-based graph database system, and it is assumed that the following information is obtained: each node belongs to which community, each community contains which nodes, and which nodes are connected to an edge for a given node.
In a specific application, the preset subgraph matching algorithm of the embodiment may include: the basic subgraph matching algorithms such as the vf2 algorithm and the like are not limited in the embodiment, and other basic subgraph matching algorithms can be selected as the preset subgraph matching algorithm according to actual conditions.
It should be noted that, in the present embodiment, reference may be made to the detailed description of the foregoing method embodiments for matching equivalent definitions and related concept definitions.
According to the subgraph matching device based on the community structure, subgraph matching is divided into two stages, and calculation is performed through two processes of subgraph matching in communities and subgraph matching between communities respectively, so that the calculation speed of subgraph matching can be effectively improved, and time overhead is reduced.
Further, on the basis of the above embodiment, the importing module 21 may parse nodes and edges and related attributes in the target schema by importing a file including the target schema; and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Therefore, the structure of the analysis target pattern can be found, and the subgraphs which are matched with each other and equivalent in the target pattern can be found.
Further, on the basis of the above embodiment, the first matching module 23 may be specifically used for
And parallelly finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
It can be understood that the computing speed can be accelerated by realizing the parallelization of the subgraph matching process in the community.
Further, on the basis of the above embodiment, the second matching module 24 can be specifically used for
Performing repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm (namely two nodes in the target pattern can be matched with the same node in the hypergraph), and storing the obtained distribution scheme in a preset table (list) (the distribution scheme refers to a distribution mode for distributing the nodes in the target pattern to various communities);
when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table may specifically include:
judging whether the number of the nodes of the target mode distributed to each community in the network graph in the step s is larger than that of the nodes in the community or not;
if the number of the nodes of the target pattern distributed to each community in the network graph in s is less than or equal to the number of the nodes in the community, taking the nodes of the target pattern distributed to each community in s as the nodes to be matched according to the structure of the target pattern, calculating the number of edges between the nodes to be matched and the nodes distributed to other communities except the community in the target pattern, pruning the nodes in the community and the edges of other communities except the community in the network graph to reduce the number of candidate matching nodes corresponding to the nodes in the target pattern, and if the number of the candidate matching nodes corresponding to a certain node in the target pattern after pruning is 0, finishing the processing of s;
if the number of candidate matching nodes corresponding to all nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found target pattern matched with equivalent subgraphs, and if not, recording all distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file (newallmaps);
for the nodes in the target mode which is distributed to the same community in the s, calculating the number of the common one-hop neighbors of the nodes in the target mode in each other community except the community pairwise;
taking a derived subgraph of a target pattern corresponding to a node in the target pattern distributed to the same community in s as a target pattern subgraph, carrying out subgraph matching on the target pattern subgraph and a corresponding community in a network graph by using a preset subgraph matching algorithm, carrying out matching attempt on each node in the target pattern subgraph by using a corresponding candidate matching node in the matching process, and pruning by using the calculated public one-hop neighbor number to obtain a matching result of carrying out subgraph matching on each target pattern subgraph in s and the corresponding community in the network graph;
further checking the matching result of each target mode subgraph in s and the corresponding community in the network graph, and if the matching result of any two target mode subgraphs in s in two communities has an edge between two nodes of the two target mode subgraphs, the edge between the nodes matched with the two nodes is also connected; if a matching result set containing a matching result of each target pattern subgraph in s passes the check in pairs, combining the matching results in the matching result set to generate a matching result of the target pattern and the network graph;
and according to the generated matching results of all target patterns and the network diagram, calculating the final matching result for all distribution schemes recorded in the preset files (newallmaps) through the matching equivalence relation.
It can be understood that, after determining whether the number of nodes of the target pattern allocated to each community in the network graph in s is greater than the number of nodes in the community, if it is determined that the number of nodes of the target pattern allocated to a certain community in s is greater than the number of nodes in the community, the second matching module 24 ends the processing on s.
It can be understood that, after the second matching module 24 determines whether the current s can be converted into the allocation scheme with the smaller index value through the matching equivalence relation according to the mutually matching equivalent subgraphs in the found target pattern, if it is determined that the current s can be converted into the allocation scheme with the smaller index value through the matching equivalence relation, the processing on s is ended.
It can be understood that inter-community subgraph matching in the embodiment fully utilizes the structure of the target pattern and the structure information of the community, and calculates the matching equivalent subgraph in the pattern graph, thereby reducing the number of matching calculation; the number of the nodes in the target pattern which are distributed to the same community in the s is calculated pairwise to carry out pruning by calculating the number of the common one-hop neighbors of the nodes in the target pattern in each other community except the community, and the number of the edges between the nodes in the network graph and each community and the number of the edges between the nodes in the target pattern and the nodes distributed to other communities different from the target pattern are calculated to reduce the number of candidate matching nodes matched with the nodes in the target pattern in the network graph, so that the matching calculation speed can be further increased.
Further, the second matching module 24 may perform processing on each allocation scheme s in the preset table in parallel when the number of elements in the preset table is greater than a preset threshold.
Therefore, the computation speed of the subgraph matching among communities can be increased, and the computation speed of the whole subgraph matching method is further increased.
The subgraph matching device based on the community structure fully utilizes the structure of the target mode and the information of the community structure, has a better pruning effect, can effectively improve the computation speed of subgraph matching, and reduces the time overhead.
The subgraph matching device based on the community structure in this embodiment may be used to implement the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar and will not be described herein again.
Fig. 3 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor 31, a memory 32, a bus 33, and computer programs stored on the memory 32 and executable on the processor 31;
the processor 31 and the memory 32 complete mutual communication through the bus 33;
when the processor 31 executes the computer program, the method provided by the foregoing method embodiments is implemented, for example, including: importing a file containing a target mode, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode; generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm in each community in the network graph to obtain subgraph matching results in the communities; finding out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtaining a subgraph matching result between communities; and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
An embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method provided by the foregoing method embodiments, and for example, the method includes: importing a file containing a target mode, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode; generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm in each community in the network graph to obtain subgraph matching results in the communities; finding out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtaining a subgraph matching result between communities; and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means/systems for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. The terms "upper", "lower", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience in describing the present invention and simplifying the description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Unless expressly stated or limited otherwise, the terms "mounted," "connected," and "connected" are intended to be inclusive and mean, for example, that they may be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the description of the present invention, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention is not limited to any single aspect, nor is it limited to any single embodiment, nor is it limited to any combination and/or permutation of these aspects and/or embodiments. Moreover, each aspect and/or embodiment of the present invention may be utilized alone or in combination with one or more other aspects and/or embodiments thereof.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims (10)

1. A subgraph matching method based on community structure applied to a social network is characterized by comprising the following steps:
importing a file containing a target mode representing three mutually recognized people, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode;
generating a hypergraph with communities as nodes according to data of the social network, and calculating the number of edges between each node in each community and each adjacent community of the community;
in each community in the social network, respectively finding out subgraphs formed by all three people known to each other in each community by using a preset subgraph matching algorithm to obtain a subgraph matching result in the community;
performing node repeatable subgraph matching on a target pattern on the hypergraph among all communities in the social network by utilizing a preset subgraph matching algorithm, storing an obtained distribution scheme in a preset table, processing each distribution scheme s in the preset table based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraph which is equivalent to each other in the target pattern when the number of elements in the preset table is larger than a preset threshold value, judging whether the number of nodes of the target pattern distributed to each community in the social network in s is larger than the number of nodes in the community, judging whether s can be converted into a distribution scheme with a smaller index value through matching equivalence relation, and calculating the common one-hop number of the nodes in the target pattern in other communities except the community in pairs for the nodes distributed to the same community in s, finding out a subgraph formed by all three mutually known people across communities to obtain a subgraph matching result among the communities;
and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target pattern and a subgraph matching result of the network graph, namely obtaining all three people known mutually in the social network.
2. The method of claim 1, wherein importing a file containing a target pattern, analyzing a structure of the target pattern based on the imported file, and finding mutually matching equivalent subgraphs in the target pattern comprises:
importing a file containing a target mode, and analyzing nodes, edges and related attributes in the target mode;
and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
3. The method of claim 1, wherein the step of finding out subgraphs, which are matched with the target pattern, of each community in the social network by using a preset subgraph matching algorithm to obtain subgraph matching results in the communities comprises the following steps:
and in parallel, in each community in the social network, respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm, and obtaining a subgraph matching result in the community.
4. The method according to claim 1, wherein a preset subgraph matching algorithm is used among all communities in the social network to perform node repeatable subgraph matching on the hypergraph for the target patterns, the obtained distribution schemes are stored in a preset table, when the number of elements in the preset table is larger than a preset threshold value, each distribution scheme s in the preset table is processed based on the calculated number of edges between each node in each community and each adjacent community of the community and subgraphs which are matched with each other and equivalent in the found target patterns, whether the number of nodes of the target patterns distributed to each community in the social network in s is larger than the number of nodes in the community is judged, whether s can be converted into distribution schemes with smaller index values through matching equivalence relations, and for the nodes of the target patterns distributed to the same community in s, in the two-by-two calculation target modes, the number of common one-hop neighbors of the target modes in other communities except the community is calculated to prune, a subgraph formed by all three people mutually known across communities is found, and a subgraph matching result between communities is obtained, wherein the subgraph matching result comprises the following steps:
performing node repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm, and storing the obtained distribution scheme in a preset table;
when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table, including:
judging whether the number of the nodes of the target mode distributed to each community in the social network in the step s is larger than that of the nodes in the community or not;
if the number of the nodes of the target mode distributed to each community in the social network in s is less than or equal to the number of the nodes in the community, taking the nodes of the target mode distributed to each community in s as the nodes to be matched according to the structure of the target mode, calculating the number of edges between the nodes to be matched and the nodes distributed to other communities except the community in the target mode, pruning the nodes in the community and the edges of other communities except the community in the social network to reduce the number of candidate matching nodes corresponding to each node in the target mode, and if the number of the candidate matching nodes corresponding to a certain node in the target mode after pruning is reduced to 0, finishing the processing of s;
if the number of candidate matching nodes corresponding to all nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found target pattern matched with equivalent subgraphs, and if not, recording all distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file;
for the nodes in the target mode which is distributed to the same community in the s, calculating the number of the common one-hop neighbors of the nodes in the target mode in other communities except the community two by two;
taking a derived subgraph of a target pattern formed by nodes in the target pattern distributed to the same community in the step s as a target pattern subgraph, carrying out subgraph matching on the target pattern subgraph and a corresponding community in the social network by using a preset subgraph matching algorithm, carrying out matching attempt on each node in the target pattern subgraph by using the corresponding candidate matching node in the matching process, and pruning by using the calculated public one-hop neighbor number to obtain a matching result of carrying out subgraph matching on each target pattern subgraph in the step s and the corresponding community in the social network;
carrying out further inspection on matching results obtained by carrying out subgraph matching on each target pattern subgraph in s and the corresponding community in the social network, and for the matching results of any two target pattern subgraphs in s in two communities, if two nodes of the two target pattern subgraphs are connected by edges, the nodes matched with the two nodes are also connected by edges; if a matching result set containing a matching result of each target pattern subgraph in s passes the check in pairs, combining the matching results in the matching result set to generate a matching result of the target pattern and the social network;
and calculating a final matching result for all distribution schemes recorded in the preset file by matching equivalent corresponding relations according to the generated matching results of all target patterns and the social network.
5. The method of claim 4, wherein after determining whether the number of nodes in s assigned to the target pattern of each community in the social network is greater than the number of nodes in that community, the method further comprises:
and if the number of the nodes in the target mode distributed to a certain community in the social network in the s is judged and known to be larger than the number of the nodes in the community, finishing the processing of the s.
6. The method of claim 4, wherein after determining whether the current s can be converted into an allocation scheme with a smaller index value through matching equivalence relation according to the subgraphs matching equivalence in the target pattern, the method further comprises:
and judging to acquire that the current s can be converted into a distribution scheme with a smaller index value through matching the equivalence relation, and finishing the processing of the s.
7. Method according to claim 4, characterized in that each allocation scheme s in the preset table is processed in parallel when the number of elements in the preset table is greater than a preset threshold.
8. A subgraph matching device applied to a social network and based on a community structure is characterized by comprising:
the system comprises an importing module, a searching module and a searching module, wherein the importing module is used for importing a file containing a target mode representing three mutually known people, analyzing the structure of the target mode based on the imported file and finding out mutually matched equivalent subgraphs in the target mode;
the generating module is used for generating a hypergraph with communities as nodes according to the social network and calculating the number of edges between each node in each community and each adjacent community interval of the community;
the first matching module is used for respectively finding out subgraphs formed by all three people known mutually in the subgraphs, matched with the target mode, of each community by utilizing a preset subgraph matching algorithm in each community in the social network to obtain a subgraph matching result in the community;
a second matching module, configured to perform node-repeatable subgraph matching on the hypergraph on the target pattern by using a preset subgraph matching algorithm among all communities in the social network, store the obtained distribution schemes in a preset table, process, for each distribution scheme s in the preset table, based on the calculated number of edges between each node in each community and each neighboring community of the community and the found subgraph with matching equivalence among the target patterns, determine, for each distribution scheme s in the preset table, whether the number of nodes of the target pattern allocated to each community in the social network is greater than the number of nodes in the community, determine whether s can be converted into a distribution scheme with a smaller index value through matching equivalence, and for nodes in the target pattern allocated to the same community in s, in the target mode, the number of public one-hop neighbors of the target mode in other communities except the community is calculated pairwise to prune, a subgraph formed by all three people mutually known across communities is found out, and a subgraph matching result between communities is obtained;
and the summarizing module is used for summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and the subgraph matching result of the social network, namely obtaining all three people known mutually in the social network.
9. An electronic device, comprising: a processor, a memory, a bus, and a computer program stored on the memory and executable on the processor;
the processor and the memory complete mutual communication through the bus;
the processor, when executing the computer program, implements the method of any of claims 1-7.
10. A non-transitory computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the method of any one of claims 1-7.
CN201810836811.3A 2018-07-26 2018-07-26 Subgraph matching method and device based on community structure Active CN109063089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810836811.3A CN109063089B (en) 2018-07-26 2018-07-26 Subgraph matching method and device based on community structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810836811.3A CN109063089B (en) 2018-07-26 2018-07-26 Subgraph matching method and device based on community structure

Publications (2)

Publication Number Publication Date
CN109063089A CN109063089A (en) 2018-12-21
CN109063089B true CN109063089B (en) 2021-04-23

Family

ID=64836646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810836811.3A Active CN109063089B (en) 2018-07-26 2018-07-26 Subgraph matching method and device based on community structure

Country Status (1)

Country Link
CN (1) CN109063089B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382315B (en) * 2018-12-29 2024-04-05 阿里巴巴集团控股有限公司 Merging method of sub-graph isomorphic matching results, electronic equipment and storage medium
US20230315786A1 (en) * 2020-08-27 2023-10-05 Tsinghua University Sub-graph matching policy determination method, sub-graph matching method, sub-graph counting method and calculation device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8676565B2 (en) * 2010-03-26 2014-03-18 Virtuoz Sa Semantic clustering and conversational agents
CN102662974B (en) * 2012-03-12 2014-02-26 浙江大学 A network graph index method based on adjacent node trees
CN105956114A (en) * 2016-05-05 2016-09-21 南京邮电大学 Method for searching pattern matching subgraphs based on tag graph
CN106384050B (en) * 2016-09-13 2019-01-15 哈尔滨工程大学 A kind of dynamic stain analysis method excavated based on Maximum Frequent subgraph

Also Published As

Publication number Publication date
CN109063089A (en) 2018-12-21

Similar Documents

Publication Publication Date Title
Ni et al. Inside the atoms: ranking on a network of networks
Kim et al. Taming subgraph isomorphism for RDF query processing
Kurlin A one‐dimensional homologically persistent skeleton of an unstructured point cloud in any metric space
US20190303266A1 (en) String transformation based trace classification and analysis
Weller et al. Exact approaches for scaffolding
Deng et al. An efficient online direction-preserving compression approach for trajectory streaming data
CN104391879B (en) The method and device of hierarchical clustering
CN109063089B (en) Subgraph matching method and device based on community structure
Aparício et al. Parallel subgraph counting for multicore architectures
Rao et al. New approximation techniques for some linear ordering problems
WO2019233089A1 (en) Method and device for large-ratio scale reduction of internet testbed topology
CN111460234A (en) Graph query method and device, electronic equipment and computer readable storage medium
CN111767547A (en) Software vulnerability detection method based on complex network community
Bi et al. MM-GNN: Mix-moment graph neural network towards modeling neighborhood feature distribution
Liu et al. Simga: A simple and effective heterophilous graph neural network with efficient global aggregation
Chen et al. PBSM: an efficient top-K subgraph matching algorithm
Zhang et al. Multi-Criteria Optimal Location Query with Overlapping Voronoi Diagrams.
Lin et al. An O*(1.1939 n) time algorithm for minimum weighted dominating induced matching
CN111125541B (en) Method for acquiring sustainable multi-cloud service combination for multiple users
Ursulenko et al. A global optimization algorithm for solving the minimum multiple ratio spanning tree problem
Lerouge et al. Gem++: A tool for solving substitution-tolerant subgraph isomorphism
Sarkar et al. A Hybrid Clone Detection Technique for Estimation of Resource Requirements of a Job
Zhang et al. Social network sensitive area perturbance method based on firefly algorithm
Tripoul et al. There are trillions of little forks in the road. choose wisely!-estimating the cost and likelihood of success of constrained walks to optimize a graph pruning pipeline
Hao et al. An Association‐Oriented Partitioning Approach for Streaming Graph Query

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181221

Assignee: Galaxy Aerospace (Xi'an) Technology Co.,Ltd.

Assignor: TSINGHUA University

Contract record no.: X2023110000007

Denomination of invention: A subgraph matching method and device based on community structure

Granted publication date: 20210423

License type: Common License

Record date: 20230112