CN108052743B

CN108052743B - Method and system for determining step approach centrality

Info

Publication number: CN108052743B
Application number: CN201711349361.7A
Authority: CN
Inventors: 金海�; 钱辰; 于东晓; 谢夏; 王娜
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2017-12-15
Filing date: 2017-12-15
Publication date: 2021-01-05
Anticipated expiration: 2037-12-15
Also published as: CN108052743A

Abstract

The invention discloses a method and a system for determining the step approach centrality, wherein the method comprises the following steps: constructing a data graph of a target database; all nodes in the data graph are used as a residual node set; calculating the distances between all node pairs in the residual node set, and calculating the proximity centrality of all nodes through the distances and the weighting functions; selecting a node with the maximum approaching centrality from the residual node set, wherein the step approaching centrality of the selected node is the approaching centrality calculated in the current sub-graph, deleting the selected node from the residual node set, and deleting the edge associated with the selected node from the data graph to generate a new sub-graph; and judging whether the deleted residual node set is empty or not, if not, repeating the steps to continue the calculation until the residual node set is empty, and at the moment, calculating all the nodes to obtain the self step approach centrality. The step approach centrality index provided by the invention has better locality and anti-interference capability.

Description

Method and system for determining step approach centrality

Technical Field

The invention belongs to the field of graph data analysis in the field of computers, and particularly relates to a method and a system for determining step proximity centrality for large-scale data graph analysis.

Background

At present, a wide variety of graph data and graph structures are abundantly present in our lives, such as social networks, traffic network graphs, biological networks, financial networks, scientific data network graphs, and so on. With the continuous development of society, the scale of the data graphs is rapidly expanded, and the scale of the data graphs is increased day by day, so that the difficulty of directly analyzing the structure of the whole graph is very high. Recently, researchers have proposed many methods for detecting and evaluating communities and centrality measures in large-scale graphs. The nodes in the graph are sorted by detecting the key elements in the graph and according to the importance of the nodes, so that the important candidate nodes are tracked. The methods provide powerful tools for researchers in various fields to understand the composition, the function and the dynamic evolution process of a real system. However, the diversity of large-scale data graph structures determines that the centrality measurement itself is a very difficult problem, and therefore, how to evaluate and detect the centrality of nodes in the graph, to provide more accurate measurement indexes and calculation algorithms, and to perform functional explanation on the detected modules and the centrality nodes is still a current very challenging task.

Various classical node centrality indexes and calculation methods have been designed at present, and the application of the classical node centrality indexes and calculation methods in large-scale graph data analysis is shown. However, most data graphs have small world characteristics, and the centrality index obtained by global calculation is often interfered by a small part of nodes with larger centrality. Therefore, how to describe the characteristics of the nodes in a local scope becomes a problem to be solved urgently.

Disclosure of Invention

Aiming at the defects or improvement requirements of the prior art, the invention provides a method and a system for determining the step approach centrality, so that the technical problem that the centrality index obtained by global calculation of the traditional node centrality index and calculation method is often interfered by a small part of nodes with larger centrality is solved.

To achieve the above object, according to one aspect of the present invention, there is provided a step approach centrality determining method including:

(1) establishing a data graph G (V, E) or G (V, E, W) by taking all data units in a target database as nodes, taking the association relation among the data units as edges and taking the strength of the association among the data units as a weight, wherein an edge set E represents a set of all edges in the data graph, a point set V represents a set of all nodes in the data graph, and a weight set W represents the weight of each edge in the edge set E;

(2) initializing current subgraph number l-0 and current residual node set V'_lV, current remainder set E'_lE, by V'_lAnd E'_lConstruction of a Generation subgraph G_l(V′_l,E′_l) Or G_l(V′_l,E′_l,W′_l) Wherein, W'_lRepresents E'_lA weight set of the middle edges;

(3) generating subgraph G_lCalculating to obtain the current residual node set V'_lDistance d between each node pair_l(u,v)；

(4) According to the weighting function and the distance d between each pair of nodes obtained by calculation_l(u, v) generating subgraph G_lCalculating to obtain the current residual node set V'_lThe approximate centrality of each node in the cluster;

(5) finding out the current residual node set V'_lMiddle-approach most centrality node set V_l ^*；

(6) From the current set V of nodes remaining'_lTo be contained in the node set V_l ^*Deleting the nodes in the node group to obtain a next-level residual node set V'_l+1；

(7) Judging the residual node set V 'of the next level'_l+1If the current state is not null, executing the step (8), otherwise, executing the step (9);

(8) from the current remaining edge set E'_lWill be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And the remaining edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1Generating subgraph G at said new generator_l+1In the method, the residual node set V 'of the next stage is obtained through calculation'_l+1Distance d between each node pair_l+1(u, v) when the subgraph number l is l +1, and returningThe step (4) is executed in which,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2);

(9) and obtaining the step approach centrality of all the nodes in the data graph G.

Preferably, in step (3), if the generation subgraph G_lAs an authorized graph G_l(V_l',E'_l,W_l') calculating the current residual node set V by adopting Dijkstra algorithm_l' distance d between each node pair in_l(u, v) if said generating subgraph G_lIs a no-right graph G_l(V_l',E'_l) Respectively constructing the generated subgraph G by adopting a breadth-first search algorithm BFS_lBFS spanning tree with each node as root node for calculating and maintaining distance d between each node pair_l(u,v)。

Preferably, in step (4), the distance d between each pair of nodes is calculated according to the weighting function_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node v is:

wherein, α (d)_l(u, v)) represents a weighting function, C_c(v) Represents the approximate centrality of the node V, and u represents the current set of nodes V_l' the nodes remaining after node v are removed.

Preferably, step (8) specifically comprises:

(8.1) from the current residual edge set E_l' in will be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1(ii) a Then passes through a residual node set V 'of the next stage'_l+1And the remaining edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1；

(8.2) if said new generation subgraph G_l+1And recalculating the residual node set V 'of the next stage by adopting Dijkstra algorithm as a weighted graph'_l+1The distance between each pair of nodes in the set;

(8.3) if said new generation subgraph G_l+1And (4) maintaining the residual node set V 'of the next stage in a mode of incremental updating of the BFS spanning tree structure calculated in the step (3) as an unauthorized graph'_l+1The distance between each pair of nodes in the array.

Preferably, step (8.3) comprises in particular:

(8.3.1) finding out the residual node set V 'of the next stage'_l+1Any node v in₀Setting a root node of a target BFS spanning tree as a node r for the target BFS spanning tree of the root node, adding the node r into a queue to be modified, and creating a pointer a to point to a head node of the queue to be modified;

(8.3.2) inserting the child node x of the node pointed by the pointer a into the tail of the queue to be modified if the inserted child node x belongs to the node to be modified

Adding all brother nodes of the child node x in the target BFS spanning tree structure into an anchor point queue, creating a pointer b to point to a head node s of the anchor point queue, simultaneously adding all child nodes of the child node x into a collapse linked list, and deleting the connection between the node pointed by the pointer a and the child node x;

(8.3.3) look-up on said generated subgraph G_lIf yes, executing the step (8.3.4), and if not, executing the step (8.3.6);

(8.3.4) creating a pointer to node t at node s, and then deleting node t from the collapse linked list;

(8.3.5) judging whether the collapse linked list after the deletion operation is empty, if not, executing the step (8.3.6), and if so, executing the step (8.3.9);

(8.3.6) judging whether a child node exists in the node pointed by the pointer b, if so, inserting the child node of the node pointed by the pointer b into the tail of the anchor point queue;

(8.3.7) judging whether the node pointed by the pointer b has a subsequent node, if so, executing the step (8.3.8), and if not, executing the step (8.3.9);

(8.3.8) pointing the pointer b to the node subsequent to the node pointed to currently, and returning to execute the step (8.3.3);

(8.3.9) judging whether the node pointed by the pointer a has a child node, if so, inserting the child node of the node pointed by the pointer a into the tail of the queue to be modified;

(8.3.10) judging whether the node pointed by the pointer a has a subsequent node, if so, executing the step (8.3.11), and if not, executing the step (8.3.12);

(8.3.11) pointing the pointer a to the node succeeding the node pointed to currently, and returning to execute the step (8.3.2);

(8.3.12) completing the incremental update to the target BFS spanning tree and completing for the set of nodes remaining at the next level V'_l+1Middle division v₀Incremental updating of BFS spanning tree with other nodes outside as root nodes to complete the new generated subgraph G_l+1The distance between each node pair in (1) is updated.

According to another aspect of the present invention, there is provided a step proximity centrality determining system including:

the data graph building module is used for building a data graph G (V, E) or G (V, E, W) by taking all data units in a target database as nodes, taking the association relations among the data units as edges and taking the strength of the association among the data units as a weight, wherein an edge set E represents a set of all edges in the data graph, a point set V represents a set of all nodes in the data graph, and a weight set W represents the weight of each edge in the edge set E;

a generated subgraph construction module for initializing the current subgraph number l equal to 0 and the current residual node set V_l'-V, current remainder set E'_lBy V ═ E_l'and E'_lConstruction of a Generation subgraph G_l(V_l',E'_l) Or G_l(V_l',E'_l,W_l') wherein, W_l'represents E'_lA weight set of the middle edges;

a first node pair distance calculation module for generating subgraph G_lThe current residual node set V is obtained by calculation_l' distance d between each node pair in_l(u,v)；

A proximity center calculation module for calculating the distance d between each pair of nodes according to the weighting function_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node in the set;

a searching module for finding out the current node set V_l' node set having maximum nearness to center in center

A node set determination module for determining the current node set V from the current node set V_l' will be included in the set of nodes V_l ^*Deleting the nodes in the node group to obtain a next-level residual node set V'_l+1；

A judging module, configured to judge the remaining node set V of the next stage'_l+1If the node set is empty, if the node set is the residual node set V of the next level'_l+1If the data graph G is empty, the step approach centrality of all the nodes in the data graph G is obtained;

a newly generated subgraph construction module for set V 'of remaining nodes at the next stage'_l+1Not empty, from the current remaining edge set E'_lWill be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And the remaining edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1Generating subgraph G at said new generator_l+1In the method, the residual node set V 'of the next stage is obtained through calculation'_l+1Distance d between each node pair_l+1(u, v) when the subgraph number l is l +1, and returning to the approximate centrality calculation module, wherein,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2).

Preferably, said node pair distance calculation module, in particular for generating subgraph G_lAs an authorized graph G_l(V_l',E'_l,W_l') calculating the current set of nodes remaining V by using Dijkstra algorithm_l' distance d between each node pair in_l(u, v) generating subgraph G_lIs a no-right graph G_l(V_l',E'_l) Then, the generated subgraph G is respectively constructed by adopting a breadth-first search algorithm BFS_lBFS spanning tree with each node as root node for calculating and maintaining distance d between each node pair_l(u,v)。

Preferably, the approximate centrality calculating module is specifically configured to calculate a distance d between each pair of nodes according to the weighting function and the calculated distance d_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node v is:

Preferably, the new generated subgraph construction module comprises:

a newly generated subgraph construction submodule for constructing from the current remaining edge set E'_lWill be included in the edge set

A second node pair distance calculation module for calculating the distance between the new generation subgraph G_l+1When the map is a weighted map, recalculating residual node set V 'of the next stage by adopting Dijkstra algorithm'_l+1The distance between each pair of nodes in the set;

a third node-to-distance computation module for generating subgraph G at the new generation subgraph G_l+1And if the node is an unweighted graph, maintaining the residual node set V 'of the next level in a mode of updating the increment of the BFS spanning tree structure calculated by the distance calculation module by the first node'_l+1The distance between each pair of nodes in the array.

Preferably, the third node-to-distance calculation module includes:

a first submodule for finding out a set V 'of nodes remaining with the next stage'_l+1Any node v in₀Setting a root node of a target BFS spanning tree as a node r for the target BFS spanning tree of the root node, adding the node r into a queue to be modified, and creating a pointer a to point to a head node of the queue to be modified;

a second sub-module, configured to insert a sub-node x of the node pointed by the pointer a into the tail of the queue to be modified, if the inserted sub-node x belongs to V_l ^*Adding all brother nodes of the child node x in the target BFS spanning tree structure into an anchor point queue, creating a pointer b to point to a head node s of the anchor point queue, simultaneously adding all child nodes of the child node x into a collapse linked list, and deleting the connection between the node pointed by the pointer a and the child node x;

a third sub-module for searching in the generated subgraph G_lWhether an edge exists between the node s pointed by the pointer b and any node t in the collapse linked list to connect the node s and the node t;

a fourth sub-module, configured to create a pointer pointing to a node t at a node s when an edge exists between the node s pointed to by the pointer b and any node t in the collapsing chain table, and then delete the node t from the collapsing chain table;

the fifth submodule is used for judging whether the collapsing link table after the deleting operation is empty or not;

a sixth sub-module, configured to, when there is no edge connecting node s to a node t in the collapsing chain table between the node s pointed to by the pointer b and any node t in the collapsing chain table, or when the collapsing chain table after deletion is not empty, determine whether there is a child node in the node pointed to by the pointer b, and if there is a child node in the node pointed to by the pointer b, insert the child node of the node pointed to by the pointer b into the tail of the anchor point queue;

the seventh submodule is used for judging whether a node pointed by the pointer b has a successor node or not;

the eighth submodule is used for pointing the pointer b to a successor node of the current pointed node when the successor node exists in the pointed node of the pointer b, and returning to execute the third submodule;

a ninth sub-module, configured to determine whether a child node exists in a node pointed by the pointer a when the collapsing link table after the deletion operation is empty or when a successor node does not exist in the node pointed by the pointer b, and if the child node exists, insert the child node of the node pointed by the pointer a into the tail of the queue to be modified;

the tenth submodule is used for judging whether the node pointed by the pointer a has a successor node or not;

the eleventh submodule is used for pointing the pointer a to a successor node of the current pointed node when the successor node exists in the node pointed by the pointer a, and returning to execute the second submodule;

a twelfth submodule, configured to, when there is no successor node in the node pointed to by the pointer a, complete incremental update of the target BFS spanning tree, and complete set V 'of the remaining nodes at the next stage'_l+1Middle division v₀Incremental updating of BFS spanning tree with other nodes outside as root nodes to complete the new generated subgraph G_l+1The distance between each node pair in (1) is updated.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:

(1) the method for determining the step approach centrality is mainly used for describing and analyzing local centrality characteristics of large-scale data graphs. When the step-type approach centrality is calculated, the interference caused by the nodes with larger approach centrality is reduced by deleting the nodes with the largest approach centrality in the current sub-graph every time, and the step-type approach centrality with better locality is obtained, so that better hierarchical analysis is facilitated for the local condition of the data graph.

(2) In the calculation of the weightless graph, the existing calculation result is effectively utilized in an incremental updating mode, the dynamic updating of the shortest path between the nodes is realized, the high cost of complete recalculation is avoided, and the calculation of the step approach centrality provided by the invention can be realized more efficiently.

Drawings

Fig. 1 is a schematic flowchart of a method for determining a step approach centrality according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a method for incrementally updating a distance between node pairs according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention provides a method for determining the step approach centrality, which mainly comprises the following steps: firstly, establishing a data graph by taking data units of a database as nodes, taking association among the data units as edges and taking the strength of the association among the data units as weight; initializing a residual node set into all nodes in the current data graph, and selecting a proper weighting function; thirdly, calculating the distances between all node pairs in the residual node set, and calculating the approaching centrality of all nodes according to the distances; selecting a node set with the maximum approach centrality from the residual node sets, wherein the step approach centrality of the selected nodes is defined as the approach centrality calculated in the current subgraph, deleting the nodes from the residual node sets, and deleting the edges associated with the nodes from the data graph to generate a new subgraph; judging whether the residual node set is empty, if not, repeating the third to fifth steps until the residual node set is empty, and calculating all the points to obtain the step approach centrality of the point. According to the invention, the node with the maximum approaching centrality is deleted, and the approaching centrality of the node with the maximum approaching centrality calculated on the obtained sub-graph is defined as the step approaching centrality of the node; the step approach centrality of all the nodes is obtained by repeating the above process, so that the centrality of all the nodes is localized, the anti-interference capability of the centrality calculated by each node is improved, and the characteristics of the nodes in a local range can be better represented.

The invention provides a step proximity center (Hierarchical Closeness center) index for large-scale map data analysis, which comprises the following steps:

in graph G (V, E), the conventional approach centrality C of node V_c(v) Is defined as:

where α (d (u, V)) is the selected weighting function, and u is the other node in the node set V except for the node V.

In a sub-graph G_lDegree of approaching traditional center C_c(v) The largest set of nodes is denoted V_l ^*Graph G_lMiddle V_l ^*The set of associated edges is represented as

Defining a series of generations in graph GSubfigure G_lThe following were used:

defining a set V of nodes remaining simultaneously_l'：

Residual edge set E_l'：

Wherein A \ B represents the removal of a portion containing B from A.

When node u and node v exist in generation subgraph G at the same time_lWhen up, a subgraph distance d is defined_l(u, v) is node u and node v is in subgraph G_lIs measured. The step approach center degree C proposed in the present invention is defined_H(v) Comprises the following steps:

degree of approaching to the center of the tradition

Compared with the step approach centrality index provided by the invention, the step approach centrality index has better locality and anti-interference capability.

Fig. 1 is a schematic flow chart of a method for determining a step approach center degree according to an embodiment of the present invention, where the method shown in fig. 1 includes:

(2) initializing current subgraph number l as 0, and current residual node set V_l'-V, current remainder set E'_lBy V ═ E_l'and E'_lConstruction of a Generation subgraph G_l(V_l',E'_l) Or G_l(V'_l,E'_l,W_l') wherein, W_l'represents E'_lA weight set of the middle edges;

(3) in generating subgraph G_lThe current residual node set V is obtained by middle calculation_l' distance d between each node pair in_l(u,v)；

In an alternative embodiment, in step (3), if sub-graph G is generated_lAs an authorized graph G_l(V_l',E'_l,W_l') calculating the current residual node set V by adopting Dijkstra algorithm_l' distance d between each node pair in_l(u, v) if subgraph G is generated_lIs a no-right graph G_l(V_l',E'_l) Respectively constructing by using a Breadth-First-Search algorithm (BFS) to generate a subgraph G_lBFS spanning tree with each node as root node for calculating and maintaining distance d between each node pair_l(u,v)。

(4) According to the weighting function and the distance d between each pair of nodes obtained by calculation_l(u, v) in generating subgraph G_lThe current residual node set V is obtained by middle calculation_l' the approximate centrality of each node in the set;

in an alternative embodiment, the weighting function may be selected by:

e.g. in the harmonious centrality

α (d (u, v)) -2 in the center of exponential decay^-d(u,v)And the like. And according to the sensitive condition of the distance change, selecting a corresponding weighting function appropriately, thereby obtaining ideal approximate centrality.

In an alternative embodimentIn the embodiment, in the step (4), the distance d between each pair of nodes is calculated according to the weighting function_l(u, v) in generating subgraph G_lThe current residual node set V is obtained by middle calculation_l' the approximate centrality of each node v is:

(5) Finding out current residual node set V_l' node set V having the greatest degree of nearness in center_l ^*；

Wherein, V_l ^*The step approach centrality C of the node u included in (1)_H(u)＝C_c(u)。

(6) From the current set of nodes remaining V_l' will be contained in a node set V_l ^*Deleting the nodes in the node group to obtain a next-level residual node set V'_l+1；

(7) Judging the residual node set V of the next level'_l+1If the current state is not null, executing the step (8), otherwise, executing the step (9);

(8) from the current remaining edge set E'_lWill be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And residual edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1In the new generation subgraph G_l+1In the method, the residual node set V 'of the next stage is obtained through calculation'_l+1Distance d between each node pair_l+1(u, v), when the subgraph number l is l +1, and returning to execute the step (4), wherein,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2);

as shown in fig. 2, in an alternative embodiment, step (8) specifically includes:

(8.1) set E of edges left from the present_l' in will be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1(ii) a Then passes through a residual node set V 'of the next stage'_l+1And residual edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1；

(8.2) if new generation subgraph G_l+1For the weighted graph, recalculating residual node set V 'of the next stage by Dijkstra algorithm'_l+1The distance between each pair of nodes in the set;

(8.3) if new generation subgraph G_l+1And (4) maintaining the residual node set V 'of the next level in a mode of updating the increment of the BFS spanning tree structure calculated in the step (3) as an unwarranted graph'_l+1The distance between each pair of nodes in the array.

In an alternative embodiment, step (8.3) specifically includes:

(8.3.1) finding a residual node set V 'of the next stage'_l+1Any node v in₀Setting a root node of the target BFS spanning tree as a node r for the target BFS spanning tree of the root node, adding the node r into a queue to be modified, and creating a pointer a to point to a head node of the queue to be modified;

(8.3.2) inserting the child node x of the node pointed by the pointer a into the tail of the queue to be modified, if the inserted child node x belongs to V_l ^*Adding all brother nodes of the child node x in the target BFS spanning tree structure into an anchor point queue, creating a head node s of which the pointer b points to the anchor point queue, simultaneously adding all child nodes of the child node x into a collapse linked list, and deleting the connection between the node pointed by the pointer a and the child node x;

(8.3.3) search in generating subgraph G_lWhether one node exists between the node s pointed to by the pointer b and any node t in the collapse chain table or notConnecting the node s and the node t by the edge, if the node s and the node t exist, executing the step (8.3.4), and if the node t does not exist, executing the step (8.3.6);

(8.3.4) creating a pointer to node t at node s, and then deleting node t from the collapsing list;

(8.3.6) judging whether the node pointed by the pointer b has a child node, if so, inserting the child node of the node pointed by the pointer b into the tail of the anchor point queue;

(8.3.9) judging whether the node pointed by the pointer a has a child node, if so, inserting the child node pointed by the pointer a into the tail of the queue to be modified;

(8.3.12) completing the incremental update to the target BFS spanning tree and completing the set of nodes remaining V 'for the next level'_l+1Middle division v₀Incremental updating of BFS spanning tree with other nodes outside as root nodes to complete new generation subgraph G_l+1The distance between each node pair in (1) is updated.

The invention also provides a system for determining the step approach centrality, which comprises:

the data graph building module is used for building a data graph G (V, E) or G (V, E, W) by taking all data units in the target database as nodes, taking the association relationship among the data units as edges and taking the strength of the association among the data units as a weight, wherein an edge set E represents a set of all edges in the data graph, a point set V represents a set of all nodes in the data graph, and a weight set W represents the weight of each edge in the edge set E;

a first node pair distance calculation module for generating subgraph G_lThe current residual node set V is obtained by middle calculation_l' distance d between each node pair in_l(u,v)；

A proximity center calculation module for calculating the distance d between each pair of nodes according to the weighting function_l(u, v) in generating subgraph G_lThe current residual node set V is obtained by middle calculation_l' the approximate centrality of each node in the set;

a searching module for finding out the current residual node set V_l' node set V having the greatest degree of nearness in center_l ^*；

A node set determination module for determining a node set V from a current node set V remaining_l' will be included in the set of nodes V_l ^*Deleting the nodes in the node group to obtain a next-level residual node set V'_l+1；

A judging module for judging the residual node set V of the next level'_l+1If the node set is empty, if the node set V is the residual node set V of the next stage'_l+1If the data graph G is empty, the step approach centrality of all the nodes in the data graph G is obtained;

a newly generated subgraph construction module for the remaining node set V 'at the next stage'_l+1Not empty, from the current remaining edge set E_l' in will be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And residual edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1In the new generation subgraph G_l+1In the method, the residual node set V 'of the next stage is obtained through calculation'_l+1Distance d between each node pair_l+1(u, v) when the subgraph number l is l +1, and returning to the approximate centrality calculation module, wherein,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2).

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method for determining a step approach centrality, comprising:

(2) initializing current subgraph number l as 0, and current residual node set V_l'-V, current remainder set E'_lBy V ═ E_l'and E'_lConstruction of a Generation subgraph G_l(V_l',E'_l) Or G_l(V_l',E'_l,W_l') wherein, W_l' represents E_l' weight set of the middle edge;

(3) generating subgraph G_lThe current remaining node is obtained by calculationCollection V_l' distance d between each node pair in_l(u, v); v represents the current set of nodes remaining V_l' Each node in u represents the current set of nodes remaining V_l' the nodes remaining after node v is removed;

(4) according to the weighting function and the distance d between each pair of nodes obtained by calculation_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node in the set;

(5) finding out the current residual node set V_l' node set V having the greatest degree of nearness in center_l ^*；

(6) From the current set of nodes remaining V_l' will be included in the set of nodes V_l ^*Deleting the nodes in the node group to obtain a next-level residual node set V'_l+1；

(8) from the current remaining edge set E'_lWill be included in the edge set

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And the remaining edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1Generating subgraph G at said new generator_l+1In the method, the residual node set V 'of the next stage is obtained through calculation'_l+1Distance d between each node pair_l+1(u, v), when the subgraph number l is l +1, and returning to execute the step (4), wherein,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2);

2. The method of claim 1, wherein in step (3), if the generation subgraph G is_lAs an authorized graph G_l(V_l',E'_l,W_l') calculating the current residual node set V by adopting Dijkstra algorithm_l' distance d between each node pair in_l(u, v) if said generating subgraph G_lIs a no-right graph G_l(V_l',E'_l) Respectively constructing the generated subgraph G by adopting a breadth-first search algorithm BFS_lBFS spanning tree with each node as root node for calculating and maintaining distance d between each node pair_l(u,v)。

3. The method of claim 1, wherein in step (4), the distance d between each pair of nodes is calculated based on the weighting function and_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node v is:

4. The method according to any one of claims 1 to 3, characterized in that step (8) comprises in particular:

(8.1) from the current remaining edge set E'_lWill be included in the edge set

5. The method according to claim 4, characterized in that step (8.3) comprises in particular:

(8.3.2) inserting the child node x of the node pointed by the pointer a into the tail of the queue to be modified, if the inserted child node x belongs to V_l ^*Adding all brother nodes of the child node x in the target BFS spanning tree structure into an anchor point queue, creating a pointer b to point to a head node s of the anchor point queue, simultaneously adding all child nodes of the child node x into a collapse linked list, and deleting the connection between the node pointed by the pointer a and the child node x;

6. A step proximity centrality determination system, comprising:

a first node pair distance calculation module for generating subgraph G_lThe current residual node set V is obtained by calculation_l' distance d between each node pair in_l(u, v); v represents the current set of nodes remaining V_l' Each node in u represents the current set of nodes remaining V_l' the nodes remaining after node v is removed;

a searching module for finding out the current node set V_l' node set V having the greatest degree of nearness in center_l ^*；

Deleting the edges to obtain a remaining edge set E 'of the next level'_l+1Then through the remaining node set V 'of the next stage'_l+1And the remaining edge set E 'of the next stage'_l+1Construction of a novel generative subgraph G_l+1Generating subgraph G at said new generator_l+1In, calculateSet of nodes remaining to the next stage V'_l+1Distance d between each node pair_l+1(u, v) when the subgraph number l is l +1, and returning to the approximate centrality calculation module, wherein,

representing the generated subgraph G_lMiddle V_l ^*The adjacent side of (2).

7. System according to claim 6, characterized in that said node pair distance calculation module, in particular for generating sub-graph G, is adapted to calculate said distance between said node pairs_lAs an authorized graph G_l(V_l',E'_l,W_l') calculating the current set of nodes remaining V by using Dijkstra algorithm_l' distance d between each node pair in_l(u, v) generating subgraph G_lIs a no-right graph G_l(V_l',E'_l) Then, the generated subgraph G is respectively constructed by adopting a breadth-first search algorithm BFS_lBFS spanning tree with each node as root node for calculating and maintaining distance d between each node pair_l(u,v)。

8. The system according to claim 6, wherein the approximate centrality calculation module is specifically configured to calculate the distance d between each pair of nodes according to the weighting function_l(u, v) generating subgraph G_lThe current residual node set V is obtained by calculation_l' the approximate centrality of each node v is:

9. The system of any of claims 6 to 8, wherein the new generation subgraph construction module comprises:

10. The system of claim 9, wherein the third node-to-distance calculation module comprises:

a second sub-module, configured to insert a sub-node x of the node pointed by the pointer a into the tail of the queue to be modified, if the inserted sub-node x belongs to V_l ^*Adding all brother nodes of the child node x in the target BFS spanning tree structure into an anchor point queue, creating a pointer b to point to a head node s of the anchor point queue, simultaneously adding all child nodes of the child node x into a collapse linked list, and deleting the pointera connection of the pointed-to node to child node x;

a twelfth submodule for pointing at the pointer aWhen the node has no successor node, finishing the incremental updating of the target BFS spanning tree and finishing the residual node set V 'of the next level'_l+1Middle division v₀Incremental updating of BFS spanning tree with other nodes outside as root nodes to complete the new generated subgraph G_l+1The distance between each node pair in (1) is updated.