CN115878861B - Selection method for integrated key node group aiming at graph data completion - Google Patents
Selection method for integrated key node group aiming at graph data completion Download PDFInfo
- Publication number
- CN115878861B CN115878861B CN202310074880.6A CN202310074880A CN115878861B CN 115878861 B CN115878861 B CN 115878861B CN 202310074880 A CN202310074880 A CN 202310074880A CN 115878861 B CN115878861 B CN 115878861B
- Authority
- CN
- China
- Prior art keywords
- node
- nodes
- key
- network
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010187 selection method Methods 0.000 title description 7
- 238000000034 method Methods 0.000 claims abstract description 61
- 239000011159 matrix material Substances 0.000 claims abstract description 35
- 230000000295 complement effect Effects 0.000 claims abstract description 27
- 230000000694 effects Effects 0.000 claims abstract description 27
- 238000012360 testing method Methods 0.000 claims abstract description 24
- 238000005096 rolling process Methods 0.000 claims abstract description 15
- 238000002360 preparation method Methods 0.000 claims abstract description 10
- 239000000284 extract Substances 0.000 claims abstract description 3
- 239000013598 vector Substances 0.000 claims description 29
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 230000003068 static effect Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 150000003722 vitamin derivatives Chemical class 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a method for selecting an integrated key node group aiming at graph data completion, which comprises four modules: the data preparation module extracts topology information of the real network data according to the real network data and obtains an adjacency matrix; the multi-angle key node identification module is used for identifying a group of key nodes by utilizing various methods; the graph rolling network testing module is used for testing the result obtained in the previous stage based on the graph rolling network to obtain a graph data complement effect; and the judging and outputting module is used for comparing the complement effects and outputting an optimal group of key nodes. The effective results of the invention are as follows: aiming at the graph data complement problem, a group of key nodes can be identified, and the neighborhood-based, path-based and iteration-based multi-angle methods are integrated, so that the method has good effects on networks of different categories.
Description
Technical Field
The invention relates to a method, in particular to a method for selecting an integrated key node group aiming at graph data completion, and belongs to the technical field of network key node identification and deep learning.
Background
The graph data complement technology refers to partial information based on the graph, and combines historical data and all information of the topological structure complement graph of the graph. The technology can make us unnecessary to know all information of the network and analyze and study the properties of the network. Thus, this technique can significantly reduce the cost of analyzing large complex networks. Today, this technology has played a great role in various fields including electricity, traffic, biology, chemistry, economy, society, and the like. The selection of the nodes for completion directly affects the effect of graph data completion.
Taking a traffic and transportation network as an example, we often consider a road network as a network, and analyze the traffic and transportation network based on real-time traffic flow at intersections and the topology of the road network. However, the number of important traffic intersections in a city often ranges from thousands to tens of thousands, and the states of the same intersection can be completely different, and if traffic flow information of each intersection is detected in real time, a great amount of cost is continuously introduced. Thus, one effective solution is: only a part of key nodes are monitored, and the rest node data is complemented by combining a complex network and a deep learning method. However, how to select key nodes for graph data completion becomes a problem to be solved.
However, the existing identification method for the key nodes of the complex network has the following disadvantages in the aspect of the problem:
1. the existing methods are mainly aimed at network propagation angles, network control angles and the like, for example, search for which nodes in a power grid can be damaged to make the loss of the power grid most serious, judge which people to push based on a social network to maximize the benefits of an advertising company, influence which nodes can make the network reach a given state most quickly, and the like, so that a selection method of key nodes aiming at the graph data complement problem is lacking;
2. most of the existing methods are aimed at static networks, and are often based on the structural characteristics of the networks, but do not pay attention to the dynamic properties and evolution rules of the networks;
3. most of the existing methods focus on ranking the importance of nodes, i.e. more focus on the selection of a single node, but lack a method for selecting multiple nodes simultaneously;
4. existing key node identification methods lack efficient integration. The current key node identification methods are based on different problems and angles, and the effects obtained on different types of graphs are often different. In addition, due to the complexity of the structure of the complex network itself and the diversity of the types, it is almost impossible to find an evaluation index applicable to all kinds of graphs.
Therefore, the invention designs a selection method of an integrated key node group aiming at graph data completion, integrates the existing key node identification method aiming at the graph data completion problem, and provides a method for selecting a group of key nodes, which can capture the dynamic characteristics of a network and has good effects on different types of graphs.
Disclosure of Invention
The invention provides a selection method of an integrated key node group for graph data completion, which integrates a plurality of key node identification methods with different angles, can obtain better effects on different types of graphs, considers the influence of dynamic characteristics of a network and selected nodes on subsequent selection, and has better effects compared with the structure characteristics of a static network only and a method for picking a plurality of previous names through node sequencing.
In order to achieve the above object, the technical scheme of the present invention is as follows, a method for selecting an integrated key node group for graph data completion, the method comprising the following steps:
step 1: acquiring topology information, acquiring data on each node in the network,
step 2: inputting the topology information obtained in the step 1 into a data preparation module,
step 3: inputting the adjacency matrix obtained in the step 2 into a multi-angle key node identification module,
step 4: the key node input graph rolling network test module is obtained, the test result of each group of key nodes is output,
step 5: after the completion effects of different key nodes are calculated, the obtained key nodes and the mean square error thereof are input into a judging and outputting module, and a group of key nodes with the best completion effects are selected as a final result and output.
As an improvement of the present invention, step 1: the method comprises the steps of obtaining topology information, firstly, abstracting an actual traffic network into a network, wherein each traffic intersection is regarded as a node in the network, a road section connecting two traffic intersections is regarded as an edge between the nodes, and the attribute value of each node represents flow information representing the corresponding traffic intersection in time.
As an improvement of the invention, the step 2 is specifically as follows, the topology information obtained in the step 1 is input into a data preparation module to obtain an adjacent matrix thereof,/>Element->The definition herein is as follows:
wherein ,representing adjacency matrix->Element of (a)>Respectively representing nodes in the network.
As an improvement of the invention, step 3 is specifically as follows, the adjacency matrix obtained in step 2 is input into a multi-angle key node identification module, and the module is divided into three sub-modules: the method comprises a neighborhood judging module, a path judging module and an iteration judging module, wherein each sub-module is based on 1-2 different judging indexes, the judging flow based on each evaluating index is shown in a figure 2, and the specific flow is as follows:
step 3.1, inputting an adjacency matrix of a traffic network and the number of key nodes to be identified;
step 3.2, selecting the most important node in the network based on a corresponding key node identification method (such as degree centrality) according to the input adjacency matrix;
step 3.3. Deleting selected nodes and connected edges thereof from the original network;
step 3.4, inputting a new adjacency matrix, and subtracting 1 from the original number of key nodes;
step 3.5, repeating the steps 3.2 to 3.4 until the number of the key nodes to be selected is 0;
and 3.6. Outputting all the selected key node numbers.
As an improvement of the invention, the graph rolling network test module in the step 4 mainly uses the graph rolling network to make the key nodes identified in the previous stage perform graph data complement test, and outputs the test result, and the graph data complement method specifically comprises the following steps:
step 4.1: inputting real graph data information and key node numbers judged in the previous stage;
step 4.2: according to the input information, deleting the information of other nodes except the key node;
step 4.3: firstly, making the data pass through a linear layer, and carrying out graph data pre-complement;
the formula for graph data pre-completion is as follows:
wherein ,is the graph data information after deleting part of the nodes, is +.>A dimension column vector, wherein->For the number of the nodes, the number of the nodes is,for the number of key nodes, < >>Is->Weight matrix of>Paranoid item of one->Validly set vector (L)>Data after pre-completion;
step 4.4: inputting the result obtained by the pre-complement into a graph convolution layer, and performing graph convolution operation;
the formula of the graph convolution operation is as follows:
wherein ,indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Is used for the weight matrix of the (c),is an adjacency matrix->Is a bias term;
step 4.5: inputting the result obtained by the graph convolution layer into a full connection layer, and outputting a graph data complement result;
the formula of the full link layer is as follows:
wherein ,indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Is used for the weight matrix of the (c),is a bias term;
step 4.6: comparing the result obtained by the completion with the original real information to obtain the graph data completion effect,
wherein ,complement values for map data,/->For the true value of the graph data, < >>For the number of nodes->The smaller the mean square error is, the better the graph data complement effect is for the key node number.
As an improvement of the present invention, step 5 is specifically as follows: after the completion effect of the different five groups of key nodes is calculated, the obtained five groups of key nodes and the mean square error thereof are input into a judging and outputting module, the completion effect of each group of key nodes is compared, a group of key nodes with the best completion effect is selected as a final result and is output,
let 5 groups of key nodes obtained by the multi-angle key node identification module be respectivelyThe +.A. obtained by the graph convolution test module>The results are +.>The final set of key nodes +.>The method comprises the following steps:
as an improvement of the invention, step 3.2. Based on the corresponding key node identification method, the specific steps are as follows: including 5 different center metrics:
(1) Centering: the centrality describes the number of neighbors of a node, and the calculation formula is as follows:
wherein ,representing node->Center of degree (F),>for the total number of nodes, +.>The greater the centrality is for the elements in the adjacency matrix, the more important the node is considered;
(2)and (3) decomposition: />The decomposition may be used to describe the location of a node in the network, and the specific calculation method is as follows:
firstly, deleting all nodes with the degree of 1 and corresponding edges thereof in a network, and then deleting all nodes with the degree of 1 and corresponding edges thereof in a new network again, and repeating the steps until no edges with the degree of 1 exist in the network;
the set of deleted nodes is referred to as a 1-shell, and the remaining nodes are referred to as 1-cores;
and so on until all nodes in the network are deleted, obtaining a k-shell and a k-core, wherein each node and a node uniquely belong to a certain k-shell, and the larger k is, the more important the node is considered;
(3) Near centrality: proximity centrality is used to describe the average distance of a node from all other nodes in the network, and is calculated as follows:
wherein ,node->Near centrality of->For the total number of nodes in the network, +.>Representing node->And node->Is defined as the distance between the slave nodes +.>To node->Wherein if node +.>And node->And not communicated with each other, the following is considered: />At this time:
the greater the proximity centrality, the more important the node is considered;
(4) Intermediate centrality: the mediating center is used to describe how many shortest paths a node is on to the nodes, and the calculation formula is as follows:
wherein ,representing node->Middle centrality of->Representing node->And node->The number of shortest paths between->Representing node->And node->The shortest path between them is passed through the node +.>The greater the intermediacy of a node, the more important that node is considered to be,
(5) Feature vector centrality: the feature vector centrality assumes that the influence of a node is determined not only by the number of its neighbors, but also by the influence of each neighbor, and that the centrality of a node is proportional to the sum of the centralities of the nodes to which it is connected, then there are:
wherein ,to represent the feature vector centrality of each node, < >>Is an adjacency matrix->Is a constant, if the above formula is true, +.>Is a matrix->And characteristic value->Corresponding feature vector, and method for calculating centrality of feature vectorIs given as initial value +.>Then the following iterative algorithm is used:
wherein ,the greater the feature vector centrality of a node, the more important that node is considered. />
As an improvement of the present invention, four modules are included: the data preparation module extracts topology information of the real network data according to the real network data and obtains an adjacency matrix; the multi-angle key node identification module is used for identifying a group of key nodes by utilizing various methods; the graph rolling network testing module is used for testing the result obtained in the previous stage based on the graph rolling network to obtain a graph data complement effect; and the judging and outputting module is used for comparing the complement effects and outputting an optimal group of key nodes.
Step 3.3 is specifically as follows: after selecting a key node from the original network, a new adjacency matrix is obtained by:
wherein ,for matrix->First->Line->Column element->Representing the number of the selected key node in the original network. The data preparation module not only utilizes static topology information of the network, but also utilizes real network data, and focuses on dynamic characteristics and evolution rules of the network.
first, find a nodeIs called node +.>If include node +.>Then->Otherwise, find node ++>The set of first-order neighbors of all first-order neighbors of (excluding the previously selected node), called node +.>If include node +.>Then->Otherwise, find node ++>First order neighbor … … of the second order neighbors of (a) and so on until node +.>Until this time, it is possible to determine +.>。
Compared with the prior art, the method has the advantages that 1, the existing key node selection method mainly aims at network propagation angles, network control angles and the like, but lacks of the key node selection method aiming at graph data completion, and the method is used for comparing and obtaining a group of key node identification methods aiming at graph data completion by utilizing a graph convolution network from the view of graph data completion;
2. most of the existing key node identification methods are aimed at static networks, are often based on the structural characteristics of the networks, and do not pay attention to the dynamic properties and evolution rules of the networks;
3. most existing approaches focus mostly on ordering the importance of nodes, i.e., more on the selection of individual nodes. In reality, however, if only the top in node ordering is selectedThe individual nodes are often not selected +.>The invention utilizes an efficient system for selecting a set of important nodes in a network, preferably among the nodes;
4. the existing key node identification methods have respective departure angles and emphasis points, and one method often has good effects on a certain class of graphs, but can not necessarily obtain good effects on other classes. Taking the example of centrality, centrality is an easy-to-calculate and efficient evaluation index for evaluating node importance in a scaleless network, whereas centrality may not be a good evaluation index when evaluating other classes of networks, such as random networks. Thus, existing methods lack efficient integration and integration. The invention integrates some representative key node identification methods before, and provides a method for selecting a group of key nodes.
Drawings
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a set of key node selection processes according to the present invention;
FIG. 3 is an exploded view of a k-shell;
FIG. 4 is a flowchart of the method for rolling network full graph data and testing effects according to the present invention.
In the figure: 1 is a 1-shell, 2 is a 2-shell, and 3 is a 3-shell.
Detailed Description
In order to enhance the understanding of the present invention, the present embodiment will be described in detail with reference to the accompanying drawings.
Example 1: referring to fig. 1-4, a method for selecting an integrated key node group for graph data completion, the method comprising the steps of:
step 1: the topology information is acquired and the topology information is acquired,
step 2: inputting the topology information obtained in the step 1 into a data preparation module,
step 3: inputting the adjacency matrix obtained in the step 2 into a multi-angle key node identification module,
step 4: the key node input graph rolling network test module is obtained, the test result of each group of key nodes is output,
step 5: after the completion effects of different key nodes are calculated, the obtained key nodes and the mean square error thereof are input into a judging and outputting module, and a group of key nodes with the best completion effects are selected as a final result and output.
Step 1: the method comprises the steps of obtaining topology information, firstly, abstracting an actual traffic network into a network, wherein each traffic intersection is regarded as a node in the network, a road section connecting two traffic intersections is regarded as an edge between the nodes, and the attribute value of each node represents flow information representing the corresponding traffic intersection in time.
wherein ,representing adjacency matrix->Element of (a)>Respectively representing nodes in the network.
step 3.1, inputting an adjacency matrix of a traffic network and the number of key nodes to be identified;
step 3.2, selecting the most important node in the network based on a corresponding key node identification method (such as degree centrality) according to the input adjacency matrix;
step 3.2, based on a corresponding key node identification method, the method specifically comprises the following steps: including 5 different center metrics:
(1) Centering: the centrality describes the number of neighbors of a node, and the calculation formula is as follows:
wherein ,representing node->Center of degree (F),>for the total number of nodes, +.>The greater the centrality is for the elements in the adjacency matrix, the more important the node is considered;
(2)and (3) decomposition: />The decomposition may be used to describe the location of a node in the network, and the specific calculation method is as follows:
firstly, deleting all nodes with the degree of 1 and corresponding edges thereof in a network, and then deleting all nodes with the degree of 1 and corresponding edges thereof in a new network again, and repeating the steps until no edges with the degree of 1 exist in the network;
the set of deleted nodes is referred to as a 1-shell, and the remaining nodes are referred to as 1-cores;
and so on until all nodes in the network are deleted, obtaining a k-shell and a k-core, wherein each node and a node uniquely belong to a certain k-shell, and the larger k is, the more important the node is considered;
(3) Near centrality: proximity centrality is used to describe the average distance of a node from all other nodes in the network, and is calculated as follows:
wherein ,node->Near centrality of->For the total number of nodes in the network, +.>Representing node->And node->Is defined as the distance between the slave nodes +.>To node->Wherein if node +.>And node->And not communicated with each other, the following is considered: />At this time:
the greater the proximity centrality, the more important the node is considered;
(4) Intermediate centrality: the mediating center is used to describe how many shortest paths a node is on to the nodes, and the calculation formula is as follows:
wherein ,representing node->Middle centrality of->Representing node->And node->The number of shortest paths between->Representing node->And node->The shortest path between them is passed through the node +.>The greater the intermediacy of a node, the more important that node is considered to be,
(5) Feature vector centrality: the feature vector centrality assumes that the influence of a node is determined not only by the number of its neighbors, but also by the influence of each neighbor, and that the centrality of a node is proportional to the sum of the centralities of the nodes to which it is connected, then there are:
wherein ,to represent the feature vector centrality of each node, < >>Is an adjacency matrix->Is a constant, if the above formula is true, +.>Is a matrix->And characteristic value->Corresponding feature vectors, the centrality of the feature vectors is calculated by giving the initial value +.>Then the following iterative algorithm is used:
wherein ,the greater the feature vector centrality of a node, the more important that node is considered.
Step 3.3. Deleting selected nodes and connected edges thereof from the original network;
step 3.4, inputting a new adjacency matrix, and subtracting 1 from the original number of key nodes;
step 3.5, repeating the steps 3.2 to 3.4 until the number of the key nodes to be selected is 0;
and 3.6. Outputting all the selected key node numbers.
Step 4, as shown in fig. 4, inputting the obtained five groups of key nodes into a graph rolling network test module, and outputting a test result of each group of key nodes, wherein the specific steps are as follows:
step 4.1: inputting real graph data information and key node numbers judged in the previous stage;
step 4.2: according to the input information, deleting the information of other nodes except the key node;
step 4.3: firstly, making the data pass through a linear layer, and carrying out graph data pre-complement;
the formula for graph data pre-completion is as follows:
wherein ,is the graph data information after deleting part of the nodes, is +.>A dimension column vector, wherein->For the number of the nodes, the number of the nodes is,for the number of key nodes, < >>Is->Weight matrix of>Paranoid item of one->Weili (vitamin column)Vector (S)>Data after pre-completion;
step 4.4: inputting the result obtained by the pre-complement into a graph convolution layer, and performing graph convolution operation;
the formula of the graph convolution operation is as follows:
wherein ,indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Is used for the weight matrix of the (c),is an adjacency matrix->Is a bias term;
step 4.5: inputting the result obtained by the graph convolution layer into a full connection layer, and outputting a graph data complement result;
the formula of the full link layer is as follows:
wherein ,indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Is used for the weight matrix of the (c),is a bias term;
step 4.6: comparing the result obtained by the completion with the original real information to obtain the graph data completion effect,
wherein ,complement values for map data,/->For the true value of the graph data, < >>For the number of nodes->The smaller the mean square error is, the better the graph data complement effect is for the key node number.
The step 5 is specifically as follows: after the completion effect of the different five groups of key nodes is calculated, the obtained five groups of key nodes and the mean square error thereof are input into a judging and outputting module, the completion effect of each group of key nodes is compared, a group of key nodes with the best completion effect is selected as a final result and is output,
let 5 groups of key nodes obtained by the multi-angle key node identification module be respectivelyThe +.A. obtained by the graph convolution test module>The results are +.>The final set of key nodes +.>The method comprises the following steps:
it should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and equivalent changes or substitutions made on the basis of the above-mentioned technical solutions fall within the scope of the present invention as defined in the claims.
Claims (5)
1. A method for selecting an integrated key node group for graph data completion, the method comprising the steps of:
step 1: acquiring topology information, acquiring data on each node in the network,
step 2: inputting the topology information obtained in the step 1 into a data preparation module, specifically, inputting the topology information obtained in the step 1 into the data preparation module to obtain an adjacent matrix thereof,/>Element->The definition herein is as follows:
wherein ,representing adjacency matrix->Element of (a)>Respectively representing nodes in the network;
step 3: inputting the adjacency matrix obtained in the step 2 into a multi-angle key node identification module, wherein the adjacency matrix obtained in the step 2 is input into the multi-angle key node identification module and is divided into three sub-modules: the system comprises a neighborhood judging module, a path judging module and an iteration judging module, wherein each sub-module is based on 1-2 different judging indexes, and the specific flow is as follows based on the judging flow of each evaluating index:
step 3.1, inputting an adjacency matrix of a traffic network and the number of key nodes to be identified;
step 3.2, selecting the most important node in the network based on a corresponding key node identification method according to the input adjacency matrix;
step 3.3. Deleting selected nodes and connected edges thereof from the original network;
step 3.4, inputting a new adjacency matrix, and subtracting 1 from the original number of key nodes;
step 3.5, repeating the steps 3.2 to 3.4 until the number of the key nodes to be selected is 0;
step 3.6. Outputting all the selected key node numbers,
step 3.2 is based on a corresponding key node identification method, and specifically comprises the following steps: including 5 different center metrics:
(1) Centering: the centrality describes the number of neighbors of a node, and the calculation formula is as follows:
wherein ,representing node->Center of degree (F),>for the total number of nodes, +.>The greater the centrality is for the elements in the adjacency matrix, the more important the node is considered;
(2)and (3) decomposition: />The decomposition may be used to describe the location of a node in the network, and the specific calculation method is as follows:
firstly, deleting all nodes with the degree of 1 and corresponding edges thereof in a network, and then deleting all nodes with the degree of 1 and corresponding edges thereof in a new network again, and repeating the steps until no edges with the degree of 1 exist in the network;
the set of deleted nodes is referred to as a 1-shell, and the remaining nodes are referred to as 1-cores;
and so on until all nodes in the network are deleted, obtaining a k-shell and a k-core, wherein each node and a node uniquely belong to a certain k-shell, and the larger k is, the more important the node is considered;
(3) Near centrality: proximity centrality is used to describe the average distance of a node from all other nodes in the network, and is calculated as follows:
wherein ,node->Near centrality of->For the total number of nodes in the network, +.>Representing node->And nodeIs defined as the distance between the slave nodes +.>To node->Wherein if node +.>And node->And not communicated with each other, the following is considered: />At this time:
the greater the proximity centrality, the more important the node is considered;
(4) Intermediate centrality: the mediating center is used to describe how many shortest paths a node is on to the nodes, and the calculation formula is as follows:
wherein ,representing node->Middle centrality of->Representing node->And node->The number of shortest paths between->Representing node->And node->The shortest path between them is passed through the node +.>Number of one nodeThe greater the mediating centricity, the more important the node is considered to be,
(5) Feature vector centrality: the feature vector centrality assumes that the influence of a node is determined not only by the number of its neighbors, but also by the influence of each neighbor, and that the centrality of a node is proportional to the sum of the centralities of the nodes to which it is connected, then there are:
wherein ,to represent the feature vector centrality of each node, < >>Is an adjacency matrix->Is a constant ifHold true->Is a matrix->And characteristic value->Corresponding feature vectors, the centrality of the feature vectors is calculated by giving the initial value +.>Then the following iterative algorithm is used:
wherein ,the greater the feature vector centrality of a node, the more important the node is considered;
step 3.3 is specifically as follows: after selecting a key node from the original network, a new adjacency matrix is obtained by:
wherein ,for matrix->First->Line->Column element->A number representing a selected key node in the original network;
first, findTo the nodeIs called node +.>If include node +.>Then->Otherwise, find node ++>Is called node +.>If include node +.>Then->Otherwise, find node ++>First order neighbor … … of the second order neighbors of (a) and so on until node +.>Until this time, it is possible to determine +.>;
Step 4: the key node input graph rolling network test module is obtained, the test result of each group of key nodes is output,
step 5: after the completion effects of different key nodes are calculated, the obtained key nodes and the mean square error thereof are input into a judging and outputting module, and a group of key nodes with the best completion effects are selected as a final result and output.
2. The method for selecting an integrated key node group for graph data completion according to claim 1, wherein step 1: the method comprises the steps of obtaining topology information, firstly abstracting an actual traffic network into a network, wherein each traffic intersection is regarded as a node in the network, a road section connecting two traffic intersections is regarded as an edge between the nodes, and the attribute value of each node represents flow information representing the corresponding traffic intersection in time.
3. The method for selecting an integrated key node group for graph data completion according to claim 2, wherein in step 4, the graph rolling network testing module mainly uses the graph rolling network to make the key node identified in the previous stage a graph data completion test, and outputs a test result, and the graph data completion method specifically comprises the following steps:
step 4.1: inputting real graph data information and key node numbers judged in the previous stage;
step 4.2: according to the input information, deleting the information of other nodes except the key node;
step 4.3: firstly, making the data pass through a linear layer, and carrying out graph data pre-complement;
the formula for graph data pre-completion is as follows:
wherein , is the graph data information after deleting part of the nodes, is +.>A dimension column vector, wherein->For the number of nodes->For the number of key nodes, < >>Is->Weight matrix of>Paranoid item of one->Validly set vector (L)>Data after pre-completion;
step 4.4: inputting the result obtained by the pre-complement into a graph convolution layer, and performing graph convolution operation;
the formula of the graph convolution operation is as follows:
wherein ,indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Weight matrix of>Is an adjacency matrix->Is a bias term;
step 4.5: inputting the result obtained by the graph convolution layer into a full connection layer, and outputting a graph data complement result;
the formula of the full link layer is as follows:
wherein , indicate->The picture data information of the layer is one +.>Validly set vector (L)>Is->Weight matrix of>Is a bias term;
step 4.6: comparing the result obtained by the completion with the original real information to obtain the graph data completion effect,
4. The method for selecting an integrated key node group for graph data completion of claim 3, wherein step 5 specifically comprises the following steps: after the completion effect of the different five groups of key nodes is calculated, the obtained five groups of key nodes and the mean square error thereof are input into a judging and outputting module, the completion effect of each group of key nodes is compared, a group of key nodes with the best completion effect is selected as a final result and is output,
let 5 groups of key nodes obtained by the multi-angle key node identification module be respectivelyThe +.A. obtained by the graph convolution test module>The results are +.>The final set of key nodes +.>The method comprises the following steps:
5. the method for selecting an integrated key node group for graph data completion according to claim 4, wherein the method is implemented by the following four modules, specifically:
the data preparation module extracts topology information of the real network data according to the real network data and obtains an adjacency matrix; the multi-angle key node identification module is used for identifying a group of key nodes by utilizing various methods; the graph rolling network testing module is used for testing the result obtained in the previous stage based on the graph rolling network to obtain a graph data complement effect; and the judging and outputting module is used for comparing the complement effects and outputting an optimal group of key nodes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310074880.6A CN115878861B (en) | 2023-02-07 | 2023-02-07 | Selection method for integrated key node group aiming at graph data completion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310074880.6A CN115878861B (en) | 2023-02-07 | 2023-02-07 | Selection method for integrated key node group aiming at graph data completion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115878861A CN115878861A (en) | 2023-03-31 |
CN115878861B true CN115878861B (en) | 2023-05-26 |
Family
ID=85760788
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310074880.6A Active CN115878861B (en) | 2023-02-07 | 2023-02-07 | Selection method for integrated key node group aiming at graph data completion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115878861B (en) |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109285346B (en) * | 2018-09-07 | 2020-05-05 | 北京航空航天大学 | Urban road network traffic state prediction method based on key road sections |
CN110135092A (en) * | 2019-05-21 | 2019-08-16 | 江苏开放大学(江苏城市职业学院) | Complicated weighting network of communication lines key node recognition methods based on half local center |
CN113190654A (en) * | 2021-05-08 | 2021-07-30 | 北京工业大学 | Knowledge graph complementing method based on entity joint embedding and probability model |
CN113205466B (en) * | 2021-05-10 | 2024-04-02 | 南京航空航天大学 | Incomplete point cloud completion method based on hidden space topological structure constraint |
CN114066772A (en) * | 2021-11-26 | 2022-02-18 | 南京理工大学 | Tooth body point cloud completion method and system based on transform encoder |
CN114897084A (en) * | 2022-05-24 | 2022-08-12 | 河南工学院 | Tower crane structure safety monitoring method based on graph convolution neural network |
CN115391553B (en) * | 2022-08-23 | 2023-10-13 | 西北工业大学 | Method for automatically searching time sequence knowledge graph completion model |
-
2023
- 2023-02-07 CN CN202310074880.6A patent/CN115878861B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115878861A (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106452825A (en) | Power distribution and utilization communication network alarm correlation analysis method based on improved decision tree | |
CN105721228A (en) | Method for importance evaluation of nodes of power telecommunication network based on fast density clustering | |
CN107169871B (en) | Multi-relationship community discovery method based on relationship combination optimization and seed expansion | |
Zhu et al. | Network inference from consensus dynamics with unknown parameters | |
CN101901251B (en) | Method for analyzing and recognizing complex network cluster structure based on markov process metastability | |
CN110705045B (en) | Link prediction method for constructing weighted network by utilizing network topology characteristics | |
Zhang et al. | Detecting colocation flow patterns in the geographical interaction data | |
CN116416478B (en) | Bioinformatics classification model based on graph structure data characteristics | |
Mohamed et al. | A genetic algorithms to solve the bicriteria shortest path problem | |
Laassem et al. | Label propagation algorithm for community detection based on Coulomb’s law | |
Chopade et al. | Node attributes and edge structure for large-scale big data network analytics and community detection | |
Liu et al. | Mean First-Passage Time and Robustness of Complex Cellular Mobile Communication Network | |
Pan et al. | Overlapping community detection via leader-based local expansion in social networks | |
CN108470251B (en) | Community division quality evaluation method and system based on average mutual information | |
CN115878861B (en) | Selection method for integrated key node group aiming at graph data completion | |
CN111008730B (en) | Crowd concentration prediction model construction method and device based on urban space structure | |
CN117036060A (en) | Vehicle insurance fraud recognition method, device and storage medium | |
Wu et al. | A new community detection algorithm based on distance centrality | |
CN103051476B (en) | Topology analysis-based network community discovery method | |
Chen et al. | Dynamic path flow estimation using automatic vehicle identification and probe vehicle trajectory data: A 3D convolutional neural network model | |
CN115564989A (en) | Random forest algorithm for land use classification | |
Jian et al. | CLOSE: Local community detection by local structure expansion in a complex network | |
Abdulkarim et al. | Using social network analysis to study diversity in business partnerships | |
Kisgyorgy et al. | Analysis and observation of road network topology | |
Bütün et al. | A multi-objective genetic algorithm for community discovery |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |