CN115529290A

CN115529290A - IP street level positioning method and device based on graph neural network

Info

Publication number: CN115529290A
Application number: CN202211144672.0A
Authority: CN
Inventors: 丁世昌; 罗向阳; 程若思; 李腾耀; 汤征; 张帆; 祖朔迪; 王玲; 巩道福; 杨春芳; 张弛
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2022-08-30
Filing date: 2022-09-20
Publication date: 2022-12-27

Abstract

The invention belongs to the technical field of target IP positioning, and discloses an IP street level positioning method and device based on a graph neural network, wherein the method comprises the following steps: firstly, the traceroute original measurement data of a computer network is expressed into a graph with attributes; then, converting the attribute graph into initial node embedding through an encoder; subsequently, the initial node embedding is refined by modeling the connection information; finally, the decoder maps the fine embedding to node positions. The invention relieves the convergence problem of GNN by considering prior knowledge, improves the geographic position prediction precision, and experiments on different real data sets show that: the present invention outperforms the most advanced rule-based and learning-based baselines by 16% to 28% over the median error distance of all data sets.

Description

IP street level positioning method and device based on graph neural network

Technical Field

The invention relates to the technical field of target IP positioning, in particular to an IP street level positioning method and device based on a graph neural network.

Background

The target IP street level positioning is based on the IP city level positioning or the known IP city position, and further finds out the more accurate position of the IP address in the city. IP street level positioning has practical requirements in a plurality of fields such as tracing network attack, providing localization service and the like. However, most of the existing methods are only suitable for ideal network environments. In an actual network, the positioning accuracy is not high because the relationship between the measurement data and the geographical location does not conform to the assumptions of the existing methods. Therefore, how to develop an IP street level positioning technology with higher accuracy in an actual network has important value.

Depending on how the "measurement data-geographical location" mapping is depicted, IP street-level location techniques can be divided into rule-based IP street-level location techniques and deep learning-based IP street-level location techniques.

IP street level positioning techniques based on predefined rules generally assume that there is some strong correlation between measurement data, such as time delay or routing path, and geographical distance, and define conversion rules from measurement data to geographical distance according to the respective assumptions. The advantages of this type of method are easy to understand. However, the accuracy is generally not high because the assumption of the method is often not in accordance with the actual situation of most practical networks.

In recent times, some scholars have attempted to model the mapping between measurement data and geographic locations using deep learning. Deep learning is good at describing the nonlinear relationship between various measurement data and geographic distances from a large amount of data, and is more suitable for the complex situation of an actual network. The patent applicant proposes that delay and routing information of a target network can be input into a multi-layer perceptron MLP, and the geographic position of a target IP can be predicted after the delay and routing information is processed by the MLP. Such methods have a higher positioning accuracy than positioning techniques based on predefined rules. However, the MLP method cannot model the topology structure of the computer network, and the positioning accuracy still needs to be improved.

Disclosure of Invention

The invention provides an IP street level positioning method and device based on a graph neural network, aiming at the problem that the positioning accuracy of the existing IP street level positioning technology still needs to be improved, the problem of GNN convergence is relieved by considering prior knowledge, and the geographic position prediction accuracy is improved.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides an IP street level positioning method based on a graph neural network, which comprises the following steps:

step 1, representing traceroute original measurement data of a computer network into a graph with attributes;

step 2, generating initial graph node embedding of each IP address based on the graph with the attribute through an encoder; the IP address comprises a landmark IP address, and the geographic position corresponding to the landmark IP address is known;

step 3, the geographical position information is injected into the initial graph node embedding by modeling the link information between the computer network nodes;

step 4, embedding and mapping the graph nodes processed in the step 3 to the geographic positions corresponding to the graph nodes, namely IP addresses, through a decoder;

step 5, comparing the difference between the geographic position obtained by mapping and the actual geographic position, and optimizing the parameters of the model through back propagation and gradient descent;

and 6, predicting the geographic position of the IP addresses of other unknown geographic positions in the computer network based on the optimized model.

Further, the step 1 comprises:

step 1.1, constructing a network topology graph:

representing the topology of a computer network using IP addresses and links between IP addresses, all IP addresses in traceroute raw measurement data of a computer network are converted into graph nodes, each node v _i The node ID is distinguished by node ID, and the value of the node ID is 1-N _V In which N is _V The number of all IP addresses in traceroute original measurement data is obtained;

converting a direct physical link between two IP addresses into a graph edge;

step 1.2, attribute extraction:

each graph node is associated with two node attributes: node latency and node IP address; for each node, the detection host repeats measuring delay for multiple times, and the minimum delay is selected as the node delay; finally, combining the node ID and the node attribute as the initial characteristics of the node;

each edge is associated with an edge delay, a head node IP address and a tail node IP address, a node close to the detection host is taken as a head node of the edge, another node of the edge is taken as a tail node, and the edge delay is calculated by subtracting the node delay of the head node from the tail node.

Further, in step 1.1, in the traceroute process, when the IP address and delay of the router cannot be obtained by the probe host, there are two methods for establishing an edge on the router: (i) Neglecting the router, and establishing an edge between two routers before and after the router on a routing path; (ii) The router is mapped to the IP addresses found in the other re-measured routing paths.

Further, the step 3 comprises:

for node v _i And its neighboring node v _j From v _j To v _i Is defined as

Wherein m is _i←j Is v _j Is sent to v _i Message of f _message As a function of the message,

indicating neighboring node embedding, e _ij Representing a node v _i And node v _j The edge of f _message The realization method comprises the following steps:

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

For computing neighbor node embedding

f _edge The framework of (1) is as follows:

wherein

As weight for controlling the embedding of adjacent nodes

The impact on node i;

and

a weight matrix of an MLP layer;

first embedding the initial edge by the above formula

Converting the temporary embedding into a one-dimensional weight embedding, and reconstructing the one-dimensional weight embedding into a two-dimensional orientation weight matrix

To control neighbor nodes v _j To node v _i Capturing a nonlinear relationship by using an activation function of the ReLU;

U _vi denotes v _i Set of neighboring nodes, aggregation function f _aggregate Collect all the slave

Is propagated to v _i Message of (3), update function f _update Using collected messages

Update v _i V. embedding of _i The aggregate update function of (a) is defined as:

wherein

Representing a node v _i Update embedding after embedding for an initial node of an encoder;

the update function is implemented as follows:

by summarizing the embedding before a node update

And messages aggregated from neighbors

To update the node v _i And (4) performing nonlinear feature transformation by using Relu.

Further, in step 5, the following loss function is adopted:

wherein N (v) _train Representing all the nodes in the training data set,

respectively representing the real geographical position and the predicted geographical position of the IP in the training set after being transformed by the decoder, and theta represents trainable model parameters.

Another aspect of the present invention provides an IP street-level positioning apparatus based on a graph neural network, including:

the preprocessing module is used for representing traceroute original measurement data of a computer network into a graph with attributes;

the encoding module is used for generating initial graph node embedding of each IP address based on the attributed graph through an encoder; the IP address comprises a landmark IP address, and the geographic position corresponding to the landmark IP address is known;

the message transmission module is used for injecting the geographical position information into the initial graph node embedding by modeling the link information between the computer network nodes;

the decoding module is used for embedding and mapping the graph nodes processed by the message transmission module to the geographic positions corresponding to the graph nodes, namely the IP addresses through a decoder;

the model training optimization module is used for comparing the difference between the geographic position obtained by mapping and the actual geographic position and optimizing the parameters of the model through back propagation and gradient descent;

and the geographic position prediction module is used for predicting the geographic position of the IP address of other unknown geographic positions in the computer network based on the optimized model.

Further, the preprocessing module is specifically configured to:

step 1.1, constructing a network topology graph:

converting a direct physical link between two IP addresses into a graph edge;

step 1.2, attribute extraction:

each edge is associated with an edge delay, a head node IP address and a tail node IP address, a node close to the probe host is taken as a head node of the edge, another node of the edge is taken as a tail node, and the edge delay is calculated by subtracting the node delay of the head node from the tail node.

Further, the message passing module is specifically configured to:

for node v _i And its neighboring node v _j From v _j To v _i Is defined as

Wherein m is _i←j Is v is _j Is sent to v _i Message of f _message As a function of the message,

indicating neighboring node embedding, e _ij Representing a node v _i And node v _j A side of formation f _message The realization method comprises the following steps:

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

For computing neighbor node embedding

f _edge The framework of (1) is as follows:

wherein

As weight for controlling the embedding of adjacent nodes

The impact on node i;

and

a weight matrix of two MLP layers;

first embedding the initial edge by the above formula

To control neighbor node v _j To node v _i Capturing a nonlinear relationship by using an activation function of the ReLU;

wherein

the update function is implemented as follows:

by summarizing embedding before node update

And messages aggregated from neighbors

Further, in the model training optimization module, the following loss function is adopted:

wherein N (v) _train Representing all the nodes in the training data set,

respectively representing the real geographical position and the predicted geographical position of the IP in the training set after the transformation of the decoder, and theta represents trainable model parameters.

Compared with the prior art, the invention has the following beneficial effects:

the invention utilizes the graph neural network to improve the generalization capability of the IP positioning. Firstly, the traceroute original measurement data of a computer network is expressed into a graph with attributes; then, converting the attribute graph into initial node embedding through an encoder; subsequently, the initial node embedding is refined by modeling the connection information; finally, the decoder maps the fine embedding to node positions. The invention relieves the convergence problem of the neural network of the map by considering the prior knowledge, thereby improving the prediction precision of the geographic position. Experiments on different real data sets showed that: the present invention outperforms the most advanced rule-based and learning-based baselines by 16% to 28% over the median error distance of all data sets.

Drawings

Fig. 1 is a flowchart illustrating an IP street level positioning method based on a graph neural network according to an embodiment of the present invention.

FIG. 2 is a diagrammatic illustration of a computer network mapping to tape attributes in accordance with an embodiment of the present invention;

FIG. 3 is a block diagram of an embodiment of a neural network-based IP street level positioning method;

fig. 4 is a schematic structural diagram of an IP street-level positioning apparatus based on a graph neural network according to an embodiment of the present invention.

Detailed Description

The invention is further illustrated by the following examples in conjunction with the drawings and the accompanying drawings:

as shown in fig. 1, an IP street level positioning (GNN-GEO) method based on graph neural network includes:

step S101, representing traceroute original measurement data of a computer network into a graph with attributes;

step S102, generating initial graph node embedding of each IP address based on the attributed graph through an encoder; the IP address comprises a landmark IP address, and the geographic position corresponding to the landmark IP address is known;

step S103, injecting the geographical position information into the initial graph node embedding by modeling the link information between the computer network nodes;

step S104, embedding and mapping the graph nodes processed in the step S103 to the geographic positions corresponding to the graph nodes, namely IP addresses through a decoder;

step S105, comparing the difference between the geographic position obtained by mapping and the actual geographic position, and optimizing the parameters of the model through back propagation and gradient descent;

and S106, predicting the geographic position of the IP addresses of other unknown geographic positions in the computer network based on the optimized model.

Specifically, the method is researched around an IP street level positioning technology in an actual Internet complex environment, the IP positioning is converted into a graph node attribute prediction problem under a deep learning view angle, and the attribute prediction problem is solved by utilizing a graph neural network architecture, so that a target IP is positioned and obtained.

1. IP positioning problem translation

The method redefines the IP street level positioning problem based on measurement in the deep learning field of the graph into an attribute prediction problem with the attribute graph nodes.

As shown in fig. 1, the present invention utilizes IP addresses and links between IP addresses to represent the topology of a computer network. By link is meant a direct physical link between two IP addresses. The location of each IP address is represented as a pair (latitude, longitude). In addition, each IP address is associated with an attribute that can be measured, such as latency from the probe source to the IP address. Each link is also associated with a property that can be measured, such as the latency of that link. All IP addresses can be divided into four groups: the host IP address, landmark IP address, target IP address, and router IP address are probed. Wherein, the detecting host is generally a detecting host or a server controlled by a researcher and provided with network measuring software, and the geographic position and the IP address of the landmark are also known; to the eyesNetwork equipment marked with unknown geographic location, known IP address and requiring geographic location estimation by a positioning algorithm; the router is an intermediate router discovered when the probe host acquires the measurement data of the landmark and the target IP. Thus, one computer network can be mapped to one band property map G = { V, E, X }. Where node v represents an IP address, edge e represents a link, and x represents measurable properties of the node and edge (e.g., latency, etc.). Suppose that the longitude and latitude are used to represent the geographical location l of a certain node v _v And the node sets belonging to the landmark and the target host are respectively V _l ,V _t Then the measurement-based IP positioning can be formalized to find a prediction function f that can predict the target location:

wherein the content of the first and second substances,

is the estimated location of the target IP and the inputs are the geographic location of the landmark, the nodes, edges, and measurement attributes of the graph G. The process of finding the prediction function f is a typical graph node property prediction problem.

To solve the problem of node attribute prediction of the graph, the graph with the attributes needs to be accurately modeled, namely the measurement data G and the real positions { l ] of the target IP are accurately described _v ,v∈V _t And obtaining a node characteristic representation which can contain signals related to the geographic position according to the relationship between the geographic position and the mobile terminal, and finally estimating the correct geographic position based on the node characteristic representation.

2. GNN-GEO Process framework

The GNN-GEO method framework is shown in fig. 3, and consists of four components: a preprocessing layer, an encoder, a Messaging (MP) layer, and a decoder. The preprocessing layer maps raw measurement data of the computer network to a graphical representation G = (V, E, XV, XE). The encoder generates initial feature embedding of G, the MP layer refines the initial feature embedding by using a graph signal, all nodes V output the refined embedding, and finally the decoder maps the refined node embedding to a position. We will describe each component individually, followed by optimization details. It is worth mentioning that we can derive different GNN-based IP geolocation methods from the GNN-Geo framework by changing some details of each component.

1. A pretreatment layer: the measurement data of the patent is mainly tracking data from a detection host to a road sign and a target. The task of the pre-processing layer is to convert the original traceroute data into the initial embedding G = (V, E, XV, XE) of the encoder. The method mainly comprises two subtasks, namely (i) constructing a graph G by using graph nodes V and links E; (ii) extract node attribute X V and link attribute X E.

(1) Constructing a network topology map

And (3) node: the topology of the computer network is represented using IP addresses and links between IP addresses. All IP addresses in the original traceroute data are converted to graph nodes. Each node v _i Distinguished by the node ID. Value of 1-N _V In which N is _V The number of all IP addresses in traceroute data.

Side: if the probing host finds a direct physical link between two IP addresses, then this link is converted to graph edge e _j . During traceroute, the probing host may not be able to obtain the IP addresses and delays of some routers, which are referred to as "anonymous routers". Neglecting the anonymous router, on the routing path, establish the edge between two routers before and after the anonymous router; (ii) The anonymous router is mapped to the IP address found in the other re-measured routing path.

(2) Attribute extraction

And (4) node attribute: in this patent, each graph node is associated with two node attributes: node latency and node IP address. The node latency is the direct latency of probing a host to a node. For each node, the probing master repeats measuring the delay a number of times. The smallest is chosen as the node delay because it contains the smallest congestion, closer to the true propagation delay. Finally, the node IDs andthe node attributes serve as initial characteristics of the node. v. of _i Representing a node v _i V denotes the initial characteristics of all nodes.

Edge attribute: each edge may be associated with several characteristics such as edge delay, head node IP address, and tail node IP address. Here, a node near the probe host is referred to as a head node of the edge, and another node is referred to as a tail node. The edge delay is calculated by subtracting the node delay of the head node from the tail node. We keep the edge delay even though it may be negative.

2. Encoder for encoding a video signal

The purpose of the encoder is to form an initial embedding of low-dimensional graph nodes and edges for the MP layer. And generating embedding of graph nodes and edges by using the initial node characteristics V and the initial edge characteristics E of the preprocessor.

For each non-zero feature in V and E, we associate it with an embedding vector. A set of low-dimensional embeddings of graph nodes and graph edges, respectively, may then be obtained. We connect the embedding of a node (or edge) to a vector to describe the node (or edge). In particular, node v _i And edge e _j The low-dimensional embedding of (2) is:

in the formula, Q _v ∈R ^Nv×G An embedded matrix, Q, representing all nodes V _v ∈R ^Ne×K An embedded matrix, N, representing all edges E _V The number of node features is indicated, and G indicates the embedding size. N is a radical of _e The number of edge features is indicated, and K indicates the embedding size. Then the encoder will

And h _E Input into the next component MP layer for embedded enhancement.

3. Messaging (MP) layer

This is the core component of the GNN-Geo framework. The input is a graph node

And an edge h _E Initial low-dimensional embedding. Its purpose is to increase initial graph node embedding by explicitly modeling edges between nodes and graph/edge attributes

The final map node embedding will be sent to the decoder for position estimation. The MP layer consists of messaging, aggregation, and update functions.

The message function: for node v _i And its neighboring node v _j From v _j To v _i Is defined as

Wherein m is _i←j Is v is _j Is sent to v _i Message of f _message As a function of the message. f. of _message The inputs of (a) are: (i) Adjacent node embedding h _vj ；(ii)v _i ，v _j Edge e between two nodes _ij Is embedded. This example will f _message The realization method comprises the following steps:

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

Message for computing neighbor nodes

The framework of (1) is as follows:

wherein

As weight for controlling neighbor embedding

Impact on node i.

And

is a weight matrix of two layers of MLP. They first embed the initial edge

(size K) to temporal embedding (size 2K) and then to one-dimensional embedding. Embedding and reconstructing one-dimensional weight into two-dimensional orientation weight matrix W _eij (size G) to control neighbor node v _j To node v _i The nonlinear relationship is captured using the activation function of the ReLU.

Aggregation and update function: u shape _vi Denotes v _i Set of neighboring nodes, aggregation function f _aggregate Collect all the slave

Is propagated to v _i Of updating function f _update Using collected messages

wherein

Representing a node v _i Update embedding after embedding for the initial node of the encoder. Aggregation function f _aggregate It may be a simple symmetric function such as Mean, max or Sum. The update function is implemented as follows:

we proceed by summarizing the embedding before a node

And messages it aggregates from neighbors

To update the node v _i . And performing nonlinear feature transformation by using Relu.

4. Decoder

The goal of the decoder is to embed by refinement of the nodes

To estimate the position. We need to predict two values (latitude and longitude) for all nodes. Our decoder is implemented as follows:

wherein

For all nodes V to pass throughThe transformed estimated position matrix, sigmoid, is an activation function with an output of [0,1 ]]. If the input value is too large, sigmoid may become saturated, which makes learning more difficult. BatchNorm refers to batch normalization. W is a group of _loc Weight parameter matrix for full connection layer in decoder, b _loc Is the bias parameter matrix of the fully connected layer in the decoder. To alleviate the saturation problem of the Sigmoid function and prevent overfitting, the present embodiment employs a batch normalization method. Then we compare the real geographical location of the training set IP (after transformation)

And predicted geographical location of training set IP (after transformation)

To train GNN-Geo. After training is finished, the geographic position of the target IP

The transformed geographic location can be scaled from the inverse of the two "0-1 scalers

And (4) converting to obtain.

5. Model training

To learn the model parameters, we use loc _train And

mean square error loss (MSE) between. In this patent, we optimize L ₂ The regularized MSE loss is as follows:

wherein N (v) _train Representing all nodes in the training dataset. The contains all trainable model parameters of GNN-Geo. Deep learning methods often suffer from over-fitting problems. Except at batches in the decoderPhysical normalization, we also use L ₂ Regularization to prevent overfitting. Note that batch normalization is only used in training and must be disabled during testing. Lambda control L ₂ Strength of regularization. The prediction model is optimized using an Adam optimizer and the model parameters are updated using the gradient of the loss function. Until the final loss is minimized, the model optimization is stopped, and the model obtained at this time can be applied to the positioning of the IP of other unknown geographic positions in the network.

To verify the effect of the present invention, the following experiment was performed:

TABLE 1 comparison of GNN-Geo method to baseline method (unit: km)

Note: bold line indicates GNN-G _e o's performance, the best metric between baselines is underlined, and the italicized line represents the best baseline approach for the mean error distance.

As shown in table 1, the results clearly show that the average error distance, the median error distance, and the maximum error distance of the GNN-Geo method are better than those of all baseline methods in all three regional data sets, i.e., the positioning accuracy of GNN-Geo is higher than that of baseline in all three indexes. It was also found that the best baseline method was different for all three regions. Therefore, compared with the prior method, the GNN-Geo shows better generalization capability under different network environments.

On the basis of the above embodiments, as shown in fig. 4, another aspect of the present invention provides an IP street level positioning apparatus based on a graph neural network, including:

the system comprises a preprocessing module, a data processing module and a data processing module, wherein the preprocessing module is used for representing traceroute original measurement data of a computer network into a graph with attributes;

Further, the preprocessing module is specifically configured to:

step 1.1, constructing a network topology graph:

representing the topology of a computer network using IP addresses and links between IP addresses, all IP addresses in traceroute raw measurement data of a computer network are converted into graph nodes, each node v _i The node ID is distinguished by node ID and takes the value of 1 to N _V In which N is _V The number of all IP addresses in traceroute original measurement data is obtained;

converting a direct physical link between two IP addresses into a graph edge;

step 1.2, attribute extraction:

Further, in step 1.1, in the traceroute process, when the IP address and delay of the router cannot be obtained by the probe host, there are two methods for establishing an edge on the router: (i) Neglecting the router, and establishing an edge between two routers before and after the router on a routing path; (ii) The router is mapped to the IP address found in the other re-measured routing path.

Further, the message passing module is specifically configured to:

for node v _i And its neighboring node v _j From v _j To v _i Is defined as

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

For computing neighbor node embedding

f _edge The framework of (1) is as follows:

wherein

As weight for controlling the embedding of adjacent nodes

The impact on node i;

and

a weight matrix of two MLP layers;

first embedding the initial edge by the above formula

To control neighbor nodes v _j To node v _i Capturing a nonlinear relation by utilizing an activation function of the ReLU;

Is propagated to v _i Of updating function f _update Using collected messages

wherein

the update function is implemented as follows:

by summarizing embedding before node update

And messages aggregated from neighbors

wherein N (v) _train Representing all the nodes in the training data set,

In summary, the present invention utilizes the graph neural network to improve the generalization capability of IP positioning. Firstly, the traceroute original measurement data of a computer network is expressed into a graph with attributes; then, converting the attribute graph into initial node embedding through an encoder; subsequently, the initial node embedding is refined by modeling the connection information; finally, the decoder maps the fine embedding to node positions. The method relieves the convergence problem of the neural network of the map by considering the prior knowledge, thereby improving the accuracy of the geographic position prediction. Experiments on different real data sets showed that: the present invention outperforms the most advanced rule-based and learning-based baselines by 16% to 28% over the median error distance of all data sets.

The above shows only the preferred embodiments of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims

1. An IP street level positioning method based on a graph neural network is characterized by comprising the following steps:

step 3, by modeling the link information between the computer network nodes, injecting the geographical position information into the initial graph node embedding;

2. The IP street-level positioning method based on graph neural network according to claim 1, wherein the step 1 comprises:

step 1.1, constructing a network topology graph:

representing the topology of a computer network using IP addresses and links between IP addresses, all IP addresses in traceroute raw measurement data of a computer network are converted into graph nodes, each node v _i The node ID is distinguished by node ID, and the value of the node ID is 1-N _V In which N is _V The number of all IP addresses in traceroute original measurement data is calculated;

converting a direct physical link between two IP addresses into a graph edge;

step 1.2, attribute extraction:

3. The IP street level positioning method based on graph neural network as claimed in claim 2, wherein in step 1.1, when the IP address and delay of the router can not be obtained by the probing host in the traceroute process, there are two methods to establish an edge on the router: (i) Neglecting the router, and establishing an edge between two routers before and after the router on a routing path; (ii) The router is mapped to the IP addresses found in the other re-measured routing paths.

4. The IP street level positioning method based on graph neural network as claimed in claim 1, wherein the step 3 comprises:

for node v _i And its neighboring node v _j From v _j To v _i Is defined as

Wherein m is _i←j Is v _j Is sent to v _i Message of (f) _message As a function of the message,

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

For computing neighbor node embedding

f _edge The framework of (1) is as follows:

wherein

As weight for controlling the embedding of adjacent nodes

The impact on node i;

and

a weight matrix of an MLP layer;

first embed the initial edge by the above formula

Is propagated to v _i Of updating function f _update Using collected messages

Updating v _i V. embedding of _i The aggregate update function of (a) is defined as:

wherein

Representing a node v _i Update embedding after embedding for the initial node of the encoder;

the update function is implemented as follows:

by summarizing embedding before node update

And messages aggregated from neighbors

5. The IP street-level positioning method based on graph neural network according to claim 1, characterized in that in the step 5, the following loss function is adopted:

wherein N (v) _train Representing all the nodes in the training data set,

6. An IP street level positioning device based on a graph neural network, comprising:

the decoding module is used for embedding and mapping the graph nodes processed by the message transmission module to the geographic positions corresponding to the graph nodes, namely IP addresses through a decoder;

7. The IP street level positioning apparatus based on graph neural network as claimed in claim 6, wherein the preprocessing module is specifically configured to:

step 1.1, constructing a network topology graph:

converting a direct physical link between two IP addresses into a graph edge;

step 1.2, attribute extraction:

8. The IP street-level locator of claim 7, wherein in step 1.1, when the IP address and delay of a router cannot be obtained by a probing host in a traceroute process, there are two methods for establishing an edge on the router: (i) Neglecting the router, and establishing an edge between two routers before and after the router on a routing path; (ii) The router is mapped to the IP addresses found in the other re-measured routing paths.

9. The IP street level positioning apparatus of claim 6, wherein the messaging module is specifically configured to:

for node v _i And its neighboring node v _j From v _j To v _i Is defined as

wherein the edge network f _edge Is a two-layer MLP that embeds edges

Is converted into a matrix

For computing neighbor node embedding

f _edge The framework of (1) is as follows:

wherein

As weight for controlling the embedding of adjacent nodes

The impact on node i;

and

a weight matrix of two MLP layers;

first embedding the initial edge by the above formula

Is propagated to v _i Of updating function f _update Using collected messages

wherein

the update function is implemented as follows:

by summarizing the embedding before a node update

And messages aggregated from neighbors

10. The IP street-level positioning apparatus based on graph neural network as claimed in claim 6, wherein the model training optimization module employs the following loss function:

wherein N (v) _train Representing all the nodes in the training data set,