CN115035349B

CN115035349B - Point representation learning method, representation method and device of graph data and storage medium

Info

Publication number: CN115035349B
Application number: CN202210736863.XA
Authority: CN
Inventors: 朱文武; 王鑫; 李昊阳
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2024-06-18
Anticipated expiration: 2042-06-27
Also published as: CN115035349A

Abstract

The application provides a point representation learning method, a point representation learning device and a point representation learning device for graph data, and a point representation learning device and a point representation learning storage medium for graph data, and belongs to the technical field of data processing. The learning method comprises the following steps: inputting graph data serving as training samples into a preset node representation model; processing the graph data through the preset node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data, wherein the stable self-subgraph is used for characterizing the stable characteristics of the nodes, and the unstable self-subgraph is used for characterizing the environmental information of the nodes; and the preset node characterization model learns the stable self-subgraphs of all nodes in the graph data to obtain a node characterization model after learning. The application aims to adaptively ensure the prediction effect of a model under the condition that the distribution difference exists between a test environment and a training environment.

Description

Point representation learning method, representation method and device of graph data and storage medium

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a point representation learning method, a representation method, a device and a storage medium of graph data.

Background

At present, the graph data is already applied to various scenes, such as social networks, traffic networks, internet of things networks and the like, but generally, the graph data cannot be directly applied to a deep learning algorithm, and after vectorization characterization is carried out on the graph data, the solution of related tasks can be completed on the graph data.

Graph data characterization learning is a very important scientific research problem, and the core problem is how to calculate the characterization of nodes in graph data; the current common graph data characterization learning mainly comprises three methods: the graph neural network characterizes learning, node number generalization and expression capacity of the graph neural network.

The vectorization characterization of the graph data nodes obtained by the method is fit with a training environment, but graph data in the testing environment often has more complex data distribution, when the training data is insufficient to reflect the real distribution of the data, the test data and the training data have distribution differences, and although the network of the node characterization of the graph data obtained by the method obtains good prediction effect on a training data set, when the network is actually applied to the testing environment, the performance is easy to be obviously reduced due to distribution migration.

Therefore, how to adaptively ensure the prediction effect of the model when the test environment and the training environment have distribution differences is a problem to be solved.

Disclosure of Invention

The embodiment of the application provides a point representation learning method, a point representation device and a point representation storage medium of graph data, which aim to adaptively ensure the prediction effect of a model when the distribution difference exists between a test environment and a training environment.

In a first aspect, an embodiment of the present application provides a method for learning a point representation of graph data under distribution migration, where the learning method includes:

inputting graph data serving as training samples into a preset node representation model;

Processing the graph data through the preset node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data, wherein the stable self-subgraph is used for characterizing the stable characteristics of the nodes, and the unstable self-subgraph is used for characterizing the environmental information of the nodes;

And the preset node characterization model learns the stable self-subgraphs of all nodes in the graph data to obtain a node characterization model after learning.

Optionally, the processing the graph data through the preset node characterization model, determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data, including:

the preset node characterization model determines a self-graph of each node in the graph data;

in the self-graph of any node, updating the node representation of the node according to the information of the neighbor node of the node;

According to the node characteristics updated by the node, respectively calculating the similarity between the node and a first-order neighbor node in the self-graph;

and determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and the first-order neighbor node in the self-graph.

Optionally, updating the node representation of the node according to the information of the neighboring node of the node, including:

carrying out neighbor aggregation on all node information in the self graph of the node;

And updating the information of the node by using the information of the neighbor nodes of the node to obtain updated node characterization.

Optionally, determining the stable self-subgraph and the unstable self-subgraph corresponding to the node according to the similarity between the node and other nodes in the self-graph, including:

if the similarity between the node and any one of the first-order neighbor nodes is larger than a preset value, the edge between the two nodes is the edge in the stable self-subgraph;

and if the similarity between the node and any one of the first-order neighbor nodes is smaller than or equal to a preset value, the edge between the two nodes is the edge in the unstable self-subgraph.

Optionally, after determining the stable self-subgraph and the unstable self-subgraph of each node in the graph data, the learning method further includes:

Characterizing the stable self subgraph and the unstable self subgraph of each node to obtain the characteristics corresponding to the stable self subgraph and the unstable self subgraph, wherein the characteristics are used for describing the clustering characteristics of each subgraph;

clustering the unstable self-subgraphs of all nodes in the graph data, wherein the clustered unstable self-subgraphs represent the environment information of the corresponding nodes.

In a second aspect, an embodiment of the present application provides a method for characterizing points of graph data under distribution migration, where the characterizing method includes:

inputting graph data to be characterized into the node characterization model according to the first aspect of the embodiment;

Processing the graph data to be characterized through the node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data to be characterized, wherein the stable self-subgraph of one node is used for characterizing the stable characteristics of the node, and the unstable self-subgraph of one node is used for characterizing the environmental information of the node;

and the node characterization model predicts according to the stable self-subgraphs of all nodes in the graph data to be characterized, and outputs a node characterization result of the graph data to be characterized.

Optionally, the training samples of the node characterization model are: and the graph data has the same or different data distribution with the graph data to be characterized.

In a third aspect, an embodiment of the present application provides a point representation learning apparatus for map data under distribution migration, the learning apparatus including:

the training input module is used for inputting graph data serving as a training sample into a preset node representation model;

The processing module is used for processing the graph data through the preset node characterization model and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data, wherein the stable self-subgraph is used for characterizing the stable characteristics of the nodes, and the unstable self-subgraph is used for characterizing the environmental information of the nodes;

And the learning module is used for enabling the preset node characterization model to learn the stable self-subgraphs of all nodes in the graph data, and obtaining the node characterization model after learning.

Optionally, the processing module includes:

the self-graph determining unit is used for determining a self-graph of each node in the graph data through the preset node characterization model;

The updating unit is used for updating the node representation of any node according to the information of the neighbor node of the node in the self-graph of the node;

the similarity calculation unit is used for calculating the similarity between the node and the first-order neighbor node in the self-graph according to the node representation updated by the node;

and the self-subgraph determining unit is used for determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and the first-order neighbor node in the self-graph.

Optionally, the updating unit includes:

An aggregation subunit, configured to perform neighbor aggregation on all node information in the self-graph of the node;

And the updating subunit is used for updating the information of the node by utilizing the information of the neighbor node of the node to obtain the updated node representation.

Optionally, the self subgraph determining unit includes:

A stable self-subgraph determination subunit, configured to, when the similarity between the node and any one of the first-order neighbor nodes is greater than a preset value, take an edge between two nodes as an edge in the stable self-subgraph;

and the unstable self-subgraph determination subunit is used for determining that the edge between the two nodes is the edge in the unstable self-subgraph when the similarity between the node and any one of the first-order neighbor nodes is smaller than or equal to a preset value.

Optionally, the learning device further includes:

the characterization module is used for performing characterization processing on the stable self-subgraph and the unstable self-subgraph of each node to obtain characteristics corresponding to the stable self-subgraph and the unstable self-subgraph, wherein the characteristics are used for describing clustering characteristics of each subgraph;

And the clustering module is used for clustering the unstable self-subgraphs of all the nodes in the graph data, and the clustered unstable self-subgraphs represent the environment information of the corresponding nodes.

In a fourth aspect, an embodiment of the present application provides a point characterization apparatus for map data under distribution migration, the characterization apparatus including:

The input module is used for inputting graph data to be characterized into the node characterization model according to the first aspect of the embodiment;

The prediction module is used for processing the graph data to be characterized through the node characterization model and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data to be characterized, wherein the stable self-subgraph of one node is used for characterizing the stable characteristics of the node, and the unstable self-subgraph of one node is used for characterizing the environmental information of the node; and the node characterization model predicts according to the stable self-subgraphs of all nodes in the graph data to be characterized, and outputs a node characterization result of the graph data to be characterized.

In a fifth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of learning a point representation of distribution under-migration map data according to the first aspect of the embodiments, and/or implements a method of point representation of distribution under-migration map data according to the second aspect of the embodiments.

The beneficial effects are that:

in the learning method, the graph data is taken as granularity of each node, so that a preset node representation model can identify stable self-subgraphs and unstable self-subgraphs of each node, the stable self-subgraphs represent stable characteristics of the nodes, the unstable self-subgraphs represent environment information of the nodes, and the preset node representation model only carries out learning and training on the stable self-subgraphs representing the stable characteristics of the nodes, namely, the preset node representation model predicts according to the stable self-subgraphs of the nodes and outputs representation results of the graph data.

In other words, the learning method identifies, distinguishes and dissociates stable characteristic information and unstable environment information in the graph data point characterization process, removes pseudo-correlation such as unstable environment information of nodes, and enables the nodes to ensure that the model predicts according to the stable characteristics of the nodes in the graph data; when the test environment and the training environment are different, the node characterization model obtained through training can still remove the influence of unstable environmental factors in data distribution on node characterization, obtain a more accurate characterization prediction result, and can adaptively ensure the prediction effect of the model when the test environment and the training environment are different in distribution.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating a method for learning point representations of graph data under distributed migration according to an embodiment of the present application;

FIG. 2 is a flow chart illustrating the steps of a method for point characterization of graph data under distributed migration according to one embodiment of the present application;

FIG. 3 is a functional block diagram of a point representation learning device for distributed transition map data according to an embodiment of the present application;

FIG. 4 is a functional block diagram of a point characterization device of map data under distributed migration according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

As graph data is increasingly widely applied, graph data characterization learning becomes a very important scientific research problem, and the core problem is how to calculate the characterization of nodes in the graph data; the current common graph data characterization learning mainly comprises three methods: the graph neural network characterizes learning, node number generalization and expression capacity of the graph neural network.

The graph neural network representation learning has strong graph representation capability, achieves ideal performance on a plurality of tasks, mainly adopts a neighbor aggregation mechanism, and carries out iterative updating on node representation according to neighbor information.

The node generalized graph neural network gives a prediction result independent of the node number of the graph structure data, can transfer a model trained on a small graph to a large graph and obtain a good effect on the large graph, but the method is only suitable for the condition that the node number of a training environment and a testing environment changes, cannot adapt to other types of distribution changes, particularly the distribution changes existing on a node level, and cannot be used for the graph data node characterization learning problem of the distribution generalization in a real scene.

The expression capacity of the graph neural network theoretically analyzes the capacity of the graph neural network to solve tasks, theoretical guarantee is given when the distribution of the training set and the test set are close, but the method cannot be applied to the condition that the training environment is inconsistent with the test environment, so that the obtained characterization cannot accurately describe graph structure data in the unknown test environment, and the performance of downstream tasks is affected to a certain extent.

The graph neural network adopted by the existing method is based on the same distribution assumption, namely, all node data are assumed to come from the same data distribution, the fitting process of training data distribution is focused, however, deviation often exists between the data distribution of graph data used in testing and the data distribution in training, so that the problem to be solved is that how to adaptively ensure the prediction effect of a model when the testing environment and the training environment have distribution differences.

FIG. 1 is a flowchart showing the steps of a method for learning point representations of graph data under distributed migration in an embodiment of the present application, the method may specifically include the steps of:

s101: and inputting the graph data serving as a training sample into a preset node representation model.

The graph data comprises nodes and edges between the nodes, and each node carries information; for example, if the node represents a person, the information carried by the node may be personal information such as gender, hobbies and the like of the person, and if the node represents a medicine, the information carried by the node may be information such as composition, category and the like; of course, in different application environments, even though the nodes represent the same person, the information carried by them may be different.

In the learning and training process of the preset node characterization model, the number of graph data serving as a training sample can be determined according to the actual training requirement, and the method is not limited in the embodiment; the graph data as training samples carries labels, which are related to the task being performed, and the present embodiment is not limited thereto.

S102: and processing the graph data through the preset node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data.

Specifically, the process of determining the stable self-subgraph and the unstable self-subgraph for each node is as follows:

A1: and the preset node characterization model determines the self-graph of each node in the graph data.

The self-graph of each node is centered on each node itself and comprises a first-order neighbor node and a second-order neighbor node.

A2: in the self-graph of any node, the node characterization of the node is updated according to the information of the neighbor nodes of the node.

Specifically, in order to improve accuracy in node characterization, node information of all nodes in a self graph of the node is subjected to neighbor aggregation, namely, information of neighbor nodes of the node is integrated into the node in a direct summation or weighted summation mode, the information of the neighbor nodes of the node is utilized to update the information of the node to obtain updated node characterization, and each iteration only carries out aggregation on first-order neighbors, so that aggregation on high-order neighbors can be realized through multiple iterations.

A3: and respectively calculating the similarity between the node and the first-order neighbor node in the self-graph according to the node representation updated by the node.

A4: and determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and the first-order neighbor node in the self-graph.

Specifically, if the similarity between the node and any one of the first-order neighbor nodes is greater than a preset value, the edge between the two nodes is the edge in the stable self-subgraph; and if the similarity between the node and any one of the first-order neighbor nodes is smaller than or equal to a preset value, the edge between the two nodes is the edge in the unstable self-subgraph.

In the actual implementation process, a first graph neural network can be set in the preset node characterization model, the steps A1-A4 are executed through the first graph neural network, and the output of the first graph neural network is the stable self-subgraph and the unstable self-subgraph of each node in the graph data.

In a possible implementation manner, after the stable self-subgraph and the unstable self-subgraph of each node are determined, the stable self-subgraph and the unstable self-subgraph of each node are subjected to characterization processing through a second graph neural network arranged in a preset node characterization model, so that the characteristics corresponding to the stable self-subgraph and the unstable self-subgraph are obtained.

The stable self subgraph captures or characterizes the stable characteristics of the nodes, the unstable self subgraph focuses more on the change information of the related environment around the nodes, and the environment information of the nodes can be deduced by aggregating the unstable self subgraphs of all the nodes in the graph data.

The characteristics corresponding to the stable self subgraph and the unstable self subgraph can be used for describing the clustering characteristic of each subgraph, the stable self subgraph generally does not have the clustering characteristic, the edges in the unstable self subgraph often have the clustering effect, a plurality of classes are constructed during clustering, each class contains similar unstable self subgraphs, and dissimilar unstable self subgraphs are represented among the classes.

When the unstable self-subgraphs of all the nodes are clustered, the same cluster is encouraged to have as many continuous edges as possible, and the continuous edges in different clusters are as few as possible, namely the edges in the unstable self-subgraphs are generalized in the same cluster as much as possible, and the clustered unstable self-subgraphs can be used for judging the environmental information of one node.

The prediction error of the node characterization model is mainly caused by an unstable self-subgraph, and the prediction of the stable self-subgraph is accurate by the general node characterization model, so that the method removes pseudo-correlation such as unstable environment information by identifying, distinguishing and dissociating the stable self-subgraph and the unstable self-subgraph of the node, and the node characterization model learns and predicts according to the stable characteristics of the node in the graph data.

S103: and the preset node characterization model learns the stable self-subgraphs of all nodes in the graph data to obtain a node characterization model after learning.

In the actual implementation process, the preset node characterization model is encouraged to learn to realize stable prediction in a multi-distribution environment according to the stable self-subgraph by adopting a normalizer, so that the node characterization model focuses more on the real information with the predicted force carried by the stable self-subgraph, ignores the noise information contained in the unstable self-subgraph, and further predicts the node task under distribution migration.

Specifically, parameters related to training in the training process, such as a loss function, a training batch, etc., may be adaptively set according to the actual task performed, which is not limited in this embodiment.

The node characterization model in the embodiment can be applied to different scenes, for example, taking medicine analysis as an example, training can be performed on a small number of marked molecular figures, and the model obtained by training is applied to classification of a large number of unlabeled medicines with larger scale, different data distribution and training; or taking social network analysis as an example, node characterization learning under distributed migration can also complete analysis on dynamic and evolving data, and a sufficiently stable generalized result is given; the node characterization model can also provide important help for human-computer interaction, computer-aided systems, trusted artificial intelligence and the like.

Referring to fig. 2, a flowchart illustrating steps of a method for characterizing points of graph data under distribution migration in an embodiment of the present application may specifically include the following steps:

s201: and inputting the graph data to be characterized into the node characterization model in the embodiment.

After training to obtain a node characterization model, the node characterization model can be directly used for predicting the characterization result of the graph data to be characterized; and training samples of the node characterization model are: the graph data with the same or different data distribution with the graph data to be characterized, namely, the graph data adopted in training and the data distribution of the graph data to be characterized can have deviation or difference.

S202: and processing the graph data to be characterized through the node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data to be characterized.

The node characterization model identifies, distinguishes and dissociates a stable self-subgraph and an unstable self-subgraph of the graph data to be characterized, the stable self-subgraph of one node is used for characterizing the stable characteristic of the node, and clusters the unstable self-subgraph of each node in the graph data to be characterized, so that the environmental information of each node in the graph data to be characterized is deduced.

S203: and the node characterization model predicts according to the stable self-subgraphs of all nodes in the graph data to be characterized, and outputs a node characterization result of the graph data to be characterized.

When the test environment and the training environment are different, the node characterization model can still remove the influence of unstable environmental factors in the data distribution on node characterization, and the node characterization result of the obtained graph data to be characterized is more accurate only by predicting according to the stable self-subgraph in the graph data to be characterized.

Unlike the prior art, the node characterization model provided in this embodiment identifies, distinguishes and dissociates stable characteristic information and unstable environment information in the graph data point characterization process, and removes such pseudo-correlation of the unstable environment information, so that the model predicts according to the stable characteristic of the node in the graph data, and even when the graph data input in the training and testing processes have differences in data distribution, the model can adaptively ensure the predicted effect, and has the advantages of completing adaptation and generalization to the testing environment different from the training environment, and giving out sufficiently stable and generalized output results on the testing data outside the distribution.

Referring to fig. 3, there is shown a functional block diagram of a point representation learning apparatus of map data under distributed migration in an embodiment of the present application, the learning apparatus comprising:

the training input module 101 is configured to input graph data serving as a training sample into a preset node characterization model;

The processing module 102 is configured to process the graph data through the preset node characterization model, and determine a stable self-subgraph and an unstable self-subgraph of each node in the graph data, where the stable self-subgraph is used to characterize a stable characteristic of the node, and the unstable self-subgraph is used to characterize environmental information of the node;

And the learning module 103 is configured to enable the preset node characterization model to learn the stable self-subgraphs of all nodes in the graph data, so as to obtain a node characterization model after learning.

Optionally, the processing module includes:

Optionally, the updating unit includes:

Optionally, the self subgraph determining unit includes:

Optionally, the learning device further includes:

Referring to fig. 4, there is shown a functional block diagram of a point characterization device of map data under distributed migration in an embodiment of the present application, the characterization device includes:

An input module 201, configured to input graph data to be characterized into the node characterization model described in the embodiment;

The prediction module 202 is configured to process the graph data to be characterized through the node characterization model, and determine a stable self-subgraph and an unstable self-subgraph of each node in the graph data to be characterized, where the stable self-subgraph of one node is used for characterizing the stable characteristics of the node, and the unstable self-subgraph of one node is used for characterizing the environmental information of the node; and the node characterization model predicts according to the stable self-subgraphs of all nodes in the graph data to be characterized, and outputs a node characterization result of the graph data to be characterized.

The embodiment of the application also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method for learning the point representation of the graph data under the distribution migration described in the embodiment and/or the method for learning the point representation of the graph data under the distribution migration described in the embodiment are realized.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.

The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method for learning point representation of graph data under distribution migration, the method comprising:

Processing the graph data through the preset node characterization model, and determining a stable self-subgraph and an unstable self-subgraph of each node in the graph data, wherein the method comprises the following steps: the preset node characterization model determines a self-graph of each node in the graph data; in the self-graph of any node, updating the node representation of the node according to the information of the neighbor node of the node; according to the node characteristics updated by the node, respectively calculating the similarity between the node and a first-order neighbor node in the self-graph; determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and a first-order neighbor node in the self-graph; the stable self subgraph is used for representing the stable characteristics of the nodes, and the unstable self subgraph is used for representing the environmental information of the nodes;

2. The learning method of claim 1 wherein updating the node representation of the node based on information of neighboring nodes of the node comprises:

3. The learning method according to claim 1, wherein determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and a first-order neighbor node in the self-graph thereof includes:

4. A learning method according to any one of claims 1-3, wherein after determining a stable self-subgraph and an unstable self-subgraph for each node in the graph data, the learning method further comprises:

5. A method for point characterization of graph data under distribution migration, the characterization method comprising:

Inputting graph data to be characterized into a node characterization model according to any one of claims 1-4;

6. The method of claim 5, wherein the training samples of the node characterization model are: and the graph data has the same or different data distribution with the graph data to be characterized.

7. A point representation learning device of map data under distribution migration, characterized in that the learning device comprises:

The processing module is configured to process the graph data through the preset node characterization model, determine a stable self-subgraph and an unstable self-subgraph of each node in the graph data, and include: the preset node characterization model determines a self-graph of each node in the graph data; in the self-graph of any node, updating the node representation of the node according to the information of the neighbor node of the node; according to the node characteristics updated by the node, respectively calculating the similarity between the node and a first-order neighbor node in the self-graph; determining a stable self-subgraph and an unstable self-subgraph corresponding to the node according to the similarity between the node and a first-order neighbor node in the self-graph; the stable self subgraph is used for representing the stable characteristics of the nodes, and the unstable self subgraph is used for representing the environmental information of the nodes;

8. A point characterization device of map data under distribution migration, the characterization device comprising:

an input module for inputting graph data to be characterized into the node characterization model according to any one of claims 1-4;

9. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method for learning point representation of distribution map data according to any one of claims 1 to 4 and/or implements the method for point representation of distribution map data according to any one of claims 5 to 6.