CN114756713A - Graph representation learning method based on multi-source interaction fusion - Google Patents

Graph representation learning method based on multi-source interaction fusion Download PDF

Info

Publication number
CN114756713A
CN114756713A CN202210267016.3A CN202210267016A CN114756713A CN 114756713 A CN114756713 A CN 114756713A CN 202210267016 A CN202210267016 A CN 202210267016A CN 114756713 A CN114756713 A CN 114756713A
Authority
CN
China
Prior art keywords
node
information
nodes
path
order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210267016.3A
Other languages
Chinese (zh)
Inventor
朱东杰
孙云栋
张星东
丁卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Longyuan Information Technology Co ltd
Harbin Institute of Technology Weihai
Original Assignee
Nanjing Longyuan Information Technology Co ltd
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Longyuan Information Technology Co ltd, Harbin Institute of Technology Weihai filed Critical Nanjing Longyuan Information Technology Co ltd
Priority to CN202210267016.3A priority Critical patent/CN114756713A/en
Publication of CN114756713A publication Critical patent/CN114756713A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a graph representation learning method based on multi-source interaction fusion, which comprises the following steps: extracting node attributes, node categories and adjacency relations among nodes in a network in a graph structure form; respectively obtaining a BFS high-order neighborhood node set and a DFS high-order neighborhood node set by adopting a BFS and DFS-based element path high-order neighborhood node sampling algorithm; acquiring first-order neighborhood information of the node through a first-order neighborhood information aggregation algorithm; acquiring high-order domain information of the node through a heterogeneous high-order neighborhood information aggregation algorithm; fusing self information of the nodes, high-order neighborhood information and first-order neighborhood information of the nodes by using a multi-source information fusion model based on a gated neural network to obtain multi-source interaction fusion information of the nodes as final vector representation; and optimizing the parameters of the algorithm model under a multitask optimization function. The method and the device improve the extraction capability of the node information in the meta-path, and simultaneously greatly enhance the capturing capability of the neighborhood information of different levels.

Description

Graph representation learning method based on multi-source interaction fusion
Technical Field
The invention relates to the technical field of graph data representation learning and graph data mining, in particular to a graph representation learning method based on multi-source interaction fusion.
Background
Many complex systems are data processing in the form of graph structures, such as social networks, biological networks, and information networks. As is well known, network data is often complex and therefore difficult to process, mainly in terms of high computational complexity, low parallelism, and difficulty in utilizing existing machine learning, deep learning methods, etc. In order to process network data more efficiently, it is a primary challenge to find an efficient network data representation method, which enables downstream data analysis tasks, such as: data mining, analysis, prediction, etc. can be done efficiently in limited space and time. Graph representation learning is a promising graph structure data representation method capable of supporting a series of graph data processing and analysis tasks, such as: the method comprises the steps of graph node classification, graph node clustering, graph visualization, node connection relation prediction and the like. Compared with the traditional graph data representation, firstly, the graph data nodes and the relations thereof can be vectorized and represented in a lower dimension by graph representation learning, so that the purpose of reducing the dimension is achieved, the storage cost is reduced, and the calculation efficiency is improved; then, the noise and redundant information can be removed while the graph data structure and the topological information are kept, and the extraction and mining capacity of potential data features is improved; most importantly, the distance between the nodes can be used for measuring the mutual relation and can be subjected to parallelization calculation and applied to the machine learning and deep learning algorithm of the front edge, and the application scene of the method can be greatly widened. Therefore, how to realize efficient representation of graph structure data becomes a hot issue of recent research, and thus the graph representation learning field is also derived. In addition, in reality, the attribute information of the graph nodes and the association relation between the nodes are complex and diverse, and how to efficiently and comprehensively mine and learn the complex information has important research value for downstream tasks.
The existing graph representation learning method has great success in the fields of node classification, link prediction, recommendation systems, group discovery and the like, wherein the network representation learning capability is further improved due to the appearance of a Graph Neural Network (GNN). The network representation learning method based on the GNN can well aggregate information of nodes directly adjacent to a central node, but the existing method based on the GNN cannot directly aggregate high-order neighbor information in a single layer, and can aggregate remote neighbor information through a multi-layer iteration method, but the method is high in complexity, and has the problems of indirect information loss, introduction of a large amount of noise information and the like in the iteration process. In addition, the existing GNN-based network representation learning method does not fully consider and distinguish the relationship between nodes in a hierarchical manner, and the relationship between different levels and different semantics among the nodes in a real scene has important influence on the network representation result. Finally, most of the existing methods for processing heterogeneous graph data are based on meta-path policies, and extract different relationships by defining node interaction modes, but the existing methods do not incorporate nodes in meta-paths into information aggregation, for example: the APA (A stands for scholars, P stands for thesis) in the academic citation network is used for mining two scholars who jointly publish the same article, the meta-path only concerns the two scholars, but not concerns the article information jointly published by the two scholars, a large amount of information is lost, and accurate mining of the relationship between the two authors is affected.
Disclosure of Invention
The invention provides a graph representation learning method based on multi-source interaction fusion, which aims to solve the problems that the existing graph representation learning technology cannot directly aggregate high-order neighbor information in a single layer, does not fully consider and distinguish the relationship between nodes in a hierarchical mode, does not bring nodes in a meta path into information aggregation when processing meta path information of heterogeneous graph data and the like.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a graph representation learning method based on multi-source interaction fusion comprises the following steps:
extracting node attributes, node categories and adjacency relations among nodes in a network in a graph structure form;
dividing the neighbor nodes of the nodes into directly adjacent first-order neighborhood nodes and indirectly adjacent high-order neighborhood nodes based on the adjacency relation among the nodes;
a BFS-based element path high-order neighborhood node sampling algorithm is adopted to obtain a BFS high-order neighborhood node set; obtaining a DFS high-order neighborhood node set by adopting a DFS-based meta-path high-order neighborhood node sampling algorithm;
acquiring first-order neighborhood information of the node through a first-order neighborhood information aggregation algorithm;
acquiring high-order domain information of the node through a heterogeneous high-order neighborhood information aggregation algorithm;
fusing self information of the nodes, high-order neighborhood information and first-order neighborhood information of the nodes by using a multi-source information fusion model based on a gated neural network to obtain multi-source interaction fusion information of the nodes as final vector representation;
and adjusting parameters of the algorithm model under the multitask optimization function until the iteration times or the precision requirement is met.
Preferably, each layer of sampling in the generation process of the BFS-based meta-path high-order neighborhood node sampling algorithm follows a meta-path mode, and each node passed by the intermediate step is retained.
Preferably, in the generating process of the DFS-based meta-path high-order neighborhood node sampling algorithm, each step of walking sampling follows a meta-path mode, and each node passed by an intermediate step is retained, and a generating policy formula is as follows:
Figure BDA0003552230020000021
where the random function represents the walk-with-memory strategy, viRepresents the currently visited node, vi+1As the next node possible to access; e represents the set of all edges in the graph; ri∈R,(0≤i<LR) Represents the ith node type, L, in meta-path modeRRepresentative YuanluThe length of the diameter.
Preferably, the first-order neighborhood information aggregation algorithm adds the relationship between nodes while preserving the structure information aggregation capability of GNNs and the information transfer characteristics between network nodes, and defines a new node update strategy, as follows:
Figure BDA0003552230020000031
wherein the content of the first and second substances,
Figure BDA0003552230020000032
represents an out-of-order neighbor of node i, i.e., there is an edge pointing from node i to node t,
Figure BDA0003552230020000033
representing an in-first-order neighbor of node i, i.e., there is an edge pointing from node t' to node i,
Figure BDA0003552230020000034
representing the vector representation of nodes in the l-th layer of the neural network, d (l) representing the vector dimension of nodes in the l-th layer of the neural network, WlIs a weight matrix which can be learnt by the neural network of the l-th layer, and different attention parameters are set according to different edge directions
Figure BDA0003552230020000035
And
Figure BDA0003552230020000036
preferably, the obtaining of the high-order domain information of the node through the heterogeneous high-order neighborhood information aggregation algorithm specifically includes the following steps:
firstly, when node information of each path is aggregated, focusing Attention to the node information of two end points of the path, simultaneously incorporating nodes passing through the nodes into the calculation of path information aggregation, and aggregating all the node information in each meta-path based on the proposed Inner-Attention GNN network;
the information fusion is carried out on each meta-path sequence according to different Attention by using the proposed Inter-Attention GNN neural network, and the Inner-Attention GNN aggregation function is as follows:
Figure BDA0003552230020000037
wherein N isbl(i) For the set of dl path neighbors that node i acquires through the DFS policy,
Figure BDA0003552230020000038
information representing the intra-aggregation of dl path neighbors of node i in the layer l neural network,
Figure BDA0003552230020000039
representing a learnable network parameter matrix, alphaijThe learnable attention weight representing the node i and the node j is calculated by the following method:
Figure BDA00035522300200000310
wherein the content of the first and second substances,
Figure BDA00035522300200000311
for the dl-th path attention network parameter that can be learned, [ g ]]Is a vector join operation;
the attention weight is then normalized using the SoftMax function:
Figure BDA00035522300200000312
finally, the Inter-Attention GNN network aggregation function is:
Figure BDA00035522300200000313
wherein DL is a manually set hyper-parameter representing DThe maximum number of paths under the FS policy,
Figure BDA00035522300200000314
in order for the neural network parameters to be learnable,
Figure BDA00035522300200000315
attention weights of the dl path neighborhood of node i obtained by training for the Inter-Attention GNN.
Preferably, the gated neural network-based multi-source information fusion model has a fusion function of:
Figure 5
wherein the content of the first and second substances,
Figure BDA0003552230020000042
m and b are learnable parameters that,
Figure BDA0003552230020000043
and
Figure BDA0003552230020000044
respectively high-order neighborhood information under a high-order neighborhood information BFS strategy under a DFS strategy under the model of the l < th > layer.
Preferably, the multi-task optimization function is a combination of an adjacency optimization task and a node label prediction task:
L=ω1L1+(1-ω1)L2
wherein, ω is1The method is characterized in that the method is a hyper-parameter and represents the proportion of a main task, an adjacency optimization task serves as the main task, and the optimization function is as follows:
Figure 100002_1
the node label prediction task is used as an auxiliary task, and the optimization function is as follows:
Figure BDA0003552230020000046
where Y represents the set of node labels in all training sets, tiThe real label of the representative node is i, yiRepresenting whether the label of the prediction node is i or not, if so, yiIs 1, otherwise yiIs a non-volatile organic compound (I) with a value of 0,
and optimizing the parameters of the model by continuously minimizing the multitask optimization function and utilizing an inverse gradient algorithm.
Based on the technical scheme, the invention has the beneficial effects that: the method comprises the steps that neighbor nodes of nodes in a graph are divided into directly adjacent first-order neighborhood nodes and indirectly adjacent high-order neighborhood nodes, and a BFS high-order neighborhood node set and a DFS high-order neighborhood node set are obtained by respectively adopting a provided BFS-based meta-path high-order neighborhood node sampling algorithm and a DFS-based meta-path high-order neighborhood node sampling algorithm aiming at the high-order neighborhood nodes; respectively obtaining high-order neighborhood information and first-order neighborhood information of the nodes by utilizing a proposed heterogeneous high-order neighborhood information aggregation algorithm and first-order neighborhood information aggregation algorithm which are brought into the node information in the path; and finally, fusing self information of the nodes, high-order neighborhood information and first-order neighborhood information of the nodes by using a gated neural network to obtain multi-source interaction fusion information of the nodes as final vector representation, and optimizing the whole process under multiple tasks. The method solves the problem that the existing graph neural network is insufficient in capturing the remote neighborhood nodes, improves the extraction capability of the node information in the meta-path, and greatly enhances the capturing capability of the neighborhood information of different levels.
Drawings
FIG. 1 is a flow diagram of a graph representation learning method based on multi-source interaction fusion, under an embodiment;
FIG. 2 is a schematic diagram of a first-order neighborhood information aggregation method in one embodiment;
FIG. 3 is a diagram of a hierarchical attention neighborhood information aggregation algorithm in one embodiment;
FIG. 4 is a diagram illustrating a heterogeneous path information aggregation method based on meta-paths in an embodiment.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The embodiment is described with reference to fig. 1, and the graph representation learning method based on multi-source interaction fusion provided by the embodiment specifically includes the following steps:
step S1, reading node attribute matrix H in graph data, namely graph structure form network (social network, information network, technical network and biological network)n×dNode adjacency relation matrix under different interaction modes
Figure BDA0003552230020000051
And a class label information matrix L of nodesn×cWherein n represents the number of nodes in the graph; d represents the dimension of the initial attribute of the node; r represents an interaction mode, namely a meta path; c represents the number of classes of the node. And dividing the neighbor nodes of the nodes in the graph into directly adjacent first-order neighborhood nodes and indirectly adjacent high-order neighborhood nodes.
S2, aiming at the high-order neighborhood nodes, obtaining a BFS high-order neighborhood node set V by adopting the proposed BFS-based element path high-order neighborhood node sampling algorithmBThe specific implementation process comprises the following steps: and acquiring neighborhood nodes and information thereof of different layers under all relation modes according to a BFS strategy by setting different path lengths.
Obtaining a DFS high-order neighborhood node set V by adopting the proposed DFS-based meta-path high-order neighborhood node sampling algorithmDThe specific implementation process is as follows: by setting different path lengths, heterogeneous meta-path nodes in all relation modes are obtained according to a DFS strategy, and intermediate node information is recorded, so that a strategy formula is generated as follows:
Figure BDA0003552230020000052
where the random function represents the walk-with-memory strategy, viRepresenting current accessNode, vi+1As the next node possible to access; e represents the set of all edges in the graph; ri∈R,(0≤i<LR) Represents the ith node type, L, in meta-path modeRRepresenting the length of the meta path.
Step S3, acquiring first-order neighborhood information h of nodes by using GNN-based first-order neighborhood information aggregation algorithmi,1. The specific implementation principle is schematically shown in fig. 2. h isiVector representation, r, representing node iijRepresenting the relationship vector representation, α, of nodes i and jijRepresenting the attention weights calculated by nodes i and j, different solid line adjacent edge colors representing different relationships, and different dotted line colors representing different attention head weights. The structure information gathering capability of the GNNs and the information transfer characteristic between the network nodes are reserved, the relationship between the nodes is added, and a new node updating strategy is defined:
Figure BDA0003552230020000053
wherein the content of the first and second substances,
Figure BDA0003552230020000054
represents an out-of-order neighbor of node i, i.e., there is an edge pointing from node i to node t,
Figure BDA0003552230020000055
an in-order neighbor representing node i, i.e., there is an edge pointing from node t' to node i.
Figure BDA0003552230020000056
Representing the vector representation of nodes in the l-th layer of the neural network, d (l) representing the vector dimension of nodes in the l-th layer of the neural network, WlIs a weight matrix which can be learnt and is set with different attention parameters aiming at different edge directions
Figure BDA0003552230020000061
And
Figure BDA0003552230020000062
to be provided with
Figure BDA0003552230020000063
For example, the calculation method is as follows:
Figure 100002_2
the SoftMax function normalizes the attention weights:
Figure BDA0003552230020000065
and step S4, acquiring the high-order domain information of the node through a heterogeneous high-order neighborhood information aggregation algorithm. The neighbors sampled by the BFS strategy are processed in a layered mode, meanwhile, a layered attention mechanism is provided for selectively aggregating different neighbor information in different layers, and a model schematic diagram of the layered attention mechanism is shown in FIG. 3.
Firstly, an Inner-Attention GNN network is provided for aggregating neighborhood information in each layer of neighborhood, and a new aggregation function is as follows:
Figure BDA0003552230020000066
wherein N isbl(i) For a bl (bl is more than or equal to 2) order neighbor set acquired by a node i through a BFS strategy,
Figure BDA0003552230020000067
information representing the intra-bl aggregated neighbors of node i in the l-th neural network,
Figure BDA0003552230020000068
representing a learnable network parameter matrix, alphaijRepresenting the learnable attention weights of node i and node j. Firstly, inputting vectors of two nodes into an attention network to calculate attention weight between the two nodes, wherein the calculation method comprises the following steps:
Figure BDA0003552230020000069
wherein
Figure BDA00035522300200000610
For the learnable blth order attention network parameter, [ g]Is a vector join operation.
The attention weight is then normalized using the SoftMax function:
Figure BDA00035522300200000611
after the aggregation information of the nodes in each layer is obtained, information of different layers needs to be fused, an Inter-Attention GNN network is further provided, the aggregation information of different layers is fused, and an aggregation function is as follows:
Figure BDA00035522300200000612
wherein BL is a manually set hyper-parameter representing the maximum order under the BFS strategy,
Figure BDA00035522300200000613
in order for the neural network parameters to be learnable,
Figure BDA00035522300200000614
attention weights for the bl layer neighborhood of node i obtained from the Inter-Attention GNN training.
Similarly, for DFS sampling neighbors, processing paths, extracting different meta-path information, so-called meta-paths, that is, specifying a relationship pattern with certain practical significance, for example, an APA meta-path in a citation diagram can dig out authors who published the same paper, although the two authors are not directly connected in the original heterogeneous network; similarly, the APCPA meta-path may mine authors who published articles in the same meeting or journal, and although the two authors may not have direct contact, the research directions may be similar.
According to the proposed DFS branch path high-order neighborhood information aggregation algorithm, an algorithm schematic diagram is shown in FIG. 4, and different meta-path examples are obtained by utilizing a meta-path information sampling strategy; secondly, fusing all node information in each meta path based on the proposed Inner-Attention GNN network; and finally, performing information aggregation on each meta path according to different Attention by using the proposed Inter-Attention GNN neural network.
And step S5, fusing the self information of the node, the high-order neighborhood information of the node and the first-order neighborhood information by using a gated neural network to obtain multi-source interaction fusion information of the node. The fusion algorithm strategy specifically comprises the following steps:
Figure 3
wherein the content of the first and second substances,
Figure BDA0003552230020000072
m and b are learnable parameters.
And step S6, continuously optimizing the algorithm model parameters under the multi-task optimization function until the iteration times or the precision requirement is met. Taking the adjacency optimization task as a main task and determining a loss function L by minimizing1The low-dimensional vector representations of the adjacent nodes are more similar, and the low-order vector representations of the nodes which are not adjacent are more distant. L is a radical of an alcohol1The calculation method is as follows:
Figure 4
taking a node label prediction task as an auxiliary task, and minimizing a cross entropy loss function L2And enabling the obtained node low-dimensional vector to represent label information capable of covering the node. L is2The calculation method is as follows:
Figure BDA0003552230020000074
where Y represents the set of node labels in all training sets, tiThe real label of the representative node is i, yiRepresenting whether the label of the prediction node is i or not, if so, yiIs 1, otherwise yiIs 0.
Fusing the two tasks to obtain a final loss function L, wherein omega1The super-parameter represents the proportion of the main task. The calculation mode of L is as follows:
L=ω1L1+(1-ω1)L2
the above description is only a preferred embodiment of the graph representation learning method based on multi-source interaction fusion disclosed by the present invention, and is not intended to limit the scope of protection of the embodiments of the present specification. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present disclosure should be included in the protection scope of the embodiments of the present disclosure.

Claims (7)

1. A graph representation learning method based on multi-source interaction fusion is characterized by comprising the following steps:
extracting node attributes, node categories and adjacency relations among nodes in a network in a graph structure form;
dividing the neighbor nodes of the nodes into directly adjacent first-order neighborhood nodes and indirectly adjacent high-order neighborhood nodes based on the adjacency relation among the nodes;
a BFS-based element path high-order neighborhood node sampling algorithm is adopted to obtain a BFS high-order neighborhood node set; obtaining a DFS high-order neighborhood node set by adopting a DFS-based meta-path high-order neighborhood node sampling algorithm;
acquiring first-order neighborhood information of the node through a first-order neighborhood information aggregation algorithm;
acquiring high-order domain information of the node through a heterogeneous high-order neighborhood information aggregation algorithm;
fusing self information of the nodes, high-order neighborhood information and first-order neighborhood information of the nodes by using a multi-source information fusion model based on a gated neural network to obtain multi-source interaction fusion information of the nodes as final vector representation;
and adjusting parameters of the algorithm model under the multitask optimization function until the iteration times or the precision requirement is met.
2. The graph representation learning method based on the multi-source interaction fusion of claim 1, wherein each layer of sampling in the generation process of the BFS-based meta-path high-order neighborhood node sampling algorithm follows a meta-path mode, and each node passed by an intermediate step is reserved.
3. The graph representation learning method based on multi-source interaction fusion of claim 1, wherein in the generation process of the DFS-based meta-path high-order neighborhood node sampling algorithm, each step of wandering sampling follows a meta-path mode, and each node passed by an intermediate step is retained, and a generation strategy formula is as follows:
Figure FDA0003552230010000011
where the random function represents the walk-with-memory strategy, viRepresents the currently visited node, vi+1As the next node possible to access; e represents the set of all edges in the graph; ri∈R,(0≤i<LR) Represents the ith node type, L, in meta-path modeRRepresenting the length of the meta-path.
4. The graph representation learning method based on multi-source interactive fusion of claim 1, wherein the first-order neighborhood information aggregation algorithm adds the relationship between nodes while preserving the structure information aggregation capability of GNNs and the information transfer characteristics between network nodes, and defines a new node update strategy as follows:
Figure FDA0003552230010000012
wherein the content of the first and second substances,
Figure FDA0003552230010000013
represents an out-of-order neighbor of node i, i.e., there is an edge pointing from node i to node t,
Figure FDA0003552230010000014
representing an in-first-order neighbor of node i, i.e., there is an edge pointing from node t' to node i,
Figure FDA0003552230010000015
representing the vector representation of nodes in the l-th layer of the neural network, d (l) representing the vector dimension of nodes in the l-th layer of the neural network, WlIs a weight matrix which can be learnt and is set with different attention parameters aiming at different edge directions
Figure FDA0003552230010000021
And
Figure FDA0003552230010000022
5. the graph representation learning method based on multi-source interaction fusion of claim 1, wherein the obtaining of the high-order domain information of the nodes by the heterogeneous high-order neighborhood information aggregation algorithm specifically comprises the following steps:
firstly, when node information of each path is aggregated, focusing Attention to the node information of two end points of the path, simultaneously incorporating nodes passing through the nodes into the calculation of path information aggregation, and aggregating all the node information in each meta-path based on the proposed Inner-Attention GNN network;
the information fusion is carried out on each meta-path sequence according to different Attention by using the proposed Inter-Attention GNN neural network, and the Inner-Attention GNN aggregation function is as follows:
Figure FDA0003552230010000023
wherein N isbl(i) For the set of dl path neighbors that node i acquires through the DFS policy,
Figure FDA0003552230010000024
information representing the intra-aggregation of dl path neighbors of node i in the layer l neural network,
Figure FDA0003552230010000025
representing a learnable network parameter matrix, alphaijThe learnable attention weight representing the node i and the node j is calculated by the following method:
Figure FDA0003552230010000026
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003552230010000027
for the dl-th path attention network parameter that can be learned, [ g ]]Is a vector join operation;
the attention weight is then normalized using the SoftMax function:
Figure FDA0003552230010000028
finally, the Inter-Attention GNN network aggregation function is:
Figure FDA0003552230010000029
wherein DL is a manually set hyper-parameter representing DFS strategyThe maximum number of paths is set to be,
Figure FDA00035522300100000210
in order for the neural network parameters to be learnable,
Figure FDA00035522300100000211
attention weights of the dl path neighborhood of node i obtained by training for the Inter-Attention GNN.
6. The graph representation learning method based on multi-source interactive fusion of claim 1, wherein the multi-source information fusion model based on the gated neural network has a fusion function of:
Figure 1
wherein the content of the first and second substances,
Figure FDA00035522300100000213
m and b are learnable parameters that,
Figure FDA00035522300100000214
and
Figure FDA00035522300100000215
respectively are the high-order neighborhood information under the high-order neighborhood information BFS strategy under the DFS strategy under the model of the l < th > layer.
7. The graph representation learning method based on multi-source interaction fusion of claim 1, wherein the multi-task optimization function is a combination of an adjacency optimization task and a node label prediction task:
L=ω1L1+(1-ω1)L2
wherein, ω is1The adjacent relation optimization task is used as a main task, and the optimization function is as follows:
Figure 2
the node label prediction task is used as an auxiliary task, and the optimization function is as follows:
Figure FDA0003552230010000032
where Y represents the set of node labels in all training sets, tiThe real label of the representative node is i, yiRepresenting whether the label of the prediction node is i or not, if so, yiIs 1, otherwise yiIs a non-volatile organic compound (I) with a value of 0,
and optimizing the parameters of the model by continuously minimizing the multitask optimization function and utilizing an inverse gradient algorithm.
CN202210267016.3A 2022-03-17 2022-03-17 Graph representation learning method based on multi-source interaction fusion Pending CN114756713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210267016.3A CN114756713A (en) 2022-03-17 2022-03-17 Graph representation learning method based on multi-source interaction fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210267016.3A CN114756713A (en) 2022-03-17 2022-03-17 Graph representation learning method based on multi-source interaction fusion

Publications (1)

Publication Number Publication Date
CN114756713A true CN114756713A (en) 2022-07-15

Family

ID=82326795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210267016.3A Pending CN114756713A (en) 2022-03-17 2022-03-17 Graph representation learning method based on multi-source interaction fusion

Country Status (1)

Country Link
CN (1) CN114756713A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116578751A (en) * 2023-07-12 2023-08-11 中国医学科学院医学信息研究所 Main path analysis method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210049414A1 (en) * 2019-08-12 2021-02-18 Nec Laboratories America, Inc. Deep graph de-noise by differentiable ranking
CN112989842A (en) * 2021-02-25 2021-06-18 电子科技大学 Construction method of universal embedded framework of multi-semantic heterogeneous graph

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210049414A1 (en) * 2019-08-12 2021-02-18 Nec Laboratories America, Inc. Deep graph de-noise by differentiable ranking
CN112989842A (en) * 2021-02-25 2021-06-18 电子科技大学 Construction method of universal embedded framework of multi-semantic heterogeneous graph

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
孙云栋: "基于多源交互融合的跨网络实体对齐方法研究", 《中国优秀硕士学位论文全文数据库信息科技》 *
惠国保: "一种基于深度学习的多源异构数据融合方法", 《现代导航》 *
蒋宗礼等: "基于融合元路径图卷积的异质网络表示学习", 《计算机科学》 *
赵晓娟等: "多源知识融合技术研究综述", 《云南大学学报(自然科学版)》 *
鲁军豪等: "信息网络表示学习方法综述", 《河北科技大学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116578751A (en) * 2023-07-12 2023-08-11 中国医学科学院医学信息研究所 Main path analysis method and device
CN116578751B (en) * 2023-07-12 2023-09-22 中国医学科学院医学信息研究所 Main path analysis method and device

Similar Documents

Publication Publication Date Title
WO2022088972A1 (en) Malicious behavior identification method and system for weighted heterogeneous graph, and storage medium
CN107016464B (en) threat estimation method based on dynamic Bayesian network
CN113407864B (en) Group recommendation method based on mixed attention network
Pan et al. Clustering of designers based on building information modeling event logs
Wang et al. Power system network topology identification based on knowledge graph and graph neural network
Liu et al. A hybrid genetic-ant colony optimization algorithm for the optimal path selection
CN113255895A (en) Graph neural network representation learning-based structure graph alignment method and multi-graph joint data mining method
Zhao et al. Rbc: Rectifying the biased context in continual semantic segmentation
CN110533253A (en) A kind of scientific research cooperative Relationship Prediction method based on Heterogeneous Information network
Kumari et al. Quantifying influential communities in granular social networks using fuzzy theory
CN116129286A (en) Method for classifying graphic neural network remote sensing images based on knowledge graph
Zhou et al. Betweenness centrality-based community adaptive network representation for link prediction
CN106649380A (en) Hot spot recommendation method and system based on tag
CN114756713A (en) Graph representation learning method based on multi-source interaction fusion
Zhang et al. HG-Meta: Graph meta-learning over heterogeneous graphs
CN112597399B (en) Graph data processing method and device, computer equipment and storage medium
Hu et al. Recurrent neural architecture search based on randomness-enhanced tabu algorithm
Wang et al. Community discovery algorithm of complex network attention model
Caiqian et al. Multimedia system and database simulation based on internet of things and cloud service platform
Liu et al. The network representation learning algorithm based on semi-supervised random walk
Le et al. Enhancing Anchor Link Prediction in Information Networks through Integrated Embedding Techniques
Fernandes et al. Data classification via centrality measures of complex networks
Ghaemmaghami et al. SOMSN: an effective self organizing map for clustering of social networks
Wijayanto et al. Predicting future potential flight routes via inductive graph representation learning
CN111309980A (en) Representation learning method based on aggregation graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220715