CN114118416A

CN114118416A - Variational graph automatic encoder method based on multi-task learning

Info

Publication number: CN114118416A
Application number: CN202111502928.6A
Authority: CN
Inventors: 李晶慈; 陆广泉
Original assignee: Guangxi Normal University
Current assignee: Guangxi Normal University
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2022-03-01

Abstract

The invention discloses a variational diagram automatic encoder method based on multitask learning, which comprises the following steps: s1: preprocessing source data; s2: partitioning the graph data set; s3: inputting the training set obtained in the step S22 into a shallow graph convolution layer to obtain a shallow shared embedded representation H; s4: inputting the shallow layer shared embedded representation H obtained by S3 into two different downstream network frameworks respectively to obtain respective embedded representations; s5: the two different embedded representations obtained at S4 are subjected to a link prediction task and a semi-supervised node classification task, respectively. The method can enable the embedding representation to be more similar to the real distribution of a sample space, and has strong competitiveness and strong robustness on a link prediction task.

Description

Variational graph automatic encoder method based on multi-task learning

Technical Field

The invention relates to the field of computer data analysis, in particular to a variational diagram automatic encoder method based on multi-task learning.

Background

With the continuous development of deep learning technology, more and more complex application scenes cannot be represented by simple Euclidean data, such as molecular structures, recommendation systems, citation networks, social networks and the like. These application data, i.e. non-euclidean data, may be represented using a graph. The graph data comprises nodes and edges, the nodes have own attribute characteristics, and different nodes have different numbers of neighbor nodes. Conventional convolutional neural networks or cyclic neural networks cannot be used to represent graph data. In recent years, the graph neural network attracts great attention of researchers, and compared with the convolutional neural network and the cyclic neural network, the graph neural network can embed node features into a low-dimensional space by keeping topological structure information and node feature information, and has strong performance. Among them, the graph autoencoder and the variational graph autoencoder are effective frameworks for performing graph unsupervised learning (link prediction, node clustering, graph generation).

However, multitasking learning of graph data has not drawn much attention from researchers. In fact, learning multiple related tasks together may improve the overall generalization ability of the tasks. The existing multi-task learning-based graph neural network frameworks directly use the learned shared representation as the input of the downstream tasks, which means that different downstream tasks use the common embedded representation for learning, and do not emphasize learning of embedded information specific to a single task. In fact, the use of a common embedded representation for different tasks may be detrimental to the learning of the respective tasks, since the shared embedded representation thus learned may also learn the noise of other tasks.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a variational diagram automatic encoder method based on multi-task learning. The method can enable the embedding representation to be more similar to the real distribution of a sample space, and has strong competitiveness and strong robustness on a link prediction task.

The technical scheme for realizing the purpose of the invention is as follows:

a variational diagram automatic encoder method based on multitask learning comprises the following steps:

s1: preprocessing source data, and the specific process is as follows:

s11: processing source data in a citation network into graph data G (V, E), wherein V is a node set, E is an edge set, a thesis in the citation network is regarded as a node in the graph, an author and a research direction of the thesis are regarded as characteristics of the node, a connected undirected edge is established between the thesis and a cited thesis, and a category to which the thesis belongs is regarded as a label, so that the quotation network forms a graph data set;

s12: obtaining a degree matrix, an adjacent matrix and a feature matrix corresponding to the graph by using the graph data set obtained in the S11;

s2: dividing a graph data set, and specifically comprising the following processes:

s21: performing mask setting on partial data in the graph data set to perform semi-supervised learning;

s22: dividing the data in the matrix obtained in the S12 into a training set, a verification set and a test set;

s3: inputting the training set obtained in the step S22 into a shallow graph convolutional layer to obtain a shallow shared embedded representation H, that is, inputting an adjacency matrix and a feature matrix into the shallow graph convolutional layer, and inputting a message propagation mechanism H ═ σ (AXW), where a is the adjacency matrix, X is a node feature matrix, W is a learnable parameter matrix, and σ is an activation function to aggregate feature information and topology structure information of a current neighbor node to update feature information of the current node, thereby obtaining the shallow shared embedded representation H;

s4: inputting the shallow layer shared embedded representation H obtained by S3 into two different downstream network frameworks respectively to obtain respective embedded representations, and the specific process is as follows:

s41: inputting the shallow shared embedded representation H obtained in the S3 into a graph convolution network for link prediction to obtain embedded representations Z _ mean and Z _ log;

s42: adding the Z _ mean and the Z _ log by using Gaussian distribution to obtain an embedded representation Z conforming to the Gaussian distribution;

s43: inputting Z as a false sample of the discriminator, and enabling the embedded representation Z to be closer to the original sample distribution based on a generation countermeasure mechanism;

s44: inputting the shallow representation H obtained in the S3 into a graph convolution network for node classification to obtain an embedded representation Z _ nc;

s5: and respectively carrying out a link prediction task and a semi-supervised node classification task on the two different embedded representations obtained in the step S4, wherein the specific process is as follows:

s51: performing adjacency matrix reconstruction on the Z input inner product layer of the embedded representation obtained in the S4 for a link prediction task;

s52: performing characteristic matrix reconstruction on the embedding expression Z input graph convolution layer obtained in the step S4 to serve as an auxiliary task of link prediction;

s53: inputting the embedded representation Z _ nc obtained by S4 into a graph convolution network for node classification;

s54: calculating a loss function, updating iteration parameters by using a gradient descent algorithm, and enabling the loss function to be converged after a plurality of iterations, wherein the final loss function formula is as follows:

where C is a set of node labels, if node i belongs to class C, y is the class label to which the node belongs,

is the softmax probability that node i belongs to class c, MASK when node i is tagged_iNot 1, otherwise MASK_i＝0，E_q(Z|X，A)[logp(A|Z)]-KL[q(Z|X,A)||p(Z)]Is the reconstruction loss of the adjacency matrix, where KL [ q (-) | p (-)]Is the relative entropy of the generated sample and the original sample,

for cross-entropy loss of semi-supervised node classification,

is the reconstruction loss of the feature matrix.

The beneficial effects of this technical scheme are:

the technical scheme is based on a multitask joint learning unsupervised link prediction task and a semi-supervised node classification task, is different from other multitask-based graph neural network frameworks, directly uses a shared representation as the input of different prediction or classification tasks, only obtains the shared representation at a shallow layer, respectively inputs the shared representation to the exclusive network frameworks designed by different downstream tasks, and in addition, in order to make the embedded representation of the link prediction task more robust, the technical scheme adds a network framework against generation, the embedded representation is made more closely to the true distribution of the sample space by the game mechanism of the generator-arbiter, experimental results on three real citation network data sets show that the framework provided by the technical scheme has strong competitiveness on a link prediction task and is even superior to the most advanced method on one data set.

The method can enable the embedding representation to be more similar to the real distribution of a sample space, and has strong competitiveness and strong robustness on a link prediction task.

Drawings

FIG. 1 is a schematic flow chart of an embodiment.

Detailed Description

The invention is described in further detail below with reference to the following figures and specific examples, but the invention is not limited thereto.

Example (b):

this example applies to data in non-Euclidean spaces, such as: social networks, citation networks, and molecular structures.

Referring to fig. 1, a variational diagram automatic encoder method based on multitask learning includes the following steps:

s1: preprocessing source data, and the specific process is as follows:

s11: in this example, graph data sets are collected in a citation network, the number of categories of each graph data set is different, each paper in the data sets has its own label, the graph data sets in the citation network are processed into graph data G ═ V, E, V is a node set, E is an edge set, assuming that one paper is regarded as a node in the graph, authors and research directions of the paper are regarded as characteristics of the node, a connected undirected edge is established between the paper and the cited paper, the category to which the paper belongs is regarded as a label, one citation network forms one graph data set, and details of three citation network graph data sets are shown in table 1:

TABLE 1 data set

Data set	Number of nodes	Number of edges	Characteristic dimension	Number of categories
					Cora	2708	5429	1433	7
Citeseer	3327	4732	3703	6
					Pubmed	19717	44338	500	3

；

s3: inputting the training set obtained in S22 into the shallow graph convolution layer to obtain a shallow shared embedded representation H, that is, Cora ═ a, X is input into the shallow graph convolution layer, and using a message propagation mechanism H ═ σ (AXW), where a is an adjacency matrix, X is a node feature matrix, W is a learnable parameter matrix, and σ is an activation function, and updating the feature information of the current node by aggregating the feature information and topology information of the current neighbor node to obtain the shallow shared embedded representation H, and the formula of the graph convolution network is:

wherein, the activation function σ (·) ═ ReLU (·), W is a weight matrix, and D is a degree matrix of the adjacency matrix;

s51: and (3) performing adjacency matrix reconstruction on the Z input inner product layer of the embedded representation obtained in the step (S4) for a link prediction task to obtain a reconstructed adjacency matrix:

the penalty function for reconstructing the adjacency matrix is:

L_re＝E_q(Z|X,A)[logp(A|Z)]-KL[q(Z|X,A)||p(Z)]；

s52: and (4) reconstructing a feature matrix of the embedding expression Z input graph convolution layer obtained in the S4 to be used as an auxiliary task of link prediction, and obtaining a reconstructed feature matrix:

the loss function for reconstructing the feature matrix is:

s53: embedding the obtained embedded representation Z _ nc of S4 into the graph convolution network for node classification, wherein the loss function of the task of point classification is as follows:

for cross-entropy loss of semi-supervised node classification,

is the reconstruction loss of the feature matrix.

After 50 iterations, the loss function has already tended to converge and the training is stopped.

The results of the three graph data sets are shown in tables 2 and 3:

table 2 link prediction: AUC and AP scores

Table 3 node classification: rate of accuracy

Methods	Cora	Pubmed	Citeseer
				GCN	0.815	0.790	0.703
Planetoid	0.757	0.772	0.947
				DeepWalk	0.972	0.653	0.432
MTGAE	0.790	0.804	0.718
				This example is a	0.809	0.861	0.666

。

Claims

1. A variational diagram automatic encoder method based on multitask learning is characterized by comprising the following steps:

s1: preprocessing source data, and the specific process is as follows:

s3: inputting the training set obtained in the step S22 into a shallow graph convolutional layer to obtain a shallow shared embedded representation H, that is, inputting an adjacency matrix and a feature matrix into the shallow graph convolutional layer, and updating feature information of a current node by aggregating feature information and topology structure information of the current neighbor node through a message propagation mechanism H = σ (AXW), where a is the adjacency matrix, X is a node feature matrix, W is a learnable parameter matrix, and σ is an activation function, so as to obtain the shallow shared embedded representation H;

s44: inputting the shallow representation H obtained in the S3 into a graph convolution network for node classification to obtain an embedded representation Z _ nc; s5: and respectively carrying out a link prediction task and a semi-supervised node classification task on the two different embedded representations obtained in the step S4, wherein the specific process is as follows:

is the softmax probability that node i belongs to class c, MASK when node i is tagged_iNot 1, otherwise MASK_i＝0，E_q(Z|X,A)[logp(A|Z)]-KL[q(Z|X,A)||p(Z)]Is the reconstruction loss of the adjacency matrix, where KL [ q (-) | p (-)]Is the relative entropy of the generated sample and the original sample,

for cross-entropy loss of semi-supervised node classification,

is the reconstruction loss of the feature matrix.