CN113159160A - Semi-supervised node classification method based on node attention - Google Patents

Semi-supervised node classification method based on node attention Download PDF

Info

Publication number
CN113159160A
CN113159160A CN202110412835.8A CN202110412835A CN113159160A CN 113159160 A CN113159160 A CN 113159160A CN 202110412835 A CN202110412835 A CN 202110412835A CN 113159160 A CN113159160 A CN 113159160A
Authority
CN
China
Prior art keywords
node
nodes
attention
characteristic
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110412835.8A
Other languages
Chinese (zh)
Inventor
俞俊
甘银兰
丁佳骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202110412835.8A priority Critical patent/CN113159160A/en
Publication of CN113159160A publication Critical patent/CN113159160A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a semi-supervised node classification method based on node attention. The invention comprises the following steps: preprocessing data; extracting characteristics, namely extracting node characteristics through a graph convolution network of 1-2 layers to prepare data for subsequent operation; and (3) self-adaptive adjustment of nodes: firstly, averagely aggregating the characteristics of first-order neighbors of each node, and then splicing the characteristic information of the node and the averagely aggregated characteristics to obtain required local characteristics; then sending the extracted local characteristics into a single-layer fully-connected network, and inputting the output result of the fully-connected network and the node characteristics obtained in the step (2) into a gate control unit for characteristic fusion; and (4) classifying and predicting and measuring accuracy. The invention can be self-adaptively adjusted for each node, has obvious advantages in space complexity compared with the attention network, and has equivalent performance to the attention network.

Description

Semi-supervised node classification method based on node attention
Technical Field
The invention provides a semi-supervised node classification method based on node attention, which mainly aims at the conditions of large scale and dense edges of graph data and utilizes the idea of attention mechanism to realize the self-adaptive adjustment of nodes so as to obtain node representation with more identification degree and improve the training efficiency and performance of a model.
Background
In recent years, network analysis has received increasing attention. Through the research on the relationship between the nodes in the network, the nodes can be marked, and the marks comprise information such as hobbies, social influence and the like. However, in reality, the network graph contains a large number of unmarked nodes, and how to effectively classify the unmarked nodes by using the existing marked nodes and the network structure relationship is particularly important. Different from a traditional data set, each piece of data has an independent feature vector, and nodes in the network can influence each other due to marginal relations, such as friend relations in a social network, mutual reference relations in a paper network and the like. Through analysis of the network structure, including the relation between nodes and between edges, the small part of marked nodes are utilized to accurately classify unmarked nodes in the network, so that the trouble of manual marking and the high cost of additional calculation can be saved.
At present, the semi-supervised node classification problem based on the network structure has the following three directions, namely relationship learning, feature representation learning and deep learning. A typical algorithm for relation learning, such as an RN classifier, is only suitable for a relatively small network, and has relatively high computational complexity, and requires a network graph itself to have some special properties; feature representation learning is to learn out feature representations of nodes based on a network structure, and most widely studied in recent years is based on a random walk algorithm, such as deep walk and node2 vec. The method has the advantages of small calculated amount, good classification effect and wide application. Under the rapid development of deep learning, a graph-based convolutional neural network model is based on a graph theory, a parameterized filter is established in a spectral domain by taking the Fourier transform as a reference, a large number of excellent algorithms are provided, such as GCN, GraphSAGE, FastGCN and the like, compared with the traditional node classification algorithm, the node classification algorithm based on deep learning has higher efficiency, but is inferior in operation time and space. Therefore, how to reduce the time and space complexity of the neural network model is a current research hotspot and difficulty.
In the semi-supervised node classification task, two technical difficulties exist. One of the problems is the problem of model learning under large-scale data, and how to design a light-weight learning strategy and reduce the complexity of the model by considering the huge space and time overhead of the existing deep learning algorithm; meanwhile, a large amount of noise exists in the neighborhood of the node, error information is inevitably introduced by directly absorbing neighborhood information, an effective learning method is designed, the characteristic representation of the node is adaptively adjusted, effective information is obtained, and introduction of noise information is avoided.
Disclosure of Invention
The invention aims to provide a semi-supervised node classification method based on node attention, aiming at the defects of the prior art.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step (1) data preprocessing
Classifying data sets for semi-supervised nodes, selecting 20 nodes in each class as a training set according to a GCN data processing mode, randomly selecting 500 nodes as a verification set, and selecting 1000 nodes as a test set;
step (2) feature extraction
Extracting node feature expression from all nodes (including a training set, a verification set and a test set) through a single-layer graph convolutional network (GCL), and preparing the obtained node feature expression as data of subsequent operation;
step (3) node self-adaptive adjustment
Firstly, averagely aggregating the characteristics of first-order neighbors of each node, splicing the characteristic information of the node and the averagely aggregated characteristics to obtain required local characteristics, and performing cross-node interaction to obtain richer local characteristics; then sending the extracted local characteristics into a single-layer full-connection network (FC), and inputting the output result of the full-connection network (FC) and the node characteristics obtained in the step (2) into a gate control unit for characteristic fusion so as to realize readjustment of the characteristic information of each node;
step (4) classified prediction
And finally, outputting the classification probability through an output layer, and calculating the accuracy.
Further, the data preprocessing of step (1):
1-1 citation data set (Cora, Citeseer, Pubmed). the Cora data set contains 2708 sample points, each sample point is a scientific paper, all sample points are divided into 7 categories, each paper is represented by a 1433-dimensional word vector, and 5429 citation relations exist; the Citeser data set comprises 3327 sample points, 4732 reference relations exist, all samples are divided into 6 large classes, and each node has 3703-dimensional characteristics; the Pubmed data set has 19717 sample points and 44338 reference relations. We follow the standard dataset partitioning method, and perform the following operations on all datasets: selecting 20 nodes from each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and using 1000 nodes as a test set;
further, the feature extraction in the step (2):
2-1 extracting node information for each node through a single-layer graph convolutional network. The single-layer graph convolution network mainly comprises 2 parts of contents:
2-2 feature transformation: and obtaining a new node feature expression through a learnable parameter.
2-3 characteristic polymerization: and 2-2, performing Laplace smoothing on the node feature expression obtained in the step, namely performing weighted summation on the neighbor of each node and the feature expression of the node per se to serve as a new characteristic of the current node, and performing an activation function on the new characteristic to obtain a new node feature expression.
Further, the node in (3) is adaptively adjusted:
3-1 node attention is first defined: including aggregation neighborhoods, cross-node interactions, and gating mechanisms.
3-2 aggregation neighborhood: namely, a local representation containing topology information is obtained by averagely aggregating the feature expressions of the first-order neighbor nodes of each node.
3-2 cross-node interaction: and (3) splicing the local characteristics obtained by the aggregation neighborhood with the characteristic information of the nodes to obtain a new node characteristic, sending the node characteristic into a single-layer full-connection network for self-learning, and outputting an attention coefficient matrix with the same size as the node characteristic obtained in the step (2-3) through an activation function.
3-3 gating mechanism: and (3) normalizing the result obtained in the step (3-2) to enable the value to belong to [0,1], and then carrying out corresponding multiplication on the normalized attention coefficient matrix and the node representation generated in the step (2-3) to obtain a node representation of self-adaptive adjustment.
Further, the classification prediction of step (4):
4-1, representing the node characteristics after the self-adaptive adjustment obtained in the step (3), and obtaining the classification probability of the nodes through a graph convolution layer; and calculate the accuracy.
The invention has the following beneficial effects:
for large-scale and dense graph data in practical application, a node classification task (two-classification or multi-classification) is researched based on the ideas of a graph convolution neural network framework and an attention mechanism, and topological information and node feature expression of a graph are combined to produce a more expressive node representation. The feature representation of the direct neighborhood node of the central node is fused with the feature representation of the central node, an attention coefficient matrix is obtained through a simple gating mechanism after cross-node interaction, and finally self-adaptive adjustment node feature expression is obtained. Namely, important features are strengthened, unimportant information is filtered, and the classification accuracy is improved. In addition, the space complexity of the algorithm is linearly related to the number of the nodes, and the complexity of the model is reduced.
The invention inserts a node attention module in the traditional two-layer graph convolution network; compared with GCN, the performance of the invention is respectively improved by 1.5%, 2.4% and 0.8% on the data sets of Cora, Citeser and Pubmed; there are also performance gains of 0.5%, 0.2%, 0.5% compared to GAT, respectively, and the present invention reduces the spatial complexity of the attention calculation from o (e) to o (n).
Drawings
FIG. 1 is a general framework schematic of the present invention;
FIG. 2 is a detailed block diagram of a node attention layer;
details of the embodiments
The invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, a semi-supervised node classification method for node attention specifically includes the following steps
Step (1) data preprocessing
For a graph node classification data set, according to a standard data set processing method, 20 nodes are selected from each class to serve as a training set, 500 nodes are randomly selected from the rest data to serve as a verification set, and 1000 nodes are selected to serve as a test set;
step (2) feature extraction
All nodes firstly extract node characteristics through a shallow graph convolution network to serve as data preparation of subsequent operation;
step (3) node self-adaptive adjustment
As shown in fig. 2, firstly, the features of the first-order neighbors of each node are averagely aggregated, then the feature information of each node is spliced with the aggregated features, and cross-node interaction is performed to obtain richer local information; then, the extracted local information is sent into a single-layer full-connection network, and the result and the characteristic diagram obtained in the step (1) are input into a gate control unit for characteristic fusion, so that the characteristic information of each node is readjusted;
step (4) classified prediction
And finally, outputting the classification probability and calculating the accuracy through a graph convolution output layer.
Further, the data preprocessing of step (1):
1-1 citation data set (Cora, Citeseer, Pubmed). the Cora data set contains 2708 sample points, each sample point is a scientific paper, all sample points are divided into 7 categories, each paper is represented by a 1433-dimensional word vector, and 5429 citation relations exist; the Citeser data set comprises 3327 sample points, 4732 reference relations exist, all samples are divided into 6 large classes, and each node has 3703-dimensional characteristics; the Pubmed data set has 19717 sample points and 44338 reference relations. We follow the standard dataset partitioning method, and perform the following operations on all datasets: selecting 20 nodes from each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and using 1000 nodes as a test set;
further, the feature extraction in the step (2):
2-1 extracting node information for each node through a single-layer graph convolutional network. The single-layer graph convolution network mainly comprises 2 parts of contents:
2-2 feature transformation. The feature expression is self-learned by an optimizable parameter.
2-3 characteristic polymerization. And (4) propagating and absorbing the node representation after neighborhood characteristic transformation by using the topological structure of the graph and the adjacent matrix.
Further, the node in (3) is adaptively adjusted:
3-1 first we define node attention, which consists of two parts, aggregation neighborhood, cross-node interaction and gating mechanism.
And 3-2, aggregating neighborhoods, namely, averagely aggregating the information of the directly adjacent nodes of each node to obtain a compressed local representation.
3-2 cross-node interaction, splicing local representations obtained by the aggregation neighborhood and characteristic information of the local representations and the characteristic information of the local representations to obtain a new node expression, sending the new node expression into a single-layer full-connection network for self-learning, and inputting a characteristic diagram with the same size as the node expression obtained in the 2-3.
And a 3-3 gating mechanism is used for carrying out normalization processing on the result obtained by the 3-2 to enable the value to belong to [0,1], and then multiplying the matrix by the node representation generated by the 2-3 to realize the self-adaptive adjustment of the node.
Further, the classification prediction of step (4):
4-1, according to the node representation obtained in the previous step, obtaining a classification probability matrix through a graph convolution layer, and calculating a related measurement index.
Example (b): as illustrated by the node 4 in fig. 2, the node feature expression obtained in step (2) is denoted as H ═ H1,h2,…,h7]The node representation of node 4 is denoted as h4The first-order neighbor node of node 1 is marked as h1,h2,h3,h5The fully-connected network FC is recorded as f, the activation function is recorded as σ, the normalization function is recorded as δ, and the node feature expression of the node 4 after adaptive adjustment is recorded as h'4
3-2 step aggregation neighborhood:
Figure BDA0003024608270000061
3-2, cross-node interaction: h ═ σ (f (h'))
3-3 gating mechanism: h'4=h4*δ(h”)。

Claims (6)

1. A semi-supervised node classification method based on node attention is characterized by comprising the following steps:
step (1) data preprocessing
Classifying data sets for semi-supervised nodes, selecting 20 nodes in each class as a training set according to a GCN data processing mode, randomly selecting 500 nodes as a verification set, and selecting 1000 nodes as a test set;
step (2) feature extraction
Extracting node feature expressions from all nodes through a single-layer graph convolution network, and taking the obtained node feature expressions as data preparation of subsequent operation;
step (3) node self-adaptive adjustment
Firstly, averagely aggregating the characteristics of first-order neighbors of each node, splicing the characteristic information of the node and the averagely aggregated characteristics to obtain required local characteristics, and performing cross-node interaction to obtain richer local characteristics; then sending the extracted local characteristics into a single-layer fully-connected network, and inputting the output result of the fully-connected network and the node characteristics obtained in the step (2) into a gate control unit for characteristic fusion so as to realize readjustment of the characteristic information of each node;
step (4) classified prediction
And finally, outputting the classification probability through an output layer, and calculating the accuracy.
2. The method for node attention-based semi-supervised node classification according to claim 1, wherein the data preprocessing of the step (1):
according to the standard data set division method, the following operations are carried out on all data sets: and (3) selecting 20 nodes in each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and selecting 1000 nodes as a test set.
3. The node attention-based semi-supervised node classification method according to claim 2, wherein the cited data set comprises: the Cora data set comprises 2708 sample points, each sample point is a scientific paper, all the sample points are divided into 7 categories, each paper is represented by a 1433-dimensional word vector, and 5429 reference relations exist; the Citeser data set comprises 3327 sample points, 4732 reference relations exist, all samples are divided into 6 large classes, and each node has 3703-dimensional characteristics; the Pubmed data set has 19717 sample points and 44338 reference relations.
4. A node attention-based semi-supervised node classification method according to claim 2 or 3, wherein the feature extraction in the step (2):
2-1, extracting node information for each node through a single-layer graph convolution network;
the single-layer graph convolution network mainly comprises 2 parts of contents:
firstly, feature transformation: obtaining a new node feature expression through a learnable parameter;
characteristic polymerization: and performing Laplace smoothing on the obtained node feature expression, namely performing weighted summation on the neighbor of each node and the feature expression of the node per se to serve as a new characteristic of the current node, and performing an activation function on the new characteristic to obtain a new node feature expression.
5. The node attention-based semi-supervised node classification method according to claim 4, wherein the node in (3) is adaptively adjusted:
3-1 first defines node attention as including: aggregating neighborhood, cross-node interaction and gating mechanisms;
-aggregating neighborhoods: the method comprises the steps that a local representation containing topological information is obtained by averagely aggregating the feature expressions of first-order neighbor nodes of each node;
cross-node interaction: splicing local characteristics obtained by the aggregation neighborhood with characteristic information of the nodes to obtain a new node characteristic, sending the node characteristic into a single-layer full-connection network for self-learning, and outputting an attention coefficient matrix with the same size as the node characteristic obtained in the step 2-3 through an activation function;
③ gating mechanism: and then carrying out corresponding multiplication on the normalized attention coefficient matrix and the node representation generated by feature aggregation to obtain the node representation of the self-adaptive adjustment.
6. The method of claim 5, wherein the classification of step (4) predicts:
expressing the node characteristics after the self-adaptive adjustment obtained in the step (3), and obtaining the classification probability of the nodes through a graph convolution layer; and calculate the accuracy.
CN202110412835.8A 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention Pending CN113159160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110412835.8A CN113159160A (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110412835.8A CN113159160A (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Publications (1)

Publication Number Publication Date
CN113159160A true CN113159160A (en) 2021-07-23

Family

ID=76868506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110412835.8A Pending CN113159160A (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Country Status (1)

Country Link
CN (1) CN113159160A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492568A (en) * 2021-12-20 2022-05-13 西安理工大学 Node classification method based on Bert model
CN114708479A (en) * 2022-03-31 2022-07-05 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492568A (en) * 2021-12-20 2022-05-13 西安理工大学 Node classification method based on Bert model
CN114708479A (en) * 2022-03-31 2022-07-05 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics
CN114708479B (en) * 2022-03-31 2023-08-29 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics

Similar Documents

Publication Publication Date Title
WO2021078027A1 (en) Method and apparatus for constructing network structure optimizer, and computer-readable storage medium
Jain et al. Data mining techniques: a survey paper
CN112508085B (en) Social network link prediction method based on perceptual neural network
WO2017206936A1 (en) Machine learning based network model construction method and apparatus
CN113159160A (en) Semi-supervised node classification method based on node attention
CN114462623B (en) Data analysis method, system and platform based on edge calculation
CN105631478A (en) Plant classification method based on sparse expression dictionary learning
Al-Janabi Overcoming the main challenges of knowledge discovery through tendency to the intelligent data analysis
CN112529638A (en) Service demand dynamic prediction method and system based on user classification and deep learning
Feng et al. Incremental Semi-Supervised classification of data streams via self-representative selection
Yu et al. Rainfall time series forecasting based on Modular RBF Neural Network model coupled with SSA and PLS
Jiang et al. A massive multi-modal perception data classification method using deep learning based on internet of things
CN111539465A (en) Internet of things unstructured big data analysis algorithm based on machine learning
CN114818681B (en) Entity identification method and system, computer readable storage medium and terminal
CN115456093A (en) High-performance graph clustering method based on attention-graph neural network
Ma The Research of Stock Predictive Model based on the Combination of CART and DBSCAN
CN112465253B (en) Method and device for predicting links in urban road network
Moreira et al. Prototype Generation Using Self‐Organizing Maps for Informativeness‐Based Classifier
Li et al. A Novel Semi-supervised Adaboost Technique Based on Improved Tri-training
Shi et al. Three-way spectral clustering
Chen et al. Global attention-based graph neural networks for node classification
Cabanes et al. On the use of Wasserstein metric in topological clustering of distributional data
Xie et al. An intrusion detection method based on hierarchical feature learning and its application
CN114494753A (en) Clustering method, clustering device, electronic equipment and computer-readable storage medium
Gong et al. Research on mobile traffic data augmentation methods based on SA-ACGAN-GN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination