CN113159160B - Semi-supervised node classification method based on node attention - Google Patents

Semi-supervised node classification method based on node attention Download PDF

Info

Publication number
CN113159160B
CN113159160B CN202110412835.8A CN202110412835A CN113159160B CN 113159160 B CN113159160 B CN 113159160B CN 202110412835 A CN202110412835 A CN 202110412835A CN 113159160 B CN113159160 B CN 113159160B
Authority
CN
China
Prior art keywords
node
nodes
network
characteristic
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110412835.8A
Other languages
Chinese (zh)
Other versions
CN113159160A (en
Inventor
俞俊
甘银兰
丁佳骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202110412835.8A priority Critical patent/CN113159160B/en
Publication of CN113159160A publication Critical patent/CN113159160A/en
Application granted granted Critical
Publication of CN113159160B publication Critical patent/CN113159160B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a semi-supervised node classification method based on node attention. The method comprises the following steps: step (1) data preprocessing; step (2) extracting the characteristics of nodes through a graph rolling network of 1-2 layers, and preparing the node characteristics as data of subsequent operations; and (3) node self-adaptive adjustment: firstly, average aggregation of the features of first-order neighbors of each node, and then splicing the feature information of the node and the average aggregated features to obtain a required local characterization; then, the extracted local representation is sent into a single-layer full-connection network, and the output result of the full-connection network and the node characteristics obtained in the step (2) are input into a gating unit for characteristic fusion; and (4) classification prediction and accuracy measurement. The invention can be adaptively adjusted for each node, has obvious advantages in space complexity compared with the graph annotation force network, and has the performance equivalent to the graph annotation force network.

Description

Semi-supervised node classification method based on node attention
Technical Field
The invention provides a semi-supervised node classification method based on node attention, which mainly aims at the conditions of large scale and dense edges of graph data, and utilizes the thought of an attention mechanism to realize self-adaptive adjustment of nodes so as to acquire node representation with more identification degree and improve the training efficiency and performance of a model.
Background
In recent years, network analysis has received increasing attention. Through research on the relation between nodes in the network, the nodes can be marked with marks, and the marks comprise information such as interest, hobbies, social influence and the like. However, in reality, the network graph contains a large number of unlabeled nodes, and it is particularly important how to effectively classify the unlabeled nodes by using the existing labeled nodes and the network structure relationship. Unlike traditional datasets, each piece of data has its own individual feature vector, nodes in the network can interact because of the edge relationships, such as friends relationships in social networks, cross-reference relationships in paper networks, and so on. By analyzing the network structure, including the relationships between nodes and edges, and utilizing semi-supervised learning of a small portion of marked nodes to more accurately classify unmarked nodes in the network, the trouble of many manual marks and the high overhead of additional computation can be saved.
At present, the semi-supervised node classification problem based on a network structure has the following three directions, namely relationship learning, feature representation learning and deep learning. The typical algorithm of relation learning, such as an RN classifier, is only suitable for a relatively small network, has relatively high computational complexity, and needs a network diagram to have certain special properties; feature representation learning is to learn a feature representation of a node based on a network structure, and is most widely studied in recent years based on a random walk algorithm, such as DeepWalk, node2vec and the like. The method has the advantages of small calculated amount, good classification effect and wide application. Under the rapid development of deep learning, a convolutional neural network model based on a graph is based on a graph theory, a parameterized filter is established in a spectral domain by referring to Fourier transform, a large number of excellent algorithms, such as GCN, graphSAGE, fastGCN and the like, are provided, compared with the traditional node classification algorithm, the node classification algorithm based on the deep learning is higher in efficiency, but is inferior in running time and space. Therefore, how to reduce the time and space complexity of the graph neural network model is a current research hotspot and difficulty.
In the task of classifying semi-supervised nodes, two technical difficulties exist. Firstly, the model learning problem under large-scale data is solved, and how to design a light learning strategy to reduce the complexity of the model by considering the huge space and time cost of the existing deep learning algorithm; meanwhile, a large amount of noise exists in the neighborhood of the node, error information is necessarily introduced by directly absorbing neighborhood information, an effective learning method is designed, node characteristic representation is adjusted in a self-adaptive mode, effective information is obtained, and noise information is prevented from being introduced.
Disclosure of Invention
The invention aims at overcoming the defects of the prior art and provides a semi-supervised node classification method based on node attention.
The technical scheme adopted by the invention for solving the technical problems comprises the following steps:
Step (1) data preprocessing
Classifying data sets by semi-supervised nodes, selecting 20 nodes as training sets in each class according to a data processing mode of GCN, randomly selecting 500 nodes as verification sets, and selecting 1000 nodes as test sets;
Step (2) feature extraction
All nodes (including training set, verification set and test set) firstly extract node characteristic expression through a single-layer graph rolling network (GCL), and the obtained node characteristic expression is used as data preparation of subsequent operation;
step (3) node self-adaptive adjustment
Firstly, average aggregation of the features of first-order neighbors of each node, and then splicing the feature information of the node and the average aggregated features to obtain the needed local characterization, and cross-node interaction to obtain richer local characterization; then, the extracted local characterization is sent into a single-layer full-connection network (FC), and the output result of the full-connection network (FC) and the node characteristics obtained in the step (2) are input into a gating unit for characteristic fusion, so that the characteristic information of each node is readjusted;
Step (4) Classification prediction
And finally, outputting the classification probability through an output layer, and calculating the accuracy.
Further, the data preprocessing in the step (1):
1-1 dataset (Cora, citeseer, pubmed) Cora dataset total 2708 sample points, each sample point being a scientific paper, all sample points being divided into 7 categories, each paper being represented by a 1433-dimensional word vector, there being 5429 citations; the Citeseer data set has 3327 sample points, has 4732 reference relations, all samples are divided into 6 major classes, and each node has 3703 dimension characteristics; the Pubmed dataset has 19717 sample points and 44338 reference relations. According to a standard data set dividing method, all data sets are subjected to the following operations: selecting 20 nodes from each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and selecting 1000 nodes as a test set;
further, the feature extraction in the step (2):
2-1 extracts node information for each node through a single-layer graph convolution network. The single-layer graph convolutional network mainly contains 2 parts of content:
2-2 feature transformation: and obtaining new node characteristic expression through a learnable parameter.
2-3 Feature polymerization: and 2-2, carrying out Laplace smoothing on the node characteristic expression obtained in the step, namely carrying out weighted summation on the neighbor of each node and the characteristic expression of the node, taking the weighted summation as the new characteristic of the current node, and carrying out an activation function on the new characteristic to obtain the new node characteristic expression.
Further, the node adaptation of (3) described above:
3-1 first defines node attention: including aggregation neighborhood, cross-node interaction, and gating mechanisms.
3-2 Aggregation neighborhood: the local characterization containing topology information is obtained by carrying out average aggregation on the feature expression of the first-order neighbor nodes of each node.
3-2 Cross-node interactions: splicing the local representation obtained by the aggregation neighborhood and the characteristic information of the node to obtain a new node representation, sending the node representation into a single-layer fully-connected network for self-learning, and outputting an attention coefficient matrix with the same size as the node representation obtained in the step 2-3 through an activation function.
3-3 Gating mechanism: and (3) carrying out normalization processing on the result obtained in the step (3-2) so that the value of the result belongs to [0,1], and then carrying out corresponding multiplication on the normalized attention coefficient matrix and the node representation generated in the step (2-3) to obtain the node representation of the self-adaptive adjustment.
Further, the classification prediction in the step (4) is as follows:
4-1, carrying out graph stacking on the node characteristic representation obtained in the step (3) after self-adaption adjustment to obtain the classification probability of the nodes; and calculate the accuracy.
The invention has the following beneficial effects:
aiming at large-scale and dense graph data in practical application, node classification tasks (two classifications or multiple classifications) are researched based on the concept of a graph convolution neural network framework and an attention mechanism, and topological information and node characteristic expression of the graph are combined to produce node expression with higher expressive force. The feature characterization of the direct neighborhood node of the central node is fused with the feature characterization of the central node, the attention coefficient matrix is obtained through a simple gating mechanism after cross-node interaction, and finally the node feature expression which is self-adaptively adjusted is obtained. The important characteristics are enhanced, and the unimportant information is filtered, so that the classification accuracy is improved. In addition, the space complexity of the algorithm is linearly related to the number of nodes, and the complexity of the model is reduced.
The invention inserts a node attention module in the traditional two-layer graph rolling network; compared with GCN, the invention has the advantages that the performance is respectively improved by 1.5 percent, 2.4 percent and 0.8 percent on Cora, citeseer percent and Pubmed data sets; there is also a performance improvement of 0.5%,0.2%,0.5% compared to GAT, respectively, and the present invention reduces the spatial complexity of the attention calculations from O (E) to O (N).
Drawings
FIG. 1 is a schematic view of the overall framework of the present invention;
FIG. 2 is a detailed block diagram of a node attention layer;
Detailed Description
The invention is further described below with reference to the accompanying drawings.
As shown in FIG. 1, a semi-supervised node classification method for node attention specifically comprises the following steps
Step (1) data preprocessing
For the graph node classification data set, 20 nodes are selected from each class as training sets according to a standard data set processing method, 500 nodes are randomly selected from the rest data as verification sets, and 1000 nodes are selected as test sets;
Step (2) feature extraction
All nodes firstly extract node characteristics through a shallow graph rolling network and serve as data preparation for subsequent operation;
step (3) node self-adaptive adjustment
As shown in fig. 2, firstly, the features of the first-order neighbors of each node are aggregated averagely, then the feature information of each node and the aggregated features are spliced together, and cross-node interaction is performed to obtain richer local information; then sending the extracted local information into a single-layer fully-connected network, and inputting the result and the feature map obtained in the step (1) into a gating unit for feature fusion, so as to realize readjustment of the feature information of each node;
Step (4) Classification prediction
And finally, outputting the classification probability through a picture convolution output layer, and calculating the accuracy.
Further, the data preprocessing in the step (1):
1-1 dataset (Cora, citeseer, pubmed) Cora dataset total 2708 sample points, each sample point being a scientific paper, all sample points being divided into 7 categories, each paper being represented by a 1433-dimensional word vector, there being 5429 citations; the Citeseer data set has 3327 sample points, has 4732 reference relations, all samples are divided into 6 major classes, and each node has 3703 dimension characteristics; the Pubmed dataset has 19717 sample points and 44338 reference relations. According to a standard data set dividing method, all data sets are subjected to the following operations: selecting 20 nodes from each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and selecting 1000 nodes as a test set;
further, the feature extraction in the step (2):
2-1 extracts node information for each node through a single-layer graph convolution network. The single-layer graph convolutional network mainly contains 2 parts of content:
2-2 feature transformation. The feature expression is self-learned by an optimizable parameter.
2-3 Feature polymerization. And using the topological structure of the graph to abut the matrix, and transmitting and absorbing the node representation after neighborhood feature transformation.
Further, the node adaptation of (3) described above:
3-1 we first define node attention, which consists of two parts, aggregation neighborhood, cross-node interaction and gating mechanism, respectively.
3-2 Aggregating the neighborhood, i.e. obtaining a compressed local representation by averaging the information of the directly adjacent nodes of each node.
3-2 Cross-node interaction, splicing the local characterization obtained by aggregating the neighborhood and the characteristic information of the local characterization by using the neighborhood to obtain a new node expression, sending the new node expression into a single-layer full-connection network to perform self-learning, and inputting a characteristic diagram with the same size as the node expression obtained by 2-3.
And 3-3 gating mechanism, normalizing the result obtained in 3-2 to make the value belong to [0,1], and multiplying the matrix with the node representation generated in 2-3 to realize self-adaptive adjustment of the node.
Further, the classification prediction in the step (4) is as follows:
4-1 obtaining a classification probability matrix through a graph roll lamination according to the node representation obtained in the previous step, and calculating a relevant measurement index.
Examples: taking the node 4 in fig. 2 as an example, the node characteristic expression obtained through the step (2) is denoted as h= [ H 1,h2,…,h7 ], the node characteristic of the node 4 is denoted as H 4, the first-order neighbor node of the node 1 is denoted as H 1,h2,h3,h5, the full-connection network FC is denoted as f, the activation function is denoted as sigma, the normalization function is denoted as delta, and the node characteristic expression of the node 4 after the self-adaptation adjustment is denoted as H' 4;
3-2 step aggregation neighborhood:
3-2 step cross-node interaction: h "=σ (f (h')
3-3 Gating mechanism: h' 4=h4 delta (h ").

Claims (1)

1. A semi-supervised node classification method based on node attention is characterized by comprising the following steps:
Step (1) data preprocessing
Classifying data sets by semi-supervised nodes, selecting 20 nodes as training sets in each class according to a data processing mode of GCN, randomly selecting 500 nodes as verification sets, and selecting 1000 nodes as test sets;
Step (2) feature extraction
Extracting node characteristic expression of all nodes through a single-layer graph convolution network, and preparing the obtained node characteristic expression as data of subsequent operation;
step (3) node self-adaptive adjustment
Firstly, average aggregation of the features of first-order neighbors of each node, and then splicing the feature information of the node and the average aggregated features to obtain the needed local characterization, and cross-node interaction to obtain richer local characterization; then sending the extracted local representation into a single-layer full-connection network, and inputting the output result of the full-connection network and the node characteristics obtained in the step (2) into a gating unit for characteristic fusion, so as to realize readjustment of the characteristic information of each node;
Step (4) Classification prediction
Finally, outputting the classification probability through an output layer, and calculating the accuracy;
data preprocessing in the step (1):
According to the standard data set dividing method, the following operations are carried out on all data sets: selecting 20 nodes from each class as a training set, randomly selecting 500 nodes from the rest data as a verification set, and selecting 1000 nodes as a test set;
The dataset comprises: the Cora data set is totally 2708 sample points, each sample point is a scientific paper, all sample points are divided into 7 categories, each paper is represented by a 1433-dimensional word vector, and 5429 quotation relations exist; the Citeseer data set has 3327 sample points, has 4732 reference relations, all samples are divided into 6 major classes, and each node has 3703 dimension characteristics; pubmed the data set has 19717 sample points and 44338 quotation relations;
the feature extraction in the step (2):
2-1 extracting node information from each node through a single-layer graph rolling network;
The single-layer graph convolutional network mainly contains 2 parts of content:
① Feature transformation: obtaining new node characteristic expression through a learnable parameter;
② Feature polymerization: the obtained node characteristic expression is smoothed by Laplacian, namely the neighbor of each node and the characteristic expression of the node are weighted and summed to be used as the new characteristic of the current node, and the new characteristic is activated to obtain the new node characteristic expression;
the node self-adaptive adjustment in the step (3):
3-1 first defines node attention including: aggregation neighborhood, cross-node interaction and gating mechanisms;
① Aggregation neighborhood: the method comprises the steps that a local representation containing topology information is obtained through average aggregation of feature expression of first-order neighbor nodes of each node;
② Cross-node interaction: splicing the local representation obtained by aggregating the neighborhood and the characteristic information of the node to obtain a new node representation, sending the node representation into a single-layer fully-connected network for self-learning, and outputting an attention coefficient matrix with the same size as the node representation obtained in the step 2-3 through an activation function;
③ Gating mechanism: normalizing the obtained local characterization to ensure that the value of the local characterization belongs to [0,1], and then carrying out corresponding multiplication on the normalized attention coefficient matrix and node representation generated by feature aggregation to obtain self-adaptive adjusted node characterization;
classification prediction as described in step (4):
The node characteristic representation obtained in the step (3) after self-adaptation adjustment is subjected to a graph roll lamination to obtain the classification probability of the nodes; and calculate the accuracy.
CN202110412835.8A 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention Active CN113159160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110412835.8A CN113159160B (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110412835.8A CN113159160B (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Publications (2)

Publication Number Publication Date
CN113159160A CN113159160A (en) 2021-07-23
CN113159160B true CN113159160B (en) 2024-06-25

Family

ID=76868506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110412835.8A Active CN113159160B (en) 2021-04-16 2021-04-16 Semi-supervised node classification method based on node attention

Country Status (1)

Country Link
CN (1) CN113159160B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492568A (en) * 2021-12-20 2022-05-13 西安理工大学 Node classification method based on Bert model
CN114708479B (en) * 2022-03-31 2023-08-29 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492691A (en) * 2018-11-07 2019-03-19 南京信息工程大学 A kind of hypergraph convolutional network model and its semisupervised classification method
CN111582443A (en) * 2020-04-22 2020-08-25 成都信息工程大学 Recommendation method based on Mask mechanism and level attention mechanism

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085124B (en) * 2020-09-27 2022-08-09 西安交通大学 Complex network node classification method based on graph attention network
CN112085127A (en) * 2020-10-26 2020-12-15 安徽大学 Semi-supervised classification method for mixed high-low order neighbor information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492691A (en) * 2018-11-07 2019-03-19 南京信息工程大学 A kind of hypergraph convolutional network model and its semisupervised classification method
CN111582443A (en) * 2020-04-22 2020-08-25 成都信息工程大学 Recommendation method based on Mask mechanism and level attention mechanism

Also Published As

Publication number Publication date
CN113159160A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN113159160B (en) Semi-supervised node classification method based on node attention
Yoshida Toward finding hidden communities based on user profile
CN108805213B (en) Power load curve double-layer spectral clustering method considering wavelet entropy dimensionality reduction
WO2023155508A1 (en) Graph convolutional neural network and knowledge base-based paper correlation analysis method
Shen et al. The analysis of intelligent real-time image recognition technology based on mobile edge computing and deep learning
CN110377605A (en) A kind of Sensitive Attributes identification of structural data and classification stage division
CN114298834A (en) Personal credit evaluation method and system based on self-organizing mapping network
CN115761275A (en) Unsupervised community discovery method and system based on graph neural network
CN116842194A (en) Electric power semantic knowledge graph system and method
CN115456093A (en) High-performance graph clustering method based on attention-graph neural network
CN113641821A (en) Value orientation identification method and system for opinion leaders in social network
Shukla et al. Role of hybrid optimization in improving performance of sentiment classification system
CN116416478B (en) Bioinformatics classification model based on graph structure data characteristics
CN113743079A (en) Text similarity calculation method and device based on co-occurrence entity interaction graph
CN112215490A (en) Power load cluster analysis method based on correlation coefficient improved K-means
CN114818681B (en) Entity identification method and system, computer readable storage medium and terminal
CN111539465A (en) Internet of things unstructured big data analysis algorithm based on machine learning
CN112463974A (en) Method and device for establishing knowledge graph
CN115240271A (en) Video behavior identification method and system based on space-time modeling
Li et al. A Novel Semi-supervised Adaboost Technique Based on Improved Tri-training
Yin et al. Gcn-based text classification research
Shi et al. Three-way spectral clustering
Liu et al. Classification of Medical Text Data Using Convolutional Neural Network-Support Vector Machine Method
Xie et al. An intrusion detection method based on hierarchical feature learning and its application
Gong et al. Research on mobile traffic data augmentation methods based on SA-ACGAN-GN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant