CN116992334A

CN116992334A - Academic-oriented network node classification method and device

Info

Publication number: CN116992334A
Application number: CN202311064423.5A
Authority: CN
Inventors: 唐杰; 耿阳李敖; 东昱晓
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-08-22
Filing date: 2023-08-22
Publication date: 2023-11-03

Abstract

The invention discloses an academic network node classification method and device, wherein the method comprises the steps of constructing an undirected graph network based on an academic network; establishing an optimization objective function by using a preset feature transformation matrix, a low-rank matrix, an adjacent matrix and a feature matrix; respectively solving gradient information of the feature transformation matrix and the low-rank matrix by using an optimized objective function, and updating matrix parameters of the feature transformation matrix and the low-rank matrix based on the gradient information to obtain updated matrix parameters; and obtaining a classification prediction probability matrix of the feature transformation matrix and the low-rank matrix for the academic entity node according to the updated matrix parameters, and obtaining a classification prediction result of the academic network node based on the classification prediction probability matrix. The invention can reasonably classify the academic network nodes under the noisy condition.

Description

Academic-oriented network node classification method and device

Technical Field

The invention relates to the technical field of node classification, in particular to an academic-oriented network node classification method and device.

Background

An academic network is a network structure that reflects academic exchanges and cooperative relationships between students. In this network, nodes typically represent various academic entities, such as researchers, papers, academic journals, academic institutions, and the like. Edges represent various academic relationships between these entities, such as partnerships, citations, publications, etc. The academic network is an important data base for research such as academic information retrieval, academic recommendation, academic evaluation and the like. The problem of node classification on academic networks is mainly to classify these academic entities. For example, researchers may categorize them according to their research area or subject matter according to the content of the paper. Academic network node classification is a very challenging problem, the categories of academic entities are diverse, and there may be various intersections and overlaps. In addition, the structural characteristics of academic networks, such as community structure, core-edge structure, etc., also present challenges to node classification problems.

The spectrogram neural network (Spectral Graph Neural Network, SGNN) is a mainstream tool for solving the problem of academic network node classification, and the SGNN is composed of a plurality of graph convolution transformation layers, and each graph convolution transformation layer includes a feature transformation step and a spectrogram convolution step. The former is typically implemented by a fully connected network layer, aimed at extracting a characteristic representation of each node in the graph; the latter then extracts the cross-node information based on the graph structure to handle downstream tasks. Spectral convolution first defines a spectral domain transform (i.e., a graph fourier transform) by using the eigenvector matrix of the graph (i.e., academic network) adjacency matrix. Then introducing an N-dimensional diagonal matrix into the spectral domain space to form a spectrum filter, wherein N represents the number of nodes in the academic network. However, computing eigenvectors of the adjacency matrix causes a huge computational overhead, and in order to break through this computational bottleneck, many subsequent works propose to construct the atlas filter directly by using various polynomial bases (e.g., chebyshev polynomials and bernstein polynomials) approximation.

Based on the above idea, the GCN [ Kipf,2017] constructs the graph filter using chebyshev polynomials of the adjacency matrix as a basis, and improves the calculation efficiency by cutting off the higher-order terms. Wu et al Wu,2019 propose to simplify GCN by eliminating nonlinear activation in the intermediate network layer, further reducing computational overhead without losing basic performance. Li et al [ Li,2018] indicate that convolution in GCN is actually a special form of Laplacian smoothing that can filter noise signals from the original features if the given amount of map information is large enough. Conversely, when a given graph contains too many noisy links, multiple graph convolutions can cause false feature aggregation, resulting in excessive smoothing problems. GCNII [ Chen,2020] has proven to be effective in alleviating the problem of excessive smoothing by introducing initial residual and identity mapping to extend GCN. The GraphHeat [ Xu,2019] uses the thermonuclear formula to assign more importance to the low frequency filter, thereby reducing the impact of high frequency interference of the signal on the graph convolution. Bernnet [ He,2021] proposes replacing the chebyshev polynomial with a Bernstein polynomial, which has been shown to have better approximation characteristics for an ideal filter. Under ideal conditions, the above polynomial-based construction approach may approach the theoretical optimal graph filter. However, in real-world situations, the academic network adjacency matrix may contain a lot of noise, resulting in the ideal condition being destroyed, resulting in poor classification performance of the final node.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems in the related art to some extent.

Therefore, the invention provides an academic-oriented network node classification method, which jumps out of a spectrum transformation construction mode based on an adjacent matrix polynomial, directly learns low-rank expression of a spectrum projection matrix, and ensures learning effect by adding orthogonality and graph smoothing regularization. The method can generate a better spectrum transformation matrix than the traditional method, and can well improve the classification accuracy of academic network nodes.

Another object of the present invention is to provide an academic network node classification device.

In order to achieve the above object, an aspect of the present invention provides an academic network node classification method, including:

constructing an undirected graph network based on an academic network; the undirected graph network comprises a collection of academic entity nodes, relations among different academic entities, an adjacent matrix of the academic network and a characteristic matrix of the academic network nodes;

establishing an optimization objective function by using a preset feature transformation matrix, a low-rank matrix, the adjacent matrix and the feature matrix;

respectively solving gradient information of the feature transformation matrix and the low-rank matrix by utilizing the optimized objective function, and updating matrix parameters of the feature transformation matrix and the low-rank matrix based on the gradient information to obtain updated matrix parameters;

and obtaining a classification prediction probability matrix of the feature transformation matrix and the low-rank matrix for the academic entity node according to the updated matrix parameters, and obtaining a classification prediction result of the academic network node based on the classification prediction probability matrix.

The academic-oriented network node classification method of the embodiment of the invention can also have the following additional technical characteristics:

in one embodiment of the invention, before establishing the optimization objective function, the method further comprises:

acquiring the degree of academic entity nodes;

calculating the degree of the academic entity node to obtain a node degree calculation result;

and preprocessing the adjacent matrix and the feature matrix based on the node degree calculation result to obtain a preprocessed adjacent matrix and a preprocessed feature matrix.

In one embodiment of the present invention, the degrees of the academic entity nodes are the number of adjacent nodes of each academic entity node; the academic entities, including papers and/or authors.

In one embodiment of the present invention, the preprocessing the adjacency matrix and the feature matrix based on the node degree calculation result to obtain a preprocessed adjacency matrix and feature matrix includes:

calculating the degree of each academic entity node, and constructing a diagonal matrix by taking the degree of each academic entity node as a diagonal element;

performing open square inversion operation on the diagonal matrix to obtain an inverse operation diagonal matrix, and performing normalization operation on the diagonal matrix and the adjacent matrix according to the inverse operation diagonal matrix to obtain a normalized adjacent matrix; the method comprises the steps of,

and carrying out standardization processing on each column of the feature matrix by using a ZSCore method to obtain a standardized feature matrix.

In one embodiment of the invention, the matrix parameters of the feature transformation matrix are updated in a gradient descent mode; updating matrix parameters of a low-rank matrix by utilizing a Riemann gradient projection method optimized for manifold; and obtaining the classification prediction result of the academic network node based on the classification prediction probability matrix by using a greedy selection algorithm.

To achieve the above object, another aspect of the present invention provides an academic network node classification apparatus, including:

the undirected graph network construction module is used for constructing an undirected graph network based on an academic network; the undirected graph network comprises a collection of academic entity nodes, relations among different academic entities, an adjacent matrix of the academic network and a characteristic matrix of the academic network nodes;

the objective function construction module is used for constructing an optimized objective function by utilizing a preset feature transformation matrix, a low-rank matrix, the adjacent matrix and the feature matrix;

the matrix parameter updating module is used for respectively solving gradient information of the feature transformation matrix and the low-rank matrix by utilizing the optimization objective function, and updating matrix parameters of the feature transformation matrix and the low-rank matrix based on the gradient information to obtain updated matrix parameters;

and the node classification prediction module is used for obtaining a classification prediction probability matrix of the feature transformation matrix and the low-rank matrix for the academic entity node according to the updated matrix parameters, and obtaining a classification prediction result of the academic network node based on the classification prediction probability matrix.

According to the academic network node classification method and device, a spectrum transformation construction mode based on an adjacent matrix polynomial is jumped out, the low-rank expression of a spectrum projection matrix is directly learned, and the learning effect is guaranteed by adding orthogonality constraint and graph smoothing regularization, so that reasonable classification of academic network nodes under the noisy condition is realized.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a method of academic-oriented network node classification according to an embodiment of the invention;

FIG. 2 is a flow chart of another academic network node classification method according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an exemplary academic network node classification according to an embodiment of the invention;

fig. 4 is a schematic structural diagram of an academic network node classification apparatus according to an embodiment of the present invention.

Detailed Description

It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other. The invention will be described in detail below with reference to the drawings in connection with embodiments.

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

The following describes an academic network node classification method and device according to an embodiment of the present invention with reference to the accompanying drawings.

Fig. 1 is a flow chart of an academic network node classification method according to an embodiment of the present invention.

As shown in fig. 1, the method includes, but is not limited to, the steps of:

s1, constructing an undirected graph network based on an academic network; the undirected graph network comprises a collection of academic entity nodes, relations among different academic entities, an adjacent matrix of the academic network and a characteristic matrix of the academic network nodes;

s2, establishing an optimization objective function by using a preset feature transformation matrix, a low-rank matrix, an adjacent matrix and a feature matrix;

s3, respectively solving gradient information of the feature transformation matrix and the low-rank matrix by using an optimized objective function, and updating matrix parameters of the feature transformation matrix and the low-rank matrix based on the gradient information to obtain updated matrix parameters;

and S4, obtaining a classification prediction probability matrix of the feature transformation matrix and the low-rank matrix for the academic entity node according to the updated matrix parameters, and obtaining a classification prediction result of the academic network node based on the classification prediction probability matrix.

In one embodiment of the invention, it may be represented by an undirected graph network g= (V, E, a, X) for a given academic network, where V represents a set of |v| academic entity nodes (e.g., authors, papers), E represents a set of |e| edges, represents relationships between different academic entities (e.g., whether there is a collaboration between different authors, whether there is a referential relationship between different papers),matrix E (i.e. adjacency matrix of academic network), a->Representing the feature matrix of v| academic network nodes. The output is a single heat matrix->Representing which of the c categories the |v| nodes belong to (e.g., the author or paper belongs to the discipline of medicine, physics, or computer).

For the problems, the conventional spectrogram neural network is obtained by normalizing the angle matrix D of A Has better spectral transformation property. Then use->Polynomial->Constructing a graph filter to filter the academic network node characteristic X, and then applying a linear transformation to the filtered characteristicFinally, the final output is obtained via a nonlinear activation function σ:

wherein the method comprises the steps ofAnd classifying the matrix for the academic network node output by the model, wherein each row represents the probability distribution vectors of c categories to which the corresponding node belongs. For clarity of description herein, only single layer network cases, multiple layer cases, and so on are considered. Wherein the graph filter->The effectiveness of (2) is mainly determined by +.>Quality of (1)>When no noise information is contained, the system is added>Will be able to effectively approximate the ideal filter +.>However, when->When more noise information is contained, the system is added with->Will be in charge of>There is always a non-trivial error epsilon between, resulting in errors in the classification of the academic network nodes.

In order to improve the classification accuracy of academic network nodes and break through the error bound of the polynomial graph filter, the invention proposes to use a low-rank matrixReplace->Realize pair->Approximation of (1), wherein->Is a "thin high" matrix, i.e., |V| > c. Note that where U is equal in width to node class number c, it can be theoretically demonstrated that this setting can guarantee UU ^T Perfect approximation->In addition, in order to promote the approximation effect in practice, orthogonal constraint and graph smoothing regularization are applied to U. Further introducing a characteristic linear transformation W and a nonlinear activation function sigma, thereby obtaining the following optimization model:

s.t.U ^T U＝I _C

the first term in the formulaEncouraging model estimation node classification σ (UU) ^T XW) is close to the true annotation class Y, second term +.>Encouraging UU ^T Capturing main spectrum structure of academic network adjacency matrix, constraint condition U ^T U＝I _C Avoiding U from falling into trivial solution. Note that two of the above optimization problems require optimization variables and constraint terms, which are optimally updated using gradient descent and Riemann gradient projection methods, respectively, until the update iterations converge. After the optimization is completed, model output σ (UU) ^T XW) to obtain a classification prediction for each academic network node using greedy selection principles, as shown in fig. 2.

The academic network node classification method according to the embodiment of the present invention is described in detail below with reference to the accompanying drawings.

FIG. 3 is a schematic diagram of exemplary academic network node classification, as shown in FIG. 3, for 7 academic author nodes, the academic relationship of which may be represented by a 7×7 adjacency matrix A= [ a ] _ij ]Depiction, wherein a _ij =1 represents node v _i And v _j With edge links (academic collaboration exists), otherwise a _ij =0. Furthermore, each node has a corresponding author information feature (represented by a 4-dimensional vector in fig. 3), which may be represented by a 7×4 matrix X. Wherein the category information of part of the author nodes has been given: v ₂ And v ₃ For category 1 (e.g. physics), v ₅ And v ₆ Category 2 (e.g., computer science). The task goal of the graph neural network method capable of learning spectrum projection is to infer the category information of the remaining author nodes.

In one implementation of the inventionIn the example, the adjacency matrix and node characteristics are preprocessed. First, the degree of each author node, namely the number of adjacent nodes of each node is calculated, taking fig. 3 as example v ₁ To v ₇ The node degrees are respectively as follows: 2. 3, 2, 3. A 7 x 7 diagonal matrix D can be constructed with these 7 values as diagonal elements. D is subjected to open square inversion operation to obtain D ^-0.5 Based on this, normalization of A is performedThis ensures +.>The absolute value of the eigenvalue of (2) does not exceed 1, which is beneficial to the stability of the algorithm. In addition, the normalization (i.e. subtracting the mean value of each column and dividing the mean value by the standard deviation of the corresponding column) is performed on each column of the node feature matrix X by using the ZScore method, which is beneficial to the convergence of the algorithm.

In one embodiment of the invention, the model is defined with an objective function. Normalized adjacency matrix based on processingAnd the feature transformation matrix W is further introduced into the feature matrix X to perform feature transformation learning. For the example illustrated in fig. 3, the dimension of the node feature is 4, the number of categories is 2, and thus the size of W is 4×2. In addition, the matrix U is introduced to approximate an ideal propagation matrix, and in the example shown in fig. 3, the number of nodes is 7, and the number of categories is 2, so that the size of U is 7×2. Based on these matrices, the following optimization objective functions are established:

s.t.U ^T U＝I ₂

wherein the nonlinear activation sigma is selected from a softmax function, tr (·) represents a matrix trace-out (summing diagonal elements) operation, I ₂ Representing a 2 x 2 identity matrix, alpha is a canonical term weight typically set0.25. And respectively carrying out random initialization on U and W, and then updating in an alternating optimization mode.

In one embodiment of the invention, the feature transformation matrix W is updated. First, U is fixed and W is updated. Note that this is an unconstrained optimization problem for W, and thus can be updated in a gradient descent fashion. First, solving gradient information of an objective function about W:

updating W using a gradient descent method:

here gamma ₁ The step size is usually set to 0.2 for updating.

In one embodiment of the invention, the low rank matrix U is updated. At this stage, W is fixed and U is updated. Note that U has constraints and therefore cannot be updated directly using the gradient descent method. In fact, constraints limit U to a Grasman manifold, so U can be updated using a Riemann gradient projection optimized for the manifold. Specifically, first, the general gradient of the objective function with respect to U is found:

where z=σ' (UU ^T XW)⊙(σ(UU ^T XW) -Y). Based onThe Riemann gradient on the Grassman manifold can be calculated:

u is then updated based on the riman gradient:

U←U-γ ₂ grad[U]

here update step size gamma ₂ Typically set to 0.2. The above update may destroy the glasman manifold condition, so the updated U needs to be projected onto the glasman manifold again, so that it satisfies the constraint condition:

wherein S is _L And S is equal to _R The left singular value decomposition vector matrix and the right singular value decomposition vector matrix of U are respectively carried out.

In one embodiment of the invention, greedy selection obtains node classifications. Based on the optimized model parameters W and U, a classification prediction probability matrix of the model for the author node is obtained:

for the example in figure 3 of the drawings,is 7 rows (number of nodes) and 2 columns (number of categories). To obtain v ₁ 、v ₄ 、v ₇ Category of (2)/(2)>Lines 1, 4, and 7 of (a) respectively select the category with the highest probability as the prediction result of the model.

The experimental results of the invention show that: experimental comparisons were made with the GraphSAGE, GAT, SGC, APPNP, DGC, chebNet, GCN, graphHeat, FAGCN, S GC ten advanced graph neural network method on the Cora, citeseer, pubmed, coauthor-Physics four common academic network node classification public dataset. The results show that the method of the present invention achieves the highest classification accuracy over all four data sets compared to ten methods, with a 2% improvement in accuracy over the citieser data set compared to the second name (the difference between the second and third names is only 0.3%). And the proposed method has the smallest standard deviation over all data sets, which shows the stability of the method.

Furthermore, noise robustness experiments were performed on one simulated graphics network dataset: noise was injected step by step from 0% to 70% in 10% steps for one standard dataset and the performance of different methods on the noisy dataset was tested. The result shows that the method is better than the ten compared advanced graph neural networks, and the highest improvement of about 10% is achieved, which shows the robustness of the method to noise.

Further, the traditional spectrogram neural network method and the graph propagation matrix obtained by learning in the invention are visually compared and displayed on four data sets of Cora, citeseer, pubmed, coauthor-Physics. The result shows that the graph propagation matrix learned by the method can break through the swing of the traditional spectrogram neural network, and has the most ideal approximate effect on the ideal propagation matrix.

According to the academic network node classification method provided by the embodiment of the invention, a spectrum transformation construction mode based on an adjacent matrix polynomial is jumped out, the low-rank expression of the spectrum projection matrix is directly learned, and the learning effect is ensured by adding orthogonality constraint and graph smoothing regularization, so that reasonable classification of the academic network node under the noisy condition is realized.

In order to implement the above embodiment, as shown in fig. 4, an academic network node classification apparatus 10 is further provided in this embodiment, where the apparatus 10 includes an undirected graph network construction module 100, an objective function construction module 200, a matrix parameter update module 300, and a node classification prediction module 400.

An undirected graph network construction module 100 for constructing an undirected graph network based on an academic network; the undirected graph network comprises a collection of academic entity nodes, relations among different academic entities, an adjacent matrix of the academic network and a characteristic matrix of the academic network nodes;

the objective function construction module 200 is configured to establish an optimized objective function by using a preset feature transformation matrix, a low-rank matrix, an adjacency matrix and a feature matrix;

the matrix parameter updating module 300 is configured to solve gradient information of the feature transformation matrix and the low-rank matrix by using an optimization objective function, and update matrix parameters of the feature transformation matrix and the low-rank matrix based on the gradient information to obtain updated matrix parameters;

the node classification prediction module 400 is configured to obtain a classification prediction probability matrix of the feature transformation matrix and the low-rank matrix for the academic entity node according to the updated matrix parameters, and obtain a classification prediction result of the academic network node based on the classification prediction probability matrix.

Further, before the objective function building module 200, the apparatus further includes a data preprocessing module, configured to:

acquiring the degree of academic entity nodes;

calculating the degrees of academic entity nodes to obtain node degree calculation results;

Further, the degree of the academic entity node is the number of adjacent nodes of each academic entity node; the academic entities, including papers and/or authors.

Further, the data preprocessing module is further configured to:

Further, updating matrix parameters of the feature transformation matrix in a gradient descent mode; updating matrix parameters of a low-rank matrix by utilizing a Riemann gradient projection method optimized for manifold; and obtaining the classification prediction result of the academic network node based on the classification prediction probability matrix by using a greedy selection algorithm.

According to the academic network node classification device provided by the embodiment of the invention, a spectrum transformation construction mode based on an adjacent matrix polynomial is jumped out, the low-rank expression of a spectrum projection matrix is directly learned, and the learning effect is ensured by adding orthogonality constraint and graph smoothing regularization, so that reasonable classification of academic network nodes under the noisy condition is realized.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Claims

1. An academic-oriented network node classification method is characterized by comprising the following steps:

2. The method of claim 1, wherein prior to establishing the optimization objective function, the method further comprises:

acquiring the degree of academic entity nodes;

3. The method of claim 2, wherein the degree of academic physical nodes is the number of neighboring nodes per academic physical node; the academic entities, including papers and/or authors.

4. A method according to claim 3, wherein the preprocessing the adjacency matrix and the feature matrix based on the node degree calculation result to obtain a preprocessed adjacency matrix and feature matrix comprises:

5. The method according to claim 1, wherein matrix parameters of the feature transformation matrix are updated in a gradient descent manner; updating matrix parameters of a low-rank matrix by utilizing a Riemann gradient projection method optimized for manifold; and obtaining the classification prediction result of the academic network node based on the classification prediction probability matrix by using a greedy selection algorithm.

6. An academic-oriented network node classification apparatus, comprising:

7. The apparatus of claim 6, wherein prior to the objective function building module, the apparatus further comprises a data preprocessing module configured to:

acquiring the degree of academic entity nodes;

8. The apparatus of claim 7, wherein the degree of academic physical node is the number of neighboring nodes per academic physical node; the academic entities, including papers and/or authors.

9. The apparatus of claim 8, wherein the data preprocessing module is further configured to:

10. The apparatus of claim 6, wherein matrix parameters of the feature transformation matrix are updated by gradient descent; updating matrix parameters of a low-rank matrix by utilizing a Riemann gradient projection method optimized for manifold; and obtaining the classification prediction result of the academic network node based on the classification prediction probability matrix by using a greedy selection algorithm.