CN115905903A - Multi-view clustering method and system based on graph attention automatic encoder - Google Patents

Multi-view clustering method and system based on graph attention automatic encoder Download PDF

Info

Publication number
CN115905903A
CN115905903A CN202211446136.6A CN202211446136A CN115905903A CN 115905903 A CN115905903 A CN 115905903A CN 202211446136 A CN202211446136 A CN 202211446136A CN 115905903 A CN115905903 A CN 115905903A
Authority
CN
China
Prior art keywords
view
clustering
node
graph
feature representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211446136.6A
Other languages
Chinese (zh)
Inventor
尉秀梅
陈佃迎
姜雪松
陈珺
柴慧慧
马浩翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202211446136.6A priority Critical patent/CN115905903A/en
Publication of CN115905903A publication Critical patent/CN115905903A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-view clustering method and a multi-view clustering system based on an automatic graph attention encoder, which relate to the technical field of multi-view clustering, and the specific scheme comprises the following steps: selecting a view with the largest information amount from different views of the same group of nodes; based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation; carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment; inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result; in order to be more suitable for cluster tasks, the graph attention network is applied to multi-view graph clustering, and the graph structure and the node content are reconstructed at the same time, so that the graph structure and the content information of the nodes are well stored in the potential representation.

Description

Multi-view clustering method and system based on graph attention automatic encoder
Technical Field
The invention belongs to the technical field of multi-view clustering, and particularly relates to a multi-view clustering method and system based on an automatic graph attention encoder.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In recent years, more and more multiview clustering has been proposed for analyzing and processing multimedia data, and most multiview clustering algorithms generally include the following two steps: firstly, constructing a shared similar graph from multi-view data, and then clustering the similar graph to obtain a clustering result; in practical multimedia applications, due to the heterogeneity of multimedia acquisition sources, multi-view data often has features such as redundancy, correlation, and diversity, which makes the prior art have several problems:
1) Most clustering methods adopt a shallow layer model to process the complex relation in the multi-view graph, and the modeling capability of the multi-view graph information is severely limited;
the purpose of graph embedding is to learn a low-dimensional node representation, and simultaneously retain the content information and the topological structure of the nodes; graph embedding methods have emerged in the last decade and can be roughly classified into two major categories, namely, topology embedding (TSE) and Content Enhanced Graph Embedding (CEGE), depending on the input information.
The TSE method takes only the topology as input and maps it to learn the low-dimensional node representation. For example, perozzi et al propose a truncated random walk algorithm to learn node representation, which converts the original graph structure information into a set of linear sequences, unlike generating linear sequences, cao et al propose a deep neural network (DNGR) for learning graph representation, which proposes a random surfing model that can directly utilize topology structure information; cavallari et al, which integrates community embedding, community detection, and node embedding into a closed loop, rather than embedding each node individually, propose a community embedding framework (ComE); to address the problem of unknown number of communities, cavallari et al propose learning limited and unlimited community inset (ComE +); although the above methods give good results, they only consider graph structures, which limits their performance; research has shown that multi-view data helps to improve clustering performance, however, all the above methods only use the graph structure and node content of a single view, resulting in non-ideal results.
A number of multi-view clustering methods are dedicated to learning high-quality potential representations or affinity matrices shared by different views, wherein depth multi-view clustering methods are widely focused by researchers due to their outstanding representation capability and fast inference speed, for example Andrew et al propose a new multi-view clustering algorithm combining a depth encoder with canonical correlation analysis (DCCA), and Wang et al propose a Depth Canonical Correlation Automatic Encoder (DCCAE) by introducing a depth decoder to extend DCCA for better learning multi-view representations.
To better utilize the graph structure information in multiple views, li et al proposed a GCN-based multi-view learning method, i.e., co-GCN, however, co-GCN was designed for semi-supervised clustering, and to solve this problem, fan et al proposed a multi-view auto-encoder (O2 MAC) for graph-embedded clustering, which, although O2MAC was successful, only encodes node content information for a single view, with limited performance when processing data for single view graphs and node content.
2) In recent years, a plurality of clustering methods of the embedded graph are developed, but none of the methods considers community-specific distribution represented by nodes, so that the clustering performance is not ideal;
several clustering models have been proposed so far, whose core is to learn a low-dimensional, compact and continuous representation and then implement a classical clustering method on the learned representation to obtain a clustering label; although the clustering performance is improved, the clustering-specific distribution represented by the nodes is ignored, different communities are distributed on different characteristic dimensions, the characteristic distribution of the nodes is very disordered, the characteristics of the nodes are very similar even on most dimensions, and the algorithm can cluster all the nodes into the same community, so that the clustering result is low; in the above methods, only the graph structure and node content of the single-view graph are utilized, which results in non-ideal results; when a part of methods process the graph structure information by using multiple views, the method only encodes the node content information of a single view; when processing data of the single view graph and node contents, the performance is limited.
Therefore, how to fuse the multi-view information becomes a topic worth studying.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a multi-view clustering method and a multi-view clustering system based on a graph attention automatic encoder, which are used for applying a graph attention network to multi-view graph clustering and reconstructing a graph structure and node contents at the same time in order to be more suitable for a clustering task, so that the graph structure and content information of nodes are well stored in potential representation.
To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
the invention provides a multi-view clustering method based on an automatic graph attention encoder;
the multi-view clustering method based on the graph attention automatic encoder comprises the following steps:
selecting a view with the largest information amount from different views of the same group of nodes;
based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation;
by using
Figure BDA0003950400900000031
Carrying out specific constraint on the node feature representation by the norm punishment to obtain a constrained node feature representation;
inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;
and reconstructing the constrained node feature representation by adding a multi-view decoder, and training the graph attention encoder by using reconstruction loss.
Further, the view with the largest information amount is obtained by calculating a modularization score of each graph view based on the clustering index and the adjacency matrix, and selecting the graph view with the highest score as the view with the largest information amount.
Further, the graph attention encoder also learns the importance of the neighbor nodes.
Furthermore, the learning of the importance of the neighbor node is to assign different weights to the neighbor in a layer-by-layer graph attention strategy of the graph attention encoder, and represents the importance of the neighbor node to the current node.
Further, said use
Figure BDA0003950400900000032
The norm penalty specifically constrains the node feature representation, and the specific formula is as follows:
Figure BDA0003950400900000041
where β is a trade-off parameter, Z i Is the node characteristic representation of the ith node, and N is the total number of nodes.
Further, in the multi-view decoder, a mapping function is added to change the mapping range of the node feature representation.
Further, the mapping function specifically includes:
Figure BDA0003950400900000042
wherein x is a node feature representation.
A second aspect of the invention provides a multi-view clustering system based on a graph attention auto-encoder.
The multi-view clustering system based on the graph attention automatic encoder comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module, wherein the view selection module comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module:
a view selection module configured to: selecting a view with the largest information amount from different views of the same group of nodes;
a feature representation module configured to: based on the view with the largest information amount and the node content information, a graph structure and node content are learned by using a graph attention encoder to obtain node feature representation;
a feature constraint module configured to: by using
Figure BDA0003950400900000043
Carrying out specific constraint on the node characteristic representation by the norm punishment to obtain a node characteristic representation after constraint;
a cluster prediction module configured to: inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;
and reconstructing the constrained node feature representation by adding a multi-view decoder, and training the graph attention encoder by using reconstruction loss.
A third aspect of the invention provides a computer readable storage medium having stored thereon a program which, when being executed by a processor, carries out the steps of the method for multi-view clustering based on a map attention auto-encoder according to the first aspect of the invention.
A fourth aspect of the present invention provides an electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps of the method for multi-view clustering based on a graph attention auto-encoder according to the first aspect of the present invention when executing the program.
The above one or more technical solutions have the following beneficial effects:
the invention provides a multi-view clustering Method (MCBGA) based on a graph attention automatic encoder, which uses the graph attention automatic encoder to perform an attribute multi-view graph clustering task, effectively integrates multi-view graph structures and content information to perform deep potential representation learning, combines node representation learning and clustering into a unified framework, and jointly optimizes embedded learning and graph clustering.
The invention utilizes
Figure BDA0003950400900000051
And the norm punishment is used for solving the problem of community specific distribution represented by the nodes, plays an important role in learning the node representation, and well describes a clustering structure so as to improve a clustering result.
Experimental results on 4 reference data sets show that the algorithm of the present invention is superior to the most advanced graph clustering method.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a model configuration diagram of the first embodiment.
Fig. 2 is a flow chart of the method of the first embodiment.
Fig. 3 is a comparison of mapping functions of the first embodiment.
FIG. 4 is a graph comparing performance on the ACM data set of the first embodiment.
FIG. 5 is a graph comparing performance on a DBLP data set for the first embodiment.
FIG. 6 is a graph comparing performance on the IMDB data set of the first embodiment.
Fig. 7 is a MCBGA performance graph with additional views of the first embodiment.
Fig. 8 is a system configuration diagram of the second embodiment.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention; unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention; as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The invention provides a multi-view clustering method based on a graph attention automatic encoder, which uses the graph attention automatic encoder to reconstruct a graph structure and node content at the same time, so that the graph structure and the content information of the nodes are well stored by potential representation, and the thought of the invention is as follows:
extracting a shared representation from the graph view and the content data with the largest amount of information using a graph attention automatic encoder, and then reconstructing all views using the shared representation, the graph attention automatic encoder being composed of one graph attention encoder and a plurality of decoders; in particular, it learns node representations from views with the largest amount of information by a graph attention encoder, and reconstructs all views by multiple multi-view decoders, using multi-view graph structures and node content.
In order to eliminate the uncertainty of post-processing clustering operation, a clustering activation function is introduced, which is beneficial to better node representation learning and node clusteringClass; the inventors have found that
Figure BDA0003950400900000061
Norm penalties have an important role in characterizing the community-specific distribution of graph structure data in a dimensional space, which is then applied to the aspect of learning node representation, which well characterizes the clustering structure, thereby improving the clustering result.
In addition, a self-training clustering target is designed, so that the current clustering distribution tends to be more suitable for the target distribution of a clustering task; by jointly optimizing reconstruction loss and clustering loss, the model can simultaneously optimize node embedding and clustering and improve each other in a unified framework.
Example one
The embodiment discloses a multi-view clustering method based on an automatic graph attention encoder;
as shown in FIG. 1, to simultaneously represent multiple view structures A in a unified framework (1) ,...,A (M) And node content X, the embodiment provides a novel graph embedding clustering model, namely a graph attention automatic encoder, which comprises a graph attention encoder and a multi-view decoder, wherein the graph attention encoder is shared by all views, a shared representation is extracted from a view graph structure and content data, the multi-view decoder is designed, the multi-view data of the shared representation is reconstructed, and the graph attention encoder is trained by using reconstruction loss.
Given a graph G = (V, E) 1 ,...,E M ,X),
Figure BDA0003950400900000071
Representing a set of nodes, e ij (m) ∈E M Representing the association between nodes i and j in the mth view, the topology of graph G may be based on the set of adjacency matrices->
Figure BDA0003950400900000072
Is shown in (A) is (m) An adjacency matrix representing the mth view if>
Figure BDA0003950400900000073
Then->
Figure BDA0003950400900000074
Otherwise->
Figure BDA0003950400900000075
X={x 1 ,...,x n Is an attribute value representing content information of the node.
The multi-view clustering method based on the graph attention automatic encoder, as shown in fig. 2, includes:
step S01: selecting a view with the largest information amount from different views of the same group of nodes;
different views represent the relationship among different aspects of the same group of nodes, and the content information is shared by all the views, and shared information exists among the views; furthermore, in many scenarios, there is often a view that is the most informative that controls the performance of the community; therefore, the content information and the shared information between the plurality of views are extracted from the view and the content data having the largest amount of information, and then used to reconstruct the view.
Selecting a view A with most abundant information and node content information X as input, and reconstructing all views; here, heuristic measurement — modularization is adopted, and a view with the most abundant information is selected according to the result of the modularization function Q, specifically:
adjoining each single view to the matrix A (m) And the node content information X are respectively input into a graph neural network GNN to learn node embedding;
performing k-means on the learned embedding to obtain a clustering index of the embedding;
based on clustering index and adjacency matrix A (m) And calculating modularization scores of each view, and selecting the view A with the highest score as the view with the largest information amount.
Modularity is used because it provides an objective metric for evaluating the cluster structure.
Step S02: based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation;
the embodiment adopts a two-layer non-linear graph attention encoder to map node content information X and a view A, obtain potential node feature representation Z, learn neighbor representation of each node by focusing attention on the neighbor thereof, and combine the node content information with a graph structure in the potential node feature representation.
The most straightforward strategy to learn neighbor representations is to integrate the node representation equally with all its neighbors; however, in order to measure the importance of various neighbors, in the layer-by-layer graph attention strategy of the graph attention encoder, different weights are given to neighbor representations, which represent the importance degree of a neighbor node to a current node, specifically:
Figure BDA0003950400900000081
wherein z is i h Is a node-specific representation of node i, N i Represents the neighbors of node i; mu.s ij Representing the attention coefficient and representing the degree of importance of the neighboring node j to the node i, ω is a non-linear activation function.
To calculate the attention coefficient mu ij The importance of the node j is measured from two aspects of a topological structure and an attribute value; from the attribute value, the attention coefficient μ ij Can be expressed as x i 、x j And weight vector
Figure BDA0003950400900000082
Connected single layer feedforward neural network:
Figure BDA0003950400900000083
attention coefficient mu ij The softmax function is typically used to match all neighborhoods j ∈ N i Normalization was performed to make them convenient for comparison between nodes:
Figure BDA0003950400900000084
then, together with the topology P and the activation function ξ, the attention coefficient is finally expressed as:
Figure BDA0003950400900000091
/>
the graph attention encoder stacks two graph attention layers, x i =z i 0 As inputs:
Figure BDA0003950400900000092
Figure BDA0003950400900000093
in this way, the encoder encodes both the graph structure and the node content into a node feature representation, i.e., z i =z i (2)
It should be noted that the last layer of the encoder uses ReLU as the activation function to ensure that the inner product between any two points is non-negative.
Step S03: by using
Figure BDA0003950400900000098
Carrying out specific constraint on the node feature representation by the norm punishment to obtain a constrained node feature representation;
because communities represented by the nodes have specific distribution, different communities are distributed on different characteristic dimensions, and the characteristic distribution of the nodes is very disordered; in order to make the node characterization Z well characterize the community structure, in other words to make the node characterization Z more discriminative, the term Z is used
Figure BDA0003950400900000094
Norm punishment to obtain the sumPost-beam node characterization
Figure BDA0003950400900000096
With this constraint, it is mandatory to use a graph attention encoder to capture potential spatial differences in different clusters.
To make better use of the cluster structure, use is made of
Figure BDA0003950400900000099
Norm penalty specifically constrains node feature representation Z for better clustering, for which ^ is greater than or equal to Z>
Figure BDA00039504009000000910
Norm penalty as a specific constraint for the community, defined as:
Figure BDA0003950400900000097
where β is a trade-off parameter, Z i Is the node characteristic representation of the ith node, and N is the total number of the nodes; by the formula (7), Z i Square of
Figure BDA00039504009000000911
Different elements in the norm compete with each other for survival, and Z i Where at least one element survives (remains non-zero), by doing so, some discriminative features are retained for each community, providing some flexibility in learning node representation.
Step S04: inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;
and constructing a self-optimizing clustering module by using K-means, inputting the constrained node feature representation into the K-means for clustering, and outputting a final clustering result, namely the community distribution of the nodes.
In order to accurately extract a shared representation of all views by a supervised graph attention encoder, a multi-view decoder is added, and a node feature representation after constraint is expressed
Figure BDA0003950400900000101
In reconstructing multi-view data>
Figure BDA0003950400900000102
The graph attention encoder is trained using the reconstruction loss. As shown in the multiview decoder portion of FIG. 1, the decoder consists of M specific view decoders
Figure BDA0003950400900000103
Composition, predictive View A (m) Whether there is a link between two nodes, where W m ∈R D×D Is a view A (m) A specific weight of (a); multi-view decoder, which is actually a graph-based embedded multi-view link prediction layer, is represented as:
Figure BDA0003950400900000104
where σ is the activation function of the multi-view decoder,
Figure BDA0003950400900000105
is a node characteristic representation after a constraint, and>
Figure BDA0003950400900000106
is->
Figure BDA0003950400900000107
The transposed matrix of (2). />
Assuming that all inner products are non-negative, i.e. ZZ T Not less than 0; since the activation function ReLU of the last coding layer of the graph attention encoder is non-negative, Z ≧ 0, σ (ZZ) T ) Will be located at
Figure BDA0003950400900000108
Cannot be regarded as a probability.
To solve this problem, a mapping function φ will need to be found that will [0, + ∞) Mapping to (- ∞, + ∞) to make σ (ZZ) T ) Output the effective probability), the ideal mapping function should satisfy the requirements: when the input is large enough, the ideal φ should be approximately y = X, as shown in FIG. 3, if the mapping function is log (X), which is too insensitive to large values, which means σ (φ (X)) will slowly approach 1 as X increases, so the present embodiment designs the mapping function as:
Figure BDA0003950400900000109
as can be seen from fig. 3, as x increases, the mapping function shown in equation (9) approaches y = x very quickly, and when x approaches 0, the function approaches ∞quickly, conforming to the condition of the ideal mapping function \35205.
Based on the above, the multi-view decoder is represented as:
Figure BDA0003950400900000111
wherein the content of the first and second substances,
Figure BDA0003950400900000112
is the reconstructed view.
Integrally training a graph attention automatic encoder comprising a graph attention encoder and a multi-view decoder to better learn graph structure and content information among multiple views; by training the clustering effect of the autoencoder for attention to the optimization map, the overall objective function of the training is defined as:
L=L R +λL C +L norm (11)
wherein L is R And L C Reconstruction loss and clustering loss, L, respectively norm Is to Z
Figure BDA0003950400900000118
Norm penalty as a specific constraint for the community, λ ≧ 0 is the coefficient controlling the balance of the two.
For the graph attention auto-encoder, the sum of reconstruction errors for each view data is minimized by:
Figure BDA0003950400900000113
wherein the content of the first and second substances,
Figure BDA0003950400900000114
for reconstruction loss of view m, L R Reconstruction loss for all views, A (m) 、/>
Figure BDA0003950400900000115
Respectively an original view and a reconstructed view; because the decoder adopts a multi-view structure, the gradient of the multi-decoder can be transmitted through the information graph encoder in the process of back transmission; thus, the graphics encoder would extract a shared representation of all views while doing the forward propagation processing.
Besides optimizing reconstruction errors, clustering loss is controlled, a self-optimizing clustering module is added for the control, and constrained node feature representations are input into the self-optimizing clustering module, so that the following goals are minimized:
Figure BDA0003950400900000116
where KL (· |) is the divergence of the Kullback-Leibler between the two distributions, and Q is the distribution of the soft label to indicate the constrained node characterization of node i
Figure BDA0003950400900000117
U of cluster center j Similarity between:
Figure BDA0003950400900000121
q ij can be viewed as soft cluster partitioning of each nodeDistribution, p in formula (13) ij Is the target distribution, defined as:
Figure BDA0003950400900000122
soft distributions (nodes near the center of the community) with high probability are considered trustworthy in Q, so the target distribution P raises Q to the square to emphasize the effect of those "confidence distributions" and then the clustering penalty forces the current distribution Q to approach the target distribution P, setting these "confidence distributions" as soft labels to supervise the embedded learning of Q.
The target distribution P acts as a "true label" in the training process, but also depends on the current soft allocation Q that is updated at each iteration; updating P with Q at each iteration is dangerous because the constant change in the target can hinder learning and convergence; to avoid instability in the self-optimization process, this example updates P every 5 iterations in the experiment.
Clustering loss is minimized to help the auto-encoder manipulate the embedding space with the features of embedding itself and the points of discrete embedding for better clustering performance.
Updating the model weight and the cluster center by using the target distribution P and the soft label distribution Q, which specifically comprises the following steps:
(1) Fixed target distribution P, given N samples, L C Relative to the cluster center μ j The gradient of (d) can be calculated as:
Figure BDA0003950400900000123
(2) Given a learning rate λ, update μ j Comprises the following steps:
Figure BDA0003950400900000124
(3) Updating the weight of the mth view specific decoder:
Figure BDA0003950400900000131
it can be seen that W m Is only related to the view m reconstruction loss, so the view specific decoder weights can capture the view specific local structure information, so the encoder weights can extract a shared representation of all views.
The effect of the method of the embodiment is verified by comparative experiments.
Experimental data
ACM this is a network of papers in ACM datasets, building a dual-view graph using co-paper (both papers are written by the same author) and co-subject (both papers contain the same subject matter) relationships. The paper features are elements of the word bag represented by the keywords.
DBLP, consisting of three graphs, namely a collaboration graph, a referenced graph of a paper and a co-referenced graph of a paper; there are 2401 author nodes and 8703 edges in the collaboration graph; the thesis reference graph has 6000 thesis nodes and 10003 edges; the referenced graph of the paper has 6000 paper nodes and 141996 edges (if they refer to a common paper, two nodes are connected; authors and papers are linked by 32048 authorship; the linking of papers in the referenced graph and the referenced graph is based on the identity of the papers; all authors and papers relate to three clusters representing research fields: artificial intelligence, computer graphics and computer networks).
IMDB, which is a network of movies from IMDB data sets, that use the relationship of co-actors (movies are shown by the same actor) and co-director (movies are shown by the same director) to construct a multi-view, movie features correspond to a collection of words representing stories.
The clustering performance of the method of the present embodiment was evaluated in three data sets, ACM, DBLP and IMDB, whose brief statistical data are shown in table 1.
TABLE 1 data set information
Figure BDA0003950400900000132
Figure BDA0003950400900000141
Parameter setting and evaluation index
Four commonly used indicators were used, accuracy (ACC), F-score (F1), normalized Mutual Information (NMI), and the grand index (ARI); wherein, ARI value range is [ -1,1], the larger the value is, the better the value is, the overlapping degree of the two divisions is reflected; for each index, a larger value means a better clustering result.
For the ACM dataset, the training was iterated 250 times because the dataset is small; for the DBLP and IMDB datasets, all the autoencoder models were trained for 1000 iterations and optimized using Adam's algorithm; the learning rate λ of the auto-encoder is set to 0.001, and the dimensions of all embedding methods are set to 32; the convergence threshold for MCBGA is set to 0.1%, update period T =20; for the rest of the methods, the settings described in the corresponding papers are retained; since all clustering algorithms rely on initialization, all methods are repeated 10 times using random initialization.
Comparison method
GAE, a single view auto-encoder method;
x-avg, in order to utilize multiple views of the network, using the X method to learn the node representation on each view, and then averaging all learned representations;
MNE is an extensible multi-view network embedding model, and for all multi-view graph embedding/clustering methods, a multi-view graph adjacency matrix is used as input;
RMSC is a robust multi-view spectral clustering method based on low rank and sparse decomposition;
PwMC is a parameter weighting multi-view graph clustering method;
SwMC is a self-weighting multi-view graph clustering method;
o2MA A variant of O2MAC that contains no clustering penalty in the objective function;
o2MAC, providing a multi-view graph clustering method based on attributes;
MCBGA this embodiment proposes a multi-view clustering method based on a graph attention auto-encoder.
Results of the experiment
In order to evaluate the efficiency of the proposed method, the clustering performance on ACM, DBLP and IMDB datasets was analyzed, tables 2, 3 and 4 summarize the experimental results on three baseline datasets, where bold values represent the best performance, the experimental results are shown in tables 2, 3 and 4, it can be seen that the method of this example is significantly superior to all methods in most evaluation indices.
Table 2 results of the ACM data set
Method ACC F1 NMI ARI
GAE 0.8216 0.8225 0.4914 0.5444
GAE-avg 0.6990 0.7025 0.4771 0.4378
MNE 0.6370 0.6479 0.2999 0.2486
RMSC 0.6315 0.5746 0.3973 0.3312
PwMC 0.4162 0.3783 0.0332 0.0395
SwMC 0.3831 0.4709 0.0838 0.0187
O2MA 0.8880 0.8894 0.6515 0.6987
O2MAC 0.9042* 0.9053* 0.6923* 0.7394*
MCBGA 0.9102 0.9223 0.7052 0.7451
TABLE 3 Experimental results for DBLP dataset
Method ACC F1 NMI ARI
GAE 0.8859 0.8743 0.6925 0.7410
GAE-avg 0.5558 0.5418 0.3072 0.2577
MNE - - - -
RMSC 0.8994 0.8248 0.7111 0.7647
PwMC 0.3253 0.2808 0.0190 0.0159
SwMC 0.6538 0.5602 0.3760 0.3800
O2MA 0.9040 0.8976 0.7257 0.7705
O2MAC 0.9074* 0.9013* 0.7287* 0.7780*
MCBGA 0.9156 0.9047 0.7365 0.7789
Table 4 experimental results of IMDB data set
Figure BDA0003950400900000151
Figure BDA0003950400900000161
Where "-" indicates the best performance of the baseline, the best results of all methods are shown in bold, and "-" indicates insufficient memory of the method on the data set.
Experimental results visualization as shown in fig. 4-5 below, it can be seen from the visualization that MCBGA results are almost superior to all baseline methods, indicating that the model proposed in this example is valid; then, by comparing the results of MCBGA with GAE-avg, O2MA and O2MAC, the MCBGA of the embodiment is a more effective graph neural network for fusing multi-view information; in addition, compared with the O2MAC, the MCBGA of the present embodiment has better results on three data sets, for example, on the ACM data set, the MCBGA of the present embodiment is improved by 0.6%, 1.8%, 1.1%, 0.7% compared with the four indexes ACC, F1, NMI, ARI of the O2MAC, respectively, because the MCBGA of the present embodiment may utilize the four indexes ACC, F1, NMI, ARI of the O2MAC
Figure BDA0003950400900000162
Norm punishment is used for solving the problem of community specificity distribution of node representation, plays an important role in learning node representation, and can well depict a clustering structure so as to improve a clustering result; compared with the O2MA, the better results of the MCBGA of the embodiment on the three data sets show that the clustering performance can be further improved after the self-training clustering target is subjected to effective pre-training; note also that the results for all baselines on the IMDB dataset are reduced compared to the results for both ACM and DBLP datasets, since it is difficult to do so on the IMDBObtaining a "highly confident" node; in this case, these "high confidence" nodes may limit the low confidence nodes to the wrong cluster.
The MCBGA algorithm provided by the embodiment has good performance on both ACM and DBLP data sets, which shows the superiority of MCBGA in node clustering tasks; for example, on the ACM data set, the MCBGA of the present embodiment has significantly improved clustering performance compared with the GAE method, because the MCBGA of the present embodiment utilizes complementary information embedded in the multi-view data, whereas the single-view method does not; in general, single-view based clustering methods are inferior to multi-view based clustering methods because multi-view methods can utilize complementary information embedded in multi-view data, whereas single-view methods cannot.
For the experimental results, the following results can be obtained: firstly, the embedding method is obviously superior to other methods, so that graph embedding is a promising method for solving the graph clustering problem; second, the deep learning approach (i.e., GAE) achieves more competitive results than other baselines, but it can only utilize single-view graphical and content information; a well-designed deep neural network integrating multiple image views is possible to achieve good effects.
Ablation experiments
An ablation study was conducted to further analyze the importance of each module in the framework proposed by this example.
Table 5 ablation experiments of three data sets
Figure BDA0003950400900000171
Is provided with
Figure BDA0003950400900000173
Comparison tests with norm penalties, it can be seen from Table 5 that at setting->
Figure BDA0003950400900000174
The experimental results of the norm penalty are better, indicating that the present embodiment proposes/>
Figure BDA0003950400900000175
Norm penalties help to learn better potential node discriminative representations for node clustering tasks.
Secondly, verifying the necessity of using modularity to select the information graph views, taking each graph view as the input of the MCBGA, reconstructing all graph views on the three data sets, and putting their results on the graph clustering task in table 6; it can be observed that if the graphics view with higher modularity value is input to the encoder, the model will get better results, verifying that modularity is a viable solution for information map view selection.
Table 6 aggregates the results across different input views.
Figure BDA0003950400900000172
/>
Figure BDA0003950400900000181
Selected views used in the experiment are shown in bold.
In the model provided by the embodiment, a plurality of graph views are fused to improve clustering performance; to further study the impact of multi-view on cluster task learning embedding, the performance of MCBGA was carefully studied by adding the view-by-view in the DBLP dataset; these three views are co-conference, co-term and co-paper, respectively, which are added to the model in order; fig. 7 is a graph of the four index performance of the MCBGA with additional views, and the results from the four indices show that the performance of the model proposed in this example steadily improves as views are added one by one, so the MCBGA provides a flexible framework to take advantage of more views.
Example two
The embodiment discloses a multi-view clustering system based on an automatic graph attention encoder;
as shown in fig. 8, the multi-view clustering system based on the graph attention automatic encoder comprises a view selection module, a feature representation module, a feature constraint module and a cluster prediction module:
a view selection module configured to: selecting a view with the largest information amount from different views of the same group of nodes;
a feature representation module configured to: based on the view with the largest information amount and the node content information, a graph structure and node content are learned by using a graph attention encoder to obtain node feature representation;
a feature constraint module configured to: by using
Figure BDA0003950400900000182
Carrying out specific constraint on the node characteristic representation by the norm punishment to obtain a node characteristic representation after constraint;
a cluster prediction module configured to: and (4) representing the constrained node characteristics, inputting the constrained node characteristics into a multi-view decoder for prediction, and obtaining a final clustering result.
EXAMPLE III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the method for multi-view clustering based on a graph attention auto-encoder according to the first embodiment of the present disclosure.
Example four
An object of the present embodiment is to provide an electronic device.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the method for multi-view clustering based on a graph attention automatic encoder according to the first embodiment of the present disclosure.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The multi-view clustering method based on the graph attention automatic encoder is characterized by comprising the following steps:
selecting a view with the largest information amount from different views of the same group of nodes;
based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation;
carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment;
inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;
and reconstructing the constrained node feature representation by adding a multi-view decoder, and training the graph attention encoder by using reconstruction loss.
2. The method as claimed in claim 1, wherein the view with the largest amount of information is obtained by calculating a modularization score of each view based on the clustering index and the adjacency matrix, and selecting the view with the highest score as the view with the largest amount of information.
3. The multi-view clustering method based on graph attention auto-encoder according to claim 1 characterized in that the graph attention encoder also learns the importance of neighboring nodes.
4. The multi-view clustering method based on graph attention automatic encoder as claimed in claim 3, characterized in that the learning of the importance of the neighbor node is to assign different weights to the neighbors in the layer-by-layer graph attention strategy of the graph attention encoder, representing the importance of the neighbor node to the current node.
5. The multi-view clustering method based on graph attention auto-encoder according to claim 1, wherein the node feature representation is specifically constrained with l1, 2-norm penalties, and the specific formula is:
Figure FDA0003950400890000021
where β is a trade-off parameter, Z i Is the node characteristic representation of the ith node, and N is the total number of nodes.
6. The method of multi-view clustering based on graph attention auto-encoder according to claim 1, characterized in that in the multi-view decoder, a mapping function is added to change the mapping range of the node feature representation.
7. The multi-view clustering method based on graph attention auto-encoder according to claim 6, characterized in that the mapping function is specifically:
Figure FDA0003950400890000022
wherein x is a node feature representation.
8. The multi-view clustering system based on the graph attention automatic encoder is characterized by comprising a view selection module, a feature representation module, a feature constraint module and a clustering prediction module, wherein the view selection module comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module:
a view selection module configured to: selecting a view with the largest information amount from different views of the same group of nodes;
a feature representation module configured to: based on the view with the largest information amount and the node content information, a graph structure and node content are learned by using a graph attention encoder to obtain node feature representation;
a feature constraint module configured to: carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment;
a cluster prediction module configured to: inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;
and a multi-view decoder is added to reconstruct the constrained node feature representation, and the graph attention encoder is trained by utilizing the reconstruction loss.
9. Computer readable storage medium, on which a program is stored which, when being executed by a processor, carries out the steps of the method for multi-view clustering based on graph attention auto-encoders according to any of claims 1-7.
10. Electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, characterized in that the processor when executing the program performs the steps in the method for multi-view clustering based on graph attention auto-encoder according to any of the claims 1-7.
CN202211446136.6A 2022-11-18 2022-11-18 Multi-view clustering method and system based on graph attention automatic encoder Pending CN115905903A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211446136.6A CN115905903A (en) 2022-11-18 2022-11-18 Multi-view clustering method and system based on graph attention automatic encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211446136.6A CN115905903A (en) 2022-11-18 2022-11-18 Multi-view clustering method and system based on graph attention automatic encoder

Publications (1)

Publication Number Publication Date
CN115905903A true CN115905903A (en) 2023-04-04

Family

ID=86487519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211446136.6A Pending CN115905903A (en) 2022-11-18 2022-11-18 Multi-view clustering method and system based on graph attention automatic encoder

Country Status (1)

Country Link
CN (1) CN115905903A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117009839A (en) * 2023-09-28 2023-11-07 之江实验室 Patient clustering method and device based on heterogeneous hypergraph neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117009839A (en) * 2023-09-28 2023-11-07 之江实验室 Patient clustering method and device based on heterogeneous hypergraph neural network
CN117009839B (en) * 2023-09-28 2024-01-09 之江实验室 Patient clustering method and device based on heterogeneous hypergraph neural network

Similar Documents

Publication Publication Date Title
Cao et al. Class-specific soft voting based multiple extreme learning machines ensemble
CN108108854B (en) Urban road network link prediction method, system and storage medium
CN110807154A (en) Recommendation method and system based on hybrid deep learning model
Honkela et al. Variational learning and bits-back coding: an information-theoretic view to Bayesian learning
CN111126488A (en) Image identification method based on double attention
CN112417289A (en) Information intelligent recommendation method based on deep clustering
KR20210030063A (en) System and method for constructing a generative adversarial network model for image classification based on semi-supervised learning
CN113449802A (en) Graph classification method and device based on multi-granularity mutual information maximization
CN115905903A (en) Multi-view clustering method and system based on graph attention automatic encoder
CN115481727A (en) Intention recognition neural network generation and optimization method based on evolutionary computation
Mautz et al. Deep embedded cluster tree
CN109409434A (en) The method of liver diseases data classification Rule Extraction based on random forest
CN112286996A (en) Node embedding method based on network link and node attribute information
CN117056763A (en) Community discovery method based on variogram embedding
CN116304518A (en) Heterogeneous graph convolution neural network model construction method and system for information recommendation
CN115913995A (en) Cloud service dynamic QoS prediction method based on Kalman filtering correction
CN115660882A (en) Method for predicting user-to-user relationship in social network and multi-head mixed aggregation graph convolutional network
CN112200208B (en) Cloud workflow task execution time prediction method based on multi-dimensional feature fusion
Yu et al. Auto graph encoder-decoder for model compression and network acceleration
CN113673773A (en) Learning path recommendation method fusing knowledge background and learning time prediction
Zeng et al. Contextual bandit guided data farming for deep neural networks in manufacturing industrial internet
JP6230501B2 (en) Reduced feature generation apparatus, information processing apparatus, method, and program
CN114863234A (en) Graph representation learning method and system based on topological structure maintenance
CN116541593B (en) Course recommendation method based on hypergraph neural network
CN116187446B (en) Knowledge graph completion method, device and equipment based on self-adaptive attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination