CN115905903A

CN115905903A - Multi-view clustering method and system based on graph attention automatic encoder

Info

Publication number: CN115905903A
Application number: CN202211446136.6A
Authority: CN
Inventors: 尉秀梅; 陈佃迎; 姜雪松; 陈珺; 柴慧慧; 马浩翔
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2022-11-18
Filing date: 2022-11-18
Publication date: 2023-04-04

Abstract

The invention provides a multi-view clustering method and a multi-view clustering system based on an automatic graph attention encoder, which relate to the technical field of multi-view clustering, and the specific scheme comprises the following steps: selecting a view with the largest information amount from different views of the same group of nodes; based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation; carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment; inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result; in order to be more suitable for cluster tasks, the graph attention network is applied to multi-view graph clustering, and the graph structure and the node content are reconstructed at the same time, so that the graph structure and the content information of the nodes are well stored in the potential representation.

Description

Multi-view clustering method and system based on graph attention automatic encoder

Technical Field

The invention belongs to the technical field of multi-view clustering, and particularly relates to a multi-view clustering method and system based on an automatic graph attention encoder.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

In recent years, more and more multiview clustering has been proposed for analyzing and processing multimedia data, and most multiview clustering algorithms generally include the following two steps: firstly, constructing a shared similar graph from multi-view data, and then clustering the similar graph to obtain a clustering result; in practical multimedia applications, due to the heterogeneity of multimedia acquisition sources, multi-view data often has features such as redundancy, correlation, and diversity, which makes the prior art have several problems:

1) Most clustering methods adopt a shallow layer model to process the complex relation in the multi-view graph, and the modeling capability of the multi-view graph information is severely limited;

the purpose of graph embedding is to learn a low-dimensional node representation, and simultaneously retain the content information and the topological structure of the nodes; graph embedding methods have emerged in the last decade and can be roughly classified into two major categories, namely, topology embedding (TSE) and Content Enhanced Graph Embedding (CEGE), depending on the input information.

The TSE method takes only the topology as input and maps it to learn the low-dimensional node representation. For example, perozzi et al propose a truncated random walk algorithm to learn node representation, which converts the original graph structure information into a set of linear sequences, unlike generating linear sequences, cao et al propose a deep neural network (DNGR) for learning graph representation, which proposes a random surfing model that can directly utilize topology structure information; cavallari et al, which integrates community embedding, community detection, and node embedding into a closed loop, rather than embedding each node individually, propose a community embedding framework (ComE); to address the problem of unknown number of communities, cavallari et al propose learning limited and unlimited community inset (ComE +); although the above methods give good results, they only consider graph structures, which limits their performance; research has shown that multi-view data helps to improve clustering performance, however, all the above methods only use the graph structure and node content of a single view, resulting in non-ideal results.

A number of multi-view clustering methods are dedicated to learning high-quality potential representations or affinity matrices shared by different views, wherein depth multi-view clustering methods are widely focused by researchers due to their outstanding representation capability and fast inference speed, for example Andrew et al propose a new multi-view clustering algorithm combining a depth encoder with canonical correlation analysis (DCCA), and Wang et al propose a Depth Canonical Correlation Automatic Encoder (DCCAE) by introducing a depth decoder to extend DCCA for better learning multi-view representations.

To better utilize the graph structure information in multiple views, li et al proposed a GCN-based multi-view learning method, i.e., co-GCN, however, co-GCN was designed for semi-supervised clustering, and to solve this problem, fan et al proposed a multi-view auto-encoder (O2 MAC) for graph-embedded clustering, which, although O2MAC was successful, only encodes node content information for a single view, with limited performance when processing data for single view graphs and node content.

2) In recent years, a plurality of clustering methods of the embedded graph are developed, but none of the methods considers community-specific distribution represented by nodes, so that the clustering performance is not ideal;

several clustering models have been proposed so far, whose core is to learn a low-dimensional, compact and continuous representation and then implement a classical clustering method on the learned representation to obtain a clustering label; although the clustering performance is improved, the clustering-specific distribution represented by the nodes is ignored, different communities are distributed on different characteristic dimensions, the characteristic distribution of the nodes is very disordered, the characteristics of the nodes are very similar even on most dimensions, and the algorithm can cluster all the nodes into the same community, so that the clustering result is low; in the above methods, only the graph structure and node content of the single-view graph are utilized, which results in non-ideal results; when a part of methods process the graph structure information by using multiple views, the method only encodes the node content information of a single view; when processing data of the single view graph and node contents, the performance is limited.

Therefore, how to fuse the multi-view information becomes a topic worth studying.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a multi-view clustering method and a multi-view clustering system based on a graph attention automatic encoder, which are used for applying a graph attention network to multi-view graph clustering and reconstructing a graph structure and node contents at the same time in order to be more suitable for a clustering task, so that the graph structure and content information of nodes are well stored in potential representation.

To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:

the invention provides a multi-view clustering method based on an automatic graph attention encoder;

the multi-view clustering method based on the graph attention automatic encoder comprises the following steps:

selecting a view with the largest information amount from different views of the same group of nodes;

based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation;

by using

Carrying out specific constraint on the node feature representation by the norm punishment to obtain a constrained node feature representation;

inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;

and reconstructing the constrained node feature representation by adding a multi-view decoder, and training the graph attention encoder by using reconstruction loss.

Further, the view with the largest information amount is obtained by calculating a modularization score of each graph view based on the clustering index and the adjacency matrix, and selecting the graph view with the highest score as the view with the largest information amount.

Further, the graph attention encoder also learns the importance of the neighbor nodes.

Furthermore, the learning of the importance of the neighbor node is to assign different weights to the neighbor in a layer-by-layer graph attention strategy of the graph attention encoder, and represents the importance of the neighbor node to the current node.

Further, said use

The norm penalty specifically constrains the node feature representation, and the specific formula is as follows:

where β is a trade-off parameter, Z _i Is the node characteristic representation of the ith node, and N is the total number of nodes.

Further, in the multi-view decoder, a mapping function is added to change the mapping range of the node feature representation.

Further, the mapping function specifically includes:

wherein x is a node feature representation.

A second aspect of the invention provides a multi-view clustering system based on a graph attention auto-encoder.

The multi-view clustering system based on the graph attention automatic encoder comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module, wherein the view selection module comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module:

a view selection module configured to: selecting a view with the largest information amount from different views of the same group of nodes;

a feature representation module configured to: based on the view with the largest information amount and the node content information, a graph structure and node content are learned by using a graph attention encoder to obtain node feature representation;

a feature constraint module configured to: by using

Carrying out specific constraint on the node characteristic representation by the norm punishment to obtain a node characteristic representation after constraint;

a cluster prediction module configured to: inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;

A third aspect of the invention provides a computer readable storage medium having stored thereon a program which, when being executed by a processor, carries out the steps of the method for multi-view clustering based on a map attention auto-encoder according to the first aspect of the invention.

A fourth aspect of the present invention provides an electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps of the method for multi-view clustering based on a graph attention auto-encoder according to the first aspect of the present invention when executing the program.

The above one or more technical solutions have the following beneficial effects:

the invention provides a multi-view clustering Method (MCBGA) based on a graph attention automatic encoder, which uses the graph attention automatic encoder to perform an attribute multi-view graph clustering task, effectively integrates multi-view graph structures and content information to perform deep potential representation learning, combines node representation learning and clustering into a unified framework, and jointly optimizes embedded learning and graph clustering.

The invention utilizes

And the norm punishment is used for solving the problem of community specific distribution represented by the nodes, plays an important role in learning the node representation, and well describes a clustering structure so as to improve a clustering result.

Experimental results on 4 reference data sets show that the algorithm of the present invention is superior to the most advanced graph clustering method.

Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.

Fig. 1 is a model configuration diagram of the first embodiment.

Fig. 2 is a flow chart of the method of the first embodiment.

Fig. 3 is a comparison of mapping functions of the first embodiment.

FIG. 4 is a graph comparing performance on the ACM data set of the first embodiment.

FIG. 5 is a graph comparing performance on a DBLP data set for the first embodiment.

FIG. 6 is a graph comparing performance on the IMDB data set of the first embodiment.

Fig. 7 is a MCBGA performance graph with additional views of the first embodiment.

Fig. 8 is a system configuration diagram of the second embodiment.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention; unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention; as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The invention provides a multi-view clustering method based on a graph attention automatic encoder, which uses the graph attention automatic encoder to reconstruct a graph structure and node content at the same time, so that the graph structure and the content information of the nodes are well stored by potential representation, and the thought of the invention is as follows:

extracting a shared representation from the graph view and the content data with the largest amount of information using a graph attention automatic encoder, and then reconstructing all views using the shared representation, the graph attention automatic encoder being composed of one graph attention encoder and a plurality of decoders; in particular, it learns node representations from views with the largest amount of information by a graph attention encoder, and reconstructs all views by multiple multi-view decoders, using multi-view graph structures and node content.

In order to eliminate the uncertainty of post-processing clustering operation, a clustering activation function is introduced, which is beneficial to better node representation learning and node clusteringClass; the inventors have found that

Norm penalties have an important role in characterizing the community-specific distribution of graph structure data in a dimensional space, which is then applied to the aspect of learning node representation, which well characterizes the clustering structure, thereby improving the clustering result.

In addition, a self-training clustering target is designed, so that the current clustering distribution tends to be more suitable for the target distribution of a clustering task; by jointly optimizing reconstruction loss and clustering loss, the model can simultaneously optimize node embedding and clustering and improve each other in a unified framework.

Example one

The embodiment discloses a multi-view clustering method based on an automatic graph attention encoder;

as shown in FIG. 1, to simultaneously represent multiple view structures A in a unified framework ⁽¹⁾ ,...,A ^(M) And node content X, the embodiment provides a novel graph embedding clustering model, namely a graph attention automatic encoder, which comprises a graph attention encoder and a multi-view decoder, wherein the graph attention encoder is shared by all views, a shared representation is extracted from a view graph structure and content data, the multi-view decoder is designed, the multi-view data of the shared representation is reconstructed, and the graph attention encoder is trained by using reconstruction loss.

Given a graph G = (V, E) ₁ ,...,E _M ,X)，

Representing a set of nodes, e _ij ^(m) ∈E _M Representing the association between nodes i and j in the mth view, the topology of graph G may be based on the set of adjacency matrices->

Is shown in (A) is ^(m) An adjacency matrix representing the mth view if>

Then->

Otherwise->

X＝{x ₁ ,...,x _n Is an attribute value representing content information of the node.

The multi-view clustering method based on the graph attention automatic encoder, as shown in fig. 2, includes:

step S01: selecting a view with the largest information amount from different views of the same group of nodes;

different views represent the relationship among different aspects of the same group of nodes, and the content information is shared by all the views, and shared information exists among the views; furthermore, in many scenarios, there is often a view that is the most informative that controls the performance of the community; therefore, the content information and the shared information between the plurality of views are extracted from the view and the content data having the largest amount of information, and then used to reconstruct the view.

Selecting a view A with most abundant information and node content information X as input, and reconstructing all views; here, heuristic measurement — modularization is adopted, and a view with the most abundant information is selected according to the result of the modularization function Q, specifically:

adjoining each single view to the matrix A ^(m) And the node content information X are respectively input into a graph neural network GNN to learn node embedding;

performing k-means on the learned embedding to obtain a clustering index of the embedding;

based on clustering index and adjacency matrix A ^(m) And calculating modularization scores of each view, and selecting the view A with the highest score as the view with the largest information amount.

Modularity is used because it provides an objective metric for evaluating the cluster structure.

Step S02: based on the view with the largest information amount and the node content information, learning a graph structure and node content by using a trained graph attention encoder to obtain node feature representation;

the embodiment adopts a two-layer non-linear graph attention encoder to map node content information X and a view A, obtain potential node feature representation Z, learn neighbor representation of each node by focusing attention on the neighbor thereof, and combine the node content information with a graph structure in the potential node feature representation.

The most straightforward strategy to learn neighbor representations is to integrate the node representation equally with all its neighbors; however, in order to measure the importance of various neighbors, in the layer-by-layer graph attention strategy of the graph attention encoder, different weights are given to neighbor representations, which represent the importance degree of a neighbor node to a current node, specifically:

wherein z is _i ^h Is a node-specific representation of node i, N _i Represents the neighbors of node i; mu.s _ij Representing the attention coefficient and representing the degree of importance of the neighboring node j to the node i, ω is a non-linear activation function.

To calculate the attention coefficient mu _ij The importance of the node j is measured from two aspects of a topological structure and an attribute value; from the attribute value, the attention coefficient μ _ij Can be expressed as x _i 、x _j And weight vector

Connected single layer feedforward neural network:

attention coefficient mu _ij The softmax function is typically used to match all neighborhoods j ∈ N _i Normalization was performed to make them convenient for comparison between nodes:

then, together with the topology P and the activation function ξ, the attention coefficient is finally expressed as:

/>

the graph attention encoder stacks two graph attention layers, x _i ＝z _i ⁰ As inputs:

in this way, the encoder encodes both the graph structure and the node content into a node feature representation, i.e., z _i ＝z _i ⁽²⁾ 。

It should be noted that the last layer of the encoder uses ReLU as the activation function to ensure that the inner product between any two points is non-negative.

Step S03: by using

because communities represented by the nodes have specific distribution, different communities are distributed on different characteristic dimensions, and the characteristic distribution of the nodes is very disordered; in order to make the node characterization Z well characterize the community structure, in other words to make the node characterization Z more discriminative, the term Z is used

Norm punishment to obtain the sumPost-beam node characterization

With this constraint, it is mandatory to use a graph attention encoder to capture potential spatial differences in different clusters.

To make better use of the cluster structure, use is made of

Norm penalty specifically constrains node feature representation Z for better clustering, for which ^ is greater than or equal to Z>

Norm penalty as a specific constraint for the community, defined as:

where β is a trade-off parameter, Z _i Is the node characteristic representation of the ith node, and N is the total number of the nodes; by the formula (7), Z _i Square of

Different elements in the norm compete with each other for survival, and Z _i Where at least one element survives (remains non-zero), by doing so, some discriminative features are retained for each community, providing some flexibility in learning node representation.

Step S04: inputting the constrained node feature representation into a self-optimization clustering module for clustering to obtain a final clustering result;

and constructing a self-optimizing clustering module by using K-means, inputting the constrained node feature representation into the K-means for clustering, and outputting a final clustering result, namely the community distribution of the nodes.

In order to accurately extract a shared representation of all views by a supervised graph attention encoder, a multi-view decoder is added, and a node feature representation after constraint is expressed

In reconstructing multi-view data>

The graph attention encoder is trained using the reconstruction loss. As shown in the multiview decoder portion of FIG. 1, the decoder consists of M specific view decoders

Composition, predictive View A ^(m) Whether there is a link between two nodes, where W _m ∈R ^D×D Is a view A ^(m) A specific weight of (a); multi-view decoder, which is actually a graph-based embedded multi-view link prediction layer, is represented as:

where σ is the activation function of the multi-view decoder,

is a node characteristic representation after a constraint, and>

is->

The transposed matrix of (2). />

Assuming that all inner products are non-negative, i.e. ZZ ^T Not less than 0; since the activation function ReLU of the last coding layer of the graph attention encoder is non-negative, Z ≧ 0, σ (ZZ) ^T ) Will be located at

Cannot be regarded as a probability.

To solve this problem, a mapping function φ will need to be found that will [0, + ∞) Mapping to (- ∞, + ∞) to make σ (ZZ) ^T ) Output the effective probability), the ideal mapping function should satisfy the requirements: when the input is large enough, the ideal φ should be approximately y = X, as shown in FIG. 3, if the mapping function is log (X), which is too insensitive to large values, which means σ (φ (X)) will slowly approach 1 as X increases, so the present embodiment designs the mapping function as:

as can be seen from fig. 3, as x increases, the mapping function shown in equation (9) approaches y = x very quickly, and when x approaches 0, the function approaches ∞quickly, conforming to the condition of the ideal mapping function \35205.

Based on the above, the multi-view decoder is represented as:

wherein the content of the first and second substances,

is the reconstructed view.

Integrally training a graph attention automatic encoder comprising a graph attention encoder and a multi-view decoder to better learn graph structure and content information among multiple views; by training the clustering effect of the autoencoder for attention to the optimization map, the overall objective function of the training is defined as:

L＝L _R +λL _C +L _norm (11)

wherein L is _R And L _C Reconstruction loss and clustering loss, L, respectively _norm Is to Z

Norm penalty as a specific constraint for the community, λ ≧ 0 is the coefficient controlling the balance of the two.

For the graph attention auto-encoder, the sum of reconstruction errors for each view data is minimized by:

wherein the content of the first and second substances,

for reconstruction loss of view m, L _R Reconstruction loss for all views, A ^(m) 、/>

Respectively an original view and a reconstructed view; because the decoder adopts a multi-view structure, the gradient of the multi-decoder can be transmitted through the information graph encoder in the process of back transmission; thus, the graphics encoder would extract a shared representation of all views while doing the forward propagation processing.

Besides optimizing reconstruction errors, clustering loss is controlled, a self-optimizing clustering module is added for the control, and constrained node feature representations are input into the self-optimizing clustering module, so that the following goals are minimized:

where KL (· |) is the divergence of the Kullback-Leibler between the two distributions, and Q is the distribution of the soft label to indicate the constrained node characterization of node i

U of cluster center _j Similarity between:

q _ij can be viewed as soft cluster partitioning of each nodeDistribution, p in formula (13) _ij Is the target distribution, defined as:

soft distributions (nodes near the center of the community) with high probability are considered trustworthy in Q, so the target distribution P raises Q to the square to emphasize the effect of those "confidence distributions" and then the clustering penalty forces the current distribution Q to approach the target distribution P, setting these "confidence distributions" as soft labels to supervise the embedded learning of Q.

The target distribution P acts as a "true label" in the training process, but also depends on the current soft allocation Q that is updated at each iteration; updating P with Q at each iteration is dangerous because the constant change in the target can hinder learning and convergence; to avoid instability in the self-optimization process, this example updates P every 5 iterations in the experiment.

Clustering loss is minimized to help the auto-encoder manipulate the embedding space with the features of embedding itself and the points of discrete embedding for better clustering performance.

Updating the model weight and the cluster center by using the target distribution P and the soft label distribution Q, which specifically comprises the following steps:

(1) Fixed target distribution P, given N samples, L _C Relative to the cluster center μ _j The gradient of (d) can be calculated as:

(2) Given a learning rate λ, update μ _j Comprises the following steps:

(3) Updating the weight of the mth view specific decoder:

it can be seen that W _m Is only related to the view m reconstruction loss, so the view specific decoder weights can capture the view specific local structure information, so the encoder weights can extract a shared representation of all views.

The effect of the method of the embodiment is verified by comparative experiments.

Experimental data

ACM this is a network of papers in ACM datasets, building a dual-view graph using co-paper (both papers are written by the same author) and co-subject (both papers contain the same subject matter) relationships. The paper features are elements of the word bag represented by the keywords.

DBLP, consisting of three graphs, namely a collaboration graph, a referenced graph of a paper and a co-referenced graph of a paper; there are 2401 author nodes and 8703 edges in the collaboration graph; the thesis reference graph has 6000 thesis nodes and 10003 edges; the referenced graph of the paper has 6000 paper nodes and 141996 edges (if they refer to a common paper, two nodes are connected; authors and papers are linked by 32048 authorship; the linking of papers in the referenced graph and the referenced graph is based on the identity of the papers; all authors and papers relate to three clusters representing research fields: artificial intelligence, computer graphics and computer networks).

IMDB, which is a network of movies from IMDB data sets, that use the relationship of co-actors (movies are shown by the same actor) and co-director (movies are shown by the same director) to construct a multi-view, movie features correspond to a collection of words representing stories.

The clustering performance of the method of the present embodiment was evaluated in three data sets, ACM, DBLP and IMDB, whose brief statistical data are shown in table 1.

TABLE 1 data set information

Parameter setting and evaluation index

Four commonly used indicators were used, accuracy (ACC), F-score (F1), normalized Mutual Information (NMI), and the grand index (ARI); wherein, ARI value range is [ -1,1], the larger the value is, the better the value is, the overlapping degree of the two divisions is reflected; for each index, a larger value means a better clustering result.

For the ACM dataset, the training was iterated 250 times because the dataset is small; for the DBLP and IMDB datasets, all the autoencoder models were trained for 1000 iterations and optimized using Adam's algorithm; the learning rate λ of the auto-encoder is set to 0.001, and the dimensions of all embedding methods are set to 32; the convergence threshold for MCBGA is set to 0.1%, update period T =20; for the rest of the methods, the settings described in the corresponding papers are retained; since all clustering algorithms rely on initialization, all methods are repeated 10 times using random initialization.

Comparison method

GAE, a single view auto-encoder method;

x-avg, in order to utilize multiple views of the network, using the X method to learn the node representation on each view, and then averaging all learned representations;

MNE is an extensible multi-view network embedding model, and for all multi-view graph embedding/clustering methods, a multi-view graph adjacency matrix is used as input;

RMSC is a robust multi-view spectral clustering method based on low rank and sparse decomposition;

PwMC is a parameter weighting multi-view graph clustering method;

SwMC is a self-weighting multi-view graph clustering method;

o2MA A variant of O2MAC that contains no clustering penalty in the objective function;

o2MAC, providing a multi-view graph clustering method based on attributes;

MCBGA this embodiment proposes a multi-view clustering method based on a graph attention auto-encoder.

Results of the experiment

In order to evaluate the efficiency of the proposed method, the clustering performance on ACM, DBLP and IMDB datasets was analyzed, tables 2, 3 and 4 summarize the experimental results on three baseline datasets, where bold values represent the best performance, the experimental results are shown in tables 2, 3 and 4, it can be seen that the method of this example is significantly superior to all methods in most evaluation indices.

Table 2 results of the ACM data set

Method	ACC	F1	NMI	ARI
					GAE	0.8216	0.8225	0.4914	0.5444
GAE-avg	0.6990	0.7025	0.4771	0.4378
					MNE	0.6370	0.6479	0.2999	0.2486
RMSC	0.6315	0.5746	0.3973	0.3312
					PwMC	0.4162	0.3783	0.0332	0.0395
SwMC	0.3831	0.4709	0.0838	0.0187
					O2MA	0.8880	0.8894	0.6515	0.6987
O2MAC	0.9042*	0.9053*	0.6923*	0.7394*
					MCBGA	0.9102	0.9223	0.7052	0.7451

TABLE 3 Experimental results for DBLP dataset

Method	ACC	F1	NMI	ARI
					GAE	0.8859	0.8743	0.6925	0.7410
GAE-avg	0.5558	0.5418	0.3072	0.2577
					MNE	-	-	-	-
RMSC	0.8994	0.8248	0.7111	0.7647
					PwMC	0.3253	0.2808	0.0190	0.0159
SwMC	0.6538	0.5602	0.3760	0.3800
					O2MA	0.9040	0.8976	0.7257	0.7705
O2MAC	0.9074*	0.9013*	0.7287*	0.7780*
					MCBGA	0.9156	0.9047	0.7365	0.7789

Table 4 experimental results of IMDB data set

Where "-" indicates the best performance of the baseline, the best results of all methods are shown in bold, and "-" indicates insufficient memory of the method on the data set.

Experimental results visualization as shown in fig. 4-5 below, it can be seen from the visualization that MCBGA results are almost superior to all baseline methods, indicating that the model proposed in this example is valid; then, by comparing the results of MCBGA with GAE-avg, O2MA and O2MAC, the MCBGA of the embodiment is a more effective graph neural network for fusing multi-view information; in addition, compared with the O2MAC, the MCBGA of the present embodiment has better results on three data sets, for example, on the ACM data set, the MCBGA of the present embodiment is improved by 0.6%, 1.8%, 1.1%, 0.7% compared with the four indexes ACC, F1, NMI, ARI of the O2MAC, respectively, because the MCBGA of the present embodiment may utilize the four indexes ACC, F1, NMI, ARI of the O2MAC

Norm punishment is used for solving the problem of community specificity distribution of node representation, plays an important role in learning node representation, and can well depict a clustering structure so as to improve a clustering result; compared with the O2MA, the better results of the MCBGA of the embodiment on the three data sets show that the clustering performance can be further improved after the self-training clustering target is subjected to effective pre-training; note also that the results for all baselines on the IMDB dataset are reduced compared to the results for both ACM and DBLP datasets, since it is difficult to do so on the IMDBObtaining a "highly confident" node; in this case, these "high confidence" nodes may limit the low confidence nodes to the wrong cluster.

The MCBGA algorithm provided by the embodiment has good performance on both ACM and DBLP data sets, which shows the superiority of MCBGA in node clustering tasks; for example, on the ACM data set, the MCBGA of the present embodiment has significantly improved clustering performance compared with the GAE method, because the MCBGA of the present embodiment utilizes complementary information embedded in the multi-view data, whereas the single-view method does not; in general, single-view based clustering methods are inferior to multi-view based clustering methods because multi-view methods can utilize complementary information embedded in multi-view data, whereas single-view methods cannot.

For the experimental results, the following results can be obtained: firstly, the embedding method is obviously superior to other methods, so that graph embedding is a promising method for solving the graph clustering problem; second, the deep learning approach (i.e., GAE) achieves more competitive results than other baselines, but it can only utilize single-view graphical and content information; a well-designed deep neural network integrating multiple image views is possible to achieve good effects.

Ablation experiments

An ablation study was conducted to further analyze the importance of each module in the framework proposed by this example.

Table 5 ablation experiments of three data sets

Is provided with

Comparison tests with norm penalties, it can be seen from Table 5 that at setting->

The experimental results of the norm penalty are better, indicating that the present embodiment proposes/>

Norm penalties help to learn better potential node discriminative representations for node clustering tasks.

Secondly, verifying the necessity of using modularity to select the information graph views, taking each graph view as the input of the MCBGA, reconstructing all graph views on the three data sets, and putting their results on the graph clustering task in table 6; it can be observed that if the graphics view with higher modularity value is input to the encoder, the model will get better results, verifying that modularity is a viable solution for information map view selection.

Table 6 aggregates the results across different input views.

/>

Selected views used in the experiment are shown in bold.

In the model provided by the embodiment, a plurality of graph views are fused to improve clustering performance; to further study the impact of multi-view on cluster task learning embedding, the performance of MCBGA was carefully studied by adding the view-by-view in the DBLP dataset; these three views are co-conference, co-term and co-paper, respectively, which are added to the model in order; fig. 7 is a graph of the four index performance of the MCBGA with additional views, and the results from the four indices show that the performance of the model proposed in this example steadily improves as views are added one by one, so the MCBGA provides a flexible framework to take advantage of more views.

Example two

The embodiment discloses a multi-view clustering system based on an automatic graph attention encoder;

as shown in fig. 8, the multi-view clustering system based on the graph attention automatic encoder comprises a view selection module, a feature representation module, a feature constraint module and a cluster prediction module:

a feature constraint module configured to: by using

a cluster prediction module configured to: and (4) representing the constrained node characteristics, inputting the constrained node characteristics into a multi-view decoder for prediction, and obtaining a final clustering result.

EXAMPLE III

An object of the present embodiment is to provide a computer-readable storage medium.

A computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the method for multi-view clustering based on a graph attention auto-encoder according to the first embodiment of the present disclosure.

Example four

An object of the present embodiment is to provide an electronic device.

An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the method for multi-view clustering based on a graph attention automatic encoder according to the first embodiment of the present disclosure.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The multi-view clustering method based on the graph attention automatic encoder is characterized by comprising the following steps:

carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment;

2. The method as claimed in claim 1, wherein the view with the largest amount of information is obtained by calculating a modularization score of each view based on the clustering index and the adjacency matrix, and selecting the view with the highest score as the view with the largest amount of information.

3. The multi-view clustering method based on graph attention auto-encoder according to claim 1 characterized in that the graph attention encoder also learns the importance of neighboring nodes.

4. The multi-view clustering method based on graph attention automatic encoder as claimed in claim 3, characterized in that the learning of the importance of the neighbor node is to assign different weights to the neighbors in the layer-by-layer graph attention strategy of the graph attention encoder, representing the importance of the neighbor node to the current node.

5. The multi-view clustering method based on graph attention auto-encoder according to claim 1, wherein the node feature representation is specifically constrained with l1, 2-norm penalties, and the specific formula is:

6. The method of multi-view clustering based on graph attention auto-encoder according to claim 1, characterized in that in the multi-view decoder, a mapping function is added to change the mapping range of the node feature representation.

7. The multi-view clustering method based on graph attention auto-encoder according to claim 6, characterized in that the mapping function is specifically:

wherein x is a node feature representation.

8. The multi-view clustering system based on the graph attention automatic encoder is characterized by comprising a view selection module, a feature representation module, a feature constraint module and a clustering prediction module, wherein the view selection module comprises a view selection module, a feature representation module, a feature constraint module and a clustering prediction module:

a feature constraint module configured to: carrying out specificity constraint on the node feature representation by adopting l1, 2-norm punishment to obtain the node feature representation after the punishment;

and a multi-view decoder is added to reconstruct the constrained node feature representation, and the graph attention encoder is trained by utilizing the reconstruction loss.

9. Computer readable storage medium, on which a program is stored which, when being executed by a processor, carries out the steps of the method for multi-view clustering based on graph attention auto-encoders according to any of claims 1-7.

10. Electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, characterized in that the processor when executing the program performs the steps in the method for multi-view clustering based on graph attention auto-encoder according to any of the claims 1-7.