LU503098B1 - A method and system for fused subspace clustering based on graph autoencoder - Google Patents
A method and system for fused subspace clustering based on graph autoencoder Download PDFInfo
- Publication number
- LU503098B1 LU503098B1 LU503098A LU503098A LU503098B1 LU 503098 B1 LU503098 B1 LU 503098B1 LU 503098 A LU503098 A LU 503098A LU 503098 A LU503098 A LU 503098A LU 503098 B1 LU503098 B1 LU 503098B1
- Authority
- LU
- Luxembourg
- Prior art keywords
- graph
- parameters
- self
- graph convolution
- convolutional
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Error Detection And Correction (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The present application discloses a method and system for fused subspace clustering based on graph autoencoder, the method comprising the steps of: S1, Initializing the graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix by graph dataset; S2, Mapping the graph dataset by initializing the graph convolution autoencoder to reconstruct the adjacency matrix of the graph dataset; S3, Minimizing the LGAE for the purpose of update the parameters of the graph convolution encoder and the graph convolution decoder for the purpose of LGAE minimization; S4, Extract graph data features from each graph convolution layer of the updated graph convolution encoder and update the self-expression coefficient matrix for the purpose of LSC minimization; S5, Calculate the reconstruction loss LGAE and the fused subspace clustering loss LSCusing the updated graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix; S6. Update the graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix with the purpose of minimizing the joint loss LGAE+LSC; S7. Transform the final obtained self-expression coefficient matrix into clustering labels by the spectral clustering algorithm. This application uses joint optimization of reconstruction loss and fused subspace clustering loss to enable feature quality and clustering performance improvement.
Description
BL-5592 1
A METHOD AND SYSTEM FOR FUSED SUBSPACE CLUSTERING BASED ON GRAPH 0%
AUTOENCODER
The present application belongs to the field of clustering technology, and specifically relates to a method and system for fused subspace clustering based on graph autoencoder.
Cluster analysis is the most basic data analysis tool and the most important component of unsupervised learning, which is widely used in various fields and has an important impact on people's daily life.
Traditional clustering techniques, including division-based, hierarchy-based, and density-based algorithms, rely on hand-designed features. And, for high-dimensional complex data, the hand-designed features often fail to meet the practical needs. Since the feature quality of the data determines the upper bound of the performance of machine learning algorithms, low-quality manual features severely limit the performance of clustering algorithms.
Conventional deep clustering techniques, with the powerful representation learning ability of conventional neural networks (fully connected neural networks, convolutional neural networks, recurrent neural networks, etc.) can extract high-quality feature vectors, and then analyze them with conventional clustering algorithms, which can greatly improve the clustering effect. However, conventional neural networks are only good at processing independent and identically distributed samples, and cannot effectively process graph data.
For example, the Chinese patent with application number CN201611027489.7 discloses a convolutional neural network-based text clustering method for social networks, comprising the following steps: Text preprocessing: Filter useless characters while converting to word vectors.
Feature mapping: The word vector is mapped to a binary feature vector available to the convolutional neural network model by a local feature-preserving algorithm as the target feature for convolutional neural network training. Convolutional Neural Network: The convolutional neural network training process is performed with a word vector as input and a binary feature vector as the target feature. K-means clustering: The clustering results are obtained based on the binary feature vectors output from the convolutional neural network using the unsupervised learning algorithm
BL-5592 2
K-means in machine learning. The scheme of this patent is then based on the convolutional neural 0° network for clustering, and thus cannot handle graph data effectively.
In response to the above-mentioned problems in the prior art, this application proposes a method and system for fused subspace clustering based on graph autoencoder with high clustering accuracy on graph data.
This application uses the following technical solution: A graph autoencoder based fused subspace clustering method, comprising the steps:
S1 Initialization of the graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix by the graph data set G(X, A), where X is the feature matrix, A is the adjacency matrix, and the graph convolution encoder and the graph convolution decoder form the graph convolution autoencoder;
S2 Mapping the graph dataset by initializing the graph convolutional autoencoder to reconstruct the adjacency matrix of the graph dataset;
S3 Calculating the reconstruction loss Lag of the graph convolutional autoencoder and solving to update the parameters of the graph convolutional encoder as well as the parameters of the graph convolutional decoder with the aim of minimizing the LGAE:
S4 Extracting graph data features from each graph convolution layer of the updated graph convolution encoder, calculating the fused subspace clustering loss Lsc using the extracted features as well as the feature matrix as input, and solving the updated self-expression coefficient matrix for the purpose of Lsc minimization;
SS Compute the reconstruction loss Lae and the fused subspace clustering loss Lsc using the updated graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix;
S6 Continuing to update the graph convolution encoder parameters, the graph convolution decoder parameters, and the self-expression coefficient matrix with the aim of minimizing the joint loss Loar+Lsc;
S7 The final obtained self-expression coefficient matrix is transformed into clustering labels by a spectral clustering algorithm.
As a preferred embodiment, step S2 specifically comprises the step of:
BL-5592 3
S2.1 Mapping the graph data set by initializing the graph convolution encoder to obtain tha 0 embedding features of the graph data;
S2.2 Mapping the embedded features by initializing the graph convolution decoder to reconstruct the adjacency matrix.
As a preferred solution, the embedding features of the graph data are obtained by initializing the graph convolution encoder to map the graph data set in step S2.1, specifically using Z=Q(X, A;
WwW);
Mapping of the embedded features Z by initializing the graph convolution decoder in step S2.2 to reconstruct the adjacency matrix, specifically using A =H(Z; 9),
Where, W={W1,W,...,Wx} are the parameters of each graph convolution layer in the graph convolution encoder, Q are the parameters of the graph convolution decoder, A is the reconstructed adjacency matrix, Q denotes the graph convolution encoder mapping, and H denotes the graph convolution decoder mapping.
As a preferred option, the equation for calculating the reconstruction loss LGaz in step S3 is specified as: x 2 to HA}
In step S4, the fused subspace clustering loss Lsc 1s specified as:
Le =I, += 22, 2,CF 2K =
Where, K is the number of graph convolution layers of the graph convolution encoder,
Z,Z»,...,Zg are the graph data features extracted from each graph convolution layer of the updated graph convolution encoder in step S4, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, ||. denotes the p-parameter, |o| denotes the F-parameter, and 4 is the balance coefficient of the self-expression loss.
As a preferred solution, step S6 specifically includes the following steps:
S6.1 Fix the self-expression coefficient matrix and continuing to update the parameters of the graph convolution encoder, and the parameters of the graph convolution decoder with the aim of minimizing the joint loss LeaetLsc;
S6.2 Fix the parameters of the graph convolution encoder, the parameters of the graph
BL-5592 4 convolution decoder, and continue to update the self-expression coefficient matrix with the aim ar 0 minimizing the joint loss Loaz+Lsc.
As a preferred option, step S6 and S7 between steps S6 and S7 further include the step:
Repeating steps S5-S6 to a predetermined number of iterations.
Accordingly, there is also provided a fused subspace clustering system based on graph self-encoding, comprising an initialization module, a graph convolutional self-encoding module, a first update module, a second update module, and a clustering module connected in sequence, wherein the graph convolutional self-encoding module includes a graph convolutional encoding unit and a graph convolutional decoding unit;
Initialization module for initializing the graph convolution coding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix by graph data set G (X,
A), where X is the feature matrix and A is the adjacency matrix;
A graph convolutional self-coding module for mapping the graph dataset and reconstructing the adjacency matrix; of the graph dataset
The first update module for calculating the reconstruction loss Lgag of the graph convolutional self-encoding module and solving to update the parameters of the graph convolutional encoding unit and the parameters of the graph convolutional decoding unit with the purpose of LGAE minimization; also for calculating the fused subspace clustering loss Lsc using as input the graph data features extracted from each graph convolutional layer of the updated graph convolutional encoding unit and the feature matrix of the graph data set with the purpose of Lsc minimization for the purpose of solving the updated self-expression coefficient matrix.
The second update module for calculating the reconstruction loss Leak and the fused subspace clustering loss Lsc using the updated graph convolutional encoding unit parameters, graph convolutional decoding unit parameters, and self-expression coefficient matrix, and continuing to update the graph convolutional encoding unit parameters, graph convolutional decoding unit parameters, and self-expression coefficient matrix for the purpose of minimizing the joint loss
LgaetLsc.
A clustering module for transforming the final obtained self-expression coefficient matrix into clustering labels by means of a spectral clustering algorithm.
As a preferred embodiment, the graph convolutional self-encoding module maps the graph data set by the graph convolutional encoding unit to obtain the embedding features of the graph data,
BL-5592 5 specifically using Z=Q(X, A; W); 7503098
The graph convolutional self-coding module maps the embedded features Z to reconstruct the adjacency matrix by means of a graph convolutional decoding unit, specifically using À =H(Z; Q).
The graph convolutional self-coding module also includes an extraction unit, which is used to extract graph data features from each graph convolutional layer of the graph convolutional encoding unit;
Where, W={W 1, W...., Wk} are the parameters of each graph convolution layer in the graph convolution coding unit, Q are the parameters of the graph convolution decoding unit, A are the parameters of the reconstructed adjacency matrix, Q denotes the graph convolution coding unit mapping, and H denotes the graph convolution decoding unit mapping.
As a preferred embodiment,
The equation for calculating the reconstruction loss Loazin the first update module is specified as:
Low =z]A- A,
The equation for the fused subspace clustering loss Lsc in the first update module is specified as:
Le =I, + ZZ 2K = ,
Where, K is the number of graph convolution layers of the graph convolution coding unit,
Z1,Z2,...,ZK is the graph data features extracted from each graph convolution layer of the updated graph convolution coding unit, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, || denotes the p-parameter,|jo . denotes the F-parameter, and À is the balance coefficient of the self-expression loss.
As a preferred option, the parameters of the graph convolutional encoding unit, the parameters of the graph convolutional decoding unit, and the self-expression coefficient matrix, as described in the second update module for the purpose of minimizing the joint loss LoartLsc, continue to be updated specifically as follows:
Fix the self-expression coefficient matrix and continuing to update the parameters of the graph convolution encoder, and the parameters of the graph convolution decoder with the aim of
BL-5592 6 minimizing the joint loss LeaetLsc; HU503098
Fix the parameters of the graph convolution encoder, the parameters of the graph convolution decoder, and continue to update the self-expression coefficient matrix with the aim of minimizing the joint loss LgagtLsc.
The beneficial effects of this application are: (1) The cluster analysis method using graph autoencoder is able to effectively process graph data and learn multi-level feature representation of graph data while keeping the topology of the data unchanged. (2) The learned self-expression coefficient matrix is used as the supervised signal, and the graph convolutional autoencoder is jointly optimized using the graph convolutional autoencoder reconstruction loss and the fused subspace clustering loss, so that the feature quality and clustering performance are iteratively improved.
Illustration of the attached figure
In order to more clearly illustrate the technical solutions in the embodiments or prior art of the present application, the following is a brief description of the accompanying drawings that need to be used in the description of the embodiments or prior art. It is obvious that the accompanying drawings in the following description are only some of the embodiments of the present application, and other accompanying drawings can be obtained based on them without any creative work for a person of ordinary skill in the art.
Figure 1 is a flow chart of a graph autoencoder based fused subspace clustering method described in the present application;
Figure 2 is a schematic diagram of the structure of a fused subspace clustering system based on graph autoencoder as described in this application.
The following illustrates the embodiment of the present application by specific concrete examples, and other advantages and efficacy of the present application can be readily understood by those skilled in the art as disclosed in this specification. The present application may also be implemented or applied by additionally different specific embodiments, and the details in this specification may also be modified or changed in various ways without departing from the spirit of
BL-5592 7 the present application based on different views and applications. It is to be noted that the following 0 embodiments and the features in the embodiments can be combined with each other without conflict.
Embodiment I:
Referring to Figure 1, this embodiment provides a method of fused subspace clustering based on graph autoencoder, comprising step:
S1 Initialization of the graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix by the graph data set G(X, A), where X is the feature matrix, A is the adjacency matrix, and the graph convolution encoder and the graph convolution decoder form the graph convolution autoencoder;
Among them, the self-expression coefficient matrix is the matrix consisting of coefficients of each sample linearly represented by other samples in the subspace clustering algorithm.
S2 Mapping the graph dataset by initializing the graph convolutional autoencoder to reconstruct the adjacency matrix of the graph dataset;
S3 Calculating the reconstruction loss Lag of the graph convolutional autoencoder and solving to update the parameters of the graph convolutional encoder as well as the parameters of the graph convolutional decoder with the aim of minimizing the LGAE;
S4 Extracting graph data features from each graph convolution layer of the updated graph convolution encoder, calculating the fused subspace clustering loss Lsc using the extracted features as well as the feature matrix as input, and solving the updated self-expression coefficient matrix for the purpose of Lsc minimization;
SS Computing the reconstruction loss Lar and the fused subspace clustering loss Lsc using the updated graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix;
S6 Continuing to update the graph convolution encoder parameters, the graph convolution decoder parameters, and the self-expression coefficient matrix with the aim of minimizing the joint loss Loar+Lsc;
S7 The final obtained self-expression coefficient matrix is transformed into clustering labels by a spectral clustering algorithm.
Specifically:
In step S2, the following steps are specifically included:
BL-5592 8
S2.1 Mapping the graph data set by initializing the graph convolution encoder to obtain tha 0 embedding features of the graph data;
S2.2 Mapping the embedded features by initializing the graph convolution decoder to reconstruct the adjacency matrix.
Further,
In step S2.1, the graph data set is mapped by initializing the graph convolution encoder to obtain the embedding features of the graph data, specifically using Z=Q(X, A; W);
Mapping of the embedded features Z by initializing the graph convolution decoder in step S2.2 to reconstruct the adjacency matrix, specifically using À =H(Z; Q).
Where, W={W1,W,... Wk}are the parameters of each graph convolution layer in the graph convolution encoder, Q are the parameters of the graph convolution decoder, A is the reconstructed adjacency matrix, Q denotes the graph convolution encoder mapping, H denotes the graph convolution decoder mapping, and the embedded features are the features output from the last graph convolution layer of the graph convolution encoder.
The equation for the reconstruction loss LGazin step S3 is specified as:
Low =5|4-A,
In step S4, the fused subspace clustering loss Lsc 1s specified as:
Le =I, += 22, 2,CF 2K =
Where, K is the number of graph convolution layers of the graph convolution encoder,
Z,Z»,...,Zg are the graph data features extracted from each graph convolution layer of the updated graph convolution encoder in step S4, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, || denotes the p-parameter,|jo . denotes the F-parameter, and À is the balance coefficient of the self-expression loss.
Step S6 specifically includes the following steps:
S6.1 Fix the self-expression coefficient matrix and continuing to update the parameters of the graph convolution encoder, the parameters of the graph convolution decoder with the aim of minimizing the joint loss Loar+Lsc;
S6.2 Fix the parameters of the graph convolution encoder, the parameters of the graph
BL-5592 9 convolution decoder, and continue to update the self-expression coefficient matrix with the purpose 998 of minimizing the joint loss Leae+Lsc.
The step between step S6 and S7 also includes the step: Repeating steps S5-S6 to a preset number of iterations, and the preset number can be set to 10 times.
And the gradient descent and back propagation algorithms are used when updating the parameters of the graph convolutional encoder, the parameters of the graph convolutional decoder, and the self-expression coefficient matrix as described above.
This application uses a clustering analysis method of graph autoencoder, which can effectively process graph data and learn multi-level feature representation of graph data while keeping the topology of the data unchanged. and the learned self-expression coefficient matrix is used as a supervised signal to jointly optimize the graph convolutional autoencoder using the graph convolutional autoencoder reconstruction loss and the fused subspace clustering loss to iteratively improve the feature quality and clustering performance.
Further, the present embodiment is compared with two existing methods, DEC and GAE, on multiple data sets (where DEC uses a deep embedding clustering algorithm to obtain clustering results by fully connected autoencoder; GAE also uses a graph convolutional self-encoding method, but its application of subspace clustering algorithm is missing compared with this application) to verify the effectiveness of the method of this application, and the data sets used are
The Cora: Paper citation dataset, containing 2708 papers, is divided into seven categories: case-based, genetic algorithms, neural networks, probabilistic methods, reinforcement learning, rule-based learning, and theoretical, with a feature vector dimension of 1433 per paper.
The Citeseer: Paper cites a dataset, similar to Cora, containing 3,327 samples, each of dimension 3703, divided into 6 classes.
The DBLP: Paper partnership dataset, containing 4,058 nodes with each node dimension 334, is divided into 4 classes.
The final clustering performance 1s shown in Table 1 below:
BL-5592 10
LU503098 ee | oe |e
Dataset DEC GAE application
Table 1
It can be seen that the clustering performance of this application is significantly better than the two comparison methods.
Embodiment II:
Referring to Figure 2, this embodiment provides a fused subspace clustering system based on graph autoencoder, including sequentially connected initialization module, graph convolutional self-encoding module, first update module, second update module, and clustering module, wherein the graph convolutional self-encoding module includes a graph convolutional encoding unit as well as a graph convolutional decoding unit;
Initialization module for initializing the graph convolution coding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix by graph data set G (X,
A), where X is the feature matrix and A is the adjacency matrix;
A graph convolutional self-coding module for mapping the graph dataset and reconstructing the adjacency matrix; of the graph dataset
The first update module for calculating the reconstruction loss Leake of the graph convolutional self-encoding module and solving to update the parameters of the graph convolutional encoding unit and the parameters of the graph convolutional decoding unit with the purpose of LGAE minimization; also for calculating the fused subspace clustering loss Lsc using as input the graph data features extracted from each graph convolutional layer of the updated graph convolutional encoding unit and the feature matrix of the graph data set with the purpose of Lsc minimization for the purpose of solving the updated self-expression coefficient matrix;
The second update module for calculating the reconstruction loss LGaz and the fused subspace clustering loss Lsc using the updated graph convolutional encoding unit parameters, graph convolutional decoding unit parameters, and self-expression coefficient matrix, and continuing to
BL-5592 11 update the graph convolutional encoding unit parameters, graph convolutional decoding nt 98 parameters, and self-expression coefficient matrix for the purpose of minimizing the joint loss
LgaetLsc.
A clustering module for transforming the final obtained self-expression coefficient matrix into clustering labels by means of a spectral clustering algorithm.
Specifically:
The graph convolutional self-coding module maps the graph data set by the graph convolutional coding unit to obtain the embedding features of the graph data, specifically using
Z=Q(X, A; W);
The graph convolutional self-coding module maps the embedded features Z to reconstruct the adjacency matrix by means of a graph convolutional decoding unit, specifically using À =H(Z; Q);
The graph convolutional self-coding module also includes an extraction unit, which is used to extract graph data features from each graph convolutional layer of the graph convolutional encoding unit;
Where, W={W 1, W...., Wk} are the parameters of each graph convolutional layer in the graph convolutional coding unit, Q are the parameters of the graph convolutional decoding unit, A is the reconstructed adjacency matrix, Q denotes the graph convolutional coding unit mapping, and H denotes the graph convolutional decoding unit mapping.
The equation for calculating the reconstruction loss Lgag in the first update module is specified as:
Low =z]A- A,
The fusion subspace clustering loss LSC in the first update module is calculated specifically as:
Le =I, + ZZ 2K = ,
Where, K is the number of graph convolution layers of the graph convolution coding unit,
Z1,Z2,...,Zg are the graph data features extracted from each graph convolutional layer of the updated graph convolutional coding unit, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, || denotes the p-parameter,|jo . denotes the F-parameter, and À is the balance
BL-5592 12 coefficient of the self-expression loss. 7505098
The parameters of the graph convolutional encoding unit, the parameters of the graph convolutional decoding unit, and the self-expression coefficient matrix continue to be updated for the purpose of minimizing the joint loss Laar+Lsc as described in the second update module specifically:
Fix the self-expression coefficient matrix and continuing to update the parameters of the graph convolution encoder, the parameters of the graph convolution decoder with the aim of minimizing the joint loss Loar+Lsc;
Fixing the parameters of the graph convolution encoder, the parameters of the graph convolution decoder, and continuing to update the self-expression coefficient matrix with the aim of minimizing the joint loss Loaz+Lsc.
The gradient descent and back propagation algorithms are used for the above updating of the graph convolution encoding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix.
It should be noted that this embodiment provides a graph autoencoder-based fused subspace clustering system similar to embodiment I, which will not be further described herein.
The above described embodiments are only a description of the preferred embodiment of the present application, not a limitation of the scope of the present application. Without departing from the spirit of the design of the present application, all kinds of deformations and improvements made to the technical solution of the present application by a person of ordinary skill in the art shall fall within the scope of protection of the present application.
Claims (10)
1. A graph autoencoder based fused subspace clustering method characterized by comprising the steps:
S1. Initialization of the graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix by the graph data set G(X, A), where X is the feature matrix, A is the adjacency matrix, and the graph convolution encoder and the graph convolution decoder form the graph convolution autoencoder;
S2.Mapping the graph dataset by initializing the graph convolutional autoencoder to reconstruct the adjacency matrix of the graph dataset;
S3. Calculating the reconstruction loss Lag of the graph convolutional autoencoder and solving to update the parameters of the graph convolutional encoder as well as the parameters of the graph convolutional decoder with the aim of minimizing the LGAE;
S4. Extracting graph data features from each graph convolution layer of the updated graph convolution encoder, calculating the fused subspace clustering lossLsc using the extracted features as well as the feature matrix as input, and solving the updated self-expression coefficient matrix for the purpose of Lsc minimization;
SS. Compute the reconstruction loss Lae and the fused subspace clustering loss Lsc using the updated graph convolution encoder parameters, graph convolution decoder parameters, and self-expression coefficient matrix;
S6. Continuing to update the graph convolution encoder parameters, the graph convolution decoder parameters, and the self-expression coefficient matrix with the aim of minimizing the joint loss Loar+Lsc;
S7. The final obtained self-expression coefficient matrix is transformed into clustering labels by a spectral clustering algorithm.
2. A graph autoencoder based fusion subspace clustering method according to claim 1, characterized in that step S2 specifically comprises the steps of:
S2.1 Mapping the graph data set by initializing the graph convolution encoder to obtain the embedding features of the graph data;
S2.2 Mapping the embedded features by initializing the graph convolution decoder to reconstruct the adjacency matrix.
3. A method for fused subspace clustering based on graph autoencoders according to claim 2,
BL-5592 14 characterized in: 7505098 In step S2.1, the graph data set is mapped by initializing the graph convolution encoder to obtain the embedding features of the graph data, specifically using Z=Q(X, A; W); Mapping of the embedded features Z by initializing the graph convolution decoder in step S2.2 to reconstruct the adjacency matrix, specifically using A =H(Z; Q), Where, W={W1,W,... Wk} are the parameters of each graph convolution layer in the graph convolution encoder, Q are the parameters of the graph convolution decoder, Ais the reconstructed adjacency matrix, Q denotes the graph convolution encoder mapping, and H denotes the graph convolution decoder mapping.
4. A graph autoencoder based fused subspace clustering method according to claim 3, characterized in that the equation for calculating the reconstruction loss LGaz in step S3, specified as: Low = 4-4], In step S4, the fused subspace clustering loss Lsc is calculated specifically as: Lac = CI, AZ, -Z,Cl, 2K = Where, K is the number of graph convolution layers of the graph convolution encoder, Z1,/2,..4Lx are the graph data features extracted from each graph convolution layer of the updated graph convolution encoder in step S4, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, le denotes the p-parameter, of denotes the F-parameter, and À is the balance coefficient of the self-expression loss.
5. A graph autoencoder based fused subspace clustering method according to claim 4, characterized in that: step S6 specifically comprises the following steps:
S6.1 Fix the self-expression coefficient matrix and continuing to update the parameters of the graph convolution encoder, and the parameters of the graph convolution decoder with the aim of minimizing the joint loss Loar+Lsc;
S6.2 Fix the parameters of the graph convolution encoder, the parameters of the graph convolution decoder, and continue to update the self-expression coefficient matrix with the aim of minimizing the joint loss Loaz+Lsc.
BL-5592 15
6. A graph autoencoder based fused subspace clustering method according to claim 3/9998 characterized in that step S6 and S7 further comprise between steps: Repeating steps S5-S6 to a preset number of iterations.
7. A fused subspace clustering system based on graph autoencoder, characterized in that it includes a sequentially connected initialization module, a graph convolutional self-encoding module, a first update module, a second update module, and a clustering module, wherein the graph convolutional self-encoding module includes a graph convolutional encoding unit and a graph convolutional decoding unit; Initialization module for initializing the graph convolution coding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix by graph data set G (X, A), where X is the feature matrix and A is the adjacency matrix; A graph convolutional self-coding module for mapping the graph dataset and reconstructing the adjacency matrix; of the graph dataset The first update module for calculating the reconstruction loss Leake of the graph convolutional self-encoding module and solving for updating the parameters of the graph convolutional encoding unit and the parameters of the graph convolutional decoding unit with the purpose of minimizing the Lag; also for calculating the fused subspace clustering loss Lscusing as input the graph data features extracted from each graph convolutional layer of the updated graph convolutional encoding unit and the feature matrix of the graph data set and solving for updating the self-expression coefficient matrix with the purpose of minimizing the Lsc; The second update module for calculating the reconstruction loss Loazand the fused subspace clustering loss Lsc using the updated graph convolution encoding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix, and continuing to update the graph convolution encoding unit parameters, graph convolution decoding unit parameters, and self-expression coefficient matrix for the purpose of minimizing the joint loss Loar+Lsc; À clustering module for transforming the final obtained self-expression coefficient matrix into clustering labels by means of a spectral clustering algorithm.
8. A graph autoencoder based fused subspace clustering system according to claim 7, characterized in: The graph convolutional self-coding module maps the graph data set by the graph
BL-5592 16 convolutional coding unit to obtain the embedding features of the graph data, specifically using 798 Z=Q(X, A; W); The graph convolutional self-coding module maps the embedded features Z to reconstruct the adjacency matrix by means of a graph convolutional decoding unit, specifically using A =H(Z; Q); The graph convolutional self-coding module also includes an extraction unit, which is used to extract graph data features from each graph convolutional layer of the graph convolutional encoding unit; Where, W={W 1, W...., Wk} are the parameters of each graph convolution layer in the graph convolution encoding unit, Q are the parameters of the graph convolution decoding unit, Ais the reconstructed adjacency matrix, Q denotes the graph convolution encoding unit mapping, and H denotes the graph convolution decoding unit mapping.
9. A graph autoencoder based fused subspace clustering system according to claim 8, characterized in the equation for calculating the reconstruction loss LGaz in the first update module, specified as: Low =z]A- A, The equation for the fused subspace clustering loss Lsc in the first update module is specified as: BIC + 212,201; 2K = , Where, K is the number of graph convolution layers of the graph convolution coding unit, Z1,Z2,...,Zg is the graph data features extracted from each graph convolution layer of the updated graph convolution coding unit, Zo=X is the original feature matrix, C is the self-expression coefficient matrix, le denotes the p-parameter, of denotes the F-parameter, À and is the balance coefficient of the self-expression loss.
10. A graph autoencoder based fused subspace clustering system according to claim 9, characterized in that the parameters of the graph convolutional encoding unit, the parameters of the graph convolutional decoding unit, and the self-expression coefficient matrix continue to be updated for the purpose of minimizing the joint loss Loar+Lsc as described in the second update module specifically:
BL-5592 17 Fix the self-expression coefficient matrix and continuing to update the parameters of the rT convolution encoder, and the parameters of the graph convolution decoder with the aim of minimizing the joint loss LeaetLsc; Fix the parameters of the graph convolution encoder, the parameters of the graph convolution decoder, and continue to update the self-expression coefficient matrix with the aim of minimizing the joint loss Loaz+Lsc.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110447565.4A CN113239984A (en) | 2021-04-25 | 2021-04-25 | Fusion subspace clustering method and system based on graph self-encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
LU503098B1 true LU503098B1 (en) | 2023-03-23 |
Family
ID=77129270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
LU503098A LU503098B1 (en) | 2021-04-25 | 2022-03-24 | A method and system for fused subspace clustering based on graph autoencoder |
Country Status (4)
Country | Link |
---|---|
CN (1) | CN113239984A (en) |
LU (1) | LU503098B1 (en) |
WO (1) | WO2022227957A1 (en) |
ZA (1) | ZA202207733B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239984A (en) * | 2021-04-25 | 2021-08-10 | 浙江师范大学 | Fusion subspace clustering method and system based on graph self-encoder |
CN118317110B (en) * | 2024-06-11 | 2024-08-27 | 国网安徽省电力有限公司超高压分公司 | Substation information source channel joint coding enhanced image transmission method |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111144463B (en) * | 2019-12-17 | 2024-02-02 | 中国地质大学(武汉) | Hyperspectral image clustering method based on residual subspace clustering network |
CN112084328A (en) * | 2020-07-29 | 2020-12-15 | 浙江工业大学 | Scientific and technological thesis clustering analysis method based on variational graph self-encoder and K-Means |
CN112164067A (en) * | 2020-10-12 | 2021-01-01 | 西南科技大学 | Medical image segmentation method and device based on multi-mode subspace clustering |
CN113239984A (en) * | 2021-04-25 | 2021-08-10 | 浙江师范大学 | Fusion subspace clustering method and system based on graph self-encoder |
-
2021
- 2021-04-25 CN CN202110447565.4A patent/CN113239984A/en active Pending
-
2022
- 2022-03-24 WO PCT/CN2022/082644 patent/WO2022227957A1/en active Application Filing
- 2022-03-24 LU LU503098A patent/LU503098B1/en active IP Right Grant
- 2022-07-12 ZA ZA2022/07733A patent/ZA202207733B/en unknown
Also Published As
Publication number | Publication date |
---|---|
CN113239984A (en) | 2021-08-10 |
ZA202207733B (en) | 2022-07-27 |
WO2022227957A1 (en) | 2022-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
LU503098B1 (en) | A method and system for fused subspace clustering based on graph autoencoder | |
CN111291212B (en) | Zero sample sketch image retrieval method and system based on graph convolution neural network | |
CN111079532B (en) | Video content description method based on text self-encoder | |
CN111898364B (en) | Neural network relation extraction method, computer equipment and readable storage medium | |
CN111753024B (en) | Multi-source heterogeneous data entity alignment method oriented to public safety field | |
WO2018098892A1 (en) | End-to-end modelling method and system | |
CN112765370B (en) | Entity alignment method and device of knowledge graph, computer equipment and storage medium | |
CN113204674B (en) | Video-paragraph retrieval method and system based on local-overall graph inference network | |
CN113157957A (en) | Attribute graph document clustering method based on graph convolution neural network | |
CN113920379B (en) | Zero sample image classification method based on knowledge assistance | |
CN111522961A (en) | Attention mechanism and entity description based industrial map construction method | |
CN114091450A (en) | Judicial domain relation extraction method and system based on graph convolution network | |
CN111078895A (en) | Remote supervision entity relation extraction method based on denoising convolutional neural network | |
CN112925920A (en) | Smart community big data knowledge graph network community detection method | |
CN114118416A (en) | Variational graph automatic encoder method based on multi-task learning | |
CN116129902A (en) | Cross-modal alignment-based voice translation method and system | |
CN115496072A (en) | Relation extraction method based on comparison learning | |
CN114254108B (en) | Method, system and medium for generating Chinese text countermeasure sample | |
CN113836319B (en) | Knowledge completion method and system for fusion entity neighbors | |
CN113255569B (en) | 3D attitude estimation method based on image hole convolutional encoder decoder | |
CN114490954A (en) | Document level generation type event extraction method based on task adjustment | |
CN118038032A (en) | Point cloud semantic segmentation model based on super point embedding and clustering and training method thereof | |
CN113191150A (en) | Multi-feature fusion Chinese medical text named entity identification method | |
CN110197521B (en) | Visual text embedding method based on semantic structure representation | |
CN116861021A (en) | Cross-modal retrieval model construction method based on denoising and momentum distillation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FG | Patent granted |
Effective date: 20230323 |