CN113869404B - Self-adaptive graph roll accumulation method for paper network data - Google Patents
Self-adaptive graph roll accumulation method for paper network data Download PDFInfo
- Publication number
- CN113869404B CN113869404B CN202111136030.1A CN202111136030A CN113869404B CN 113869404 B CN113869404 B CN 113869404B CN 202111136030 A CN202111136030 A CN 202111136030A CN 113869404 B CN113869404 B CN 113869404B
- Authority
- CN
- China
- Prior art keywords
- data
- graph
- self
- representation
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000009825 accumulation Methods 0.000 title claims abstract description 7
- 230000004927 fusion Effects 0.000 claims abstract description 12
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000005096 rolling process Methods 0.000 claims abstract description 8
- 238000005065 mining Methods 0.000 claims abstract description 4
- 230000003044 adaptive effect Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 238000007418 data mining Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The self-adaptive graph volume accumulation method for paper network data is suitable for the field of data mining. The method comprises the steps of firstly applying a self-adaptive graph convolution network to a depth graph convolution clustering task, adaptively updating a graph structure and learning optimal data representation; secondly, the method creatively provides a fusion module based on an attention mechanism, and the data representations of two parallel networks are fused layer by layer in a weighting mode, and meanwhile the problem of overcomplete of a graph rolling network is effectively relieved. The method mainly solves the technical problems of mining internal structures among all samples, ensuring that a model can capture more complete data structure information, avoiding negative influence of an inaccurate graph structure on clustering performance and effectively fusing heterogeneous information.
Description
Technical Field
The method is suitable for the fields of data mining, machine learning, pattern recognition and the like, and particularly suitable for clustering tasks of paper networks containing noise and abnormal values.
Background
With the development of social media, a large number of images, videos and microblogs are widely spread on the internet, but most of the data are unlabeled, so that the classification task driven by the data is difficult to realize, and the nature of the underlying structure attribute among the data can provide more remarkable difference information, which motivates the development of depth map convolution clustering.
Wang Chun et al propose an end-to-end graph annotation meaning self-coding clustering model, effectively fuses attribute information and structure information of data, and simultaneously guides the optimization process of a network by utilizing a self-supervision mechanism. Pan Shirui et al propose an anti-regularization graph convolution self-encoder that reconstructs the original data and graph structure, and the anti-training model enhances the robustness of the data representation. However, embedding these graphs into the network creates an excessive smoothing problem, which compromises clustering performance. Bo Deyu et al devised a transfer operator that transferred the data representation learned from the encoder module to the corresponding atlas, while utilizing a self-supervision mechanism to unify two different deep neural architectures.
The existing clustering method based on graph convolution mainly depends on the quality of an initial graph structure, and the graph structure is kept unchanged in the model optimization process, but in actual situations, the graph structure contains noise and abnormal values, and it is difficult to accurately describe the connection relation between data, so that the clustering performance is affected. These methods do not effectively fuse attribute information and structure information of data.
In order to solve the problem, a paper clustering method based on a graph convolution network is provided, and a self-adaptive graph is used for replacing a fixed graph to capture more complete structural information in the process of model optimization; a fusion module based on an attention mechanism is designed, more key difference information is extracted, and the problem that a graph rolling network is too smooth is effectively avoided.
In order to solve the problem that the prior deep picture volume accumulation type method clusters paper network data containing noise, the invention provides a paper clustering method based on a picture volume network. The method comprises the steps of firstly applying a self-adaptive graph convolution network to a depth graph convolution clustering task, adaptively updating a graph structure and learning optimal data representation; secondly, the method creatively provides a fusion module based on an attention mechanism, and the data representations of two parallel networks are fused layer by layer in a weighting mode, and meanwhile the problem of overcomplete of a graph rolling network is effectively relieved. The method mainly solves the technical problems of mining of internal structures among all samples, ensuring that a model can capture more complete data structure information and effectively fusing heterogeneous information.
Disclosure of Invention
The self-adaptive graph-volume accumulation method for the paper network data can effectively solve the defects of the existing deep clustering method, and the self-adaptive graph-volume network is provided, and the self-adaptive graph structure is used for replacing a fixed graph structure in the graph-volume process, so that the model is facilitated to mine more complete internal structure information, and negative influence of an inaccurate graph structure on clustering performance is avoided; the fusion module based on the attention mechanism is provided, heterogeneous information is selectively weighted to extract key information, and the problem that a graph rolling network is too smooth is effectively solved. Fig. 1 shows the overall framework of the proposed method.
The invention is realized by the following technical scheme:
(1) Attribute information is first extracted from the input data using a self-encoder,
H(l)=σ(W(l)H(l-1)+b(l)),l=1,2,…,L
Where H (l) represents the data representation learned from the encoder layer I, W (l) and b (l) represent the weight matrix and bias, respectively, of the layer I that can be learned, L represents the number of network layers of the model, σ (·) represents the nonlinear activation function, where RELU is selected as the activation function.
At the same time, to preserve the characteristics of the original data as much as possible, the reconstructed data is minimizedAnd a reconstruction error between the original input data X, X representing bag-of-word characteristics of the keywords of the sample in the dataset.
Wherein N is the number of samples, and the Frobenius norm is defined as
(2) High-level structural information of the data is captured by an adaptive graph convolution module.
Z(l+1)=σ(A(l+1)F(l)U(l+1)),l=1,2,…,L
Where U (l+1) represents the learnable weight matrix of the (l+1) th layer of the adaptive graph convolution module, Z (l+1) is the updated node representation of the (l+1) th layer of the module, A (l+1) is the learned adaptive graph structure, more accurately reflects the intrinsic structure between samples, and F (l) is the fused representation obtained from the attention-mechanism-based fusion module.
Specifically, the adjacency matrix is constructed by computing the inner product of the fused representation F (l), mining potential similarities between samples,
The learned adaptive map is then usedAdded to the original map/>The quality of the initial graph structure is enhanced,
Wherein, E is the balance coefficient, and E is set to be 0.5 in the invention.
Finally, minimizing the reconstructed structure in order for the learned intermediate layer data representation Z (L/2) to more reflect the dependencies between the dataAnd reconstruction errors between the original input map structure a,
Wherein,Is an adjacency matrix constructed from the inner product of the data representation Z (L) of the last layer of the adaptive graph convolution module.
(3) A fusion module based on an attention mechanism is presented to efficiently fuse data representations extracted from an encoder module and an adaptive graph convolution module. Specifically, for the first layer of the network, concatenates the data representations H (l) and Z (l) learned from the self-encoding module and the adaptive graph convolution module, respectively,
Y(l)=[H(l),Z(l)] (5)
Where [. Cndot ] is a cascading operation.
From the cascade characteristics Y (l), different weights are assigned to H (l) and Z (l), respectively, according to their relative importance, the fusion representation F (l) is finally obtained,
a=f(Y(l))
e=softmax(sigmoid(a)/τ)
W=mean(e)
F(l)=W1·Z(l)+W2·H(l)
Wherein W 1 is the weight coefficient assigned to Z (l), W 2 is the weight coefficient assigned to H (l), f (·) is a network consisting of three fully connected layers, τ is the calibration coefficient, and in the present invention, τ is set to 10, the sigmoid (·) function acts together with the calibration coefficient to avoid assigning a score close to "1" to the most relevant data representation.
(4) The self-supervised clustering module is referenced to train the end-to-end model.
Where q ij represents the probability of assigning the ith sample to the jth cluster in the feature representation H (L/2) learned from the encoder, and the target distribution p ij.tij represents the probability of assigning the ith sample to the jth cluster in the intermediate layer feature representation Z (L/2) learned in the adaptive graph convolution module by amplifying q ij and normalizing it.
Finally, the proposed overall objective function is:
Where lambda 1,λ2 and lambda 3 are hyper-parameters which balance the importance of the different losses, 1.0,0.01,0.1 respectively.
Randomly initializing weights and deviations in the model, including W (l),b(l) and U (l+1), optimally solving the model by minimizing the loss function, learning the weight and deviation parameters in the model, obtaining an optimal data representation Z (L/2) when the training times of the model reach 700 times or the value of the loss function fluctuates between +/-1%, then feeding it into the softmax function to obtain a final clustering result C *,
C*=softmax(Z(L/2))。
ACC, NMI, ARI and F1 were chosen as standard measurements, with higher values of the index reflecting better performance.
Drawings
Fig. 1 is a frame diagram of the present invention.
Detailed Description
The invention has proved that the above-mentioned method has obvious effect.
The method evaluates on six public datasets, including USPS, HHAR, REUT, DBLP, ACM and CITE datasets.
In order to verify the superiority of the clustering performance of the proposed method, the proposed paper clustering method (AGCC) based on the graph rolling network is compared with several existing most advanced clustering methods of K-means, AE, IDEC, GAE, DAEGC, SDCN.
The clustering results shown in table 1 indicate that in most cases, the clustering performance of the proposed self-adaptive graph roll accumulation method for paper network data is significantly better than that of other comparison methods.
Clustering performance of DAEGCs is superior to IDEC for paper dataset ACM and CITE, which directly provide graph structures. Whereas clustering performance for datasets USPS, HHAR and REUT, GAE and DAEGC, which built the initial graph structure by the K-nearest neighbor method, was not as good as AE and IDEC. It is believed that the graph constructed from K-nearest neighbors does not accurately describe the relationship between the data, resulting in poor clustering performance of GAE and DAEGC. Therefore, a superior adaptive graph learning method is necessary.
The approach in CITE and ACM datasets is greatly improved over the most important baseline approach SDCN, SDCN uses a fixed graph structure during graph convolution, but the structural information between the samples contains noise and outliers, thus negatively impacting clustering performance. The continuously updated graph structure in the proposed method can more accurately reflect the similarity between samples, thereby enhancing the performance of the graph rolling network. Furthermore, AGCC proposes a fusion module based on an attention mechanism, which fully fuses attribute information and structure information of data. These heterogeneous information complements each other with a characteristic representation of efficient learning data, resulting in a significant improvement in clustering performance. And effectively relieves the problem of too smooth of the graph roll-up network.
Table 1: clustering performance contrast on six datasets
Claims (1)
1. The self-adaptive graph volume accumulation method for paper network data is characterized by comprising the following steps of:
(1) Attribute information is first extracted from the input data using a self-encoder,
H(l)=σ(W(l)H(l-1)+b(l)),l=1,2,…,L
Wherein H (l) represents the data representation learned from the encoder layer i, W (l) and b (l) represent the weight matrix and bias of the layer i, respectively, which can be learned, L represents the number of network layers of the model, σ (·) represents the nonlinear activation function, and RELU is selected as the activation function;
At the same time, to preserve the characteristics of the original data as much as possible, the reconstructed data is minimized And a reconstruction error between the original input data X, X representing bag-of-word characteristics of keywords of the sample in the dataset;
Wherein N is the number of samples, and the Frobenius norm is defined as
(2) Capturing high-order structural information of data through an adaptive graph convolution module;
Z(l+1)=σ(A(l+1)F(l)U(l+1)),l=1,2,…,L
Wherein U (l+1) represents a learnable weight matrix of the (l+1) th layer of the adaptive graph rolling module, Z (l+1) is a node representation updated by the (l+1) th layer of the module, A (l+1) is a learned adaptive graph structure, and F (l) is a fusion representation obtained from a fusion module based on an attention mechanism;
specifically, the adjacency matrix is constructed by computing the inner product of the fused representation F (l), mining potential similarities between samples,
The learned adaptive map is then usedAdded to normalized original graph structure/>The quality of the original graph structure is enhanced,
Wherein, E is the balance coefficient, set E to 0.5;
Finally, minimizing the reconstructed structure in order for the learned intermediate layer data representation Z (L/2) to more reflect the dependencies between the data And reconstruction errors between the original graph structure a,
Wherein,An adjacency matrix constructed by the inner product of the data representation Z (L) of the last layer of the adaptive graph rolling module;
(3) The method comprises the steps of providing a fusion module based on an attention mechanism to efficiently fuse data representations extracted from an encoder module and an adaptive graph convolution module; specifically, for the first layer of the network, concatenates the data representations H (l) and Z (l) learned from the self-encoding module and the adaptive graph convolution module, respectively,
Y(l)=[H(l),Z(l)],
Wherein [. Cndot ] is a cascading operation;
From the cascade characteristics Y (l), different weights are assigned to H (l) and Z (l), respectively, according to their relative importance, the fusion representation F (l) is finally obtained,
a=f(Y(l))
e=softmax(sigmoid(a)/τ)
W=mean(e)
F(l)=W1·Z(l)+W2·H(l),
Wherein W 1 is a weight coefficient assigned to Z (l), W 2 is a weight coefficient assigned to H (l), f (·) is a network consisting of three fully connected layers, τ is a calibration coefficient, and τ is set to 10;
(4) Training an end-to-end model by referring to a self-supervision clustering module;
Wherein q ij represents the probability of assigning the ith sample to the jth cluster in the middle layer feature representation H (L/2) learned by the self-encoder, and the target distribution p ij;tij is obtained by amplifying q ij and normalizing it to represent the probability of assigning the ith sample to the jth cluster in the middle layer feature representation Z (L/2) learned by the adaptive graph convolution module;
Finally, the proposed overall objective function is:
wherein lambda 1,λ2 and lambda 3 are hyper-parameters for balancing the importance of different losses, and 1.0,0.01,0.1 are taken respectively;
Randomly initializing weights and deviations in the model, including W (l),b(l) and U (l+1), optimally solving the model by minimizing the loss function, learning the weight and deviation parameters in the model, obtaining an optimal data representation Z (L/2) when the training times of the model reach 700 times or the value of the loss function fluctuates between +/-1%, then feeding it into the softmax function to obtain a final clustering result C *,
C*=softmax(Z(L/2))。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111136030.1A CN113869404B (en) | 2021-09-27 | 2021-09-27 | Self-adaptive graph roll accumulation method for paper network data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111136030.1A CN113869404B (en) | 2021-09-27 | 2021-09-27 | Self-adaptive graph roll accumulation method for paper network data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113869404A CN113869404A (en) | 2021-12-31 |
CN113869404B true CN113869404B (en) | 2024-05-28 |
Family
ID=78991234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111136030.1A Active CN113869404B (en) | 2021-09-27 | 2021-09-27 | Self-adaptive graph roll accumulation method for paper network data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113869404B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114781553B (en) * | 2022-06-20 | 2023-04-07 | 浙江大学滨江研究院 | Unsupervised patent clustering method based on parallel multi-graph convolution neural network |
CN114861072B (en) * | 2022-07-05 | 2022-11-29 | 浙商银行股份有限公司 | Graph convolution network recommendation method and device based on interlayer combination mechanism |
CN115114411B (en) * | 2022-08-30 | 2022-12-30 | 中国科学院自动化研究所 | Prediction method and device based on knowledge graph and electronic equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113128600A (en) * | 2021-04-23 | 2021-07-16 | 湖北珞珈环创科技有限公司 | Structured depth incomplete multi-view clustering method |
CN113157957A (en) * | 2021-03-05 | 2021-07-23 | 北京工业大学 | Attribute graph document clustering method based on graph convolution neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111047182B (en) * | 2019-12-10 | 2021-12-28 | 北京航空航天大学 | Airspace complexity evaluation method based on deep unsupervised learning |
-
2021
- 2021-09-27 CN CN202111136030.1A patent/CN113869404B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113157957A (en) * | 2021-03-05 | 2021-07-23 | 北京工业大学 | Attribute graph document clustering method based on graph convolution neural network |
CN113128600A (en) * | 2021-04-23 | 2021-07-16 | 湖北珞珈环创科技有限公司 | Structured depth incomplete multi-view clustering method |
Non-Patent Citations (1)
Title |
---|
基于融合元路径图卷积的异质网络表示学习;蒋宗礼等;《计算机科学》;20200408(第07期);第231页至第235页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113869404A (en) | 2021-12-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113869404B (en) | Self-adaptive graph roll accumulation method for paper network data | |
Gordo et al. | End-to-end learning of deep visual representations for image retrieval | |
CN110647907B (en) | Multi-label image classification algorithm using multi-layer classification and dictionary learning | |
CN112381179B (en) | Heterogeneous graph classification method based on double-layer attention mechanism | |
CN113157957A (en) | Attribute graph document clustering method based on graph convolution neural network | |
CN107563406B (en) | Image fine classification method for autonomous learning | |
CN113591879A (en) | Deep multi-view clustering method, network, device and storage medium based on self-supervision learning | |
Bonaccorso | Hands-on unsupervised learning with Python: implement machine learning and deep learning models using Scikit-Learn, TensorFlow, and more | |
Chatterjee et al. | A clustering‐based feature selection framework for handwritten Indic script classification | |
CN117315381B (en) | Hyperspectral image classification method based on second-order biased random walk | |
CN111540405B (en) | Disease gene prediction method based on rapid network embedding | |
Forest et al. | Deep embedded self-organizing maps for joint representation learning and topology-preserving clustering | |
Bianchi et al. | Improving image classification robustness through selective cnn-filters fine-tuning | |
Zhang et al. | Discovering similar Chinese characters in online handwriting with deep convolutional neural networks | |
CN117349494A (en) | Graph classification method, system, medium and equipment for space graph convolution neural network | |
He et al. | Image quality assessment based on adaptive multiple Skyline query | |
CN114265954B (en) | Graph representation learning method based on position and structure information | |
CN109885758A (en) | A kind of recommended method of the novel random walk based on bigraph (bipartite graph) | |
CN114611668A (en) | Vector representation learning method and system based on heterogeneous information network random walk | |
Garcia et al. | A methodology for neural network architectural tuning using activation occurrence maps | |
CN113033641A (en) | Semi-supervised classification method for high-dimensional data | |
CN110175625B (en) | WeChat information identification and management method based on improved SSD algorithm | |
CN114419382A (en) | Method and system for embedding picture of unsupervised multi-view image | |
CN114168822A (en) | Method for establishing time series data clustering model and time series data clustering | |
CN114399687A (en) | Semi-supervised self-training hyperspectral remote sensing image classification method based on spatial correction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |