CN112766412A - Multi-view clustering method based on self-adaptive sparse graph learning - Google Patents
Multi-view clustering method based on self-adaptive sparse graph learning Download PDFInfo
- Publication number
- CN112766412A CN112766412A CN202110158287.0A CN202110158287A CN112766412A CN 112766412 A CN112766412 A CN 112766412A CN 202110158287 A CN202110158287 A CN 202110158287A CN 112766412 A CN112766412 A CN 112766412A
- Authority
- CN
- China
- Prior art keywords
- view
- learning
- following
- clustering
- adaptive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 239000011159 matrix material Substances 0.000 claims abstract description 28
- 230000003044 adaptive effect Effects 0.000 claims abstract description 11
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 10
- 230000003595 spectral effect Effects 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims abstract description 6
- 239000000126 substance Substances 0.000 claims description 4
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000002159 abnormal effect Effects 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 abstract description 2
- 238000012821 model calculation Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 102100039109 Amelogenin, Y isoform Human genes 0.000 description 1
- 101000959107 Homo sapiens Amelogenin, Y isoform Proteins 0.000 description 1
- 241000764238 Isis Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Abstract
The invention discloses a multi-view clustering method based on self-adaptive sparse graph learning, which comprises the following steps of: firstly, aiming at a data matrix of each view, obtaining a similar matrix of each view through adaptive neighbor learning; then, automatically weighting the similar matrix of each view, and learning by using sparse constraint to obtain a shared similar matrix with a sparse structure; and finally, optimizing the objective function by using an efficient iterative update algorithm based on a multiplier alternating direction method (ADMM), and performing standard spectral clustering on the shared similar matrix to obtain a final clustering result. The method improves the quality of each view similarity graph, and simultaneously enhances the robustness to noise and abnormal values. The calculation complexity of the method is approximately equal to that of a spectral clustering method based on a single view, so that the model calculation speed is high, and the framework is simple and easy to realize.
Description
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a multi-view clustering method based on self-adaptive sparse graph learning.
Background
Clustering is a data analysis method commonly used in the fields of machine learning, pattern recognition, data mining, artificial intelligence, etc., and aims to divide a data set into a plurality of subsets consisting of similar objects according to the characteristics of data. With the rapid development of internet technology and sensor technology, the description of actually acquired data has evolved from a single view in the past to a ubiquitous multi-view description, which can provide more sufficient information for data analysis tasks, and is semantically richer, more useful, but more complex. Numerous studies have shown that multi-view learning is more efficient, robust and has better generalization capabilities than single-view learning. The multi-view learning approach aims at simulating each view to learn one function and improving generalization performance by jointly optimizing all functions, thereby effectively fusing information from different views.
The purpose of multi-view clustering is to group data points into a certain number of patterns by using compatible and complementary information of multi-view data, and existing algorithms mainly include a graph-based method, a matrix decomposition method, a multi-kernel learning method and a subspace learning method. Among them, the graph-based method is receiving wide attention due to its simplicity and high efficiency, and can be subdivided into a multi-view spectral clustering method, which generally uses a graph constructed by KNN, and a multi-view subspace clustering method, which generally uses a graph constructed by a self-representation model, such as sparse representation and low rank representation.
Most of the multi-view clustering methods are based on graph models to fuse multi-view information, and earlier methods often focused on fusing two-view information and are not suitable for three or more views. Researchers have proposed a sparse graph learning-based multi-view spectral clustering (S-MVSC) method, which aims to learn a shared similarity matrix with a sparse structure from multiple views, but neglects the quality difference between different similarity matrices. Another researcher has proposed a self-weighted multi-view clustering (SwMC) method with multiple graphs to study laplacian rank constrained graphs, but it requires singular value decomposition in each iteration, resulting in a very time-consuming process.
Disclosure of Invention
Aiming at the defects pointed out in the background technology, the invention provides a multi-view clustering method based on self-adaptive sparse graph learning, and aims to solve the problems in the prior art in the background technology.
In order to achieve the purpose, the invention adopts the technical scheme that:
a multi-view clustering method based on adaptive sparse graph learning comprises the following steps:
(1) obtaining the ith data point x of the similarity matrix of each view to the vth view through the adaptive neighbor learning of the data matrixiAll data points of the v-th viewWill be with probabilityAs xiIs close to xiConnected, therefore, for all data points of the v-th view, the probability is determined by solving the following problem
From SvMiddle schoolThere are k nonzero values, k is a neighbor parameter, and the result obtained after solving is as follows:
wherein the content of the first and second substances,adaptively learning the similarity of each view from the data according to the solution resultA matrix;
(2) shared similarity matrix learning
For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
s.t.α(v)≥0,αT1p=1,
wherein α ═ α(1),α(2),...,α(p)]λ > 0, γ > 0; the solution of the above model translates into solving the following relaxation problem:
s.t.α(v)≥0,αT1p=1,
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a);
(3) optimization algorithm
And solving the relaxation problem by using a multiplier alternating direction method ADMM, respectively and alternately updating alpha and S through fixed variables, and learning a consensus similar matrix with a sparse structure as the input of standard spectral clustering to obtain a clustering result.
Preferably, in step (1), the solution is performedSince only the nearest data point will be at probability 1And all other data points cannot be neighbors ofIs close to, so is converted into a solutionThe following problems are solved:
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niSo further to solve the following problem:
Preferably, in step (3), when updating S, α is fixed, and the relaxation problem is shifted to solve the following problem:
Introducing a soft threshold shrinkage operator:where μ > 0, the similarity matrix to obtain the updated S is:
preferably, in step (3), when updating α, S is fixed, and the relaxation problem is shifted to solve the following problem:
s.t.α(v)≥0,αT1p=1;
s.t.α(v)≥0,αT1p=1;
further conversion was to the following form:the solution is then solved by the multiplier alternating direction method ADMM.
Compared with the defects and shortcomings of the prior art, the invention has the following beneficial effects:
(1) the invention constructs a similar matrix for each view by using an adaptive neighbor learning method as input, thereby improving the quality of the similar graph constructed for each view.
(2) The multi-view clustering (ASGL) model based on the self-adaptive sparse graph learning automatically weights each view, learns a shared similar matrix with a sparse structure from the views as the input of standard spectral clustering, considers the quality difference among different views, and learns that the shared similar matrix is sparse, so that the noise generated by different views can be effectively eliminated, and the robustness to the noise and abnormal values is improved.
(3) The ASGL model is fast and easy to implement, and the computational complexity of the ASGL is approximately equal to that of single-view atlas clustering under the condition that the time consumption for constructing similar matrixes for all views and iteratively solving the optimal shared similar matrix is not considered.
(4) The ASGL model is optimized through an efficient iterative updating algorithm based on a multiplier alternating direction method (ADMM), and compared with several latest algorithms, the ASGL model has the advantage that the effectiveness of the ASGL method is verified through numerical experiments on six data sets.
Drawings
Fig. 1 is a flow framework diagram of a multi-view clustering method based on adaptive sparse graph learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
1. Adaptive neighbor graph learning
Ith data point x for the v viewiStation of the v-th viewHas data pointsWill be with probabilityAs xiIs close to xiAnd (4) connecting. Usually, a small distanceA large probability should be assignedThus, for all data points of the v-th view, a probability is determinedThe natural approach of (a) is to solve the following problem:
equation (1) has a simple solution, only the nearest data point will be x with probability 1i vAnd all other data points cannot be neighbors of xiv, in other words, without considering the distance information in some data, will translate to solving the following problem:
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niIn conjunction with equations (1) and (2), further translates to solving the following problem:
from SvMiddle schoolThere are k nonzero values, k is a neighbor parameter, and the final result of the solution is as follows:
wherein the content of the first and second substances,the similarity matrix for each view can be learned adaptively from the data by equation (4).
2. Shared similarity matrix learning
Based on the fact that the performance of spectral clustering depends on the quality of the similarity matrix to a great extent, the invention researches a method for learning the shared similarity matrix from a plurality of input data views, and simultaneously considers the quality difference between different similarity matrices. For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
wherein α ═ α(1),α(2),...,α(p)]λ > 0, γ > 0; since the solution of equation (5) involves minimizing l0Norm, so the solution of the above model translates to solving the following relaxation problem:
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a).
3. Optimization algorithm
And (3) using an efficient iterative updating algorithm based on a multiplier alternating direction method (ADMM), respectively and alternately updating alpha and S through fixed variables, and learning a shared similar matrix to obtain a clustering result.
(1) Fix α, update S, switch to solving the following problem:
for the objective function of equation (7), there is:
the formula (8) is equivalent to the following formula:
to simplify the calculation amount, the rewrite equation (9) is as follows:
wherein the content of the first and second substances,introducing a soft threshold shrinkage operator:
where μ > 0, a similar solution is obtained for formula (10):
(2) fix S, update α, shift to solve the following problem:
the rewrite equation (14) is in the form:
equation (15) can be solved by an efficient iterative algorithm based on the multiplier alternating direction method ADMM.
By analysis, it can be easily found that the solution of α is related to S, and the solution of S is also related to α, so the original equation (6) is solved by alternately and iteratively optimizing S and α.
The whole process of solving equation (6) is shown in algorithm 1:
the flow frame diagram of the multi-view clustering method (ASGL) based on the adaptive sparse graph learning is shown in FIG. 1.
In order to verify the correctness of the clustering result, the clustering experiments are respectively carried out on a 3-source text data set, a COIL20 toy data set, an MSRC image data set, a NUS image data set, an ORL face data set and an Outdoor Scene data set, and the correctness rates (Accuracy) corresponding to the clustering result are 73.79%, 93.36%, 82.90%, 28.67%, 83.13% and 64.47% respectively; normalized mutual information (Normalized mutual information) is: 67.45%, 97.01%, 73.75%, 16.34%, 93.19% and 55.58%; the Adjusted random coefficients (Adjusted rand index) are respectively: 57.63%, 92.73%, 67.95%, 10.12%, 79.34% and 46.20%; the purities (Purity) were respectively: 81.07%, 94.78%, 83.29%, 31.10%, 86.85% and 66.55%. Compared with 5 latest multi-view clustering methods S-MVSC, SwMC, AMGL, MLAN and AWP, the ASGL method provided by the invention obtains the highest clustering result under 6 experimental databases and 4 common clustering evaluation criteria.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (4)
1. A multi-view clustering method based on adaptive sparse graph learning is characterized by comprising the following steps:
(1) obtaining the similar matrix of each view by the self-adaptive neighbor learning of the data matrix
Ith data point x for the v viewiAll data points of the v-th viewWill be with probabilityAs xiIs close to xiAnd (4) connecting. Thus, for all data points of the v-th view, the probability is determined by solving the following problem
From SvMiddle schoolThere are k nonzero values, k is a neighbor parameter, and the result obtained after solving is as follows:
wherein the content of the first and second substances,learning the similarity matrix of each view from the data in a self-adaptive manner according to the solving result;
(2) shared similarity matrix learning
For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
s.t.α(v)≥0,αT1p=1,
wherein α ═ α(1),α(2),...,α(p)]λ > 0, γ > 0; the solution of the above model translates into solving the following relaxation problem:
s.t.α(v)≥0,αT1p=1,
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a);
(3) optimization algorithm
And solving the relaxation problem by using a multiplier alternating direction method (ADMM), respectively and alternately updating alpha and S through fixed variables, and learning a shared similar matrix with a sparse structure as the input of standard spectral clustering to obtain a clustering result.
2. The multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (1), solving is performedSince only the nearest data point will be at probability 1And all other data points cannot be neighbors ofSo the following problem is solved:
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niSo further to solve the following problem:
3. The multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (3), when updating S, α is fixed, and the relaxation problem is converted to solve the following problems:
Introducing a soft threshold shrinkage operator:where μ > 0, the similarity matrix to obtain the updated S is:
4. the multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (3), when updating α, S is fixed, and the relaxation problem is converted to solve the following problems:
s.t.α(v)≥0,αT1p=1;
s.t.α(v)≥0,αT1p=1;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110158287.0A CN112766412B (en) | 2021-02-05 | 2021-02-05 | Multi-view clustering method based on self-adaptive sparse graph learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110158287.0A CN112766412B (en) | 2021-02-05 | 2021-02-05 | Multi-view clustering method based on self-adaptive sparse graph learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112766412A true CN112766412A (en) | 2021-05-07 |
CN112766412B CN112766412B (en) | 2023-11-07 |
Family
ID=75705070
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110158287.0A Active CN112766412B (en) | 2021-02-05 | 2021-02-05 | Multi-view clustering method based on self-adaptive sparse graph learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112766412B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764276A (en) * | 2018-04-12 | 2018-11-06 | 西北大学 | A kind of robust weights multi-characters clusterl method automatically |
CN109508752A (en) * | 2018-12-20 | 2019-03-22 | 西北工业大学 | A kind of quick self-adapted neighbour's clustering method based on structuring anchor figure |
US20190138786A1 (en) * | 2017-06-06 | 2019-05-09 | Sightline Innovation Inc. | System and method for identification and classification of objects |
CN110378365A (en) * | 2019-06-03 | 2019-10-25 | 广东工业大学 | A kind of multiple view Subspace clustering method based on joint sub-space learning |
CN111008637A (en) * | 2018-10-08 | 2020-04-14 | 北京京东尚科信息技术有限公司 | Image classification method and system |
CN111401468A (en) * | 2020-03-26 | 2020-07-10 | 上海海事大学 | Weight self-updating multi-view spectral clustering method based on shared neighbor |
CN112148911A (en) * | 2020-08-19 | 2020-12-29 | 江苏大学 | Image clustering method of multi-view intrinsic low-rank structure |
-
2021
- 2021-02-05 CN CN202110158287.0A patent/CN112766412B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190138786A1 (en) * | 2017-06-06 | 2019-05-09 | Sightline Innovation Inc. | System and method for identification and classification of objects |
CN108764276A (en) * | 2018-04-12 | 2018-11-06 | 西北大学 | A kind of robust weights multi-characters clusterl method automatically |
CN111008637A (en) * | 2018-10-08 | 2020-04-14 | 北京京东尚科信息技术有限公司 | Image classification method and system |
CN109508752A (en) * | 2018-12-20 | 2019-03-22 | 西北工业大学 | A kind of quick self-adapted neighbour's clustering method based on structuring anchor figure |
CN110378365A (en) * | 2019-06-03 | 2019-10-25 | 广东工业大学 | A kind of multiple view Subspace clustering method based on joint sub-space learning |
CN111401468A (en) * | 2020-03-26 | 2020-07-10 | 上海海事大学 | Weight self-updating multi-view spectral clustering method based on shared neighbor |
CN112148911A (en) * | 2020-08-19 | 2020-12-29 | 江苏大学 | Image clustering method of multi-view intrinsic low-rank structure |
Non-Patent Citations (6)
Title |
---|
SONG YAN 等: "Multi-view spectral clustering algorithm based on shared nearest neighbor", 《JOURNAL OF COMPUTER APPLICATIONS》, vol. 40, no. 11, pages 3211 - 3216 * |
YILING ZHANG 等: "A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE", 《KNOWLEDGE-BASED SYSTEMS》, vol. 163, pages 776 - 786 * |
何云: "面向多视图数据的降维与聚类算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2020, pages 138 - 821 * |
沈肖波: "多视图嵌入学习方法及其应用研究", 《中国博士学位论文全文数据库信息科技辑》, no. 2018, pages 138 - 45 * |
肖庆江: "基于动态近邻学习的多视图聚类方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2022, pages 138 - 133 * |
许楠: "基于子空间学习的多视角聚类方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2020, pages 138 - 1014 * |
Also Published As
Publication number | Publication date |
---|---|
CN112766412B (en) | 2023-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111931814B (en) | Unsupervised countering domain adaptation method based on intra-class structure tightening constraint | |
CN108280236B (en) | Method for analyzing random forest visual data based on LargeVis | |
CN109711411B (en) | Image segmentation and identification method based on capsule neurons | |
Zhang et al. | Evolving neural network classifiers and feature subset using artificial fish swarm | |
CN115272774A (en) | Sample attack resisting method and system based on improved self-adaptive differential evolution algorithm | |
CN116386899A (en) | Graph learning-based medicine disease association relation prediction method and related equipment | |
CN106780501A (en) | Based on the image partition method for improving artificial bee colony algorithm | |
Anwaar et al. | Genetic algorithms: Brief review on genetic algorithms for global optimization problems | |
CN113806559B (en) | Knowledge graph embedding method based on relationship path and double-layer attention | |
CN111639680B (en) | Identity recognition method based on expert feedback mechanism | |
CN116306780B (en) | Dynamic graph link generation method | |
CN116051924B (en) | Divide-and-conquer defense method for image countermeasure sample | |
CN117409456A (en) | Non-aligned multi-view multi-mark learning method based on graph matching mechanism | |
CN112766412A (en) | Multi-view clustering method based on self-adaptive sparse graph learning | |
CN116861001A (en) | Medical common sense knowledge graph automatic construction method based on meta learning | |
CN114581470B (en) | Image edge detection method based on plant community behaviors | |
CN115908697A (en) | Generation model based on point cloud probability distribution learning and method thereof | |
CN115661450A (en) | Category increment semantic segmentation method based on contrast knowledge distillation | |
CN115131605A (en) | Structure perception graph comparison learning method based on self-adaptive sub-graph | |
CN112836511B (en) | Knowledge graph context embedding method based on cooperative relationship | |
CN115019053A (en) | Dynamic graph semantic feature extraction method for point cloud classification and segmentation | |
CN113111308A (en) | Symbolic regression method and system based on data-driven genetic programming algorithm | |
Hu et al. | A classification surrogate model based evolutionary algorithm for neural network structure learning | |
Chao et al. | Incomplete Contrastive Multi-View Clustering with High-Confidence Guiding | |
CN112926723A (en) | Automatic network growth method based on Split LBI algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |