CN112766412A - Multi-view clustering method based on self-adaptive sparse graph learning - Google Patents

Multi-view clustering method based on self-adaptive sparse graph learning Download PDF

Info

Publication number
CN112766412A
CN112766412A CN202110158287.0A CN202110158287A CN112766412A CN 112766412 A CN112766412 A CN 112766412A CN 202110158287 A CN202110158287 A CN 202110158287A CN 112766412 A CN112766412 A CN 112766412A
Authority
CN
China
Prior art keywords
view
learning
following
clustering
adaptive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110158287.0A
Other languages
Chinese (zh)
Other versions
CN112766412B (en
Inventor
肖庆江
黄奕轩
杜世强
石玉清
单广荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest Minzu University
Original Assignee
Northwest Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest Minzu University filed Critical Northwest Minzu University
Priority to CN202110158287.0A priority Critical patent/CN112766412B/en
Publication of CN112766412A publication Critical patent/CN112766412A/en
Application granted granted Critical
Publication of CN112766412B publication Critical patent/CN112766412B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

The invention discloses a multi-view clustering method based on self-adaptive sparse graph learning, which comprises the following steps of: firstly, aiming at a data matrix of each view, obtaining a similar matrix of each view through adaptive neighbor learning; then, automatically weighting the similar matrix of each view, and learning by using sparse constraint to obtain a shared similar matrix with a sparse structure; and finally, optimizing the objective function by using an efficient iterative update algorithm based on a multiplier alternating direction method (ADMM), and performing standard spectral clustering on the shared similar matrix to obtain a final clustering result. The method improves the quality of each view similarity graph, and simultaneously enhances the robustness to noise and abnormal values. The calculation complexity of the method is approximately equal to that of a spectral clustering method based on a single view, so that the model calculation speed is high, and the framework is simple and easy to realize.

Description

Multi-view clustering method based on self-adaptive sparse graph learning
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a multi-view clustering method based on self-adaptive sparse graph learning.
Background
Clustering is a data analysis method commonly used in the fields of machine learning, pattern recognition, data mining, artificial intelligence, etc., and aims to divide a data set into a plurality of subsets consisting of similar objects according to the characteristics of data. With the rapid development of internet technology and sensor technology, the description of actually acquired data has evolved from a single view in the past to a ubiquitous multi-view description, which can provide more sufficient information for data analysis tasks, and is semantically richer, more useful, but more complex. Numerous studies have shown that multi-view learning is more efficient, robust and has better generalization capabilities than single-view learning. The multi-view learning approach aims at simulating each view to learn one function and improving generalization performance by jointly optimizing all functions, thereby effectively fusing information from different views.
The purpose of multi-view clustering is to group data points into a certain number of patterns by using compatible and complementary information of multi-view data, and existing algorithms mainly include a graph-based method, a matrix decomposition method, a multi-kernel learning method and a subspace learning method. Among them, the graph-based method is receiving wide attention due to its simplicity and high efficiency, and can be subdivided into a multi-view spectral clustering method, which generally uses a graph constructed by KNN, and a multi-view subspace clustering method, which generally uses a graph constructed by a self-representation model, such as sparse representation and low rank representation.
Most of the multi-view clustering methods are based on graph models to fuse multi-view information, and earlier methods often focused on fusing two-view information and are not suitable for three or more views. Researchers have proposed a sparse graph learning-based multi-view spectral clustering (S-MVSC) method, which aims to learn a shared similarity matrix with a sparse structure from multiple views, but neglects the quality difference between different similarity matrices. Another researcher has proposed a self-weighted multi-view clustering (SwMC) method with multiple graphs to study laplacian rank constrained graphs, but it requires singular value decomposition in each iteration, resulting in a very time-consuming process.
Disclosure of Invention
Aiming at the defects pointed out in the background technology, the invention provides a multi-view clustering method based on self-adaptive sparse graph learning, and aims to solve the problems in the prior art in the background technology.
In order to achieve the purpose, the invention adopts the technical scheme that:
a multi-view clustering method based on adaptive sparse graph learning comprises the following steps:
(1) obtaining the ith data point x of the similarity matrix of each view to the vth view through the adaptive neighbor learning of the data matrixiAll data points of the v-th view
Figure BDA0002935279040000021
Will be with probability
Figure BDA0002935279040000022
As xiIs close to xiConnected, therefore, for all data points of the v-th view, the probability is determined by solving the following problem
Figure BDA0002935279040000023
Figure BDA0002935279040000024
Figure BDA0002935279040000025
From SvMiddle school
Figure BDA0002935279040000026
There are k nonzero values, k is a neighbor parameter, and the result obtained after solving is as follows:
Figure BDA0002935279040000027
wherein the content of the first and second substances,
Figure BDA0002935279040000028
adaptively learning the similarity of each view from the data according to the solution resultA matrix;
(2) shared similarity matrix learning
For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
Figure BDA0002935279040000029
s.t.α(v)≥0,αT1p=1,
wherein α ═ α(1)(2),...,α(p)]λ > 0, γ > 0; the solution of the above model translates into solving the following relaxation problem:
Figure BDA0002935279040000031
s.t.α(v)≥0,αT1p=1,
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a);
(3) optimization algorithm
And solving the relaxation problem by using a multiplier alternating direction method ADMM, respectively and alternately updating alpha and S through fixed variables, and learning a consensus similar matrix with a sparse structure as the input of standard spectral clustering to obtain a clustering result.
Preferably, in step (1), the solution is performed
Figure BDA0002935279040000032
Since only the nearest data point will be at probability 1
Figure BDA0002935279040000033
And all other data points cannot be neighbors of
Figure BDA0002935279040000034
Is close to, so is converted into a solutionThe following problems are solved:
Figure BDA0002935279040000035
Figure BDA0002935279040000036
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niSo further to solve the following problem:
Figure BDA0002935279040000037
Figure BDA0002935279040000038
solve the final result to
Figure BDA0002935279040000039
Preferably, in step (3), when updating S, α is fixed, and the relaxation problem is shifted to solve the following problem:
Figure BDA0002935279040000041
for the
Figure BDA0002935279040000049
The objective function of (1) is:
Figure BDA0002935279040000042
therefore, it is
Figure BDA0002935279040000043
Is equivalent toThe following formula:
Figure BDA0002935279040000044
definition of
Figure BDA0002935279040000045
Further simplified to
Figure BDA0002935279040000046
Introducing a soft threshold shrinkage operator:
Figure BDA0002935279040000047
where μ > 0, the similarity matrix to obtain the updated S is:
Figure BDA0002935279040000048
preferably, in step (3), when updating α, S is fixed, and the relaxation problem is shifted to solve the following problem:
Figure BDA0002935279040000051
s.t.α(v)≥0,αT1p=1;
definition of
Figure BDA0002935279040000052
Then the equivalence is to solve the following problem:
Figure BDA0002935279040000053
s.t.α(v)≥0,αT1p=1;
further conversion was to the following form:
Figure BDA0002935279040000054
the solution is then solved by the multiplier alternating direction method ADMM.
Compared with the defects and shortcomings of the prior art, the invention has the following beneficial effects:
(1) the invention constructs a similar matrix for each view by using an adaptive neighbor learning method as input, thereby improving the quality of the similar graph constructed for each view.
(2) The multi-view clustering (ASGL) model based on the self-adaptive sparse graph learning automatically weights each view, learns a shared similar matrix with a sparse structure from the views as the input of standard spectral clustering, considers the quality difference among different views, and learns that the shared similar matrix is sparse, so that the noise generated by different views can be effectively eliminated, and the robustness to the noise and abnormal values is improved.
(3) The ASGL model is fast and easy to implement, and the computational complexity of the ASGL is approximately equal to that of single-view atlas clustering under the condition that the time consumption for constructing similar matrixes for all views and iteratively solving the optimal shared similar matrix is not considered.
(4) The ASGL model is optimized through an efficient iterative updating algorithm based on a multiplier alternating direction method (ADMM), and compared with several latest algorithms, the ASGL model has the advantage that the effectiveness of the ASGL method is verified through numerical experiments on six data sets.
Drawings
Fig. 1 is a flow framework diagram of a multi-view clustering method based on adaptive sparse graph learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
1. Adaptive neighbor graph learning
Ith data point x for the v viewiStation of the v-th viewHas data points
Figure BDA0002935279040000061
Will be with probability
Figure BDA0002935279040000062
As xiIs close to xiAnd (4) connecting. Usually, a small distance
Figure BDA0002935279040000063
A large probability should be assigned
Figure BDA0002935279040000064
Thus, for all data points of the v-th view, a probability is determined
Figure BDA0002935279040000065
The natural approach of (a) is to solve the following problem:
Figure BDA0002935279040000066
equation (1) has a simple solution, only the nearest data point will be x with probability 1i vAnd all other data points cannot be neighbors of xiv, in other words, without considering the distance information in some data, will translate to solving the following problem:
Figure BDA0002935279040000067
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niIn conjunction with equations (1) and (2), further translates to solving the following problem:
Figure BDA0002935279040000071
from SvMiddle school
Figure BDA0002935279040000072
There are k nonzero values, k is a neighbor parameter, and the final result of the solution is as follows:
Figure BDA0002935279040000073
wherein the content of the first and second substances,
Figure BDA0002935279040000074
the similarity matrix for each view can be learned adaptively from the data by equation (4).
2. Shared similarity matrix learning
Based on the fact that the performance of spectral clustering depends on the quality of the similarity matrix to a great extent, the invention researches a method for learning the shared similarity matrix from a plurality of input data views, and simultaneously considers the quality difference between different similarity matrices. For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
Figure BDA0002935279040000075
wherein α ═ α(1)(2),...,α(p)]λ > 0, γ > 0; since the solution of equation (5) involves minimizing l0Norm, so the solution of the above model translates to solving the following relaxation problem:
Figure BDA0002935279040000076
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a).
3. Optimization algorithm
And (3) using an efficient iterative updating algorithm based on a multiplier alternating direction method (ADMM), respectively and alternately updating alpha and S through fixed variables, and learning a shared similar matrix to obtain a clustering result.
(1) Fix α, update S, switch to solving the following problem:
Figure BDA0002935279040000081
for the objective function of equation (7), there is:
Figure BDA0002935279040000082
the formula (8) is equivalent to the following formula:
Figure BDA0002935279040000083
to simplify the calculation amount, the rewrite equation (9) is as follows:
Figure BDA0002935279040000084
wherein the content of the first and second substances,
Figure BDA0002935279040000085
introducing a soft threshold shrinkage operator:
Figure BDA0002935279040000086
where μ > 0, a similar solution is obtained for formula (10):
Figure BDA0002935279040000087
(2) fix S, update α, shift to solve the following problem:
Figure BDA0002935279040000088
definition of
Figure BDA0002935279040000089
Then the equivalence is to solve the following problem:
Figure BDA00029352790400000810
the rewrite equation (14) is in the form:
Figure BDA0002935279040000091
equation (15) can be solved by an efficient iterative algorithm based on the multiplier alternating direction method ADMM.
By analysis, it can be easily found that the solution of α is related to S, and the solution of S is also related to α, so the original equation (6) is solved by alternately and iteratively optimizing S and α.
The whole process of solving equation (6) is shown in algorithm 1:
Figure BDA0002935279040000092
the flow frame diagram of the multi-view clustering method (ASGL) based on the adaptive sparse graph learning is shown in FIG. 1.
In order to verify the correctness of the clustering result, the clustering experiments are respectively carried out on a 3-source text data set, a COIL20 toy data set, an MSRC image data set, a NUS image data set, an ORL face data set and an Outdoor Scene data set, and the correctness rates (Accuracy) corresponding to the clustering result are 73.79%, 93.36%, 82.90%, 28.67%, 83.13% and 64.47% respectively; normalized mutual information (Normalized mutual information) is: 67.45%, 97.01%, 73.75%, 16.34%, 93.19% and 55.58%; the Adjusted random coefficients (Adjusted rand index) are respectively: 57.63%, 92.73%, 67.95%, 10.12%, 79.34% and 46.20%; the purities (Purity) were respectively: 81.07%, 94.78%, 83.29%, 31.10%, 86.85% and 66.55%. Compared with 5 latest multi-view clustering methods S-MVSC, SwMC, AMGL, MLAN and AWP, the ASGL method provided by the invention obtains the highest clustering result under 6 experimental databases and 4 common clustering evaluation criteria.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (4)

1. A multi-view clustering method based on adaptive sparse graph learning is characterized by comprising the following steps:
(1) obtaining the similar matrix of each view by the self-adaptive neighbor learning of the data matrix
Ith data point x for the v viewiAll data points of the v-th view
Figure FDA0002935279030000011
Will be with probability
Figure FDA0002935279030000012
As xiIs close to xiAnd (4) connecting. Thus, for all data points of the v-th view, the probability is determined by solving the following problem
Figure FDA0002935279030000013
Figure FDA0002935279030000014
Figure FDA0002935279030000015
From SvMiddle school
Figure FDA0002935279030000016
There are k nonzero values, k is a neighbor parameter, and the result obtained after solving is as follows:
Figure FDA0002935279030000017
wherein the content of the first and second substances,
Figure FDA0002935279030000018
learning the similarity matrix of each view from the data in a self-adaptive manner according to the solving result;
(2) shared similarity matrix learning
For multi-view data, first p similar matrices S are constructed(1),S(2),...,S(p)In which S is(v)∈Rnxn(v is more than or equal to 1 and less than or equal to p), introducing parameters lambda and gamma, and proposing the following models:
Figure FDA0002935279030000019
s.t.α(v)≥0,αT1p=1,
wherein α ═ α(1)(2),...,α(p)]λ > 0, γ > 0; the solution of the above model translates into solving the following relaxation problem:
Figure FDA0002935279030000021
s.t.α(v)≥0,αT1p=1,
wherein | S | purple1Is | | S | | non-conducting phosphor0The convex relaxation of (a);
(3) optimization algorithm
And solving the relaxation problem by using a multiplier alternating direction method (ADMM), respectively and alternately updating alpha and S through fixed variables, and learning a shared similar matrix with a sparse structure as the input of standard spectral clustering to obtain a clustering result.
2. The multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (1), solving is performed
Figure FDA0002935279030000022
Since only the nearest data point will be at probability 1
Figure FDA0002935279030000023
And all other data points cannot be neighbors of
Figure FDA0002935279030000024
So the following problem is solved:
Figure FDA0002935279030000025
Figure FDA0002935279030000026
since the optimal solution is that all data points of the vth view will be x with the same probability of 1/niSo further to solve the following problem:
Figure FDA0002935279030000027
Figure FDA0002935279030000028
solve the final result to
Figure FDA0002935279030000029
3. The multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (3), when updating S, α is fixed, and the relaxation problem is converted to solve the following problems:
Figure FDA0002935279030000031
for the
Figure FDA0002935279030000032
The objective function of (1) is:
Figure FDA0002935279030000033
therefore, it is
Figure FDA0002935279030000034
Equivalent to the following equation:
Figure FDA0002935279030000035
definition of
Figure FDA0002935279030000036
Further simplified to
Figure FDA0002935279030000037
Introducing a soft threshold shrinkage operator:
Figure FDA0002935279030000038
where μ > 0, the similarity matrix to obtain the updated S is:
Figure FDA0002935279030000039
4. the multi-view clustering method based on adaptive sparse graph learning as claimed in claim 1, wherein in step (3), when updating α, S is fixed, and the relaxation problem is converted to solve the following problems:
Figure FDA0002935279030000041
s.t.α(v)≥0,αT1p=1;
definition of
Figure FDA0002935279030000042
Then the equivalence is to solve the following problem:
Figure FDA0002935279030000043
s.t.α(v)≥0,αT1p=1;
further conversion was to the following form:
Figure FDA0002935279030000044
the solution is then solved by the multiplier alternating direction method ADMM.
CN202110158287.0A 2021-02-05 2021-02-05 Multi-view clustering method based on self-adaptive sparse graph learning Active CN112766412B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110158287.0A CN112766412B (en) 2021-02-05 2021-02-05 Multi-view clustering method based on self-adaptive sparse graph learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110158287.0A CN112766412B (en) 2021-02-05 2021-02-05 Multi-view clustering method based on self-adaptive sparse graph learning

Publications (2)

Publication Number Publication Date
CN112766412A true CN112766412A (en) 2021-05-07
CN112766412B CN112766412B (en) 2023-11-07

Family

ID=75705070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110158287.0A Active CN112766412B (en) 2021-02-05 2021-02-05 Multi-view clustering method based on self-adaptive sparse graph learning

Country Status (1)

Country Link
CN (1) CN112766412B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764276A (en) * 2018-04-12 2018-11-06 西北大学 A kind of robust weights multi-characters clusterl method automatically
CN109508752A (en) * 2018-12-20 2019-03-22 西北工业大学 A kind of quick self-adapted neighbour's clustering method based on structuring anchor figure
US20190138786A1 (en) * 2017-06-06 2019-05-09 Sightline Innovation Inc. System and method for identification and classification of objects
CN110378365A (en) * 2019-06-03 2019-10-25 广东工业大学 A kind of multiple view Subspace clustering method based on joint sub-space learning
CN111008637A (en) * 2018-10-08 2020-04-14 北京京东尚科信息技术有限公司 Image classification method and system
CN111401468A (en) * 2020-03-26 2020-07-10 上海海事大学 Weight self-updating multi-view spectral clustering method based on shared neighbor
CN112148911A (en) * 2020-08-19 2020-12-29 江苏大学 Image clustering method of multi-view intrinsic low-rank structure

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138786A1 (en) * 2017-06-06 2019-05-09 Sightline Innovation Inc. System and method for identification and classification of objects
CN108764276A (en) * 2018-04-12 2018-11-06 西北大学 A kind of robust weights multi-characters clusterl method automatically
CN111008637A (en) * 2018-10-08 2020-04-14 北京京东尚科信息技术有限公司 Image classification method and system
CN109508752A (en) * 2018-12-20 2019-03-22 西北工业大学 A kind of quick self-adapted neighbour's clustering method based on structuring anchor figure
CN110378365A (en) * 2019-06-03 2019-10-25 广东工业大学 A kind of multiple view Subspace clustering method based on joint sub-space learning
CN111401468A (en) * 2020-03-26 2020-07-10 上海海事大学 Weight self-updating multi-view spectral clustering method based on shared neighbor
CN112148911A (en) * 2020-08-19 2020-12-29 江苏大学 Image clustering method of multi-view intrinsic low-rank structure

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
SONG YAN 等: "Multi-view spectral clustering algorithm based on shared nearest neighbor", 《JOURNAL OF COMPUTER APPLICATIONS》, vol. 40, no. 11, pages 3211 - 3216 *
YILING ZHANG 等: "A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE", 《KNOWLEDGE-BASED SYSTEMS》, vol. 163, pages 776 - 786 *
何云: "面向多视图数据的降维与聚类算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2020, pages 138 - 821 *
沈肖波: "多视图嵌入学习方法及其应用研究", 《中国博士学位论文全文数据库信息科技辑》, no. 2018, pages 138 - 45 *
肖庆江: "基于动态近邻学习的多视图聚类方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2022, pages 138 - 133 *
许楠: "基于子空间学习的多视角聚类方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2020, pages 138 - 1014 *

Also Published As

Publication number Publication date
CN112766412B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN111931814B (en) Unsupervised countering domain adaptation method based on intra-class structure tightening constraint
CN108280236B (en) Method for analyzing random forest visual data based on LargeVis
CN109711411B (en) Image segmentation and identification method based on capsule neurons
Zhang et al. Evolving neural network classifiers and feature subset using artificial fish swarm
CN115272774A (en) Sample attack resisting method and system based on improved self-adaptive differential evolution algorithm
CN116386899A (en) Graph learning-based medicine disease association relation prediction method and related equipment
CN106780501A (en) Based on the image partition method for improving artificial bee colony algorithm
Anwaar et al. Genetic algorithms: Brief review on genetic algorithms for global optimization problems
CN113806559B (en) Knowledge graph embedding method based on relationship path and double-layer attention
CN111639680B (en) Identity recognition method based on expert feedback mechanism
CN116306780B (en) Dynamic graph link generation method
CN116051924B (en) Divide-and-conquer defense method for image countermeasure sample
CN117409456A (en) Non-aligned multi-view multi-mark learning method based on graph matching mechanism
CN112766412A (en) Multi-view clustering method based on self-adaptive sparse graph learning
CN116861001A (en) Medical common sense knowledge graph automatic construction method based on meta learning
CN114581470B (en) Image edge detection method based on plant community behaviors
CN115908697A (en) Generation model based on point cloud probability distribution learning and method thereof
CN115661450A (en) Category increment semantic segmentation method based on contrast knowledge distillation
CN115131605A (en) Structure perception graph comparison learning method based on self-adaptive sub-graph
CN112836511B (en) Knowledge graph context embedding method based on cooperative relationship
CN115019053A (en) Dynamic graph semantic feature extraction method for point cloud classification and segmentation
CN113111308A (en) Symbolic regression method and system based on data-driven genetic programming algorithm
Hu et al. A classification surrogate model based evolutionary algorithm for neural network structure learning
Chao et al. Incomplete Contrastive Multi-View Clustering with High-Confidence Guiding
CN112926723A (en) Automatic network growth method based on Split LBI algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant