CN110334770A - Figure clustering method based on the sparse study of robust order constraint - Google Patents
Figure clustering method based on the sparse study of robust order constraint Download PDFInfo
- Publication number
- CN110334770A CN110334770A CN201910615677.9A CN201910615677A CN110334770A CN 110334770 A CN110334770 A CN 110334770A CN 201910615677 A CN201910615677 A CN 201910615677A CN 110334770 A CN110334770 A CN 110334770A
- Authority
- CN
- China
- Prior art keywords
- similar diagram
- objective function
- data
- iteration
- sparse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Discrete Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of figure clustering method based on the sparse study of robust order constraint, the technical issues of for solving existing figure clustering method poor robustness.Technical solution is by sparse representation method learning data similar diagram S, in conjunction with L2,1Norm obtains initial target function;Initial graph is constructed adjacent to method using k, constrains required similar diagram S in the neighborhood of initial graph;In addition Laplce's order constraint, makes order be equal to data point number and subtracts the similar diagram connected region number, obtain final objective function.Using augmented vector approach, objective function is transformed into unconstrained optimization problem from constrained optimization problem;For the variable alternative optimization for including in objective function;In each iteration finally, being updated to the parameter that augmented vector approach includes;After iteration reaches termination condition, according to the solution acquired, similarity matrix is decomposed, obtains final cluster result.The present invention utilizes L by structure figures in high quality2,1Norm improves the robustness of method.
Description
Technical field
The present invention relates to a kind of figure clustering method, in particular to a kind of figure cluster side based on the sparse study of robust order constraint
Method.
Background technique
With the development of Information technology, people touch the growth that geometry grade is presented with information to be treated daily,
In face of such large-scale information resources, how they are effectively organized and are utilized, become one it is in the urgent need to address
The problem of.In this context, largely not labeled data are greatly promoted in the field of data mining one and important grind
Study carefully the development in direction --- cluster.Cluster stems primarily from taxology, it is to be based on similitude for physical object or abstract object
It is grouped into the process of multiple classes.The process is using the feature of sample as classification foundation, to ensure the individual tool in same category
There is homogeney as high as possible, and there is heterogeneity as far as possible between classification.Cluster is the representative of " unsupervised learning ", cluster
When do not need artificial input data tag, do not need priori knowledge, can be by data click and sweep by the operation in finite time
It is divided into different classifications.Cluster can be used as the tool of independent analysis data, immediately arrive at analysis as a result, can also be in data volume
When larger, as the pretreatment to data, the presence of cluster is frequently accompanied by " big data " this concept.
In numerous clustering techniques, figure cluster possesses relatively advanced performance, and can be with the immanent structure of mining data.
Figure cluster is a kind of clustering algorithm by digital simulation for spatial point, i.e. the distance building data according to data in space are similar
Figure, then post-processes data similar diagram to complete cluster task.J.Huang et al. is in document " J.Huang, F.Nie,
and H.Huang,A new simplex sparse learning model to measure data similarity
for clustering,in Proc.IEEE Conf.Twenty-Fourth International Joint Conference
On Artificial Intelligence, 2015, " in propose simplex sparse table diagram clustering method, this method is dilute
It dredges on the basis of indicating, the constrained rarefaction representation of tool can be used to calculate the similarity between sample point in proposition, achieves
Preferable Clustering Effect.F.Nie et al. is in document " F.Nie, X.Wang, M.I.Jordan, and H.Huang, The
constrained laplacian rank algorithm for graph-based clustering,in Proc.IEEE
Conf.Thirtieth AAAI Conference on Artificial Intelligence, 2016, " in propose La Pu
Lars order constraint figure clustering method, this method directly learn datagram on the basis of order constraint, in this way can basis
Connected relation of the data point in similar diagram directly obtains cluster result, improves the speed and accuracy of cluster.
Both methods all has limitation.First, the similar diagram to learn can not accurately reflect the relationship between data.
Second, the error of each data point is squared in objective function, these methods are easy to be influenced by exceptional value and noise, Shandong
Stick is poor.
Summary of the invention
In order to overcome the shortcomings of that existing figure clustering method poor robustness, the present invention provide a kind of sparse based on robust order constraint
The figure clustering method of study.This method is by sparse representation method learning data similar diagram S, in combination with L2,1Norm obtains just
Beginning objective function;Initial graph is constructed adjacent to method using k, constrains required similar diagram S in the neighborhood of initial graph;In addition Laplce
Order constraint makes order be equal to data point number and subtracts the similar diagram connected region number;It is added together after above-mentioned three additions coefficient,
Obtain final objective function.Using augmented vector approach, objective function is transformed into nothing from constrained optimization problem
Constrained optimization problem;It for the variable for including in objective function, is initialized first, then in each iteration, fixes it
His three variables are constant, a certain variable of alternative optimization;In each iteration finally, the ginseng for including to augmented vector approach
Number is updated;After iteration reaches termination condition, according to the solution acquired, similarity matrix is decomposed, is obtained final
Cluster result.The present invention utilizes L by structure figures in high quality2,1Norm improves the robustness of method.
The technical solution adopted by the present invention to solve the technical problems: a kind of figure based on the sparse study of robust order constraint is poly-
Class method, its main feature is that the following steps are included:
Step 1: by sparse representation method learning data similar diagram S, in combination with L2,1Norm, X are data, initial mesh
Scalar functions are as follows:
Step 2: constructing initial graph adjacent to method using k, pass through regularization termRequired similar diagram S is constrained first
Begin in the neighborhood of figure B;
Step 3: to the Laplacian Matrix L of SsOrder use restraint, make order be equal to data point number subtract the similar diagram
Connected region number directly obtains cluster result according to connected relation of the data point in similar diagram, specifically, passes throughIt is constrained;
Step 4: obtaining final objective function for being added together after above-mentioned three additions coefficient:
Step 5: enabling E=X-XZ, Z=S using augmented vector approach, objective function is converted are as follows:
Step 6: being initialized first for E, Z, S, F wherein included, then in each iteration, other are fixed
Three variables are constant, a certain variable of alternative optimization;
Step 7: in each iteration finally, being updated to the parameter that augmented vector approach includes, by having
Time iteration is limited, optimal solution is gradually acquired.
Step 8: decomposing according to the optimal solution acquired to S, final cluster result is obtained.
The beneficial effects of the present invention are: this method passes through sparse representation method learning data similar diagram S, in combination with L2,1
Norm obtains initial target function;Initial graph is constructed adjacent to method using k, constrains required similar diagram S in the neighborhood of initial graph;
In addition Laplce's order constraint, makes order be equal to data point number and subtracts the similar diagram connected region number;Above-mentioned three additions system
It is added together after number, obtains final objective function.Using augmented vector approach, by objective function from constrained optimization
Problem is transformed into unconstrained optimization problem;It for the variable for including in objective function, is initialized, is then being changed every time first
Dai Zhong, it is constant to fix other three variables, a certain variable of alternative optimization;In each iteration finally, multiplying to augmentation Lagrange
The parameter that sub- method includes is updated;After iteration reaches termination condition, according to the solution acquired, similarity matrix is divided
Solution, obtains final cluster result.The present invention utilizes L by structure figures in high quality2,1Norm improves the robust of method
Property.
It elaborates with reference to the accompanying drawings and detailed description to the present invention.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow charts of the figure clustering method of the sparse study of robust order constraint.
Specific embodiment
Referring to Fig.1.The present invention is based on the figure clustering method of the sparse study of robust order constraint, specific step is as follows:
Step 1: by sparse representation method learning data similar diagram S, in combination with L2,1Norm improves figure with this
Building quality and the influence for reducing data noise and exceptional value.Specifically, X is data, initial target function are as follows:
Step 2: constructing initial graph adjacent to method using k, pass through regularization termRequired similar diagram S is constrained first
Begin in the neighborhood of figure B, so that the similar diagram acquired can accurately reflect the relationship between data;
Step 3: adding Laplce's order constraint, i.e., to the Laplacian Matrix L of SsOrder use restraint, be equal to order
Data point number subtracts the similar diagram connected region number, thus can be straight according to connected relation of the data point in similar diagram
It connects to obtain cluster result, without operating after subsequent execution, improves the quality and efficiency of cluster.Specifically, pass throughIt is constrained;
Step 4: obtaining final objective function for being added together after above-mentioned three additions coefficient:
Step 5: enabling E=X-XZ, Z=S using augmented vector approach, objective function is converted are as follows:
Step 6: being initialized first for E, Z, S, F wherein included, then in each iteration, other are fixed
Three variables are constant, a certain variable of alternative optimization;
Step 7: being passed through in this way in each iteration finally, be updated to the parameter that augmented vector approach includes
Limited times iteration is crossed, optimal solution can be gradually acquired.
Step 8: decomposing according to the solution acquired to S, final cluster result is obtained.
Effect of the invention is described further by following emulation experiment.
1. simulated conditions.
The present invention is to be in central processing unitI5-3470 3.2GHz CPU, memory 4G, WINDOWS 7 operation system
On system, with the emulation of MATLAB software progress.
COIL20 data set is from D.Cai et al. in document " D.Cai, X.He, J.Han, and used in experiment
T.S.Huang,Graph regularized nonnegative matrix factorization for data
representation,IEEE transactions on pattern analysis and machine
It is proposed in intelligence, vol.33, no.8, pp.1548-1560,2010. ", includes 1440 groups of data, there is 20 classifications,
Corresponding one 32 × 32 picture of every group of data.2. emulation content.
Cluster experiment is carried out to data set.In order to compare effectiveness of the invention, F.Nie et al. is had chosen in document "
F.Nie,X.Wang,M.I.Jordan,and H.Huang,The constrained laplacian rank algorithm
for graph-based clustering,in Proc.IEEE Conf.Thirtieth AAAI Conference on
Artificial Intelligence, 2016, " in the Laplce's order constraint figure clustering method (CLR) and J.Huang mentioned
Et al. in document " J.Huang, F.Nie, and H.Huang, A new simplex sparse learning model to
measure data similarity for clustering,in Proc.IEEE Conf.Twenty-Fourth
International Joint Conference on Artificial Intelligence, 2015, " in propose it is simple
Shape sparse table diagram clustering method and some other basic clustering method.Adjustment parameter calculates accuracy (ACC), normalization
Mutual information (NMI) and degree of purity (Purity).Comparing result is as shown in table 1.
The comparison of 1 experimental result of table
Method | K-means | NMF | NCut | CAN | CLR | SSR | This method |
ACC | 0.55 | 0.47 | 0.49 | 0.82 | 0.83 | 0.69 | 0.88 |
NMI | 0.72 | 0.60 | 0.66 | 0.90 | 0.91 | 0.80 | 0.93 |
Purity | 0.60 | 0.50 | 0.56 | 0.86 | 0.87 | 0.75 | 0.89 |
As seen from Table 1, the clustering performance of this method is better than other control methods.This can be verified by the above emulation experiment
The validity of invention.
Claims (1)
1. a kind of figure clustering method based on the sparse study of robust order constraint, it is characterised in that the following steps are included:
Step 1: by sparse representation method learning data similar diagram S, in combination with L2,1Norm, X are data, initial target letter
Number are as follows:
Step 2: constructing initial graph adjacent to method using k, pass through regularization termRequired similar diagram S is constrained in initial graph B
Neighborhood in;
Step 3: to the Laplacian Matrix L of SsOrder use restraint, make order be equal to data point number subtract the similar diagram connection
Areal directly obtains cluster result according to connected relation of the data point in similar diagram, specifically, passes throughIt is constrained;
Step 4: obtaining final objective function for being added together after above-mentioned three additions coefficient:
Step 5: enabling E=X-XZ, Z=S using augmented vector approach, objective function is converted are as follows:
Step 6: being initialized first for E, Z, S, F wherein included, then in each iteration, other three are fixed
Variable is constant, a certain variable of alternative optimization;
Step 7: in each iteration finally, being updated to the parameter that augmented vector approach includes, by limited times
Iteration gradually acquires optimal solution;
Step 8: decomposing according to the optimal solution acquired to S, final cluster result is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910615677.9A CN110334770A (en) | 2019-07-09 | 2019-07-09 | Figure clustering method based on the sparse study of robust order constraint |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910615677.9A CN110334770A (en) | 2019-07-09 | 2019-07-09 | Figure clustering method based on the sparse study of robust order constraint |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110334770A true CN110334770A (en) | 2019-10-15 |
Family
ID=68144812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910615677.9A Pending CN110334770A (en) | 2019-07-09 | 2019-07-09 | Figure clustering method based on the sparse study of robust order constraint |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110334770A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837863A (en) * | 2019-11-07 | 2020-02-25 | 仲恺农业工程学院 | Graph node clustering method based on orthogonal robust nonnegative matrix factorization |
CN112926658A (en) * | 2021-02-26 | 2021-06-08 | 西安交通大学 | Image clustering method and device based on two-dimensional data embedding and adjacent topological graph |
-
2019
- 2019-07-09 CN CN201910615677.9A patent/CN110334770A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837863A (en) * | 2019-11-07 | 2020-02-25 | 仲恺农业工程学院 | Graph node clustering method based on orthogonal robust nonnegative matrix factorization |
CN112926658A (en) * | 2021-02-26 | 2021-06-08 | 西安交通大学 | Image clustering method and device based on two-dimensional data embedding and adjacent topological graph |
CN112926658B (en) * | 2021-02-26 | 2023-03-21 | 西安交通大学 | Image clustering method and device based on two-dimensional data embedding and adjacent topological graph |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Subspace segmentation-based robust multiple kernel clustering | |
CN102592268B (en) | Method for segmenting foreground image | |
WO2020143321A1 (en) | Training sample data augmentation method based on variational autoencoder, storage medium and computer device | |
WO2020083073A1 (en) | Non-motorized vehicle image multi-label classification method, system, device and storage medium | |
CN102622610B (en) | Handwritten Uyghur character recognition method based on classifier integration | |
Yu et al. | Self-paced learning for k-means clustering algorithm | |
US8266078B2 (en) | Platform for learning based recognition research | |
CN102651128B (en) | Image set partitioning method based on sampling | |
CN110288605A (en) | Cell image segmentation method and device | |
CN104933445A (en) | Mass image classification method based on distributed K-means | |
CN112885415B (en) | Quick screening method for estrogen activity based on molecular surface point cloud | |
CN111539444A (en) | Gaussian mixture model method for modified mode recognition and statistical modeling | |
CN110334770A (en) | Figure clustering method based on the sparse study of robust order constraint | |
CN103593855A (en) | Clustered image splitting method based on particle swarm optimization and spatial distance measurement | |
CN101968852A (en) | Entropy sequencing-based semi-supervision spectral clustering method for determining clustering number | |
Yu et al. | A recognition method of soybean leaf diseases based on an improved deep learning model | |
Chen et al. | LABIN: Balanced min cut for large-scale data | |
CN110135364A (en) | A kind of Objects recognition method and device | |
Wei et al. | Flexible high-dimensional unsupervised learning with missing data | |
CN108090461A (en) | Three-dimensional face identification method based on sparse features | |
Xu et al. | POEM: 1-bit point-wise operations based on expectation-maximization for efficient point cloud processing | |
Zhao et al. | A novel multi-view clustering method via low-rank and matrix-induced regularization | |
Lin et al. | Hyperbolic diffusion embedding and distance for hierarchical representation learning | |
Chen et al. | Rooted Mahalanobis distance based Gustafson-Kessel fuzzy C-means | |
CN108256569B (en) | Object identification method under complex background and used computer technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191015 |
|
RJ01 | Rejection of invention patent application after publication |