CN114037853B

CN114037853B - Depth image clustering method based on Laplace rank constraint

Info

Publication number: CN114037853B
Application number: CN202111354109.1A
Authority: CN
Inventors: 李学龙; 韦腾飞; 赵阳
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2021-11-11
Filing date: 2021-11-11
Publication date: 2024-03-05
Anticipated expiration: 2041-11-11
Also published as: CN114037853A

Abstract

The invention provides a depth image clustering method based on Laplace rank constraint. Firstly, preprocessing an image data set to obtain an expanded data set; then, introducing orthogonal constraint in the last layer of the spectrum embedded network to ensure that clustering indication vectors output by the network are mutually orthogonal, introducing a similarity matrix limited by Laplace matrix rank in a loss function, and training the network by using an image data set; and finally, processing the images to be classified by using the trained network to obtain a classification result. The method can obtain good low-dimensional data representation, is suitable for carrying out image data clustering processing of different scale scales, can efficiently process large-scale image data sets, and has good practical value.

Description

Depth image clustering method based on Laplace rank constraint

Technical Field

The invention belongs to the field of machine learning, and particularly relates to a depth image clustering method based on Laplace rank constraint.

Background

Clustering is a basic method in the machine learning field, and the use of the clustering in the big data age is also increasingly prominent, no matter the industries face massive data, the clustering is definitely the lowest-cost unsupervised data analysis method, and the clustering analysis is also a primary tool for data analysis in many fields including mathematics, computer science, statistics, biology, economy and other subjects. However, with the trend of diversification of data forms, the existing clustering method is somewhat caught in the process of processing multi-scale and complex manifold data.

There is a conventional clustering method Ulrike von Luxburg, which is a spectral clustering method proposed in the literature "A Tutorial on Spectral clustering. Statistics and Computing, vol.17, no.4,2007, pp.395-416," which characterizes the distance between data by using a similarity measure between different data, treating each data as a node, mapping the entire dataset, and clustering the data by cutting the map. Nie et al in document "The Constrained Laplacian Rank Algorithm for Graph-Based clustering. AAAI'16Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,2016,pp.1969-1976," propose to introduce rank constraints on the Laplace matrix in graph-Based clustering methods to obtain a cluster-friendly representation of data, i.e., to learn a similarity matrix with a distinct block structure. Most of the methods focus on finding better similarity measurement methods and finding optimal neighbor data points to improve the accuracy of clustering tasks, but are limited by the constraints of the self calculation method architecture, have high complexity and are difficult to apply to large-scale data sets.

The deep learning based clustering algorithm is a deep embedding clustering method proposed by Xie, junyuan et al in the literature "Unsupervised Deep Embedding for Clustering analysis.ICML'16Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48,2016, pp.478-487. The deep clustering method greatly improves the processing capacity of a large-scale data set, but lacks the interpretability when acquiring the low-dimensional characterization information of the data, and is difficult to obtain good low-dimensional embedded representation suitable for clustering. Moreover, the methods are only suitable for data sets with specific scales, and the clustering method is poor in robustness.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a depth image clustering method based on Laplacian rank constraint. Firstly, preprocessing an image data set to obtain an expanded data set; then, introducing orthogonal constraint in the last layer of the spectrum embedded network to ensure that clustering indication vectors output by the network are mutually orthogonal, introducing a similarity matrix limited by Laplace matrix rank in a loss function, and training the network by using an image data set; and finally, processing the images to be classified by using the trained network to obtain a classification result. The method can acquire good low-dimensional representation of data, can flexibly process image data of different scale, and can solve the problems that the existing method cannot process a large-scale data set, is difficult to acquire good characteristic representation suitable for clustering, is difficult to process data in various different mode representation forms and the like.

A depth image clustering method based on Laplace rank constraint is characterized by comprising the following steps:

step 1: inputting an image data set, processing images in the image data set by utilizing rotation, scaling and color transformation respectively to obtain an expanded image data set, and performing size normalization processing to ensure that all the images keep consistent in size;

step 2: randomly selecting m images in the image data set obtained in the step 1 to input a spectrum into a network, carrying out orthogonal constraint on the last layer of the network, calculating network loss L, updating network weight theta through forward propagation until the loss converges, and obtaining a trained network;

wherein, the value range of m is more than or equal to 1 and less than or equal to n, and n is the total number of images contained in the image dataset; judging a loss convergence condition by adopting an early stopping mechanism;

the spectrum embedded network refers to a 5-layer fully connected network, the dimensions of the network of layers 1-5 are 1024, 512, 256, 128 and k respectively, and k represents the number of clusters; the loss function L (θ) of the network is set as follows:

wherein y is _i Ith input image x representing network output _i Corresponding low-dimensional embedded features; c _i And c _j To press respectivelyAnd->The i-th input image x obtained by calculation _i And jth input image x _j Degree of corresponding node, s _i,j Representing the j-th column element of the ith row in the similarity matrix S, and calculating the similarity matrix S according to the following formula:

wherein s is _i Representing the transpose of the vector of elements of row i of the matrix S, L _S Representing a laplacian matrix constructed from the input batch of image data, gamma is the regularization parameter and, the value range is (0), ++ infinity]；

The orthogonal constraint is instruction Y ^T Y＝I _k Wherein Y is a matrix formed by indicating vectors output by the last layer of the network according to columns, I _k Representing a k-order identity matrix;

step 3: inputting the image data set to be processed into the spectrum embedded network trained in the step 2, outputting to obtain a corresponding low-dimensional embedded vector, and clustering the low-dimensional embedded vector by using a K-means method to obtain a final clustering result.

The beneficial effects of the invention are as follows: since the rank limit of the Laplace matrix is introduced to calculate the similarity matrix, the implicit semantic relation between image data can be fully mined, the problem of high calculation complexity of the existing image clustering algorithm on a large-scale image data set is solved, and the clustering precision is improved; the number of neighbors is selected in a self-adaptive manner in the similarity calculation process, so that the complexity of similarity matrix calculation can be reduced, and the method can maintain good clustering efficiency in processing a large-scale image data set; because the orthogonal constraint is introduced into the last layer of the network, the clustering indication vectors output by the network can be ensured to be mutually orthogonal, the characteristic decomposition process of the traditional spectral clustering is replaced, and the calculation time is reduced; because the self-adaptive Laplace matrix calculation is embedded into the neural network, better low-dimensional data representation can be obtained, and the high efficiency of the traditional similarity calculation method and the strong calculation advantage of the neural network are combined, so that the method can well process image data sets with different scale dimensions; because the network weights are continuously updated by utilizing the batch training capability of the neural network, the method can be suitable for cluster analysis of data sets with different scale dimensions, and particularly can well cope with large-scale data set cluster processing.

Drawings

Fig. 1 is a flowchart of a depth image clustering method based on laplacian rank constraint of the present invention.

Detailed Description

The invention will be further illustrated with reference to the following figures and examples, which include but are not limited to the following examples.

As shown in fig. 1, the invention provides a depth image clustering method based on laplacian rank constraint, which comprises the following specific implementation processes:

1. data set enhancement and preprocessing

The method comprises the steps of inputting an image data set, processing images in the image data set by rotation, scaling and color transformation respectively, expanding the data quantity, enhancing the expression capability of the data set, and performing size normalization processing to enable all the images to keep consistent sizes. Let the resulting image dataset be x= { X ₁ ,x ₂ ,…,x _n X, where x _i I=1, 2, …, n, n representing the total number of images contained in the expanded image dataset, for a total of k classes of images.

2. Constructing an adaptive similarity matrix

In order to fully mine the implicit semantic relation between image data, a Laplace matrix rank constraint is introduced to construct a similarity matrix S as follows:

wherein x is _i Representing the ith input image data, x _j Represents the j-th input image data, s _i,j Represents the j-th row and column elements, S, in the similarity matrix S _i Representing the transpose vector of the ith row element of matrix S, L _S A laplace matrix representing the data set X, gamma is one regularization parameter introduced, the value range is (0), ++ infinity]。

The optimal solution for the above equation is:

wherein p represents the number of the set neighbors, and the value range is 1-p-n and d _ij Calculated according to the following formula:

wherein lambda is an introduced super parameter, and the value range is (0, 1]；e _i And e _j Respectively representing an ith row and a jth row of vectors of the embedded representation matrix E, wherein the embedded representation matrix E is calculated by the following extremum problem according to the attribute of the spectral cluster:

3. setting network and parameters

Randomly initializing a weight parameter theta of a spectrum embedded network, setting the size batch size of batch training data to be m, setting the value range of m to be more than or equal to 1 and less than or equal to n, and carrying out orthogonal constraint on the last layer of the network, namely enabling Y to be the same as the value range of m ^T Y＝I _k Wherein Y is a matrix formed by indicating vectors output by the last layer of the network according to columns, I _k Representing a k-th order identity matrix.

The loss function L (θ) of the network is set as follows:

wherein y is _i Ith input image x representing network output _i Corresponding low-dimensional embedded features; c _i And c _j To press respectivelyAnd->The i-th input image x obtained by calculation _i And jth input image x _j Degree of the corresponding node.

The spectrum embedded network refers to a 5-layer fully connected network, the dimensions of the network of layers 1-5 are 1024, 512, 256, 128 and k respectively, and the network output is a low-dimensional embedded feature representation vector of the input data.

4. Network training

And (3) randomly selecting m image input spectrums from the image data set obtained in the step (1) to embed the image input spectrums into a network, calculating the network loss L, and updating the network weight theta through forward propagation until the loss converges (judged by adopting an early stopping mechanism), so as to obtain a trained network.

5. Image clustering

Inputting the image data set to be classified into the spectrum embedded network trained in the step 4, outputting to obtain a corresponding low-dimensional embedded vector, and clustering the low-dimensional embedded vector by using a K-means method to obtain a final clustering result.

The effects of the present invention can be further illustrated by the following experiments.

1. Experimental conditions

The invention is simulated on a ubuntu20.04 operating system with a CPU model i7-5930K, a GPU model TITAN X (Pascal) (12G) and a memory 64G by using the python language and related kits. The data set used in the experiment is MNIST, CIFAR-10 data set, MNIST is a picture data set of handwriting characters, the picture data set comprises 70000 handwriting character pictures, and the picture size is 28x28; CIFAR-10 is a small color image dataset used to identify pervasive objects. A total of 10 categories of RGB color pictures: aircraft (airland), automobiles (automatic), birds (bird), cats (cat), deer (deer), dogs (dog), frogs (frog), horses (horse), boats (ship), and trucks (truck). Each picture has a size of 32×32, 6000 images per category, and a total of 50000 training pictures and 10000 test pictures in the dataset.

2. Experimental details

The method is adopted to train a network, test is carried out on a test set, two quantization indexes of Accuracy (ACC) and Normalized Mutual Information (NMI) of different methods on each data set are calculated, clustering results are shown in table 1, and selected comparison methods include a Spectral Clustering (SC) method, a Deep Embedding Clustering (DEC) method, a cosine heterogeneous model (HOE) method and a denoising self-encoder (DAE) method. DEC is described in detail in the literature "J.Xie, R.Girshick and a.faradai," Unsupervised deep embedding for clustering analysis, "in International Conference on Machine Learning,2016, pp.478-487," HOE is described in detail in the literature "X.Peng, H.Zhu, J.Feng, C.Shen, H.Zhang and j.t. zhou," Deep clustering with sample-assignment invariance prior, "IEEE Transactions on Neural Networks and Learning Systems, vol.31, no.11, pp.4857-4868,2019," DAE is described in detail in the literature "P.Vincent, H.Larochelle, I.Lajoie, Y.Bengio, p. -a.manzagol, and l.boltou," Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion "Journal of Machine Learning Research, vol.11, no.12,2010.

TABLE 1

As can be seen from Table 1, the performance of the method of the invention on two data sets is greatly improved compared with that of the comparison method, and the clustering effect on CIFAR-10 is greatly improved, which indicates the robustness and the good generalization of the method of the invention on large-scale data sets.

Claims

1. A depth image clustering method based on Laplace rank constraint is characterized by comprising the following steps:

wherein y is _i Ith input image x representing network output _i Corresponding low-dimensional embedded features; c _i And c _j To press respectivelyAnd->The i-th input image x obtained by calculation _i And the j-th input diagramImage x _j Degree of corresponding node, s _i,j Representing the j-th column element of the ith row in the similarity matrix S, and calculating the similarity matrix S according to the following formula: