CN111310813A - A subspace clustering method and apparatus for latent low-rank representation - Google Patents
A subspace clustering method and apparatus for latent low-rank representation Download PDFInfo
- Publication number
- CN111310813A CN111310813A CN202010082142.2A CN202010082142A CN111310813A CN 111310813 A CN111310813 A CN 111310813A CN 202010082142 A CN202010082142 A CN 202010082142A CN 111310813 A CN111310813 A CN 111310813A
- Authority
- CN
- China
- Prior art keywords
- matrix
- rank
- low
- subspace
- clustering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 239000011159 matrix material Substances 0.000 claims abstract description 179
- 238000005457 optimization Methods 0.000 claims abstract description 33
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 230000003595 spectral effect Effects 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000003190 augmentative effect Effects 0.000 claims description 11
- 238000003064 k means clustering Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Discrete Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及模式识别计算技术领域,尤其涉及一种潜在低秩表示的子空间聚类方法及装置。The present invention relates to the technical field of pattern recognition computing, and in particular, to a subspace clustering method and device for potential low-rank representation.
背景技术Background technique
随着科学的进步和人工智能的发展,模式识别对表征事物或现象的各种形式的信息进行处理和分析,从而对事物或现象进行描述、辨认、分类和解释的过程。子空间聚类广泛出现在许多的应用领域,例如图像,视频,文本等。With the progress of science and the development of artificial intelligence, pattern recognition processes and analyzes various forms of information that characterize things or phenomena, so as to describe, identify, classify and explain things or phenomena. Subspace clustering is widely used in many application domains, such as images, videos, texts, etc.
子空间的重要性自然导致子空间分割的难题,它的目标是将数据分割(或分组)到与子空间对应的每个集群中。子空间分割面临的主要挑战是如何有效处理噪声校正与数据分割之间的耦合问题。潜在低秩表示子空间聚类作为一种子空间分割算法,可以看作是低秩表示的增强版,从而获得更准确的分割结果。可以自动从损坏的数据中提取显著特征,从而产生有效的特征进行分类,已经引起了相关技术领域的广泛关注和高度重视。潜在低秩表示子空间能够考虑数据中未观测到的数据,因此提高了聚类的性能,为了解决样本不足的问题,相关技术一般采用核范数来约束正则项。但是,相关技术仅仅考虑了利用核范数作为秩函数的近似约束,但当矩阵的奇异值较大时,从秩函数和核范数的定义看,后者过分的放松使其不能更准确地估计矩阵的秩,从而使聚类性能降低,精度不高,且鲁棒性不强。The importance of subspaces naturally leads to the conundrum of subspace segmentation, where the goal is to partition (or group) the data into each cluster corresponding to the subspace. The main challenge for subspace segmentation is how to effectively deal with the coupling between noise correction and data segmentation. As a subspace segmentation algorithm, latent low-rank representation subspace clustering can be regarded as an enhanced version of low-rank representation, so as to obtain more accurate segmentation results. Significant features can be automatically extracted from corrupted data to generate effective features for classification, which has attracted extensive attention and great attention in related technical fields. The latent low-rank representation subspace can take into account the unobserved data in the data, thus improving the performance of clustering. In order to solve the problem of insufficient samples, related technologies generally use the nuclear norm to constrain the regular term. However, the related art only considers the approximate constraint of using the nuclear norm as the rank function, but when the singular value of the matrix is large, from the definition of the rank function and the nuclear norm, the latter is too relaxed to be more accurate. The rank of the matrix is estimated, so that the clustering performance is reduced, the accuracy is not high, and the robustness is not strong.
发明内容SUMMARY OF THE INVENTION
本发明为解决现有的子空间聚类中低秩表示样本不足、潜在低秩表示子空间聚类的鲁棒性不强和性能不足的问题,提供了一种潜在低秩表示的子空间聚类方法及装置。In order to solve the problems of insufficient low-rank representation samples, weak robustness and insufficient performance of potential low-rank representation subspace clustering in the existing subspace clustering, the present invention provides a subspace clustering with potential low-rank representation. Class methods and devices.
为实现以上发明目的,而采用的技术手段是:In order to achieve the above purpose of the invention, the technical means adopted are:
一种潜在低秩表示的子空间聚类方法,包括以下步骤:A subspace clustering method for latent low-rank representations, including the following steps:
S1.获取数据并对其进行预处理,得到特征矩阵;S1. Acquire data and preprocess it to obtain a feature matrix;
S2.基于所述特征矩阵,利用Schatten-p范数作为正则项以代替秩函数、利用lp范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数;S2. Based on the feature matrix, use the Schatten- p norm as a regular term to replace the rank function, and use the lp norm as a constraint function of the error term to construct a potential low rank to represent the optimization objective function of subspace clustering;
S3.求解所述优化目标函数得到低秩表示矩阵;S3. Solve the optimization objective function to obtain a low-rank representation matrix;
S4.基于所述低秩表示矩阵计算得到亲和矩阵;S4. Calculate the affinity matrix based on the low-rank representation matrix;
S5.利用谱聚类算法对所述亲和矩阵进行计算分割,实现所述数据的潜在低秩表示子空间聚类。S5. Use spectral clustering algorithm to calculate and segment the affinity matrix to realize subspace clustering of potential low-rank representation of the data.
上述方案中,通过在低秩表示子空间聚类的基础上,利用考虑了未观测到的数据样本潜在低秩表示子空间聚类,并利用Schatten-p范数作为正则项来代替秩函数,将NP难求解的问题转化为一个可求解的问题,以及引入lp范数约束误差项,解决了低秩表示样本不足和秩函数难求解的问题,增强了潜在低秩表示子空间聚类的鲁棒性,提升了潜在低秩表示子空间聚类的性能。In the above scheme, based on the low-rank representation subspace clustering, the potential low-rank representation subspace clustering considering the unobserved data samples is used, and the Schatten-p norm is used as a regular term to replace the rank function, The NP-hard problem is transformed into a solvable problem, and the lp norm constraint error term is introduced, which solves the problem of insufficient low-rank representation samples and intractable rank functions, and enhances the potential low-rank representation subspace aggregation. The robustness of classes improves the performance of subspace clustering of latent low-rank representations.
优选的,步骤S1所述得到特征矩阵后,还包括对特征矩阵中的各特征点进行归一化处理。在本优选方案中,归一化处理后可方便后续数据处理。Preferably, after the feature matrix is obtained in step S1, it further includes normalizing each feature point in the feature matrix. In this preferred solution, subsequent data processing can be facilitated after normalization.
优选的,步骤S2所述的潜在低秩表示子空间聚类的优化目标函数具体为:Preferably, the optimization objective function of the potential low-rank representation subspace clustering described in step S2 is specifically:
s.t.X=XZ+XL+E;s.t.X=XZ+XL+E;
式中,Z为子空间低秩表示矩阵,L为子空间稀疏表示矩阵,X为所述特征矩阵,E为重构误差矩阵,λ为控制损失惩罚的超参数;为Schatten-p范数,定义为 为lp范数,定义为 where Z is the subspace low-rank representation matrix, L is the subspace sparse representation matrix, X is the feature matrix, E is the reconstruction error matrix, and λ is the hyperparameter that controls the loss penalty; is the Schatten-p norm, defined as is the lp norm, defined as
优选的,所述步骤S3具体包括以下步骤:Preferably, the step S3 specifically includes the following steps:
对所述优化目标函数引入辅助变量J、S,其中,Z=J,L=S:Introduce auxiliary variables J and S to the optimization objective function, where Z=J, L=S:
s.t.X=XZ+XL+E,Z=J,L=Ss.t.X=XZ+XL+E, Z=J, L=S
并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数,然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化,直到收敛从而得到低秩表示矩阵。The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize the various variables in the augmented Lagrangian function until convergence to obtain a low-rank representation matrix .
优选的,所述步骤S5具体包括以下步骤:Preferably, the step S5 specifically includes the following steps:
利用以下公式计算亲和矩阵的度矩阵:Calculate the degree matrix of the affinity matrix using the following formula:
其中度矩阵为方阵,Di,i为度矩阵的第i行的元素,Si,j为亲和矩阵的第i行、第j列的元素;Wherein the degree matrix is a square matrix, D i, i is the element of the ith row of the degree matrix, S i, j is the element of the ith row and the jth column of the affinity matrix;
采用计算归一化的拉普拉斯矩阵L:use Compute the normalized Laplacian matrix L:
式中,D为度矩阵,S为亲和矩阵,I为单位矩阵;where D is the degree matrix, S is the affinity matrix, and I is the identity matrix;
计算拉普拉斯矩阵L的特征向量,取前k个特征值最大的向量按列排列为列矩阵X=[x1,x2,…,xk]∈Rn*k;Calculate the eigenvectors of the Laplacian matrix L, take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x 1 , x 2 ,..., x k ]∈R n*k ;
将列矩阵的行向量转变为单位向量,得到目标矩阵;Convert the row vector of the column matrix into a unit vector to get the target matrix;
采用K-means聚类方法对目标矩阵进行聚类,得到K个聚类结果,从而实现潜在低秩表示子空间聚类。The target matrix is clustered by K-means clustering method, and K clustering results are obtained, thereby realizing the subspace clustering of potential low-rank representation.
本发明还提供了一种潜在低秩表示的子空间聚类装置,包括:The present invention also provides a subspace clustering device represented by a potential low rank, including:
数据预处理模块,用于获取数据并对其进行预处理,得到特征矩阵;The data preprocessing module is used to obtain the data and preprocess it to obtain the feature matrix;
优化目标函数构建模块,基于所述特征矩阵,利用Schatten-p范数作为正则项以代替秩函数、利用lp范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数;The optimization objective function building module, based on the feature matrix, uses the Schatten- p norm as the regular term to replace the rank function, and uses the lp norm as the constraint function of the error term to construct the optimization objective of the potential low-rank representation subspace clustering function;
子空间表示矩阵计算模块,用于求解所述优化目标函数得到低秩表示矩阵;a subspace representation matrix calculation module, used for solving the optimization objective function to obtain a low-rank representation matrix;
亲和矩阵计算模块,用于基于所述低秩表示矩阵计算得到亲和矩阵;an affinity matrix calculation module for calculating an affinity matrix based on the low-rank representation matrix;
子空间聚类模块,用于利用谱聚类算法对所述亲和矩阵进行计算分割,实现所述数据的潜在低秩表示子空间聚类。The subspace clustering module is used to calculate and segment the affinity matrix by using the spectral clustering algorithm, so as to realize the subspace clustering of the potential low-rank representation of the data.
上述方案中,优化目标函数构建模块中通过在低秩表示子空间聚类的基础上,利用考虑了未观测到的数据样本潜在低秩表示子空间聚类,并利用Schatten-p范数作为正则项来代替秩函数,将NP难求解的问题转化为一个可求解的问题,以及引入lp范数约束误差项,解决了低秩表示样本不足和秩函数难求解的问题,增强了潜在低秩表示子空间聚类的鲁棒性,提升了潜在低秩表示子空间聚类的性能。In the above scheme, in the optimization objective function building module, based on the low rank representation subspace clustering, the subspace clustering is represented by the potential low rank considering the unobserved data samples, and the Schatten-p norm is used as the regularity. term to replace the rank function, transforming the NP-hard problem into a solvable problem, and introducing the lp norm constraint error term to solve the problem of insufficient low-rank representation samples and difficult rank functions to solve, enhancing the potential The robustness of low-rank representation subspace clustering improves the performance of potentially low-rank representation subspace clustering.
优选的,所述数据预处理模块还用于在得到特征矩阵后,对特征矩阵中的各特征点进行归一化处理。Preferably, the data preprocessing module is further configured to normalize each feature point in the feature matrix after obtaining the feature matrix.
优选的,所述优化目标函数构建模块所构造的潜在低秩表示子空间聚类的优化目标函数具体为:Preferably, the optimization objective function of the potential low-rank representation subspace clustering constructed by the optimization objective function building module is specifically:
s.t.X=XZ+XL+E;s.t.X=XZ+XL+E;
式中,Z为子空间低秩表示矩阵,L为子空间稀疏表示矩阵,X为所述特征矩阵,E为重构误差矩阵,λ为控制损失惩罚的超参数;为Schatten-p范数,定义为0<p≤∞;为lp范数,定义为 where Z is the subspace low-rank representation matrix, L is the subspace sparse representation matrix, X is the feature matrix, E is the reconstruction error matrix, and λ is the hyperparameter that controls the loss penalty; is the Schatten-p norm, defined as 0<p≤∞; is the lp norm, defined as
优选的,所述子空间表示矩阵计算模块进一步用于:Preferably, the subspace representation matrix calculation module is further used for:
对所述优化目标函数引入辅助变量J、S,其中,Z=J,L=S:Introduce auxiliary variables J and S to the optimization objective function, where Z=J, L=S:
s.t.X=XZ+XL+E,Z=J,L=Ss.t.X=XZ+XL+E, Z=J, L=S
并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数,然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化,直到收敛,从而得到低秩表示矩阵。The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize various variables in the augmented Lagrangian function until convergence, so as to obtain a low-rank representation matrix.
优选的,所述子空间聚类模块进一步用于:Preferably, the subspace clustering module is further used for:
利用以下公式计算亲和矩阵的度矩阵:Calculate the degree matrix of the affinity matrix using the following formula:
其中度矩阵为方阵,Di,i为度矩阵的第i行的元素,Si,j为亲和矩阵的第i行、第j列的元素;Wherein the degree matrix is a square matrix, D i, i is the element of the ith row of the degree matrix, S i, j is the element of the ith row and the jth column of the affinity matrix;
采用计算归一化的拉普拉斯矩阵L:use Compute the normalized Laplacian matrix L:
式中,D为度矩阵,S为亲和矩阵,I为单位矩阵;where D is the degree matrix, S is the affinity matrix, and I is the identity matrix;
计算拉普拉斯矩阵的特征向量,取前k个特征值最大的向量按列排列为列矩阵X=[x1,x2,…,xk]∈Rn*k;Calculate the eigenvectors of the Laplacian matrix, and take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x 1 , x 2 ,..., x k ]∈R n*k ;
将列矩阵的行向量转变为单位向量,得到目标矩阵;Convert the row vector of the column matrix into a unit vector to get the target matrix;
采用K-means聚类方法对目标矩阵进行聚类,得到K个聚类结果,从而实现潜在低秩表示子空间聚类。The target matrix is clustered by K-means clustering method, and K clustering results are obtained, thereby realizing the subspace clustering of potential low-rank representation.
与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the beneficial effects of the technical solution of the present invention are:
本发明利用潜在低秩表示的子空间聚类能够包含未观测到的数据样本,解决了低秩表示样本不足的问题,利用Schatten-p范数替换秩函数,Schatten-p范数比核范数更好的逼近效果,将NP难求解的问题转化为一个可求解的问题,并引入lp范数约束误差项,构造潜在低秩表示子空间聚类的优化目标函数;本发明提高了算法的鲁棒性和聚类性能,解决了现有的子空间聚类中低秩表示样本不足、潜在低秩表示子空间聚类的鲁棒性不强和性能不足的问题。The invention uses the subspace clustering represented by the potential low rank to include unobserved data samples, and solves the problem of insufficient low-rank representation samples. The Schatten-p norm is used to replace the rank function, and the Schatten-p norm is higher than the nuclear norm. Better approximation effect, transform the NP-hard problem into a solvable problem, and introduce the lp norm constraint error term to construct the optimization objective function of potential low-rank representation subspace clustering; the invention improves the algorithm The robustness and clustering performance of the existing subspace clustering solves the problems of insufficient low-rank representation samples, weak robustness and insufficient performance of potential low-rank representation subspace clustering in the existing subspace clustering.
此外,本发明还针对基于潜在低秩表示子空间聚类方法提供了相应的实现装置,进一步使得所述方法更具有实用性,所述装置具有相应的优点。In addition, the present invention also provides a corresponding implementation device for the subspace clustering method based on the latent low rank representation, further making the method more practical, and the device has corresponding advantages.
附图说明Description of drawings
图1为本发明方法的流程。Fig. 1 is the flow chart of the method of the present invention.
图2为本发明装置的模块图。FIG. 2 is a block diagram of the device of the present invention.
具体实施方式Detailed ways
附图仅用于示例性说明,不能理解为对本专利的限制;The accompanying drawings are for illustrative purposes only, and should not be construed as limitations on this patent;
为了更好说明本实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;In order to better illustrate this embodiment, some parts of the drawings are omitted, enlarged or reduced, which do not represent the size of the actual product;
对于本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。It will be understood by those skilled in the art that some well-known structures and their descriptions may be omitted from the drawings.
下面结合附图和实施例对本发明的技术方案做进一步的说明。The technical solutions of the present invention will be further described below with reference to the accompanying drawings and embodiments.
实施例1Example 1
本实施例1提供了一种潜在低秩表示的子空间聚类方法,如图1所示,包括以下步骤:The present embodiment 1 provides a subspace clustering method with a potential low-rank representation, as shown in FIG. 1 , including the following steps:
S1.获取数据并对其进行预处理,得到特征矩阵;S1. Acquire data and preprocess it to obtain a feature matrix;
该预处理步骤采用本领域的公知常用手段即可,如对于图像数据的预处理,即为对目标图像进行规格化和灰度矫正,消除噪声,然后从目标图像特征中提取边缘、区域或纹理作为实验特征,比如人脸图像数据则提取Gabor特征,手写数据集则提取HOG特征;分别得到的特征矩阵Xi=[x1,x2,…,xN]∈RD*N为数据向量构成的特征矩阵,特征矩阵中的每一个列向量对应一个特征点的特征向量,其中D为特征空间的维度,N为特征点的个数;This preprocessing step can be done by common means known in the art. For example, the preprocessing of image data is to normalize and grayscale the target image, remove noise, and then extract edges, regions or textures from the features of the target image. As experimental features, for example, Gabor features are extracted from face image data, and HOG features are extracted from handwritten data sets; the obtained feature matrix X i =[x 1 , x 2 ,...,x N ]∈R D*N is the data vector The formed feature matrix, each column vector in the feature matrix corresponds to the feature vector of a feature point, where D is the dimension of the feature space, and N is the number of feature points;
为了方便后续数据处理,利用下述公式对特征矩阵中的各特征点进行归一化处理:In order to facilitate subsequent data processing, the following formulas are used to normalize each feature point in the feature matrix:
式中,x′i为第i个特征点归一化后的值,xi为第i个特征点归一化前的值。In the formula, x′ i is the normalized value of the ith feature point, and x i is the value of the ith feature point before normalization.
S2.基于所述特征矩阵,利用Schatten-p范数作为正则项以代替秩函数、利用lp范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数:S2. Based on the feature matrix, the Schatten- p norm is used as a regular term to replace the rank function, and the lp norm is used as a constraint function of the error term to construct a potential low-rank optimization objective function representing subspace clustering:
为了求解目标函数最小化问题,在该步骤中首先构建一个秩最小化问题的潜在低秩表示子空间聚类的目标函数:To solve the objective function minimization problem, in this step, a potential low-rank representation subspace clustering objective function of the rank minimization problem is first constructed:
s.t.X=[X0,XH]Z+E;stX=[X 0 , X H ]Z+E;
式中,rank(·)表示矩阵的秩,Z为子空间低秩表示矩阵,X为所述特征矩阵,X0为观测到的数据样本矩阵,XH为未观测到的数据样本矩阵,E为重构误差矩阵,λ为控制损失惩罚的超参数;In the formula, rank( ) represents the rank of the matrix, Z is the low-rank representation matrix of the subspace, X is the feature matrix, X 0 is the observed data sample matrix, X H is the unobserved data sample matrix, E is the reconstructed error matrix, λ is the hyperparameter that controls the loss penalty;
然而,通常利用核范数来作为秩函数的最佳逼近范数,从而求解得到低秩矩阵。为了充分考虑观测数据,也即考虑寻找最优的低秩表示矩阵,本实施例中利用矩阵的Schatten-p范数作为正则项来估计秩函数,lp范数作为误差项的约束函数,结合潜在低秩表示构造目标函数;However, the kernel norm is usually used as the best approximation norm for the rank function, so as to obtain a low-rank matrix. In order to fully consider the observed data, that is, to consider finding the optimal low-rank representation matrix, in this embodiment, the Schatten- p norm of the matrix is used as the regular term to estimate the rank function, and the lp norm is used as the constraint function of the error term, combined with The potential low-rank representation constructs the objective function;
即在本步骤中,构造出潜在低秩表示子空间聚类的优化目标函数:That is, in this step, the optimization objective function of the potential low-rank representation subspace clustering is constructed:
s.t.X=XZ+XL+E;s.t.X=XZ+XL+E;
式中,Z为子空间低秩表示矩阵,L为子空间稀疏表示矩阵,X为所述特征矩阵,λ为控制损失惩罚的超参数;为Schatten-p范数,定义为 0<p≤∞,用来约束矩阵的低秩;秩是矩阵非0奇异值的个数,秩为非凸的,因此为一个NP难得问题,Schatten-p范数是凸的,Schatten-p范数是秩的凸近似,用Schatten-p范数最小化来近似实现低秩约束。为lp范数,定义为 In the formula, Z is the subspace low-rank representation matrix, L is the subspace sparse representation matrix, X is the feature matrix, and λ is the hyperparameter that controls the loss penalty; is the Schatten-p norm, defined as 0<p≤∞, used to constrain the low rank of the matrix; the rank is the number of non-zero singular values of the matrix, the rank is non-convex, so it is a rare NP problem, the Schatten-p norm is convex, and the Schatten-p The norm is a convex approximation of the rank, and the Schatten-p norm minimization is used to approximate the low-rank constraint. is the lp norm, defined as
S3.求解所述优化目标函数得到低秩表示矩阵;S3. Solve the optimization objective function to obtain a low-rank representation matrix;
在本实施例中,通过将所述优化目标函数转换成凸优化问题,引入辅助变量J、S,其中,Z=J,L=S:In this embodiment, by converting the optimization objective function into a convex optimization problem, auxiliary variables J and S are introduced, where Z=J, L=S:
s.t.X=XZ+XL+E,Z=J,L=Ss.t.X=XZ+XL+E, Z=J, L=S
并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数,然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化,直到收敛从而得到低秩表示矩阵。其中求解增广拉格朗日函数具体包括以下步骤:The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize the various variables in the augmented Lagrangian function until convergence to obtain a low-rank representation matrix . The solution of the augmented Lagrangian function specifically includes the following steps:
A1.设置参数并初始化Z=J=0,L=S=0,E=0,Y1=0,Y2=0,Y3=0,μ=10-6,maxu=106,ρ=1.1,and ε=10-6,Y1、Y2、Y3、Z、L为乘子项,μ、maxu为惩罚项项参数,ρ为惩罚参数的更新系数,ε为收敛阈值;A1. Set parameters and initialize Z=J=0, L=S=0, E=0, Y1 =0, Y2 =0, Y3 =0, μ=10 −6 , max u =10 6 , ρ =1.1, and ε=10 -6 , Y 1 , Y 2 , Y 3 , Z, L are multiplier terms, μ and max u are the penalty item parameters, ρ is the update coefficient of the penalty parameter, and ε is the convergence threshold;
A2.更新J:其中,G=Z+Y2/μA2. Update J: Among them, G=Z+Y 2 /μ
具体的,J的最优解为其中,QG、分别代表G的左奇异值和右奇异值,Δ为一个对角矩阵,通过以下公式求解:Specifically, the optimal solution of J is Among them, Q G , Represent the left singular value and right singular value of G respectively, and Δ is a diagonal matrix, which is solved by the following formula:
其中δi和σi是矩阵J和G的第i个奇异值,通过以下步骤对公式进行求解: where δ i and σ i are the ith singular values of matrices J and G, and the formula is To solve:
定义常量最优解x*分为两种情况:define constants The optimal solution x * is divided into two cases:
1)当δi小于等于v1,x*=0;2)当δi大于v1时,x*通过迭代计算x(i+1)=δi-λp(x(i))p-1得到最优解;1) When δ i is less than or equal to v 1 , x * = 0; 2) When δ i is greater than v 1 , x * is calculated by iteration x (i+1) = δ i -λp(x (i) ) p-1 get the optimal solution;
根据求解得到的x*构造对角矩阵Δ,最后得到J的最优解 Construct the diagonal matrix Δ according to the obtained x * , and finally obtain the optimal solution of J
A3.更新S:其中,M=L+Y3/μA3. Update S: Wherein, M=L+Y 3 /μ
具体的,与A1求解方式一致,其中,最优解为其中,QM、分别代表M的左奇异值和右奇异值。Specifically, it is consistent with the solution method of A1, where the optimal solution is Among them, Q M , represent the left and right singular values of M, respectively.
A4.更新Z:Z=(I+XTX)-1(XT(X-LX-E)+J+XTY1-Y2/μ)A4. Update Z: Z=(I+X T X) -1 (X T (X-LX-E)+J+X T Y 1 -Y 2 /μ)
A5.更新L:L=((X-XZ-E)XT+S+(Y1XT-Y3)/μ)(I+XXT)-1 A5. Update L: L=((X-XZ-E)X T +S+(Y 1 X T -Y 3 )/μ)(I+XX T ) -1
A6.更新E:其中,N=X-XZ-LX+Y1/μA6. Update E: Among them, N=X-XZ-LX+Y 1 /μ
具体的,与A1、A2求解的方式一致,其中,最优解x*分为三种情况:1)当δi小于v1,x*=0;2)当δi等于v1,x*=υ;3)当δi大于v1时,x*通过迭代计算x(i+1)=δi-λp(x(i))p-1得到最优解;Specifically, it is consistent with the solutions of A1 and A2, wherein the optimal solution x * is divided into three cases: 1) when δ i is less than v 1 , x * =0; 2) when δ i is equal to v 1 , x * =υ; 3) When δ i is greater than v 1 , x * obtains the optimal solution by iterative calculation x (i+1) =δ i -λp(x (i) ) p-1 ;
A7.更新乘子:Y1=Y1+μ(X-XZ-LX-E),Y2=Y2+μ(Z-J),Y3=Y3+μ(L-S)A7. Update multipliers: Y 1 =Y 1 +μ(X-XZ-LX-E), Y 2 =Y 2 +μ(ZJ), Y 3 =Y 3 +μ(LS)
A8.更新参数:μ=min(ρμ,maxu)。A8. Update parameters: μ=min(ρμ, max u ).
需要说明的是,该步骤也可以采用其他方法求解优化目标函数的最优化问题,本实施例仅举例其中一种计算方法。It should be noted that, in this step, other methods may also be used to solve the optimization problem of the optimization objective function, and this embodiment only exemplifies one of the calculation methods.
S4.基于所述低秩表示矩阵计算得到亲和矩阵;S4. Calculate the affinity matrix based on the low-rank representation matrix;
可采用S=||ZTZ‖2计算得亲和矩阵S,Z为表示矩阵。The affinity matrix S can be calculated by using S=||Z T Z‖ 2 , and Z is the representation matrix.
需要说明的是,该步骤除了采用上述方法计算亲和矩阵,还可采用其他方法进行计算,本实施例仅举例其中一种计算方法,本申请对此不做任何限定。It should be noted that, in addition to using the above method to calculate the affinity matrix, other methods may also be used for calculation in this step, and this embodiment only exemplifies one of the calculation methods, which is not limited in this application.
S5.利用谱聚类算法对所述亲和矩阵进行计算分割,实现所述数据的潜在低秩表示子空间聚类:S5. Use spectral clustering algorithm to calculate and segment the affinity matrix to realize the subspace clustering of the potential low-rank representation of the data:
利用以下公式计算亲和矩阵的度矩阵:Calculate the degree matrix of the affinity matrix using the following formula:
其中度矩阵为方阵,Di,i为度矩阵的第i行的元素,Si,j为亲和矩阵的第i行、第j列的元素;Wherein the degree matrix is a square matrix, D i, i is the element of the ith row of the degree matrix, S i, j is the element of the ith row and the jth column of the affinity matrix;
采用计算归一化的拉普拉斯矩阵L:use Compute the normalized Laplacian matrix L:
式中,D为度矩阵,S为亲和矩阵,I为单位矩阵;where D is the degree matrix, S is the affinity matrix, and I is the identity matrix;
计算拉普拉斯矩阵L的特征向量,取前k个特征值最大的向量按列排列为列矩阵X=[x1,x2,…,xk]∈Rn*k;Calculate the eigenvectors of the Laplacian matrix L, take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x 1 , x 2 ,..., x k ]∈R n*k ;
将列矩阵的行向量转变为单位向量,得到目标矩阵;Convert the row vector of the column matrix into a unit vector to get the target matrix;
采用K-means聚类方法对目标矩阵进行聚类,得到K个聚类结果,从而实现潜在低秩表示子空间聚类。The target matrix is clustered by K-means clustering method, and K clustering results are obtained, thereby realizing the subspace clustering of potential low-rank representation.
实施例2Example 2
本实施例2针对实施例1提供的潜在低秩表示的子空间聚类方法提供了相应的实现装置,进一步使得所述方法更具有实用性。下面对本实施例提供的潜在低秩表示的子空间聚类装置进行介绍,下文描述的潜在低秩表示的子空间聚类装置与上文描述的潜在低秩表示的子空间聚类方法可相互对应参照。This
如图2所示,该装置包括:As shown in Figure 2, the device includes:
数据预处理模块,用于获取数据并对其进行预处理,得到特征矩阵,对特征矩阵中的各特征点进行归一化处理;The data preprocessing module is used to obtain and preprocess the data to obtain a feature matrix, and normalize each feature point in the feature matrix;
优化目标函数构建模块,基于特征矩阵构建潜在低秩表示子空间聚类的目标函数,并利用Schatten-p范数作为正则项以代替秩函数、利用lp范数作为误差项的约束函数,从而构造出潜在低秩表示子空间聚类的优化目标函数;The optimization objective function building module is based on the feature matrix to construct a potential low-rank objective function representing subspace clustering, and the Schatten- p norm is used as the regular term to replace the rank function, and the lp norm is used as the constraint function of the error term, so that Constructing an optimization objective function that represents subspace clustering with potential low rank;
子空间表示矩阵计算模块,用于求解所述优化目标函数得到低秩表示矩阵;a subspace representation matrix calculation module, used for solving the optimization objective function to obtain a low-rank representation matrix;
亲和矩阵计算模块,用于基于所述低秩表示矩阵计算得到亲和矩阵;an affinity matrix calculation module for calculating an affinity matrix based on the low-rank representation matrix;
子空间聚类模块,用于利用谱聚类算法对所述亲和矩阵进行计算分割,实现所述数据的潜在低秩表示子空间聚类。The subspace clustering module is used to calculate and segment the affinity matrix by using the spectral clustering algorithm, so as to realize the subspace clustering of the potential low-rank representation of the data.
本发明实施例所提供的潜在低秩表示的子空间聚类装置的各功能模块的功能可根据上述方法实施例1中的方法具体实现,其具体实现过程可以参照上述方法实施例1的相关描述,此处不再赘述。The functions of each functional module of the subspace clustering apparatus for potential low-rank representation provided in this embodiment of the present invention may be specifically implemented according to the method in the foregoing method embodiment 1, and the specific implementation process may refer to the relevant description of the foregoing method embodiment 1 , and will not be repeated here.
附图中描述位置关系的用语仅用于示例性说明,不能理解为对本专利的限制;The terms describing the positional relationship in the accompanying drawings are only used for exemplary illustration, and should not be construed as a limitation on this patent;
显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Obviously, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the embodiments of the present invention. For those of ordinary skill in the art, changes or modifications in other different forms can also be made on the basis of the above description. There is no need and cannot be exhaustive of all implementations here. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the claims of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010082142.2A CN111310813A (en) | 2020-02-07 | 2020-02-07 | A subspace clustering method and apparatus for latent low-rank representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010082142.2A CN111310813A (en) | 2020-02-07 | 2020-02-07 | A subspace clustering method and apparatus for latent low-rank representation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111310813A true CN111310813A (en) | 2020-06-19 |
Family
ID=71146932
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010082142.2A Pending CN111310813A (en) | 2020-02-07 | 2020-02-07 | A subspace clustering method and apparatus for latent low-rank representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111310813A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111813982A (en) * | 2020-07-23 | 2020-10-23 | 中原工学院 | Data processing method and device for subspace clustering algorithm based on spectral clustering |
CN113420464A (en) * | 2021-07-22 | 2021-09-21 | 西南交通大学 | Aisle arrangement method considering robustness |
CN113627467A (en) * | 2021-07-01 | 2021-11-09 | 杭州电子科技大学 | Image clustering method based on non-convex approximation low-rank subspace |
WO2023065525A1 (en) * | 2021-10-22 | 2023-04-27 | 西安闻泰信息技术有限公司 | Object feature matrix determination method and apparatus, device, and storage medium |
CN116310462A (en) * | 2023-05-19 | 2023-06-23 | 浙江财经大学 | Image clustering method and device based on rank constraint self-expression |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103400143A (en) * | 2013-07-12 | 2013-11-20 | 中国科学院自动化研究所 | Data subspace clustering method based on multiple view angles |
CN106408530A (en) * | 2016-09-07 | 2017-02-15 | 厦门大学 | Sparse and low-rank matrix approximation-based hyperspectral image restoration method |
CN108460412A (en) * | 2018-02-11 | 2018-08-28 | 北京盛安同力科技开发有限公司 | A kind of image classification method based on subspace joint sparse low-rank Structure learning |
CN109685155A (en) * | 2018-12-29 | 2019-04-26 | 广东工业大学 | Subspace clustering method, device, equipment and storage medium based on multiple view |
CN110378365A (en) * | 2019-06-03 | 2019-10-25 | 广东工业大学 | A kind of multiple view Subspace clustering method based on joint sub-space learning |
-
2020
- 2020-02-07 CN CN202010082142.2A patent/CN111310813A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103400143A (en) * | 2013-07-12 | 2013-11-20 | 中国科学院自动化研究所 | Data subspace clustering method based on multiple view angles |
CN106408530A (en) * | 2016-09-07 | 2017-02-15 | 厦门大学 | Sparse and low-rank matrix approximation-based hyperspectral image restoration method |
CN108460412A (en) * | 2018-02-11 | 2018-08-28 | 北京盛安同力科技开发有限公司 | A kind of image classification method based on subspace joint sparse low-rank Structure learning |
CN109685155A (en) * | 2018-12-29 | 2019-04-26 | 广东工业大学 | Subspace clustering method, device, equipment and storage medium based on multiple view |
CN110378365A (en) * | 2019-06-03 | 2019-10-25 | 广东工业大学 | A kind of multiple view Subspace clustering method based on joint sub-space learning |
Non-Patent Citations (3)
Title |
---|
FEIPING NIE ET AL: "Joint Schatten p-norm and lp-norm robust matrix completion for missing value recovery", 《KNOWLEDGE INFORMATION SYSTEMS》 * |
SONG YU ET AL: "Subspace clustering based on latent low rank representation with Frobenius norm minimization", 《NEUROCOMPUTING》 * |
李凯鑫: "基于低秩的子空间聚类算法", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111813982A (en) * | 2020-07-23 | 2020-10-23 | 中原工学院 | Data processing method and device for subspace clustering algorithm based on spectral clustering |
CN113627467A (en) * | 2021-07-01 | 2021-11-09 | 杭州电子科技大学 | Image clustering method based on non-convex approximation low-rank subspace |
CN113420464A (en) * | 2021-07-22 | 2021-09-21 | 西南交通大学 | Aisle arrangement method considering robustness |
CN113420464B (en) * | 2021-07-22 | 2022-04-19 | 西南交通大学 | Aisle arrangement method considering robustness |
WO2023065525A1 (en) * | 2021-10-22 | 2023-04-27 | 西安闻泰信息技术有限公司 | Object feature matrix determination method and apparatus, device, and storage medium |
CN116310462A (en) * | 2023-05-19 | 2023-06-23 | 浙江财经大学 | Image clustering method and device based on rank constraint self-expression |
CN116310462B (en) * | 2023-05-19 | 2023-08-11 | 浙江财经大学 | Image clustering method and device based on rank constraint self-expression |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fang et al. | Robust latent subspace learning for image classification | |
CN111310813A (en) | A subspace clustering method and apparatus for latent low-rank representation | |
Zafar et al. | Face recognition with Bayesian convolutional networks for robust surveillance systems | |
Gao et al. | Angle 2DPCA: A new formulation for 2DPCA | |
Patel et al. | Kernel sparse subspace clustering | |
Zhang et al. | Learning structured low-rank representations for image classification | |
Li et al. | Learning low-rank and discriminative dictionary for image classification | |
Ma et al. | Sparse representation for face recognition based on discriminative low-rank dictionary learning | |
EP3084682B1 (en) | System and method for identifying faces in unconstrained media | |
Yi et al. | Unified sparse subspace learning via self-contained regression | |
Lu et al. | Low-rank 2-D neighborhood preserving projection for enhanced robust image representation | |
Li et al. | Mutual component analysis for heterogeneous face recognition | |
CN108681725A (en) | A kind of weighting sparse representation face identification method | |
CN112115881B (en) | Image feature extraction method based on robust identification feature learning | |
Yang et al. | Cross-domain visual representations via unsupervised graph alignment | |
CN108564061B (en) | Image identification method and system based on two-dimensional pivot analysis | |
Lu et al. | Nuclear norm-based 2DLPP for image classification | |
Cheng et al. | A minimax framework for classification with applications to images and high dimensional data | |
Zhang et al. | Noise modeling and representation based classification methods for face recognition | |
Abbad et al. | Application of MEEMD in post‐processing of dimensionality reduction methods for face recognition | |
Zhang et al. | Optimal discriminative projection for sparse representation-based classification via bilevel optimization | |
Chen et al. | Semi-supervised dictionary learning with label propagation for image classification | |
Sharma et al. | Pose‐invariant face recognition using curvelet neural network | |
Huang et al. | Locality-regularized linear regression discriminant analysis for feature extraction | |
Liu et al. | Bilaterally normalized scale-consistent sinkhorn distance for few-shot image classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200619 |