CN111310813A

CN111310813A - A subspace clustering method and apparatus for latent low-rank representation

Info

Publication number: CN111310813A
Application number: CN202010082142.2A
Authority: CN
Inventors: 曹江中; 符益兰; 戴青云
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2020-02-07
Filing date: 2020-02-07
Publication date: 2020-06-19

Abstract

The invention discloses a subspace clustering method and a subspace clustering device for potential low-rank representation, which are used for obtaining a characteristic matrix by acquiring data and preprocessing the data; representing subspace clustering by taking the unobserved potential low-rank of the data samples into consideration, and replacing a rank function by using a Schatten-p norm as a regular term to convert the problem that NP is difficult to solve into a problem which can be solved; introduction of l_pThe norm constrains the error term to construct an optimized objective function of potential low-rank representation subspace clustering; then, solving an optimization objective function to obtain a low-rank representation matrix; computing an affinity matrix based on the low-rank representation matrix; and calculating and dividing the affinity matrix by using a spectral clustering algorithm to realize the potential low-rank expression subspace clustering of the data. The method solves the problems of insufficient low-rank representation samples and difficulty in solving the rank function, and enhances the potentialThe robustness of the low-rank representation subspace clustering improves the performance of potential low-rank representation subspace clustering.

Description

A subspace clustering method and apparatus for latent low-rank representation

技术领域technical field

本发明涉及模式识别计算技术领域，尤其涉及一种潜在低秩表示的子空间聚类方法及装置。The present invention relates to the technical field of pattern recognition computing, and in particular, to a subspace clustering method and device for potential low-rank representation.

背景技术Background technique

随着科学的进步和人工智能的发展，模式识别对表征事物或现象的各种形式的信息进行处理和分析，从而对事物或现象进行描述、辨认、分类和解释的过程。子空间聚类广泛出现在许多的应用领域，例如图像，视频，文本等。With the progress of science and the development of artificial intelligence, pattern recognition processes and analyzes various forms of information that characterize things or phenomena, so as to describe, identify, classify and explain things or phenomena. Subspace clustering is widely used in many application domains, such as images, videos, texts, etc.

子空间的重要性自然导致子空间分割的难题，它的目标是将数据分割(或分组)到与子空间对应的每个集群中。子空间分割面临的主要挑战是如何有效处理噪声校正与数据分割之间的耦合问题。潜在低秩表示子空间聚类作为一种子空间分割算法，可以看作是低秩表示的增强版，从而获得更准确的分割结果。可以自动从损坏的数据中提取显著特征，从而产生有效的特征进行分类，已经引起了相关技术领域的广泛关注和高度重视。潜在低秩表示子空间能够考虑数据中未观测到的数据，因此提高了聚类的性能，为了解决样本不足的问题，相关技术一般采用核范数来约束正则项。但是，相关技术仅仅考虑了利用核范数作为秩函数的近似约束，但当矩阵的奇异值较大时，从秩函数和核范数的定义看，后者过分的放松使其不能更准确地估计矩阵的秩，从而使聚类性能降低，精度不高，且鲁棒性不强。The importance of subspaces naturally leads to the conundrum of subspace segmentation, where the goal is to partition (or group) the data into each cluster corresponding to the subspace. The main challenge for subspace segmentation is how to effectively deal with the coupling between noise correction and data segmentation. As a subspace segmentation algorithm, latent low-rank representation subspace clustering can be regarded as an enhanced version of low-rank representation, so as to obtain more accurate segmentation results. Significant features can be automatically extracted from corrupted data to generate effective features for classification, which has attracted extensive attention and great attention in related technical fields. The latent low-rank representation subspace can take into account the unobserved data in the data, thus improving the performance of clustering. In order to solve the problem of insufficient samples, related technologies generally use the nuclear norm to constrain the regular term. However, the related art only considers the approximate constraint of using the nuclear norm as the rank function, but when the singular value of the matrix is large, from the definition of the rank function and the nuclear norm, the latter is too relaxed to be more accurate. The rank of the matrix is estimated, so that the clustering performance is reduced, the accuracy is not high, and the robustness is not strong.

发明内容SUMMARY OF THE INVENTION

本发明为解决现有的子空间聚类中低秩表示样本不足、潜在低秩表示子空间聚类的鲁棒性不强和性能不足的问题，提供了一种潜在低秩表示的子空间聚类方法及装置。In order to solve the problems of insufficient low-rank representation samples, weak robustness and insufficient performance of potential low-rank representation subspace clustering in the existing subspace clustering, the present invention provides a subspace clustering with potential low-rank representation. Class methods and devices.

为实现以上发明目的，而采用的技术手段是：In order to achieve the above purpose of the invention, the technical means adopted are:

一种潜在低秩表示的子空间聚类方法，包括以下步骤：A subspace clustering method for latent low-rank representations, including the following steps:

S1.获取数据并对其进行预处理，得到特征矩阵；S1. Acquire data and preprocess it to obtain a feature matrix;

S2.基于所述特征矩阵，利用Schatten-p范数作为正则项以代替秩函数、利用l_p范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数；S2. Based on the feature matrix, use the Schatten- _p norm as a regular term to replace the rank function, and use the lp norm as a constraint function of the error term to construct a potential low rank to represent the optimization objective function of subspace clustering;

S3.求解所述优化目标函数得到低秩表示矩阵；S3. Solve the optimization objective function to obtain a low-rank representation matrix;

S4.基于所述低秩表示矩阵计算得到亲和矩阵；S4. Calculate the affinity matrix based on the low-rank representation matrix;

S5.利用谱聚类算法对所述亲和矩阵进行计算分割，实现所述数据的潜在低秩表示子空间聚类。S5. Use spectral clustering algorithm to calculate and segment the affinity matrix to realize subspace clustering of potential low-rank representation of the data.

上述方案中，通过在低秩表示子空间聚类的基础上，利用考虑了未观测到的数据样本潜在低秩表示子空间聚类，并利用Schatten-p范数作为正则项来代替秩函数，将NP难求解的问题转化为一个可求解的问题，以及引入l_p范数约束误差项，解决了低秩表示样本不足和秩函数难求解的问题，增强了潜在低秩表示子空间聚类的鲁棒性，提升了潜在低秩表示子空间聚类的性能。In the above scheme, based on the low-rank representation subspace clustering, the potential low-rank representation subspace clustering considering the unobserved data samples is used, and the Schatten-p norm is used as a regular term to replace the rank function, The NP-hard problem is transformed into a solvable problem, and the _lp norm constraint error term is introduced, which solves the problem of insufficient low-rank representation samples and intractable rank functions, and enhances the potential low-rank representation subspace aggregation. The robustness of classes improves the performance of subspace clustering of latent low-rank representations.

优选的，步骤S1所述得到特征矩阵后，还包括对特征矩阵中的各特征点进行归一化处理。在本优选方案中，归一化处理后可方便后续数据处理。Preferably, after the feature matrix is obtained in step S1, it further includes normalizing each feature point in the feature matrix. In this preferred solution, subsequent data processing can be facilitated after normalization.

优选的，步骤S2所述的潜在低秩表示子空间聚类的优化目标函数具体为：Preferably, the optimization objective function of the potential low-rank representation subspace clustering described in step S2 is specifically:

s.t.X＝XZ+XL+E；s.t.X=XZ+XL+E;

式中，Z为子空间低秩表示矩阵，L为子空间稀疏表示矩阵，X为所述特征矩阵，E为重构误差矩阵，λ为控制损失惩罚的超参数；

为Schatten-p范数，定义为

为l_p范数，定义为

where Z is the subspace low-rank representation matrix, L is the subspace sparse representation matrix, X is the feature matrix, E is the reconstruction error matrix, and λ is the hyperparameter that controls the loss penalty;

is the Schatten-p norm, defined as

is the _lp norm, defined as

优选的，所述步骤S3具体包括以下步骤：Preferably, the step S3 specifically includes the following steps:

对所述优化目标函数引入辅助变量J、S，其中，Z＝J，L＝S：Introduce auxiliary variables J and S to the optimization objective function, where Z=J, L=S:

s.t.X＝XZ+XL+E，Z＝J，L＝Ss.t.X=XZ+XL+E, Z=J, L=S

并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数，然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化，直到收敛从而得到低秩表示矩阵。The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize the various variables in the augmented Lagrangian function until convergence to obtain a low-rank representation matrix .

优选的，所述步骤S5具体包括以下步骤：Preferably, the step S5 specifically includes the following steps:

利用以下公式计算亲和矩阵的度矩阵：Calculate the degree matrix of the affinity matrix using the following formula:

其中度矩阵为方阵，D_i，i为度矩阵的第i行的元素，S_i，j为亲和矩阵的第i行、第j列的元素；Wherein the degree matrix is a square matrix, D _{i, i} is the element of the ith row of the degree matrix, S _{i, j} is the element of the ith row and the jth column of the affinity matrix;

采用

计算归一化的拉普拉斯矩阵L：use

Compute the normalized Laplacian matrix L:

式中，D为度矩阵，S为亲和矩阵，I为单位矩阵；where D is the degree matrix, S is the affinity matrix, and I is the identity matrix;

计算拉普拉斯矩阵L的特征向量，取前k个特征值最大的向量按列排列为列矩阵X＝[x₁，x₂，…，x_k]∈R^n*k；Calculate the eigenvectors of the Laplacian matrix L, take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x ₁ , x ₂ ,..., x _k ]∈R ^n*k ;

将列矩阵的行向量转变为单位向量，得到目标矩阵；Convert the row vector of the column matrix into a unit vector to get the target matrix;

采用K-means聚类方法对目标矩阵进行聚类，得到K个聚类结果，从而实现潜在低秩表示子空间聚类。The target matrix is clustered by K-means clustering method, and K clustering results are obtained, thereby realizing the subspace clustering of potential low-rank representation.

本发明还提供了一种潜在低秩表示的子空间聚类装置，包括：The present invention also provides a subspace clustering device represented by a potential low rank, including:

数据预处理模块，用于获取数据并对其进行预处理，得到特征矩阵；The data preprocessing module is used to obtain the data and preprocess it to obtain the feature matrix;

优化目标函数构建模块，基于所述特征矩阵，利用Schatten-p范数作为正则项以代替秩函数、利用l_p范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数；The optimization objective function building module, based on the feature matrix, uses the Schatten- _p norm as the regular term to replace the rank function, and uses the lp norm as the constraint function of the error term to construct the optimization objective of the potential low-rank representation subspace clustering function;

子空间表示矩阵计算模块，用于求解所述优化目标函数得到低秩表示矩阵；a subspace representation matrix calculation module, used for solving the optimization objective function to obtain a low-rank representation matrix;

亲和矩阵计算模块，用于基于所述低秩表示矩阵计算得到亲和矩阵；an affinity matrix calculation module for calculating an affinity matrix based on the low-rank representation matrix;

子空间聚类模块，用于利用谱聚类算法对所述亲和矩阵进行计算分割，实现所述数据的潜在低秩表示子空间聚类。The subspace clustering module is used to calculate and segment the affinity matrix by using the spectral clustering algorithm, so as to realize the subspace clustering of the potential low-rank representation of the data.

上述方案中，优化目标函数构建模块中通过在低秩表示子空间聚类的基础上，利用考虑了未观测到的数据样本潜在低秩表示子空间聚类，并利用Schatten-p范数作为正则项来代替秩函数，将NP难求解的问题转化为一个可求解的问题，以及引入l_p范数约束误差项，解决了低秩表示样本不足和秩函数难求解的问题，增强了潜在低秩表示子空间聚类的鲁棒性，提升了潜在低秩表示子空间聚类的性能。In the above scheme, in the optimization objective function building module, based on the low rank representation subspace clustering, the subspace clustering is represented by the potential low rank considering the unobserved data samples, and the Schatten-p norm is used as the regularity. term to replace the rank function, transforming the NP-hard problem into a solvable problem, and introducing the _lp norm constraint error term to solve the problem of insufficient low-rank representation samples and difficult rank functions to solve, enhancing the potential The robustness of low-rank representation subspace clustering improves the performance of potentially low-rank representation subspace clustering.

优选的，所述数据预处理模块还用于在得到特征矩阵后，对特征矩阵中的各特征点进行归一化处理。Preferably, the data preprocessing module is further configured to normalize each feature point in the feature matrix after obtaining the feature matrix.

优选的，所述优化目标函数构建模块所构造的潜在低秩表示子空间聚类的优化目标函数具体为：Preferably, the optimization objective function of the potential low-rank representation subspace clustering constructed by the optimization objective function building module is specifically:

s.t.X＝XZ+XL+E；s.t.X=XZ+XL+E;

为Schatten-p范数，定义为

0＜p≤∞；

为l_p范数，定义为

is the Schatten-p norm, defined as

0<p≤∞;

is the _lp norm, defined as

优选的，所述子空间表示矩阵计算模块进一步用于：Preferably, the subspace representation matrix calculation module is further used for:

s.t.X＝XZ+XL+E，Z＝J，L＝Ss.t.X=XZ+XL+E, Z=J, L=S

并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数，然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化，直到收敛，从而得到低秩表示矩阵。The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize various variables in the augmented Lagrangian function until convergence, so as to obtain a low-rank representation matrix.

优选的，所述子空间聚类模块进一步用于：Preferably, the subspace clustering module is further used for:

采用

计算归一化的拉普拉斯矩阵L：use

Compute the normalized Laplacian matrix L:

计算拉普拉斯矩阵的特征向量，取前k个特征值最大的向量按列排列为列矩阵X＝[x₁，x₂，…，x_k]∈R^n*k；Calculate the eigenvectors of the Laplacian matrix, and take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x ₁ , x ₂ ,..., x _k ]∈R ^n*k ;

与现有技术相比，本发明技术方案的有益效果是：Compared with the prior art, the beneficial effects of the technical solution of the present invention are:

本发明利用潜在低秩表示的子空间聚类能够包含未观测到的数据样本，解决了低秩表示样本不足的问题，利用Schatten-p范数替换秩函数，Schatten-p范数比核范数更好的逼近效果，将NP难求解的问题转化为一个可求解的问题，并引入l_p范数约束误差项，构造潜在低秩表示子空间聚类的优化目标函数；本发明提高了算法的鲁棒性和聚类性能，解决了现有的子空间聚类中低秩表示样本不足、潜在低秩表示子空间聚类的鲁棒性不强和性能不足的问题。The invention uses the subspace clustering represented by the potential low rank to include unobserved data samples, and solves the problem of insufficient low-rank representation samples. The Schatten-p norm is used to replace the rank function, and the Schatten-p norm is higher than the nuclear norm. Better approximation effect, transform the NP-hard problem into a solvable problem, and introduce the _lp norm constraint error term to construct the optimization objective function of potential low-rank representation subspace clustering; the invention improves the algorithm The robustness and clustering performance of the existing subspace clustering solves the problems of insufficient low-rank representation samples, weak robustness and insufficient performance of potential low-rank representation subspace clustering in the existing subspace clustering.

此外，本发明还针对基于潜在低秩表示子空间聚类方法提供了相应的实现装置，进一步使得所述方法更具有实用性，所述装置具有相应的优点。In addition, the present invention also provides a corresponding implementation device for the subspace clustering method based on the latent low rank representation, further making the method more practical, and the device has corresponding advantages.

附图说明Description of drawings

图1为本发明方法的流程。Fig. 1 is the flow chart of the method of the present invention.

图2为本发明装置的模块图。FIG. 2 is a block diagram of the device of the present invention.

具体实施方式Detailed ways

附图仅用于示例性说明，不能理解为对本专利的限制；The accompanying drawings are for illustrative purposes only, and should not be construed as limitations on this patent;

为了更好说明本实施例，附图某些部件会有省略、放大或缩小，并不代表实际产品的尺寸；In order to better illustrate this embodiment, some parts of the drawings are omitted, enlarged or reduced, which do not represent the size of the actual product;

对于本领域技术人员来说，附图中某些公知结构及其说明可能省略是可以理解的。It will be understood by those skilled in the art that some well-known structures and their descriptions may be omitted from the drawings.

下面结合附图和实施例对本发明的技术方案做进一步的说明。The technical solutions of the present invention will be further described below with reference to the accompanying drawings and embodiments.

实施例1Example 1

本实施例1提供了一种潜在低秩表示的子空间聚类方法，如图1所示，包括以下步骤：The present embodiment 1 provides a subspace clustering method with a potential low-rank representation, as shown in FIG. 1 , including the following steps:

该预处理步骤采用本领域的公知常用手段即可，如对于图像数据的预处理，即为对目标图像进行规格化和灰度矫正，消除噪声，然后从目标图像特征中提取边缘、区域或纹理作为实验特征，比如人脸图像数据则提取Gabor特征，手写数据集则提取HOG特征；分别得到的特征矩阵Xⁱ＝[x₁，x₂，…，x_N]∈R^D*N为数据向量构成的特征矩阵，特征矩阵中的每一个列向量对应一个特征点的特征向量，其中D为特征空间的维度，N为特征点的个数；This preprocessing step can be done by common means known in the art. For example, the preprocessing of image data is to normalize and grayscale the target image, remove noise, and then extract edges, regions or textures from the features of the target image. As experimental features, for example, Gabor features are extracted from face image data, and HOG features are extracted from handwritten data sets; the obtained feature matrix X ⁱ =[x ₁ , x ₂ ,...,x _N ]∈R ^D*N is the data vector The formed feature matrix, each column vector in the feature matrix corresponds to the feature vector of a feature point, where D is the dimension of the feature space, and N is the number of feature points;

为了方便后续数据处理，利用下述公式对特征矩阵中的各特征点进行归一化处理：In order to facilitate subsequent data processing, the following formulas are used to normalize each feature point in the feature matrix:

式中，x′_i为第i个特征点归一化后的值，x_i为第i个特征点归一化前的值。In the formula, x′ _i is the normalized value of the ith feature point, and x _i is the value of the ith feature point before normalization.

S2.基于所述特征矩阵，利用Schatten-p范数作为正则项以代替秩函数、利用l_p范数作为误差项的约束函数以构建潜在低秩表示子空间聚类的优化目标函数：S2. Based on the feature matrix, the Schatten- _p norm is used as a regular term to replace the rank function, and the lp norm is used as a constraint function of the error term to construct a potential low-rank optimization objective function representing subspace clustering:

为了求解目标函数最小化问题，在该步骤中首先构建一个秩最小化问题的潜在低秩表示子空间聚类的目标函数：To solve the objective function minimization problem, in this step, a potential low-rank representation subspace clustering objective function of the rank minimization problem is first constructed:

s.t.X＝[X₀，X_H]Z+E；stX=[X ₀ , X _H ]Z+E;

式中，rank(·)表示矩阵的秩，Z为子空间低秩表示矩阵，X为所述特征矩阵，X₀为观测到的数据样本矩阵，X_H为未观测到的数据样本矩阵，E为重构误差矩阵，λ为控制损失惩罚的超参数；In the formula, rank( ) represents the rank of the matrix, Z is the low-rank representation matrix of the subspace, X is the feature matrix, X ₀ is the observed data sample matrix, X _H is the unobserved data sample matrix, E is the reconstructed error matrix, λ is the hyperparameter that controls the loss penalty;

然而，通常利用核范数来作为秩函数的最佳逼近范数，从而求解得到低秩矩阵。为了充分考虑观测数据，也即考虑寻找最优的低秩表示矩阵，本实施例中利用矩阵的Schatten-p范数作为正则项来估计秩函数，l_p范数作为误差项的约束函数，结合潜在低秩表示构造目标函数；However, the kernel norm is usually used as the best approximation norm for the rank function, so as to obtain a low-rank matrix. In order to fully consider the observed data, that is, to consider finding the optimal low-rank representation matrix, in this embodiment, the Schatten- _p norm of the matrix is used as the regular term to estimate the rank function, and the lp norm is used as the constraint function of the error term, combined with The potential low-rank representation constructs the objective function;

即在本步骤中，构造出潜在低秩表示子空间聚类的优化目标函数：That is, in this step, the optimization objective function of the potential low-rank representation subspace clustering is constructed:

s.t.X＝XZ+XL+E；s.t.X=XZ+XL+E;

式中，Z为子空间低秩表示矩阵，L为子空间稀疏表示矩阵，X为所述特征矩阵，λ为控制损失惩罚的超参数；

为Schatten-p范数，定义为

0＜p≤∞，用来约束矩阵的低秩；秩是矩阵非0奇异值的个数，秩为非凸的，因此为一个NP难得问题，Schatten-p范数是凸的，Schatten-p范数是秩的凸近似，用Schatten-p范数最小化来近似实现低秩约束。

为l_p范数，定义为

In the formula, Z is the subspace low-rank representation matrix, L is the subspace sparse representation matrix, X is the feature matrix, and λ is the hyperparameter that controls the loss penalty;

is the Schatten-p norm, defined as

0<p≤∞, used to constrain the low rank of the matrix; the rank is the number of non-zero singular values of the matrix, the rank is non-convex, so it is a rare NP problem, the Schatten-p norm is convex, and the Schatten-p The norm is a convex approximation of the rank, and the Schatten-p norm minimization is used to approximate the low-rank constraint.

is the _lp norm, defined as

在本实施例中，通过将所述优化目标函数转换成凸优化问题，引入辅助变量J、S，其中，Z＝J，L＝S：In this embodiment, by converting the optimization objective function into a convex optimization problem, auxiliary variables J and S are introduced, where Z=J, L=S:

s.t.X＝XZ+XL+E，Z＝J，L＝Ss.t.X=XZ+XL+E, Z=J, L=S

并将约束条件利用拉格朗日乘子法转化为增广拉格朗日函数，然后采用交替方法对增广拉格朗日函数中各类变量进行迭代优化，直到收敛从而得到低秩表示矩阵。其中求解增广拉格朗日函数具体包括以下步骤：The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize the various variables in the augmented Lagrangian function until convergence to obtain a low-rank representation matrix . The solution of the augmented Lagrangian function specifically includes the following steps:

A1.设置参数并初始化Z＝J＝0，L＝S＝0，E＝0，Y₁＝0，Y₂＝0，Y₃＝0，μ＝10^-6，max_u＝10⁶，ρ＝1.1，and ε＝10^-6，Y₁、Y₂、Y₃、Z、L为乘子项，μ、max_u为惩罚项项参数，ρ为惩罚参数的更新系数，ε为收敛阈值；A1. Set parameters and initialize Z=J=0, L=S=0, E=0, _Y1 =0, _Y2 =0, _Y3 =0, μ=10 ⁻⁶ , max _u =10 ⁶ , ρ =1.1, and ε=10 ^-6 , Y ₁ , Y ₂ , Y ₃ , Z, L are multiplier terms, μ and max _u are the penalty item parameters, ρ is the update coefficient of the penalty parameter, and ε is the convergence threshold;

A2.更新J：

其中，G＝Z+Y₂/μA2. Update J:

Among them, G=Z+Y ₂ /μ

具体的，J的最优解为

其中，Q_G、

分别代表G的左奇异值和右奇异值，Δ为一个对角矩阵，通过以下公式求解：Specifically, the optimal solution of J is

Among them, Q _G ,

Represent the left singular value and right singular value of G respectively, and Δ is a diagonal matrix, which is solved by the following formula:

其中δ_i和σ_i是矩阵J和G的第i个奇异值，通过以下步骤对公式

进行求解：

where δ _i and σ _i are the ith singular values of matrices J and G, and the formula is

To solve:

定义常量

最优解x^*分为两种情况：define constants

The optimal solution x ^* is divided into two cases:

1)当δ_i小于等于v₁，x^*＝0；2)当δ_i大于v₁时，x^*通过迭代计算x⁽ⁱ⁺¹⁾＝δ_i-λp(x⁽ⁱ⁾)^p-1得到最优解；1) When δ _i is less than or equal to v ₁ , x ^* = 0; 2) When δ _i is greater than v ₁ , x ^* is calculated by iteration x ⁽ⁱ⁺¹⁾ = δ _i -λp(x ⁽ⁱ⁾ ) ^p-1 get the optimal solution;

根据求解得到的x^*构造对角矩阵Δ，最后得到J的最优解

Construct the diagonal matrix Δ according to the obtained x ^* , and finally obtain the optimal solution of J

A3.更新S：

其中，M＝L+Y₃/μA3. Update S:

Wherein, M=L+Y ₃ /μ

具体的，与A1求解方式一致，其中，最优解为

其中，Q_M、

分别代表M的左奇异值和右奇异值。Specifically, it is consistent with the solution method of A1, where the optimal solution is

Among them, Q _M ,

represent the left and right singular values of M, respectively.

A4.更新Z：Z＝(I+X^TX)^-1(X^T(X-LX-E)+J+X^TY₁-Y₂/μ)A4. Update Z: Z=(I+X ^T X) ^-1 (X ^T (X-LX-E)+J+X ^T Y ₁ -Y ₂ /μ)

A5.更新L：L＝((X-XZ-E)X^T+S+(Y₁X^T-Y₃)/μ)(I+XX^T)^-1 A5. Update L: L=((X-XZ-E)X ^T +S+(Y ₁ X ^T -Y ₃ )/μ)(I+XX ^T ) ^-1

A6.更新E：

其中，N＝X-XZ-LX+Y₁/μA6. Update E:

Among them, N=X-XZ-LX+Y ₁ /μ

具体的，与A1、A2求解的方式一致，其中，最优解x^*分为三种情况：1)当δ_i小于v₁，x^*＝0；2)当δ_i等于v₁，x^*＝υ；3)当δ_i大于v₁时，x^*通过迭代计算x⁽ⁱ⁺¹⁾＝δ_i-λp(x⁽ⁱ⁾)^p-1得到最优解；Specifically, it is consistent with the solutions of A1 and A2, wherein the optimal solution x ^* is divided into three cases: 1) when δ _i is less than v ₁ , x ^* =0; 2) when δ _i is equal to v ₁ , x ^* =υ; 3) When δ _i is greater than v ₁ , x ^* obtains the optimal solution by iterative calculation x ⁽ⁱ⁺¹⁾ =δ _i -λp(x ⁽ⁱ⁾ ) ^p-1 ;

A7.更新乘子：Y₁＝Y₁+μ(X-XZ-LX-E)，Y₂＝Y₂+μ(Z-J)，Y₃＝Y₃+μ(L-S)A7. Update multipliers: Y ₁ =Y ₁ +μ(X-XZ-LX-E), Y ₂ =Y ₂ +μ(ZJ), Y ₃ =Y ₃ +μ(LS)

A8.更新参数：μ＝min(ρμ，max_u)。A8. Update parameters: μ=min(ρμ, max _u ).

需要说明的是，该步骤也可以采用其他方法求解优化目标函数的最优化问题，本实施例仅举例其中一种计算方法。It should be noted that, in this step, other methods may also be used to solve the optimization problem of the optimization objective function, and this embodiment only exemplifies one of the calculation methods.

可采用S＝||Z^TZ‖₂计算得亲和矩阵S，Z为表示矩阵。The affinity matrix S can be calculated by using S=||Z ^T Z‖ ₂ , and Z is the representation matrix.

需要说明的是，该步骤除了采用上述方法计算亲和矩阵，还可采用其他方法进行计算，本实施例仅举例其中一种计算方法，本申请对此不做任何限定。It should be noted that, in addition to using the above method to calculate the affinity matrix, other methods may also be used for calculation in this step, and this embodiment only exemplifies one of the calculation methods, which is not limited in this application.

S5.利用谱聚类算法对所述亲和矩阵进行计算分割，实现所述数据的潜在低秩表示子空间聚类：S5. Use spectral clustering algorithm to calculate and segment the affinity matrix to realize the subspace clustering of the potential low-rank representation of the data:

采用

计算归一化的拉普拉斯矩阵L：use

Compute the normalized Laplacian matrix L:

实施例2Example 2

本实施例2针对实施例1提供的潜在低秩表示的子空间聚类方法提供了相应的实现装置，进一步使得所述方法更具有实用性。下面对本实施例提供的潜在低秩表示的子空间聚类装置进行介绍，下文描述的潜在低秩表示的子空间聚类装置与上文描述的潜在低秩表示的子空间聚类方法可相互对应参照。This Embodiment 2 provides a corresponding implementation device for the subspace clustering method of potential low-rank representation provided in Embodiment 1, which further makes the method more practical. The subspace clustering apparatus for potential low-rank representation provided in this embodiment is introduced below. The subspace clustering apparatus for potential low-rank representation described below may correspond to the subspace clustering method for potential low-rank representation described above. Reference.

如图2所示，该装置包括：As shown in Figure 2, the device includes:

数据预处理模块，用于获取数据并对其进行预处理，得到特征矩阵，对特征矩阵中的各特征点进行归一化处理；The data preprocessing module is used to obtain and preprocess the data to obtain a feature matrix, and normalize each feature point in the feature matrix;

优化目标函数构建模块，基于特征矩阵构建潜在低秩表示子空间聚类的目标函数，并利用Schatten-p范数作为正则项以代替秩函数、利用l_p范数作为误差项的约束函数，从而构造出潜在低秩表示子空间聚类的优化目标函数；The optimization objective function building module is based on the feature matrix to construct a potential low-rank objective function representing subspace clustering, and the Schatten- _p norm is used as the regular term to replace the rank function, and the lp norm is used as the constraint function of the error term, so that Constructing an optimization objective function that represents subspace clustering with potential low rank;

本发明实施例所提供的潜在低秩表示的子空间聚类装置的各功能模块的功能可根据上述方法实施例1中的方法具体实现，其具体实现过程可以参照上述方法实施例1的相关描述，此处不再赘述。The functions of each functional module of the subspace clustering apparatus for potential low-rank representation provided in this embodiment of the present invention may be specifically implemented according to the method in the foregoing method embodiment 1, and the specific implementation process may refer to the relevant description of the foregoing method embodiment 1 , and will not be repeated here.

附图中描述位置关系的用语仅用于示例性说明，不能理解为对本专利的限制；The terms describing the positional relationship in the accompanying drawings are only used for exemplary illustration, and should not be construed as a limitation on this patent;

显然，本发明的上述实施例仅仅是为清楚地说明本发明所作的举例，而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明权利要求的保护范围之内。Obviously, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the embodiments of the present invention. For those of ordinary skill in the art, changes or modifications in other different forms can also be made on the basis of the above description. There is no need and cannot be exhaustive of all implementations here. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the claims of the present invention.

Claims

1. a subspace clustering method represented by a potential low rank, is characterized in that, comprises the following steps:

S1. Acquire data and preprocess it to obtain a feature matrix;

S2. Based on the feature matrix, use the Schatten- _p norm as a regular term to replace the rank function, and use the lp norm as a constraint function of the error term to construct a potential low rank to represent the optimization objective function of subspace clustering;

S3. Solve the optimization objective function to obtain a low-rank representation matrix;

S4. Calculate the affinity matrix based on the low-rank representation matrix;

S5. Use spectral clustering algorithm to calculate and segment the affinity matrix to realize subspace clustering of potential low-rank representation of the data.

2 . The subspace clustering method of potential low-rank representation according to claim 1 , wherein after obtaining the feature matrix in step S1 , the method further comprises normalizing each feature point in the feature matrix. 3 .

3. The subspace clustering method of potential low-rank representation according to claim 1, wherein the optimization objective function of the potential low-rank representation subspace clustering described in step S2 is specifically:

s.t.X=XZ+XL+E;

is the Schatten-p norm, defined as

is the _lp norm, defined as

4. The subspace clustering method of potential low-rank representation according to claim 3, wherein the step S3 specifically comprises the following steps:

Introduce auxiliary variables J and S to the optimization objective function, where Z=J, L=S:

s.t.X=XZ+XL+E, Z=J, L=S

The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize the various variables in the augmented Lagrangian function until convergence to obtain a low-rank representation matrix .

5. The subspace clustering method of latent low-rank representation according to claim 4, wherein the step S5 specifically comprises the following steps:

Calculate the degree matrix of the affinity matrix using the following formula:

The degree matrix is a square matrix, D _i,i is the element of the ith row of the degree matrix, and S _i,j is the element of the ith row and the jth column of the affinity matrix;

use

Compute the normalized Laplacian matrix L:

where D is the degree matrix, S is the affinity matrix, and I is the identity matrix;

Calculate the eigenvectors of the Laplacian matrix L, take the vectors with the largest eigenvalues of the first k and arrange them in columns as a column matrix X=[x ₁ ,x ₂ ,...,x _k ]∈R ^n*k ;

Convert the row vector of the column matrix into a unit vector to get the target matrix;

The target matrix is clustered by K-means clustering method, and K clustering results are obtained, thereby realizing the subspace clustering of potential low-rank representation.

6. A subspace clustering device represented by a potential low rank, comprising:

The data preprocessing module is used to obtain the data and preprocess it to obtain the feature matrix;

The optimization objective function building module is based on the feature matrix to construct a potential low-rank objective function representing subspace clustering, and the Schatten- _p norm is used as the regular term to replace the rank function, and the lp norm is used as the constraint function of the error term, so that Constructing an optimization objective function that represents subspace clustering with potential low rank;

a subspace representation matrix calculation module, used for solving the optimization objective function to obtain a low-rank representation matrix;

an affinity matrix calculation module for calculating an affinity matrix based on the low-rank representation matrix;

The subspace clustering module is used to calculate and segment the affinity matrix by using the spectral clustering algorithm, so as to realize the subspace clustering of the potential low-rank representation of the data.

7 . The subspace clustering device of potential low-rank representation according to claim 6 , wherein the data preprocessing module is further configured to normalize each feature point in the feature matrix after obtaining the feature matrix. 8 . processing.

8. The subspace clustering device of potential low-rank representation according to claim 6, wherein the optimization objective function of the potential low-rank representation subspace clustering constructed by the optimization objective function building module is specifically:

s.t.X=XZ+XL+E;

is the Schatten-p norm, defined as

is the _lp norm, defined as

9. The subspace clustering device of latent low-rank representation according to claim 8, wherein the subspace representation matrix calculation module is further used for:

s.t.X=XZ+XL+E, Z=J, L=S

The constraints are transformed into the augmented Lagrangian function using the Lagrangian multiplier method, and then the alternating method is used to iteratively optimize various variables in the augmented Lagrangian function until convergence, so as to obtain a low-rank representation matrix.

10. The subspace clustering device of latent low-rank representation according to claim 9, wherein the subspace clustering module is further used for:

Calculate the degree matrix of the affinity matrix using the following formula:

use

Compute the normalized Laplacian matrix L: