CN114821145A

CN114821145A - A clustering method for incomplete multi-view image data based on data inpainting

Info

Publication number: CN114821145A
Application number: CN202210740754.5A
Authority: CN
Inventors: 赵洪伟; 付强; 付立军; 李骜
Original assignee: Shandong Bim Information Technology Co ltd
Current assignee: Shandong Bim Information Technology Co ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-07-29
Anticipated expiration: 2042-06-28
Also published as: CN114821145B

Abstract

The invention is suitable for the technical field of computer vision image clustering, and provides a data restoration-based incomplete multi-view image data clustering method, which comprises the following steps: s1: inputting a missing multi-view image dataset; s2: repairing the missing multi-view image data set from a data layer based on a Pearson correlation coefficient calculation method to obtain a complete multi-view image data set; according to the method, based on the similarity among the visual angles of the multi-visual-angle image data, an incomplete multi-visual-angle image is accurately repaired from a dimensional layer of a sample through a Pearson correlation coefficient calculation method, the incomplete image data is filled into complete data, the filled data is close to the pixel value of real image data to the greatest extent, and the fact that the image data used for subsequent image clustering model learning contains real and effective information is guaranteed.

Description

A clustering method for incomplete multi-view image data based on data inpainting

技术领域technical field

本发明涉及计算机视觉图像聚类技术领域，具体是一种基于数据修复的非完整多视角图像数据聚类方法。The invention relates to the technical field of computer vision image clustering, in particular to a non-complete multi-view image data clustering method based on data restoration.

背景技术Background technique

随着科技技术的百花齐放，图像数据的收集方式逐渐增多，图像数据开始呈现爆炸式的增长，使得所获取的图像数据常常处于一种无标签的状态，导致图像数据的数量不足以训练模型。聚类技术的形成，使得对无标签的图像数据分类成为可能。With the development of science and technology, the collection methods of image data gradually increase, and the image data begins to show explosive growth, so that the acquired image data is often in an unlabeled state, resulting in insufficient image data to train the model. The formation of clustering technology makes it possible to classify unlabeled image data.

但图像数据收集设备由于某些客观原因，常常导致所获取到的图像数据出现缺失现象(即图像数据非完整)，使得图像聚类模型的精确度急剧下降。However, due to some objective reasons, the image data collection equipment often causes the acquired image data to be missing (ie, the image data is incomplete), which makes the accuracy of the image clustering model drop sharply.

因此，针对以上现状，迫切需要提供一种基于数据修复的非完整多视角图像数据聚类方法，以克服当前实际应用中的不足。Therefore, in view of the above situation, it is urgent to provide a non-complete multi-view image data clustering method based on data inpainting to overcome the deficiencies in current practical applications.

发明内容SUMMARY OF THE INVENTION

本发明实施例的目的在于提供一种基于数据修复的非完整多视角图像数据聚类方法，旨在解决如何对非完整的多视角图像数据进行修复的问题。The purpose of the embodiments of the present invention is to provide a non-complete multi-view image data clustering method based on data restoration, which aims to solve the problem of how to restore the non-complete multi-view image data.

本发明实施例是这样实现的，一种基于数据修复的非完整多视角图像数据聚类方法，包括以下步骤：The embodiments of the present invention are implemented in this way, a method for clustering incomplete multi-view image data based on data restoration, comprising the following steps:

S1：输入一个缺失的多视角图像数据集；S1: Input a missing multi-view image dataset;

S2：基于皮尔逊相关系数计算方法对所述缺失的多视角图像数据集从数据层面进行修复，获得一个完整的多视角图像数据集；S2: Repair the missing multi-view image data set from the data level based on the Pearson correlation coefficient calculation method to obtain a complete multi-view image data set;

S3：构建一个多核协作表示模型，将完整的多视角图像数据集放入多核表示模型中学习，获得一个含有多重有效信息的图像数据鲁棒表示，不同视角将获取到不同的所述鲁棒表示；S3: Build a multi-core collaborative representation model, put the complete multi-view image data set into the multi-core representation model for learning, and obtain a robust representation of image data containing multiple valid information. Different views will obtain different robust representations ;

S4：构造一个鲁棒子空间学习模型，将鲁棒表示送入鲁棒子空间学习模型中学习，并获得图像数据的子空间表示，不同视角将获得不同的所述子空间表示；S4: construct a robust subspace learning model, send the robust representation into the robust subspace learning model for learning, and obtain the subspace representation of the image data, and different perspectives will obtain different subspace representations;

S5：构建一个低秩张量模型，将子空间表示放入低秩张量模型中，并重组为三维形式来探索图像数据在三维空间中的隐藏信息；S5: Build a low-rank tensor model, put the subspace representation into the low-rank tensor model, and reorganize it into a three-dimensional form to explore the hidden information of the image data in the three-dimensional space;

S6：定义一个有效的联合表示模型，将多核协作表示模型、鲁棒子空间学习模型以及低秩张量模型放入所述联合表示模型中；S6: define an effective joint representation model, and put the multi-core cooperative representation model, the robust subspace learning model and the low-rank tensor model into the joint representation model;

S7：利用完整的多视角图像数据集求解出每个变量的最优解，以此获得一个具有高判别性的融合子空间；S7: Use the complete multi-view image data set to solve the optimal solution of each variable, so as to obtain a fusion subspace with high discrimination;

S8：将融合子空间送入聚类算法中，获得所需的聚类结果。S8: Send the fusion subspace into the clustering algorithm to obtain the required clustering result.

优选地，在步骤S2中，所述基于皮尔逊相关系数计算方法对所述缺失的多视角图像数据集从数据层面进行修复的公式为：Preferably, in step S2, the formula for repairing the missing multi-view image data set from the data level based on the Pearson correlation coefficient calculation method is:

；

;

其中p表示当前样本的维度，v表示视角数，w表示样本缺失数，

表示与当前缺失样本相似性最高的前五个样本,cor表示当前样本与x ^b样本间的相关系数。 where p represents the dimension of the current sample, v represents the number of views, w represents the number of missing samples,

represents the top five samples with the highest similarity to the current missing sample, and cor represents the correlation coefficient between the current sample and the x ^b samples.

优选地，在步骤S3中，所述多核协作表示模型为：Preferably, in step S3, the multi-core cooperative representation model is:

；

;

其中K表示由完整图像数据所获取的核矩阵，

表示图像数据鲁棒表示； where K represents the kernel matrix obtained from the complete image data,

represents a robust representation of image data;

中c表示特征表示维度，n表示样本数。

Among them, c represents the feature representation dimension, and n represents the number of samples.

优选地，在步骤S4中，所述鲁棒子空间学习模型为：Preferably, in step S4, the robust subspace learning model is:

其中，α ₁ 、α ₂ 、γ ₁和γ ₂表示正则化参数，

表示Frobinus范数，

表示矩阵的迹,H ₁和H ₂分别表示视角1和视角2的数据鲁棒表示,K ₁和K ₂分别表示从视角1和视角2所获取的核矩阵，G ₁和G ₂分别表示视角1和视角2的鲁棒子空间矩阵。 where α ₁ , α ₂ , γ ₁ and γ ₂ represent regularization parameters,

represents the Frobinus norm,

represents the trace of the matrix, H ₁ and H ₂ represent the data-robust representation of view 1 and view 2, respectively, K ₁ and K ₂ represent the kernel matrix obtained from view 1 and view 2, respectively, G ₁ and G ₂ represent views, respectively Robust subspace matrix for 1 and view 2.

优选地，在步骤S5中，所述低秩张量模型为：Preferably, in step S5, the low-rank tensor model is:

；

;

其中，

表示由视角1和视角2的鲁棒子空间矩阵所组成的三阶张量，δ表示正则化常数，

代表张量的Frobinus范数，

表示低秩张量结构，

表示张量G的低秩张量结构。 in,

represents the third-order tensor composed of the robust subspace matrices of view 1 and view 2, δ denotes the regularization constant,

represents the Frobinus norm of the tensor,

represents a low-rank tensor structure,

Represents a low-rank tensor structure of a tensor G.

优选地，在步骤S6中，所述联合表示模型为：Preferably, in step S6, the joint representation model is:

。

.

优选地，在步骤S7中，所述利用完整的多视角图像数据集求解出每个变量的最优解的步骤为：Preferably, in step S7, the step of using the complete multi-view image data set to solve the optimal solution of each variable is:

根据交替方向乘子法，在其他各变量不变的情况下针对其中一个变量迭代的求解最优解。According to the alternating direction multiplier method, the optimal solution is iteratively obtained for one of the variables while the other variables remain unchanged.

优选地，还包括以下步骤：Preferably, the following steps are also included:

计算非完整多视角图像数据的聚类精确度；Calculate the clustering accuracy of incomplete multi-view image data;

计算非完整多视角图像数据的归一化互信息；Calculate the normalized mutual information of non-holonomic multi-view image data;

计算非完整多视角图像数据的纯度。Calculates the purity of incomplete multi-view image data.

与现有技术相比，本发明实施例的有益效果：Compared with the prior art, the beneficial effects of the embodiments of the present invention:

（1）本发明的实施例基于多视角图像数据视角间的相似性，通过皮尔逊相关系数计算方法，从样本的维度层面对非完整的多视角图像进行一种精准的修复，在将非完整图像数据填补为完整数据的基础上，使得所填补的数据在极大程度上接近于真实图像数据的像素值，确保用于后续图像聚类模型学习的图像数据蕴含真实有效的信息；(1) The embodiment of the present invention is based on the similarity between the perspectives of the multi-view image data, and uses the Pearson correlation coefficient calculation method to accurately repair the incomplete multi-view image from the dimension level of the sample. On the basis of filling the image data into complete data, the filled data is very close to the pixel value of the real image data to ensure that the image data used for subsequent image clustering model learning contains real and effective information;

（2）本发明的实施例为了解决线性不可分问题，通过核函数学习构建了一种多核协作模型，用于学习多视角图像数据的数据鲁棒表示，同时，构造了鲁棒子空间学习模型，使得多视角图像数据的每个视角都可学习出专属于自己的低维表示空间，此外，为了更好的探索图像数据的低维表示在第三维上所蕴含的潜在线索，本实例还引入了低秩张量模型来学习一个融合的子空间表示，用于后续的聚类算法；(2) In order to solve the linear inseparability problem, the embodiment of the present invention constructs a multi-core collaborative model through kernel function learning, which is used to learn the data robust representation of multi-view image data, and at the same time, constructs a robust subspace learning model, Each view of multi-view image data can learn its own low-dimensional representation space. In addition, in order to better explore the potential clues contained in the third dimension of the low-dimensional representation of image data, this example also introduces A low-rank tensor model to learn a fused subspace representation for subsequent clustering algorithms;

（3）本发明的实施例开发了一种交替优化的数值求解算法，利用交替方向乘子法求解出耦合在目标函数中的各变量的最优解，并保证在迭代中收敛。(3) The embodiment of the present invention develops a numerical solution algorithm of alternating optimization, which uses the alternating direction multiplier method to solve the optimal solution of each variable coupled in the objective function, and ensures the convergence in the iteration.

附图说明Description of drawings

图1是本发明实施例提供的一种基于数据修复的非完整多视角图像数据聚类方法的流程图。FIG. 1 is a flowchart of a method for clustering incomplete multi-view image data based on data restoration provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

以下结合具体实施例对本发明的具体实现进行详细描述。The specific implementation of the present invention will be described in detail below with reference to specific embodiments.

请参阅图1，本发明实施例提供的一种基于数据修复的非完整多视角图像数据聚类方法，包括以下步骤：Referring to FIG. 1, a method for clustering incomplete multi-view image data based on data restoration provided by an embodiment of the present invention includes the following steps:

在本发明的一个实施例中，所述基于皮尔逊相关系数计算方法对所述缺失的多视角图像数据集从数据层面进行修复的公式为：In an embodiment of the present invention, the formula for repairing the missing multi-view image data set from the data level based on the Pearson correlation coefficient calculation method is:

；

;

其中∑表示求和符号，p表示当前样本的维度，v表示视角数，w表示样本缺失数，

(b=1,2,…,5,m表示样本维度)表示与当前缺失样本相似性最高的前五个样本,cor表示当前样本与x ^b样本间的相关系数。 where ∑ represents the summation symbol, p represents the dimension of the current sample, v represents the number of viewing angles, w represents the number of missing samples,

( b = 1,2,...,5, m represents the sample dimension) represents the top five samples with the highest similarity to the current missing sample, cor represents the correlation coefficient between the current sample and x ^b samples.

在本发明的一个实施例中，所述多核协作表示模型为：In an embodiment of the present invention, the multi-core cooperative representation model is:

；

;

其中，

表示矩阵的迹，

表示约束条件，T表示矩阵的转置，I表示单位矩阵，

表示所述图像数据鲁棒表示(c表示特征表示维度，n表示样本数)，K表示由所述完整图像数据所获取的核矩阵，具体公式如下： in,

represents the trace of the matrix,

represents the constraints, T represents the transpose of the matrix, I represents the identity matrix,

represents the robust representation of the image data ( c represents the feature representation dimension, n represents the number of samples), K represents the kernel matrix obtained from the complete image data, and the specific formula is as follows:

；

;

其中，X表示多视角图像数据，

表示图像数据X中第v个视角的第i列数据，

代表图像数据中样本的索引，s代表上述公式中核函数的映射种类,

表示核函数集合，在本实施例中，共采用了5种核函数。 where X represents multi-view image data,

represents the data in the i -th column of the v -th viewing angle in the image data X ,

represents the index of the sample in the image data, s represents the mapping type of the kernel function in the above formula,

Indicates a set of kernel functions. In this embodiment, a total of 5 kinds of kernel functions are used.

在本发明的一个实施例中，所述鲁棒子空间学习模型为：In an embodiment of the present invention, the robust subspace learning model is:

；

;

其中，α₁、α₂、γ₁和γ₂表示正则化参数，

表示Frobinus范数，

表示矩阵的迹,H ₁和H ₂分别表示视角1和视角2的数据鲁棒表示, K ₁和K ₂分别表示从视角1和视角2所获取的核矩阵，G ₁和G ₂分别表示视角1和视角2的鲁棒子空间矩阵。 where α ₁ , α ₂ , γ ₁ and γ ₂ represent regularization parameters,

represents the Frobinus norm,

在本发明的一个实施例中，所述低秩张量模型为：In an embodiment of the present invention, the low-rank tensor model is:

；

;

其中，

表示表示由视角1和视角2的鲁棒子空间矩阵所组成的三阶张量，δ 表示正则化常数，

代表张量的Frobinus范数，

表示低秩张量结构，

表示张量G的低秩张量结构。 in,

represents the Frobinus norm of the tensor,

represents a low-rank tensor structure,

Represents a low-rank tensor structure of a tensor G.

在本发明的一个实施例中，所述联合表示模型为：In an embodiment of the present invention, the joint representation model is:

；

;

在本发明的一个实施例中，所述利用完整的多视角图像数据集求解出每个变量的最优解的步骤为：In an embodiment of the present invention, the step of using the complete multi-view image data set to solve the optimal solution of each variable is:

根据交替方向乘子法，在其他各变量不变的情况下针对其中一个变量迭代的求解最优解；According to the alternating direction multiplier method, the optimal solution is iteratively solved for one of the variables while the other variables remain unchanged;

具体地：specifically:

1）固定其他变量，删除与H ₁无关的函数项，可得变量H ₁的最小化函数式如下：1) Fix other variables and delete the function items unrelated to H _1. The minimized function formula of variable H ₁ can be obtained as follows:

；

;

可将上述公式转化为：The above formula can be transformed into:

；

;

上述公式可以使用特征分解对其进行求解，其中，H ₁由矩阵M ₁所相对应的前c个最大特征值的特征向量所组成，表示一个特征向量矩阵；The above formula can be solved by using eigendecomposition, wherein, H ₁ is composed of the eigenvectors of the first c largest eigenvalues corresponding to the matrix M ₁ , and represents an eigenvector matrix;

2）固定其他变量，删除与G ₁无关的函数项，获得如下所示的变量G ₁的最小化目标函数:2) Fix other variables, delete the function terms unrelated to G ₁ , and obtain the minimized objective function of variable G ₁ as shown below:

；

;

将上述公式的导数设为0，可获得闭式解如下：Setting the derivative of the above formula to 0, the closed-form solution can be obtained as follows:

；

;

3）固定其他变量，删除与H ₂无关的函数项，可得变量H ₂的最小化函数式如下：3) Fix other variables and delete the function items unrelated to H ₂ , and the minimized function formula of variable H ₂ can be obtained as follows:

；

;

上述公式可转化为：The above formula can be transformed into:

；

;

上述公式可以使用特征分解对其进行求解，其中，H ₂由矩阵M ₂所相对应的前c个最大特征值的特征向量所组成，表示一个特征向量矩阵；The above formula can be solved by using eigendecomposition, wherein, H ₂ is composed of the eigenvectors of the first c largest eigenvalues corresponding to the matrix M ₂ , and represents an eigenvector matrix;

4）固定其他变量，删除与G ₂无关的函数项，获得如下所示的变量G ₂的最小化目标函数:4) Fix other variables, delete the function terms unrelated to G ₂ , and obtain the minimized objective function of variable G ₂ as shown below:

；

;

。

.

在本发明的一个实施例中，还包括以下步骤：In an embodiment of the present invention, the following steps are also included:

综上所述，本发明实施例可将非完整多视角图像数据修复为一个在最大程度上接近于真实数据值的完整多视角图像数据，并通过多核函数计算出完整图像数据的鲁棒数据表示；结合子空间学习方法，求解出图像数据的低维表示；为了探索图像数据中的潜在线索，利用张量低秩模型探索多视角图像数据间的高维相似性，求解出一个具有高判别性的融合子空间，将子空间送入聚类算法中获取图像聚类结果。To sum up, in the embodiment of the present invention, the incomplete multi-view image data can be repaired into a complete multi-view image data that is close to the real data value to the greatest extent, and a robust data representation of the complete image data is calculated by using a multi-kernel function. ; Combine the subspace learning method to solve the low-dimensional representation of the image data; in order to explore the potential clues in the image data, the tensor low-rank model is used to explore the high-dimensional similarity between multi-view image data, and solve a high-discriminative image data. The fused subspace is sent into the clustering algorithm to obtain the image clustering result.

进一步说明，假设将一个非完整的多视角图像数据放入本实施例的图像聚类模型中，将会获得一个图像聚类精确度达到90%以上的结果；Further description, assuming that an incomplete multi-view image data is put into the image clustering model of this embodiment, an image clustering accuracy of more than 90% will be obtained;

具体实施方式如下：The specific implementation is as follows:

本实施例将采用三个已公开的多视角图像数据集对本实施例方法进行验证，数据集包括一个手写图像数据集和两个面部图像数据集；In this embodiment, three published multi-view image data sets are used to verify the method of this embodiment, and the data sets include a handwriting image data set and two facial image data sets;

其中，手写图像数据集采用UCI数据集，UCI数据集包含了从0，1，…,9十种不同的手写图像，共2000个样本；本实施例从中选取了500个样本作为验证对象，并根据这500个样本构建了两个视角，其中，第一视角为216维的轮廓相关特征，第二视角维76维的傅里叶系数；Among them, the handwritten image data set adopts the UCI data set. The UCI data set contains ten different handwritten images from 0, 1, . Two perspectives are constructed according to these 500 samples, wherein the first perspective is 216-dimensional contour-related features, and the second perspective is 76-dimensional Fourier coefficients;

此外，其中一个面部数据集为YALE数据集，YALE数据包含了15个不同性别和年龄的人在不同光线和角度下的人脸图像，每个人各拍摄11张图像，共165张人脸图像；本实施方式将对YALE数据集提取两种特征分别作为两个视角来进行本实施例的验证；In addition, one of the face datasets is the YALE dataset. The YALE data contains 15 face images of people of different genders and ages under different light and angles. Each person shoots 11 images for a total of 165 face images; In this embodiment, two kinds of features are extracted from the YALE data set as two perspectives to verify this embodiment;

另一面部图像数据集为ORL数据集，该数据集采集了40个人在不同的光线和面部表情下的面部图像，每人采集10张图像，共400张图像；本实施例选取了该图像数据集中每个样本的灰度强度和局部二值模式作为两个视角对本实施例进行验证；Another facial image data set is the ORL data set, which collects facial images of 40 people under different light and facial expressions, and each person collects 10 images, a total of 400 images; this embodiment selects the image data The grayscale intensity and local binary pattern of each sample are collected as two viewing angles to verify this embodiment;

本实施例将采用三种常用的聚类指标作为衡量指标，分别为精确度（ACC）、归一化互信息（NMI）和纯度（PUR）；本实施例以非完整的多视角图像数据为对象，即图像数据将按照给定的丢失率随机选取缺失的样本，并将选定的样本的像素值设为0，本实施例将通过调整不同的丢失率来对本实施方式的图像聚类模型进行实验，丢失率的调整范围为0.1-0.5，每次实验的间隔为0.1；In this embodiment, three commonly used clustering indicators are used as measurement indicators, namely accuracy (ACC), normalized mutual information (NMI) and purity (PUR). In this embodiment, incomplete multi-view image data is used as The object, that is, the image data, will randomly select the missing samples according to the given loss rate, and set the pixel value of the selected sample to 0. In this example, the image clustering model of this embodiment will be adjusted by adjusting different loss rates. Carry out experiments, the adjustment range of the loss rate is 0.1-0.5, and the interval of each experiment is 0.1;

从上表的实验结果可以看出，本实施方式在三种指标上均呈现良好的实验结果；It can be seen from the experimental results in the above table that the present embodiment presents good experimental results on the three indicators;

在YALE数据集上，当丢失率达到0.4时，精确度虽呈现一定的下降趋势，但明显下降幅度较小，表现出良好的稳定性；On the YALE data set, when the loss rate reaches 0.4, although the accuracy shows a certain downward trend, the decline is obviously small, showing good stability;

而在另两个数据集上的实验结果可以看出，随着丢失率的上升，本实施例一直呈现一种稳定的趋势，波动趋势平缓，证明了本实施例在面对非完整的多视角图像数据时，具有较好的鲁棒性和广泛的实践应用性。From the experimental results on the other two data sets, it can be seen that with the increase of the loss rate, this embodiment has always shown a stable trend, and the fluctuation trend is gentle, which proves that this embodiment is in the face of incomplete multi-viewpoints. When using image data, it has better robustness and wide practical applicability.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims

1. A method for clustering incomplete multi-view image data based on data restoration is characterized by comprising the following steps:

s1: inputting a missing multi-view image dataset;

s2: repairing the missing multi-view image data set from a data layer based on a Pearson correlation coefficient calculation method to obtain a complete multi-view image data set;

s3: constructing a multi-core cooperative representation model, putting a complete multi-view image data set into the multi-core cooperative representation model for learning, obtaining an image data robust representation containing multiple effective information, and obtaining different robust representations at different views;

s4: constructing a robust subspace learning model, sending the robust representation into the robust subspace learning model for learning, obtaining subspace representation of image data, and obtaining different subspace representations from different visual angles;

s5: constructing a low-rank tensor model, putting the subspace representation into the low-rank tensor model, and recombining the subspace representation into a three-dimensional form to explore hidden information of the image data in the three-dimensional space;

s6: defining an effective joint representation model, and putting a multi-core cooperation representation model, a robust subspace learning model and a low-rank tensor model into the joint representation model;

s7: solving the optimal solution of each variable by using a complete multi-view image data set so as to obtain a fusion subspace with high discriminability;

s8: and sending the fusion subspace into a clustering algorithm to obtain a required clustering result.

2. The incomplete multi-view image data clustering method based on data restoration according to claim 1, wherein in step S2, the formula for restoring the missing multi-view image data set from the data plane based on the pearson correlation coefficient calculation method is as follows:

；

wherein

The dimensions of the current sample are represented by,

the number of views is represented as,

the number of missing samples is indicated by the number of samples,

representing the first five samples with the highest similarity to the current missing sample,

represents the current sample and

correlation coefficient between samples.

3. The incomplete multi-view image data clustering method based on data recovery as claimed in claim 1, wherein in step S3, the multi-kernel collaborative representation model is:

；

wherein

Representing a kernel matrix acquired from the complete image data,

representing a robust representation of image data;

in

The representation of the feature represents a dimension that,

representing the number of samples.

4. The method for clustering incomplete multi-view image data based on data recovery as claimed in claim 1, wherein in step S4, the robust subspace learning model is:

；

wherein,

and

a regularization parameter is represented as a function of,

represents the norm of Frobinus,

the traces representing the matrix are shown as traces of the matrix,

and

robust representation of data representing view 1 and view 2 respectively,

and

respectively representing the kernel matrices acquired from view 1 and view 2,

and

the robust subspace matrices for view 1 and view 2 are represented, respectively.

5. The method for clustering incomplete multi-view image data based on data recovery as claimed in claim 1, wherein in step S5, the low rank tensor model is:

；

where δ represents the robust subspace matrix composed of view 1 and view 2The third-order tensor is,

a regularization constant is represented as a function of,

the frobenus norm representing the tensor,

a low rank tensor structure is represented which,

tensor of representation

A low rank tensor structure.

6. The incomplete multi-view image data clustering method based on data recovery as claimed in claim 1, wherein in step S6, the joint representation model is:

。

7. the method for clustering incomplete multi-view image data based on data recovery as claimed in claim 1, wherein in step S7, the step of solving the optimal solution for each variable by using the complete multi-view image data set comprises:

according to the alternating direction multiplier method, the optimal solution is solved for one variable iteration under the condition that other variables are unchanged.

8. The method for clustering incomplete multi-view image data based on data recovery according to any one of claims 1-7, further comprising the steps of:

calculating the clustering accuracy of the incomplete multi-view image data;

calculating normalized mutual information of incomplete multi-view image data;

and calculating the purity of the incomplete multi-view image data.