CN112464005B

CN112464005B - Depth-enhanced image clustering method

Info

Publication number: CN112464005B
Application number: CN202011343296.9A
Authority: CN
Inventors: 陈志奎; 金珊; 高静; 李朋; 张佳宁; 宋鑫
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2021-12-03
Anticipated expiration: 2040-11-26
Also published as: CN112464005A

Abstract

The invention provides a depth-enhanced image clustering method, which belongs to the technical field of image clustering and data mining, and comprises the following steps of 1) pre-training an encoding and decoding network and initializing a potential feature space; 2) initializing clustering centroids in a potential feature space by adopting a traditional K-means method, and distributing Bernoulli-logistic units to the centroids; 3) calculating a logistic regression parameter and Bernoulli distribution between the point and the unit; 4) a reward regression strategy is used for dynamically distributing temporary rewards, and the movement locus of each mass center is calculated by combining with the distribution of auxiliary targets; 5) and calculating weight, and iteratively optimizing the clustering unit until a convergence condition is met, thereby completing the depth enhanced image clustering process. The method is based on the reinforcement learning idea, utilizes the potential characteristics to express and adjust the clustering center of mass in a combined manner by using the reward regression strategy, fully applies all clustering information, particularly clustering information of adjacent areas to the clustering analysis process, effectively improves the clustering fuzzy problem in the interaction of environment and behavior, and effectively improves the clustering performance.

Description

Depth-enhanced image clustering method

Technical Field

The invention belongs to the technical field of image clustering and reinforcement learning, and relates to a depth image clustering method based on reinforcement learning.

Background

With the rapid development of the internet of things technology and the network information technology, the popularization range of electronic products such as smart phones and tablet computers is wider, more and more data can be collected, the data structure is more and more complex, and especially the amount of unstructured image data is increased explosively. The image data contains rich semantic information for research and use in various fields, but the rich semantic information in the data is difficult to accurately acquire due to the influence of complex data structure and high dimensionality. Therefore, a new method is urgently needed to be researched to deeply mine abundant information in massive image data.

The clustering usually performs data analysis in an unsupervised learning or self-supervised learning mode, is one of important research contents in the fields of data mining, image processing and the like, and effectively solves a plurality of problems of data mining by the idea of 'same type and different types'. Conventional clustering algorithms typically take a given data feature representation as input and then cluster the feature representation using different models, but for some high-dimensional, complex image data, it may be difficult to find its intrinsic pattern structure, i.e., "dimensional disaster," due to a lack of a sample similarity measure that is effective in a high-dimensional space. For high-dimensional data clustering, the conventional corresponding method comprises subspace clustering, feature dimension reduction, feature extraction and the like. In recent years, deep learning brings a new solution for high-dimensional and complex data clustering due to the unique advantages of the deep learning on feature representation, and a plurality of effective deep clustering algorithms are derived. The more representative deep clustering method, namely a Deep Embedded Clustering (DEC) algorithm, optimizes parameters and clustering distribution of the deep neural network simultaneously, and provides a powerful tool for clustering research.

At present, in the process of carrying out cluster analysis on large-scale data, the influence of potential feature representation on a clustering effect is fully considered by a deep clustering algorithm, so that the problem of dimension disaster is effectively solved. However, the existing deep clustering algorithm lacks consideration on the influence of the whole clustering environment, especially the adjacent area environment on the clustering effect, the similarity of the same kind is enhanced more and more along with the increase of the iteration times, and the heterogeneous difference is not obvious, so that the clustering of the partial image input points is fuzzy. In the process of interaction with the environment, reinforcement learning learns the strategy by acquiring rewards and guiding behaviors to maximize return, so that the accuracy of a clustering result is improved by guiding the walking direction of a prototype by adopting a reinforcement learning idea on the premise of not neglecting potential feature representation on the basis of deep clustering.

In summary, the invention provides a depth-enhanced image clustering method, which mainly considers the use problem of all clustering information in the iterative process of clustering analysis, and utilizes the clustering information after each iteration to adjust a clustering prototype and guide the clustering process.

Disclosure of Invention

The invention provides a depth-enhanced image clustering method. Firstly, in order to solve the dimensionality disaster, the method selects a deep self-encoder to perform dimensionality reduction on original image data, and obtains deep semantic features contained in the data. Secondly, the invention designs a Bernoulli-logistic unit to represent the clustering prototype, and effectively utilizes various available information to adjust the influence of the adjacent area on the clustering prototype. Finally, the invention adopts a reward strategy of reinforcement learning, distributes rewards for each cluster after the available information of each cluster is obtained, guides the behavior action of the clustering prototype in the clustering environment and obtains more accurate clustering results. In summary, the invention provides a depth-enhanced image clustering method, which adopts a learning mode of reward regression to learn the potential features of images from large-scale unlabeled data and perform clustering division so as to improve the Accuracy (ACC) of the clustering method, adjust the lander index (ARI) and standardize mutual information (NMI).

In order to achieve the purpose, the technical scheme adopted by the depth-enhanced image clustering method comprises the following steps:

step 1, pre-training an encoding and decoding network, and learning potential features of an image;

step 2, mining a clustering prototype in the potential feature space by adopting a K-means method, and distributing a Bernoulli-logistic unit for the clustering prototype;

step 3, randomly selecting a sample x_iCalculating the logic between the point and the clustering prototypeRegression probability and bernoulli distribution parameters;

step 4, dynamically distributing temporary rewards by using a reward regression strategy, and calculating motion tracks of various prototypes by combining Bernoulli distribution;

step 5, calculating weight, iteratively optimizing clustering prototypes until convergence conditions are met, and completing a deep enhanced clustering process;

the invention has the beneficial effects that: the invention designs a deep enhanced clustering method aiming at image data, considers the interaction problem of the clustering environment and the prototype wandering direction in the clustering process, designs a Bernoulli-logistic unit and dynamically updates clustering information. Meanwhile, the method adjusts the clustering prototype by using a reward regression strategy based on the latent characteristics of the image and by using the idea of reinforcement learning, fully applies all clustering information, especially the clustering information of the adjacent areas to image clustering analysis, and effectively solves the problem of clustering blur. The method disclosed by the invention is used for carrying out experiments on evaluation indexes ACC, ARI and NMI commonly used by the clustering method, and proves that the method can effectively improve the clustering accuracy.

Drawings

FIG. 1 is a frame diagram of a depth-enhanced image clustering method;

FIG. 2 is a flow chart of a method of the present invention.

Detailed Description

The following further describes embodiments of the present invention with reference to the drawings.

FIG. 1 is a frame diagram of a deep enhanced clustering method. Firstly, a latent feature representation of data is extracted by a deep self-encoder, and high-dimensional original image data is mapped to a low-dimensional feature space, so that the problem of dimension disaster of the high-dimensional data is solved. Secondly, mining the clustering mass center of the data by using a K-means method, initializing clustering prototypes, distributing Bernoulli-logistic units to the clustering prototypes, and storing clustering environment information in the iteration process. Then, the similarity between the data points and the clustering prototypes in the characteristic space is measured by using the Euclidean distance, and the logistic regression parameters of the clustering and the Bernoulli distribution with high confidence level are updated. Secondly, dynamically rewarding and punishing each prototype by using a reward regression strategy, updating the motion track of the clustering prototype by combining Bernoulli distribution, ensuring the use of all clustering environments, particularly environment information of adjacent areas, and finishing the interaction of the current clustering environment and the input point behaviors. Finally, the process is repeated using a reinforcement learning algorithm by jointly using bernoulli distribution and reward penalty until the convergence condition is satisfied.

The method comprises the following specific steps:

the original image data can provide more abundant and detailed information due to the characteristic of higher dimensionality, but at the same time, the understandability and usability of the data are greatly reduced due to the improvement of the data dimensionality. In order to solve the problem of the disaster of the dimension of the image data, the invention adopts a deep self-encoder model, minimizes the reconstruction error in an unsupervised mode for training, extracts the high-order features of the input data layer by layer in the process, reduces the dimension of the input data, and converts the complex input image data into a simple low-dimensional feature space.

The deep layer self-encoder network is formed by stacking a noise reduction self-encoder network, the noise reduction self-encoder network is formed by an encoding layer and a decoding layer, in the training process, the network randomly destroys the input of each noise reduction self-encoder, and then reconstructs the original input as the output to obtain the potential representation of the input data. The network can be defined as the following process:

setting image data x as the input of a noise reduction self-encoder; dropout () sets the partial input to 0 as a random mapping function; f. of_e、f_dAs a mapping function of the coding layer and the decoding layer, θ ═ W_e,b_e,W_d,b_dAre parameters of the network model; the potential features h output by the encoding layer will be the input to the decoding layer. Meanwhile, in order to ensure that the reconstructed image data x' is consistent with the original image data x as much as possible, a minimum square loss function is adopted

The method of (3) optimizes the model.

Specifically, in order to make the reconstructed feature vector contain all the information of the original feature vector, the activation functions of the coding layer and the decoding layer of the first noise reduction self-encoder are set as identity functions, and the activation functions of the coding layer and the decoding layer of the other noise reduction self-encoders are set as ReLu functions. After the construction of each noise reduction self-encoder network is completed, parameters of an encoding layer and a decoding layer are initialized by random distribution, then training is carried out by adopting a random gradient descent (SGD) back propagation mode, then the encoding layer and the decoding layer of each noise reduction self-encoder are disassembled, and an integral model frame of a deep self-encoder is combined according to input and output dimensions. Next, the SGD is adopted again to train the network model, so that the reconstruction loss is reduced to the maximum extent, a good deep self-encoder network model is generated, and the encoding part is selected as the original feature space to the potential feature space (f)_θX → H), where θ is a parameter of the input point, X is an original feature representation of the input point, and H is a potential feature representation of the input point.

In the process of pre-training the deep self-encoder, the iteration number is set to 300, the number of samples in each training is set to 256, and experiments prove that the above-mentioned super-parameters are set, so that more effective potential feature representation H of the original image data can be obtained.

Step 2, initializing a clustering prototype, and distributing Bernoulli-logistic units;

generating an input point x by using a trained deep self-coder model_iIs represented by the potential feature of (a)_iThe composition set H ═ H_i|h_i＝f_θ(x_i),x_iE.g. X, i is 1,2

And then, aggregating the prototypes on the potential feature representation H through a K-means clustering algorithm

And updating to obtain K initialized clustering prototypes.

Specifically, the purpose of the K-means clustering algorithm is to find K clustering prototypes by optimizing the following objective functions:

wherein d (c)_k,h_i) Is the input point h_iAnd clustering prototype c_kThe distance between the two adjacent channels is Euclidean distance, and the calculation process is as follows:

wherein n is the input point h_iAnd clustering prototype c_kOf (c) is calculated.

In order to effectively solve the process, a proper K value is selected according to the priori knowledge, and then a heuristic iteration method is adopted to select a proper clustering prototype. Specifically, K samples are randomly selected from a sample set as initial prototypes, the distance between each sample and each prototype is calculated according to the formula (6), and each sample is allocated to the nearest clustering prototype to obtain an initial clustering division result; then updating the prototype to obtain a new clustering division result; repeating the above process until the prototype is no longer usedThe change occurs, and the final clustering result is obtained as the clustering prototype set of the potential feature representation

Clustering prototypes obtained by using K-means

And constructing a Bernoulli-logistic unit BLlist ═ w, p, dw, fx } which contains clustering information of the clustering prototype under the current environment as the clustering prototype, wherein w is the weight of the clustering prototype under the current environment, p is the Bernoulli distribution coefficient of the clustering prototype, dw is the Euclidean distance of the prototype, and fx is the logistic regression coefficient of the prototype. The initial weight of the clustering prototype can be obtained through a K-means clustering algorithm, and the initial values of the other parameters are set to be 0 values.

Step 3, strengthening clustering;

after the latent feature extraction and cluster prototype initialization are completed, the main part of the invention, namely the initial non-linear mapping f to the original image data, will be performed_θAnd clustering prototypes in Bernoulli-logistic units

The strengthening process of (1). The strengthening process mainly comprises the following two steps. And 3-1, calculating the potential feature representation of the input points, the logistic regression parameters of the clustering prototypes and the Bernoulli distribution parameters, and dynamically updating the parameters into the Bernoulli-logistic unit. And 3-2, rewarding or punishing each clustering prototype in the current clustering environment by adopting a reward regression strategy, and learning clustering loss by combining parameters in a Bernoulli-logistic stetty unit. Obtaining the probability p corresponding to the clustering prototype closest to the potential feature of the input image by using the step 3-1 and the step 3-2_kIndicating variable y_iPrize value r_iAnd determining the motion trail of the prototype.

1) Bernoulli-logistic distribution

According to the method, the Bernoulli-logistic distribution is selected as the auxiliary target distribution to be used for the activation unit, namely, the similarity between the potential features of the input image and the clustering prototype is measured through the Bernoulli-logistic distribution, the clustering accuracy is improved, and the confidence coefficient of clustering distribution is improved.

Firstly, randomly selecting potential features h of an input image to interact with each Bernoulli-logistic unit, and calculating the Euclidean distance s between the potential features h of the input image and each Bernoulli-logistic unit_k＝d(h,w_k) Then, a logistic function is used to measure the similarity between the latent features h of the input image and the prototype to which each cell belongs, and the function is as follows:

after obtaining the probability distribution of the current point, estimating by using an auxiliary cost function, wherein the calculation formula is as follows:

p_k＝h(s_k)＝2×(1-F(s_k)) (8)

wherein p is_kBelonging to a cluster prototype c for the potential feature of the input image_kThe probability of the cluster is, when the potential feature of the input image is closer to a unit, the probability p of the corresponding unit_kThe larger the size, the smaller the size otherwise.

Because the influence of each clustering unit on the potential characteristics of the input image is not uniform due to the uncertainty of probability distribution under Bernoulli-logistic distribution, a formula (9) is designed to generate random seeds p, the random seeds p are compared with the probability obtained in the logical regression, an indication variable y is obtained, and the influence effect of each clustering prototype on the whole clustering result is balanced. The calculation formula is as follows:

the Bernoulli-logistic distribution obtained above is dynamically updated to the clustered Bernoulli-logistic unit along with an iterative process, and is used for iteratively updating the weight information of the clustering prototype to which the unit belongs.

2) Reward regression strategy

In order to fully utilize the influence of the clustering environment on the clustering result and to prominently consider the positive or negative effect of each front clustering on the back clustering, the invention adopts the idea of reinforcement learning, and selects a proper reward strategy to further clarify the learning direction for the set clustering prototype so as to seek the optimal result. For this, each clustering prototype c is calculated_kOf (a) metric value y_kAnd then, allocating an evaluation decision for each prototype by using an incentive regression strategy to dynamically update the behavior generated after interaction between the input point and each clustering prototype, rewarding the effective clustering prototype and performing punishment operation on the adjacent invalid region, thereby solving the problem that the consideration of all clusters, particularly the adjacent regions, in a clustering algorithm is insufficient. The specific strategy scheme is as follows:

wherein the prototype is a closer prototype, i.e. a more compact prototype

When the current clustering prototype needs to be matched with a more active action scheme, namely a reward signal is sent to the current clustering prototype, the current clustering prototype needs to be matched with a more active action scheme

Conversely, when the prototype is a distant prototype, i.e. the prototype is a distant prototype

When the prototype is described as an error prototype, a penalty signal is sent to the prototype, i.e. the prototype is interpreted as an error prototype

In doing so, the weight of the clustering prototype is not affected.

Step 4, updating the weight and optimizing a clustering prototype;

after the strengthening task of the clustering prototype is completed, the method adopts a strategy gradient algorithm to update the weight parameter of the prototype k corresponding to the input point x, and the initial updating formula is as follows:

wherein a > 0 is the learning rate, r is the enhancement signal obtained during the enhancement process, b_ikA reinforcement baseline; while

Is the unit weight w_ikDegree of transformation of corresponding features, the value being subject to probability density function g under continuous distribution_ik(y_k；w_ik,h_i) Is influenced by variations in, i.e. by, the latent features h of the input image_iAnd a weight w_ikThe determined current prototype indicates a variable y in the current environment_kThe influence of (a);

according to the result of the enhanced clustering task, combining the indication variable y_kWith allocation results of reward policies, and allocation of reinforcement baselines b_ikThe available final weight update formula is as follows:

Δw_ik＝ar_k(y_k-p_k)(-fx/(1-p_k))(w_ik-h_i) (12)

clustering prototype

The updating is performed by equation (12). And when the iteration times reach the preset maximum training times 60000 times, the whole clustering task is completed.

The method comprises the following steps:

the whole process of the invention is divided into three parts: a characteristic preprocessing process, a clustering prototype initialization process and a reinforced clustering process. Firstly, a deep self-encoder model is built, a decoder and an encoder of a noise reduction self-encoder are adopted to pre-train a network, and then the built encoding layer is utilized to map original high-dimensional image data to a low-dimensional potential feature space, so as to obtain potential feature representation of an image. Secondly, based on potential feature representation extracted from data in the feature preprocessing process, a traditional K-means algorithm is adopted to initialize clustering prototypes, and the clustering prototypes are stored in a Bernoulli-logistic unit mode. And finally, acquiring reward signals of each unit by adopting a reward regression strategy in reinforcement learning, and dynamically optimizing a clustering result by combining Bernoulli-logistic distribution until a clustering completion condition is met, wherein the specific flow is shown in figure 2.

And (4) verification result:

in the experiments of the present method, two general image data sets were selected: the MNIST writes the digital data set and the fast-MNIST data set to verify the effectiveness of the method, and the detailed information of the data sets is shown in Table 1.

MNIST handwritten digit data set: consisting of 70000 handwritten digits of 28 x 28 pixel size. The present invention reconstructs each image into a 784-dimensional vector.

The fast-MNIST dataset: consisting of 70000 apparel images of 28 x 28 pixel size. The present invention reconstructs each image into a 784-dimensional vector.

Table 1 details of the data set

Data set	Number of samples	Sample dimension	Number of categories
				MNIST	70000	784	10
Fashion-MNIST	70000	784	10

The method uses the traditional clustering evaluation criteria: clustering Accuracy (ACC), Adjusted Rand Index (ARI), and Normalized Mutual Information (NMI).

To verify the performance of the invention, 3 general typical clustering methods were chosen: the traditional unsupervised clustering method K means (K-means), the deep clustering method (AE + K-means) and the deep embedded clustering method (DEC) are compared.

The results of comparing the performance of ACC, ARI and NMI on MNIST and Fashion-MNIST data sets by the method proposed by the present invention are shown in tables 3 and 4.

Table 2 comparison of results on MNIST dataset for each experiment

Experiments	ACC	ARI	NMI
				K-means	0.5319	0.3633	0.4971
AE+K-means	0.8184	0.7421	0.7790
				DEC	0.8430	0.8181	0.8437
The invention	0.9292	0.8493	0.8438

TABLE 3 comparison of results on the Fashinon-MNIST dataset for each experiment

Experiments	ACC	ARI	NMI
				K-means	0.4758	0.3485	0.5122
AE+K-means	0.5713	0.4259	0.5764
				DEC	0.5829	0.4823	0.6404
The invention	0.6166	0.4871	0.6002

From tables 2 and 3, it can be observed that the method provided by the invention is superior to the comparative baseline method in both the evaluation indexes ACC and ARI of the data sets of MNIST and fast-MNIST, which proves the effectiveness of the invention. In particular, compared with the K-means method, the method has the advantages that the potential features of the image data can be extracted through the deep self-encoder network, and the clustering effect is improved. Compared with the AE + K-means method, the method has the advantages that the Bernoulli-logistic unit is adopted, the clustering information in the clustering unit is used for adjusting the cluster center, and the clustering performance is improved. Compared with the DEC method, the method has the advantages that the reward regression strategy is adopted, the valid clustering units are rewarded, meanwhile, invalid adjacent clustering units are punished, the influence of all clusters, particularly adjacent areas, on the clustering effect is fully considered, and the clustering performance is improved.

Claims

1. A depth-enhanced image clustering method is characterized by comprising the following steps:

the deep layer self-encoder network is formed by stacking a noise reduction self-encoder network, the noise reduction self-encoder network consists of an encoding layer and a decoding layer, in the training process, the network randomly destroys the input of each noise reduction self-encoder, and then reconstructs the original input as the output to obtain the potential representation of the input data; the denoising self-encoder network is defined as the following process:

setting image data x as the input of a noise reduction self-encoder; dropout () sets the partial input to 0 as a random mapping function; f. of_e、f_dAs a mapping function of the coding layer and the decoding layer, θ ═ W_e,b_e,W_d,b_dAre parameters of the network model; the potential characteristics h output by the coding layer are used as the input of the decoding layer; meanwhile, in order to ensure that the reconstructed image data x' is consistent with the original image data x as much as possible, a minimum square loss function is adopted

The method of (3) optimizes the model;

setting the activation functions of the coding layer and the decoding layer of the first noise reduction self-encoder as identity functions, and setting the activation functions of the coding layer and the decoding layer of the other noise reduction self-encoders as ReLu functions; after the network of each noise reduction self-encoder is built, parameters of a coding layer and a decoding layer are initialized by random distribution, then training is carried out by adopting a random gradient descent back propagation mode, then the coding layer and the decoding layer of each noise reduction self-encoder are disassembled, and an integral model frame of a deep self-encoder is combined according to input and output dimensions;

and training the network model by adopting the SGD again to reduce the reconstruction loss to the maximum extent, generating a good deep self-encoder network model, and selecting a coding part as a mapping tool f from an original characteristic space to a potential characteristic space_θX → H, where θ is a parameter of the input point, X is an original feature representation of the input point, and H is a potential feature representation of the input point, and finally a potential feature representation H of the original image data is obtained;

generating an input point x by adopting the deep self-coder model trained in the step 1_iIs represented by the potential feature of (a)_iThe composition set H ═ H_i|h_i＝f_θ(x_i),x_iE.g. X, i is 1,2

Updating to obtain K initialized clustering prototypes;

clustering prototypes obtained by using K-means

As a clustering prototype, constructing a bernoulli-logistic unit BLlist ═ { w, p, dw, fx }, which contains clustering information of the clustering prototype in the current environment, wherein w is the weight of the clustering prototype in the current environment, p is the bernoulli distribution coefficient of the clustering prototype, dw is the euclidean distance of the prototype, fx is the prototypeA logistic regression coefficient; obtaining the initial weight of a clustering prototype through a K-means clustering algorithm, and setting the initial values of the other parameters as 0 values;

step 3, strengthening clustering;

step 3-1, calculating the potential feature representation of the input points, the logistic regression parameters of the clustering prototypes and the Bernoulli distribution parameters, and dynamically updating the parameters into a Bernoulli-logistic unit;

p_k＝h(s_k)＝2×(1-F(s_k)) (8)

wherein p is_kBelonging to a cluster prototype c for the potential feature of the input image_kThe probability of the cluster is, when the potential feature of the input image is closer to a unit, the probability p of the corresponding unit_kThe larger the size, the smaller the size otherwise;

meanwhile, a random seed p is generated by the formula (9), and is compared with the probability obtained in the logical regression to obtain an indication variable y, so that the influence effect of each clustering prototype on the whole clustering result is balanced; the calculation formula is as follows:

the Bernoulli-logistic distribution obtained in the above way is dynamically updated to a Bernoulli-logistic unit of the clustering prototype along with the iterative process, and is used for iteratively updating the weight information of the clustering prototype to which the unit belongs;

step 3-2, reward or punishment is carried out on each clustering prototype under the current clustering environment by adopting a reward regression strategy, and clustering loss is learned by combining parameters in a Bernoulli-logistic stetty unit;

in calculating each clustering prototype c_kOf (a) metric value y_kThen, an evaluation decision is distributed to each prototype by using an incentive regression strategy to dynamically update behaviors generated after interaction between the input points and each clustering prototype, and punishment operation is carried out on adjacent invalid regions while an effective clustering prototype is rewarded, so that the problem that the consideration of all clusters, particularly the adjacent regions, in a clustering algorithm is insufficient is solved; the specific strategy scheme is as follows:

wherein the prototype is a closer prototype, i.e. a more compact prototype

In the process of carrying out the strategy, the weight of the clustering prototype is not influenced;

determining a prototype motion track by using the step 3-1 and the step 3-2;

step 4, updating the weight and optimizing a clustering prototype;

updating the weight parameters of the prototype k corresponding to the input point x by adopting a strategy gradient algorithm, wherein an initial updating formula is as follows:

according to the result of the enhanced clustering task, combining the indication variable y_kWith allocation results of reward policies, and allocation of reinforcement baselines b_ikWhen it is 0, the final weight update formula is as follows:

Δw_ik＝ar_k(y_k-p_k)(-fx/(1-p_k))(w_ik-h_i) (12)

clustering prototype

Updating by formula (12); and when the iteration times reach the preset maximum training times, finishing the whole clustering task.

2. The method for depth-enhanced image clustering according to claim 1, wherein the K-means clustering algorithm in step 1 aims to find K clustering prototypes by optimizing the following objective functions:

wherein d (c)_k,h_i) Is the input point h_iAnd clustering prototype c_kThe Euclidean distance is adopted as the distance between the two, and the calculation process is as follows:

wherein n is the input point h_iAnd clustering prototype c_kDimension (d);

firstly, randomly selecting K samples from a sample set as initial prototypes, calculating the distance between the samples and each prototype according to the formula (6), and distributing each sample to the nearest clustering prototype to obtain an initial clustering division result; then updating the prototype to obtain a new clustering division result; repeating the above process until the prototype is not changed any more, and obtaining the final clustering result as the clustering prototype set of the potential feature representation