CN110674848A

CN110674848A - High-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation

Info

Publication number: CN110674848A
Application number: CN201910819539.2A
Authority: CN
Inventors: 肖亮; 黄楠
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2019-08-31
Filing date: 2019-08-31
Publication date: 2020-01-10

Abstract

The invention discloses a high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation, which comprises the following steps of: partitioning the high-dimensional dataset into non-overlapping data subsets using spatial proximity; constructing a structure dictionary and calculating the correlation between the structure dictionary and each data node; defining a bipartite graph of high-dimensional data joint clustering; constructing an adjacency matrix of the bipartite graph; constructing a bipartite graph segmentation and optimization model; standardizing the adjacency matrix and calculating the left and right eigenvectors of the adjacency matrix; and performing joint clustering on the left and right feature vectors by using a K-means algorithm to obtain a final clustering label. According to the method, the non-overlapped local neighborhood subsets and the structural dictionary learning under the combined sparse representation constraint are adopted to simultaneously mine the local sparsity and the self-correlation property in the high-dimensional data, and the clustering precision can be effectively improved and the robustness to noise is enhanced by simultaneously utilizing the left and right characteristic vectors of the adjacent matrix and simultaneously clustering the high-dimensional data nodes and the structural dictionary atoms.

Description

High-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation

Technical Field

The invention belongs to the technical field of high-dimensional data processing, and particularly relates to a high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation.

Background

With the rapid development of information technology, data obtained by various industries are rapidly increasing in an exponential mode, so that the data is larger and larger in scale and higher in complexity, a large amount of data reflects the characteristics of high-dimensional data, the dimensionality of the data can reach hundreds of thousands of dimensions, and even higher, and the hyperspectral remote sensing data is a typical representative. The hyperspectral remote sensing technology utilizes an imaging spectrometer, can acquire continuous and narrow-band image data with nanoscale spectral resolution in the range from visible light to short wave infrared and even middle infrared and thermal infrared bands and hundreds of hyperspectral resolutions in spectral bands, contains abundant spatial information and spectral information, and can acquire cubic hyperspectral remote sensing data by superposing the two, so that the hyperspectral remote sensing data has the property of map integration, and is widely applied to the fields of military reconnaissance, environmental monitoring, geological exploration, crop evaluation, disaster early warning and the like. The hyperspectral image is used as an image cube integrating the maps, and the core of quantitative analysis is spectral analysis. In the application of hyperspectral data, classification is one of the important tasks for hyperspectral data understanding. The hyperspectral images are classified into two methods, namely supervision and unsupervised, according to whether sample label information is contained or not. Generally, it is difficult to obtain a large number of labeled training samples, so the unsupervised classification or clustering method has wider application value in the field. Meanwhile, unsupervised classification is also an important way for realizing hyperspectral quantitative analysis, the development trend of the method is that spectrum information-level unsupervised classification is developed to a space-spectrum context unsupervised classification method, and a structured sparse representation learning mechanism is introduced to deeply excavate hyperspectral image space-spectrum context structure information so as to obtain a sub-pixel-level fine classification result.

For unsupervised classification, the existing high-dimensional data clustering algorithm mainly comprises a center-based clustering method, a distribution-based clustering method, a density-based clustering method, a connection-based clustering method and the like. However, the existing clustering method is simple, lacks the capability of processing complex structure data, and when the sample space is not convex, the algorithm is easy to fall into local optimization, and the performance is not good enough on high-dimensional data processing. The subspace-based clustering algorithm is based on spectrogram theory, and has attracted more and more attention in academia in recent years. Compared with the traditional clustering algorithm, the clustering algorithm based on the subspace has the advantages that clustering can be performed on sample spaces in any shapes and the overall optimal solution can be converged, many practical problems can be effectively handled, and the clustering algorithm has great scientific research value and application prospect. Spectral clustering, which is one of subspace clustering, is a graph segmentation technique, and generally converts a high-dimensional data clustering problem into a hypergraph segmentation optimization problem. In other words, spectral clustering is the partitioning of a weighted graph into disjoint subsets such that the sum of the weights of the edges connecting the disjoint subsets is minimized. However, spectral clustering only uses row information of the adjacency matrix and applies it to the intersected set, which tends to discard part of the information, and thus the overall accuracy of the method is degraded. Furthermore, spectral clustering generally deals with the clustering problem of high dimensional data unidirectionally, i.e., clustering only dictionary atoms or clustering sparse representation coefficients.

Disclosure of Invention

The invention discloses a high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation, which can deeply excavate context structure information of high-dimensional data by introducing a structured sparse representation learning mechanism and obtain a sub-pixel-level fine classification result.

The technical solution for realizing the purpose of the invention is as follows: a high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation comprises the following steps:

the method comprises the following steps that firstly, a high-dimensional data set is divided into non-overlapping data subsets by utilizing spatial proximity;

secondly, constructing a structure dictionary and calculating the correlation between the structure dictionary and each data node, namely learning a joint sparse representation coefficient and the structure dictionary in combination with the context characteristics of high-dimensional data in an optimization model of joint sparse representation constraint;

step three, defining a bipartite graph of high-dimensional data joint clustering, namely defining an undirected bipartite graph which comprises two disjoint vertex sets;

fourthly, constructing an adjacent matrix of the bipartite graph, namely mapping the joint sparse representation coefficient to a non-negative adjacent matrix of the graph;

fifthly, constructing a bipartite graph segmentation and optimization model, namely constructing a bipartite graph optimization model by using an adjacency matrix and using a normalized segmentation optimization model;

sixthly, standardizing the adjacent matrix and calculating left and right eigenvectors of the adjacent matrix, namely calculating the standardized matrix of the adjacent matrix, and decomposing the normalized matrix by using singular value decomposition to obtain the left and right eigenvectors;

and seventhly, performing combined clustering on the left and right feature vectors by using a K-means algorithm to obtain a final clustering label.

Furthermore, in the first step, the high-dimensional data set is divided into non-overlapping data subsets by using spatial proximity, that is, the high-dimensional data set is divided into a plurality of non-overlapping w × w square spatial neighborhood subsets, wherein w is more than or equal to 3.

Further, the second step constructs a structure dictionary and calculates the correlation between the structure dictionary and each data node, and the specific process is as follows:

(1) for ith data node y_iI is more than or equal to 1 and less than or equal to n, n is the number of samples, and the node y is intercepted from the high-dimensional data_iIs denoted as Γ_iExpressed as follows:

wherein

Representing a spatial neighborhood subset Γ_iOf_iL number of data nodes;

(2) the structural dictionary learning model is represented as follows:

wherein X ═ X₁,…,x_i,…,x_n]∈R^m×nIs a joint sparse representation coefficient, D ═ D₁,…,d_i,…,d_m]∈R^d ^×mIs a dictionary of the structure of the text,is a₂/l₁Norm, which represents

Line of₂The sum of the norms is then calculated,

is a regularization parameter in which the joint sparse representation coefficient X provides a correlation between the high dimensional data node and the dictionary atoms.

Further, defining a bipartite graph of high-dimensional data joint clustering in the third step, wherein the specific process is as follows:

defining an undirected bipartite graph

It consists of two disjoint sets of vertices, where

Is a collection of dictionary atoms that is,

is a high-dimensional data node set connected by corresponding edge sets E, and the set E represents all edge weights E_ijIn which E_ijIs the weight of the edge between the ith vertex and the jth vertex in the bipartite graph, and the edge E_ijOnly between two heterogeneous vertex sets.

Further, a fourth step constructs a adjacency matrix of the bipartite graph, i.e. by mapping the joint sparse coefficients to a non-negative adjacency matrix of the graph, in particular

Wherein a ═ X |.

Further, a fifth step of constructing a bipartite graph segmentation and optimization model, which comprises the following specific processes:

(1) divided into two clusters, assuming

Is a division of the bipartite graph, and adopts a normalized cut for dividing the bipartite graph, wherein the normalized cut can be written as:

wherein

Andrepresents the accumulated edge weights between the clusters,

representing the accumulated edge weights within a cluster;

(2) let q be the vector of the segmented bipartite graph G if

q

₁1, otherwise q₂-1; the rayleigh quotient of the vector q is equivalent to the segmentation optimization model in step (1), and specifically comprises the following steps:

wherein

Is a matrix of laplacian data to be encoded,

is a diagonal matrix;

the above formula is equivalent to:

whereinAll elements of vector e are equal to 1;

(3) the discrete segmentation vector q in the segmentation optimization model can be relaxed in a continuous vector form, and specifically comprises the following steps:

the above solving problem corresponds to a generalized eigenvalue problemIs determined, wherein z is the feature vector.

Further, the sixth step of normalizing the adjacency matrix and calculating the left and right eigenvectors thereof comprises the following specific processes:

(1) in the bipartite drawing

Wherein D₁(i,i)＝∑_jA_ijAnd D₂(j,j)＝∑_iA_ijIs a diagonal matrix;

(2) let z ═ z₁z₂]^TProblem of generalized eigenvalues

Is equivalent to

The above formula is equivalent to:

D₁z₁-Az₂＝λD₁z₁

-A^Tz₁+D₂z₂＝λD₂z₂

make it

The above formula is equivalent to:

thus, the above equation is equivalent to a normalized matrix

Singular value decomposition of (c).

Further, the seventh step uses the K-means algorithm to pair vectors

And clustering to obtain a final clustering label.

Compared with the prior art, the invention has the remarkable characteristics that: (1) the invention adopts non-overlapping spatial neighborhood subsets, nodes in the neighborhood subsets are usually positioned in a low-dimensional subspace and are usually formed by dictionary atoms of the same class, and the identifiability between high-dimensional data classes is obtained; (2) the dictionary learning method under the combined sparse representation constraint optimization framework is used for capturing inherent local sparsity and non-local self-similarity of high-dimensional data, and the calculation complexity and parameter setting process are reduced; (3) and (3) capturing row and column information of the adjacency matrix through bipartite graph segmentation of the high-dimensional data and capturing correlation between the high-dimensional data nodes and dictionary atoms.

Drawings

FIG. 1 is a flow chart of the high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation.

FIG. 2 is a schematic diagram of data subsets that segment high-dimensional data sets into non-overlapping.

FIG. 3 is a bipartite graph between high-dimensional data nodes and dictionary atoms.

FIG. 4(a) is a plot of the actual terrain profile for the Indian Pines dataset.

FIG. 4(b) is a graph of the clustering effect of Indian Pines data sets using the K-means method.

FIG. 4(c) is a diagram of the clustering effect of Indian Pines data set using the CFSFDP method.

FIG. 4(d) is a clustering effect diagram of a high-dimensional data clustering method of spectral dictionary learning and spectral clustering for Indian Pines data sets.

FIG. 4(e) is a clustering effect diagram of a high-dimensional data clustering method of Indian Pines data set using spatial dictionary learning and spectral clustering.

FIG. 4(f) is a graph of the clustering effect of the high-dimensional data clustering method in which the Indian Pines data set employs joint sparse representation and spectral clustering.

FIG. 4(g) is a clustering effect diagram of a high-dimensional data joint clustering method of Indian Pines data sets by spectral dictionary learning and bipartite graph segmentation.

FIG. 4(h) is a clustering effect diagram of a high-dimensional data joint clustering method of Indian Pines data set by using spatial dictionary learning and bipartite graph segmentation.

FIG. 4(i) is a graph of the clustering effect of Indian Pines datasets using the method of the present invention.

FIG. 5(a) is a plot of the true terrain profile for the Pavia University dataset.

FIG. 5(b) is a graph of the clustering effect of the Pavia University dataset by the K-means method.

FIG. 5(c) is a graph of the clustering effect of the Pavia University data set using the CFSFDP method.

FIG. 5(d) is a clustering effect diagram of the high-dimensional data clustering method of the Pavia University data set using spectral dictionary learning and spectral clustering.

FIG. 5(e) is a clustering effect diagram of a high-dimensional data clustering method of Pavia University data set employing spatial dictionary learning and spectral clustering.

FIG. 5(f) is a clustering effect diagram of a high-dimensional data clustering method in which the Pavia University data set adopts joint sparse representation and spectral clustering.

FIG. 5(g) is a clustering effect diagram of a high-dimensional data joint clustering method of the Pavia University data set by using spectral dictionary learning and bipartite graph segmentation.

FIG. 5(h) is a clustering effect diagram of a high-dimensional data joint clustering method of Pavia University data set by using spatial dictionary learning and bipartite graph segmentation.

FIG. 5(i) is a graph of the clustering effect of the Pavia University dataset using the method of the present invention.

Detailed Description

The invention provides a high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation, which adopts non-overlapped local neighborhood subsets and structure dictionary learning under joint sparse representation constraint to simultaneously mine local sparsity and autocorrelation properties in high-dimensional data, and reduces the calculation complexity and parameter setting process; and the row and column information of the adjacency matrix and the correlation between the high-dimensional data nodes and the dictionary atoms are captured by dividing the high-dimensional data nodes through the bipartite graph. The specific steps of the present invention are described in detail with reference to fig. 1:

first, a high-dimensional data set is divided into non-overlapping data subsets by using spatial proximity, and a hyperspectral image Y is input by taking the image shown in fig. 4(a) as an example₁,…,y_i,…,y_n]∈R^d×nAnd d is 200, and n is 21025, and the hyperspectral image element is divided into a plurality of non-overlapping 5 × 5 square space neighborhood subsets. The specific division is shown in fig. 2.

And secondly, constructing a structural dictionary, wherein the specific process is as follows:

(1) for the ith pixel y (i is more than or equal to 1 and less than or equal to 21025)_iIntercepting included pixel y in hyperspectral image_iIs denoted as Γ, and is a 5 × 5 neighborhood subset of_iExpressed as follows:

wherein y is_i,25Representing a spatial neighborhood subset Γ_iOf_iAnd | high spectral pixel.

(2) The structural dictionary learning model is represented as follows:

wherein X ═ X₁,…,x_i,…,x_n]∈R^m×nIs a joint sparse representation coefficient, Y is a hyperspectral image, D ═ D₁,…,d_i,…,d_m]∈R^d×mIs a dictionary of the structure of the text,

is a₂/l₁Norm, which represents

Line of₂The sum of the norms is then calculated,

is a regularization parameter where the joint sparse representation coefficient X provides a correlation between the hyperspectral image pixels and the dictionary atoms.

Thirdly, defining a bipartite graph of the hyperspectral image cluster, and the specific process is as follows:

defining an undirected bipartite graph

It consists of two disjoint sets of vertices, whereIs a collection of dictionary atoms that is,

is a hyperspectral image pixel set, connected by corresponding edge sets E, the set E represents all edge weights E_ijIn which E_ijIs the weight of the edge between the ith vertex and the jth vertex in the bipartite graph, and the edge E_ijExist only between two heterogeneous sets of vertices (vertices within the set do not communicate), as shown in fig. 3.

Fourth, a adjacency matrix of the bipartite graph is constructed, i.e. by mapping the joint sparse coefficients to a non-negative adjacency matrix of the graph, in particular

Wherein a ═ X |.

And fifthly, constructing a bipartite graph segmentation and optimization model, wherein the concrete process is as follows:

(1) for simplicity, consider a division into two clusters, assuming

The method is a partition of a bipartite graph, in order to better divide a sample into two clusters and balance the size of each cluster, a normalized partition for partitioning the bipartite graph is adopted, and the normalized partition can be written as:

wherein

And

represents the accumulated edge weights between the clusters,

representing the accumulated edge weights within a cluster.

(2) Let q be the vector of the segmented bipartite graph G if

q

₁1, otherwise q₂Is-1. The rayleigh quotient of the vector q is equivalent to the segmentation optimization model in the previous step, and specifically includes:

wherein

Is a matrix of laplacian data to be encoded,

is a diagonal matrix.

The above formula is equivalent to:

wherein

All elements of vector e are equal to 1.

(3) The discrete segmentation vector q in the segmentation optimization model in the previous step can be relaxed in a continuous vector form, specifically:

the above solving problem corresponds to a generalized eigenvalue problem

Is determined, wherein z is the feature vector.

Sixthly, standardizing the adjacency matrix and calculating the left and right eigenvectors thereof, specifically:

(1) in the bipartite drawing

Wherein D₁(i,i)＝∑_jA_ijAnd D₂(j,j)＝∑_iA_ijIs a diagonal matrix.

(2) Let z ═ z₁z₂]^TProblem of generalized eigenvalues

Is equivalent to

The above formula is equivalent to:

make it

The above formula is equivalent to:

thus, the above equation is equivalent to a normalized matrixSingular value decomposition of (c).

Seventhly, performing combined clustering on the left and right feature vectors by using a K-mean algorithm to obtain a final clustering label, namely using the K-mean algorithm to perform vector matching

And clustering to obtain a final clustering label.

The method efficiently utilizes the joint sparsity in the data, integrates the joint representation characteristics of representation dictionary atoms and coefficients, overcomes the defect that the traditional sparsity clustering only utilizes the representation coefficients, improves the clustering precision, and enhances the robustness to noise. The method can be widely applied to the unsupervised classification of high-dimensional data in the fields of homeland resources, mineral survey and precision agriculture.

The invention is further described in detail below with reference to examples of hyperspectral image clustering and the accompanying drawings.

Examples

(1) Simulation conditions

The simulation experiment adopts two groups of real hyperspectral data: indian Pines dataset and Pavia University dataset. The Indian Pines dataset is a hyperspectral remote sensing image acquired by an airborne visible infrared imaging spectrometer (AVIRIS) in an Indian Pines experimental area, indiana. The image contains 220 bands in total, the spatial resolution is 20m, and the image size is 145 × 145. After removing 20 water vapor absorption and low signal-to-noise ratio bands (band numbers 104-. The region contains 16 known land features in total, and 8 land features are selected as experiments in order to balance the balance among the land features. The Pavia University dataset was acquired by a ross sensor in parkia, and included 115 bands in total, with an image size of 610 × 340, and after removing the noise band, the remaining 103 bands were selected as the study objects. Considering the problem of computational complexity, the invention selects a sub-graph with the size of 200 × 100. The simulation experiments are all completed by adopting MATLAB R2014a under a Windows 7 operating system.

The evaluation indexes adopted by the invention are an evaluation method of clustering accuracy (ACC, clustering method of clustering accuracy), adjusting Lande Index (ARI, Adjusted Rand Index), adjusting Mutual Information (AMI, Adjusted Mutual Information), normalizing Mutual Information (NMI, Normalized Mutual Information), Homogeneity (Homogeneity), integrity (completeness), harmonic mean (V-measure) and Fowles-Mallows Index (FMI, Fokes-Mallows Index).

(2) Emulated content

The invention adopts the clustering performance of a real hyperspectral data set inspection algorithm. In order to test the performance of the algorithm, the proposed high-dimensional data joint clustering method (BGP-JSDL) for joint sparse representation and bipartite graph segmentation is compared with the current internationally popular clustering algorithm. The comparison method comprises the following steps: k-means, CFSFDP, a high-dimensional data clustering method of spectral dictionary learning and spectral clustering (SC-SDL), a high-dimensional data clustering method of spatial dictionary learning and spectral clustering (SC-CDL), a high-dimensional data clustering method of joint sparse representation and spectral clustering (SC-JSDL), a high-dimensional data joint clustering method of spectral dictionary learning and bipartite graph segmentation (BGP-SDL), a high-dimensional data joint clustering method of spatial dictionary learning and bipartite graph segmentation (BGP-CDL), and a high-dimensional data joint clustering method of joint sparse representation and bipartite graph segmentation (BGP-JSDL).

(3) Analysis of simulation experiment results

Tables 1 and 2 show the clustering precision and the comparison result of different evaluation indexes of two groups of hyperspectral data sets under different clustering algorithms.

TABLE 1 quantitative evaluation of different clustering algorithms for Indian Pines datasets (ACC, ARI, AMI, NMI, homogeneity, completeness, V _ means, FMI (%))

TABLE 2 quantitative evaluation of different clustering algorithms for the Pavia University dataset (ACC, ARI, AMI, NMI, homogeneity, completeness, V _ means, FMI (%))

As can be seen from table 1, in the Indian Pines dataset, the JSDL significantly improves the clustering accuracy in different evaluation indexes by virtue of the inherent local sparsity of the captured hyperspectral image and the discriminativity of the dictionary, compared with the SDL and the CDL. The high-dimensional data clustering based on bipartite graph segmentation is characterized in that the correlation between the pixels of the hyperspectral images and the atoms of the dictionary is captured, and compared with a high-dimensional data clustering method based on spectral clustering, the clustering precision is remarkably improved. As can be seen from Table 2, the same conclusions can be drawn on the Paviauniversity dataset. The result effect graphs of the method of the invention on two sets of data sets are shown in fig. 4 and fig. 5. The simulation experiment results of the two groups of real data sets show the effectiveness of the method.

Claims

1. A high-dimensional data joint clustering method combining sparse representation and bipartite graph segmentation is characterized by comprising the following steps:

step three, defining a bipartite graph of high-dimensional data joint clustering, namely defining an undirected bipartite graph comprising two disjoint vertex sets;

2. The method of claim 1, wherein the first step of using spatial proximity to partition the high-dimensional data set into non-overlapping subsets of data is to divide the high-dimensional data set into non-overlapping subsets of w x w square spatial neighbors, where w is greater than or equal to 3.

3. The method for jointly clustering high-dimensional data by jointly representing sparse data and segmenting bipartite graph according to claim 1, wherein the second step constructs a structural dictionary and calculates the correlation between the structural dictionary and each data node, and the specific process is as follows:

wherein

Representing a spatial neighborhood subset Γ_iOf_iL number of data nodes;

(2) the structural dictionary learning model is represented as follows:

wherein X ═ X₁,…,x_i,…,x_n]∈R^m×nIs a joint sparse representation coefficient, D ═ D₁,…,d_i,…,d_m]∈R^d×mIs a dictionary of the structure of the text,is a₂/l₁Norm, which represents

Line of₂The sum of the norms is then calculated,is a regularization parameter in which the joint sparse representation coefficient X provides a correlation between the high dimensional data node and the dictionary atoms.

4. The method for jointly clustering high-dimensional data by jointly sparse representation and bipartite graph segmentation according to claim 1, wherein the third step defines a bipartite graph for jointly clustering high-dimensional data by:

defining an undirected bipartite graph

It consists of two disjoint sets of vertices, where

Is a collection of dictionary atoms that is,

5. Method for jointly clustering high-dimensional data by joint sparse representation and bipartite graph partitioning according to claim 1, wherein the fourth step constructs a adjacency matrix of the bipartite graph by mapping joint sparse coefficients to non-negative adjacency matrices of the graph, in particular to non-negative adjacency matrices of the graphWherein a ═ X |.

6. The method for jointly clustering sparse representation and bipartite graph segmented high-dimensional data according to claim 1, wherein the fifth step is constructing a bipartite graph segmentation and optimization model by the following specific processes:

(1) divided into two clusters, assuming

wherein

And

represents the accumulated edge weights between the clusters,

representing the accumulated edge weights within a cluster;

(2) let q be the vector of the segmented bipartite graph G if

q₁1, otherwise q₂-1; the rayleigh quotient of the vector q is equivalent to the segmentation optimization model in step (1), and specifically comprises the following steps:

wherein

Is a matrix of laplacian data to be encoded,

is a diagonal matrix;

the above formula is equivalent to:

wherein

All elements of vector e are equal to 1;

(3) the discrete segmentation vector q in the segmentation optimization model is relaxed in a continuous vector form, and the method specifically comprises the following steps:

the above solving problem corresponds to a generalized eigenvalue problem

Is determined, wherein z is the feature vector.

7. The method for jointly clustering sparse representations and bipartite graph segmented high-dimensional data according to claim 1, wherein the sixth step normalizes the adjacency matrix and calculates its left and right eigenvectors by:

(1) in the bipartite drawing

Wherein D₁(i,i)＝∑_jA_ijAnd D₂(j,j)＝∑_iA_ijIs a diagonal matrix;

(2) let z ═ z₁z₂]^TProblem of generalized eigenvalues

Is equivalent to

The above formula is equivalent to:

D₁z₁-Az₂＝λD₁z₁

-A^Tz₁+D₂z₂＝λD₂z₂

make it

The above formula is equivalent to:

thus, the above equation is equivalent to a normalized matrix

Singular value decomposition of (c).

8. According to the claims7, the high-dimensional data joint clustering method for joint sparse representation and bipartite graph segmentation is characterized in that in the seventh step, a K-means algorithm is used for vector

And clustering to obtain a final clustering label.