CN103399852A

CN103399852A - Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading

Info

Publication number: CN103399852A
Application number: CN2013102600621A
Authority: CN
Inventors: 杨金龙; 李志伟; 葛洪伟; 周得水
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2013-06-27
Filing date: 2013-06-27
Publication date: 2013-11-20

Abstract

The invention discloses a multi-channel spectral clustering method based on local density estimation and neighbor relation spreading. The multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading mainly solves the problem that an existing clustering method cannot carry out clustering on data distributed unevenly in density. The multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading comprises the steps that local density of a sample is estimated and is used as data characteristics and dimension lifting is carried out on original data; a distance matrix, a threshold value and a similarity matrix are calculated, and a neighbor relation matrix is initialized; the neighbor relation matrix and the similarity matrix are updated, and similarity of samples of a subset is updated by the adoption of a local maximum similar value, and an accurate affinity matrix is obtained; a similarity matrix and a normalized Laplacian matrix are calculated; a spectrum matrix is normalized, and a clustering result is obtained through the K-means algorithm. Compared with an existing clustering technology, the multi-channel spectrum method based on local density estimation and neighbor relation spreading enables a more real similarity matrix to be obtained, the clustering result is more accurate and the robustness is better.

Description

Multi-path spectral clustering method based on local density estimation and neighbor relation propagation

Technical Field

The invention belongs to the technical field of cluster analysis, and relates to a construction method for improving an affinity matrix in spectral clustering. In particular to a multi-path spectral clustering method based on local density estimation and neighbor relation propagation, which can be used for systems such as data mining, image segmentation, machine learning and the like.

Background

The spectral clustering technology is established on the basis of spectrogram theory, and essentially converts the clustering problem into the optimal partition problem of a graph by using a spectral relaxation method. First, from a given data set, affinity matrices are defined to describe the similarity between data points, and eigenvalues and eigenvectors of the normalized affinity matrices are calculated, clustering different data points by selecting the appropriate eigenvectors. Compared with the traditional clustering algorithm (such as K-means), the method can cluster data sets with any shape on a sample space, and can converge to a global optimal solution. Therefore, the spectral clustering method has been widely applied to the fields of image segmentation, computer vision, speech recognition, VLSI design, and the like.

In recent years, Shi and Malik have established a standard cut (Ncut) objective function based on 2-way partitioning according to spectrogram theory, designed a spectral clustering algorithm for image partitioning, and developed into NJW algorithm for k-way partitioning by Ng et al. In the classical algorithms, a Gaussian function determined by Euclidean distance is adopted to calculate the similarity matrix W, so that the real similarity relation between data set samples is difficult to reflect, and particularly for data sets with complex distribution structures and arbitrary shapes, the similarity matrix constructed by the method fails.

In order to obtain the true similarity which can reflect the data samples, a plurality of improved methods such as a similarity measurement method based on a path, a new method for defining an affinity graph based on flow plane sequencing, a similarity function construction method based on density sensitive distance measure and the like appear at present. In 2012, Li et al proposed an affinity matrix construction method (SC-NP for short) based on neighbor relation propagation, which initializes the neighbor relation of samples according to a distance threshold epsilon, and then divides the samples with high similarity into the same subset according to a neighbor relation propagation principle. Although the method can amplify the similarity of the samples in the same subset, the similarity of the samples among the subsets is measured by adopting the global minimum value of the similarity matrix W, so that the similarity among different subsets belonging to the same class is reduced, and for data with uneven density distribution, the samples of different classes are easily divided into the subsets, so that the constructed affinity matrix cannot truly reflect the similarity among the samples, and the clustering result is inaccurate.

Disclosure of Invention

The invention aims to overcome the problems in the background technology and provides a multi-path spectral clustering method based on local density estimation and neighbor relation propagation, and a clustering result is more accurate and stable by constructing a similarity matrix capable of truly reflecting the similarity between samples.

The key technology for realizing the invention is as follows: a multi-path spectral clustering method based on local density estimation and neighbor relation propagation is disclosed. The concrete implementation steps comprise:

(1) input dataset X ═ X₁，x₂，...，x_n}∈R^d，x_nThe nth sample in the data set is represented, n is the number of samples, and d represents the dimension of the sample.

(2) Estimating local density of a sample

(2a) Find K nearest neighbor samples for each sample x and form a set n (x) ═ y₁，y₂，...，y_k}∈R^d，y_KThe kth nearest neighbor sample representing sample x;

(2b) a set of distances for the sample x is calculated,

D (x) = \cup_{I = 1}^{K} d (y_{1}, Nearest (y_{1})),

among them, nerest (y)_i) Representing a sample y_iNearest neighbor sample of d (y)_i，y_j) Denotes y_iAnd y_jEuclidean distance of (a):

d (y_{i}, y_{j}) = \sqrt{{(y_{i 1} - y_{j 1})}^{2} + {(y_{i 2} - y_{j 2})}^{2} + . . . + {(y_{id} - y_{jd})}^{2}}

wherein, y_idAnd y_jdD-dimension attribute values of the ith sample and the jth sample respectively;

(2c) a sample set near (x) is defined. Firstly, judging whether the elements in D (x) can be divided into two types, if yes, expressing a type of sample with more elements as y'₁，y′₂，...，y′_mWhere m denotes the number of samples, the m samples are considered to have a large influence on the density estimation of x, and near (x) = { y'_i1, | i ═ 1: m }; otherwise, K samples have a large influence on the density estimation of x as a whole, and define near (x) ═ n (x);

(2d) calculating a discriminant density estimate for a sample x

{\hat{f}}_{P} (x) = \underset{x_{i} &Element; Near (x)}{Σ} e^{- \frac{d {(x, x_{i})}^{2}}{{2 σ}^{2}}}

Wherein near (x) is obtainable according to step (2c), d (x, x)_i) Are samples x and x_iEuclidean distance of, σ²Is the window width;

(2e) calculating individual density estimates for sample x

{\hat{f}}_{S} (x) = e^{- \frac{d {(x, y_{3})}^{2}}{{2 σ}^{2}}}

Wherein, y₃Is the 3 rd neighbor sample of sample x;

(2f) computing local density estimates for sample x

Defining local density estimates as the sum of discrimination density estimates and individual density estimates, i.e.

{\hat{f}}_{L} (x) = {\hat{f}}_{P} (x) + {\hat{f}}_{S} (x) .

(3) Raising the sample density to obtain the local density of each sample in the set X

As the d +1 th dimension of the sample, a new data set is obtained

X^{*} = {x_{1}^{*}, x_{2}^{*}, . . ., x_{n}^{*}} &Element; R^{d + 1},

Wherein,

x_{n}^{*} = {[x_{n} {\hat{f}}_{L} (x)]}^{T} .

(4) calculating X^*The euclidean distance between samples yields the distance matrix B, i.e. B ═ B_ij]_n×n。

(5) Calculating X^*Similarity between samples, and obtaining a similarity matrix W ═ W_ij]_n×n，w_ij＝exp(-b_ij ²/2σ²) Wherein w is_ijDenotes the ith sample x_iAnd the jth sample x_jThe similarity between them.

(6) Calculating a threshold value epsilon according to the distance matrix B, determining the neighbor relation between samples corresponding to elements in the B, and obtaining an initial neighbor relation matrix T, wherein,

(7) and updating the relation matrix T and the similarity matrix W according to a neighbor relation propagation principle to obtain an affinity matrix A.

(8) Constructing a degree matrix D and a Laplace matrix Lsym, wherein Lsym is D^-1/2AD^-1/2Where D is a diagonal matrix, diagonal elementsDenotes the ith sample x_iDegree of (c).

(9) Calculating eigenvectors corresponding to the first k largest eigenvalues of Lsym, constructing a matrix V, and further unitizing the matrix into a matrix Y ═ Y_ij]_n×kWherein

(10) each line of Y is taken as a sample point in K-dimensional space, and the sample points are grouped into K classes by the K-means algorithm. If and only if YWhen the ith row is allocated to the jth class, the sample point x in the original data set is divided into_iAssigned as class j.

The invention simultaneously considers the spatial consistency characteristic constraint information and the density information of the data. The spatial consistency can ensure that the probability that the neighboring samples belong to the same class is higher. When the data samples are in any shape and any distribution structure, the element values in the affinity matrix can reflect the real similarity relation between the data, so that the clustering result is more accurate and stable, and the clustering effectiveness of the multi-path spectral clustering method is improved.

The invention has the following advantages:

(1) the method has the advantages that the samples of different classes are prevented from entering the same subset, so that the updated affinity matrix can truly reflect the similarity among the samples;

(2) accurate and stable clustering results can be obtained for data sets with uniform and non-uniform density distribution;

(3) compared with the prior art, the method has better clustering performance and robustness.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a Datal data set with a non-uniform density distribution, containing class 2 samples;

FIG. 3 is a Data2 Data set with a non-uniform density distribution, containing class 3 samples;

FIG. 4 is a profile of a subset of the Datal data set using the SC-NP method;

FIG. 5 is a distribution diagram of subsets of the Data2 Data set obtained by the SC-NP method;

FIG. 6 is a diagram of the clustering results obtained by the SC-NP method for the Datal data set;

FIG. 7 is a diagram of the clustering result of the Data2 Data set by SC-NP method;

FIG. 8 is a profile of a subset of a Datal data set obtained using the present invention;

FIG. 9 is a profile of a subset of a Data2 Data set obtained using the present invention;

FIG. 10 is a graph of the clustering results obtained by the Datal data set using the present invention;

FIG. 11 is a diagram of the clustering results obtained by the Data2 Data set using the present invention;

FIG. 12 is a graph of the clustering results obtained by the K-means method for the Datal data set;

FIG. 13 is a diagram of the clustering results obtained by the K-means method for the Data2 Data set;

FIG. 14 is a graph of the clustering results obtained by the Datal data set using the Ncut method;

FIG. 15 is a diagram of the clustering results obtained by the Ncut method for the Data2 Data set;

FIG. 16 is a diagram of the clustering results obtained by the Datal data set using the NJW method;

FIG. 17 is a diagram of the clustering results obtained by using the NJW method for the Data2 Data set;

FIG. 18 is a graph of eigenvalues of a Datal data set taken using the present invention for different parameters σ;

FIG. 19 is a graph of eigenvalues of a Data2 Data set taken for different parameters σ using the present invention;

FIG. 20 is a Data3 dataset with a relatively uniform density distribution, containing class 2 samples;

FIG. 21 is a Data4 dataset with a relatively uniform density distribution, containing class 3 samples;

FIG. 22 is a diagram of the clustering results obtained by the Data3 Data set using the present invention;

FIG. 23 is a diagram of the clustering results obtained by the present invention for a Data4 Data set.

Detailed Description

Introduction of basic theory

1. Theory of spectrum

Assuming that each data sample is regarded as a vertex V in the graph, edges among the vertices are weighted according to the similarity among the samples, and an undirected weighted graph G (V, E) based on the similarity among the samples is constructed, the clustering problem can be converted into a graph partitioning problem.

Graph division principle: the weights within the subgraphs are maximized and the weights between subgraphs are minimized. Graph G is divided into V₁And V₂The cost function for both subgraphs can be expressed as:

cut (V_{1}, V_{2}) = \underset{u &Element; V_{1}, v &Element; V_{2}}{Σ} w_{uv} - - - (1)

wherein, V₁UV₂＝V，

w_uvIs the similarity of samples u and v. Then the standard cut objective function based on 2-way is:

Ncut (V_{1}, V_{2}) = \frac{cut (V_{1}, V_{2})}{assoc (V_{1}, V)} + \frac{cut (V_{1}, V_{2})}{assoc (V_{2}, V)} - - - (2)

wherein,

minimizing the Ncut function is referred to as the canonical cut-set criterion. The criterion can measure the similarity between samples in a class and the dissimilarity between samples in a class.

If a plurality of sub-graphs are divided, the standard cut objective function based on k-way is as follows:

Ncut (V_{1}, \cdot \cdot \cdot, V_{k}) = \frac{cut (V_{1}, V - V_{1})}{assoc (V_{1}, V)} + \frac{cut (V_{2}, V - V_{2})}{assoc (V_{2}, V)} + \cdot \cdot \cdot + \frac{cut (V_{k}, V - V_{k})}{assoc (V_{k}, V)} - - - (3)

the spectral clustering algorithm is to derive new characteristics of a clustering object through a matrix analysis theory, and cluster original data by using the new characteristics, wherein the theoretical basis of the spectral clustering algorithm is a Laplancian matrix which generally has three forms: in some embodiments, the compound is a compound of formula i (i) or (ii) and (iii) or (iii)_uv]_n×nD is a diagonal matrix,

② the symmetrical form of the specification is represented as L_sym＝D^-1/2LD^-1/2＝I-D^-1/2WD^-1/2(ii) a ③ the normalized random walk form is expressed as L_rw＝D^-1L＝I-D^-1W。

When a 2-way partition is employed, let p ═ p_i]_i∈vIs V₁Is divided into indication vectors

p_{i} = \{\begin{matrix} - 1, i &Element; V_{2} \\ 1, i &Element; V_{1} \end{matrix}- - - - (4)

Then

cut (V_{1}, V_{2}) = f (p) = \frac{1}{4} \underset{i, j &Element; V}{Σ} w_{ij} {(p_{i} - p_{j})}^{2} = \frac{1}{2} p^{T} Lp .

Considering the constraint x^TWe＝x^TDe is 0, then

\min Ncut (V_{1}, V_{2}) = \frac{x^{T} (D - W) x}{x^{T} Dx} - - - (5)

Relaxing x to the continuum domain [ -1, 1], then equation (5) is equivalent:

\arg \min_{x} \min_{x^{T} De = 0} \frac{x^{T} (D - W) x}{x^{T} Dx} - - - (6)

according to the Rayleigh quotient principle, the optimal solution of the formula (6) is the second solution in the formula (7)Two minimum eigenvalues λ₂Corresponding feature solution vector x₂Then x₂The division information of the graph is included, so that x can be divided according to the heuristic rule₂The elements in (1) are classified into 2 types.

D^{- \frac{1}{2}} (D - W) D^{\frac{1}{2}} x = λx - - - (7)

Similarly, an optimal solution of the k-way standard cut objective function can be derived, that is, a solution of the eigenvectors corresponding to the k minimum eigenvalues of equation (7). At present, a k-way clustering method has been proved to have better clustering effect than a 2-way method, such as NJW algorithm and the like.

2. Density estimation

Let data set X ═ X₁，x₂，...，x_n}∈R^dIf the random variable set is independent and uniformly distributed, the Parzen window density estimation formula obeyed by the variable is as follows:

\hat{f} (x) \frac{1}{n} Σ_{i = 1}^{n} K_{σ^{2}} (x, x_{i}) - - - (8)

wherein,is a kernel function, σ²Is the window width and n is the number of samples. The kernel function is typically symmetric and finite, i.e., K (-u) ═ K (u) and K (u) | < ∞,

commonly used kernel functions typically have: gaussian kernel function, Epanechnikov kernel function, Biweight kernel function, square wave kernel function, and the like. In the present invention, the local density of the sample x is estimated using a Gaussian kernel function, i.e.

{\hat{f}}_{Gauss} (x) = \underset{x_{i} &Element; Near (x)}{Σ} e^{- \frac{d {(x, x_{i})}^{2}}{2 σ^{2}}} - - - (9)

Second, the invention is based on the multi-channel spectral clustering method that the local density estimates and the neighbour relations spread

Referring to fig. 1, the implementation of the present invention includes the following steps:

step 1. calculating local density estimation of sample

(1.1) search for K nearest neighbors of sample x, denoted by the set n (x) ═ y₁，y₂，...，y_k}∈R^dIn the invention, K is taken to be 7. The set of distances for sample x is calculated according to equation (10):

D (x) = \cup_{i = 1}^{K} d (y_{i}, Nearest (y_{i})) - - - (10)

among them, nerest (y)_i) Representing a sample y_iNearest neighbor sample of d (y)_i，y_j) Denotes y_iAnd y_jThe euclidean distance of (c).

(1.2) defining near (x) according to the criterion:

(1.2.1) if the elements in D (x) can be divided into two groups. If the number of samples corresponding to a class with a larger number of elements in the class is m, the m samples have a larger influence on the density estimation of x. Let the m samples be x'₁，x′₂，...，x′_mAnd constitutes a set near (x), and near (x) ═ x 'is defined'_i|i＝1：m}；

(1.2.2) if the elements in d (x) cannot be classified into two types, then we consider that these K samples have a large influence on the density estimation, and we can define near (x) n (x).

(1.3) calculating the discrimination density of the sample x

{\hat{f}}_{P} (x) = \underset{x_{i} &Element; Near (x)}{Σ} e^{- \frac{d {(x, x_{i})}^{2}}{{2 σ}^{2}}} - - - (11)

(1.4) calculating the individual Density of sample x

{\hat{f}}_{S} (x) = e^{- \frac{d {(x, y_{3})}^{2}}{{2 σ}^{2}}} - - - (12)

(1.5) calculating sample xLocal density

{\hat{f}}_{L} (x) = {\hat{f}}_{P} (x) + {\hat{f}}_{S} (x) - - - (13)

Step 2, performing density dimension increasing on the sample

Will be locally denseExtended as d + 1-th dimension into sample x, forming a new data set

X^{*} = {x_{1}^{*}, x_{2}^{*}, . . ., x_{n}^{*}} &Element; R^{d + 1},

Wherein,

x_{n}^{*} = {[x_{n} {\hat{f}}_{L} (x)]}^{T} .

step 3. construct data set X^*Affinity matrix A of

(3.1) calculating any two samples x₁And x_jEuclidean distance b between them_ijConstructing a distance matrix B ═ B_ij]_n×nWherein

b_{ij} = \sqrt{{(x_{i 1} - x_{j 1})}^{2} + {(x_{i 2} - x_{j 2})}^{2} + . . . + {(x_{id + 1} - x_{jd + 1})}^{2}};

(3.2) calculating any two samples x_iAnd x_jSimilarity w between them_ijConstructing a similarity matrix W ═ W_ij]_n×nWherein w is_ij＝exp(-b_ij ²/2σ²)；

(3.3) determining the neighbor relation of the samples corresponding to the elements in the distance matrix B according to the threshold epsilon

Defining a neighbor relation matrix as T ═ T_ij]_n×nWhen initializing, all elements in T are set as 0, and the distance threshold is set as

In the distance matrix B, if B_ijE is less than or equal to epsilon, let t_ij＝1，t_ji＝t_ijThen, it indicates the sample x_iAnd x_jHaving a neighbor relation, symbolized by (x)_i，x_j) E R, wherein R represents a neighbor relation.

(3.4) according to the transfer principle of the neighbor relation, respectively updating the neighbor relation matrix T and the similarity matrix W

(3.4.1) neighbor relation transfer principle: if (x)_i，x_j)∈R，(x_j，x_k)∈R，

Then (x)_i，x_k)∈R；

(3.4.2) defining subsets: after propagation through the neighbor relation, each sample has a corresponding neighbor relation domain, the neighbor domains with the number of the samples being 1 are defined as a single sample set, and the neighbor domains with the number of the samples being more than 1 are defined as a multi-sample set. The single sample set and the multiple sample sets are collectively referred to as subsets. If C mutually disjoint subsets C are obtained according to the neighbor relation propagation algorithm₁，C₂，...，C_cThen the subset needs to satisfy the condition:

1≤|C_ii is less than n, wherein i is more than or equal to 1 and less than or equal to c;

(3.4.3) the T matrix and intra-subset sample similarity updates are as follows: if T in T_ij＝1，t_jk＝1，t_ikWhen t is equal to 0, t is added_ikAnd t_kiIs updated to 1 while the element W in the similarity matrix W is updated_ikAnd w_kiIs updated to min (w)_ij，w_jk)；

(3.4.4) the similarity of the samples between subsets is updated as follows: let set C₁，C₂，...，C_cThe corresponding number of samples is m₁，m₂，...，m_cDefinition of W (C)_i，C_j) Is C_iAnd C_jA similarity matrix formed by the sample similarities between the two subsets is selected, the maximum value of the sample similarities between the two subsets is selected as a local maximum similarity value, and the local maximum similarity value is expressed as MaxSim max (W (C)_i，C_j) The similarity of the samples between the subsets may be updated to a local maximum.

And (3.5) obtaining the similarity matrix W as the affinity matrix.

Step 4, calculating a degree matrix D and a standard Laplace matrix L_sym

D is a diagonal matrix, diagonal elements D_iiRepresents a sample x_iWherein, in

The Laplace matrix can be represented as L_sym＝D^-1/2AD^-1/2。

Step 5, normalizing the spectrum matrix V

(5.1) if 1 ═ λ₁≥λ₂≥...≥λ_kIs L_symThe first k maximum eigenvalues, v¹，v²，...，v^kFor the corresponding eigenvectors, construct the matrix V ═ V¹，v²，...，v^k]∈R^n×kAnd v is^kIs a column vector;

(5.2) normalizing V so that the sum of each row of elements of V is 1, resulting in a matrix Y, the elements of which are

y_{ij} = V_{ij} / {(Σ_{j = 1}^{k} V {^{2}}_{ij})}^{1 / 2} .

Step 6, outputting clustering results

And taking each line of Y as a sample in a K-dimensional space, and performing K-class division on the samples by adopting a K-means algorithm. Sample x in the original dataset if and only if the ith row of Y is assigned to class j_iWill be classified as j-th class.

The effects of the present invention can be further illustrated by the following experimental simulations.

1. Simulation conditions

Four artificial data sets are selected for carrying out simulation experiments, wherein the data sets are respectively shown in fig. 2, 3, 20 and 21, data1 and data2 in fig. 2 and 3 are uneven data sets, and data3 and data4 in fig. 20 and 21 are even data sets. The simulation software was MATLAB 7.1.

2. Simulation result

Experiment 1 comparative experiment between the method of the present invention and SC-NP method

FIGS. 4 and 5 are sample subset distribution diagrams obtained by SC-NP method, and it can be seen that some samples of different classes are divided into the same subset (e.g. 'in FIG. 4 and' in FIG. 5)Labeled samples) resulting in an inaccurate similarity matrix and thus erroneous clustering results, as shown in fig. 6 and 7.

Fig. 8 and 9 are sample subset distribution diagrams obtained by the method of the present invention, and it can be seen that samples of different classes are divided into different subsets, so that a more accurate similarity matrix can be obtained, and a clustering result is more accurate, and is shown in fig. 10 and 11.

Experiment 2 comparative experiment of the method of the present invention with other clustering methods

FIG. 12 and FIG. 13 are graphs of clustering results obtained by the K-means method; fig. 14 and 15 are graphs of clustering results obtained by the Ncut method; fig. 16 and 17 are graphs of clustering results obtained by the NJW method. As can be seen from the experimental result graph, the method of the invention is obviously superior to other clustering methods.

Experiment 3 robustness analysis experiment

Fig. 18 and fig. 19 are graphs of characteristic values obtained by the method of the present invention for different parameters σ, respectively, and it is obvious that the curve shapes corresponding to different parameters σ are similar, and the variation of the characteristic value between the curves is not large, but in the same curve, there are obvious characteristic value mutation points, such as the characteristic value number 3 in fig. 18 and the characteristic value number 4 in fig. 19. Obviously, the feature vector corresponding to the first k-2 largest feature values selected in fig. 18 can obtain an accurate clustering result, and the feature vector corresponding to the first k-3 largest feature values selected in fig. 19 can obtain a more accurate clustering result, thereby illustrating that the method of the present invention has better robustness to the parameter σ.

Experiment 4 data set experiment with uniform density distribution

Fig. 22 and 23 are graphs of clustering results obtained by the method of the present invention, and it is obvious that the method of the present invention can also obtain accurate clustering results for data sets with uniformly distributed density.

Claims

1. A multi-path spectral clustering method based on local density estimation and neighbor relation propagation comprises the following steps:

(1) input dataset X ═ X₁，x₂，...，x_n}∈R^d，x_nRepresenting the nth sample in the data set, wherein n is the number of samples, and d is the dimension of the samples;

(2) estimating the local density of the sample:

(2a) find K nearest neighbor samples of sample x and form a set n (x) ═ y₁，y₂，...，y_k}∈R^dWherein, y_KThe kth nearest neighbor sample representing sample x;

(2b) calculate the distance set for x:

D (x) = \cup_{i = 1}^{K} d (y_{i}, Nearest (y_{i})),

d (y_{i}, y_{j}) = \sqrt{{(y_{i 1} - y_{j 1})}^{2} + {(y_{i 2} - y_{j 2})}^{2} + . . . + {(y_{id} - y_{jd})}^{2}}

wherein, y_idAnd y_jdRespectively representing the d-dimension attribute values of the ith sample and the jth sample;

(2c) defining a sample set near (x): firstly, judging whether the elements in D (x) can be divided into two types, if yes, expressing a type of sample with more elements as y'₁，y′₂，...，y′_mWhere m denotes the number of samples, the m samples are considered to have a large influence on the density estimation of x, and near (x) = { y'_i1, | i ═ 1: m, otherwise, K samples have a large influence on the density estimation of x as a whole, and define near (x) as n (x);

(2d) calculating a discriminant density estimate of x

{\hat{f}}_{P} (x) = \underset{x_{i} &Element; Near (x)}{Σ} e^{- \frac{d {(x, x_{i})}^{2}}{{2 σ}^{2}}}

(2e) calculating individual density estimates for x

{\hat{f}}_{S} (x) = e^{- \frac{d {(x, y_{3})}^{2}}{{2 σ}^{2}}}

Wherein, y₃The 3 rd neighbor sample of x;

(2f) calculating local density estimates for x

{\hat{f}}_{L} (x) = {\hat{f}}_{P} (x) + {\hat{f}}_{S} (x);

(3) Raising the sample density: will be locally dense

Extended as d + 1-th dimension into sample x, forming a new data set

X^{*} = {x_{1}^{*}, x_{2}^{*}, . . ., x_{n}^{*}} &Element; R^{d + 1},

Wherein,

x_{n}^{*} {[x_{n} {\hat{f}}_{L} (x)]}^{T};

(4) calculating X^*The euclidean distance between samples yields the distance matrix B, i.e. B ═ B_ij]_n×n；

(5) Calculating X^*Similarity between samples, and obtaining a similarity matrix W ═ W_ij]_n×n，w_ij＝exp(-b_ij ²/2σ²) Wherein w is_ijRepresenting the similarity between the ith sample and the jth sample;

(7) respectively updating T and W according to a neighbor relation propagation principle to obtain an affinity matrix A;

(8) constructing a degree matrix D and a Laplace matrix Lsym, wherein D is a diagonal matrix and diagonal elements

Denotes the ith sample x_iDegree of (Lsym) ═ D^-1/2AD^-1/2；

(10) taking each line of Y as a sample point in a K-dimensional space, and clustering the sample points into K classes through a K-means algorithm; sample x is assigned if and only if the ith row of Y is assigned to the jth class_iAssigned as class j.

2. The method of claim 1, wherein step (6) is performed as follows:

defining a neighbor relation matrix as T ═ T_ij]_n×nWhen initializing, all elements in T are set to 0, and the distance threshold is

3. The method of claim 1, wherein step (7) is performed as follows:

(3.1) determining a neighbor relation transfer principle: if (x)_i，x_j)∈R，(x_j，x_k)∈R，

Then (x)_i，x_k)∈R；

(3.2) defining subsets: after propagation through the neighbor relation, each sample has a corresponding neighbor relation domain, the neighbor domains with the number of the samples being 1 are defined as a single sample set, the neighbor domains with the number being more than 1 are defined as multiple sample sets, and the single sample set and the multiple sample sets are collectively called as subsets; if C mutually disjoint subsets C are obtained according to the neighbor relation propagation algorithm₁，C₂，...，C_cThen the subset needs to satisfy the condition:

(3.3) updating the similarity of the T matrix and the samples in the subset respectively if T in T is_ij＝1，t_jk＝1，t_ikWhen t is equal to 0, t is added_ikAnd t_kiIs updated to 1 while the element W in the similarity matrix W is updated_ikAnd w_kiIs updated to min (w)_ij，w_jk)；

(3.4) updating the similarity of the samples among the subsets: let set C₁，C₂，...，C_cThe corresponding number of samples is m₁，m₂，...，m_cDefinition of W (C)_i，c_j) Is C_iAnd C_jA similarity matrix formed by the sample similarities between the two subsets is selected, the maximum value of the sample similarities between the two subsets is selected as a local maximum similarity value, and the local maximum similarity value is expressed as MaxSim max (W (C)_i，C_j) The similarity of the samples among the subsets can be updated to a local maximum;

and (3.5) obtaining the similarity matrix W as the affinity matrix.