CN103399852A - Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading - Google Patents
Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading Download PDFInfo
- Publication number
- CN103399852A CN103399852A CN2013102600621A CN201310260062A CN103399852A CN 103399852 A CN103399852 A CN 103399852A CN 2013102600621 A CN2013102600621 A CN 2013102600621A CN 201310260062 A CN201310260062 A CN 201310260062A CN 103399852 A CN103399852 A CN 103399852A
- Authority
- CN
- China
- Prior art keywords
- sample
- matrix
- samples
- similarity
- neighbor relation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000001228 spectrum Methods 0.000 title abstract description 7
- 239000011159 matrix material Substances 0.000 claims abstract description 75
- 230000003595 spectral effect Effects 0.000 claims abstract description 15
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 13
- 238000012546 transfer Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 12
- 238000002474 experimental method Methods 0.000 description 8
- 239000013598 vector Substances 0.000 description 5
- 238000004088 simulation Methods 0.000 description 4
- 238000000638 solvent extraction Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003709 image segmentation Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 238000013432 robust analysis Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a multi-channel spectral clustering method based on local density estimation and neighbor relation spreading. The multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading mainly solves the problem that an existing clustering method cannot carry out clustering on data distributed unevenly in density. The multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading comprises the steps that local density of a sample is estimated and is used as data characteristics and dimension lifting is carried out on original data; a distance matrix, a threshold value and a similarity matrix are calculated, and a neighbor relation matrix is initialized; the neighbor relation matrix and the similarity matrix are updated, and similarity of samples of a subset is updated by the adoption of a local maximum similar value, and an accurate affinity matrix is obtained; a similarity matrix and a normalized Laplacian matrix are calculated; a spectrum matrix is normalized, and a clustering result is obtained through the K-means algorithm. Compared with an existing clustering technology, the multi-channel spectrum method based on local density estimation and neighbor relation spreading enables a more real similarity matrix to be obtained, the clustering result is more accurate and the robustness is better.
Description
Technical Field
The invention belongs to the technical field of cluster analysis, and relates to a construction method for improving an affinity matrix in spectral clustering. In particular to a multi-path spectral clustering method based on local density estimation and neighbor relation propagation, which can be used for systems such as data mining, image segmentation, machine learning and the like.
Background
The spectral clustering technology is established on the basis of spectrogram theory, and essentially converts the clustering problem into the optimal partition problem of a graph by using a spectral relaxation method. First, from a given data set, affinity matrices are defined to describe the similarity between data points, and eigenvalues and eigenvectors of the normalized affinity matrices are calculated, clustering different data points by selecting the appropriate eigenvectors. Compared with the traditional clustering algorithm (such as K-means), the method can cluster data sets with any shape on a sample space, and can converge to a global optimal solution. Therefore, the spectral clustering method has been widely applied to the fields of image segmentation, computer vision, speech recognition, VLSI design, and the like.
In recent years, Shi and Malik have established a standard cut (Ncut) objective function based on 2-way partitioning according to spectrogram theory, designed a spectral clustering algorithm for image partitioning, and developed into NJW algorithm for k-way partitioning by Ng et al. In the classical algorithms, a Gaussian function determined by Euclidean distance is adopted to calculate the similarity matrix W, so that the real similarity relation between data set samples is difficult to reflect, and particularly for data sets with complex distribution structures and arbitrary shapes, the similarity matrix constructed by the method fails.
In order to obtain the true similarity which can reflect the data samples, a plurality of improved methods such as a similarity measurement method based on a path, a new method for defining an affinity graph based on flow plane sequencing, a similarity function construction method based on density sensitive distance measure and the like appear at present. In 2012, Li et al proposed an affinity matrix construction method (SC-NP for short) based on neighbor relation propagation, which initializes the neighbor relation of samples according to a distance threshold epsilon, and then divides the samples with high similarity into the same subset according to a neighbor relation propagation principle. Although the method can amplify the similarity of the samples in the same subset, the similarity of the samples among the subsets is measured by adopting the global minimum value of the similarity matrix W, so that the similarity among different subsets belonging to the same class is reduced, and for data with uneven density distribution, the samples of different classes are easily divided into the subsets, so that the constructed affinity matrix cannot truly reflect the similarity among the samples, and the clustering result is inaccurate.
Disclosure of Invention
The invention aims to overcome the problems in the background technology and provides a multi-path spectral clustering method based on local density estimation and neighbor relation propagation, and a clustering result is more accurate and stable by constructing a similarity matrix capable of truly reflecting the similarity between samples.
The key technology for realizing the invention is as follows: a multi-path spectral clustering method based on local density estimation and neighbor relation propagation is disclosed. The concrete implementation steps comprise:
(1) input dataset X ═ X1,x2,...,xn}∈Rd,xnThe nth sample in the data set is represented, n is the number of samples, and d represents the dimension of the sample.
(2) Estimating local density of a sample
(2a) Find K nearest neighbor samples for each sample x and form a set n (x) ═ y1,y2,...,yk}∈Rd,yKThe kth nearest neighbor sample representing sample x;
(2b) a set of distances for the sample x is calculated, among them, nerest (y)i) Representing a sample yiNearest neighbor sample of d (y)i,yj) Denotes yiAnd yjEuclidean distance of (a):
wherein, yidAnd yjdD-dimension attribute values of the ith sample and the jth sample respectively;
(2c) a sample set near (x) is defined. Firstly, judging whether the elements in D (x) can be divided into two types, if yes, expressing a type of sample with more elements as y'1,y′2,...,y′mWhere m denotes the number of samples, the m samples are considered to have a large influence on the density estimation of x, and near (x) = { y'i1, | i ═ 1: m }; otherwise, K samples have a large influence on the density estimation of x as a whole, and define near (x) ═ n (x);
Wherein near (x) is obtainable according to step (2c), d (x, x)i) Are samples x and xiEuclidean distance of, σ2Is the window width;
Wherein, y3Is the 3 rd neighbor sample of sample x;
(2f) computing local density estimates for sample xDefining local density estimates as the sum of discrimination density estimates and individual density estimates, i.e.
(3) Raising the sample density to obtain the local density of each sample in the set XAs the d +1 th dimension of the sample, a new data set is obtained Wherein,
(4) calculating X*The euclidean distance between samples yields the distance matrix B, i.e. B ═ Bij]n×n。
(5) Calculating X*Similarity between samples, and obtaining a similarity matrix W ═ Wij]n×n,wij=exp(-bij 2/2σ2) Wherein w isijDenotes the ith sample xiAnd the jth sample xjThe similarity between them.
(6) Calculating a threshold value epsilon according to the distance matrix B, determining the neighbor relation between samples corresponding to elements in the B, and obtaining an initial neighbor relation matrix T, wherein,
(7) and updating the relation matrix T and the similarity matrix W according to a neighbor relation propagation principle to obtain an affinity matrix A.
(8) Constructing a degree matrix D and a Laplace matrix Lsym, wherein Lsym is D-1/2AD-1/2Where D is a diagonal matrix, diagonal elementsDenotes the ith sample xiDegree of (c).
(9) Calculating eigenvectors corresponding to the first k largest eigenvalues of Lsym, constructing a matrix V, and further unitizing the matrix into a matrix Y ═ Yij]n×kWherein
(10) each line of Y is taken as a sample point in K-dimensional space, and the sample points are grouped into K classes by the K-means algorithm. If and only if YWhen the ith row is allocated to the jth class, the sample point x in the original data set is divided intoiAssigned as class j.
The invention simultaneously considers the spatial consistency characteristic constraint information and the density information of the data. The spatial consistency can ensure that the probability that the neighboring samples belong to the same class is higher. When the data samples are in any shape and any distribution structure, the element values in the affinity matrix can reflect the real similarity relation between the data, so that the clustering result is more accurate and stable, and the clustering effectiveness of the multi-path spectral clustering method is improved.
The invention has the following advantages:
(1) the method has the advantages that the samples of different classes are prevented from entering the same subset, so that the updated affinity matrix can truly reflect the similarity among the samples;
(2) accurate and stable clustering results can be obtained for data sets with uniform and non-uniform density distribution;
(3) compared with the prior art, the method has better clustering performance and robustness.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a Datal data set with a non-uniform density distribution, containing class 2 samples;
FIG. 3 is a Data2 Data set with a non-uniform density distribution, containing class 3 samples;
FIG. 4 is a profile of a subset of the Datal data set using the SC-NP method;
FIG. 5 is a distribution diagram of subsets of the Data2 Data set obtained by the SC-NP method;
FIG. 6 is a diagram of the clustering results obtained by the SC-NP method for the Datal data set;
FIG. 7 is a diagram of the clustering result of the Data2 Data set by SC-NP method;
FIG. 8 is a profile of a subset of a Datal data set obtained using the present invention;
FIG. 9 is a profile of a subset of a Data2 Data set obtained using the present invention;
FIG. 10 is a graph of the clustering results obtained by the Datal data set using the present invention;
FIG. 11 is a diagram of the clustering results obtained by the Data2 Data set using the present invention;
FIG. 12 is a graph of the clustering results obtained by the K-means method for the Datal data set;
FIG. 13 is a diagram of the clustering results obtained by the K-means method for the Data2 Data set;
FIG. 14 is a graph of the clustering results obtained by the Datal data set using the Ncut method;
FIG. 15 is a diagram of the clustering results obtained by the Ncut method for the Data2 Data set;
FIG. 16 is a diagram of the clustering results obtained by the Datal data set using the NJW method;
FIG. 17 is a diagram of the clustering results obtained by using the NJW method for the Data2 Data set;
FIG. 18 is a graph of eigenvalues of a Datal data set taken using the present invention for different parameters σ;
FIG. 19 is a graph of eigenvalues of a Data2 Data set taken for different parameters σ using the present invention;
FIG. 20 is a Data3 dataset with a relatively uniform density distribution, containing class 2 samples;
FIG. 21 is a Data4 dataset with a relatively uniform density distribution, containing class 3 samples;
FIG. 22 is a diagram of the clustering results obtained by the Data3 Data set using the present invention;
FIG. 23 is a diagram of the clustering results obtained by the present invention for a Data4 Data set.
Detailed Description
Introduction of basic theory
1. Theory of spectrum
Assuming that each data sample is regarded as a vertex V in the graph, edges among the vertices are weighted according to the similarity among the samples, and an undirected weighted graph G (V, E) based on the similarity among the samples is constructed, the clustering problem can be converted into a graph partitioning problem.
Graph division principle: the weights within the subgraphs are maximized and the weights between subgraphs are minimized. Graph G is divided into V1And V2The cost function for both subgraphs can be expressed as:
wherein, V1UV2=V,wuvIs the similarity of samples u and v. Then the standard cut objective function based on 2-way is:
wherein,minimizing the Ncut function is referred to as the canonical cut-set criterion. The criterion can measure the similarity between samples in a class and the dissimilarity between samples in a class.
If a plurality of sub-graphs are divided, the standard cut objective function based on k-way is as follows:
the spectral clustering algorithm is to derive new characteristics of a clustering object through a matrix analysis theory, and cluster original data by using the new characteristics, wherein the theoretical basis of the spectral clustering algorithm is a Laplancian matrix which generally has three forms: in some embodiments, the compound is a compound of formula i (i) or (ii) and (iii) or (iii)uv]n×nD is a diagonal matrix,② the symmetrical form of the specification is represented as Lsym=D-1/2LD-1/2=I-D-1/2WD-1/2(ii) a ③ the normalized random walk form is expressed as Lrw=D-1L=I-D-1W。
When a 2-way partition is employed, let p ═ pi]i∈vIs V1Is divided into indication vectors
Then
Considering the constraint xTWe=xTDe is 0, then
Relaxing x to the continuum domain [ -1, 1], then equation (5) is equivalent:
according to the Rayleigh quotient principle, the optimal solution of the formula (6) is the second solution in the formula (7)Two minimum eigenvalues λ2Corresponding feature solution vector x2Then x2The division information of the graph is included, so that x can be divided according to the heuristic rule2The elements in (1) are classified into 2 types.
Similarly, an optimal solution of the k-way standard cut objective function can be derived, that is, a solution of the eigenvectors corresponding to the k minimum eigenvalues of equation (7). At present, a k-way clustering method has been proved to have better clustering effect than a 2-way method, such as NJW algorithm and the like.
2. Density estimation
Let data set X ═ X1,x2,...,xn}∈RdIf the random variable set is independent and uniformly distributed, the Parzen window density estimation formula obeyed by the variable is as follows:
wherein,is a kernel function, σ2Is the window width and n is the number of samples. The kernel function is typically symmetric and finite, i.e., K (-u) ═ K (u) and K (u) | < ∞,
commonly used kernel functions typically have: gaussian kernel function, Epanechnikov kernel function, Biweight kernel function, square wave kernel function, and the like. In the present invention, the local density of the sample x is estimated using a Gaussian kernel function, i.e.
Second, the invention is based on the multi-channel spectral clustering method that the local density estimates and the neighbour relations spread
Referring to fig. 1, the implementation of the present invention includes the following steps:
(1.1) search for K nearest neighbors of sample x, denoted by the set n (x) ═ y1,y2,...,yk}∈RdIn the invention, K is taken to be 7. The set of distances for sample x is calculated according to equation (10):
among them, nerest (y)i) Representing a sample yiNearest neighbor sample of d (y)i,yj) Denotes yiAnd yjThe euclidean distance of (c).
(1.2) defining near (x) according to the criterion:
(1.2.1) if the elements in D (x) can be divided into two groups. If the number of samples corresponding to a class with a larger number of elements in the class is m, the m samples have a larger influence on the density estimation of x. Let the m samples be x'1,x′2,...,x′mAnd constitutes a set near (x), and near (x) ═ x 'is defined'i|i=1:m};
(1.2.2) if the elements in d (x) cannot be classified into two types, then we consider that these K samples have a large influence on the density estimation, and we can define near (x) n (x).
Will be locally denseExtended as d + 1-th dimension into sample x, forming a new data set Wherein,
(3.1) calculating any two samples x1And xjEuclidean distance b between themijConstructing a distance matrix B ═ Bij]n×nWherein
(3.2) calculating any two samples xiAnd xjSimilarity w between themijConstructing a similarity matrix W ═ Wij]n×nWherein w isij=exp(-bij 2/2σ2);
(3.3) determining the neighbor relation of the samples corresponding to the elements in the distance matrix B according to the threshold epsilon
Defining a neighbor relation matrix as T ═ Tij]n×nWhen initializing, all elements in T are set as 0, and the distance threshold is set asIn the distance matrix B, if BijE is less than or equal to epsilon, let tij=1,tji=tijThen, it indicates the sample xiAnd xjHaving a neighbor relation, symbolized by (x)i,xj) E R, wherein R represents a neighbor relation.
(3.4) according to the transfer principle of the neighbor relation, respectively updating the neighbor relation matrix T and the similarity matrix W
(3.4.2) defining subsets: after propagation through the neighbor relation, each sample has a corresponding neighbor relation domain, the neighbor domains with the number of the samples being 1 are defined as a single sample set, and the neighbor domains with the number of the samples being more than 1 are defined as a multi-sample set. The single sample set and the multiple sample sets are collectively referred to as subsets. If C mutually disjoint subsets C are obtained according to the neighbor relation propagation algorithm1,C2,...,CcThen the subset needs to satisfy the condition: 1≤|Cii is less than n, wherein i is more than or equal to 1 and less than or equal to c;
(3.4.3) the T matrix and intra-subset sample similarity updates are as follows: if T in Tij=1,tjk=1,tikWhen t is equal to 0, t is addedikAnd tkiIs updated to 1 while the element W in the similarity matrix W is updatedikAnd wkiIs updated to min (w)ij,wjk);
(3.4.4) the similarity of the samples between subsets is updated as follows: let set C1,C2,...,CcThe corresponding number of samples is m1,m2,...,mcDefinition of W (C)i,Cj) Is CiAnd CjA similarity matrix formed by the sample similarities between the two subsets is selected, the maximum value of the sample similarities between the two subsets is selected as a local maximum similarity value, and the local maximum similarity value is expressed as MaxSim max (W (C)i,Cj) The similarity of the samples between the subsets may be updated to a local maximum.
And (3.5) obtaining the similarity matrix W as the affinity matrix.
D is a diagonal matrix, diagonal elements DiiRepresents a sample xiWherein, inThe Laplace matrix can be represented as Lsym=D-1/2AD-1/2。
(5.1) if 1 ═ λ1≥λ2≥...≥λkIs LsymThe first k maximum eigenvalues, v1,v2,...,vkFor the corresponding eigenvectors, construct the matrix V ═ V1,v2,...,vk]∈Rn×kAnd v iskIs a column vector;
(5.2) normalizing V so that the sum of each row of elements of V is 1, resulting in a matrix Y, the elements of which are
And taking each line of Y as a sample in a K-dimensional space, and performing K-class division on the samples by adopting a K-means algorithm. Sample x in the original dataset if and only if the ith row of Y is assigned to class jiWill be classified as j-th class.
The effects of the present invention can be further illustrated by the following experimental simulations.
1. Simulation conditions
Four artificial data sets are selected for carrying out simulation experiments, wherein the data sets are respectively shown in fig. 2, 3, 20 and 21, data1 and data2 in fig. 2 and 3 are uneven data sets, and data3 and data4 in fig. 20 and 21 are even data sets. The simulation software was MATLAB 7.1.
2. Simulation result
FIGS. 4 and 5 are sample subset distribution diagrams obtained by SC-NP method, and it can be seen that some samples of different classes are divided into the same subset (e.g. 'in FIG. 4 and' in FIG. 5)Labeled samples) resulting in an inaccurate similarity matrix and thus erroneous clustering results, as shown in fig. 6 and 7.
Fig. 8 and 9 are sample subset distribution diagrams obtained by the method of the present invention, and it can be seen that samples of different classes are divided into different subsets, so that a more accurate similarity matrix can be obtained, and a clustering result is more accurate, and is shown in fig. 10 and 11.
FIG. 12 and FIG. 13 are graphs of clustering results obtained by the K-means method; fig. 14 and 15 are graphs of clustering results obtained by the Ncut method; fig. 16 and 17 are graphs of clustering results obtained by the NJW method. As can be seen from the experimental result graph, the method of the invention is obviously superior to other clustering methods.
Fig. 18 and fig. 19 are graphs of characteristic values obtained by the method of the present invention for different parameters σ, respectively, and it is obvious that the curve shapes corresponding to different parameters σ are similar, and the variation of the characteristic value between the curves is not large, but in the same curve, there are obvious characteristic value mutation points, such as the characteristic value number 3 in fig. 18 and the characteristic value number 4 in fig. 19. Obviously, the feature vector corresponding to the first k-2 largest feature values selected in fig. 18 can obtain an accurate clustering result, and the feature vector corresponding to the first k-3 largest feature values selected in fig. 19 can obtain a more accurate clustering result, thereby illustrating that the method of the present invention has better robustness to the parameter σ.
Fig. 22 and 23 are graphs of clustering results obtained by the method of the present invention, and it is obvious that the method of the present invention can also obtain accurate clustering results for data sets with uniformly distributed density.
Claims (3)
1. A multi-path spectral clustering method based on local density estimation and neighbor relation propagation comprises the following steps:
(1) input dataset X ═ X1,x2,...,xn}∈Rd,xnRepresenting the nth sample in the data set, wherein n is the number of samples, and d is the dimension of the samples;
(2) estimating the local density of the sample:
(2a) find K nearest neighbor samples of sample x and form a set n (x) ═ y1,y2,...,yk}∈RdWherein, yKThe kth nearest neighbor sample representing sample x;
(2b) calculate the distance set for x: among them, nerest (y)i) Representing a sample yiNearest neighbor sample of d (y)i,yj) Denotes yiAnd yjEuclidean distance of (a):
wherein, yidAnd yjdRespectively representing the d-dimension attribute values of the ith sample and the jth sample;
(2c) defining a sample set near (x): firstly, judging whether the elements in D (x) can be divided into two types, if yes, expressing a type of sample with more elements as y'1,y′2,...,y′mWhere m denotes the number of samples, the m samples are considered to have a large influence on the density estimation of x, and near (x) = { y'i1, | i ═ 1: m, otherwise, K samples have a large influence on the density estimation of x as a whole, and define near (x) as n (x);
Wherein near (x) is obtainable according to step (2c), d (x, x)i) Are samples x and xiEuclidean distance of, σ2Is the window width;
Wherein, y3The 3 rd neighbor sample of x;
(2f) calculating local density estimates for xDefining local density estimates as the sum of discrimination density estimates and individual density estimates, i.e.
(3) Raising the sample density: will be locally denseExtended as d + 1-th dimension into sample x, forming a new data set Wherein,
(4) calculating X*The euclidean distance between samples yields the distance matrix B, i.e. B ═ Bij]n×n;
(5) Calculating X*Similarity between samples, and obtaining a similarity matrix W ═ Wij]n×n,wij=exp(-bij 2/2σ2) Wherein w isijRepresenting the similarity between the ith sample and the jth sample;
(6) calculating a threshold value epsilon according to the distance matrix B, determining the neighbor relation between samples corresponding to elements in the B, and obtaining an initial neighbor relation matrix T, wherein,
(7) respectively updating T and W according to a neighbor relation propagation principle to obtain an affinity matrix A;
(8) constructing a degree matrix D and a Laplace matrix Lsym, wherein D is a diagonal matrix and diagonal elementsDenotes the ith sample xiDegree of (Lsym) ═ D-1/2AD-1/2;
(9) Calculating eigenvectors corresponding to the first k largest eigenvalues of Lsym, constructing a matrix V, and further unitizing the matrix into a matrix Y ═ Yij]n×kWherein
(10) taking each line of Y as a sample point in a K-dimensional space, and clustering the sample points into K classes through a K-means algorithm; sample x is assigned if and only if the ith row of Y is assigned to the jth classiAssigned as class j.
2. The method of claim 1, wherein step (6) is performed as follows:
defining a neighbor relation matrix as T ═ Tij]n×nWhen initializing, all elements in T are set to 0, and the distance threshold isIn the distance matrix B, if BijE is less than or equal to epsilon, let tij=1,tji=tijThen, it indicates the sample xiAnd xjHaving a neighbor relation, symbolized by (x)i,xj) E R, wherein R represents a neighbor relation.
3. The method of claim 1, wherein step (7) is performed as follows:
(3.2) defining subsets: after propagation through the neighbor relation, each sample has a corresponding neighbor relation domain, the neighbor domains with the number of the samples being 1 are defined as a single sample set, the neighbor domains with the number being more than 1 are defined as multiple sample sets, and the single sample set and the multiple sample sets are collectively called as subsets; if C mutually disjoint subsets C are obtained according to the neighbor relation propagation algorithm1,C2,...,CcThen the subset needs to satisfy the condition: 1≤|Cii is less than n, wherein i is more than or equal to 1 and less than or equal to c;
(3.3) updating the similarity of the T matrix and the samples in the subset respectively if T in T isij=1,tjk=1,tikWhen t is equal to 0, t is addedikAnd tkiIs updated to 1 while the element W in the similarity matrix W is updatedikAnd wkiIs updated to min (w)ij,wjk);
(3.4) updating the similarity of the samples among the subsets: let set C1,C2,...,CcThe corresponding number of samples is m1,m2,...,mcDefinition of W (C)i,cj) Is CiAnd CjA similarity matrix formed by the sample similarities between the two subsets is selected, the maximum value of the sample similarities between the two subsets is selected as a local maximum similarity value, and the local maximum similarity value is expressed as MaxSim max (W (C)i,Cj) The similarity of the samples among the subsets can be updated to a local maximum;
and (3.5) obtaining the similarity matrix W as the affinity matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013102600621A CN103399852A (en) | 2013-06-27 | 2013-06-27 | Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013102600621A CN103399852A (en) | 2013-06-27 | 2013-06-27 | Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103399852A true CN103399852A (en) | 2013-11-20 |
Family
ID=49563482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013102600621A Pending CN103399852A (en) | 2013-06-27 | 2013-06-27 | Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103399852A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103995821A (en) * | 2014-03-14 | 2014-08-20 | 盐城工学院 | Selective clustering integration method based on spectral clustering algorithm |
CN104517123A (en) * | 2014-12-24 | 2015-04-15 | 西安理工大学 | Sub-spatial clustering method guided by local motion feature similarity |
CN105912726A (en) * | 2016-05-13 | 2016-08-31 | 北京邮电大学 | Density centrality based sampling and detecting methods of unusual transaction data of virtual assets |
CN107480685A (en) * | 2016-06-08 | 2017-12-15 | 国家计算机网络与信息安全管理中心 | A kind of distributed power iteration clustering method and device based on GraphX |
CN108288076A (en) * | 2018-02-12 | 2018-07-17 | 深圳开思时代科技有限公司 | Auto parts machinery clustering method, device, electronic equipment and storage medium |
CN108322320A (en) * | 2017-01-18 | 2018-07-24 | 华为技术有限公司 | Business survival stress method and device |
CN108616457A (en) * | 2018-03-16 | 2018-10-02 | 广东电网有限责任公司茂名供电局 | A kind of method that adapted telecommunication access network service influenza is known |
CN109916627A (en) * | 2019-03-27 | 2019-06-21 | 西南石油大学 | Bearing fault detection and diagnosis based on Active Learning |
CN112258014A (en) * | 2020-10-17 | 2021-01-22 | 中国石油化工股份有限公司 | Clustering and grouping-based risk discrimination analysis method for heat exchangers |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102110173A (en) * | 2011-04-08 | 2011-06-29 | 华北电力大学(保定) | Improved multi-path spectral clustering method for affinity matrix |
-
2013
- 2013-06-27 CN CN2013102600621A patent/CN103399852A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102110173A (en) * | 2011-04-08 | 2011-06-29 | 华北电力大学(保定) | Improved multi-path spectral clustering method for affinity matrix |
Non-Patent Citations (2)
Title |
---|
周林等: "基于谱聚类的聚类集成算法", 《自动化学报》, vol. 38, no. 8, 31 August 2012 (2012-08-31) * |
李新叶 等: "适用于复杂结构的多路谱聚类算法的改进", 《北京工业大学学报》, vol. 39, no. 3, 31 March 2013 (2013-03-31) * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103995821B (en) * | 2014-03-14 | 2017-05-10 | 盐城工学院 | Selective clustering integration method based on spectral clustering algorithm |
CN103995821A (en) * | 2014-03-14 | 2014-08-20 | 盐城工学院 | Selective clustering integration method based on spectral clustering algorithm |
CN104517123A (en) * | 2014-12-24 | 2015-04-15 | 西安理工大学 | Sub-spatial clustering method guided by local motion feature similarity |
CN104517123B (en) * | 2014-12-24 | 2017-12-29 | 西安理工大学 | A kind of Subspace clustering method guided using local motion feature similitude |
CN105912726A (en) * | 2016-05-13 | 2016-08-31 | 北京邮电大学 | Density centrality based sampling and detecting methods of unusual transaction data of virtual assets |
CN107480685B (en) * | 2016-06-08 | 2021-02-23 | 国家计算机网络与信息安全管理中心 | GraphX-based distributed power iterative clustering method and device |
CN107480685A (en) * | 2016-06-08 | 2017-12-15 | 国家计算机网络与信息安全管理中心 | A kind of distributed power iteration clustering method and device based on GraphX |
CN108322320A (en) * | 2017-01-18 | 2018-07-24 | 华为技术有限公司 | Business survival stress method and device |
US11108619B2 (en) | 2017-01-18 | 2021-08-31 | Huawei Technologies Co., Ltd. | Service survivability analysis method and apparatus |
CN108288076A (en) * | 2018-02-12 | 2018-07-17 | 深圳开思时代科技有限公司 | Auto parts machinery clustering method, device, electronic equipment and storage medium |
CN108616457A (en) * | 2018-03-16 | 2018-10-02 | 广东电网有限责任公司茂名供电局 | A kind of method that adapted telecommunication access network service influenza is known |
CN109916627A (en) * | 2019-03-27 | 2019-06-21 | 西南石油大学 | Bearing fault detection and diagnosis based on Active Learning |
CN112258014A (en) * | 2020-10-17 | 2021-01-22 | 中国石油化工股份有限公司 | Clustering and grouping-based risk discrimination analysis method for heat exchangers |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103399852A (en) | Multi-channel spectrum clustering method based on local density estimation and neighbor relation spreading | |
Hepp et al. | Approaches to regularized regression–a comparison between gradient boosting and the lasso | |
Dong et al. | Laplacian matrix learning for smooth graph signal representation | |
Alshammari et al. | Refining a k-nearest neighbor graph for a computationally efficient spectral clustering | |
CN110827921B (en) | Single cell clustering method and device, electronic equipment and storage medium | |
CN104732545B (en) | The texture image segmenting method with quick spectral clustering is propagated with reference to sparse neighbour | |
Cho et al. | Authority-shift clustering: Hierarchical clustering by authority seeking on graphs | |
CN104298999B (en) | EO-1 hyperion feature learning method based on recurrence autocoding | |
Ramathilagam et al. | Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing | |
CN103985112B (en) | Image segmentation method based on improved multi-objective particle swarm optimization and clustering | |
CN103064941A (en) | Image retrieval method and device | |
CN104778480A (en) | Hierarchical spectral clustering method based on local density and geodesic distance | |
CN102110173A (en) | Improved multi-path spectral clustering method for affinity matrix | |
Emms et al. | Graph embedding using quantum commute times | |
Wang et al. | MCMC methods for Gaussian process models using fast approximations for the likelihood | |
Bazargan et al. | Bayesian model selection for complex geological structures using polynomial chaos proxy | |
CN103745232A (en) | Band migration-based hyperspectral image clustering method | |
Kiranmayee et al. | Explorative data analytics of brain tumour data using R | |
Yang et al. | Autonomous semantic community detection via adaptively weighted low-rank approximation | |
Rodrigues et al. | A complex networks approach for data clustering | |
CN104008197B (en) | A kind of fuzzy distribution clustering method that compacts of characteristic weighing | |
Monteil et al. | The caRamel R package for automatic calibration by evolutionary multi objective algorithm | |
CN116844649B (en) | Interpretable cell data analysis method based on gene selection | |
Li et al. | High dimensional electromagnetic interference signal clustering based on SOM neural network | |
Koech et al. | K-means clustering of ontologies based on graph metrics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20131120 |
|
WD01 | Invention patent application deemed withdrawn after publication |