WO2022227956A1 - 一种基于局部核的最优邻居多核聚类方法及系统 - Google Patents

一种基于局部核的最优邻居多核聚类方法及系统 Download PDF

Info

Publication number
WO2022227956A1
WO2022227956A1 PCT/CN2022/082643 CN2022082643W WO2022227956A1 WO 2022227956 A1 WO2022227956 A1 WO 2022227956A1 CN 2022082643 W CN2022082643 W CN 2022082643W WO 2022227956 A1 WO2022227956 A1 WO 2022227956A1
Authority
WO
WIPO (PCT)
Prior art keywords
kernel
matrix
clustering
local
sample
Prior art date
Application number
PCT/CN2022/082643
Other languages
English (en)
French (fr)
Inventor
朱信忠
徐慧英
刘吉元
赵建民
Original Assignee
浙江师范大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江师范大学 filed Critical 浙江师范大学
Publication of WO2022227956A1 publication Critical patent/WO2022227956A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods

Definitions

  • the present application relates to the technical field of data analysis, and in particular, to a local kernel-based optimal neighbor multi-kernel clustering method and system.
  • Kernel clustering methods have been widely used in machine learning and data mining. It implicitly maps the original inseparable data to a high-dimensional Hilbert space. In this space, the corresponding sample points have clear decision boundaries. Then, the unlabeled data is divided into clusters using classical clustering algorithms, including k-means clustering, fuzzy c-means clustering, spectral clustering and Gaussian mixture model (GMM), etc.
  • GMM Gaussian mixture model
  • kernel clustering methods have achieved great success in a large number of practical applications, they can only use a single kernel to process data. Meanwhile, kernel functions can be classified into different types, such as polynomial kernel functions, Gaussian kernel functions, linear kernel functions, etc., and manual parameter selection is required.
  • the clustering task has no label information, how to choose the correct kernel function and its parameters is still an open question.
  • the characteristics of the samples may also be collected from different data sources. For example, a person's portrait can be described through various aspects such as the person's appearance, social network, and work habits. The most common approach is to concatenate all features into a vector, but this approach ignores the incomparability of different types of features.
  • the multi-kernel clustering (MKC) algorithm solves the above problems by fusing the complementary information of different kernel matrices, which can be roughly divided into three categories.
  • Methods in the first category construct cluster-consistent kernels by using low-rank optimization methods. For example, a shared low-rank matrix is first recovered from the transition probability matrix of multiple cores, which is then used as the input to the standard Markov chain method for clustering.
  • the second class of techniques uses the partitioning matrix generated from each kernel to compute their clustering results. First, perform kernel k-means clustering on each incomplete view to obtain multiple partition matrices, and then fuse the complementary information between different partition matrices to obtain the final solution.
  • the third class of algorithms builds consensus kernels during the clustering process. The basic assumption of most algorithms is that the optimal kernel can be represented as a weighted combination of pre-specified kernels. In addition to this, various regularization methods have been proposed to constrain the kernel weights.
  • Kernel alignment is an effective regularization method in the multi-kernel k-means algorithm, which forces all sample pairs to be equally aligned with the same ideal similarity.
  • This conflicts with the accepted concept that aligning two more distant samples with low similarity in high-dimensional space is not reliable.
  • the local kernel trick can solve this problem. It better captures the inherent characteristics of the samples in the data, constructs local kernels using the neighborhoods of each sample, and maximizes their aligned sum with the ideal similarity matrix.
  • local kernels can help clustering algorithms make better use of the information provided by closer pairs of samples.
  • the above MKC algorithm suffers from two problems: it does not fully consider the local density around a single data sample and overly restricts the representational ability of learning the optimal kernel.
  • the local kernel sets the number of neighborhoods of each sample to a global constant, which does not guarantee that all pairs of samples in the local kernel are close to each other. Alignments with distant sample pairs are notoriously less reliable. Therefore, such local kernels cannot minimize unreliability due to ignoring local densities around individual data samples.
  • most multi-kernel clustering algorithms assume that the optimal kernel is a weighted combination of pre-specified kernels, while ignoring some more robust kernels.
  • the purpose of this application is to provide a local kernel-based optimal neighbor multi-kernel clustering method and system for the defects of the prior art.
  • a local kernel-based optimal neighbor multi-kernel clustering method including:
  • calculating the kernel matrix of each view corresponding to the target data sample in step S2 is specifically: for the target data sample The p-th view in the kernel function is mapped to obtain the kernel matrix of the p-th view, which is expressed as:
  • represents the average distance between all target data samples
  • e represents a natural constant
  • K p (i,j) represents the value of the ith row and jth column in the kernel matrix of the pth view
  • m represents the number of views.
  • a local kernel-based optimal neighbor multi-kernel clustering objective function is established, which is expressed as:
  • H represents the division matrix
  • represents the combination coefficient
  • J represents the optimal neighbor kernel
  • n represents the number of all samples
  • represents the adaptive kernel similarity threshold
  • M represents the kernel relationship matrix
  • K ⁇ represents the matrix obtained by combining the kernel matrix according to the ⁇ coefficient, H T represents the transposition of the division matrix;
  • I k represents the k-order identity matrix;
  • step S4 is specifically:
  • a (i) S (i) S (i)T ;
  • a (i) represents intermediate variable,
  • a (i) S (i) S (i)T ;
  • In represents n -order identity matrix;
  • B represents the intermediate variable
  • Matrix J obtains the solution to the problem by removing the negative eigenvalues in matrix B;
  • ⁇ T represents the transpose of ⁇
  • M pq represents the relationship between the kernel matrices p and q;
  • K p , K q , and ⁇ all represent intermediate variables;
  • ⁇ p represents the value of position p in the ⁇ vector .
  • obj t+1 and obj t represent the value of the objective function of the t+1 and t-th iterations, respectively; ⁇ represents the set precision.
  • a local kernel-based optimal neighbor multi-kernel clustering system including:
  • the acquisition module is used to acquire clustering tasks and target data samples
  • the calculation module is used to calculate the kernel matrix of each view corresponding to the target data sample, and to centralize and normalize the kernel matrix to obtain the processed kernel matrix;
  • the solving module is used to solve the established objective function in a cyclic manner, and obtain the division matrix after view fusion;
  • the clustering module is used to perform k-means clustering on the obtained partition matrix to obtain the clustering result.
  • calculating the kernel matrix of each view corresponding to the target data sample in the calculation module is specifically: for the target data sample The p-th view in the kernel function is mapped to obtain the kernel matrix of the p-th view, which is expressed as:
  • represents the average distance between all target data samples
  • e represents a natural constant
  • K p (i,j) represents the value of the ith row and jth column in the kernel matrix of the pth view
  • m represents the number of views.
  • a local kernel-based optimal neighbor multi-kernel clustering objective function is established in the establishment module, which is expressed as:
  • H represents the division matrix
  • represents the combination coefficient
  • J represents the optimal neighbor kernel
  • n represents the number of all samples
  • represents the adaptive kernel similarity threshold
  • M represents the kernel relationship matrix
  • K ⁇ represents the matrix obtained by combining the kernel matrix according to the ⁇ coefficient, H T represents the transposition of the division matrix;
  • I k represents the k-order identity matrix;
  • solving module is specifically:
  • the first fixing module is used to fix J and ⁇ and optimize H;
  • a (i) S (i) S (i)T ;
  • a (i) represents intermediate variable,
  • a (i) S (i) S (i)T ;
  • In represents n -order identity matrix;
  • the second fixed module is used to fix H and ⁇ and optimize J;
  • B represents the intermediate variable
  • Matrix J obtains the solution to the problem by removing the negative eigenvalues in matrix B;
  • the third fixed module is used to fix H and J and optimize ⁇ ;
  • ⁇ T represents the transpose of ⁇
  • M pq represents the relationship between the kernel matrices p and q;
  • K p , K q , and ⁇ all represent intermediate variables;
  • ⁇ p represents the value of position p in the ⁇ vector .
  • termination conditions in the first fixed module, the second fixed module and the third fixed module are expressed as:
  • obj t+1 and obj t represent the value of the objective function of the t+1 and t-th iterations, respectively; ⁇ represents the set precision.
  • the present application proposes a novel local kernel-based optimal neighbor multi-kernel clustering method and system, which includes constructing an adaptive local kernel matrix, finding and constructing an optimal neighbor kernel matrix, and fusing the adaptive local kernel matrix.
  • the construction of the kernel matrix, the search of the optimal neighbor kernel matrix and the clustering are three parts, and they are integrated in the same objective formula to solve.
  • the method greatly improves the performance of the multi-core clustering algorithm, and the experimental results on four public datasets prove that the performance of the present application is superior to the existing algorithm.
  • Embodiment 1 is a flowchart of a local kernel-based optimal neighbor multi-kernel clustering method provided by Embodiment 1;
  • FIG. 2 is a schematic diagram showing the comparison of local cores provided in Embodiments 1 and 2.
  • FIG. 2 is a schematic diagram showing the comparison of local cores provided in Embodiments 1 and 2.
  • the present application provides a local kernel-based optimal neighbor multi-kernel clustering method and system.
  • a local kernel-based optimal neighbor multi-kernel clustering method provided by this embodiment, as shown in FIG. 1 includes:
  • a new method for multi-kernel clustering of optimal neighbors based on local kernels proposed in this embodiment includes constructing an adaptive local kernel matrix, finding and constructing an optimal neighbor kernel matrix, and fusing adaptive local kernels.
  • the construction of the matrix, the search of the optimal neighbor kernel matrix and the clustering are three parts, and they are integrated in the same objective formula to solve the problem, which greatly improves the clustering performance.
  • step S12 the kernel matrix of each view corresponding to the target data sample is calculated, and the kernel matrix is subjected to centering and normalization processing to obtain the processed kernel matrix.
  • the kernel function For the target data sample The vth view in the kernel function is mapped (the commonly used kernel functions include Gaussian kernel function, linear kernel function, etc.), this embodiment takes the Gaussian kernel function as an example, and finally the kernel matrix of the vth view is obtained, which is expressed as:
  • represents the average distance between all target data samples
  • e represents a natural constant
  • K p (i,j) represents the value of the ith row and jth column in the kernel matrix of the pth view
  • m represents the number of views.
  • each kernel matrix is centered and normalized, that is, the mean is 0 and the variance is 1.
  • step S13 according to the obtained processed kernel matrix, an optimal neighbor multi-kernel clustering objective function based on local kernel is established.
  • an adaptive local kernel matrix based on adaptation is adopted.
  • the adaptive local kernel matrix is constructed as follows: For the kernel matrix J, the local kernel matrix corresponding to the i-th sample consists of samples whose similarity is greater than ⁇ , which can be formally expressed as represents the ⁇ (i) nearest neighbor of the ith sample, and S (i)T represents the transpose of S (i) .
  • Figure 2 is its visualization.
  • Figure 2(a) is the kernel matrix J. The greater the similarity between samples, the higher the gray value, which is marked as 1, 0.75, 0.5 and 0.25 in turn.
  • is set to 0.75, subgraphs 2(b.1) and 2(c.2) can be obtained.
  • Figure 2(b.1) is the adaptive local kernel corresponding to the first sample
  • Figure 2(c.2) is the adaptive local kernel corresponding to the third sample.
  • This embodiment establishes the optimal neighbor multi-kernel clustering objective function based on local kernel, which is expressed as:
  • H represents the division matrix
  • represents the combination coefficient
  • J represents the optimal neighbor kernel
  • n represents the number of all samples
  • represents the adaptive kernel similarity threshold
  • M represents the kernel relationship matrix
  • K ⁇ represents the matrix obtained by combining the kernel matrix according to the ⁇ coefficient, H T represents the transposition of the division matrix;
  • I k represents the k-order identity matrix;
  • step S14 the established objective function is solved in a cyclic manner, and a division matrix after view fusion is obtained. Specifically:
  • a (i) S (i) S (i)T ;
  • a (i) represents intermediate variable,
  • a (i) S (i) S (i)T ;
  • In represents n -order identity matrix;
  • B represents the intermediate variable
  • Matrix J obtains the solution to the problem by removing the negative eigenvalues in matrix B;
  • ⁇ T represents the transpose of ⁇
  • M pq represents the relationship between the kernel matrices p and q;
  • K p , K q , and ⁇ all represent intermediate variables;
  • ⁇ p represents the value of position p in the ⁇ vector .
  • the prior art algorithm can be solved by the Lagrange multiplier method, and the algorithm is directly called in Matlab to solve.
  • steps S141, S142, and S143 need to be alternately performed by an alternate method until convergence, where the termination condition (ie, the convergence condition) is expressed as:
  • obj t+1 and obj t represent the value of the objective function of the t+1 and t-th iterations, respectively; ⁇ represents the set precision.
  • step S15 k-means clustering is performed on the obtained partition matrix to obtain a clustering result.
  • the adaptive local kernel matrix of this embodiment has different clustering performance obtained by the adopted method.
  • This embodiment can obtain better performance (represented by the clustering accuracy ACC), as shown in Table 1 below:
  • the purpose of this embodiment is to provide a local kernel-based optimal neighbor multi-kernel clustering method.
  • the method constructs an adaptive local kernel matrix and finds an optimal neighbor kernel around the linear combination of multiple pre-defined kernels, and uses this neighbor kernel for clustering.
  • these three processes are placed in the same objective formula for rotation optimization, and the final clustering result is obtained when the change of loss tends to be stable.
  • this embodiment also provides a local kernel-based optimal neighbor multi-kernel clustering system, including:
  • the acquisition module is used to acquire clustering tasks and target data samples
  • the calculation module is used to calculate the kernel matrix of each view corresponding to the target data sample, and to centralize and normalize the kernel matrix to obtain the processed kernel matrix;
  • the solving module is used to solve the established objective function in a cyclic manner, and obtain the division matrix after view fusion;
  • the clustering module is used to perform k-means clustering on the obtained partition matrix to obtain the clustering result.
  • calculating the kernel matrix of each view corresponding to the target data sample in the calculation module is specifically: for the target data sample The p-th view in the kernel function is mapped to obtain the kernel matrix of the p-th view, which is expressed as:
  • represents the average distance between all target data samples
  • e represents a natural constant
  • K p (i,j) represents the value of the ith row and jth column in the kernel matrix of the pth view
  • m represents the number of views.
  • an optimal neighbor multi-kernel clustering objective function based on local kernels is established, which is expressed as:
  • H represents the division matrix
  • represents the combination coefficient
  • J represents the optimal neighbor kernel
  • n represents the number of all samples
  • represents the adaptive kernel similarity threshold
  • M represents the kernel relationship matrix
  • K ⁇ represents the matrix obtained by combining the kernel matrix according to the ⁇ coefficient, H T represents the transposition of the division matrix;
  • I k represents the k-order identity matrix;
  • solving module is specifically:
  • the first fixing module is used to fix J and ⁇ and optimize H;
  • a (i) S (i) S (i)T ;
  • a (i) represents intermediate variable,
  • a (i) S (i) S (i)T ;
  • In represents n -order identity matrix;
  • the second fixed module is used to fix H and ⁇ and optimize J;
  • B represents the intermediate variable
  • Matrix J obtains the solution to the problem by removing the negative eigenvalues in matrix B;
  • the third fixed module is used to fix H and J and optimize ⁇ ;
  • ⁇ T represents the transpose of ⁇
  • M pq represents the relationship between the kernel matrices p and q;
  • K p , K q , and ⁇ all represent intermediate variables;
  • ⁇ p represents the value of position p in the ⁇ vector .
  • termination conditions in the first fixed module, the second fixed module and the third fixed module are expressed as:
  • obj t+1 and obj t represent the value of the objective function of the t+1 and t-th iterations, respectively; ⁇ represents the set precision.
  • this application proposes a novel local kernel-based optimal neighbor multi-kernel clustering system, which includes constructing an adaptive local kernel matrix, finding and constructing an optimal neighbor kernel matrix, and fusing the adaptive local kernel matrix.
  • a local kernel-based optimal neighbor multi-kernel clustering method provided in this embodiment differs from Embodiment 1 in that:
  • the main content of this embodiment includes the design of an adaptive local kernel, which is based on the pre-specified local density of a single data sample and the over-restriction of the representation ability of the optimal kernel learned in the current multi-kernel clustering algorithm.
  • the optimal kernel is located in the neighborhood of the linear combination of the kernels; the two techniques are used in a single multi-kernel clustering framework; the generalization range of the optimal neighborhood multi-kernel clustering algorithm based on adaptive local kernel is studied.
  • the above-mentioned adaptive local kernel is a sub-matrix of the kernel function, and its main function is to reflect the relationship between the sample and its neighborhood.
  • the ith adaptive local kernel of matrix K can be expressed as:
  • H (i) S (i) T H, is the identity matrix of size ⁇ (i) and ⁇ (i) varies with the density around the sample.
  • the adaptive local kernel proposed in this embodiment is derived from [M.Li, X.Liu, W.Lei, D.Yong, J.Yin, and E.Zhu, "Multiple kernel clustering with local kernel alignment maximization," in International The local kernel in Joint conference on Artificial Intelligence, 2016] is extended, and the size of the local kernel is directly set as a constant. However, this does not guarantee that all pairs of samples are in a local kernel of high similarity. On the contrary, this embodiment constructs the ith adaptive local kernel by selecting samples whose similarity to sample i is higher than the threshold ⁇ . Figure 2 comprehensively compares these two types of local kernels.
  • the proposed adaptive local kernel is usually faster than [M.Li, X.Liu, W.Lei, D.Yong , J.Yin, and E.Zhu, "Multiple kernel clustering with local kernel alignment maximization," in International Joint conference on Artificial Intelligence, 2016]], the local kernel is small, thus ensuring that all neighbors have relatively high similarity , and reduce the unreliability caused by further aligning sample pairs.
  • a is the original kernel matrix
  • b.1 and b.2 are in [M.Li, X.Liu, W.Lei, D.Yong, J.Yin, and E.Zhu, "Multiple kernel clustering with local kernel alignment maximization ,” in International Joint conference on Artificial Intelligence, 2016], the local kernels corresponding to 1/3 of the samples were generated. Its size ⁇ is fixed to 3; c.1 and c.2 are adaptive local kernels corresponding to 1/3 samples. The similarity to its neighbors is higher than ⁇ .
  • K (i) S (i)T KS (i)
  • J (i) S (i)T JS (i)
  • n is the number of all samples, is an identity matrix of size ⁇ (i) .
  • the optimal kernel J is used to connect the clustering process with the knowledge acquisition process. In this case, it utilizes the complementary information in the pre-specified kernels to help the clustering process and the information from the clusters to help the weight assignment of the pre-specified kernels as feedback.
  • a local kernel-based optimal neighbor multi-kernel clustering method provided in this embodiment differs from Embodiment 1 in that:
  • This embodiment is compared with existing methods on multiple data sets to verify the effectiveness of the present application.
  • Flower102 This dataset contains 8189 samples, uniformly distributed in 102 categories, with 4 kernel matrices.
  • This dataset includes 2000 samples, uniformly distributed in 10 classes, with 3 kernel matrices.
  • Caltech101 This dataset contains 1530 samples, uniformly distributed in 102 categories, with 25 kernel matrices.
  • Protein Fold This dataset includes 694 samples, uniformly distributed in 27 categories, with 12 kernel matrices.
  • the method has two hyperparameters, ⁇ and ⁇ .
  • represents the relative importance of the construction of the adaptive local kernel matrix and the optimal neighbor kernel matrix.
  • represents the similarity threshold between neighbor samples.
  • a grid search technique was used to select these two parameters, where ⁇ varied from 2 ⁇ 15 to 2 15 and ⁇ varied from ⁇ 0.5 to 0.5.
  • the clustering algorithm common accuracy rate (ACC) evaluation index was used for evaluation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种基于局部核的最优邻居多核聚类方法及系统,其中涉及的一种基于局部核的最优邻居多核聚类方法,包括:S11.获取聚类任务和目标数据样本;S12.计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;S13.根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;S14.采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;S15.对得到的划分矩阵进行k均值聚类,得到聚类结果。

Description

一种基于局部核的最优邻居多核聚类方法及系统 技术领域
本申请涉及数据分析技术领域,尤其涉及一种基于局部核的最优邻居多核聚类方法及系统。
背景技术
核聚类方法已经广泛地应用于机器学习和数据挖掘领域。它隐式地将原始的不可分离数据映射到高维希尔伯特空间。在此空间中,对应的样本点具有明显的决策边界。然后,采用经典的聚类算法将未标记的数据划分为簇,包括k均值聚类,模糊c均值聚类,谱聚类和高斯混合模型(GMM)等。尽管核聚类方法在大量实际应用中都取得了巨大的成功,但它们只能使用单个核来处理数据。同时,核函数可分为不同类型,例如多项式核函数,高斯核函数,线性核函数等,并且需要手动进行参数选择。由于聚类任务没有标签信息,如何选择正确的核函数及其参数仍然是一个悬而未决的问题。与此同时,在实际的应用中,样本的特征还可能收集于不同的数据源。例如,人物画像可以通过人物的外貌特征、社交网络、处事习惯等多个方面进行描述。最常见的方法是将所有特征拼接到一个向量中,但是此种方法忽略了不同类型特征的不可比性。
多核聚类(MKC)算法通过融合不同核矩阵的互补信息来解决上述问题,其大致可以分为三类。第一类中的方法是通过使用低秩优化方法来构造聚类一致核。例如,首先从多核的转移概率矩阵中恢复一个共享的低秩矩阵,然后将其作为标准马尔可夫链方法的输入进行聚类。第二类技术使用从每个核生成的划分矩阵来计算它们的聚类结果。首先对每个不完全视图进行核k均值聚类得到多个划分矩阵,然后融合不同划分矩阵之间的互补信息,得到最终解。第三类算法在聚类过程中建立了一致核。大多数算法的基本假设是,最优核可以表示为预先指定的核的加权组合。除此之外,还提出了各种正则化方法来约束核权重。
核对齐是多核k-均值算法中一种有效的正则化方法,核对齐迫使所有样本对以相同的理想相似性相等地对齐。然而,这与公认的概念相冲突,即,在高维空间中以低相似性对齐两个更远的样本不可靠。局部核技巧可以解决这一 问题。它能更好地捕捉数据中样本的固有特征,利用每个样本的邻域构造局部核,并最大化它们与理想相似矩阵的对齐和。此外,局部核能够帮助聚类算法更好地利用更接近的样本对提供的信息。
上述MKC算法存在两个问题:没有充分考虑单个数据样本周围的局部密度和过度限制学习最优核的表示能力。具体来说,局部核将每个样本的邻域数设置为一个全局常数,这不能保证局部核中的所有样本对相互靠近。众所周知,与较远的样本对进行比对不太可靠。因此,由于忽略了单个数据样本周围的局部密度,这种局部核不能将不可靠性降到最低。同时,大多数多核聚类算法假设最优核是预先指定的核的加权组合,而忽略了一些更健壮的核。
发明内容
本申请的目的是针对现有技术的缺陷,提供了一种基于局部核的最优邻居多核聚类方法及系统。
为了实现以上目的,本申请采用以下技术方案:
一种基于局部核的最优邻居多核聚类方法,包括:
S1.获取聚类任务和目标数据样本;
S2.计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
S3.根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
S4.采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
S5.对得到的划分矩阵进行k均值聚类,得到聚类结果。
进一步的,所述步骤S2计算与目标数据样本相对应的各个视图的核矩阵具体为:对目标数据样本
Figure PCTCN2022082643-appb-000001
中的第p个视图进行核函数映射,得到第p个视图的核矩阵,表示为:
Figure PCTCN2022082643-appb-000002
其中,
Figure PCTCN2022082643-appb-000003
Figure PCTCN2022082643-appb-000004
表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值; m表示视图的个数。
进一步的,所述步骤S3中建立基于局部核的最优邻居多核聚类目标函数,表示为:
Figure PCTCN2022082643-appb-000005
s.t.H∈R n×k,H TH=I kT1 m=1,β p≥0,
Figure PCTCN2022082643-appb-000006
其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
Figure PCTCN2022082643-appb-000007
H T表示划分矩阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
Figure PCTCN2022082643-appb-000008
表示对于所有p;J (i)=S (i)TJS (i)
Figure PCTCN2022082643-appb-000009
表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
Figure PCTCN2022082643-appb-000010
表示μ (i)的单位矩阵。
进一步的,所述步骤S4具体为:
S41.固定J和β,优化H;
将目标函数转化为:
Figure PCTCN2022082643-appb-000011
s.t.    H∈R n×k,H TH=I k
其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
通过对
Figure PCTCN2022082643-appb-000012
进行特征值分解得到问题的解;
S42.固定H和β,优化J;
将目标函数转化为:
Figure PCTCN2022082643-appb-000013
Figure PCTCN2022082643-appb-000014
其中,B表示中间变量;
Figure PCTCN2022082643-appb-000015
矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
S43.固定H和J,优化β;
将目标函数转化为:
Figure PCTCN2022082643-appb-000016
Figure PCTCN2022082643-appb-000017
α=[α 1,…,α m],α p=-ρTr(JK p)
其中,α T表示α的转置;
Figure PCTCN2022082643-appb-000018
表示第i的样本的局部核矩阵p和q之间的关系;
Figure PCTCN2022082643-appb-000019
表示第p个核矩阵中第i个样本的局部核;
Figure PCTCN2022082643-appb-000020
表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
进一步的,所述步骤S41、S42、S43中的终止条件表示为:
(obj t+1-boj t)/obj t≤ε
其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
相应的,还提供一种基于局部核的最优邻居多核聚类系统,包括:
获取模块,用于获取聚类任务和目标数据样本;
计算模块,用于计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
建立模块,用于根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
求解模块,用于采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
聚类模块,用于对得到的划分矩阵进行k均值聚类,得到聚类结果。
进一步的,所述计算模块中计算与目标数据样本相对应的各个视图的核矩阵具体为:对目标数据样本
Figure PCTCN2022082643-appb-000021
中的第p个视图进行核函数映射,得到第p个视图的核矩阵,表示为:
Figure PCTCN2022082643-appb-000022
其中,
Figure PCTCN2022082643-appb-000023
Figure PCTCN2022082643-appb-000024
表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值; m表示视图的个数。
进一步的,所述建立模块中建立基于局部核的最优邻居多核聚类目标函数,表示为:
Figure PCTCN2022082643-appb-000025
s.t.H∈R n×k,H TH=I kT1 m=1,β p≥0,
Figure PCTCN2022082643-appb-000026
其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
Figure PCTCN2022082643-appb-000027
H T表示划分矩阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
Figure PCTCN2022082643-appb-000028
表示对于所有p;J (i)=S (i)TJS (i)
Figure PCTCN2022082643-appb-000029
表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
Figure PCTCN2022082643-appb-000030
表示μ (i)的单位矩阵。
进一步的,所述求解模块具体为:
第一固定模块,用于固定J和β,优化H;
将目标函数转化为:
Figure PCTCN2022082643-appb-000031
s.t.    H∈R n×k,H TH=I k
其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
通过对
Figure PCTCN2022082643-appb-000032
进行特征值分解得到问题的解;
第二固定模块,用于固定H和β,优化J;
将目标函数转化为:
Figure PCTCN2022082643-appb-000033
Figure PCTCN2022082643-appb-000034
其中,B表示中间变量;
Figure PCTCN2022082643-appb-000035
矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
第三固定模块,用于固定H和J,优化β;
将目标函数转化为:
Figure PCTCN2022082643-appb-000036
Figure PCTCN2022082643-appb-000037
α=[α 1,…,α m],α p=-ρTr(JK p)
其中,α T表示α的转置;
Figure PCTCN2022082643-appb-000038
表示第i的样本的局部核矩阵p和q之间的关系;
Figure PCTCN2022082643-appb-000039
表示第p个核矩阵中第i个样本的局部核;
Figure PCTCN2022082643-appb-000040
表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
进一步的,所述第一固定模块、第二固定模块、第三固定模块中的终止条件表示为:
(obj t+1-obj t)/obj t≤ε
其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
与现有技术相比,本申请提出了一种新颖的基于局部核的最优邻居多核聚类方法及系统,其包括构建自适应局部核矩阵、寻找构建最优邻居核矩阵和融合自适应局部核矩阵的构建、最优邻居核矩阵的寻找及聚类三个部分,并将之融合在同一个目标式中求解。该方法大幅提高了多核聚类算法的性能,且在四个公共数据集上的实验结果证明了本申请的性能优于现有算法。
附图说明
图1是实施例一提供的一种基于局部核的最优邻居多核聚类方法流程图;
图2是实施例一、二提供的局部核比较示意图。
具体实施方式
以下通过特定的具体实例说明本申请的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本申请的其他优点与功效。本申请还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本申请的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。
本申请针对现有缺陷,提供了一种基于局部核的最优邻居多核聚类方法及系统。
实施例一
本实施例提供的一种基于局部核的最优邻居多核聚类方法,如图1所示,包括:
S11.获取聚类任务和目标数据样本;
S12.计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
S13.根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
S14.采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
S15.对得到的划分矩阵进行k均值聚类,得到聚类结果。
本实施例提出的一种通过基于局部核的最优邻居多核聚类的新方法,相比现有方法,其包括构建自适应局部核矩阵、寻找构建最优邻居核矩阵和融合自适应局部核矩阵的构建、最优邻居核矩阵的寻找及聚类三个部分,并将之融合在同一个目标式中求解,大幅提高了聚类性能。
在步骤S12中,计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵。
对目标数据样本
Figure PCTCN2022082643-appb-000041
中的第v个视图进行核函数映射(常用的核函数有高斯核函数,线性核函数等),本实施例以高斯核函数为例,最终得到第v个视图的核矩阵,表示为:
Figure PCTCN2022082643-appb-000042
其中,
Figure PCTCN2022082643-appb-000043
Figure PCTCN2022082643-appb-000044
表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值;m表示视图的个数。
由此,可得到v个视图的核矩阵
Figure PCTCN2022082643-appb-000045
接着,将各个核矩阵进行中心化和归一化,即平均值为0,方差为1。
在步骤S13中,根据得到的处理后的核矩阵,建立基于局部核的最优邻居 多核聚类目标函数。
在本实施例中,采用基于自适应的自适应局部核矩阵。自适应局部核矩阵构建如下:对于核矩阵J,第i个样本对应的局部核矩阵由与该样本相似度大于ζ的样本组成,可形式化表达为
Figure PCTCN2022082643-appb-000046
表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置。图2是其可视化。图2(a)为核矩阵J,样本间相似度越大,灰度值越高,依次标记为1、0.75、0.5和0.25。当将ζ设置为0.75,可得到子图2(b.1)、图2(c.2)。其中,图2(b.1)为第1个样本对应的自适应局部核,图2(c.2)为第3个样本对应的自适应局部核。
本实施例建立基于局部核的最优邻居多核聚类目标函数,表示为:
Figure PCTCN2022082643-appb-000047
s.t.H∈R n×k,H TH=I kT1 m=1,β p≥0,
Figure PCTCN2022082643-appb-000048
其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
Figure PCTCN2022082643-appb-000049
H T表示划分矩阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
Figure PCTCN2022082643-appb-000050
表示对于所有p;J (i)=S (i)TJS (i)
Figure PCTCN2022082643-appb-000051
表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
Figure PCTCN2022082643-appb-000052
表示μ (i)的单位矩阵。
在步骤S14中,采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵。具体为:
S141.固定J和β,优化H;
将目标函数转化为:
Figure PCTCN2022082643-appb-000053
s.t.    H∈R n×k,H TH=I k
其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
通过对
Figure PCTCN2022082643-appb-000054
进行特征值分解得到问题的解;
S142.固定H和β,优化J;
将目标函数转化为:
Figure PCTCN2022082643-appb-000055
Figure PCTCN2022082643-appb-000056
其中,B表示中间变量;
Figure PCTCN2022082643-appb-000057
矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
S143.固定H和J,优化β;
将目标函数转化为:
Figure PCTCN2022082643-appb-000058
Figure PCTCN2022082643-appb-000059
α=[α 1,…,α m],α p=-ρTr(JK p)
其中,α T表示α的转置;
Figure PCTCN2022082643-appb-000060
表示第i的样本的局部核矩阵p和q之间的关系;
Figure PCTCN2022082643-appb-000061
表示第p个核矩阵中第i个样本的局部核;
Figure PCTCN2022082643-appb-000062
表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
这是一个标准的QP问题,可以通过现有算法包进行求解。
现有技术算法可用拉格朗日乘子法进行求解,Matlab中直接调用算法求解。
在本实施例中,需要对步骤S141、S142、S143通过交替法进行交替执行,直至收敛,其中终止条件(即收敛条件)表示为:
(obj t+1-obj t)/obj t≤ε
其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
在步骤S15中,对得到的划分矩阵进行k均值聚类,得到聚类结果。
对得到的划分矩阵进行k均值聚类,得到聚类结果,即对矩阵H进行标准的k均值聚类即可得到最终的聚类结果。
采用本实施例的自适应的自适应局部核矩阵与采用整个核矩阵ONKC相比,所采用的的方法取得的聚类性能不同。本实施例能得到更好的性能(使用聚类精度ACC来表示),如下表1:
ONKC 本方法
41.56 45.44
91.00 96.30
35.91 38.04
39.19 40.63
表1
传统的多核聚类算法没有充分的考虑样本间的局部密度,且严重限制了用于最终聚类的最优核的取值范围,导致所取得的性能不高。本实施例的目的在于提供一种基于局部核的最优邻居多核聚类方法。该方法通过构建自适应的局部核矩阵,并在多个预先定义核的线性组合周围寻找一个最优的邻居核,且利用此邻居核进行聚类。同时,这三个过程被放在同一个目标式中进行轮替优化,当损失的变化趋于稳定时,得到最终的聚类结果。
相应的,本实施例还提供一种基于局部核的最优邻居多核聚类系统,包括:
获取模块,用于获取聚类任务和目标数据样本;
计算模块,用于计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
建立模块,用于根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
求解模块,用于采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
聚类模块,用于对得到的划分矩阵进行k均值聚类,得到聚类结果。
进一步的,所述计算模块中计算与目标数据样本相对应的各个视图的核矩阵具体为:对目标数据样本
Figure PCTCN2022082643-appb-000063
中的第p个视图进行核函数映射,得到第p个视图的核矩阵,表示为:
Figure PCTCN2022082643-appb-000064
其中,
Figure PCTCN2022082643-appb-000065
Figure PCTCN2022082643-appb-000066
表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值;m表示视图的个数。
进一步的,所述建立模块中建立基于局部核的最优邻居多核聚类目标函数, 表示为:
Figure PCTCN2022082643-appb-000067
s.t.H∈R n×k,H TH=I kT1 m=1,β p≥0,
Figure PCTCN2022082643-appb-000068
其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
Figure PCTCN2022082643-appb-000069
H T表示划分矩阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
Figure PCTCN2022082643-appb-000070
表示对于所有p;J (i)=S (i)TJS (i)
Figure PCTCN2022082643-appb-000071
表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
Figure PCTCN2022082643-appb-000072
表示μ (i)的单位矩阵。
进一步的,所述求解模块具体为:
第一固定模块,用于固定J和β,优化H;
将目标函数转化为:
Figure PCTCN2022082643-appb-000073
s.t.    H∈R n×k,H TH=I k
其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
通过对
Figure PCTCN2022082643-appb-000074
进行特征值分解得到问题的解;
第二固定模块,用于固定H和β,优化J;
将目标函数转化为:
Figure PCTCN2022082643-appb-000075
Figure PCTCN2022082643-appb-000076
其中,B表示中间变量;
Figure PCTCN2022082643-appb-000077
矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
第三固定模块,用于固定H和J,优化β;
将目标函数转化为:
Figure PCTCN2022082643-appb-000078
Figure PCTCN2022082643-appb-000079
α=[α 1,…,α m],α p=-ρTr(JK p)
其中,α T表示α的转置;
Figure PCTCN2022082643-appb-000080
表示第i的样本的局部核矩阵p和q之间的关系;
Figure PCTCN2022082643-appb-000081
表示第p个核矩阵中第i个样本的局部核;
Figure PCTCN2022082643-appb-000082
表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
进一步的,所述第一固定模块、第二固定模块、第三固定模块中的终止条件表示为:
(obj t+1-obj t)/obj t≤ε
其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
与现有技术相比,本申请提出了一种新颖的基于局部核的最优邻居多核聚类系统,其包括构建自适应局部核矩阵、寻找构建最优邻居核矩阵和融合自适应局部核矩阵的构建、最优邻居核矩阵的寻找及聚类三个部分,并将之融合在同一个目标式中求解。该方法大幅提高了多核聚类算法的性能。
实施例二
本实施例提供的一种基于局部核的最优邻居多核聚类方法与实施例一的不同之处在于:
本实施例主要内容包括针对目前多核聚类算法中没有充分考虑单个数据样本的局部密度和过度限制学习到的最优核的表示能力这两个问题,设计了自适应局部核,并从预先指定的核的线性组合的邻域中定位出最优核;将这两种技术利用到单个多内核集群框架中;研究了基于自适应局部核的最优邻域多核聚类算法的推广范围。
上述的自适应局部核是核函数的一个子矩阵,主要功能是反映了样本与其邻域之间的关系。首先,定义阈值ζ,并且第i个样本的对应指标集合Ω (i)可以写为Ω (i)={j|K(i,j)≥ζ}然后,相应的指标矩阵
Figure PCTCN2022082643-appb-000083
Figure PCTCN2022082643-appb-000084
被定义为:
Figure PCTCN2022082643-appb-000085
矩阵K的第i个自适应局部核可以表示为:
Figure PCTCN2022082643-appb-000086
换言之,上述等式选择与第i个样本对应的核值大于ζ的μ (i)个相邻样本,并去掉其他样本。在多核k-均值中使用构造的局部核,并将矩阵诱导正则化λ的权重设置为1,可以重写为以下形式
Figure PCTCN2022082643-appb-000087
其中
Figure PCTCN2022082643-appb-000088
H (i)=S (i)TH,
Figure PCTCN2022082643-appb-000089
是大小为μ (i)且μ (i)的单位矩阵随样本周围的密度而变化。
本实施例提出的自适应局部核是从[M.Li,X.Liu,W.Lei,D.Yong,J.Yin,and E.Zhu,“Multiple kernel clustering with local kernel alignment maximization,”in International Joint conference on Artificial Intelligence,2016]中的局部核扩展而来的,直接将局部核的大小为常数。然而,这样不能保证所有样本对都在一个高相似度的局部核中。相反,本实施例通过选择与样本i的相似度高于阈值ζ的样本来构造第i个自适应局部核。图2全面比较了这两种类型的局部核。可以看出,[M.Li,X.Liu,W.Lei,D.Yong,J.Yin,and E.Zhu,“Multiple kernel clustering with local kernel alignment maximization,”in International Joint conference on Artificial Intelligence,2016]中生成的局部核具有相同的大小,而所提出的自适应局部核是由样本对的相似性所决定的。比较图2中的b.1、b.2和c.1、c.2,可以注意到,所提出的自适应局部核通常比[M.Li,X.Liu,W.Lei,D.Yong,J.Yin,and E.Zhu,“Multiple kernel clustering with local kernel alignment maximization,”in International Joint conference on Artificial Intelligence,2016]]中的局部核小,从而保证了所有邻居具有相对较高的相似性,并减少了进一步比对样本对带来的不可靠性。
如图2所示的局部核比较:方框的暗度表示样本对之间的相似程度。方框 越暗,对应的样本对越相似。a是原始核矩阵,b.1和b.2是在[M.Li,X.Liu,W.Lei,D.Yong,J.Yin,and E.Zhu,“Multiple kernel clustering with local kernel alignment maximization,”in International Joint conference on Artificial Intelligence,2016]中生成的对应于1/3样本的局部核。其大小μ固定为3;c.1和c.2是对应于1/3样本的自适应局部核。与它的邻居的相似性比ζ更高。
假设最优核(称为J)驻留在核组合的邻域中,表示为:
Figure PCTCN2022082643-appb-000090
这一假设在方程中得到了目标式,如下:
Figure PCTCN2022082643-appb-000091
Figure PCTCN2022082643-appb-000092
上述目标式中的目标由于对J的约束,J很难优化。观察到K β为聚类提供了先验知识,J更有可能在K β之间的差距较小的情况下达到最优值。本实施例没有显式地设置最大差距η,而是在聚类过程中学习实际差距,这就形成了最终目标式
Figure PCTCN2022082643-appb-000093
s.t.H∈R n×k,H TH=I kT1 m=1,β p≥0,
Figure PCTCN2022082643-appb-000094
其中,
Figure PCTCN2022082643-appb-000095
K (i)=S (i)TKS (i),J (i)=S (i)TJS (i)
Figure PCTCN2022082643-appb-000096
表示第i个样本的μ (i)最近邻域,其中n是所有样本数,
Figure PCTCN2022082643-appb-000097
是大小为μ (i)的单位矩阵。最优核J用作连接聚类过程与知识获取过程。在这种情况下,它利用预先指定的核中的互补信息来帮助聚类过程,并利用来自聚类的信息来帮助预先指定的核的权重分配作为反馈。
实施例三
本实施例提供的一种基于局部核的最优邻居多核聚类方法与实施例一的不同之处在于:
本实施例在多个数据集上与现有方法进行对比以验证本申请的有效性。
数据集:
Flower102:该数据集包含8189个样本,均匀分布在102个类别中,拥有 4个核矩阵。
Digital:该数据集包括2000个样本,均匀分布在10个类别中,拥有3个核矩阵。
Caltech101:该数据集包含1530个样本,均匀分布在102个类别中,拥有25个核矩阵。
Protein Fold:该数据集包括694个样本,均匀分布在27个类别中,拥有12个核矩阵。
以上数据集的统计信息如表2所示。
Figure PCTCN2022082643-appb-000098
表2
数据准备与参数设置:
在初始化阶段,按照[C.Cortes,M.Mohri,and A.Rostamizadeh,“Algorithms for learning kernels based on centered alignment,”Journal of Machine Learning Research,vol.13,no.2,pp.795–828,2012.]中所述的方法将核矩阵进行中心化。接着对其进行归一化,以便更好地将样本对之间的相似度值范围指定在-1到1之间。
该方法有两个超参数,分别为ρ和ξ。ρ代表自适应局部核矩阵的构建与最优邻居核矩阵两个过程的相对重要程度。ξ代表邻居样本之间的相似度阈值。采用网格搜索技术用于选择这两个参数,其中,ρ在2 -15到2 15之间变化,ξ在-0.5到0.5之间变化。
评价指标:
采用聚类算法通用准确率(ACC)评价指标进行评价。
本实施例在四个数据集上与文献中的三个多视图聚类方法进行对比,分别为RMKC[P.Zhou,L.Du,L.Shi,H.Wang,and Y.-D.Shen,“Recovery of corrupted multiple kernels for clustering,”in Twenty-Fourth  International Joint Conference on Artificial Intelligence,2015.]、RMKKM[L.Du,P.Zhou,L.Shi,H.Wang,M.Fan,W.Wang,and Y.-D.Shen,“Robust multiple kernel k-means using l21-norm,”in Twenty-Fourth International Joint Conference on Artificial Intelligence,2015.]和MKCMR[X.Liu,Y.Dou,J.Yin,L.Wang,and E.Zhu,“Multiple kernel k-means clustering with matrix-induced regularization,”in Thirtieth AAAI Conference on Artificial Intelligence,2016.]的对比结果如表3所示,本实施例的性能明显优于对比方法。
Figure PCTCN2022082643-appb-000099
表3
本实施例在四个公共数据集上的实验结果证明了本方法的性能优于现有算法。
注意,上述仅为本申请的较佳实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由所附的权利要求范围决定。

Claims (10)

  1. 一种基于局部核的最优邻居多核聚类方法,其特征在于,包括:
    S1.获取聚类任务和目标数据样本;
    S2.计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
    S3.根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
    S4.采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
    S5.对得到的划分矩阵进行k均值聚类,得到聚类结果。
  2. 根据权利要求1所述的一种基于局部核的最优邻居多核聚类方法,其特征在于,所述步骤S2计算与目标数据样本相对应的各个视图的核矩阵具体为:对目标数据样本
    Figure PCTCN2022082643-appb-100001
    中的第p个视图进行核函数映射,得到第p个视图的核矩阵,表示为:
    Figure PCTCN2022082643-appb-100002
    其中,
    Figure PCTCN2022082643-appb-100003
    Figure PCTCN2022082643-appb-100004
    表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值;m表示视图的个数。
  3. 根据权利要求2所述的一种基于局部核的最优邻居多核聚类方法,其特征在于,所述步骤S3中建立基于局部核的最优邻居多核聚类目标函数,表示为:
    Figure PCTCN2022082643-appb-100005
    Figure PCTCN2022082643-appb-100006
    其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
    Figure PCTCN2022082643-appb-100007
    H T表示划分矩 阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
    Figure PCTCN2022082643-appb-100008
    表示对于所有p;J (i)=S (i)TJS (i)
    Figure PCTCN2022082643-appb-100009
    表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
    Figure PCTCN2022082643-appb-100010
    表示μ (i)的单位矩阵。
  4. 根据权利要求3所述的一种基于局部核的最优邻居多核聚类方法,其特征在于,所述步骤S4具体为:
    S41.固定J和β,优化H;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100011
    s.t.H∈R n×k,H TH=I k
    其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
    通过对
    Figure PCTCN2022082643-appb-100012
    进行特征值分解得到问题的解;
    S42.固定H和β,优化J;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100013
    Figure PCTCN2022082643-appb-100014
    其中,B表示中间变量;
    Figure PCTCN2022082643-appb-100015
    矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
    S43.固定H和J,优化β;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100016
    Figure PCTCN2022082643-appb-100017
    α=[α 1,…,α m],α p=-ρTr(JK p)
    其中,α T表示α的转置;
    Figure PCTCN2022082643-appb-100018
    表示第i的样本的局部核矩阵p和q之间的关系;
    Figure PCTCN2022082643-appb-100019
    表示第p个核矩阵中第i个样本的局部核;
    Figure PCTCN2022082643-appb-100020
    表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
  5. 根据权利要求4所述的一种基于局部核的最优邻居多核聚类方法,其特征在于,所述步骤S41、S42、S43中的终止条件表示为:
    (obj t+1-obj t)/obj t≤ε
    其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
  6. 一种基于局部核的最优邻居多核聚类系统,其特征在于,包括:
    获取模块,用于获取聚类任务和目标数据样本;
    计算模块,用于计算与目标数据样本相对应的各个视图的核矩阵,并对核矩阵进行中心化和归一化处理,得到处理后的核矩阵;
    建立模块,用于根据得到的处理后的核矩阵,建立基于局部核的最优邻居多核聚类目标函数;
    求解模块,用于采用循环方式求解建立的目标函数,得到视图融合后的划分矩阵;
    聚类模块,用于对得到的划分矩阵进行k均值聚类,得到聚类结果。
  7. 根据权利要求6所述的一种基于局部核的最优邻居多核聚类系统,其特征在于,所述计算模块中计算与目标数据样本相对应的各个视图的核矩阵具体为:对目标数据样本
    Figure PCTCN2022082643-appb-100021
    中的第p个视图进行核函数映射,得到第p个视图的核矩阵,表示为:
    Figure PCTCN2022082643-appb-100022
    其中,
    Figure PCTCN2022082643-appb-100023
    Figure PCTCN2022082643-appb-100024
    表示第i,j个样本,σ表示所有目标数据样本之间距离的平均值;e表示自然常数;K p(i,j)表示第p个视图的核矩阵中第i行j列的值;m表示视图的个数。
  8. 根据权利要求7所述的一种基于局部核的最优邻居多核聚类系统,其特征在于,所述建立模块中建立基于局部核的最优邻居多核聚类目标函数,表示为:
    Figure PCTCN2022082643-appb-100025
    Figure PCTCN2022082643-appb-100026
    其中,H表示划分矩阵;β表示组合系数;J表示最优邻居核;n表示所 有样本数;μ表示自适应核相似度阈值;M表示核关系矩阵;H (i)表示第i个样本对应的划分矩阵;H (i)T表示H (i)的转置;β T表示组合系数向量;M (i)表示第i个样本的最优邻居核矩阵的关系矩阵;ρ表示超参数,需要提前设定;K β表示核矩阵按照β系数组合后得到的矩阵,
    Figure PCTCN2022082643-appb-100027
    H T表示划分矩阵的转置;I k表示k阶单位矩阵;β p表示β向量位置p的值;
    Figure PCTCN2022082643-appb-100028
    表示对于所有p;J (i)=S (i)TJS (i)
    Figure PCTCN2022082643-appb-100029
    表示第i个样本的μ (i)最近邻域,S (i)T表示S (i)的转置;
    Figure PCTCN2022082643-appb-100030
    表示μ (i)的单位矩阵。
  9. 根据权利要求8所述的一种基于局部核的最优邻居多核聚类系统,其特征在于,所述求解模块具体为:
    第一固定模块,用于固定J和β,优化H;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100031
    s.t.H∈R n×k,H TH=I k
    其中,A (i)=S (i)S (i)T;A (i)表示中间变量,A (i)=S (i)S (i)T;I n表示n阶单位矩阵;
    通过对
    Figure PCTCN2022082643-appb-100032
    进行特征值分解得到问题的解;
    第二固定模块,用于固定H和β,优化J;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100033
    Figure PCTCN2022082643-appb-100034
    其中,B表示中间变量;
    Figure PCTCN2022082643-appb-100035
    矩阵J通过将矩阵B中的负数特征值去除得到问题的解;
    第三固定模块,用于固定H和J,优化β;
    将目标函数转化为:
    Figure PCTCN2022082643-appb-100036
    Figure PCTCN2022082643-appb-100037
    α=[α 1,…,α m],α p=-ρTr(JK p)
    其中,α T表示α的转置;
    Figure PCTCN2022082643-appb-100038
    表示第i的样本的局部核矩阵p和q之间的关系;
    Figure PCTCN2022082643-appb-100039
    表示第p个核矩阵中第i个样本的局部核;
    Figure PCTCN2022082643-appb-100040
    表示第q个核矩阵中第i个样本的局部核;M pq表示核矩阵p和q之间的关系;K p、K q、α均表示中间变量;α p表示α向量中位置p的值。
  10. 根据权利要求9所述的一种基于局部核的最优邻居多核聚类系统,其特征在于,所述第一固定模块、第二固定模块、第三固定模块中的终止条件表示为:
    (obj t+1-obj t)/obj t≤ε
    其中,obj t+1和obj t分别表示第t+1和第t轮迭代的目标函数的值;ε表示设定精度。
PCT/CN2022/082643 2021-04-25 2022-03-24 一种基于局部核的最优邻居多核聚类方法及系统 WO2022227956A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110447615.9A CN113269231A (zh) 2021-04-25 2021-04-25 一种基于局部核的最优邻居多核聚类方法及系统
CN202110447615.9 2021-04-25

Publications (1)

Publication Number Publication Date
WO2022227956A1 true WO2022227956A1 (zh) 2022-11-03

Family

ID=77229306

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/082643 WO2022227956A1 (zh) 2021-04-25 2022-03-24 一种基于局部核的最优邻居多核聚类方法及系统

Country Status (4)

Country Link
CN (1) CN113269231A (zh)
LU (1) LU503092B1 (zh)
WO (1) WO2022227956A1 (zh)
ZA (1) ZA202207734B (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292162A (zh) * 2023-11-27 2023-12-26 烟台大学 一种多视图图像聚类的目标跟踪方法、系统、设备及介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269231A (zh) * 2021-04-25 2021-08-17 浙江师范大学 一种基于局部核的最优邻居多核聚类方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145976A (zh) * 2018-08-14 2019-01-04 聚时科技(上海)有限公司 一种基于最优邻居核的多视图聚类机器学习方法
CN110188812A (zh) * 2019-05-24 2019-08-30 长沙理工大学 一种快速处理缺失异构数据的多核聚类方法
CN113269231A (zh) * 2021-04-25 2021-08-17 浙江师范大学 一种基于局部核的最优邻居多核聚类方法及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145976A (zh) * 2018-08-14 2019-01-04 聚时科技(上海)有限公司 一种基于最优邻居核的多视图聚类机器学习方法
CN110188812A (zh) * 2019-05-24 2019-08-30 长沙理工大学 一种快速处理缺失异构数据的多核聚类方法
CN113269231A (zh) * 2021-04-25 2021-08-17 浙江师范大学 一种基于局部核的最优邻居多核聚类方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU JIYUAN; LIU XINWANG; XIONG JIAN; LIAO QING; ZHOU SIHANG; WANG SIWEI; YANG YUEXIANG: "Optimal Neighborhood Multiple Kernel Clustering With Adaptive Local Kernels", IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, IEEE SERVICE CENTRE , LOS ALAMITOS , CA, US, vol. 34, no. 6, 4 August 2020 (2020-08-04), US , pages 2872 - 2885, XP011906946, ISSN: 1041-4347, DOI: 10.1109/TKDE.2020.3014104 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292162A (zh) * 2023-11-27 2023-12-26 烟台大学 一种多视图图像聚类的目标跟踪方法、系统、设备及介质
CN117292162B (zh) * 2023-11-27 2024-03-08 烟台大学 一种多视图图像聚类的目标跟踪方法、系统、设备及介质

Also Published As

Publication number Publication date
LU503092B1 (fr) 2023-03-22
ZA202207734B (en) 2022-07-27
CN113269231A (zh) 2021-08-17

Similar Documents

Publication Publication Date Title
Wang et al. Late fusion multiple kernel clustering with proxy graph refinement
WO2022227956A1 (zh) 一种基于局部核的最优邻居多核聚类方法及系统
Mrabah et al. Deep clustering with a dynamic autoencoder: From reconstruction towards centroids construction
WO2022179241A1 (zh) 一种缺失条件下的高斯混合模型聚类机器学习方法
WO2022170840A1 (zh) 基于二部图的后期融合多视图聚类机器学习方法及系统
Wang et al. Feature selection and multi-kernel learning for adaptive graph regularized nonnegative matrix factorization
WO2021004361A1 (zh) 一种人脸美丽等级预测方法、装置及存储介质
Wu et al. Manifold kernel sparse representation of symmetric positive-definite matrices and its applications
Meng et al. Improving federated learning face recognition via privacy-agnostic clusters
Becker et al. Domain adaptation for microscopy imaging
Tai et al. Growing self-organizing map with cross insert for mixed-type data clustering
Tong et al. Federated nonconvex sparse learning
Zhou et al. Semantic adaptation network for unsupervised domain adaptation
Yang et al. Network topology inference from heterogeneous incomplete graph signals
Fernandes et al. Discriminative directional classifiers
Li et al. Adaptive weighted ensemble clustering via kernel learning and local information preservation
Li et al. Unsupervised domain adaptation via discriminative feature learning and classifier adaptation from center-based distances
CN112836629A (zh) 一种图像分类方法
Li et al. Unsupervised double weighted domain adaptation
Płoński et al. Visualizing random forest with self-organising map
Azad et al. Improved data classification using fuzzy euclidean hyperbox classifier
Huang et al. Kernelized convex hull approximation and its applications in data description tasks
Płoński et al. Improving performance of self-organising maps with distance metric learning method
Cho et al. Cooperative distribution alignment via jsd upper bound
Vahidipour et al. Comparing weighted combination of hierarchical clustering based on Cophenetic measure

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794432

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794432

Country of ref document: EP

Kind code of ref document: A1