CN105160357A

CN105160357A - Multimodal data subspace clustering method based on global consistency and local topology

Info

Publication number: CN105160357A
Application number: CN201510546959.XA
Authority: CN
Inventors: 赫然; 胡包钢; 樊艳波
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2015-08-31
Filing date: 2015-08-31
Publication date: 2015-12-16

Abstract

The invention provides a multimodal data subspace clustering method based on global consistency and local topology. The method comprises obtaining a Laplacian matrix corresponding to each piece of modal data, establishing a multimodal data subspace clustering model according to the Laplacian matrixes, obtaining a self-expression matrix corresponding to each piece of modal data through the multimodal data subspace clustering model, selecting the first self-expression matrixes from all the self-expression matrixes of the various pieces of modal data, and clustering the first self-expression matrixes to obtain a clustering result. The multimodal data subspace clustering method based on global consistency and local topology is capable of obtaining better clustering performance and enhancing the robustness.

Description

Based on the multi-modal data Subspace clustering method of global coherency and local topology

Technical field

The present invention relates to computer realm, particularly relate to a kind of multi-modal data Subspace clustering method based on global coherency and local topology.

Background technology

Along with the development of science and technology and day by day popularizing of network, the collection of modern society's data becomes more and more easier, and data volume grows with each passing day, and data also become more and more diversified simultaneously, and particularly various multi-modal data also become more and more common.Learning method based on multi-modal data also receives increasing concern and research, compared to single mode data, multi-modal data can provide more mainly with and more complicated information, the learning model therefore based on multi-modal data usually can obtain better effect and possess more excellent statistical property.

In Multimodal Learning field, multi-modal data cluster receives and pays close attention to due to the ability of the extensive non-supervisory data of its process widely and develop, and the object of multi-modal data cluster utilizes the feature under multiple mode data to be better aggregated among their classification itself.A key issue in multi-modal data cluster is exactly how better to set up and to utilize the related information between different modalities, current existing a lot of research work is intended to address this problem, comprise first utilize multi-modal under feature learning go out the statement of public characteristic, then state in public feature but not former multi-modal feature do cluster, as the multi-modal clustering method based on Non-negative Matrix Factorization; And in the process of model training, utilizing the information under different modalities to increase bound term, these methods can obtain good effect.In addition be also the field that research is many recently based on the Subspace clustering method of spectral clustering, these class methods usually suppose that similar sample has usually and similar certainly express coefficient, and the close sample in space can linear reconstruction each other; These class methods first need to calculate sample from expression matrix, then using sample from expression matrix as input, utilizing the method for spectral clustering to generate final cluster result, the result of robust more can be obtained by increasing some structure prior imformations such as structure sparse constraint and the constraint of structure low-rank etc.

But, although current certain methods can improve the clustering performance of multi-modal data to a certain extent, how better to excavate and to utilize the information such as the correlativity between different modalities and otherness still to face very large challenge.

Summary of the invention

Multi-modal data Subspace clustering method based on global coherency and local topology provided by the invention, can obtain better clustering performance, strengthens robustness.

According to an aspect of the present invention, a kind of multi-modal data Subspace clustering method based on global coherency and local topology is provided, comprises: obtain the Laplacian Matrix that each modal data is corresponding; Multi-modal data subspace clustering model is built according to described Laplacian Matrix; By described multi-modal data Clustering Model obtain each modal data described corresponding from expression matrix; From each modal data described corresponding from expression matrix, choose first from expression matrix; Carry out cluster by described first from expression matrix and obtain cluster result.

The multi-modal data Subspace clustering method based on global coherency and local topology that the embodiment of the present invention provides, obtain the Laplacian Matrix that each modal data is corresponding, multi-modal data subspace clustering model is built according to Laplacian Matrix, by multi-modal data Clustering Model obtain each modal data corresponding from expression matrix, from each modal data corresponding from expression matrix, choose first from expression matrix, carry out cluster from expression matrix by Spectral Clustering by first and obtain cluster result, better clustering performance can be obtained, strengthen robustness.

Accompanying drawing explanation

The multi-modal data Subspace clustering method process flow diagram based on global coherency and local topology that Fig. 1 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with accompanying drawing, the multi-modal data Subspace clustering method based on global coherency and local topology that the embodiment of the present invention provides is described in detail.

With reference to Fig. 1, in step S101, obtain the Laplacian Matrix that each modal data is corresponding.

In step S102, build multi-modal data subspace clustering model according to described Laplacian Matrix.

In step S103, by described multi-modal data Clustering Model obtain each modal data described corresponding from expression matrix.

In step S104, from each modal data described corresponding from expression matrix, choose first from expression matrix.

Here, first is corresponding optimum from expression matrix in expression matrix of each modal data from expression matrix, optimum can obtain according to the priori of data from expression matrix, also can be tested by checking collection, thus acquisition optimum from expression matrix.

In step S105, carry out cluster by described first from expression matrix and obtain cluster result.

Further, the Laplacian Matrix that each modal data of described acquisition is corresponding comprises:

Gaussian kernel function is utilized to calculate similarity corresponding to each modal data described respectively;

Similarity matrix corresponding to described similarity is obtained according to described similarity;

Laplacian Matrix corresponding to described similarity matrix is obtained according to described similarity matrix.

Here, in order to improve the efficiency of algorithm, structure similarity matrix can adopt, but is not limited to, and is specially k near neighbor method.Particularly, utilize gaussian kernel function to calculate similarity between the sample of each modal data and k neighbour sample respectively, build a similarity matrix W _i, according to similarity matrix W _iobtain its Laplacian Matrix L _i.

Further, describedly described corresponding the comprising from expression matrix of each modal data is obtained by described multi-modal data Clustering Model:

According to formula (1) calculate each modal data described corresponding from expression matrix:

< Z > = \arg \min_{Z} Σ_{i = 1}^{m} | | X_{i} - X_{i} Z_{i} | |_{F}^{2} + λ Σ_{i = 1}^{m} Σ_{i = 1, j &NotEqual; i}^{m} t r (Z_{i} L_{i} Z_{i}^{T}) + β | | Z | |_{*} + ρ | | Z | |_{F}^{2} - - - (1)

Wherein, Z=[Z ₁z ₂z _m], Z _ifor each modal data described corresponding from expression matrix, for the reconstructed error of each modal data described, L _ifor the Laplacian Matrix that each modal data described is corresponding, for the local topology unchangeability of each modal data described, || Z|| _*for the global coherency of each modal data described, for regular terms, λ, β and ρ are respectively weight parameter.

Here, incite somebody to action || Z|| _*replace, specifically from formula (2):

\begin{matrix} < Z > = \arg \min_{Z} Σ_{i = 1}^{m} | | X_{i} - X_{i} Z_{i} | |_{F}^{2} + λ Σ_{i = 1}^{m} Σ_{j = 1, j &NotEqual; i}^{m} t r (Z_{i} L_{i} Z_{i}^{T}) + \frac{β}{2} Σ_{i = 1}^{m} t r (Z_{i}^{T} L_{i} Z_{i}) \\ + \frac{β}{2} t r (S) + ρ | | Z | |_{F}^{2} \end{matrix} - - - (2)

Optimize S and Z by alternating iteration and minimize formula (2).

When each modal data corresponding from expression matrix Z _i, i=1 ... when m is constant, upgrades S by formula (3), be specially:

S＝(ZZ ^T+μI) ^0.5,Z＝[Z ₁Z ₂...Z _m](3)

When S is constant, upgrade each modal data corresponding from expression matrix Z _i, i=1 ... m, specifically from formula (4):

< Z_{i} > = \arg \min_{Z} | | X_{i} - X_{i} Z_{i} | |_{F}^{2} + λ Σ_{j = 1, j &NotEqual; i}^{m} t r (Z_{i} L_{i} Z_{i}^{T}) + \frac{β}{2} t r (Z_{i}^{T} S^{- 1} Z_{i}) - - - (4)

Formula (4) is carried out distortion and obtains formula (5), be specially:

(X_{i}^{T} X_{i} + \frac{β}{2} S^{- 1} + ρ I) Z_{i} + {λZ}_{i} Σ_{j = 1, j &NotEqual; i}^{m} L_{j} = X_{i}^{T} X_{i} - - - (5)

Further, carry out cluster by described first from expression matrix to obtain cluster result and comprise:

Carry out cluster from expression matrix by Spectral Clustering by described first and obtain cluster result.

In order to be described in detail, the method that the present invention proposes is applied on the database of five conventional multi-modal clusters, i.e. Movies617, PASCAL-VOC, WiKiText-Image, Animal, 3-Sources database.Wherein Movies617 database comprises totally 617 films of 17 classifications, two modal characteristics corresponding 1878 dimension keyword features and 1398 cast's features tieed up respectively; PASCAL-VOC comprises 20 class image texts pair, removes the sample having multiple category attribute, can obtain totally 5649 samples, consider the time cost of some control methodss simultaneously.Get first three class sample as evaluation and test collection, and utilize the text words-frequency feature that the Gist characteristic sum 399 of 512 dimensions is tieed up; WiKiText-Image database by totally 2866 image texts of 10 classifications to forming, from each classification, random selecting 60 samples form the test set of totally 600 samples, wherein text feature utilizes the LDA feature of 10 dimensions, and characteristics of image utilizes the SIFT feature of 128 dimensions; Animal database is made up of 30475 samples, totally 50 classes, choose front ten classifications and from each classification random selecting 50 composition of sample test sets, get PyramidHOG (PHOG), colorSIFT and SURF feature as the feature representation under three kinds of mode; 3-Sources database comprises totally 416 different message of collecting from BBC, Reuters and TheGuardian, and they are divided into 6 classifications, and wherein 169 three mechanisms have the message of report to be used as our test set.

As from the foregoing, first the Laplacian Matrix under different modalities is calculated, then the data input model of all data sets is carried out training obtain under its different modalities from expression matrix, choosing in expression matrix from different modalities is optimum from expression matrix, Spectral Clustering NormalizedCut is finally utilized to carry out cluster to optimum from expression matrix, thus obtain optimum result, and using the result of optimum as final cluster result.

The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims

1., based on a multi-modal data Subspace clustering method for global coherency and local topology, it is characterized in that, described method comprises:

Obtain the Laplacian Matrix that each modal data is corresponding;

Multi-modal data subspace clustering model is built according to described Laplacian Matrix;

By described multi-modal data Clustering Model obtain each modal data described corresponding from expression matrix;

From each modal data described corresponding from expression matrix, choose first from expression matrix;

Carry out cluster by described first from expression matrix and obtain cluster result.

2. method according to claim 1, is characterized in that, the Laplacian Matrix that each modal data of described acquisition is corresponding comprises:

3. method according to claim 1, is characterized in that, describedly obtains described corresponding the comprising from expression matrix of each modal data by described multi-modal data Clustering Model:

According to following formula calculate each modal data described corresponding from expression matrix:

< Z > = \arg \min_{Z} Σ_{i = 1}^{m} | | X_{i} - X_{i} Z_{i} | |_{F}^{2} + λ Σ_{i = 1}^{m} Σ_{i = 1, j &NotEqual; i}^{m} t r (Z_{i} L_{i} Z_{i}^{T}) + β | | Z | |_{*} + ρ | | Z | |_{F}^{2}

Wherein, Z=[Z _lz ₂z _m], Z _ifor each modal data described corresponding from expression matrix, for the reconstructed error of each modal data described, L _ifor the Laplacian Matrix that each modal data described is corresponding, for the local topology unchangeability of each modal data described, || Z|| _*for the global coherency of each modal data described, for regular terms, λ, β and ρ are respectively weight parameter.

4. method according to claim 1, is characterized in that, carries out cluster obtain described first cluster result and comprise from expression matrix: