CN111582321A - Tensor subspace learning algorithm based on HSIC maximization - Google Patents

Tensor subspace learning algorithm based on HSIC maximization Download PDF

Info

Publication number
CN111582321A
CN111582321A CN202010303130.8A CN202010303130A CN111582321A CN 111582321 A CN111582321 A CN 111582321A CN 202010303130 A CN202010303130 A CN 202010303130A CN 111582321 A CN111582321 A CN 111582321A
Authority
CN
China
Prior art keywords
data
tensor
rkhs
hsic
maximization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010303130.8A
Other languages
Chinese (zh)
Inventor
马争鸣
陈李创凯
甘伟超
冯伟佳
刘洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN202010303130.8A priority Critical patent/CN111582321A/en
Publication of CN111582321A publication Critical patent/CN111582321A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for researching the dimension reduction problem of multi-dimensional data. The invention uses a tensor to represent a multi-dimensional dataset, wherein the first dimensions of the tensor represent the dimensions of the multi-dimensional data, and the last dimension represents the number of data contained in the dataset. Because the mode product of the tensor and the matrix can change the size of a certain dimension of the tensor, the tensor data dimension reduction model based on the tensor mode product is provided, and the matrix of the mode product is optional and can be determined according to different criteria. The invention determines the matrix of the mode product according to the criterion of HSIC maximization between the two tensors before and after dimensionality reduction. The HSIC transforms the two data sets onto two re-generated nuclear Hilbert spaces (RKHS) and then measures the statistical dependence of the two transformed data sets using an HS operator between the two RKHS. The advantage of the present invention is that the RKHS is optional and one can select the RKHS with the best dimension reduction effect based on a given data set.

Description

Tensor subspace learning algorithm based on HSIC maximization
Technical Field
The invention belongs to the field of machine learning, and relates to a brand-new algorithm based on HSIC maximized subspace learning in manifold learning, which is applied to dimensionality reduction of tensor data, original data and dimensionality reduced data are respectively mapped into two different Regeneration Kernel Hilbert Spaces (RKHS), and the statistical dependency between two data sets is measured by using an HS operator between the two RKHS, so that the dimensionality reduced subspace is determined, and the data set of the space retains the geometric structure of the original data set as much as possible. The problem of the machine learning dimension disaster is solved by a dimension reduction method.
Background
With the advent of the big data age, problems related to dimensionality disasters have become more and more serious. Therefore, subspace learning algorithms are also gaining increasing importance. The subspace learning algorithm is one of dimension reduction algorithms, the subspace learning meaning is that the mapping from high-dimensional features to low-dimensional space is realized through projection, and the method is a classical dimension reduction idea. In pattern recognition, most of the possible dimension reduction (dimension reduction, projection) algorithms are computed as subspace learning, such as PCA, LDA, LPP, LLE, etc. The main problems of subspace learning are how to compress features from a high-dimensional space to a low-dimensional space, what information needs to be retained, what criteria are set, what features the features of the low-dimensional space have, and the like.
The HSIC criterion is to scale the statistical dependency between the data sets with HS operators in both RKHS and maximize this dependency. Thereby determining the orthonormal basis of the reduced subspace. However, the most critical problem for mapping the original data set to the RKHS is how to preserve the geometry of the original data, and the kernel determines the RKHS, so the selection of the kernel is also an important issue.
In mathematical definition, a function satisfies symmetry, square integrable, and positive definite, and then this function is called a kernel function. According to Moore-Aronszajn theorem: knowing a kernel function k (x, y), there is only one Hilbert space H, such that H is a regenerating kernel Hilbert space, and k (x, y) is the regenerating kernel of H, it can be seen that as long as a kernel function is defined, an RKHS is defined, as well as the regenerating kernels of RKHS. Because the dimensionality of tensor data is high, in a machine learning algorithm, the problem of dimensionality disaster can occur in the processing process of the tensor data, and therefore dimensionality reduction processing needs to be performed on the data. Dimension reduction is the main application of manifold learning, and most manifold learning algorithms are local feature retention algorithms from the aspect of dimension reduction. This may be due to the mathematical nature of the manifold. In mathematics, a manifold is defined as a local, isogenous manifold in Euclidean space. In recent years, manifold learning based local and global feature preserving algorithms have been widely used. In many such algorithms, the global (linear or non-linear) relationships between high-dimensional and low-dimensional data are first determined and then substituted into the manifold learning objective function to determine these global relationships.
In contrast, subspace learning and tensor data dimensionality reduction are combined, the original tensor data are considered to be elements of a high-dimensional data space, and the data subjected to dimensionality reduction are considered to be elements of a learned low-dimensional subspace. Mapping the high-dimensional data and the low-dimensional data into two different RKHS, and measuring the statistical dependence of the two different RKHS by using an HS operator between the two different RKHS to make the dependence maximum. That is, the subspace is determined by finding the orthonormal basis of the target subspace using the criterion of HSIC maximization. By doing so, the data after dimensionality reduction can keep the geometric structure of the original data as much as possible, and a better dimensionality reduction effect is achieved.
Disclosure of Invention
In the prior machine learning, a lot of input data are non-Europe data, the data dimension is high, and the linear operation can not be directly carried out, so that the input data are mapped onto the RKHS by using the kernel function, and the linear operation is carried out on the RKHS, so that various machine learning problems can be well processed. The dimensionality disaster is solved by performing dimensionality reduction on the RKHS. The kernel functions used in the machine learning algorithm are substantially fixed and therefore the RKHS mapped by the kernel functions is also fixed and different RKHS is generated by different kernel functions. And each RKHS corresponds to the application of different fields, and the application of the dimension reduction algorithm in manifold learning in different fields is generalized. Therefore, the invention provides a subspace learning framework. Most of the objective functions of manifold learning can be simplified to the following form:
Figure BDA0002454758300000021
where tr (-) is the trace of the matrix, Y is the data after dimensionality reduction, L is a symmetric semi-positive definite matrix and is derived fromHigh dimensional data according to different manifold learning algorithms. The high-dimensional data X and the low-dimensional data Y are set to be linearly related, that is: y ═ WTX, the linear transformation matrix W is determined by the following manifold-learned objective function:
Figure BDA0002454758300000022
the algorithm represented by this formula is LPP or a variant of LPP, theoretically tr (YLY)T) It can be said to be the objective function of any manifold learning algorithm.
The framework of dimension reduction of tensor data based on subspace learning is to a high-dimensional data set
Figure BDA0002454758300000023
Requires finding a subspace span W according to a certain criterion to obtain
Figure BDA0002454758300000029
The coordinates projected on the subspace span w, namely:
Figure BDA00024547583000000210
also called as
Figure BDA00024547583000000211
The fourier coefficients of (a). Where span W is the space spanned by the column vectors of W,
Figure BDA0002454758300000024
and Jn<<LnN is 1,2, …, N-1. For tensor data, the operations satisfying the tensor product by the tensor data operations include:
Figure BDA0002454758300000025
and (3) unfolding the tensor to obtain a matrix form as follows:
Figure BDA0002454758300000026
for subspace learning, the orthonormal basis for the subspace should be determined according to a certain criterion, wherein,
Figure BDA0002454758300000027
a common criterion is the minimum distance criterion, namely: raw data
Figure BDA00024547583000000212
The distance from the projected data is minimal as follows:
Figure BDA0002454758300000028
this algorithm is now the so-called PCA algorithm. However, the algorithm proposed by the present invention is based on the criterion of HSIC maximization to determine the orthonormal basis of the subspace.
The invention has the characteristics and significance that:
(1) the problem of dimension reduction of multidimensional data is studied herein. Unlike most applications that use a tensor to represent a multi-dimensional data, the first contribution herein is to represent a multi-dimensional dataset using a tensor, where the first dimensions of the tensor represent the dimensions of the multi-dimensional data and the last dimension represents the number of data contained in the dataset.
(2) The tensor-matrix product can change the size of a certain dimension of the tensor, therefore, the second contribution of the invention is to provide a tensor data dimension reduction model based on the tensor product, and in the model, the matrix of the product is optional and can be determined according to different criteria.
(3) And determining a matrix of the mode product according to the criterion of HSIC maximization between the two tensors before and after dimensionality reduction. The HSIC transforms the two data sets onto two re-generated nuclear Hilbert spaces (RKHS) and then measures the statistical correlation of the two transformed data sets using an HS operator between the two RKHS.
Drawings
FIG. 1: a tensor subspace learning algorithm flow chart based on HSIC maximization.
Detailed Description
A tensor subspace learning algorithm based on HSIC maximization comprises the following specific contents:
in this invention, HSIC (X, Y) maximization is used as a criterion for dimension reduction, namely:
Figure BDA0002454758300000031
in other words, it aims to find a low-dimensional Euclidean space RdThe data set Y of (A) is matched with a high-dimensional Euclidean space R as much as possible by using HSICDIs linearly correlated (statistically correlated). Y can be regarded as the result of X dimensionality reduction. For ease of description, in the next section of this document, the algorithm presented here is referred to as HSIC. Compared with other dimension reduction algorithms with linear correlation requirement (such as PCA, wherein Y is WTX and W are linear transformation matrices), the HSIC algorithm more respects the nature of the data itself.
In HSIC (X, Y), the dimension reduction result Y is hidden in the kernel matrix KYThis will be disadvantageous for the formula
Figure BDA0002454758300000032
Solution of the HSIC problem shown. To express Y explicitly, the kernel function of Y in HSIC is defined as kY:Rd×Rd→ R, and for any y', y "∈ Rd
kY(y',y”)=y'Ty”+k(y',y”)
Wherein, k is more than 0,
Figure BDA0002454758300000041
is to guarantee k theoreticallyYAdd up for positive nature.
Obviously, in the formula kY(y',y”)=y'Ty "+ k (y', y") represents a function kYIs a kernel function, and thus, the kernel matrix KYCan be expressed as follows:
Figure BDA0002454758300000042
substituting the above formula into formula
Figure BDA0002454758300000043
The following can be obtained:
Figure BDA0002454758300000044
due to tr (C)NKXCN) Independent of Y and N or κ independent of Y, then the problem of the above formula may be equivalent to the following:
Figure BDA0002454758300000045
because Y is WTX, and X is the known original dataset, then selecting Y is equivalent to selecting W, so the above equation is equivalent to the following problem:
Figure BDA0002454758300000046
obviously, the above formula is simple and easy to understand and use. In addition, the solution to the problem presented by the above equation is very simple. In fact, due to the kernel matrix KXIs a symmetric positive definite matrix, and the solution of the problem represented by the above formula can be converted into WTW=IdAnd calculating the maximum Rayleigh entropy under the orthogonal condition. The rayleigh entropy calculation problem is a common problem for matrix calculations. There are many existing source programs available for invocation.
The application of the above-described framework of subspace learning to high-dimensional tensor data is specifically as follows:
first, a tensor data is known
Figure BDA0002454758300000047
It can be converted into low-dimensional tensor data
Figure BDA0002454758300000048
So that
Figure BDA00024547583000000412
And
Figure BDA00024547583000000413
the statistical correlation between them is maximal, wherein Jn<<LnN is 1, …, N-1. Then we set up
Figure BDA0002454758300000049
Wherein
Figure BDA00024547583000000410
n=1,…,N-1,
Figure BDA00024547583000000414
Are coordinates on the subspace. The task of dimension reduction is to find the matrix AnN is 1, …, N-1. This mainly uses the HSIC maximization criterion to define this subspace. Therefore, the temperature of the molten metal is controlled,
Figure BDA00024547583000000411
can be converted into
Figure BDA0002454758300000051
Wherein
Figure BDA0002454758300000052
Finally order
Figure BDA0002454758300000053
Substituted type
Figure BDA0002454758300000054
The following can be obtained:
Figure BDA0002454758300000055
the final objective function can be obtained as follows:
Figure BDA0002454758300000056
wherein the content of the first and second substances,
Figure BDA0002454758300000057
and is
Figure BDA0002454758300000058
Is a kernel function.
It is also simple to solve the problem that this equation reveals. In fact, due to
Figure BDA00024547583000000514
Is a symmetric positive definite matrix, the problem presented by solving this equation can be translated into
Figure BDA0002454758300000059
Under the condition (2), calculating the maximum value of Rayleigh entropy. According to the above, the objective function of the present invention can be obtained, namely:
Figure BDA00024547583000000510
according to the lagrange multiplier method, we can obtain:
Figure BDA00024547583000000511
thus only the matrix pair is required
Figure BDA00024547583000000512
Decomposing the characteristic value and taking the front
Figure BDA00024547583000000513
The feature vectors constitute W. Thereby completing the learning of the subspace.

Claims (1)

1. A tensor subspace learning algorithm based on HSIC maximization is characterized in that:
A. a subspace learning framework based on HSIC maximization is proposed, and the HSIC criterion is to measure the data dependency between data sets in two regeneration core Hilbert spaces (RKHS). The data dependence effectively reflects the geometric structure of the manifold through the information of the existing data samples, so that the kernel function can keep the original geometric structure of the manifold while mapping;
B. applying the subspace learning framework to tensor data;
C. considering that the tensor data has a high dimensionality and a dimensionality disaster occurs, directly processing the tensor data is complicated. Therefore, tensor data are mapped to the RKHS and subjected to dimension reduction processing;
D. expressing a multi-dimensional data set by using a tensor, and reducing dimensions by using a dimension reduction model of tensor data based on a tensor mode product;
E. the data set after dimensionality reduction adopts an inner product core as a regeneration core to further generate RKHS;
F. the regeneration core of the original dataset is optional, with different regeneration cores producing different RKHS;
G. measuring statistical dependencies between the two data sets using an HS operator between the two RKHS;
H. determining a matrix of a mode product by adopting an HSIC maximization criterion, and further determining a standard orthogonal basis of a subspace;
I. the algorithm of the patent is suitable for data sets such as face recognition and object classification in the field of machine learning.
CN202010303130.8A 2020-04-17 2020-04-17 Tensor subspace learning algorithm based on HSIC maximization Pending CN111582321A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010303130.8A CN111582321A (en) 2020-04-17 2020-04-17 Tensor subspace learning algorithm based on HSIC maximization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010303130.8A CN111582321A (en) 2020-04-17 2020-04-17 Tensor subspace learning algorithm based on HSIC maximization

Publications (1)

Publication Number Publication Date
CN111582321A true CN111582321A (en) 2020-08-25

Family

ID=72117517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010303130.8A Pending CN111582321A (en) 2020-04-17 2020-04-17 Tensor subspace learning algorithm based on HSIC maximization

Country Status (1)

Country Link
CN (1) CN111582321A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287811A (en) * 2020-10-27 2021-01-29 广州番禺职业技术学院 Domain self-adaption method based on HSIC and RKHS subspace learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287811A (en) * 2020-10-27 2021-01-29 广州番禺职业技术学院 Domain self-adaption method based on HSIC and RKHS subspace learning

Similar Documents

Publication Publication Date Title
Liu et al. Multiview dimension reduction via Hessian multiset canonical correlations
Shang A survey of functional principal component analysis
Guo et al. A feature fusion based forecasting model for financial time series
Zhang et al. Robust non-negative matrix factorization
Sirimongkolkasem et al. On regularisation methods for analysis of high dimensional data
Gallant et al. The relative efficiency of method of moments estimators
Guo et al. A stock market forecasting model combining two-directional two-dimensional principal component analysis and radial basis function neural network
Tenreiro Machado et al. Analysis of financial data series using fractional Fourier transform and multidimensional scaling
Xiao et al. Primal and dual alternating direction algorithms for ℓ 1-ℓ 1-norm minimization problems in compressive sensing
US20150074130A1 (en) Method and system for reducing data dimensionality
CN110334546B (en) Difference privacy high-dimensional data release protection method based on principal component analysis optimization
Yu et al. A classification scheme for ‘high-dimensional-small-sample-size’data using soda and ridge-SVM with microwave measurement applications
Yin et al. Nonnegative matrix factorization with bounded total variational regularization for face recognition
Lisboa et al. Cluster-based visualisation with scatter matrices
CN111582321A (en) Tensor subspace learning algorithm based on HSIC maximization
Fan et al. An efficient KPCA algorithm based on feature correlation evaluation
Li Estimation of large dynamic covariance matrices: A selective review
Mehrali et al. A Jensen–Gini measure of divergence with application in parameter estimation
Batalo et al. Temporal-stochastic tensor features for action recognition
Su et al. Regularized denoising latent subspace based linear regression for image classification
Lapko et al. Analysis of the ratio of the standard deviations of the kernel estimate of the probability density with independent and dependent random variables
Liu et al. Grey incidence analysis models
CN109165679B (en) Data processing method and device
Luo et al. Frequency Information Matters for Image Matting
Bannour Lahaw et al. A new greedy sparse recovery algorithm for fast solving sparse representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200825