CN109615014B - KL divergence optimization-based 3D object data classification system and method - Google Patents

KL divergence optimization-based 3D object data classification system and method Download PDF

Info

Publication number
CN109615014B
CN109615014B CN201811540690.4A CN201811540690A CN109615014B CN 109615014 B CN109615014 B CN 109615014B CN 201811540690 A CN201811540690 A CN 201811540690A CN 109615014 B CN109615014 B CN 109615014B
Authority
CN
China
Prior art keywords
data
divergence
view
module
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811540690.4A
Other languages
Chinese (zh)
Other versions
CN109615014A (en
Inventor
高跃
吉书仪
赵曦滨
黄晋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811540690.4A priority Critical patent/CN109615014B/en
Publication of CN109615014A publication Critical patent/CN109615014A/en
Application granted granted Critical
Publication of CN109615014B publication Critical patent/CN109615014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to a method for classifying 3D object data based on KL divergence optimization, which comprises the following steps: preprocessing data of original images, texts and the like, and modeling objects into multidimensional distribution; selecting a certain amount of triples from the training data with the labels to perform model training; taking the selected triples as training data, applying a linear mapping A on all mean vectors, and learning the optimal linear mapping through iterative optimization, wherein the learning process is based on the basic assumption of metric learning, namely that the distance between similar samples is reduced, and the distance between different types of samples is increased; optimizing by adopting an internal gradient descent algorithm, projecting the gradient of an objective function to the tangent space of the same manifold, and then executing Riemann gradient descent on the manifold of an SPD matrix given an affine invariant Riemann metric; and calculating KL divergence between the test set and the training set, and classifying the samples by adopting a K Nearest Neighbor (KNN) classifier. The method can effectively improve the classification precision of the system and has more stable performance.

Description

KL divergence optimization-based 3D object data classification system and method
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a system and a method for classifying data based on KL divergence optimization.
Background
With the development of information technology, data classification technology is becoming a research hotspot in academia and industry. Data classification refers to the process of automatically determining data categories based on data content under a given classification hierarchy, and data classification techniques include various applications such as picture classification, text classification, speech classification, and so forth. A good classifier is advantageous for more post-application of the data. For example, after preliminary text classification, the method can be applied to a plurality of fields such as text filtering, automatic classification of Web documents, digital libraries, word semantic analysis, and document organization and management.
In machine learning, data objects are often modeled as multidimensional distributions to characterize their features. Therefore, how to measure the similarity between two data distributions during the data classification becomes a core problem in the classification task. The higher the similarity between two samples, the greater the probability they belong to the same class. Common probability distribution metrics include Jensen-Shannon divergence, the Earth Mover's Distance (EMD), maximum Mean Discrepancy, and the like. Among these metrics, kullback-Leibler divergence (KL divergence), also known as relative entropy, is one of the most commonly used metrics for measuring similarity between two probability distributions, and is widely used in various fields such as computer vision, pattern recognition, and the like. The KL divergence represents the loss of information generated when the probability distribution Q is used to fit the true distribution P, and therefore the similarity between the data distributions can be measured well.
However, in the real world, the source of the data is very complex, the quality of the data cannot be guaranteed, and in the collected data set, there may be problems of noise data, missing data, unbalanced data distribution, mixed data points, and the like, which are difficult to distinguish. Under such circumstances, it is difficult for the conventional KL divergence to precisely measure the similarity between two data samples in the european space, and thus the accuracy of data classification may be greatly affected. In other words, conventional KL divergence does not always allow optimal data representation.
The existing research on the KL divergence is mostly focused on two aspects, namely, the KL divergence is directly used as a measure between multidimensional distributions, such as a variational self-encoder and the like, and a more effective measure is obtained through approximation on the KL divergence, such as approximation through a variational upper bound, a Monte Carlo approximation and the like. It can be seen that most of the existing studies directly apply KL divergence, while few studies focus on the optimization of KL divergence itself.
Disclosure of Invention
The invention aims to provide a data classification system and method based on KL divergence optimization, which optimize the traditional KL divergence, learn the best expression of one data, effectively improve the classification capacity of the existing system and have more stable expression.
The technical scheme of the invention is to provide a data classification system based on KL divergence optimization, which comprises: the multi-view feature extraction system comprises a feature extraction module, a feature whitening module, a multi-view feature modeling module, a training data selection module, a feature mapping module, a multi-view sample similarity calculation module, an optimization module based on KL divergence and a classification module based on KL divergence under optimal linear mapping, and is characterized in that:
the feature extraction module is used for extracting multi-view features of the original data from the original image and text data;
the feature whitening module uniformly projects the multi-view features extracted from the feature extraction module to the same low-dimensional space, performs whitening treatment on the features, reduces redundancy of the multi-view features extracted from the original data, removes correlation among the features of different samples, and reconverts the transformed data back to the original space;
the multi-view feature modeling module is used for modeling and characterizing the multi-view features processed by the feature whitening module;
the training data selection module is used for selecting a certain amount of triples from the labeled training data to perform model training;
after training data is selected, a feature mapping module generates a projection matrix, original data features are mapped to a new feature space, and in the new feature space, the distances between similar samples are reduced, and the distances between different types of samples are increased; then, the multi-view sample similarity calculation module measures the similarity of the multi-view samples by calculating the optimized KL divergence in the new feature space;
an optimization module based on KL divergence, which is used for modeling the optimization problem of KL divergence as a minimization problem on a positive definite matrix group manifold; continuously and repeatedly training the model by utilizing a feature mapping module, a multi-view data similarity calculation module and an optimization module based on KL divergence until convergence, thereby learning the optimal linear mapping;
and the classification module is used for mapping the original data features to a new feature space by using the learned optimal linear mapping based on the KL divergence measurement under the optimal linear mapping, and classifying the test set samples by adopting a K neighbor classifier based on the KL divergence between the test set and the training set.
Further, each sample is modeled as a gaussian distribution, and it is assumed that both gaussian distributions have the same covariance matrix; for each sample, the multi-view feature is characterized by a mean vector and covariance matrix of the gaussian distribution.
Further, the triplets include samples belonging to the same class of objects.
The invention also provides a data classification method realized by the data system based on KL divergence optimization, which comprises the following steps:
step 1, extracting characteristics of original data, namely extracting multi-view characteristics of the original data from the original data comprising images and texts;
step 2, performing whitening treatment on the extracted features, namely uniformly projecting the multi-view features to the same low-dimensional space after extracting the multi-view features from the original data, performing whitening treatment on the features, reducing redundancy of the multi-view features extracted from the original data, removing correlation among the features of different samples, and re-transforming the transformed data back to the original space;
step 3, modeling and characterizing the processed multi-view features;
step 4, selecting a certain amount of triples from the labeled training data as training data, wherein the distribution of the training data is the characteristic distribution formed by modeling the sample in the step 3;
step 5, taking the selected triples as training data, performing feature mapping, namely applying a linear mapping on all mean vectors, mapping the original data features to a new feature space, wherein the distances between similar samples are reduced, and the distances between different types of samples are increased in the new feature space;
step 6, calculating the similarity between the multi-view samples, namely measuring the similarity of the multi-view samples by calculating the optimized KL divergence in the new feature space of the mapping;
step 7, taking the selected triples as training data, and optimizing based on KL divergence; the model is repeatedly trained by utilizing a feature mapping module, a multi-view data similarity calculation module and an optimization module based on KL divergence until convergence, so that the optimal linear mapping is learned;
and 8, mapping the original data features to a new feature space by using the optimal linear mapping learned in the step 7, and classifying the test set samples in the new feature space by adopting a K nearest neighbor classifier.
Further, in step 7, positive parameter γ is used to balance the effects of the same kind of data and different kinds of data; and set the parameter gamma toWherein (1)>Is the average KL divergence of the entire training dataset.
Further, in step 7, after projecting the gradient of the objective function into the tangent space of the same manifold, performing a Riemann gradient descent on the manifold given an affine invariant Riemann metric symmetric positive definite matrix), after symmetrizing, preserving the manifold structure of the learned linear mapping during each iteration of the optimization;
further, in step 8, calculating the KL divergence of each sample in the test set and each sample in the training set; maintaining a priority queue with a size k from large to small according to the distance, and storing nearest neighbor training tuples; and randomly selecting k tuples from the training tuples as initial nearest neighbor tuples, respectively calculating the distances from the test tuple to the k tuples, and storing the training tuple marks and the distances into a priority queue.
The invention has the beneficial effects that:
(1) The method and the system provided by the invention can effectively improve the classification precision of the system and have more stable performance. The problems that noise data, missing data, unbalanced data distribution, mixed data points and the like are difficult to distinguish and the like possibly exist in a real scene can be solved.
(2) The invention can be applied to classification of multi-view data.
(3) The method learns an optimal linear mapping from the labeled training data, mapping the original data space to a new feature space. In the new feature space, the distances between data samples from the same class will be closer, while the distances between data samples from different classes will be further. Compared with the existing systems of the same type, the system and the method can effectively improve the classification capacity of the system and have more stable performance.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of the method of the present invention;
FIG. 3 is an explanatory diagram of the gradient descent algorithm of the intrinsic safety employed in the method of the present invention;
FIG. 4 is a comparison result of classification accuracy of the system with other systems applied to a 3D object recognition task, wherein the test data set is an NTU16 data set;
FIG. 5 is a comparison result of classification accuracy of the system with other systems applied to a 3D object recognition task, wherein the test data set is an NTU47 data set;
FIG. 6 is a comparison result of classification accuracy of the present system with other systems applied in text classification tasks, the test dataset being a TWITTER dataset;
fig. 7 shows the change of the classification accuracy of the system when the parameters of the system are changed.
Detailed Description
The technical scheme of the invention will be described in detail with reference to fig. 1-5.
As shown in fig. 1, this embodiment provides a KL divergence optimization-based data classification system, including: the multi-view feature extraction system comprises a feature extraction module, a feature whitening module, a multi-view feature modeling module, a training data selection module, a feature mapping module, a multi-view sample similarity calculation module, an optimization module based on KL divergence and a classification module based on KL divergence under optimal linear mapping, wherein:
the feature extraction module is used for extracting multi-view features of the original data from the original image, text and other data.
The feature whitening module uniformly projects the multi-view features extracted from the feature extraction module to the same low-dimensional space, performs whitening treatment on the features, reduces redundancy of the multi-view features extracted from the original data, removes correlation among the features of different samples, and reconverts the transformed data back to the original space.
The multi-view feature modeling module is used for modeling and characterizing the multi-view features processed by the feature whitening module.
Wherein: the present system models each sample as a gaussian distribution and assumes that both gaussian distributions have the same covariance matrix. For each sample, the multi-view feature is characterized by a mean vector and covariance matrix of the gaussian distribution.
After the multi-view feature modeling is completed, the training data selection module is used for selecting a certain amount of triples from the labeled training data to perform model training.
Wherein: the triplets include samples belonging to the same class of objects as well as samples belonging to different classes of objects, i.e. one input triplet includes a pair of positive samples and a pair of negative samples. For example, assume that there are now two types of objects, a table and a chair, respectively. For table a, one possible triplet may be represented as table a, table B, and chair C, with table a and table B representing a pair of positive sample pairs and table a and chair C representing a pair of negative sample pairs.
After training data is selected, the feature mapping module generates a projection matrix, original data features are mapped to a new feature space, and in the new feature space, the distances between similar samples are reduced, and the distances between different types of samples are increased, so that the mapped data are easier to classify. .
After the training data is selected, similarity calculation between the training data is needed. The multi-view sample similarity calculation module measures similarity of the multi-view samples by calculating an optimized KL-divergence in the mapped new feature space.
An optimization module based on KL divergence is used for modeling the optimization problem of KL divergence as a minimization problem on a positive definite matrix group manifold. In order to solve the optimization problem, the embodiment adopts an internal gradient descent algorithm, namely a Riemann gradient descent method, and makes symmetrical improvement on the algorithm. Namely: will beSymmetrization of->Wherein the method comprises the steps ofexp represents an exponential function based on a natural constant e, f (A t ) Representing the objective function after t iterations of the linear mapping, < ->Representing the corresponding gradient, α represents the learning rate.
In this way, the manifold structure of the learned linear mapping can be preserved during each iteration of the optimization, and therefore, the symmetry-normality of the learned optimal KL divergence metric can be ensured.
And continuously and repeatedly training the model by using the feature mapping module, the multi-view data similarity calculation module and the KL divergence-based optimization module until convergence, so as to learn the optimal linear mapping. Until convergence, the optimal linear mapping is learned.
In general, the system takes the triples selected in the training data selection module as training data, and learns an optimal linear mapping based on the optimization of KL divergence, so that the mapped data is easier to classify.
And the classification module is used for mapping the original data features to a new feature space by using the learned optimal linear mapping based on the KL divergence measurement under the optimal linear mapping, and classifying the test set samples by adopting a K Nearest Neighbor (KNN) classifier based on the KL divergence between the test set and the training set.
The feature extraction module is the first module of the data classification system based on KL divergence optimization, and extracts features from the original data. And performing whitening pretreatment on the features, and performing multi-view feature modeling.
After the system models the multi-view characteristics of the samples, the data classification system based on KL divergence optimization selects a certain number of triples as training data.
In the training module, a data classification system based on KL divergence optimization maps features to a new feature space, then the gradient descent algorithm in the multi-view sample similarity calculation is used for optimization by using the KL divergence measurement after mapping optimization. The system repeats this process until convergence. At this point, the system can learn an optimal linear mapping. Finally, the system performs classification in the mapped new feature space by using a classification module.
In particular, this method is based on the optimization of KL divergence, and in addition, the embodiment does not use the traditional gradient descent method to optimize the classification system, but uses an internal gradient descent algorithm to optimize, and makes symmetrical improvement to ensure the symmetry and the normalization of the learned linear mapping in each optimization iteration process. Still another point is that the KL divergence optimization based data classification system can be applied to data classification of multi-view data, while many homogeneous systems can only be applied to single-view data.
The embodiment also provides a data classification method based on KL divergence optimization, which comprises,
and step 1, extracting features of the original data.
First, multi-view features of original data are extracted from the original image, text, etc.
Taking the 3D object classification as an example, in this step 1, each 3D object is depicted by a set of views in different directions. For each view, a set of Convolutional Neural Network (CNN) features is extracted. In addition to the feature extraction process, view clustering is performed to generate a view cluster, and a representative set of views is selected from among the view clusters, removing some of the views that may be redundant. In this way, each object may be characterized by a representative set of views selected from the cluster of views. By this method we can perform multi-view feature extraction for the 3D object classification task.
Taking text classification as an example, in this step 1, each piece of text may be characterized by a bag-of-words (BOW) bag feature. Thereafter, the text distribution may be used as the distribution of the text data. Specifically, first, stop words in all texts are taken out and removed, and then all other non-stop words are embedded into a word vector space (word 2vec space). That is, the vector representation of each word is learned through a three-layer neural network (i.e., word2vec model). Each text then gets a normalized bag of words (nBOW) vector reflecting the frequency of occurrence of each non-stop word in the text. By this method we can perform feature extraction on the text classification task.
It should be noted that the system of the present invention has no particular requirements for the feature extraction process and method, meaning that other features may be used in the system of the present invention, convolutional neural network features and normalized bag-of-word vector features are just one example.
And 2, performing whitening treatment on the extracted features.
After multi-view features are extracted from the original data, the multi-view features are projected to the same low-dimensional space in a unified mode, whitening processing is carried out on the features, redundancy of the multi-view features extracted from the original data is reduced, correlation among the features of different samples is removed, and the transformed data are transformed back to the original space.
Taking 3D object classification as an example, for each view of object a, connecting all views, uniformly projecting the views to the same low-dimensional space by using a PCA method (principal component analysis), performing whitening processing on the projected features, and finally separating different views of the transformed data and transforming the different views back to the original space.
And 3, modeling and characterizing the processed multi-view features.
The present system models each sample as a multi-dimensional gaussian distribution and assumes that both gaussian distributions have the same covariance matrix. For each sample, the multi-view feature is characterized by a mean vector and covariance matrix of the sample gaussian distribution. For example, in an embodiment of the present invention, each 3D object, each piece of text, is modeled as a multi-dimensional gaussian distribution. Each 3D object, each text, has the same covariance matrix.
Step 4, selecting a certain amount of triples from the labeled training data as training data, wherein the distribution of the training data is the characteristic distribution formed by modeling the sample in the step 3;
as in fig. 2, the complexity is 0 (n due to the selection of training data as triples 3 ) And therefore, not all triples need to be computed. The data classification system based on KL divergence optimization selects k for each sample in the training data set i Samples from the same class and k g Samples from different classes were trained. k (k) i And k g Are super parameters.
Taking 3D object classification as an example, for each object in the training set, select k i Most similar to the object (i.e. the KL divergence between the two samples is minimal), objects from the same class and k g The objects from different classes are updated due to gradient calculations in the subsequent optimization process.
And 5, taking the selected triplet as training data to perform feature mapping, namely, applying a linear mapping A on all mean value vectors, mapping the original data features to a new feature space, wherein the distances between the similar samples are smaller and the distances between the different types of samples are larger in the new feature space.
Most of the existing similar systems measure the similarity between two samples by directly calculating the KL divergence between the two samples. To better distinguish samples in different classes, the KL-divergence-optimized data classification system makes the following improvements:applying a linear mapping A on all mean vectors, i.e. all mu i Replacement with A mu i The original data features are mapped to a new feature space. The objective function is as in fig. 2.
Taking 3D object classification as an example, the 3D object after feature mapping satisfies θ i =g(x;Aμ i ;∑ i ) Gaussian distribution, A is the learned linear mapping, θ i Representing the ith object, g represents a gaussian distribution, Σ represents a covariance matrix, and μ represents a mean vector.
Step 6, calculating the similarity between the multi-view samples; i.e. the similarity of the multi-view samples is measured by calculating the optimized KL-divergence in the new feature space of the map. Note that the optimized KL divergence here is continuously updated, i.e. the similarity between the multi-view samples is continuously updated.
In embodiments of the present invention, each text is modeled as a multidimensional gaussian distribution, so the similarity between two texts is measured by the KL divergence between them, the original KL divergence being expressed as Wherein D is KL ( 1 ||P 2 ) Representing sample P 1 And sample P 2 KL divergence between, log represents natural logarithms, det represents determinant of matrix, n represents characteristic dimension of sample, tr is trace of matrix, Σ represents covariance matrix, μ represents mean vector. The method assumes that the two gaussian distributions have the same covariance matrix, i.e. Σ 1 =Σ 2 . Therefore, the above formula can be simplified intoDuring each iteration, the learned mapping is updated continuously, at which time the similarity of the two samples is measured by the optimized KL divergence in the new feature space of the mapping, expressed asWherein K is A ( 1 ||P 2 ) Representing sample P 1 And sample P 2 The optimized KL divergence measure in between, Σ represents the covariance matrix, μ represents the mean vector, and A is the learned linear mapping.
Step 7, taking the selected triples as training data, and optimizing based on KL divergence; the model is repeatedly trained by utilizing a feature mapping module, a multi-view data similarity calculation module and an optimization module based on KL divergence until convergence, so that the optimal linear mapping is learned;
as shown in fig. 3, the KL divergence optimization based data classification system models the problem as a minimized problem on a positive definite matrix cluster manifold. In a specific implementation, firstly, the optimized KL divergences (linear mapping a is applied) between all samples in the training data in step 6 are calculated, and different groups are put according to whether the categories are the same or not. The first term in the objective function is the sum of all the KL-divergences of samples from the same class and the second term in the objective function is the sum of all the KL-divergences of samples from different classes. All KL divergences mentioned here and later are optimized KL divergences.
In addition, unlike existing similar systems, the KL divergence optimization-based data classification system does not use a hinge loss function (hinge-loss function), but rather uses a new positive parameter γ to balance the effects of similar data and different types of data. According to the distribution characteristics of the data set, in the present embodiment, the parameter γ is set to beWherein (1)>Is the average KL divergence of the entire training dataset. As the distance between the same-type data and different-type data increases, i.e., the distance between the different-type data becomes more and more distant, the entropy of the system becomes large because the data becomes uniform, and the parameter γ tends to be 1 (in +), at this time>Become very big and->Trend 0 so γ would tend to be 1). Thus, the modified ternary constraint, γ, may well describe the classified features. In addition, λ in the objective function is a parameter for balancing the loss term and regularization term, and its value is between 0 and 1. n represents the number of samples. K (K) Ai ||θ j ) Representing sample θ i And sample θ j KL divergence measure under the mapping between.
In addition to the modified ternary constraint, the KL divergence optimization-based data classification system employs a new regularization term to prevent the overfitting phenomenon. Overfitting often occurs in such data classification systems, particularly in high dimensional situations. To maintain the local topology in the input space, the data classification system based on KL divergence optimization designs a regularization device based on the local topology and characterized by local neighbors meeting the local smoothness, and adds the regularization device into the objective function, namelyWherein->β i Belonging to the positive real number field, in effect representing the input X i Density function p (X) i ). KL divergence optimization-based data classification system estimates density function p (X) using Parzen window method (kernel density estimation) i )。
Wherein k is h Is a gaussian kernel function. h represents the width of the kernel. The width of the kernel controls the effect of sample spacing. h is typically set to 0.4.N (N) i Is the neighbor index set of the central core xi. N (N) i Taken as 3 in length.
S ij Representing the similarity between two samples, using a Gaussian kernel functionThe calculation is performed such that,where σ=mind+1/v (maxD-minD), max D and min D represent the maximum and minimum KL divergence between each pair of all samples, respectively. V is a control parameter, and is set to 10 in the present system. D (D) ij Indicating the KL divergence between samples i and j. Note that here S ij The initial KL divergence of the training data is used for calculation and no further updates are then made. I.e. S ij In the original feature space of the data. K (K) Ai ||θ j ) Representing sample θ i And sample θ j KL divergence measure under the mapping between.
In addition, in order to solve the problem of minimization and optimization on manifold, the invention does not adopt the traditional gradient descent method, but designs an internal gradient descent algorithm. As shown in fig. 3, this method performs the Riemann gradient descent on the manifold given an affine invariant Riemann metric SPD matrix (symmetric positive definite matrix) after projecting the gradient of the objective function into the tangent space of the same manifold, i.e. the manifold structure of the learned linear map can be preserved during each iteration of the optimization after the symmetrization, thus ensuring the symmetric positive nature of the learned optimal KL divergence metric. The gradient descent optimization algorithm formula of the interior is as followsWherein the method comprises the steps ofexp represents an exponential function based on a natural constant e, f (A t ) Representing the objective function after t iterations of the linear mapping, < ->Representing the corresponding gradient, α represents the learning rate.
And continuously repeating the characteristic mapping process, the multi-view sample similarity calculation process and the optimization process until convergence, wherein the linear mapping A is the learned optimal linear mapping.
Step 8, mapping the original data features to a new feature space by using the optimal linear mapping learned in the step 7, and classifying test set samples in the new feature space by using a K Nearest Neighbor (KNN) classifier;
in the classification module, a data classification system based on KL divergence optimization adopts a K nearest neighbor algorithm. KL divergence of each sample in the test set and each sample in the training set is first calculated. A priority queue of size k from large to small is maintained for storing nearest neighbor training tuples. And randomly selecting k tuples from the training tuples as initial nearest neighbor tuples, respectively calculating the distances from the test tuple to the k tuples, and storing the training tuple marks and the distances into a priority queue. If a sample is most of the k most similar (i.e., nearest neighbor) samples in the feature space that belong to a certain class, then it is determined that the sample also belongs to that class.
In this embodiment, the data classification system based on KL-divergence optimization performs an example test on both the 3D object recognition and text class tasks, and as a result, as shown in fig. 4, the data classification system based on KL-divergence optimization is higher in classification accuracy than the existing system.
As shown in fig. 4 to 6, the abscissa represents 20%, 30%, 40% and 50% of the data in the original data set as training data, and the ordinate represents the corresponding classification accuracy when selecting different proportions of training data.
The data classification system is shown as KLD-M in the figure, and can be seen from images, compared with other eight main stream data classification systems, the data classification system based on KL divergence optimization is higher in classification accuracy than the existing system. The eight main data classification systems are respectively "partial least squares covariance discrimination learning (cdl_pls)", "Barbitten Distance (BD)", "linear discriminant analysis covariance discrimination learning (cdl_lda)", "Manifold Discriminant Analysis (MDA)", "Projection Metric Learning (PML)", "Log-Euclidean metric learning (LEML)", "soil carrying distance (EMD)", and "conventional KL divergence (KLD)".
As illustrated in fig. 5, the abscissa represents the variation range of λ, and the ordinate represents the corresponding classification accuracy when different λ values are selected. It can be seen from the graph that when the parameters are changed, the data classification system based on the KL divergence optimization can still maintain good performance, and the robustness of the data classification system based on the KL divergence optimization is shown.
It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims (7)

1. A KL divergence optimization-based 3D object data classification system, comprising: the multi-view feature extraction system comprises a feature extraction module, a feature whitening module, a multi-view feature modeling module, a training data selection module, a feature mapping module, a multi-view sample similarity calculation module, an optimization module based on KL divergence and a classification module based on KL divergence under optimal linear mapping, and is characterized in that:
the feature extraction module is used for extracting multi-view features of the original data from the original image and text data;
the feature whitening module uniformly projects the multi-view features extracted from the feature extraction module to the same low-dimensional space, performs whitening treatment on the features, reduces redundancy of the multi-view features extracted from the original data, removes correlation among the features of different samples, and reconverts the transformed data back to the original space;
the multi-view feature modeling module is used for modeling and characterizing the multi-view features processed by the feature whitening module;
the training data selection module is used for selecting a certain amount of triples from the labeled training data to perform model training;
after training data is selected, a feature mapping module generates a projection matrix, original data features are mapped to a new feature space, and in the new feature space, the distances between similar samples are reduced, and the distances between different types of samples are increased; then, the multi-view sample similarity calculation module measures the similarity of the multi-view samples by calculating the optimized KL divergence in the new feature space;
an optimization module based on KL divergence, which is used for modeling the optimization problem of KL divergence as a minimization problem on a positive definite matrix group manifold; continuously and repeatedly training the model by utilizing a feature mapping module, a multi-view data similarity calculation module and an optimization module based on KL divergence until convergence, thereby learning the optimal linear mapping;
and the classification module is used for mapping the original data features to a new feature space by using the learned optimal linear mapping based on the KL divergence measurement under the optimal linear mapping, and classifying the test set samples by adopting a K neighbor classifier based on the KL divergence between the test set and the training set.
2. The KL-divergence-optimized 3D object data classification system as recited in claim 1, wherein: modeling each sample as a gaussian distribution, and assuming that both gaussian distributions have the same covariance matrix; for each sample, the multi-view feature is characterized by a mean vector and covariance matrix of the gaussian distribution.
3. The KL-divergence-optimized 3D object data classification system as recited in claim 1, wherein: the triplets include samples belonging to the same class of objects and samples belonging to different classes of objects.
4. A 3D object data classification method implemented by the KL-divergence-optimization-based 3D object data classification system of claim 1, characterized in that: the data classification method comprises the following steps:
step 1, extracting characteristics of original data, namely extracting multi-view characteristics of the original data from the original data comprising images and texts;
step 2, performing whitening treatment on the extracted features, namely uniformly projecting the multi-view features to the same low-dimensional space after extracting the multi-view features from the original data, performing whitening treatment on the features, reducing redundancy of the multi-view features extracted from the original data, removing correlation among the features of different samples, and re-transforming the transformed data back to the original space;
step 3, modeling and characterizing the processed multi-view features;
step 4, selecting a certain amount of triples from the labeled training data as training data, wherein the distribution of the training data is the characteristic distribution formed by modeling the multi-view characteristics in the step 3;
step 5, taking the selected triples as training data, performing feature mapping, namely applying a linear mapping on all mean vectors, mapping the original data features to a new feature space, wherein the distances between similar samples are reduced, and the distances between different types of samples are increased in the new feature space;
step 6, calculating the similarity between the multi-view samples, namely measuring the similarity of the multi-view samples by calculating the optimized KL divergence in the new feature space of the mapping;
step 7, taking the selected triples as training data, and optimizing based on KL divergence; the model is repeatedly trained by utilizing a feature mapping module, a multi-view data similarity calculation module and an optimization module based on KL divergence until convergence, so that the optimal linear mapping is learned;
and 8, mapping the original data features to a new feature space by using the optimal linear mapping learned in the step 7, and classifying the test set samples in the new feature space by adopting a K nearest neighbor classifier based on KL divergence.
5. The 3D object data classification method of claim 4, wherein: in step 7, positive parameter gamma is adopted to balance the influence caused by the same kind of data and different kinds of data; and set the parameter gamma toWherein (1)>Is the average KL divergence of the entire training dataset.
6. The 3D object data classification method of claim 5, wherein: in step 7, after projecting the gradient of the objective function into the tangent space of the same manifold, a Riemann gradient descent is performed on the manifold given an affine invariant Riemann metric symmetric positive definite matrix, after which the manifold structure of the learned linear mapping is preserved during each iteration of the optimization.
7. The 3D object data classification method of claim 5, wherein: in step 8, calculating the KL divergence of each sample in the test set and each sample in the training set; maintaining a priority queue with a size k from large to small according to the distance, and storing nearest neighbor training tuples; and randomly selecting k tuples from the training tuples as initial nearest neighbor tuples, respectively calculating the distances from the test tuple to the k tuples, and storing the training tuple marks and the distances into a priority queue.
CN201811540690.4A 2018-12-17 2018-12-17 KL divergence optimization-based 3D object data classification system and method Active CN109615014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811540690.4A CN109615014B (en) 2018-12-17 2018-12-17 KL divergence optimization-based 3D object data classification system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811540690.4A CN109615014B (en) 2018-12-17 2018-12-17 KL divergence optimization-based 3D object data classification system and method

Publications (2)

Publication Number Publication Date
CN109615014A CN109615014A (en) 2019-04-12
CN109615014B true CN109615014B (en) 2023-08-22

Family

ID=66009466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811540690.4A Active CN109615014B (en) 2018-12-17 2018-12-17 KL divergence optimization-based 3D object data classification system and method

Country Status (1)

Country Link
CN (1) CN109615014B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223275B (en) * 2019-05-28 2020-12-18 陕西师范大学 task-fMRI guided brain white matter fiber deep clustering method
CN110118657B (en) * 2019-06-21 2021-06-11 杭州安脉盛智能技术有限公司 Rolling bearing fault diagnosis method and system based on relative entropy and K nearest neighbor algorithm
CN112149699B (en) * 2019-06-28 2023-09-05 北京京东尚科信息技术有限公司 Method and device for generating model and method and device for identifying image
CN112949296A (en) * 2019-12-10 2021-06-11 医渡云(北京)技术有限公司 Riemann space-based word embedding method and device, medium and equipment
CN111259938B (en) * 2020-01-09 2022-04-12 浙江大学 Manifold learning and gradient lifting model-based image multi-label classification method
CN111738351B (en) * 2020-06-30 2023-12-19 创新奇智(重庆)科技有限公司 Model training method and device, storage medium and electronic equipment
CN113095731B (en) * 2021-05-10 2023-04-18 北京人人云图信息技术有限公司 Flight regulation and control method and system based on passenger flow time sequence clustering optimization
CN113298731A (en) * 2021-05-24 2021-08-24 Oppo广东移动通信有限公司 Image color migration method and device, computer readable medium and electronic equipment
CN113688773B (en) * 2021-09-03 2023-09-26 重庆大学 Storage tank dome displacement data restoration method and device based on deep learning
CN113655385B (en) * 2021-10-19 2022-02-08 深圳市德兰明海科技有限公司 Lithium battery SOC estimation method and device and computer readable storage medium
CN113887661B (en) * 2021-10-25 2022-06-03 济南大学 Image set classification method and system based on representation learning reconstruction residual analysis
CN114882262B (en) * 2022-05-07 2024-01-26 四川大学 Multi-view clustering method and system based on topological manifold
CN114662620B (en) * 2022-05-24 2022-10-21 岚图汽车科技有限公司 Automobile endurance load data processing method and device for market users
CN116687406B (en) * 2023-05-06 2024-01-02 粤港澳大湾区精准医学研究院(广州) Emotion recognition method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574548A (en) * 2015-12-23 2016-05-11 北京化工大学 Hyperspectral data dimensionality-reduction method based on sparse and low-rank representation graph
CN106126474A (en) * 2016-04-13 2016-11-16 扬州大学 A kind of linear classification method embedded based on local spline
CN106951914A (en) * 2017-02-22 2017-07-14 江苏大学 The Electronic Nose that a kind of Optimization of Fuzzy discriminant vectorses are extracted differentiates vinegar kind method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046317B2 (en) * 2007-12-31 2011-10-25 Yahoo! Inc. System and method of feature selection for text classification using subspace sampling
US8699789B2 (en) * 2011-09-12 2014-04-15 Xerox Corporation Document classification using multiple views

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574548A (en) * 2015-12-23 2016-05-11 北京化工大学 Hyperspectral data dimensionality-reduction method based on sparse and low-rank representation graph
CN106126474A (en) * 2016-04-13 2016-11-16 扬州大学 A kind of linear classification method embedded based on local spline
CN106951914A (en) * 2017-02-22 2017-07-14 江苏大学 The Electronic Nose that a kind of Optimization of Fuzzy discriminant vectorses are extracted differentiates vinegar kind method

Also Published As

Publication number Publication date
CN109615014A (en) 2019-04-12

Similar Documents

Publication Publication Date Title
CN109615014B (en) KL divergence optimization-based 3D object data classification system and method
CN107622104B (en) Character image identification and marking method and system
Chen et al. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions
CN108446689B (en) Face recognition method
CN113408605B (en) Hyperspectral image semi-supervised classification method based on small sample learning
WO2022126810A1 (en) Text clustering method
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
Yang et al. An ensemble classification algorithm for convolutional neural network based on AdaBoost
Wang et al. Markov topic models
CN110008365B (en) Image processing method, device and equipment and readable storage medium
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN106778834A (en) A kind of AP based on distance measure study clusters image labeling method
Chen et al. A saliency map fusion method based on weighted DS evidence theory
CN110830291B (en) Node classification method of heterogeneous information network based on meta-path
CN113674862A (en) Acute renal function injury onset prediction method based on machine learning
Zhi et al. Gray image segmentation based on fuzzy c-means and artificial bee colony optimization
Zhuang et al. A handwritten Chinese character recognition based on convolutional neural network and median filtering
CN110765285A (en) Multimedia information content control method and system based on visual characteristics
CN114358279A (en) Image recognition network model pruning method, device, equipment and storage medium
US11475684B1 (en) Methods and systems for performing noise-resistant computer vision techniques
Zhao et al. Safe semi-supervised classification algorithm combined with active learning sampling strategy
CN115439919B (en) Model updating method, device, equipment, storage medium and program product
Liu et al. Learning implicit labeling-importance and label correlation for multi-label feature selection with streaming labels
CN110717547A (en) Learning algorithm based on regression hypergraph
CN110287973A (en) A kind of image characteristic extracting method based on low-rank robust linear discriminant analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant