WO2000016250A1 - Data decomposition/reduction method for visualizing data clusters/sub-clusters - Google Patents

Data decomposition/reduction method for visualizing data clusters/sub-clusters Download PDF

Info

Publication number
WO2000016250A1
WO2000016250A1 PCT/US1999/021363 US9921363W WO0016250A1 WO 2000016250 A1 WO2000016250 A1 WO 2000016250A1 US 9921363 W US9921363 W US 9921363W WO 0016250 A1 WO0016250 A1 WO 0016250A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
level
clusters
projection
visualization
Prior art date
Application number
PCT/US1999/021363
Other languages
French (fr)
Inventor
Joseph Y. Wang
Original Assignee
The Catholic University Of America
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Catholic University Of America filed Critical The Catholic University Of America
Priority to JP2000570715A priority Critical patent/JP2002525719A/en
Priority to CA002310333A priority patent/CA2310333A1/en
Priority to EP99946966A priority patent/EP1032918A1/en
Priority to AU59262/99A priority patent/AU5926299A/en
Publication of WO2000016250A1 publication Critical patent/WO2000016250A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor

Definitions

  • the present invention relates gene ⁇ cally to the field of data analysis and data presentation ⁇ and, more particularly, to the analysis of data sets having higher dimensionality data points m order to optimally present the data m a lower dimensional order context, i.e., m a hierarchy of two- or three-dimensional visual contexts to reveal data structures within the data set.
  • the visualization of data sets having a large number of data points with multiple variables or attributes associated with each data point represents a complex problem.
  • a priori to easily identify groups or subgroups of data points that have relational attributes such that structures and sub-structures existing within the data set can be visualized.
  • Various techniques have been developed for processing the data sets to reveal internal structures as an aid to understanding the data.
  • a large data set will oftentimes have data points that are multi-variant, that is, a single data point can have a multitude of attributes, including attributes that are completely independent from one another or have some degree of mter- attribute relationship or dependency.
  • a single projection of a higher- order data set onto a visualization space may not be able to present all of the structures and substructures within the data set of interest m such a way that the structures or sub-structures can be visually distinguished or discriminated.
  • presentation schema involves hierarchical visualization by which the data set is viewed at a highest - level , whole data set viewpoint. Thereafter, features within the highest-level projection are identified m accordance with an algorithm (s) or other identification criteria and those next highest level features further processed to reveal their respective internal structure m another projection (s) .
  • This hierarchal process can be repeated for successive levels to present successively finer and detailed views of the data set.
  • m a hierarchical visualization scheme
  • an image tree is provided with the successively lower images of the tree revealing more detail .
  • the data set is subjected by Bishop and Tipping to a form of linear latent variable modelling to find a representation of the multidimensional data set m terms of two latent, or "hidden,” variables that is determined indirectly from the data set .
  • the modelling is similar to principal component analysis, but defines a probability density m the data space.
  • a single top-level latent variable model is generated with the posterior mean of each data point plotted m the latent space. Any cluster centers identified m this initial plot are used as the basis for initiating the next -lower level analysis leading to a mixture of the latent variable models.
  • the parameters, including the optimal projections, are determined by maximum likelihood; this criterion need not always lead to the most interesting or mterpretable visualization plots. Disclosure of Invention
  • the present invention provides a data decomposition/reduction method for visualizing large sets of multi -variant data including the processing of the multi -variant data down to two- or three- dimensional space m order to optimally reveal otherwise hidden structures within the data set including the principal data cluster or clusters at a first or top level of processing and additional sub-clusters within the principal data clusters m successive lower level visualizations.
  • the identification of the morphology of clusters and subclusters and mter-cluster separation and relative positioning within a large data set allows investigation of the underlying drive that created the data set morphology and the mtra-data-set features .
  • the data set constituted by a multitude of data points each having a plurality of attributes, is initially processed as a whole using multiple finite normal mixture models and hierarchical visualization spaces to develop the multi-level data visualization and interpretation.
  • the top-level model and its projection explain the entire data set revealing the presence of clusters and cluster relationships, while lower-level models and projections display internal structure within individual clusters, such as the presence of subclusters, which might not be apparent in the higher-level models and projections.
  • each level is relatively simple while the complete hierarchy maintains overall flexibility while still conveying considerable structural information.
  • the arrangement combines (a) minimax entropy modeling by which the models are determined and various parameters estimated and (b) principal component analysis to optimize structure decomposition and dimensionality reduction.
  • the present invention advantagiously performs a probabilistic principal component analysis to project the softly partitioned data space down to a desired two-dimensional visualization space to lead to an optimal dimensionality reduction allowing the best extraction and visualization of local clusters.
  • the minimax entropy principle is used to select the model structures and estimate its parameter values, where the soft partitioning of the data set results in a standard finite normal mixture model with minimum conditional bias and variance.
  • the present invention treats structure decomposition and dimensionality reduction as two separate but complementary operations, where the criterion used to optimize dimensionality reduction is the separation of clusters rather than the maximum likelihood approach of Bishop and Tipping.
  • the resulting projections in turn, enhance the performance of structure decomposition at the next lower level .
  • a model selection procedure is applied to determine the number of subclusters inside each cluster at each level using an information theoretic criteria based upon the minimum of alternate calculations of the Akaike Information Critera (AIC) and the minimum description length (MDL) criteria. This determination allows the process of the present invention to automatically determine whether a further split of a subspace should be implemented or whether to terminate the further processing.
  • AIC Akaike Information Critera
  • MDL minimum description length
  • a probabilistic adaptive principal component extraction (PAPEX) algorithm is also applied to estimate the desired number of principal axes. When the dimensionality of the raw data is high, this PAPEX approach is computationally very efficient.
  • the present invention defines a probability distribution in data space which naturally induces a corresponding distribution in projection space through a Radon transform. This defined probability distribution permits an independent procedure in determining values for the intrinsic model parameters without concurrent estimation of projection mapping matrix. ⁇
  • the underlying "drive" that give rise to the data points often form clusters of points because more than one variable may be a function of that same underlying drive .
  • the data set (designated herein as the t-space) is projected onto a single x-space (i.e., two- dimensional space) , in which a descriptor W is determined from the sample covariance matrix C t by fitting a single Gaussian model to the data set over t-space .
  • a descriptor W is determined from the sample covariance matrix C t by fitting a single Gaussian model to the data set over t-space .
  • the a value f(t) is then determined for K 0 m which the values of ⁇ k z lk , ⁇ tk , and C tk are further refined by maximizing the likelihood over t-space.
  • G k (t) is determined by repeating the above process steps to thus construct multiple x-subspaces at the third level; the hierarchy is completed under the information theoretic criteria using the AIC and the MDL and all x-space subspaces plotted for visual evaluation.
  • the present invention advantageously provides a data decomposition/reduction method for visualizing data clusters/sub-clusters within a large data space that is optimally effective and computationally efficient .
  • FIG. 1 is a schematic block diagram of a system for processing a raw multi -varient data set m accordance with the present invention
  • FIG. 2 is a flow diagram of the process flow of the present invention
  • FIG. 2A is an alternative visualization of the process flow of the present invention.
  • FIG. 3 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis
  • FIG. 4A is a 2 -dimensional visualization space of one of the clusters of FIG. 3 ;
  • FIG. 4B is a 2 -dimensional visualization space of another of the clusters of FIG. 3;
  • FIG. 5 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis;
  • FIG. 6A is a 2 -dimensional visualization space of one of the clusters of FIG. 5 ;
  • FIG. 6B is a 2 -dimensional visualization space of a second of the clusters of FIG. 5;
  • FIG. ⁇ C is a 2-d ⁇ mens ⁇ onal visualization space of a third of the clusters of FIG. 5.
  • FIG. 1 A processing system for implementing the dimensionality reduction using probabilistic principal component analysis and structure decomposition using adaptive expectation maximization methods for visualizing data m accordance with the present invention is shown m FIG. 1 and designated generally therein by the reference character 10.
  • the system 10 includes a working memory 12 that accepts the raw multi-varient data set, indicated at 14, and which bi-directionally interfaces with a processor 16.
  • the processor 16 processes the raw t-space data set ⁇ ⁇ 14 as explained m more detail below and presents that data to a graphical user interface (GUI) 18 which presents a two- or three- dimensional visual presentation to the user as also explained below.
  • GUI graphical user interface
  • a plotter or printer 20 can be provided to generate a printed record of the display output of the graphical user interface (GUI) .
  • the processor 16 may take the form of a software or firmware programed CPU, ALU, ASIC, or microprocessor or a combination thereof.
  • the data set is subject to a global principal component analysis to thereafter effect a top most projection.
  • This step is initiated by determining the value of a variable W for the top-most projection m the hierarchy of projections.
  • W is directly found by evaluating the covariance matrix C t .
  • APEX adaptive principal components extraction
  • the two-step expectation maximization (EM) algorithm can be applied to allow a standard finite normal mixture model (SFNM) , i.e., where
  • the standard finite normal mixture (SFNM) modeling solution addresses the estimation of the regional parameters ( ⁇ k ⁇ tk ) and the detection of the structural parameter K 0 in the relationship
  • the EM algorithm is implemented as a two-step process, i.e., the E-step and the M-step as follows:
  • K a 7K 0 - 1 (i.e., the values of Akaike' s Information Criteria (AIC) and the Minimum Description Length (MDL) for K with selection of a model m which K corresponds to the minimum of the
  • EQ. 9 are then used as the initial means of the respective submodels. Since the mixing proportions ⁇ are pro ection- invariant , a 2 x 2 unit matrix is assigned to the remaining parameters of the covariance matrix C tk .
  • the expectation-maximization (EM) algorithm can be again applied to allow a standard finite normal matrix (SFNM) with K 0 submodels to be fitted to the data over t-space.
  • SFNM finite normal matrix
  • the corresponding EM algorithm can be derived by replacing all x m the E-step and the M-step equations, above, by t.
  • C tk can be directly evaluated to obtain W k as described above.
  • an algorithm termed the probabilistic adaptive principal component extraction (PAPEX) is applied as follows .
  • i f c(i + 1) i f c(i) + ifc(i)tifc -
  • a k (i + 1) a k (i) - ⁇ 7feto(»)yifc(*) + y2fc(*) a fc( )]
  • W k the eigenvector associated with the second largest eigenvalue of the covariance matrix C k .
  • the determination of the parameters of the models at the third level can again be viewed as a two-step estimation problem, in which a further split of the models at the second level is determined within each of the subspaces over x- space, and then the parameters of the selected models are fine tuned over t-space.
  • the learning of ⁇ k (x) can again be performed using the expectation-maximization (EM) algorithm and the model selection procedures described above.
  • the third level EM algorithm has the same form as the EM algorithm at the second level, except that in the E-step, the posterior probability that a data point x 1 belongs to submodel j is given by
  • EQ. 19 are then used to initialize the means of the respective submodels, and the expectation maximization (EM) algorithm can be applied to allow a standard finite normal matrix (SFNM) distribution with K 0 submodels to be fitted to the data over t- space.
  • the formulation can be derived by simply replacing all x in the second level M-step by t. With the resulting z 1 ( k _-) in t-space, the PAPEX algorithm can be applied to estimate W ( k ) , in which the effective input values are expressed by
  • Hk z i(kJ) ⁇ ⁇ ⁇ t ⁇ k ,j)) EQ. 20 * "
  • the next level visualization subspace is generated by plotting each data point t x at the corresponding
  • FIGS. 3, 4A, and 4B A first exemplary two-level implementation of the present invention is shown in FIGS. 3, 4A, and 4B in which the entire data set is present in the top level projection and two local clusters within that top level projection each individually presented in FIGS. 4A and 4B.
  • the entire data set is subject to principal component analysis as described above to obtain the principal axis or axes (axis A x being representative) for the top level display. Additionally, the axis (unnumbered) for each of the apparent clusters is displayed. Thereafter, the apparent centers of the two clusters are identified and the data subject to the aforementioned processing to further reveal the local cluster of FIG. 4A and the local cluster of FIG. 4B .
  • FIGS. 5, 6A, 6B, and 6C A second exemplary two- level implementation of the present invention is shown in FIGS. 5, 6A, 6B, and 6C in which the entire data set is present in the top level projection and three local clusters within that top level projection are each individually presented in FIGS. 6A, 6B, and 6C.
  • the entire data set is subject to principal component analysis as described above to obtain the principal axis (A x ) and the axis (unnumbered) for each of the apparent clusters as displayed.
  • the t-space raw data set arises from a mixture of three Gaussians consisting of 300 data points as presented in FIG. 5.
  • two cloud- like clusters are well separated while a third cluster appears spaced in between the two well- separated cloud-like clusters.
  • the second level visual space is generated with a mixture of two local principal component axis subspaces where the line A x indicates the global principal axis.
  • the plot on the "right" of FIG. 5 shows evidence of further split.
  • a hierarchical model is adopted, which illustrates that there are indeed total three clusters within the data set, as shown in FIGS. 6A, 6B, and 6C.
  • An alternate visualization of the process of flow of the present invention is shown in FIG.
  • the present invention has use m all applications requiring the analysis of data, particularly multi -dimensional data, for the purpose of optimally visualizing various underlying structures and distributions present within the universe of data. Applications include the detection of data clusters and sub-clusters and their relative relationships m areas of medical, industrial, geophysical imaging, and digital library processing, for example.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

Higher dimensionality data is subject to a hierarchical visualization to allow the complete data set to be visualized in a top-down hierarchy in terms of clusters and sub-clusters at deeper levels. The data set is subject to standard finite normal mixture models and probabilistic principal component projections, the parameters of which are estimated using the expectation-maximization and principal component analysis under the Akaike Information Criteria (AIC) and the Minimum Description Length (MDL) criteria. The high-dimension raw data is subject to processing using principal component analysis to reveal the dominant distribution of the data at a first level. Thereafter, the so-processed information is further processed to reveal sub-clusters within the primary clusters. The various clusters and sub-clusters at the various hierarchical levels are subject to visual projection to reveal the underlying structure. The inventive schema has utility in all applications in which high-dimensionality multi-variate data is to be reduced to a two- or theree-dimensional projection space to allow visual exploration of the underlying structure of the data set.

Description

DATA DECOMPOSITION/REDUCTION METHOD FOR VISUALIZING DATA CLUSTERS/SUB-CLUSTERS
Background Art The present invention relates geneπcally to the field of data analysis and data presentation ~ and, more particularly, to the analysis of data sets having higher dimensionality data points m order to optimally present the data m a lower dimensional order context, i.e., m a hierarchy of two- or three-dimensional visual contexts to reveal data structures within the data set.
The visualization of data sets having a large number of data points with multiple variables or attributes associated with each data point represents a complex problem. In general, there is no way, a priori , to easily identify groups or subgroups of data points that have relational attributes such that structures and sub-structures existing within the data set can be visualized. Various techniques have been developed for processing the data sets to reveal internal structures as an aid to understanding the data. In general, a large data set will oftentimes have data points that are multi-variant, that is, a single data point can have a multitude of attributes, including attributes that are completely independent from one another or have some degree of mter- attribute relationship or dependency.
Any elementary visualization process involving the projection of the data set onto a two- dimensional visualization space using straight- forward projection algorithms becomes progressively less adequate as the order of the data points increases. Thus, a single projection of a higher- order data set onto a visualization space may not be able to present all of the structures and substructures within the data set of interest m such a way that the structures or sub-structures can be visually distinguished or discriminated.
One form of presentation schema involves hierarchical visualization by which the data set is viewed at a highest - level , whole data set viewpoint. Thereafter, features within the highest-level projection are identified m accordance with an algorithm (s) or other identification criteria and those next highest level features further processed to reveal their respective internal structure m another projection (s) . This hierarchal process can be repeated for successive levels to present successively finer and detailed views of the data set. Thus, m a hierarchical visualization scheme, an image tree is provided with the successively lower images of the tree revealing more detail .
One such hierarchical data visualization scheme is disclosed by C. M. Bishop and M. E. Tipping m an article entitled "A Hierarchical Latent Variable
Model for Data Visualization," IEEE Trans. Pattern Anal. Machine Intell . , Vol. 20, No. 3, pp. 282-293, March 1998. Bishop and Tipping present a hierarchical visualization algorithm based on a two- dimensional hierarchical mixture of latent variable models, the parameters of which are estimated using the expectation-maximization (EM) algorithm. The construction of the hierarchical tree proceeds top down so that structure decomposition is driven interactively by the user and optimal projection is determined by the maximum likelihood principle. A ~" hierarchy of multiple two-dimensional visualization spaces is provided with the top-level projection displaying the entire data set and successive lower- level projections displaying clusters within the data set displayed at the top-level. Further lower- level projections display sub-clusters and related internal structures within the data set .
Initially, the data set is subjected by Bishop and Tipping to a form of linear latent variable modelling to find a representation of the multidimensional data set m terms of two latent, or "hidden," variables that is determined indirectly from the data set . The modelling is similar to principal component analysis, but defines a probability density m the data space. In applying the Bishop and Tipping protocol, a single top-level latent variable model is generated with the posterior mean of each data point plotted m the latent space. Any cluster centers identified m this initial plot are used as the basis for initiating the next -lower level analysis leading to a mixture of the latent variable models.
There are two potential limitations associated with the Bishop and Tipping approach. First, although a probability density is defined m the data space through a latent variable model, the prior and order of the mixture model are heuπstically selected and an isotropic Gaussian conditional distribution is undesirably restricted, which may misrepresent the true data structures and put the optimality of the formulation m doubt. ~
Secondly, the parameters, including the optimal projections, are determined by maximum likelihood; this criterion need not always lead to the most interesting or mterpretable visualization plots. Disclosure of Invention
The present invention provides a data decomposition/reduction method for visualizing large sets of multi -variant data including the processing of the multi -variant data down to two- or three- dimensional space m order to optimally reveal otherwise hidden structures within the data set including the principal data cluster or clusters at a first or top level of processing and additional sub-clusters within the principal data clusters m successive lower level visualizations. The identification of the morphology of clusters and subclusters and mter-cluster separation and relative positioning within a large data set allows investigation of the underlying drive that created the data set morphology and the mtra-data-set features .
The data set, constituted by a multitude of data points each having a plurality of attributes, is initially processed as a whole using multiple finite normal mixture models and hierarchical visualization spaces to develop the multi-level data visualization and interpretation. The top-level model and its projection explain the entire data set revealing the presence of clusters and cluster relationships, while lower-level models and projections display internal structure within individual clusters, such as the presence of subclusters, which might not be apparent in the higher-level models and projections. With many complementary mixture models and visualization projections, each level is relatively simple while the complete hierarchy maintains overall flexibility while still conveying considerable structural information. The arrangement combines (a) minimax entropy modeling by which the models are determined and various parameters estimated and (b) principal component analysis to optimize structure decomposition and dimensionality reduction.
The present invention advantagiously performs a probabilistic principal component analysis to project the softly partitioned data space down to a desired two-dimensional visualization space to lead to an optimal dimensionality reduction allowing the best extraction and visualization of local clusters. The minimax entropy principle is used to select the model structures and estimate its parameter values, where the soft partitioning of the data set results in a standard finite normal mixture model with minimum conditional bias and variance. By performing the principal component analysis and minimax entropy modeling alternatively, a complete hierarchy of complementary projections and refined models can be generated automatically, corresponding to a statistical description best fitted to the data .
The present invention treats structure decomposition and dimensionality reduction as two separate but complementary operations, where the criterion used to optimize dimensionality reduction is the separation of clusters rather than the maximum likelihood approach of Bishop and Tipping. The resulting projections, in turn, enhance the performance of structure decomposition at the next lower level .
Thereafter, a model selection procedure is applied to determine the number of subclusters inside each cluster at each level using an information theoretic criteria based upon the minimum of alternate calculations of the Akaike Information Critera (AIC) and the minimum description length (MDL) criteria. This determination allows the process of the present invention to automatically determine whether a further split of a subspace should be implemented or whether to terminate the further processing.
A probabilistic adaptive principal component extraction (PAPEX) algorithm is also applied to estimate the desired number of principal axes. When the dimensionality of the raw data is high, this PAPEX approach is computationally very efficient. Lastly, the present invention defines a probability distribution in data space which naturally induces a corresponding distribution in projection space through a Radon transform. This defined probability distribution permits an independent procedure in determining values for the intrinsic model parameters without concurrent estimation of projection mapping matrix. ~
In many data sets in which the data points are multi-varient, the underlying "drive" that give rise to the data points often form clusters of points because more than one variable may be a function of that same underlying drive .
In accordance with the present invention and as an initial step in processing the raw data set, the data set (designated herein as the t-space) is projected onto a single x-space (i.e., two- dimensional space) , in which a descriptor W is determined from the sample covariance matrix Ct by fitting a single Gaussian model to the data set over t-space .
Thereafter, a value f (x) is determined for clusters K=l , 2 , ..., K^, in which the values of 7Tk and θxk are initialized by the user and estimated by maximizing the likelihood over x-space.
After f (x) is determined, values of the Akaike Information Criterion (AIC) and the minimum description length (MDL) for the various clusters K = 1 , 2 , ..., Kmax are calculated and a model selected with a K0 that corresponds to the minimum of the calculated values of the Akaike Information Criteria (AIC) and the minimum description length (MDL) criteria. The a value f(t) is then determined for K0 m which the values of πk zlk, μtk , and Ctk are further refined by maximizing the likelihood over t-space. Wk is determined by directly evaluating the covariance matrix Ctk or learning from tlk for k=l,2,...,K0.
Thereafter, xlk or h [Wτ k (t1-mtk) ] for k = 1,2,...,K0 is plotted by projecting the data set onto multiple x-subspaces at the second level for visual evaluation by the user.
Then Gk(t) is determined by repeating the above process steps to thus construct multiple x-subspaces at the third level; the hierarchy is completed under the information theoretic criteria using the AIC and the MDL and all x-space subspaces plotted for visual evaluation.
The present invention advantageously provides a data decomposition/reduction method for visualizing data clusters/sub-clusters within a large data space that is optimally effective and computationally efficient .
Other objects and further scope of applicability of the present invention will become apparent from the detailed description to follow, taken m conjunction with the accompanying drawings, m which like parts are designated by like reference characters .
Brief Description of the Drawings The present invention is described below, by way of example, with reference to the accompanying drawings, wherein: FIG. 1 is a schematic block diagram of a system for processing a raw multi -varient data set m accordance with the present invention;
FIG. 2 is a flow diagram of the process flow of the present invention;
FIG. 2A is an alternative visualization of the process flow of the present invention;
FIG. 3 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis;
FIG. 4A is a 2 -dimensional visualization space of one of the clusters of FIG. 3 ;
FIG. 4B is a 2 -dimensional visualization space of another of the clusters of FIG. 3; FIG. 5 is an example of the projection of a data set onto a 2 -dimensional visualization space after determination of the principal axis;
FIG. 6A is a 2 -dimensional visualization space of one of the clusters of FIG. 5 ; FIG. 6B is a 2 -dimensional visualization space of a second of the clusters of FIG. 5; and
FIG. βC is a 2-dιmensιonal visualization space of a third of the clusters of FIG. 5.
Best Mode for Carrying Out the Invention A processing system for implementing the dimensionality reduction using probabilistic principal component analysis and structure decomposition using adaptive expectation maximization methods for visualizing data m accordance with the present invention is shown m FIG. 1 and designated generally therein by the reference character 10. As shown, the system 10 includes a working memory 12 that accepts the raw multi-varient data set, indicated at 14, and which bi-directionally interfaces with a processor 16. The processor 16 processes the raw t-space data set ~~ 14 as explained m more detail below and presents that data to a graphical user interface (GUI) 18 which presents a two- or three- dimensional visual presentation to the user as also explained below. If desired, a plotter or printer 20 (or other hard copy output device) can be provided to generate a printed record of the display output of the graphical user interface (GUI) . The processor 16 may take the form of a software or firmware programed CPU, ALU, ASIC, or microprocessor or a combination thereof.
As the initial step m processing the raw data and as presented m FIG. 2, the data set is subject to a global principal component analysis to thereafter effect a top most projection. This step is initiated by determining the value of a variable W for the top-most projection m the hierarchy of projections. For relatively low dimensional data sets, W is directly found by evaluating the covariance matrix Ct . For higher dimensional data sets, only the top two eigenvectors of the covariance matrix of the data points are of interest; depending upon the dimensionality of the raw data, it may be computationally more efficient to apply the adaptive principal components extraction (APEX) algorithm described m Y. Wang, S. H. Lin, H. Li, and S, Y. Kung, "Data mapping by probabilistic modular networks and information theoretic criteria," IEEE Trans. Signal Processing, Vol. 46, No.12, pp. 3378-3397, December 1998 to find W directly from the raw data points t1 . After the data set is projected and displayed by it principal component axis and n the basis of this single x- space and given a fixed K, the user then selects or identifies those points μxk on the plot corresponding to the centers of apparent clusters.
The two-step expectation maximization (EM) algorithm can be applied to allow a standard finite normal mixture model (SFNM) , i.e., where
Figure imgf000013_0001
gfrlθxk) = J g(t\θtk)δ(x -wTt+ wT^)dt EQ. 2 and where the log-likelihood of projecting the data under the Radon Transform is
Figure imgf000013_0002
EQ. 3
The standard finite normal mixture (SFNM) modeling solution addresses the estimation of the regional parameters (πk θtk) and the detection of the structural parameter K0 in the relationship
Figure imgf000013_0003
EQ. 4 based on the observations t. It has been shown that when K0 is given, the maximum likelihood (ML) estimate of the regional parameters can be obtained using the expectation maximization (EM) algorithm. There are two observations with the described ~ approach: when the dimension of the data space is high, the implementation of the expectation maximization (EM) algorithm is very complex and time consuming. Additionally, the initialization of the expectation maximization (EM) algorithm is heuristically chosen, this heuristic selection often leads to only a local optimal solution. Therefore, it is reasonable to consider the model parameter values being estimated, first, in the projected x- space and then further adjusted or fine tuned in the data t-space. One natural criterion used for determining the optimal parameter values is to minimize the distance between the standard finite normal mixture (SFNM) distribution f (x) and the data histogram fx. Relative entropy (Kullback-Leibler distance) , as suggested by information theory, is a suitable measure of the reconstruction error, given by:
Figure imgf000014_0001
EQ. 5 When relative entropy is used as a distance measure, distance minimization is equivalent to the maximum likelihood estimation, summarized by
£ = xp(~N[H(fx) + D(fx\\f)])
EQ. 6 where H is the entropy calculator described by Y.
Wang, S. H. Lin, H. Li, and S, Y. Kung in "Data mapping by probabilistic modular networks and information theoretic criteria," IEEE Trans. Signal
Processing, Vol. 46, No.12, pp. 3378-3397, December ~
1998.
The EM algorithm is implemented as a two-step process, i.e., the E-step and the M-step as follows:
E-step :
(n) _ jfyfol ) zik
/WTJ ^)
EQ . 7 and the M- step
Figure imgf000015_0001
c(n+l) ∑fc l *&? (** ~ ftirt X** ~ fj* )T
EQ. 8C In each complete processing cycle, the previous set of parameter values is used to determine the
posterior probabilities xk using the E-step equation. These posterior probabilities are then used to obtain the new set of values "* , ^k , and x* using the appropriate M-step equations. The processing is continued until a minima in the value
of the relative entropy
Figure imgf000015_0002
is ldentified. This model selection criteria will determine the optimal number of K0 values unless it is already at a local minimum. The model selection procedure will then determine the optimal number K0 of models to fit at the next level down m the hierarchy using the two information theoretic criteria, where Ka = 7K0 - 1 (i.e., the values of Akaike' s Information Criteria (AIC) and the Minimum Description Length (MDL) for K with selection of a model m which K corresponds to the minimum of the
AIC and the MDL) . The resulting points μtk (0) m data space, obtained by
Figure imgf000016_0001
EQ. 9 are then used as the initial means of the respective submodels. Since the mixing proportions π are pro ection- invariant , a 2 x 2 unit matrix is assigned to the remaining parameters of the covariance matrix Ctk. The expectation-maximization (EM) algorithm can be again applied to allow a standard finite normal matrix (SFNM) with K0 submodels to be fitted to the data over t-space.
The corresponding EM algorithm can be derived by replacing all x m the E-step and the M-step equations, above, by t. With a soft partitioning of the data set to generate possible models for the next level projection using the EM algorithm, data points will now effectively belong to more than one cluster at any given level. Thus, the effective input values are tik = zik(tj - μtk) for an independent visualization subspace k in the hierarchy. Ctk can be directly evaluated to obtain Wk as described above. However, when the determination of Wk is based on a neural network learning scheme, an algorithm, termed the probabilistic adaptive principal component extraction (PAPEX) is applied as follows .
The feedforward weight vector wmkand the feedback weight vector wmk are initialize to small random values at time i = 1 and a small positive value is assigned to the learning rate parameter η. For m=l and for i=l,2,..., the value of
Figure imgf000017_0001
EQ. 10 is computed where
ifc(i + 1) = ifc(i) + ifc(i)tifc -
Figure imgf000017_0002
EQ . 11
For large values of i, wlk(i) → wlk, where wlk is the eigenvector associated with the largest eigenvalue of the covariance matrix Ck. Thereafter m is set equal to 2 and for i=l,2, ..., the following values are computed:
Stefc(ϊ) = w^(i)t fc + ak(i)yik(i)
EQ . 12
W2*(ϊ + 1) = W2fc(i) +
Figure imgf000017_0003
- ylkiήwzkiή]
EQ . 13 and ak(i + 1) = ak(i) - ϊ7feto(»)yifc(*) + y2fc(*)afc( )]
EQ . 14
For large values of i, w2k(i) → w2k, where w2k is the eigenvector associated with the second largest eigenvalue of the covariance matrix Ck. Having determined principal axes Wk of the mixture model at the second level, the visualization subspaces are then constructed by plotting each data point tx at the corresponding xlk for k = 1,2, ... , K0. Thus if one particular point takes most of the contribution for a particular component, then that point will effectively be visible only on the corresponding subspace.
The determination of the parameters of the models at the third level can again be viewed as a two-step estimation problem, in which a further split of the models at the second level is determined within each of the subspaces over x- space, and then the parameters of the selected models are fine tuned over t-space. Based on the plot of xlk, the learning of ςk(x) can again be performed using the expectation-maximization (EM) algorithm and the model selection procedures described above. The third level EM algorithm has the same form as the EM algorithm at the second level, except that in the E-step, the posterior probability that a data point x1 belongs to submodel j is given by
_ _ 7Tj|fcg(xilfl.ylfc)
EQ . 15 where zlk are constants estimated from the second level of the hierarchy. The corresponding M-step in the expectation maximization algorithm is then given by
Figure imgf000019_0001
Similarly, the resulting points in data space
A*t(l ) =Wfcμx(fc.)+ tfc
EQ. 19 are then used to initialize the means of the respective submodels, and the expectation maximization (EM) algorithm can be applied to allow a standard finite normal matrix (SFNM) distribution with K0 submodels to be fitted to the data over t- space. The formulation can be derived by simply replacing all x in the second level M-step by t. With the resulting z1(k_-) in t-space, the PAPEX algorithm can be applied to estimate W (k ) , in which the effective input values are expressed by
Hk ) = zi(kJ){ ~ μt{k,j)) EQ. 20 *"
The next level visualization subspace is generated by plotting each data point tx at the corresponding
Xi(k,3) = 2i(A,j)W (ti~ Mt(fcJ)) EQ. 21 v '" values in the (k,j) subspace.
The construction of the entire tree structure hierarchy is automatically completed with the flow diagram of FIG. 2 ending when no further data split is recommended by the information theoretic criteria ~~ (AIC and MDL) in all of the parent subspaces.
A first exemplary two-level implementation of the present invention is shown in FIGS. 3, 4A, and 4B in which the entire data set is present in the top level projection and two local clusters within that top level projection each individually presented in FIGS. 4A and 4B. As shown in FIG. 3, the entire data set is subject to principal component analysis as described above to obtain the principal axis or axes (axis Ax being representative) for the top level display. Additionally, the axis (unnumbered) for each of the apparent clusters is displayed. Thereafter, the apparent centers of the two clusters are identified and the data subject to the aforementioned processing to further reveal the local cluster of FIG. 4A and the local cluster of FIG. 4B .
A second exemplary two- level implementation of the present invention is shown in FIGS. 5, 6A, 6B, and 6C in which the entire data set is present in the top level projection and three local clusters within that top level projection are each individually presented in FIGS. 6A, 6B, and 6C. As shown in FIG. 5, the entire data set is subject to principal component analysis as described above to obtain the principal axis (Ax) and the axis (unnumbered) for each of the apparent clusters as displayed. The t-space raw data set arises from a mixture of three Gaussians consisting of 300 data points as presented in FIG. 5. As shown, two cloud- like clusters are well separated while a third cluster appears spaced in between the two well- separated cloud-like clusters. By performing the same operations as described above, the second level visual space is generated with a mixture of two local principal component axis subspaces where the line Ax indicates the global principal axis. When the two information theoretic criteria are applied (AIC and MDL) to examine these two cluster plots, the plot on the "right" of FIG. 5 shows evidence of further split. At the third level data modeling, a hierarchical model is adopted, which illustrates that there are indeed total three clusters within the data set, as shown in FIGS. 6A, 6B, and 6C. An alternate visualization of the process of flow of the present invention is shown in FIG. 2A in which the data is input and structured and the high- level data set that this then subject to algorithmic processing to iteratively effect the data structure decomposition, dimensionality reduction, and multiple model selection using the AIC/MDL criteria and effect a best fit to for the next subsequent projection. Thereafter, extraction by the above- described probabilistic adaptive principal component processing and the radon transform is effect to thereafter generate the data cluster visualizations. Industπal Applicability The present invention has use m all applications requiring the analysis of data, particularly multi -dimensional data, for the purpose of optimally visualizing various underlying structures and distributions present within the universe of data. Applications include the detection of data clusters and sub-clusters and their relative relationships m areas of medical, industrial, geophysical imaging, and digital library processing, for example.
As will be apparent to those skilled m the art, various changes and modifications may be made to the illustrated data decomposition/reduction method for visualizing data clusters/sub-clusters of the present invention without departing from the spirit and scope of the invention as determined m the appended claims and their legal equivalent .
CROSS REFERENCE TO UNITED STATES PROVISIONAL PATENT APPLICATION
This application claims the benefit of the filing date of co-pending U.S. Provisional Patent Application No. 60/100,622 filed on September 17, 199Θ by the same inventor herein and entitled "Hierarchical Minimax Entropy Modeling and Visualization for Data Representation and Analysis/Interpretation, " the disclosure of which is incorporated herein by reference.

Claims

Cl aims :
1. A method of processing a data set of a multitude of data points each having a dimensionality greater than at least three to provide a hierarchy of visualizations of the data set into an at least two-dimensional space including a top-level visualization and at least one second level visualization presenting at least one cluster K of the top-level visualized therein, comprising the steps of : providing, as the top-level visualization, a reduced dimension projection of the entire data set along at least a principal axis into an at least two-dimensional visualization space in which the dimensionality of the projected data set is reduced by principal component analysis of the data set to obtain a principal component projection axis; selecting at least one point on said first- mentioned visualization space corresponding to centers of apparent clusters; developing an optimum number of possible models for a second level projection; determining the optimum number of local clusters K for the second level projection by alternately calculating the Akaike information criteria and the minimum description length and using the minimum of the Akaike information criterion and minimum description length to determine the optimum number of local clusters K; determinmg the principal axes for visualization subspaces for the so-determined local clusters and projecting the data for at least one of the so-determined local clusters m a visualization space different from the first -mentioned ~~ visualization space.
2. The method of claim 1, wherein the principal component for the top-level projection is determined by directly evaluating the covariance matrix.
3. The method of claim 1, wherein the principal component for the top-level projection is determined by adaptive principal components extraction.
4. The method of claim 1, wherein the plurality of possible models for the next-to-top level projection are developed by successive cycles of the E-step and the M-step of expectation- maximization algorithm until a minimum relative entropy is attained.
5. A method of optimally processing a large data set of high dimensional (>3) data points to provide, by dimensional reduction, cluster analysis, and two-dimensional surface projection of a hierarchy of visual displays for the purpose of discerning data information relationships therein, comprising the steps of : a. providing, through principal component analysis, a top level visualization as a projection of the entire data set onto a two-dimensional visualization space defined by its principal component projection axis; b. selection by algorithm of an initial best estimate of data points on said first-mentioned visualization space corresponding to centers of apparent clusters; c. developing an optimal number of possible models for a second level projection; d. determining the optimal number of local clusters for the second level projections by calculating the minimum of the AIC or the MDL to determine the optimum number of second level clusters ; e. determining the principal component axis of each second level cluster for projection onto respective two- dimensional subspaces for display visualization; and f. repeating steps c, d, and e until no further data point clusters are algoπthmically detectable .
6. A method of optimally processing a large data set of high dimensional (>3) data points to provide, by dimensional reduction, cluster analysis and two-dimensional surface projection of a hierarchy of visual displays for the purpose of discerning data information relationships therein, comprising the steps of: a. providing, through principal component analysis, a top level visualization as a projection of the entire data set onto a two-dimensional visualization space defined by its principal component projection axis; b. heuristically selecting, from multiple competing choices, the initial best estimates of data points on said first-mentioned visualization space corresponding to centers of apparent clusters; c. developing an optimal number of possible models for a second level projection; d. determining the optimal number of local clusters for the second level projections by calculating the minimum of the AIC or the MDL to determine the optimum number of second level clusters; e. determining the principal component axis of each second level cluster for projection onto respective two- dimensional subspaces for display visualization; and f. repeating steps c, d, and e until no further data point clusters are heuristically detectable .
7. A system for processing a data set of a multitude of data points each having a dimensionality greater than at least three to provide a hierarchy of visualizations of the data set into an at least two-dimensional space including a top-level visualization and at least one second level visualization presenting at least one cluster K of the top-level visualized therein, characterized by: a processor having a cooperating memory containing a data set of a multitude of data points each having a dimensionality greater than at least three ; a display for presenting one or more visualizations of the data set as processed by the processor; the processor providing, as the top-level visualization on the display, a reduced dimension projection of the entire data set along at least a principal axis into an at least two-dimensional visualization space m which the dimensionality of the projected data set is reduced by principal component analysis of the data set to obtain a principal component projection axis; the processor selecting at least one point on said first-mentioned visualization space corresponding to centers of apparent clusters; the processor thereafter developing an optimum number of possible models for a second level projection; the processor determining the optimum number of local clusters K for the second level projection by alternately calculating the Akaike information criteria and the minimum description length and using the minimum of the Akaike information criterion and minimum description length to determine the optimum number of local clusters K; the processor determining the principal axes for visualization subspaces for the so-determined local clusters and projecting the data for at least one of the so-determined local clusters m a visualization space on the display different from the first-mentioned visualization space.
8. The system of claim 7, wherein the principal component for the top-level projection is determined by directly evaluating the covariance matrix.
9. The system of claim 8, wherein the principal component for the top-level projection is determined by adaptive principal components extraction.
10. The method of claim 7, wherein the plurality of possible models for the next-to-top level projection are developed by successive cycles of the E-step and the M-step of expectation- maximization algorithm until a minimum relative entropy is attained.
11. A system for optimally processing a large data set of high dimensional (>3) data points to provide, by dimensional reduction, cluster analysis, and two-dimensional surface projection of a hierarchy of visual displays for the purpose of discerning data information relationships therein, characterized by: a processor having a cooperating memory containing a data set of a multitude of data points each having a dimensionality greater than at least three and a display for presenting one or more visualizations of the data set as processed by the processor; the processor, through principal component analysis, providing a top level visualization as a projection of the entire data set onto a two- dimensional visualization space m the display and defined by its principal component projection axis; the processor selecting by algorithm an initial best estimate of data points on said first-mentioned visualization space corresponding to centers of apparent clusters; the processor developing an optimal number of possible models for a second level projection; the processor determining the optimal number of local clusters for the second level projections by calculating the minimum of the AIC or the MDL to determine the optimum number of second level clusters; the processor determining the principal component axis of each second level cluster for projection onto respective two-dimensional subspaces for visualization on the display.
12. A system for optimally processing a large data set of high dimensional (>3) data points to provide, by dimensional reduction, cluster analysis and two-dimensional surface projection of a hierarchy of visual displays for the purpose of discerning data information relationships therein, characterized by: a processor having a cooperating memory containing a data set of a multitude of data points each having a dimensionality greater than at least three and a display for presenting one or more visualizations of the data set as processed by the processor; the processor effecting a principal component analysis of the data to provide a top level visualization as a projection of the entire data set onto a two-dimensional visualization space of the display and defined by its principal component projection axis; ~~ the processor, in response to a heuristic selection entered by a user, providing an initial best estimate of data points on said first-mentioned visualization space corresponding to centers of apparent clusters; the processor thereafter developing an optimal number of possible models for a second level projection; the processor determining the optimal number of local clusters for the second level projections by calculating the minimum of the AIC or the MDL to determine the optimum number of second level clusters; and the processor determining the principal component axis of each second level cluster for projection onto respective two-dimensional subspaces for display visualization by the display.
13. A computer automated process for generating a hierarchy of minimax entropy models and optimum visualization projections for high dimensional space data to improve data representation and interpretation, characterized by: structurally decomposing a high dimensional space data utilizing minimax entropy principles to develop a statistical framework for model identification to an optional number and kernel shape of local clusters from said data; dimensionally reducing said high dimensional space data by combining minimax entropy principles and principal component analysis to optimize data structure decomposition; iteratively and separately performing principal component analysis and minimax entropy model identification to generate a hierarchy of complementary projections and models to develop an intrinsic model to best-fit the high dimensional space data; and ~~ creating a substantially reduced dimensional visualization space to facilitate better data representation and interpretation of said data.
PCT/US1999/021363 1998-09-17 1999-09-17 Data decomposition/reduction method for visualizing data clusters/sub-clusters WO2000016250A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2000570715A JP2002525719A (en) 1998-09-17 1999-09-17 Data decomposition / reduction method for visualizing data clusters / subclusters
CA002310333A CA2310333A1 (en) 1998-09-17 1999-09-17 Data decomposition/reduction method for visualizing data clusters/sub-clusters
EP99946966A EP1032918A1 (en) 1998-09-17 1999-09-17 Data decomposition/reduction method for visualizing data clusters/sub-clusters
AU59262/99A AU5926299A (en) 1998-09-17 1999-09-17 Data decomposition/reduction method for visualizing data clusters/sub-clusters

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US10062298P 1998-09-17 1998-09-17
US60/100,622 1998-09-17
US39842199A 1999-09-17 1999-09-17
US09/398,421 1999-09-17

Publications (1)

Publication Number Publication Date
WO2000016250A1 true WO2000016250A1 (en) 2000-03-23

Family

ID=26797375

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/021363 WO2000016250A1 (en) 1998-09-17 1999-09-17 Data decomposition/reduction method for visualizing data clusters/sub-clusters

Country Status (5)

Country Link
EP (1) EP1032918A1 (en)
JP (1) JP2002525719A (en)
AU (1) AU5926299A (en)
CA (1) CA2310333A1 (en)
WO (1) WO2000016250A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7440986B2 (en) 2003-12-17 2008-10-21 Internatioanl Business Machines Corporation Method for estimating storage requirements for a multi-dimensional clustering data configuration
US9202178B2 (en) 2014-03-11 2015-12-01 Sas Institute Inc. Computerized cluster analysis framework for decorrelated cluster identification in datasets
CN105447001A (en) * 2014-08-04 2016-03-30 华为技术有限公司 Dimensionality reduction method and device for high dimensional data
US9424337B2 (en) 2013-07-09 2016-08-23 Sas Institute Inc. Number of clusters estimation
US9996543B2 (en) 2016-01-06 2018-06-12 International Business Machines Corporation Compression and optimization of a specified schema that performs analytics on data within data systems
CN110287978A (en) * 2018-03-19 2019-09-27 国际商业机器公司 For having the computer implemented method and computer system of the machine learning of supervision

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4670010B2 (en) * 2005-10-17 2011-04-13 株式会社国際電気通信基礎技術研究所 Mobile object distribution estimation device, mobile object distribution estimation method, and mobile object distribution estimation program
US8239379B2 (en) * 2007-07-13 2012-08-07 Xerox Corporation Semi-supervised visual clustering
US20090232388A1 (en) * 2008-03-12 2009-09-17 Harris Corporation Registration of 3d point cloud data by creation of filtered density images
JP5332647B2 (en) * 2009-01-23 2013-11-06 日本電気株式会社 Model selection apparatus, model selection apparatus selection method, and program
JP6586764B2 (en) * 2015-04-17 2019-10-09 株式会社Ihi Data analysis apparatus and data analysis method
US11847132B2 (en) 2019-09-03 2023-12-19 International Business Machines Corporation Visualization and exploration of probabilistic models

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
AKAIKE H: "A NEW LOOK AT THE STATISTICAL MODEL IDENTIFICATION", IEEE TRANSACTIONS ON AUTOMATIC CONTROL,US,IEEE INC. NEW YORK, vol. AC-19, no. 6, December 1974 (1974-12-01), pages 716-723, XP000675871, ISSN: 0018-9286 *
ANONYMOUS: "Data Preprocessing With Clustering Algorithms.", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 33, no. 10B, March 1991 (1991-03-01), New York, US, pages 26 - 27, XP000109861 *
ANONYMOUS: "Multivariate Statistical Data Reduction Method", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 36, no. 4, April 1993 (1993-04-01), New York, US, pages 181 - 184, XP000364481 *
BISHOP C M ET AL: "A HIERARCHICAL LATENT VARIABLE MODEL FOR DATA VISUALIZATION", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,US,IEEE INC. NEW YORK, vol. 20, no. 3, March 1998 (1998-03-01), pages 281-293, XP000767918, ISSN: 0162-8828 *
CHATTERJEE C ET AL: "ON SELF-ORGANIZING ALGORITHMS AND NETWORKS FOR CLASS-SEPARABILITY FEATURES", IEEE TRANSACTIONS ON NEURAL NETWORKS,US,IEEE INC, NEW YORK, vol. 8, no. 3, May 1997 (1997-05-01), pages 663-678, XP000656917, ISSN: 1045-9227 *
JIANCHANG MAO ET AL: "ARTIFICIAL NEURAL NETWORKS FOR FEATURE EXTRACTION AND MULTIVARIATE DATA PROJECTION", IEEE TRANSACTIONS ON NEURAL NETWORKS,US,IEEE INC, NEW YORK, vol. 6, no. 2, 2 March 1995 (1995-03-02), pages 296-316, XP000492664, ISSN: 1045-9227 *
KUNG S Y ET AL: "ADAPTIVE PRINCIPAL COMPONENT EXTRACTION (APEX) AND APPLICATIONS", IEEE TRANSACTIONS ON SIGNAL PROCESSING,US,IEEE, INC. NEW YORK, vol. 42, no. 5, May 1994 (1994-05-01), pages 1202-1216, XP000460366, ISSN: 1053-587X *
PAO Y -H ET AL: "Visualization of pattern data through learning of non-linear variance-conserving dimension-reduction mapping", PATTERN RECOGNITION,US,PERGAMON PRESS INC. ELMSFORD, N.Y, vol. 30, no. 10, 1 October 1997 (1997-10-01), pages 1705-1717, XP004094254, ISSN: 0031-3203 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7440986B2 (en) 2003-12-17 2008-10-21 Internatioanl Business Machines Corporation Method for estimating storage requirements for a multi-dimensional clustering data configuration
US7912798B2 (en) 2003-12-17 2011-03-22 International Business Machines Corporation System for estimating storage requirements for a multi-dimensional clustering data configuration
US9424337B2 (en) 2013-07-09 2016-08-23 Sas Institute Inc. Number of clusters estimation
US9202178B2 (en) 2014-03-11 2015-12-01 Sas Institute Inc. Computerized cluster analysis framework for decorrelated cluster identification in datasets
CN105447001A (en) * 2014-08-04 2016-03-30 华为技术有限公司 Dimensionality reduction method and device for high dimensional data
US9996543B2 (en) 2016-01-06 2018-06-12 International Business Machines Corporation Compression and optimization of a specified schema that performs analytics on data within data systems
CN110287978A (en) * 2018-03-19 2019-09-27 国际商业机器公司 For having the computer implemented method and computer system of the machine learning of supervision
CN110287978B (en) * 2018-03-19 2023-04-25 国际商业机器公司 Computer-implemented method and computer system for supervised machine learning

Also Published As

Publication number Publication date
AU5926299A (en) 2000-04-03
EP1032918A1 (en) 2000-09-06
JP2002525719A (en) 2002-08-13
CA2310333A1 (en) 2000-03-23

Similar Documents

Publication Publication Date Title
Stanford et al. Finding curvilinear features in spatial point patterns: principal curve clustering with noise
Tirandaz et al. A two-phase algorithm based on kurtosis curvelet energy and unsupervised spectral regression for segmentation of SAR images
Clausi K-means Iterative Fisher (KIF) unsupervised clustering algorithm applied to image texture segmentation
Ugarriza et al. Automatic image segmentation by dynamic region growth and multiresolution merging
Keuchel et al. Binary partitioning, perceptual grouping, and restoration with semidefinite programming
Seetharaman et al. Texture characterization, representation, description, and classification based on full range Gaussian Markov random field model with Bayesian approach
WO2000016250A1 (en) Data decomposition/reduction method for visualizing data clusters/sub-clusters
Krasnoshchekov et al. Order-k α-hulls and α-shapes
Allassonniere et al. A stochastic algorithm for probabilistic independent component analysis
Tsuchie et al. High-quality vertex clustering for surface mesh segmentation using Student-t mixture model
Bergamasco et al. A graph-based technique for semi-supervised segmentation of 3D surfaces
Lavoué et al. Markov Random Fields for Improving 3D Mesh Analysis and Segmentation.
AlZu′ bi et al. 3D medical volume segmentation using hybrid multiresolution statistical approaches
Blanchet et al. Triplet Markov fields for the classification of complex structure data
Vilalta et al. An efficient approach to external cluster assessment with an application to martian topography
Huang et al. Texture classification by multi-model feature integration using Bayesian networks
Kouritzin et al. A graph theoretic approach to simulation and classification
Gehre et al. Feature Curve Co‐Completion in Noisy Data
Li et al. High resolution radar data fusion based on clustering algorithm
Marras et al. 3D geometric split–merge segmentation of brain MRI datasets
Guizilini et al. Iterative continuous convolution for 3d template matching and global localization
Huang et al. Image segmentation using an efficient rotationally invariant 3D region-based hidden Markov model
Roy et al. A finite mixture model based on pair-copula construction of multivariate distributions and its application to color image segmentation
Li Unsupervised texture segmentation using multiresolution Markov random fields
Dokur et al. Segmentation of medical images by using wavelet transform and incremental self-organizing map

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 1999946966

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2310333

Country of ref document: CA

Ref country code: CA

Ref document number: 2310333

Kind code of ref document: A

Format of ref document f/p: F

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 570715

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 59262/99

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 1999946966

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 1999946966

Country of ref document: EP