WO2005122065A1 - System and method for elimination of irrelevant and redundant features to improve cad performance - Google Patents

System and method for elimination of irrelevant and redundant features to improve cad performance Download PDF

Info

Publication number
WO2005122065A1
WO2005122065A1 PCT/US2005/019116 US2005019116W WO2005122065A1 WO 2005122065 A1 WO2005122065 A1 WO 2005122065A1 US 2005019116 W US2005019116 W US 2005019116W WO 2005122065 A1 WO2005122065 A1 WO 2005122065A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature set
determining
reduced
vector
discriminant
Prior art date
Application number
PCT/US2005/019116
Other languages
French (fr)
Inventor
Murat Dundar
Original Assignee
Siemens Medical Solutions Usa, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Medical Solutions Usa, Inc. filed Critical Siemens Medical Solutions Usa, Inc.
Publication of WO2005122065A1 publication Critical patent/WO2005122065A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2115Selection of the most significant subset of features by evaluating different subsets according to an optimisation criterion, e.g. class separability, forward selection or backward elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular

Definitions

  • the present invention relates to image processing, and more particularly to system and method for feature selection in an object detection system.
  • a computer- implemented method for processing an image includes identifying a plurality of candidates for an object of interest in the image, extracting a feature set for each candidate, determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient, determining at least one candidate of the plurality of candidates as a positive candidate based on the reduced feature set, and displaying the positive candidate for analysis of the object.
  • Determining the reduced feature set comprises initializing a discriminant vector and a regularization parameter, and determining, iteratively, the reduced feature set.
  • Determining, iteratively, the reduced feature set includes determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set, determining a transformation vector, updating the class
  • the method comprises comparing, at each iteration, each element of the discriminant vector to a threshold, and stopping the iterative determination of the reduced feature set upon determining that all elements are greater than the threshold.
  • the threshold is a user defined variable for controlling a degree to which features are eliminated.
  • a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for processing an image.
  • the method includes identifying a plurality of candidates for an object of interest in the image, extracting a feature set for each candidate, determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient, determining at least one candidate of the
  • a computer- implemented detection system comprises an object detection module determining a candidate object and a feature set for the candidate object, and a feature selection module coupled to the object detection module, wherein the feature selection module receives the feature set and generates a reduced feature set having a desirable value of a Rayleigh quotient, wherein the object detection modules implements the reduced feature set for detecting an object in an image.
  • the feature selection module further includes an initialization module setting an initial value of a discriminant vector and a regularization parameter, a reduction module determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, and a discriminant module determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set.
  • the feature selection module further includes a sparsity module determining a transformation vector, and an update module updating the class scatter matrix and means according to the transformation vector, wherein the sparsity module determines the discriminant vector given the updated class scatter matrix and means.
  • Figure 1 is a system according to an embodiment of the present disclosure
  • Figure 2 is a flow chart of a method according to an embodiment of the present disclosure
  • Figure 3 is a graph of testing error according to an embodiment of the present disclosure
  • Figure 4A is a graph of receiver operating characteristics (ROC) curves for training results according to an embodiment of the present disclosure
  • Figure 4B is a graph of receiver operating characteristics (ROC) curves for training results according to an embodiment of the present disclosure
  • Figure 5 is a flow chart of a method according to an embodiment of the present disclosure
  • Figure 6 is a diagram of an object detection system according to an embodiment of the present disclosure.
  • the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof.
  • the present invention may be implemented in software as an application program tangibly embodied on a program storage device.
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • a computer system 101 for implementing a image processing method can comprise, inter alia, a central processing unit (CPU) 102, a memory 103 and an input/output (I/O) interface 104.
  • the computer system 101 is generally coupled through the I/O interface 104 to a display 105 and various input devices 106 such as a mouse and keyboard.
  • the support circuits can include circuits such as cache, power supplies, clock circuits, and a communications bus.
  • the memory 103 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combination thereof.
  • the present invention can be implemented as a routine 107 that is stored in memory 103 and executed by the CPU 102 to process the signal from the signal source 108.
  • the computer system 101 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 107 of the present invention.
  • the computer platform 101 also includes an operating system and micro instruction code.
  • the various processes and functions described herein may either be part of the micro instruction code or part of the application program (or a combination thereof) which is executed via the operating system.
  • various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.
  • a Computer-Aided Detection (CAD) system automatically identifies candidates for an object of interest in an image 201 given known characteristics such as the shape of an abnormality, e.g., a polyp, extract features for each candidate 202, wherein a determined feature set is reduced (e.g., see Figure 5), labels candidates as positive or negative 203, and displays positive candidates to a radiologist for diagnosis 204.
  • the labeling or classification is performed by a classifier that has been trained offline from a training dataset and then frozen for use in the CAD system.
  • the training dataset is a database of images where candidates have been labeled by an expert. The ability to generalize is important to the CAD system and thus the classifier. The classifier needs to correctly labels new datasets.
  • Classification performance is determined by a classification methods used and an inherent class information available in the features provided.
  • the classification methods determine the best achievable separation between classes by exploiting the potential information available within the feature set. In real-world settings the number of features available can be more than needed. It is expected that a large number of features would provide more discriminating power.
  • two classes can be separated in many ways. However, for generalization ability, few separations will generalize well on the new datasets. Thus, feature selection is important.
  • an automatic feature selection method is built into Fisher's Linear Discriminant (FLD).
  • the method identifies a feature subset by iteratively maximizing a ratio between and within class scatter matrices with respect to the discriminant coefficients and feature weights, respectively (see Figure 5).
  • the FLD arises in a special case when classes have a common covariance matrix.
  • FLD is a classification method that projects the high dimensional data onto a line for a binary classification problem and performs classification in this one dimensional space. This projection is chosen such that the ratio of between and within class scatter matrices or the Rayleigh quotient is maximized.
  • X ⁇ e R dxl be a matrix containing the I training data points on d-dimensional space and /. the number of labeled samples for class ⁇ t , i e ⁇ + ⁇ .
  • FLD is the projection , which maximizes, ⁇ ( _ Q ⁇ SB G
  • a sparse formulation of FLD incorporating a regularization constraint on the FLD.
  • a system and method eliminate those features determined to have limited impact on the objective function. Sparse Fisher Discriminant Analysis: Blindly fitting classifiers without appropriate regularization conditions yields over-fitted models. Methods for controlling model complexity are needed in modern data analysis. In particular, when the number of features available is large, an appropriate regularization can dramatically reduce the dimensionality and produces better generalization performance that is supported by learning theory.
  • a 1-norm penalty P(f) has been implemented in a sparse FLD formulation, which generates sparser feature subsets than 2-norm penalty.
  • ⁇ 2 is empty whenever ⁇ 5 max ⁇ 0 or ⁇ min > ⁇ .
  • ⁇ ⁇ 5 max should hold to achieve a sparse solution.
  • a linear transformation will ensure na > ° anc ' standardize the sparsity constraint.
  • the noise features are added to the feature set one by one allowing us to observe the gradual change in the prediction capability of both approaches.
  • the error bars in Figure 3 are obtained by repeating the above process 100 times for each d each time using a different training and testing set. Figure 3 illustrates testing error vs. / for artificial data.
  • Curve 301 corresponds to FLD and curve 302 corresponds to a sparse method according to an embodiment of the present disclosure.
  • d 3 with two redundant features the prediction accuracy of the conventional FLD is decent.
  • the standard deviation in prediction error is smaller under a method according to an embodiment of the present disclosure indicating the elimination of one or both of the redundant features.
  • d gets larger and noise features are added to the feature set the performance of the conventional FLD deteriorates significantly whereas the average prediction error for the proposed formulation remains around its initial level with some increase in the standard deviation.
  • 90% of the time selects feature two and three together.
  • Example 2 Colon Cancer; Data Sources and Domain Description; Colorectal cancer is the third most common cancer in both men and women. It is estimated that in 2004, nearly 147,000 cases of colon and rectal cancer will be diagnosed in the US, and more than 56,730 people would die from colon cancer. While there is wide consensus that screening patients is effective in decreasing advanced disease, only 44% of the eligible population undergoes any colorectal cancer screening. There are many factors for this, Multiple reasons have been identified for non-compliance, key being: patient comfort, bowel preparation and cost.
  • Non-invasive virtual colonoscopy derived from computer tomographic (CT) images of the colon holds great promise as a screening method for colorectal cancer, particularly if CAD tools are developed to facilitate the efficiency of radiologists' efforts in detecting lesions.
  • CT computer tomographic
  • identifying (and removing) lesions (polyp) when still in a local stage of the disease has very high survival rates, thus illustrating the critical need for early diagnosis.
  • the database of high-resolution CT images used in this study were obtained from NYU Medical Center, Cleveland Clinic Foundation, and two EU sites in Vienna and Belgium.
  • Training Data Patient and Polyp Info There were 96 patients with 187 volumes. A total of 76 polyps were identified in this set with a total number of 9830 candidates. Testing Data Patient and Polyp Info: There were 67 patients with 133 volumes. A total of 53 polyps were identified in this set with a total number of 6616 candidates. A combined total of 207 features are extracted for each candidate by three imaging scientists. Feature Selection and Classification: In this experiment three feature selection methods where considered in a wrapper framework and compare their prediction performance on the Colon Dataset.
  • SFLD sparse formulation proposed in this study
  • SKFD Kernel Fisher Discriminant with linear loss and linear regularizer
  • GFLD greedy sequential forward-backward feature selection algorithm implemented with FLD
  • SFLD Sparse Fisher Linear Discriminant
  • LOPO Leave- One-Patient-Out
  • both views are left out, e.g., the supine and the prone views, of one patient from the training data.
  • the classifier is trained using the patients from the remaining set, and tested on both views of the "left-out" patient.
  • LOPO is superior to other cross-validation metrics such as leave-one-volume-out, leave-one-polyp-out or k-fold cross-validation because it simulates the actual use, wherein the CAD system processes both volumes for a new patient.
  • a polyp is visible in both views, the corresponding candidates could be assigned to different folds; thus a classifier may be trained and tested on the same polyp (albeit in different views).
  • a method is run for varying sizes of ⁇ [Id]. For each value of the Receiver Operating Characteristics (ROC) curve is obtained by evaluating the Leave One Patient Out (LOPO) Cross Validation performance of a sparse FLD method and
  • Kernel Fisher Discriminant with linear loss and linear regularizer (SKFD): In this approach there is a set of constraints for every data point on the training set which leads to large optimization problems. To alleviate the computational burden on mathematical programming formulation for this approach Laplacian models may be implemented for both the loss function and the regularizer. This choice leads to linear programming formulation instead of the quadratic programming formulation that is obtained when a Gaussian model is assumed for both the loss function and the regularizer. The linear programming formulation used is written as:
  • LOPO Greedy sequential forward-backward feature selection algorithm with FLD (GFLD): This approach starts with an empty subset and performs a forward selection succeeded by a backward attempt to eliminate a feature from the subset. During each iteration of the forward selection exactly one feature is added to the feature subset. To determine which feature to add, the algorithm tentatively adds to the candidate feature subset one feature that is not already selected and tests the LOPO performance of a classifier built on the tentative feature subset. The feature that results in the largest area under the ROC curve is added to the feature subset. During each iteration of the backward elimination the algorithm attempts to eliminate the feature that results in the largest ROC area gain. This process goes on until no or negligible improvement is gained.
  • GFLD Fluorous forward-backward feature selection algorithm with FLD
  • SKFD was run on a subset of the training dataset where all the positive candidates and a random subset of size 1000 of the negative candidates where included.
  • Table 1 The number of features selected (d), the area of the ROC curve scaled by 100 (Area) and the sensitivity corresponding to 90% specificity (Sens) is shown for all algorithms considered in this study. The values in parenthesis show the corresponding values for the testing results. Algorithm d Area Sens (%) SFLD 25 94.8 (94.9) 89 (87) SFLD-sub 17 94.7 (94.1) 92 (85) GFLD 17 94.3 (94.7) 85 (83) SKFD 18 88.0 (82.0) 65 (60) FLD 207 80.3 (89.1) 63 (77) TABLE 1
  • the ROC curves in Figure 3 demonstrates the LOPO performance of the each method and those in Figure 4 show the performance on the test data set.
  • Table 1 shows the number of features selected (d), the area of the ROC curve scaled by 100 (Area) and the sensitivity corresponding to 90% specificity (Sens) for all algorithms considered in this study.
  • Sparse (SFLD) and SFLDsub outperform the greedy and conventional FLD and SKFD both on the training and testing datasets.
  • SFLD-sub performs better than SFLD on the training data
  • SFLD generalizes slightly better on the testing data. This is not surprising because SFLD-sub uses a subset of the original training data.
  • GFLD performs almost equally well with SFLDsub and SFLD methods but the difference is hidden in the computational cost needed to select the features in GFLD.
  • a computer-implemented detection system includes an object detection module determining a candidate object and a feature set for the candidate object 601.
  • the system includes a feature selection module 602 coupled to the object detection module 601 , wherein the feature selection module 602 receives the feature set and generates a reduced feature set having a desirable value of a Rayleigh quotient, wherein the object detection modules 601 implements the reduced feature set for detecting an object in an image.
  • a feature selection module includes an initialization module 603 setting an initial value of a discriminant vector and a regularization parameter, a reduction module 604 determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, a discriminant module 605 determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set, a sparsity module 606 determining a transformation vector, and an update module 607 updating the class scatter matrix and means according to the transformation vector, wherein the sparsity module 606 determines the discriminant vector given the updated class scatter matrix and means.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A computer-implemented method for processing an image includes identifying a plurality of candidates for an object of interest in the image (201), extracting a feature set for each candidate, determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient (202), determining at least one candidate of the plurality of candidates as a positive candidate based on the reduced feature set (203), and displaying the positive candidate for analysis of the object (204).

Description

SYSTEM AND METHOD FOR ELIMINATION OF IRRELEVANT AND REDUNDANT FEATURES TO IMPROVE CAD PERFORMANCE
This application claims priority to U.S. Provisional Application Serial No. 60/576,115, filed on June 2, 2004, which is herein incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
1. Technical Field: The present invention relates to image processing, and more particularly to system and method for feature selection in an object detection system.
2. Discussion of Related Art: Features of medical images are typically identified by several imaging technicians working independently. As a result, technicians often identify the same or similar features. These features may be redundant or irrelevant, which may in turn impact classifier performance. Therefore, a need exists for a system and method of eliminating redundant and irrelevant features from a feature set.
SUMMARY OF THE INVENTION According to an embodiment of the present disclosure, a computer- implemented method for processing an image includes identifying a plurality of candidates for an object of interest in the image, extracting a feature set for each candidate, determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient, determining at least one candidate of the plurality of candidates as a positive candidate based on the reduced feature set, and displaying the positive candidate for analysis of the object. Determining the reduced feature set comprises initializing a discriminant vector and a regularization parameter, and determining, iteratively, the reduced feature set. Determining, iteratively, the reduced feature set includes determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set, determining a transformation vector, updating the class
scatter matrix and means according to the transformation vector, and determining the discriminant vector. The method comprises comparing, at each iteration, each element of the discriminant vector to a threshold, and stopping the iterative determination of the reduced feature set upon determining that all elements are greater than the threshold. The threshold is a user defined variable for controlling a degree to which features are eliminated. The transformation vector and the discriminant vector can be determined as: r(Sw *(aaT)) min aaeRd (£ ((ro+ - m_) * a = b . - cfet ≤ ≥O According to an embodiment of the present disclosure, a program storage device is provided readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for processing an image. The method includes identifying a plurality of candidates for an object of interest in the image, extracting a feature set for each candidate, determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient, determining at least one candidate of the
plurality of candidates as a positive candidate based on the reduced feature set, and displaying the positive candidate for analysis of the object. According to an embodiment of the present disclosure, a computer- implemented detection system comprises an object detection module determining a candidate object and a feature set for the candidate object, and a feature selection module coupled to the object detection module, wherein the feature selection module receives the feature set and generates a reduced feature set having a desirable value of a Rayleigh quotient, wherein the object detection modules implements the reduced feature set for detecting an object in an image. The feature selection module further includes an initialization module setting an initial value of a discriminant vector and a regularization parameter, a reduction module determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, and a discriminant module determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set. The feature selection module further includes a sparsity module determining a transformation vector, and an update module updating the class scatter matrix and means according to the transformation vector, wherein the sparsity module determines the discriminant vector given the updated class scatter matrix and means.
BRIEF DESCRIPTION OF THE DRAWINGS Preferred embodiments of the present invention will be described below in more detail, with reference to the accompanying drawings: Figure 1 is a system according to an embodiment of the present disclosure; Figure 2 is a flow chart of a method according to an embodiment of the present disclosure; Figure 3 is a graph of testing error according to an embodiment of the present disclosure; Figure 4A is a graph of receiver operating characteristics (ROC) curves for training results according to an embodiment of the present disclosure; Figure 4B is a graph of receiver operating characteristics (ROC) curves for training results according to an embodiment of the present disclosure; Figure 5 is a flow chart of a method according to an embodiment of the present disclosure; and Figure 6 is a diagram of an object detection system according to an embodiment of the present disclosure.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS According to an embodiment of the present disclosure, irrelevant and redundant features are automatically eliminated from a feature set extracted from images, such as CT or MRI images. It is to be understood that the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. In one embodiment, the present invention may be implemented in software as an application program tangibly embodied on a program storage device. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Referring to Figure 1 , according to an embodiment of the present disclosure, a computer system 101 for implementing a image processing method can comprise, inter alia, a central processing unit (CPU) 102, a memory 103 and an input/output (I/O) interface 104. The computer system 101 is generally coupled through the I/O interface 104 to a display 105 and various input devices 106 such as a mouse and keyboard. The support circuits can include circuits such as cache, power supplies, clock circuits, and a communications bus. The memory 103 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combination thereof. The present invention can be implemented as a routine 107 that is stored in memory 103 and executed by the CPU 102 to process the signal from the signal source 108. As such, the computer system 101 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 107 of the present invention. The computer platform 101 also includes an operating system and micro instruction code. The various processes and functions described herein may either be part of the micro instruction code or part of the application program (or a combination thereof) which is executed via the operating system. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device. It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures may be implemented in software, the actual connections between the system components (or the process steps) may differ depending upon the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention. Referring to Figure 2, a Computer-Aided Detection (CAD) system automatically identifies candidates for an object of interest in an image 201 given known characteristics such as the shape of an abnormality, e.g., a polyp, extract features for each candidate 202, wherein a determined feature set is reduced (e.g., see Figure 5), labels candidates as positive or negative 203, and displays positive candidates to a radiologist for diagnosis 204. The labeling or classification is performed by a classifier that has been trained offline from a training dataset and then frozen for use in the CAD system. The training dataset is a database of images where candidates have been labeled by an expert. The ability to generalize is important to the CAD system and thus the classifier. The classifier needs to correctly labels new datasets. Because a large number of different classifiers can be built from the training data using classification methods, each with adjustable parameters, the choice of the classifier is important. Classification performance is determined by a classification methods used and an inherent class information available in the features provided. The classification methods determine the best achievable separation between classes by exploiting the potential information available within the feature set. In real-world settings the number of features available can be more than needed. It is expected that a large number of features would provide more discriminating power. With a limited number of training examples in a high dimensional feature space two classes can be separated in many ways. However, for generalization ability, few separations will generalize well on the new datasets. Thus, feature selection is important. According to an embodiment of the present disclosure, an automatic feature selection method is built into Fisher's Linear Discriminant (FLD). The method identifies a feature subset by iteratively maximizing a ratio between and within class scatter matrices with respect to the discriminant coefficients and feature weights, respectively (see Figure 5). The FLD arises in a special case when classes have a common covariance matrix. FLD is a classification method that projects the high dimensional data onto a line for a binary classification problem and performs classification in this one dimensional space. This projection is chosen such that the ratio of between and within class scatter matrices or the Rayleigh quotient is maximized. Let X{ e Rdxl be a matrix containing the I training data points on d-dimensional space and /. the number of labeled samples for class ωt, i e {+} . FLD is the projection , which maximizes, τ ( _ Q^SBG
where
Sg = (m+ — m_) ( + — m_)
Sw - τ (Xi ~ mieh) (Xi ~ m*eD
are the between and within class scatter matrices respectively and
Figure imgf000008_0001
is the mean of class ωι and eu is an dimensional vector of ones. Transforming the above problem into a convex quadratic programming problem provides algorithmic advantages. For example, notice that if is a solution to Eq.(1), then so is any scalar multiple of it. Therefore, to avoid multiplicity of solutions, a constraint άrSB =b2 is imposed, which is equivalent to άr(?τι+ -m_) = b where b is some arbitrary positive scalar. Then the optimization problem of Eq.(1) becomes,
Problem 1 : mmaGRd τSiγ s.t. τ (ιn+ — m_) — b For binary classification problems the solution of this problem is α* =bsw' n+-m 7_) — ^Qt^at eac^ e|emen|- 0f ^Q discriminant vector is a (m+-m_ ) Sw (m+-m_ ) weighted sum of the difference between class mean vectors where the weighting coefficients are rows of ^ . According to this expansion since S^1 is positive definite unless the difference of the class means along a given feature is zero all features contributes to the final discriminant. If a given feature in the training set is redundant, its contribution to the final discriminant would be artificial and not desirable. As a linear classifier FLD is well suited to handle features of this sort provided that they do not dominate the feature set, that is, the ratio of redundant to relevant features is not significant. Although the contribution of a single redundant feature to the final discriminant would be negligible when several of these features are available at the same time, the overall impact could be quite significant leading to poor prediction accuracy. Apart from this impact, in the context of FLD these undesirable features also pose numerical constraints on the computation of S^1 especially when the number of training samples is limited. Indeed, when the number of features, d is higher than the number of training samples, I, Sw becomes ill-conditioned and its inverse does not exist. Hence eliminating the irrelevant and redundant features may provide a two-fold boost on the performance. According to an embodiment of the present disclosure, a sparse formulation of FLD incorporating a regularization constraint on the FLD. A system and method eliminate those features determined to have limited impact on the objective function. Sparse Fisher Discriminant Analysis: Blindly fitting classifiers without appropriate regularization conditions yields over-fitted models. Methods for controlling model complexity are needed in modern data analysis. In particular, when the number of features available is large, an appropriate regularization can dramatically reduce the dimensionality and produces better generalization performance that is supported by learning theory. For linear models of the form «Γ as considered here, well-established regularization conditions include the 2-norm penalty and 1-norm penalty on the weight vector a. A regularized model fitting problem can be written as: / =min(error( )+ LP( )). (2) where λ is called the regularization parameter. According to an embodiment of the present disclosure, a 1-norm penalty P(f) =
Figure imgf000010_0001
has been implemented in a sparse FLD formulation, which generates sparser feature subsets than 2-norm penalty. The regularized model fitting formulation of Eq.(2) has an equivalent formulation as =min(errør(jO, subject to : P(f) ≤ γ). (3) where the parameter γ plays a similar role to the regularization parameter λ in Eq.(2) to trade off between the training error and the penalty term. If a is required to be non-negative, the 1-norm of can be determined as et. Optimization Problem 2 may be obtained. With new constraints Problem 1 can be updated as follows,
Problem 2 :
Figure imgf000010_0002
The feasible set associated with Problem 1 is denoted by
Ωj = { e Rd, r(m+ -m_) = b] band that associated with Problem 2 by
Ωl = { ≡ Rdr(m+ -m_) = b,άrel ≤ γ, >0} , and observe that Ω2 c Ω,.
< χ
Figure imgf000010_0003
are defined wherΘ i = {l-,d} . The set
Ω2 is empty whenever <5max <0 or δmin > γ. In addition to the feasibility constraints γ< <5max should hold to achieve a sparse solution. According to an embodiment of the present disclosure, a linear transformation will ensure na > ° anc' standardize the sparsity constraint. For simplicity and without loss of generality Sw is assumed to be a diagonal matrix with elements λi,i = l,...,d where λ{ are the eigenvalues of Sw.
Under this scenario a solution to Problem 1 is * = b [m÷~m- }' , ... , {m*~ζ- )d J where b = έ — . A linear transformation is defined as D = diag(a ...,dd) = bdiag^~™-h ,...,("i^0d ) such that x \→ Dx where diag indicates a diagonal matrix. With this transformation, Problem 2 takes the following form
Problem- 3 : irιαeβd s.t.
Figure imgf000011_0001
< _ = max, _ ,2 and < n = min, - bλi ,2 are defined where i = {l,...,d} . Note that ^ and δT[aκ are nonnegative and hence both feasibility constraints are satisfied when δnύB > γ. For γ> d the globally optimum solution a* to Problem 3 is * = [l,...,lf , i.e., nonsparse solution. For γ< d sparse solutions can be obtained. Unlike Problem 2 where the upper bound on γ depends on mean vectors, here the upper bound is d, i.e., the number of features. The sparse formulation is a biconvex programming problem.
Problem 4 : mmα.; a£Rd aτ (5^ * (a.cF^ a s.t. τ (( + — m_) * a) = b a eι -^ 7» a -≥ 0 An initialization, or=[l,...,l]r, is performed, and a* is solved for, e.g., a solution to Problem 1. α* is fixed and α* is solved for, e.g., a solution to Problem 3. The Iterative Feature Selection Method: Referring to Figure 5, successive feature elimination can be obtained by iteratively solving the above biconvex programming problem. (501) Set the discriminant vector to all ones, the regularization parameter to d such that is much less than d; 0 = en,d° = d, γ« d For each iteration i do the following: (502) Select the dl features with a values greater than ε,d'' ≤ d e.g., select the features with the corresponding element of the discriminant vector greater than ε. (503) Determine the class scatter matrices and means in the dl - dimensional (reduced) feature space. (504) Solve Problem 4 to obtain a1, the transformation vector. (505) Using the newly obtained transformation vector, fix a to a1 and update the class scatter matrices and means. (506) Solve Problem 4 to obtain ', the discriminant. (507) Stop when all
Figure imgf000012_0001
e.g., stop if none of the elements of the discriminant vector is less than ε . ε is a threshold for controlling how aggressive feature elimination is performed, ε may be user selected. Since at each iteration a is truncated the above method is not guaranteed to converge. However, at any iteration i when dl <
Figure imgf000012_0002
sparseness would be achieved and hence all a^ would be equal to one. Therefore the algorithm stops when dl < γ, at the latest.
Experimental Results: A Toy Example; this experiment is adapted from Weston et al., Feature Selection for SVMs, Advances in Neural Information Processing Systems, 13 pp. 668-674. Using an artificial data it has been demonstrated that the performance of conventional FLD suffers from the presence of too many irrelevant features whereas the proposed sparse approach produces a better prediction accuracy by successfully handling these irrelevant features. The probability of y =1 or y =-1 is equal. The first three features x x2,x3 are drawn as xt = yN(i,5) . Note that only one of these features is relevant for discriminating one class from the other, the other two are redundant. The rest of the features are drawn as x. = N(0,20). Note that these features are noise. The noise features are added to the feature set one by one allowing us to observe the gradual change in the prediction capability of both approaches. The method is initialized as d = 3, e.g., start with the first three features and proceed as follows. Samples are generated for training (e.g., 200) and samples are generated for testing (e.g., 1000). Both approaches are trained and tested. The corresponding prediction errors are recorded, d is increased by one and repeat the above procedure until we reach d = 20. For the proposed approach we select the best two features. The error bars in Figure 3 are obtained by repeating the above process 100 times for each d each time using a different training and testing set. Figure 3 illustrates testing error vs. / for artificial data. Full dimensionality and two-dimensional feature subset compared: Curve 301 corresponds to FLD and curve 302 corresponds to a sparse method according to an embodiment of the present disclosure. Looking at the results, at d = 3 with two redundant features the prediction accuracy of the conventional FLD is decent. With the same two redundant features at d= 3 the standard deviation in prediction error is smaller under a method according to an embodiment of the present disclosure indicating the elimination of one or both of the redundant features. As d gets larger and noise features are added to the feature set the performance of the conventional FLD deteriorates significantly whereas the average prediction error for the proposed formulation remains around its initial level with some increase in the standard deviation. Also 90% of the time a method according to an embodiment of the present disclosure selects feature two and three together. These are the two most powerful features in the set. Example 2: Colon Cancer; Data Sources and Domain Description; Colorectal cancer is the third most common cancer in both men and women. It is estimated that in 2004, nearly 147,000 cases of colon and rectal cancer will be diagnosed in the US, and more than 56,730 people would die from colon cancer. While there is wide consensus that screening patients is effective in decreasing advanced disease, only 44% of the eligible population undergoes any colorectal cancer screening. There are many factors for this, Multiple reasons have been identified for non-compliance, key being: patient comfort, bowel preparation and cost. Non-invasive virtual colonoscopy derived from computer tomographic (CT) images of the colon holds great promise as a screening method for colorectal cancer, particularly if CAD tools are developed to facilitate the efficiency of radiologists' efforts in detecting lesions. In over 90% of the cases colon cancer progressed rapidly is from local (polyp adenomas) to advanced stages (colorectal cancer), which has very poor survival rates. However, identifying (and removing) lesions (polyp) when still in a local stage of the disease, has very high survival rates, thus illustrating the critical need for early diagnosis. The database of high-resolution CT images used in this study were obtained from NYU Medical Center, Cleveland Clinic Foundation, and two EU sites in Vienna and Belgium. The 163 patients were randomly partitioned into two groups: training (n=96) and test (n=67). The test group was sequestered and only used to evaluate the performance of the final system.
Training Data Patient and Polyp Info: There were 96 patients with 187 volumes. A total of 76 polyps were identified in this set with a total number of 9830 candidates. Testing Data Patient and Polyp Info: There were 67 patients with 133 volumes. A total of 53 polyps were identified in this set with a total number of 6616 candidates. A combined total of 207 features are extracted for each candidate by three imaging scientists. Feature Selection and Classification: In this experiment three feature selection methods where considered in a wrapper framework and compare their prediction performance on the Colon Dataset. These techniques are namely, the sparse formulation proposed in this study (SFLD), the sparse formulation for Kernel Fisher Discriminant with linear loss and linear regularizer (SKFD) and a greedy sequential forward-backward feature selection algorithm implemented with FLD (GFLD). Sparse Fisher Linear Discriminant (SFLD): The choice of plays an important role on the generalization performance of a method according to an embodiment of the present disclosure. It regularizes the FLD by seeking a balance between the "goodness of fit", e.g., Rayleigh
Quotient and the number of features used to achieve this performance. The value of this parameter is estimated by cross validation. Leave- One-Patient-Out (LOPO) cross validation may be implemented. In this scheme, both views are left out, e.g., the supine and the prone views, of one patient from the training data. The classifier is trained using the patients from the remaining set, and tested on both views of the "left-out" patient. LOPO is superior to other cross-validation metrics such as leave-one-volume-out, leave-one-polyp-out or k-fold cross-validation because it simulates the actual use, wherein the CAD system processes both volumes for a new patient. For instance, with any of the above alternative methods, if a polyp is visible in both views, the corresponding candidates could be assigned to different folds; thus a classifier may be trained and tested on the same polyp (albeit in different views). To find the optimum value of γ, a method is run for varying sizes of γ≡ [Id]. For each value of the Receiver Operating Characteristics (ROC) curve is obtained by evaluating the Leave One Patient Out (LOPO) Cross Validation performance of a sparse FLD method and
determining the area under this curve. The optimum value of γ is chosen as the value that results in the largest area. Kernel Fisher Discriminant with linear loss and linear regularizer (SKFD): In this approach there is a set of constraints for every data point on the training set which leads to large optimization problems. To alleviate the computational burden on mathematical programming formulation for this approach Laplacian models may be implemented for both the loss function and the regularizer. This choice leads to linear programming formulation instead of the quadratic programming formulation that is obtained when a Gaussian model is assumed for both the loss function and the regularizer. The linear programming formulation used is written as:
Figure imgf000015_0001
where e± is vector of ones of size the number of points in class ±. The final classifier for an unseen data point x is given by sign ( x-β). The regularization parameter is estimated by
LOPO. Greedy sequential forward-backward feature selection algorithm with FLD (GFLD): This approach starts with an empty subset and performs a forward selection succeeded by a backward attempt to eliminate a feature from the subset. During each iteration of the forward selection exactly one feature is added to the feature subset. To determine which feature to add, the algorithm tentatively adds to the candidate feature subset one feature that is not already selected and tests the LOPO performance of a classifier built on the tentative feature subset. The feature that results in the largest area under the ROC curve is added to the feature subset. During each iteration of the backward elimination the algorithm attempts to eliminate the feature that results in the largest ROC area gain. This process goes on until no or negligible improvement is gained. In this study the algorithm stops when the increase on the ROC area after a forward selection is less than 0.005. A total of 17 features is selected before this constraint is met. SKFD was run on a subset of the training dataset where all the positive candidates and a random subset of size 1000 of the negative candidates where included. The 5 algorithms run included: 1. SFLD on the original training set. 2. GFLD on the original training set. 3. Conventional on the original training set. 4. SKFD on the subset training set. 5. SFLK on the subset training set (denoted as SFLDsub).
Table 1 : The number of features selected (d), the area of the ROC curve scaled by 100 (Area) and the sensitivity corresponding to 90% specificity (Sens) is shown for all algorithms considered in this study. The values in parenthesis show the corresponding values for the testing results. Algorithm d Area Sens (%) SFLD 25 94.8 (94.9) 89 (87) SFLD-sub 17 94.7 (94.1) 92 (85) GFLD 17 94.3 (94.7) 85 (83) SKFD 18 88.0 (82.0) 65 (60) FLD 207 80.3 (89.1) 63 (77) TABLE 1 The ROC curves in Figure 3 demonstrates the LOPO performance of the each method and those in Figure 4 show the performance on the test data set. Table 1 shows the number of features selected (d), the area of the ROC curve scaled by 100 (Area) and the sensitivity corresponding to 90% specificity (Sens) for all algorithms considered in this study. These results show that Sparse (SFLD) and SFLDsub outperform the greedy and conventional FLD and SKFD both on the training and testing datasets. Although SFLD-sub performs better than SFLD on the training data, SFLD generalizes slightly better on the testing data. This is not surprising because SFLD-sub uses a subset of the original training data. GFLD performs almost equally well with SFLDsub and SFLD methods but the difference is hidden in the computational cost needed to select the features in GFLD. The computational cost of GFLD is proportional to d3 whereas that of SFLD is proportional to d2. According to an embodiment of the present disclosure, a method for sparse formulation of the Fisher Linear Discriminant is applied to medical images. The method is applicable to other images. Experimental results favor the proposed algorithm over two other feature selection/regularization techniques implemented in the FLD framework both in terms of prediction accuracy and the computational cost fir large data sets. Referring to Figure 6, a computer-implemented detection system includes an object detection module determining a candidate object and a feature set for the candidate object 601. The system includes a feature selection module 602 coupled to the object detection module 601 , wherein the feature selection module 602 receives the feature set and generates a reduced feature set having a desirable value of a Rayleigh quotient, wherein the object detection modules 601 implements the reduced feature set for detecting an object in an image. A feature selection module includes an initialization module 603 setting an initial value of a discriminant vector and a regularization parameter, a reduction module 604 determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set, a discriminant module 605 determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set, a sparsity module 606 determining a transformation vector, and an update module 607 updating the class scatter matrix and means according to the transformation vector, wherein the sparsity module 606 determines the discriminant vector given the updated class scatter matrix and means. Having described embodiments for a system and method for feature selection in an object detection system, it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention as defined by the appended claims. Having thus described the invention with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A computer-implemented method for processing an image comprising: identifying a plurality of candidates for an object of interest in the image; extracting a feature set for each candidate; determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient; determining at least one candidate of the plurality of candidates as a positive candidate based on the reduced feature set; and displaying the positive candidate for analysis of the object.
2. The computer-implemented method of claim 1 , wherein determining the reduced feature set comprises: initializing a discriminant vector and a regularization parameter; and determining, iteratively, the reduced feature set.
3. The computer-implemented method of claim 2, wherein determining, iteratively, the reduced feature set comprises: determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set; determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set; determining a transformation vector; updating the class scatter matrix and means according to the transformation vector; and determining the discriminant vector.
4. The computer-implemented method of claim 2, further comprising: comparing, at each iteration, each element of the discriminant vector to a threshold; and stopping the iterative determination of the reduced feature set upon determining that all elements are greater than the threshold.
5. The computer-implemented method of claim 4, wherein the threshold is a user defined variable for controlling a degree to which features are eliminated.
6. The computer-implemented method of claim 2, wherein the transformation vector and the discriminant vector can be determined as: r(Sw * (aaT))a min ccazR" ^ ^m+ ~ m_) * a = b . - tfe^ xa≥ O
7. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for processing an image, the method steps comprising: identifying a plurality of candidates for an object of interest in the image; extracting a feature set for each candidate; determining a reduced feature set by removing a least one redundant feature from the feature set to maximize a Rayleigh quotient; determining at least one candidate of the plurality of candidates as a positive candidate based on the reduced feature set; and displaying the positive candidate for analysis of the object.
8. The method of claim 7, wherein determining the reduced feature set comprises: initializing a discriminant vector and a regularization parameter; and determining, iteratively, the reduced feature set.
9. The method of claim 8, wherein determining, iteratively, the reduced feature set comprises: determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set; determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set; determining a transformation vector; updating the class scatter matrix and means according to the transformation vector; and determining the discriminant vector.
10. The method of claim 8, further comprising: comparing, at each iteration, each element of the discriminant vector to a threshold; and stopping the iterative determination of the reduced feature set upon determining that all elements are greater than the threshold.
11. The method of claim 10, wherein the threshold is a user defined variable for controlling a degree to which features are eliminated.
12. The method of claim 8, wherein the transformation vector and the discriminant vector can be determined as: άr(Sw * (aaτ))a min aaeRά ((m+ -m_)* a = b . *' άrel ≤ γta≥ 0
13. A computer-implemented detection system comprising: an object detection module determining a candidate object and a feature set for the candidate object; and a feature selection module coupled to the object detection module, wherein the feature selection module receives the feature set and generates a reduced feature set having a desirable value of a Rayleigh quotient, wherein the object detection modules implements the reduced feature set for detecting an object in an image.
14. The computer-implemented detection system of claim 13, wherein the feature selection module further comprises: an initialization module setting an initial value of a discriminant vector and a regularization parameter; a reduction module determining the reduced feature set according to the discriminant vector, wherein features of the feature set with an element of the discriminant vector greater than a threshold are selected as the reduced feature set; a discriminant module determining a class scatter matrix and mean in a reduced dimensional space defined by the reduced feature set; a sparsity module determining a transformation vector; and an update module updating the class scatter matrix and means according to the transformation vector, wherein the sparsity module determines the discriminant vector given the updated class scatter matrix and means.
PCT/US2005/019116 2004-06-02 2005-06-01 System and method for elimination of irrelevant and redundant features to improve cad performance WO2005122065A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US57611504P 2004-06-02 2004-06-02
US60/576,115 2004-06-02
US11/140,290 US20050281457A1 (en) 2004-06-02 2005-05-27 System and method for elimination of irrelevant and redundant features to improve cad performance
US11/140,290 2005-05-27

Publications (1)

Publication Number Publication Date
WO2005122065A1 true WO2005122065A1 (en) 2005-12-22

Family

ID=35480622

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/019116 WO2005122065A1 (en) 2004-06-02 2005-06-01 System and method for elimination of irrelevant and redundant features to improve cad performance

Country Status (2)

Country Link
US (1) US20050281457A1 (en)
WO (1) WO2005122065A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070122041A1 (en) * 2005-11-29 2007-05-31 Baback Moghaddam Spectral method for sparse linear discriminant analysis
US20070127824A1 (en) * 2005-12-07 2007-06-07 Trw Automotive U.S. Llc Method and apparatus for classifying a vehicle occupant via a non-parametric learning algorithm
US20090097741A1 (en) * 2006-03-30 2009-04-16 Mantao Xu Smote algorithm with locally linear embedding
US9202140B2 (en) * 2008-09-05 2015-12-01 Siemens Medical Solutions Usa, Inc. Quotient appearance manifold mapping for image classification
CN107203891A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of automatic many threshold values characteristic filter method and devices

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030172043A1 (en) * 1998-05-01 2003-09-11 Isabelle Guyon Methods of identifying patterns in biological systems and uses thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5926804A (en) * 1994-07-01 1999-07-20 The Board Of Governors For Higher Education, State Of Rhode Island And Providence Plantations Discriminant neural networks
US6405065B1 (en) * 1999-01-22 2002-06-11 Instrumentation Metrics, Inc. Non-invasive in vivo tissue classification using near-infrared measurements
US7336827B2 (en) * 2000-11-08 2008-02-26 New York University System, process and software arrangement for recognizing handwritten characters
WO2002077895A2 (en) * 2001-03-26 2002-10-03 Epigenomics Ag Method for epigenetic feature selection
US20050177040A1 (en) * 2004-02-06 2005-08-11 Glenn Fung System and method for an iterative technique to determine fisher discriminant using heterogenous kernels

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030172043A1 (en) * 1998-05-01 2003-09-11 Isabelle Guyon Methods of identifying patterns in biological systems and uses thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
I. GUYON AND A. ELISSEEFF: "An introduction to variable and feature selection", JOURNAL OF MACHINE LEARNING RESEARCH, vol. 3, March 2003 (2003-03-01), pages 1157 - 1182, XP002343161 *
J. WESTON ET AL: "Feature selection for SVMs", NEURAL INFORMATION PROCESSING SYSTEMS, vol. 13, 27 November 2000 (2000-11-27), pages 668 - 674, XP002343162 *
S. MIKA ET AL: "An improved training algorithm for kernel Fisher discriminants", PROCEEDINGS OF THE EIGHTH INTERNATIONAL WORKSHOP ON ARTIFICIAL INTELLIGENCE AND STATISTICS, 4 January 2001 (2001-01-04), pages 98 - 104, XP002343160 *

Also Published As

Publication number Publication date
US20050281457A1 (en) 2005-12-22

Similar Documents

Publication Publication Date Title
Lu Embedded feature selection accounting for unknown data heterogeneity
US11200483B2 (en) Machine learning method and apparatus based on weakly supervised learning
KR101908680B1 (en) A method and apparatus for machine learning based on weakly supervised learning
Hebart et al. The Decoding Toolbox (TDT): a versatile software package for multivariate analyses of functional imaging data
Wang et al. Magnetic resonance brain classification by a novel binary particle swarm optimization with mutation and time-varying acceleration coefficients
US20200027012A1 (en) Systems and methods for bayesian optimization using non-linear mapping of input
Cao et al. Nonlinearity-aware based dimensionality reduction and over-sampling for AD/MCI classification from MRI measures
Cai et al. Concussion classification via deep learning using whole-brain white matter fiber strains
Smolander et al. Comparing deep belief networks with support vector machines for classifying gene expression data from complex disorders
CN113728335A (en) Method and system for classification and visualization of 3D images
US8775345B2 (en) Recovering the structure of sparse markov networks from high-dimensional data
WO2015173435A1 (en) Method for predicting a phenotype from a genotype
WO2021087129A1 (en) Automatic reduction of training sets for machine learning programs
Van Belle et al. White box radial basis function classifiers with component selection for clinical prediction models
Li et al. Efficient ℓ 0‐norm feature selection based on augmented and penalized minimization
US8064662B2 (en) Sparse collaborative computer aided diagnosis
US20050177040A1 (en) System and method for an iterative technique to determine fisher discriminant using heterogenous kernels
WO2005122065A1 (en) System and method for elimination of irrelevant and redundant features to improve cad performance
WO2013012990A1 (en) Multi-task learning for bayesian matrix factorization
Bhardwaj et al. Computational biology in the lens of CNN
Yu et al. Energy efficiency of inference algorithms for clinical laboratory data sets: Green artificial intelligence study
WO2006132975A1 (en) System and method for learning rankings via convex hull separation
WO2005122066A1 (en) Support vector classification with bounded uncertainties in input data
US9984212B2 (en) Group-sparse nonnegative supervised canonical correlation analysis (GNCCA)
Wagner Monet: An open-source Python package for analyzing and integrating scRNA-Seq data using PCA-based latent spaces

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase