WO2009067655A3 - Methods of feature selection through local learning; breast and prostate cancer prognostic markers - Google Patents

Methods of feature selection through local learning; breast and prostate cancer prognostic markers Download PDF

Info

Publication number
WO2009067655A3
WO2009067655A3 PCT/US2008/084325 US2008084325W WO2009067655A3 WO 2009067655 A3 WO2009067655 A3 WO 2009067655A3 US 2008084325 W US2008084325 W US 2008084325W WO 2009067655 A3 WO2009067655 A3 WO 2009067655A3
Authority
WO
WIPO (PCT)
Prior art keywords
prostate cancer
feature selection
local learning
features
breast
Prior art date
Application number
PCT/US2008/084325
Other languages
French (fr)
Other versions
WO2009067655A2 (en
Inventor
Yijun Sun
Steve Goodison
Li Liu
William George Farmerie
Original Assignee
University Of Florida Research Foundation, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Florida Research Foundation, Inc. filed Critical University Of Florida Research Foundation, Inc.
Publication of WO2009067655A2 publication Critical patent/WO2009067655A2/en
Publication of WO2009067655A3 publication Critical patent/WO2009067655A3/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Landscapes

  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • Software Systems (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A method is provided that addresses the feature selection problem in the presence of copious irrelevant features. According to this method, feature selection can be accomplished by decomposing a given complex problem into a set of locally linear problems through local learning, and estimating the relevance of features globally within a large margin framework. Local learning allows one to capture local structure of the data, while the global parameter estimation within a large margin framework allows one to a\oid possible overfitting. This method addresses many major issues of the prior art, including their problems with computational complexity, solution accuracy, algorithm implementation, exportability of selected features, and extension to multiclass settings. Using the method, a small number of genes useful for predicting the occurrence of distal metastases in breast cancer patients were identified: LOC58509, CEGPl, AL080059, ATP5E. and FRAME. Also using the method, prostate cancer prognostic signatures based on gene expression alone and gene expression in combination with post-operative nomogram were derived. Genes determined to be particularly relevant to prostate cancer prognosis include PCOLN3, TGFB3, PAK3. RBM34. RPL23, EI24, FUT7, RlCS Rho. MAP4K4. CUTLl, and ZNF324B.
PCT/US2008/084325 2007-11-21 2008-11-21 Methods of feature selection through local learning; breast and prostate cancer prognostic markers WO2009067655A2 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US98959207P 2007-11-21 2007-11-21
US60/989,592 2007-11-21
US4023208P 2008-03-28 2008-03-28
US4023708P 2008-03-28 2008-03-28
US61/040,237 2008-03-28
US61/040,232 2008-03-28

Publications (2)

Publication Number Publication Date
WO2009067655A2 WO2009067655A2 (en) 2009-05-28
WO2009067655A3 true WO2009067655A3 (en) 2009-09-03

Family

ID=40668094

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/084325 WO2009067655A2 (en) 2007-11-21 2008-11-21 Methods of feature selection through local learning; breast and prostate cancer prognostic markers

Country Status (1)

Country Link
WO (1) WO2009067655A2 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI20105252A0 (en) * 2010-03-12 2010-03-12 Medisapiens Oy METHOD, ORGANIZATION AND COMPUTER SOFTWARE PRODUCT FOR ANALYZING A BIOLOGICAL OR MEDICAL SAMPLE
WO2012107786A1 (en) 2011-02-09 2012-08-16 Rudjer Boskovic Institute System and method for blind extraction of features from measurement data
CN104063631B (en) * 2014-06-13 2017-07-18 周家锐 A kind of metabolism group characteristic analysis method and its system towards big data
PL3262190T3 (en) * 2015-02-24 2021-12-20 Ruprecht-Karls-Universität Heidelberg Biomarker panel for the detection of cancer
US11062229B1 (en) * 2016-02-18 2021-07-13 Deepmind Technologies Limited Training latent variable machine learning models using multi-sample objectives
GB201616912D0 (en) 2016-10-05 2016-11-16 University Of East Anglia Classification of cancer
CN107391433B (en) * 2017-06-30 2021-04-13 天津大学 Feature selection method based on KDE conditional entropy of mixed features
CN109248542B (en) * 2017-07-12 2021-03-05 中国石油化工股份有限公司 Method and system for determining optimal adsorption time of pressure swing adsorption device
CN108763873A (en) * 2018-05-28 2018-11-06 苏州大学 A kind of gene sorting method and relevant device
CN109599175A (en) * 2018-12-04 2019-04-09 中山大学孙逸仙纪念医院 A kind of analysis of joint destroys the device and method of progress probability
CN109735622A (en) * 2019-03-07 2019-05-10 天津市第三中心医院 LncRNA relevant to colorectal cancer and its application
CN110070916B (en) * 2019-04-29 2023-04-18 安徽大学 Historical data-based cancer disease gene characteristic selection method
CN110489660B (en) * 2019-07-22 2020-12-18 武汉大学 User economic condition portrait method of social media public data
CN113177604B (en) * 2021-05-14 2024-04-16 东北大学 High-dimensional data feature selection method based on improved L1 regularization and clustering

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065535A1 (en) * 2001-05-01 2003-04-03 Structural Bioinformatics, Inc. Diagnosing inapparent diseases from common clinical tests using bayesian analysis
US20030233197A1 (en) * 2002-03-19 2003-12-18 Padilla Carlos E. Discrete bayesian analysis of data
US20040209290A1 (en) * 2003-01-15 2004-10-21 Cobleigh Melody A. Gene expression markers for breast cancer prognosis
US20040265830A1 (en) * 2001-10-17 2004-12-30 Aniko Szabo Methods for identifying differentially expressed genes by multivariate analysis of microaaray data
US20050048542A1 (en) * 2003-07-10 2005-03-03 Baker Joffre B. Expression profile algorithm and test for cancer prognosis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065535A1 (en) * 2001-05-01 2003-04-03 Structural Bioinformatics, Inc. Diagnosing inapparent diseases from common clinical tests using bayesian analysis
US20040265830A1 (en) * 2001-10-17 2004-12-30 Aniko Szabo Methods for identifying differentially expressed genes by multivariate analysis of microaaray data
US20030233197A1 (en) * 2002-03-19 2003-12-18 Padilla Carlos E. Discrete bayesian analysis of data
US20040209290A1 (en) * 2003-01-15 2004-10-21 Cobleigh Melody A. Gene expression markers for breast cancer prognosis
US20050048542A1 (en) * 2003-07-10 2005-03-03 Baker Joffre B. Expression profile algorithm and test for cancer prognosis

Also Published As

Publication number Publication date
WO2009067655A2 (en) 2009-05-28

Similar Documents

Publication Publication Date Title
WO2009067655A3 (en) Methods of feature selection through local learning; breast and prostate cancer prognostic markers
Ishida The effect of ICT development on economic growth and energy consumption in Japan
Solís-Lemus et al. Bayesian species delimitation combining multiple genes and traits in a unified framework
Friedman et al. Data analysis with Bayesian networks: A bootstrap approach
Shen et al. rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data
Mastrandrea et al. Enhanced reconstruction of weighted networks from strengths and degrees
Lee et al. Locally adaptive spatial smoothing using conditional auto-regressive models
WO2009002949A3 (en) System, method and apparatus for predictive modeling of specially distributed data for location based commercial services
WO2020154830A1 (en) Techniques to detect fusible operators with machine learning
Shih et al. Reliability of readmission rates as a hospital quality measure in cardiac surgery
He et al. Robust twin boosting for feature selection from high-dimensional omics data with label noise
JP2009535644A5 (en)
JP2010092266A (en) Learning device, learning method and program
Bilal et al. Novel deep learning algorithm predicts the status of molecular pathways and key mutations in colorectal cancer from routine histology images
WO2020132572A1 (en) Source of origin deconvolution based on methylation fragments in cell-free-dna samples
Shao et al. Change point determination for a multivariate process using a two-stage hybrid scheme
Wang et al. MSB: a mean-shift-based approach for the analysis of structural variation in the genome
Asensio et al. Self-adaptive grids for noise mapping refinement
EP1939796A3 (en) Data processing apparatus, data processing method data processing program and computer readable medium
Maity et al. Testing in semiparametric models with interaction, with applications to gene–environment interactions
US20230090925A1 (en) Methylation fragment probabilistic noise model with noisy region filtration
CN114938248B (en) Method for building and demodulating underwater wireless optical communication demodulation model
Dutta et al. A hybrid ensemble model of kriging and neural network for ore grade estimation
Wu et al. A multifactor dimensionality reduction-logistic regression model of gene polymorphisms and an environmental interaction analysis in cancer research
US20190318802A1 (en) Method and apparatus for improved determination of node influence in a network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08851210

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08851210

Country of ref document: EP

Kind code of ref document: A2