WO2009067655A3 - Methods of feature selection through local learning; breast and prostate cancer prognostic markers - Google Patents
Methods of feature selection through local learning; breast and prostate cancer prognostic markers Download PDFInfo
- Publication number
- WO2009067655A3 WO2009067655A3 PCT/US2008/084325 US2008084325W WO2009067655A3 WO 2009067655 A3 WO2009067655 A3 WO 2009067655A3 US 2008084325 W US2008084325 W US 2008084325W WO 2009067655 A3 WO2009067655 A3 WO 2009067655A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- prostate cancer
- feature selection
- local learning
- features
- breast
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
Landscapes
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Epidemiology (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
A method is provided that addresses the feature selection problem in the presence of copious irrelevant features. According to this method, feature selection can be accomplished by decomposing a given complex problem into a set of locally linear problems through local learning, and estimating the relevance of features globally within a large margin framework. Local learning allows one to capture local structure of the data, while the global parameter estimation within a large margin framework allows one to a\oid possible overfitting. This method addresses many major issues of the prior art, including their problems with computational complexity, solution accuracy, algorithm implementation, exportability of selected features, and extension to multiclass settings. Using the method, a small number of genes useful for predicting the occurrence of distal metastases in breast cancer patients were identified: LOC58509, CEGPl, AL080059, ATP5E. and FRAME. Also using the method, prostate cancer prognostic signatures based on gene expression alone and gene expression in combination with post-operative nomogram were derived. Genes determined to be particularly relevant to prostate cancer prognosis include PCOLN3, TGFB3, PAK3. RBM34. RPL23, EI24, FUT7, RlCS Rho. MAP4K4. CUTLl, and ZNF324B.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US98959207P | 2007-11-21 | 2007-11-21 | |
US60/989,592 | 2007-11-21 | ||
US4023208P | 2008-03-28 | 2008-03-28 | |
US4023708P | 2008-03-28 | 2008-03-28 | |
US61/040,237 | 2008-03-28 | ||
US61/040,232 | 2008-03-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2009067655A2 WO2009067655A2 (en) | 2009-05-28 |
WO2009067655A3 true WO2009067655A3 (en) | 2009-09-03 |
Family
ID=40668094
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2008/084325 WO2009067655A2 (en) | 2007-11-21 | 2008-11-21 | Methods of feature selection through local learning; breast and prostate cancer prognostic markers |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2009067655A2 (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FI20105252A0 (en) * | 2010-03-12 | 2010-03-12 | Medisapiens Oy | METHOD, ORGANIZATION AND COMPUTER SOFTWARE PRODUCT FOR ANALYZING A BIOLOGICAL OR MEDICAL SAMPLE |
WO2012107786A1 (en) | 2011-02-09 | 2012-08-16 | Rudjer Boskovic Institute | System and method for blind extraction of features from measurement data |
CN104063631B (en) * | 2014-06-13 | 2017-07-18 | 周家锐 | A kind of metabolism group characteristic analysis method and its system towards big data |
PL3262190T3 (en) * | 2015-02-24 | 2021-12-20 | Ruprecht-Karls-Universität Heidelberg | Biomarker panel for the detection of cancer |
US11062229B1 (en) * | 2016-02-18 | 2021-07-13 | Deepmind Technologies Limited | Training latent variable machine learning models using multi-sample objectives |
GB201616912D0 (en) | 2016-10-05 | 2016-11-16 | University Of East Anglia | Classification of cancer |
CN107391433B (en) * | 2017-06-30 | 2021-04-13 | 天津大学 | Feature selection method based on KDE conditional entropy of mixed features |
CN109248542B (en) * | 2017-07-12 | 2021-03-05 | 中国石油化工股份有限公司 | Method and system for determining optimal adsorption time of pressure swing adsorption device |
CN108763873A (en) * | 2018-05-28 | 2018-11-06 | 苏州大学 | A kind of gene sorting method and relevant device |
CN109599175A (en) * | 2018-12-04 | 2019-04-09 | 中山大学孙逸仙纪念医院 | A kind of analysis of joint destroys the device and method of progress probability |
CN109735622A (en) * | 2019-03-07 | 2019-05-10 | 天津市第三中心医院 | LncRNA relevant to colorectal cancer and its application |
CN110070916B (en) * | 2019-04-29 | 2023-04-18 | 安徽大学 | Historical data-based cancer disease gene characteristic selection method |
CN110489660B (en) * | 2019-07-22 | 2020-12-18 | 武汉大学 | User economic condition portrait method of social media public data |
CN113177604B (en) * | 2021-05-14 | 2024-04-16 | 东北大学 | High-dimensional data feature selection method based on improved L1 regularization and clustering |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030065535A1 (en) * | 2001-05-01 | 2003-04-03 | Structural Bioinformatics, Inc. | Diagnosing inapparent diseases from common clinical tests using bayesian analysis |
US20030233197A1 (en) * | 2002-03-19 | 2003-12-18 | Padilla Carlos E. | Discrete bayesian analysis of data |
US20040209290A1 (en) * | 2003-01-15 | 2004-10-21 | Cobleigh Melody A. | Gene expression markers for breast cancer prognosis |
US20040265830A1 (en) * | 2001-10-17 | 2004-12-30 | Aniko Szabo | Methods for identifying differentially expressed genes by multivariate analysis of microaaray data |
US20050048542A1 (en) * | 2003-07-10 | 2005-03-03 | Baker Joffre B. | Expression profile algorithm and test for cancer prognosis |
-
2008
- 2008-11-21 WO PCT/US2008/084325 patent/WO2009067655A2/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030065535A1 (en) * | 2001-05-01 | 2003-04-03 | Structural Bioinformatics, Inc. | Diagnosing inapparent diseases from common clinical tests using bayesian analysis |
US20040265830A1 (en) * | 2001-10-17 | 2004-12-30 | Aniko Szabo | Methods for identifying differentially expressed genes by multivariate analysis of microaaray data |
US20030233197A1 (en) * | 2002-03-19 | 2003-12-18 | Padilla Carlos E. | Discrete bayesian analysis of data |
US20040209290A1 (en) * | 2003-01-15 | 2004-10-21 | Cobleigh Melody A. | Gene expression markers for breast cancer prognosis |
US20050048542A1 (en) * | 2003-07-10 | 2005-03-03 | Baker Joffre B. | Expression profile algorithm and test for cancer prognosis |
Also Published As
Publication number | Publication date |
---|---|
WO2009067655A2 (en) | 2009-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009067655A3 (en) | Methods of feature selection through local learning; breast and prostate cancer prognostic markers | |
Ishida | The effect of ICT development on economic growth and energy consumption in Japan | |
Solís-Lemus et al. | Bayesian species delimitation combining multiple genes and traits in a unified framework | |
Friedman et al. | Data analysis with Bayesian networks: A bootstrap approach | |
Shen et al. | rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data | |
Mastrandrea et al. | Enhanced reconstruction of weighted networks from strengths and degrees | |
Lee et al. | Locally adaptive spatial smoothing using conditional auto-regressive models | |
WO2009002949A3 (en) | System, method and apparatus for predictive modeling of specially distributed data for location based commercial services | |
WO2020154830A1 (en) | Techniques to detect fusible operators with machine learning | |
Shih et al. | Reliability of readmission rates as a hospital quality measure in cardiac surgery | |
He et al. | Robust twin boosting for feature selection from high-dimensional omics data with label noise | |
JP2009535644A5 (en) | ||
JP2010092266A (en) | Learning device, learning method and program | |
Bilal et al. | Novel deep learning algorithm predicts the status of molecular pathways and key mutations in colorectal cancer from routine histology images | |
WO2020132572A1 (en) | Source of origin deconvolution based on methylation fragments in cell-free-dna samples | |
Shao et al. | Change point determination for a multivariate process using a two-stage hybrid scheme | |
Wang et al. | MSB: a mean-shift-based approach for the analysis of structural variation in the genome | |
Asensio et al. | Self-adaptive grids for noise mapping refinement | |
EP1939796A3 (en) | Data processing apparatus, data processing method data processing program and computer readable medium | |
Maity et al. | Testing in semiparametric models with interaction, with applications to gene–environment interactions | |
US20230090925A1 (en) | Methylation fragment probabilistic noise model with noisy region filtration | |
CN114938248B (en) | Method for building and demodulating underwater wireless optical communication demodulation model | |
Dutta et al. | A hybrid ensemble model of kriging and neural network for ore grade estimation | |
Wu et al. | A multifactor dimensionality reduction-logistic regression model of gene polymorphisms and an environmental interaction analysis in cancer research | |
US20190318802A1 (en) | Method and apparatus for improved determination of node influence in a network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08851210 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08851210 Country of ref document: EP Kind code of ref document: A2 |