CN112437961A - 机器学习使能的生物聚合物组装 - Google Patents
机器学习使能的生物聚合物组装 Download PDFInfo
- Publication number
- CN112437961A CN112437961A CN201980047341.5A CN201980047341A CN112437961A CN 112437961 A CN112437961 A CN 112437961A CN 201980047341 A CN201980047341 A CN 201980047341A CN 112437961 A CN112437961 A CN 112437961A
- Authority
- CN
- China
- Prior art keywords
- assembly
- nucleotide
- learning model
- locations
- biopolymer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional [2D] or three-dimensional [3D] molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/20—Sequence assembly
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/30—Unsupervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Data Mining & Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Addition Polymer Or Copolymer, Post-Treatments, Or Chemical Modifications (AREA)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201862671260P | 2018-05-14 | 2018-05-14 | |
| US62/671,260 | 2018-05-14 | ||
| US201862671884P | 2018-05-15 | 2018-05-15 | |
| US62/671,884 | 2018-05-15 | ||
| PCT/US2019/032065 WO2019222120A1 (en) | 2018-05-14 | 2019-05-13 | Machine learning enabled biological polymer assembly |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN112437961A true CN112437961A (zh) | 2021-03-02 |
Family
ID=66669118
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201980047341.5A Pending CN112437961A (zh) | 2018-05-14 | 2019-05-13 | 机器学习使能的生物聚合物组装 |
Country Status (10)
| Country | Link |
|---|---|
| US (1) | US20190348152A1 (https=) |
| EP (1) | EP3794596A1 (https=) |
| JP (1) | JP2021523479A (https=) |
| KR (1) | KR20210010488A (https=) |
| CN (1) | CN112437961A (https=) |
| AU (1) | AU2019270961A1 (https=) |
| BR (1) | BR112020022257A2 (https=) |
| CA (1) | CA3098876A1 (https=) |
| MX (1) | MX2020012278A (https=) |
| WO (1) | WO2019222120A1 (https=) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3624068A1 (en) * | 2018-09-14 | 2020-03-18 | Covestro Deutschland AG | Method for improving prediction relating to the production of a polymer-ic produc |
| US11664090B2 (en) * | 2020-06-11 | 2023-05-30 | Life Technologies Corporation | Basecaller with dilated convolutional neural network |
| EP4211691A1 (en) * | 2020-09-11 | 2023-07-19 | F. Hoffmann-La Roche AG | Deep-learning-based techniques for generating a consensus sequence from multiple noisy sequences |
| CA3214755A1 (en) * | 2021-04-09 | 2022-10-13 | Natalie CASTELLANA | Method for antibody identification from protein mixtures |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102460155A (zh) * | 2009-04-29 | 2012-05-16 | 考利达基因组股份有限公司 | 用于关于参考多核苷酸序列标注样本多核苷酸序列中的变异的方法和系统 |
| CN103797486A (zh) * | 2011-06-06 | 2014-05-14 | 皇家飞利浦有限公司 | 用于组装核酸序列数据的方法 |
| US20150169824A1 (en) * | 2013-12-16 | 2015-06-18 | Complete Genomics, Inc. | Basecaller for dna sequencing using machine learning |
| CA2894317A1 (en) * | 2015-06-15 | 2016-12-15 | Deep Genomics Incorporated | Systems and methods for classifying, prioritizing and interpreting genetic variants and therapies using a deep neural network |
-
2019
- 2019-05-13 CA CA3098876A patent/CA3098876A1/en active Pending
- 2019-05-13 BR BR112020022257-7A patent/BR112020022257A2/pt not_active IP Right Cessation
- 2019-05-13 JP JP2020564123A patent/JP2021523479A/ja active Pending
- 2019-05-13 CN CN201980047341.5A patent/CN112437961A/zh active Pending
- 2019-05-13 MX MX2020012278A patent/MX2020012278A/es unknown
- 2019-05-13 WO PCT/US2019/032065 patent/WO2019222120A1/en not_active Ceased
- 2019-05-13 AU AU2019270961A patent/AU2019270961A1/en not_active Abandoned
- 2019-05-13 KR KR1020207035288A patent/KR20210010488A/ko not_active Ceased
- 2019-05-13 US US16/411,056 patent/US20190348152A1/en not_active Abandoned
- 2019-05-13 EP EP19727233.9A patent/EP3794596A1/en not_active Withdrawn
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102460155A (zh) * | 2009-04-29 | 2012-05-16 | 考利达基因组股份有限公司 | 用于关于参考多核苷酸序列标注样本多核苷酸序列中的变异的方法和系统 |
| CN103797486A (zh) * | 2011-06-06 | 2014-05-14 | 皇家飞利浦有限公司 | 用于组装核酸序列数据的方法 |
| US20150169824A1 (en) * | 2013-12-16 | 2015-06-18 | Complete Genomics, Inc. | Basecaller for dna sequencing using machine learning |
| CA2894317A1 (en) * | 2015-06-15 | 2016-12-15 | Deep Genomics Incorporated | Systems and methods for classifying, prioritizing and interpreting genetic variants and therapies using a deep neural network |
Non-Patent Citations (12)
| Title |
|---|
| ALLEX C F, ET AL.: "Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies", BIOINFORMATICS, vol. 15, no. 9, 1 September 1999 (1999-09-01), pages 723 - 728, XP002267211, DOI: 10.1093/bioinformatics/15.9.723 * |
| HIRANUMA N, ET AL.: "DeepATAC: A deep-learning method to predict regulatory factor binding activity from ATAC-seq signals", COLD SPRING HARBOR LABORATORY, vol. 33, no. 14, 12 July 2017 (2017-07-12), pages 1 - 5 * |
| HOLMAN A G, ET AL.: "A Machine Learning Approach for Identifying Amino Acid Signatures in the HIV Env Gene Predictive of Dementia", PLOS ONE, vol. 7, no. 11, 14 November 2012 (2012-11-14), pages 2 * |
| KELLEY D R,ET AL: "Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks", GENOME RESEARCH, vol. 26, no. 7, 1 July 2016 (2016-07-01), pages 990 - 999, XP055507160, DOI: 10.1101/gr.200535.115 * |
| KOH P W, ET AL.: "Denoising genome-wide histone ChIP-seq with convolutional neural networks", BIOINFORMATICS, vol. 33, no. 14, 24 July 2017 (2017-07-24), pages 1 - 9 * |
| LOMAN N J, ET AL.: "A complete bacterial genome assembled de novo using only nanopore sequencing data", BIORXIV, 11 March 2015 (2015-03-11), pages 1 - 30 * |
| LOMAN N J,ET AL.: "A complete bacterial genome assembled de novo using only nanopore sequencing data", NATURE METHODS, vol. 12, no. 8, 1 August 2015 (2015-08-01), pages 733 * |
| LUO R B, ET AL.: "Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing", BIORXIV, vol. 10, 28 April 2018 (2018-04-28), pages 1 - 20 * |
| RYAN P, ET AL.: "A universal SNP and small-indel variant caller using deep neural networks", NATURE BIOTECHNOLOGY, vol. 36, no. 10, 20 March 2018 (2018-03-20), pages 983 * |
| RYAN P, ET AL.: "Creating a universal SNP and small indel variant caller with deep neural networks", BIORXIV, 21 December 2016 (2016-12-21), pages 1 - 13 * |
| UMAROV R K ,ET AL.: "Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks", PLOS ONE, vol. 12, no. 2, 22 March 2017 (2017-03-22), pages 1 - 12 * |
| VASER R,ET AL.: "Fast and accurate de novo genome assembly from long uncorrected reads", GENOME RESEARCH, vol. 27, no. 5, 1 May 2017 (2017-05-01), pages 737 - 746, XP055608901, DOI: 10.1101/gr.214270.116 * |
Also Published As
| Publication number | Publication date |
|---|---|
| MX2020012278A (es) | 2021-01-29 |
| JP2021523479A (ja) | 2021-09-02 |
| WO2019222120A1 (en) | 2019-11-21 |
| KR20210010488A (ko) | 2021-01-27 |
| AU2019270961A1 (en) | 2020-11-19 |
| BR112020022257A2 (pt) | 2021-02-23 |
| EP3794596A1 (en) | 2021-03-24 |
| US20190348152A1 (en) | 2019-11-14 |
| CA3098876A1 (en) | 2019-11-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250226056A1 (en) | Variant classifier based on deep neural networks | |
| KR102433458B1 (ko) | 심층 컨볼루션 신경망의 앙상블을 트레이닝하기 위한 반감독 학습 | |
| CN112437961A (zh) | 机器学习使能的生物聚合物组装 | |
| US20200176082A1 (en) | Analysis of nanopore signal using a machine-learning technique | |
| US11769073B2 (en) | Methods and systems for producing an expanded training set for machine learning using biological sequences | |
| US20230298698A1 (en) | Methods and systems for sequence generation and prediction | |
| WO2023197718A9 (zh) | 一种预测环状rna ires的方法 | |
| CN109411020A (zh) | 利用长测序读段进行全基因组序列补洞的方法 | |
| Zych et al. | reGenotyper: Detecting mislabeled samples in genetic data | |
| JP2021523479A5 (https=) | ||
| CN108427865B (zh) | 一种预测LncRNA和环境因素关联关系的方法 | |
| US20260004878A1 (en) | Method for assuming organism or host, method for obtaining model for assuming organism or host, and computer device for performing the same | |
| US20250253012A1 (en) | Error Correction of Nucleic Acid Sequencing Reads | |
| US10937523B2 (en) | Methods, systems and computer readable storage media for generating accurate nucleotide sequences | |
| WO2024130230A2 (en) | Systems and methods for evaluation of expression patterns | |
| CN114298214B (zh) | 基于超大规模进化算法和硬件加速的蛋白质异常检测方法 | |
| US10216899B2 (en) | Sentence construction for DNA classification | |
| JPWO2019222120A5 (https=) | ||
| Agarwala | Cross-Species Prediction of Transcription Factor Binding | |
| Zagganas et al. | Simplifying p-value calculation for the unbiased microRNAenrichment analysis, using ML-techniques. | |
| Shaw | Prediction of Isoform Functions and Interactions with ncRNAs via Deep Learning | |
| Eraslan | Enriching the characterization of complex clinical and molecular phenotypes with deep learning | |
| Balaji | Journey Into the Unknown: Graph and Machine Learning Based Approaches for Improved Characterization of Novel Pathogens | |
| Fujimoto et al. | Learning the language of genes: representing global codon bias with deep language models | |
| US8762119B2 (en) | Method, system and apparatus to predict and/or recognize and/or classify biological sequences |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210302 |