CA3098876A1 - Machine learning enabled biological polymer assembly - Google Patents

Machine learning enabled biological polymer assembly Download PDF

Info

Publication number
CA3098876A1
CA3098876A1 CA3098876A CA3098876A CA3098876A1 CA 3098876 A1 CA3098876 A1 CA 3098876A1 CA 3098876 A CA3098876 A CA 3098876A CA 3098876 A CA3098876 A CA 3098876A CA 3098876 A1 CA3098876 A1 CA 3098876A1
Authority
CA
Canada
Prior art keywords
assembly
locations
location
learning model
nucleotide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3098876A
Other languages
English (en)
French (fr)
Inventor
Minh Duc Cao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quantum Si Inc
Original Assignee
Quantum Si Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quantum Si Inc filed Critical Quantum Si Inc
Publication of CA3098876A1 publication Critical patent/CA3098876A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional [2D] or three-dimensional [3D] molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Addition Polymer Or Copolymer, Post-Treatments, Or Chemical Modifications (AREA)
CA3098876A 2018-05-14 2019-05-13 Machine learning enabled biological polymer assembly Pending CA3098876A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201862671260P 2018-05-14 2018-05-14
US62/671,260 2018-05-14
US201862671884P 2018-05-15 2018-05-15
US62/671,884 2018-05-15
PCT/US2019/032065 WO2019222120A1 (en) 2018-05-14 2019-05-13 Machine learning enabled biological polymer assembly

Publications (1)

Publication Number Publication Date
CA3098876A1 true CA3098876A1 (en) 2019-11-21

Family

ID=66669118

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3098876A Pending CA3098876A1 (en) 2018-05-14 2019-05-13 Machine learning enabled biological polymer assembly

Country Status (10)

Country Link
US (1) US20190348152A1 (https=)
EP (1) EP3794596A1 (https=)
JP (1) JP2021523479A (https=)
KR (1) KR20210010488A (https=)
CN (1) CN112437961A (https=)
AU (1) AU2019270961A1 (https=)
BR (1) BR112020022257A2 (https=)
CA (1) CA3098876A1 (https=)
MX (1) MX2020012278A (https=)
WO (1) WO2019222120A1 (https=)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3624068A1 (en) * 2018-09-14 2020-03-18 Covestro Deutschland AG Method for improving prediction relating to the production of a polymer-ic produc
US11664090B2 (en) * 2020-06-11 2023-05-30 Life Technologies Corporation Basecaller with dilated convolutional neural network
EP4211691A1 (en) * 2020-09-11 2023-07-19 F. Hoffmann-La Roche AG Deep-learning-based techniques for generating a consensus sequence from multiple noisy sequences
CA3214755A1 (en) * 2021-04-09 2022-10-13 Natalie CASTELLANA Method for antibody identification from protein mixtures

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010127045A2 (en) * 2009-04-29 2010-11-04 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
EP2718862B1 (en) * 2011-06-06 2018-10-31 Koninklijke Philips N.V. Method for assembly of nucleic acid sequence data
EP3084002A4 (en) * 2013-12-16 2017-08-23 Complete Genomics, Inc. Basecaller for dna sequencing using machine learning
CA2894317C (en) * 2015-06-15 2023-08-15 Deep Genomics Incorporated Systems and methods for classifying, prioritizing and interpreting genetic variants and therapies using a deep neural network

Also Published As

Publication number Publication date
MX2020012278A (es) 2021-01-29
JP2021523479A (ja) 2021-09-02
WO2019222120A1 (en) 2019-11-21
KR20210010488A (ko) 2021-01-27
AU2019270961A1 (en) 2020-11-19
BR112020022257A2 (pt) 2021-02-23
CN112437961A (zh) 2021-03-02
EP3794596A1 (en) 2021-03-24
US20190348152A1 (en) 2019-11-14

Similar Documents

Publication Publication Date Title
US20190348152A1 (en) Machine learning enabled biological polymer assembly
US20200176082A1 (en) Analysis of nanopore signal using a machine-learning technique
US11817180B2 (en) Systems and methods for analyzing nucleic acid sequences
KR102433458B1 (ko) 심층 컨볼루션 신경망의 앙상블을 트레이닝하기 위한 반감독 학습
Gross et al. CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction
Allen et al. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions
US20240242075A1 (en) Deep Learning-Based Pathogenicity Classifier for Promoter Single Nucleotide Variants (pSNVs)
US20250069699A1 (en) Methods and Systems for Discovery of Embedded Target Genes in Biosynthetic Gene Clusters
JP2021523479A5 (https=)
US20260004878A1 (en) Method for assuming organism or host, method for obtaining model for assuming organism or host, and computer device for performing the same
Beiko et al. GANN: genetic algorithm neural networks for the detection of conserved combinations of features in DNA
EP4584789A1 (en) Pathogenicity prediction for protein mutations using amino acid score distributions
WO2005059707A2 (en) Estimating gene networks using inferential methods and biological constraints
US10937523B2 (en) Methods, systems and computer readable storage media for generating accurate nucleotide sequences
Grassi et al. A functional strategy to characterize expression Quantitative Trait Loci
JPWO2019222120A5 (https=)
Shaw Prediction of Isoform Functions and Interactions with ncRNAs via Deep Learning
Ahsan Learning from watching evolution
Fujimoto et al. Learning the language of genes: representing global codon bias with deep language models
El-Attar et al. Machine learning approaches for predicting cardiovascular disease: a systematic review and meta-analysis
Zheng Deep learning predicts the impact of non-coding genetic variants in human traits and diseases
Agarwala Cross-Species Prediction of Transcription Factor Binding
Eraslan Enriching the characterization of complex clinical and molecular phenotypes with deep learning
John et al. Tools for sequence assembly and annotation
Munteanu Computational models to investigate binding mechanisms of regulatory proteins

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20240513

D00 Search and/or examination requested or commenced

Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D00-D120 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: VOLUNTARY SUBMISSION OF PRIOR ART RECEIVED

Effective date: 20241104

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT

Effective date: 20250124

D00 Search and/or examination requested or commenced

Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D00-D123 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: PRIOR ART DISCLOSURE DETERMINED COMPLIANT

Effective date: 20250320

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT

Effective date: 20250320

D15 Examination report completed

Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D15-D126 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXAMINER'S REPORT

Effective date: 20250512

B12 Application deemed to be withdrawn, abandoned or lapsed

Free format text: ST27 STATUS EVENT CODE: N-6-6-B10-B12-B303 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: DEEMED ABANDONED - FAILURE TO RESPOND TO AN EXAMINER'S REQUISITION

Effective date: 20250912

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT

Effective date: 20251202

U13 Renewal or maintenance fee not paid

Free format text: ST27 STATUS EVENT CODE: N-2-6-U10-U13-U300 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: DEEMED ABANDONED - FAILURE TO RESPOND TO MAINTENANCE FEE NOTICE

Effective date: 20260202