CA3142888A1 - Techniques for protein identification using machine learning and related systems and methods - Google Patents

Techniques for protein identification using machine learning and related systems and methods

Info

Publication number
CA3142888A1
CA3142888A1 CA3142888A CA3142888A CA3142888A1 CA 3142888 A1 CA3142888 A1 CA 3142888A1 CA 3142888 A CA3142888 A CA 3142888A CA 3142888 A CA3142888 A CA 3142888A CA 3142888 A1 CA3142888 A1 CA 3142888A1
Authority
CA
Canada
Prior art keywords
data
learning model
machine learning
amino acids
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3142888A
Other languages
English (en)
French (fr)
Inventor
Zhizhuo ZHANG
Sabrina RASHID
Bradley Robert Parry
Michael Meyer
Brian Reed
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quantum Si Inc
Original Assignee
Quantum Si Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quantum Si Inc filed Critical Quantum Si Inc
Publication of CA3142888A1 publication Critical patent/CA3142888A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Bioethics (AREA)
  • Public Health (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
CA3142888A 2019-06-12 2020-06-12 Techniques for protein identification using machine learning and related systems and methods Pending CA3142888A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962860750P 2019-06-12 2019-06-12
US62/860,750 2019-06-12
PCT/US2020/037541 WO2020252345A1 (en) 2019-06-12 2020-06-12 Techniques for protein identification using machine learning and related systems and methods

Publications (1)

Publication Number Publication Date
CA3142888A1 true CA3142888A1 (en) 2020-12-17

Family

ID=71409529

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3142888A Pending CA3142888A1 (en) 2019-06-12 2020-06-12 Techniques for protein identification using machine learning and related systems and methods

Country Status (10)

Country Link
US (1) US20200395099A1 (https=)
EP (1) EP3966824A1 (https=)
JP (1) JP2022536343A (https=)
KR (1) KR20220019778A (https=)
CN (1) CN115989545A (https=)
AU (1) AU2020290510A1 (https=)
BR (1) BR112021024915A2 (https=)
CA (1) CA3142888A1 (https=)
MX (1) MX2021015347A (https=)
WO (1) WO2020252345A1 (https=)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR112021008098A2 (pt) 2018-11-15 2021-08-10 Quantum-Si Incorporated métodos e composições para o sequenciamento de proteínas
US11126890B2 (en) * 2019-04-18 2021-09-21 Adobe Inc. Robust training of large-scale object detectors with a noisy dataset
CN114929897A (zh) * 2019-10-28 2022-08-19 宽腾矽公司 制备用于多肽测序的富集样品的方法
JP7592737B2 (ja) 2020-03-06 2024-12-02 ボストンジーン コーポレイション 多重免疫蛍光イメージングを使用する組織特性の決定
WO2021236983A2 (en) 2020-05-20 2021-11-25 Quantum-Si Incorporated Methods and compositions for protein sequencing
CA3227592A1 (en) * 2021-09-22 2023-03-30 Gregory KAPP Methods and systems for determining polypeptide interactions
CN114118366B (zh) * 2021-11-15 2025-04-22 国网浙江省电力有限公司电力科学研究院 基于长短期记忆神经网络的生命体触电电流检测方法
CN114093415B (zh) * 2021-11-19 2022-06-03 中国科学院数学与系统科学研究院 肽段可检测性预测方法及系统
CN114456926A (zh) * 2022-03-15 2022-05-10 常州市环境科学研究院 一种可预测生长趋势的浮游植物自动培养装置及方法
US12587274B2 (en) 2023-03-28 2026-03-24 Quantum Generative Materials Llc Satellite optimization management system based on natural language input and artificial intelligence
WO2025057424A1 (ja) 2023-09-15 2025-03-20 富士通株式会社 情報処理プログラム,情報処理装置および情報処理方法
WO2025128525A1 (en) * 2023-12-11 2025-06-19 Research Development Foundation System and method for predicting microproteins
WO2025123211A1 (zh) * 2023-12-12 2025-06-19 深圳华大生命科学研究院 一种多肽分类器的构建与应用
WO2025123212A1 (zh) * 2023-12-12 2025-06-19 深圳华大生命科学研究院 一种基于目标检测模型的多肽信号提取方法
US12368503B2 (en) 2023-12-27 2025-07-22 Quantum Generative Materials Llc Intent-based satellite transmit management based on preexisting historical location and machine learning
US12603701B2 (en) 2023-12-27 2026-04-14 Quantum Generative Materials Llc Distributed satellite constellation management and control system
CN117744748B (zh) * 2024-02-20 2024-04-30 北京普译生物科技有限公司 一种神经网络模型训练、碱基识别方法及装置、电子设备

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050119454A1 (en) * 2000-01-24 2005-06-02 The Cielo Institute, Inc. Algorithmic design of peptides for binding and/or modulation of the functions of receptors and/or other proteins
CA2466792A1 (en) * 2003-05-16 2004-11-16 Affinium Pharmaceuticals, Inc. Evaluation of spectra
EP2389585A2 (en) * 2009-01-22 2011-11-30 Li-Cor, Inc. Single molecule proteomics with dynamic probes
US20120015825A1 (en) * 2010-07-06 2012-01-19 Pacific Biosciences Of California, Inc. Analytical systems and methods with software mask
WO2016069124A1 (en) * 2014-09-15 2016-05-06 Board Of Regents, The University Of Texas System Improved single molecule peptide sequencing
EP4414988A3 (en) * 2013-01-31 2024-11-06 Codexis, Inc. Methods, systems, and software for identifying bio-molecules using models of multiplicative form
US9212996B2 (en) * 2013-08-05 2015-12-15 Tellspec, Inc. Analyzing and correlating spectra, identifying samples and their ingredients, and displaying related personalized information
BR112016006284B1 (pt) * 2013-09-27 2022-07-26 Codexis, Inc Método implementado por computador, produto de programa de computador, e, sistema de computador
MX384725B (es) * 2014-08-08 2025-03-14 Quantum Si Inc Dispositivo integrado con fuente de luz externa para el sondeo, detección y análisis de moléculas.
WO2017214320A1 (en) * 2016-06-07 2017-12-14 Edico Genome, Corp. Bioinformatics systems, apparatus, and methods for performing secondary and/or tertiary processing
EP3568782A1 (en) * 2017-01-13 2019-11-20 Massachusetts Institute Of Technology Machine learning based antibody design
EA201992476A1 (ru) * 2017-04-18 2020-02-25 Икс-Чем, Инк. Способы идентификации соединений
US11573239B2 (en) * 2017-07-17 2023-02-07 Bioinformatics Solutions Inc. Methods and systems for de novo peptide sequencing using deep learning
US11587644B2 (en) * 2017-07-28 2023-02-21 The Translational Genomics Research Institute Methods of profiling mass spectral data using neural networks
WO2019152943A1 (en) * 2018-02-02 2019-08-08 Arizona Board Of Regents, For And On Behalf Of, Arizona State University Methods, systems, and media for predicting functions of molecular sequences
KR102885910B1 (ko) * 2018-02-17 2025-11-13 리제너론 파마슈티칼스 인코포레이티드 Mhc 펩티드 결합 예측을 위한 gan-cnn
US20210151123A1 (en) * 2018-03-08 2021-05-20 Jungla Inc. Interpretation of Genetic and Genomic Variants via an Integrated Computational and Experimental Deep Mutational Learning Framework
US20210239705A1 (en) * 2018-06-06 2021-08-05 Nautilus Biotechnology, Inc. Methods and applications of protein identification
BR112021008098A2 (pt) * 2018-11-15 2021-08-10 Quantum-Si Incorporated métodos e composições para o sequenciamento de proteínas

Also Published As

Publication number Publication date
WO2020252345A9 (en) 2022-02-10
BR112021024915A2 (pt) 2022-01-18
JP2022536343A (ja) 2022-08-15
CN115989545A (zh) 2023-04-18
KR20220019778A (ko) 2022-02-17
MX2021015347A (es) 2022-04-06
US20200395099A1 (en) 2020-12-17
WO2020252345A1 (en) 2020-12-17
AU2020290510A1 (en) 2022-02-03
EP3966824A1 (en) 2022-03-16

Similar Documents

Publication Publication Date Title
US20200395099A1 (en) Techniques for protein identification using machine learning and related systems and methods
US20230207068A1 (en) Methods of Profiling Mass Spectral Data Using Neural Networks
AU2019211435B2 (en) Machine learning enabled pulse and base calling for sequencing devices
US20230114905A1 (en) Highly multiplexable analysis of proteins and proteomes
Pierleoni et al. PredGPI: a GPI-anchor predictor
Rich et al. Grading the commercial optical biosensor literature—Class of 2008:‘The Mighty Binders’
US10006919B2 (en) Peptide array quality control
JP2018504587A (ja) 細胞分泌プロファイルの分析およびスクリーニング
WO2023035745A1 (zh) 嗅觉受体筛选、模型训练、酒类产品鉴定的方法与装置
US20220277811A1 (en) Detecting False Positive Variant Calls In Next-Generation Sequencing
WO2023035744A1 (zh) 嗅觉受体、重组细胞、试剂盒及其用途
Kirsch et al. Localizing genes to cerebellar layers by classifying ISH images
Colonna et al. Implementation and validation of single-cell genomics experiments in neuroscience
US20230360732A1 (en) Systems and methods for assessing and improving the quality of multiplex molecular assays
Smith et al. Estimating error rates for single molecule protein sequencing experiments
WO2012059748A1 (en) Method, apparatus and software for identifying cells
CN116741265A (zh) 一种基于机器学习的纳米孔蛋白质测序数据处理方法及其应用
US20240321393A1 (en) Cell-type optimization method and scanner
US20250299777A1 (en) Systems and methods of phenotype classification using shotgun analysis of nanopore signals
Pal et al. Dried biofluid droplet morphologies for automated and scalable disease classification
US20240087679A1 (en) Systems and methods of validating new affinity reagents
Mohamed Adaptable Biophysically-Interpretable Neural Networks in Genomics and Biomedicine
EP4195219A1 (en) Means and methods for the binary classification of ms1 maps and the recognition of discriminative features in proteomes
Sabherwal et al. Stacking ensemble-based machine learning for classification of Parkinson’s disease
CN121963841A (zh) 一种基于多模态深度学习的抗癌肽综合预测与筛选方法

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220926

U00 Fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED

Effective date: 20240717

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT

Effective date: 20240717

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL

Effective date: 20240717

MFA Maintenance fee for application paid

Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 4TH ANNIV.) - STANDARD

Year of fee payment: 4

U00 Fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED

Effective date: 20240718

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT

Effective date: 20240718

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL

Effective date: 20240718

D15 Examination report completed

Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D15-D126 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXAMINER'S REPORT

Effective date: 20240814

U00 Fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U00-U107 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: REFUND REQUEST RECEIVED

Effective date: 20241105

T11 Administrative time limit extension requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-T10-T11-T100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXTENSION OF TIME FOR TAKING ACTION REQUEST RECEIVED

Effective date: 20241213

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT RECEIVED - RESPONSE TO EXAMINER'S REQUISITION

Effective date: 20250214

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT

Effective date: 20250225

T13 Administrative time limit extension granted

Free format text: ST27 STATUS EVENT CODE: A-2-2-T10-T13-T101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXTENSION OF TIME FOR TAKING ACTION REQUIREMENTS DETERMINED COMPLIANT

Effective date: 20250714

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT

Effective date: 20250714

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT DETERMINED COMPLIANT

Effective date: 20250819

P13 Application amended

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P13-X000 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: APPLICATION AMENDED

Effective date: 20250819