CN117581304A - 基于压缩状态的碱基调用 - Google Patents

基于压缩状态的碱基调用 Download PDF

Info

Publication number
CN117581304A
CN117581304A CN202280045844.0A CN202280045844A CN117581304A CN 117581304 A CN117581304 A CN 117581304A CN 202280045844 A CN202280045844 A CN 202280045844A CN 117581304 A CN117581304 A CN 117581304A
Authority
CN
China
Prior art keywords
channel
sequencing
cluster
intensity
per
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280045844.0A
Other languages
English (en)
Chinese (zh)
Inventor
G·D·帕纳比
E·J·奥贾德
D·卡什夫哈吉吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inmair Ltd
Original Assignee
Inmair Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/944,948 external-priority patent/US20230087698A1/en
Application filed by Inmair Ltd filed Critical Inmair Ltd
Priority claimed from PCT/US2022/044293 external-priority patent/WO2023049215A1/en
Publication of CN117581304A publication Critical patent/CN117581304A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Bioethics (AREA)
  • Public Health (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
CN202280045844.0A 2021-09-22 2022-09-21 基于压缩状态的碱基调用 Pending CN117581304A (zh)

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US202163247301P 2021-09-22 2021-09-22
US202163247296P 2021-09-22 2021-09-22
US63/247296 2021-09-22
US63/247301 2021-09-22
US17/944948 2022-09-14
US17/944,948 US20230087698A1 (en) 2021-09-22 2022-09-14 Compressed state-based base calling
US17/944,809 US12412387B2 (en) 2021-09-22 2022-09-14 State-based base calling
US17/944809 2022-09-14
PCT/US2022/044293 WO2023049215A1 (en) 2021-09-22 2022-09-21 Compressed state-based base calling

Publications (1)

Publication Number Publication Date
CN117581304A true CN117581304A (zh) 2024-02-20

Family

ID=88067183

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202280045844.0A Pending CN117581304A (zh) 2021-09-22 2022-09-21 基于压缩状态的碱基调用
CN202280046324.1A Pending CN117581305A (zh) 2021-09-22 2022-09-21 基于状态的碱基调用

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202280046324.1A Pending CN117581305A (zh) 2021-09-22 2022-09-21 基于状态的碱基调用

Country Status (4)

Country Link
US (1) US12412387B2 (https=)
EP (2) EP4405956A2 (https=)
JP (2) JP2024534302A (https=)
CN (2) CN117581304A (https=)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197153A1 (en) * 2022-04-12 2023-10-19 Orange Systems and methods to accelerate creation of and/or searching for digital twins on a computerized platform
US20250021801A1 (en) * 2023-07-12 2025-01-16 Canon Medical Systems Corporation Mapping method and apparatus

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3181696A1 (en) 2013-12-03 2015-06-11 Paul BELITZ Methods and systems for analyzing image data
EP3084002A4 (en) * 2013-12-16 2017-08-23 Complete Genomics, Inc. Basecaller for dna sequencing using machine learning
US11378544B2 (en) * 2018-01-08 2022-07-05 Illumina, Inc. High-throughput sequencing with semiconductor-based detection
US12073922B2 (en) * 2018-07-11 2024-08-27 Illumina, Inc. Deep learning-based framework for identifying sequence patterns that cause sequence-specific errors (SSEs)
US11783917B2 (en) 2019-03-21 2023-10-10 Illumina, Inc. Artificial intelligence-based base calling
WO2020191389A1 (en) 2019-03-21 2020-09-24 Illumina, Inc. Training data generation for artificial intelligence-based sequencing
US11593649B2 (en) 2019-05-16 2023-02-28 Illumina, Inc. Base calling using convolutions
US20210265009A1 (en) 2020-02-20 2021-08-26 Illumina, Inc. Artificial Intelligence-Based Base Calling of Index Sequences
CN121034400A (zh) * 2020-02-20 2025-11-28 因美纳有限公司 基于人工智能的多对多碱基判读
US12591780B2 (en) * 2020-02-20 2026-03-31 Illumina, Inc. Data compression for artificial intelligence-based base calling

Also Published As

Publication number Publication date
JP2024536665A (ja) 2024-10-08
CN117581305A (zh) 2024-02-20
EP4405955A1 (en) 2024-07-31
US20230298339A1 (en) 2023-09-21
EP4405956A2 (en) 2024-07-31
JP2024534302A (ja) 2024-09-20
US12412387B2 (en) 2025-09-09

Similar Documents

Publication Publication Date Title
US11908548B2 (en) Training data generation for artificial intelligence-based sequencing
JP7754822B2 (ja) 人工知能ベースのベースコーラの知識蒸留及び勾配プルーニングに基づく圧縮
US11347965B2 (en) Training data generation for artificial intelligence-based sequencing
WO2020205296A1 (en) Artificial intelligence-based generation of sequencing metadata
US20230343414A1 (en) Sequence-to-sequence base calling
NL2023311B9 (en) Artificial intelligence-based generation of sequencing metadata
NL2023310B1 (en) Training data generation for artificial intelligence-based sequencing
US12412387B2 (en) State-based base calling
US20230087698A1 (en) Compressed state-based base calling
WO2023049212A2 (en) State-based base calling
HK40076748A (en) Knowledge distillation and gradient pruning-based compression of artificial intelligence-based base caller
HK40076748B (en) Knowledge distillation and gradient pruning-based compression of artificial intelligence-based base caller
HK40058973B (en) Training data generation for artificial intelligence-based sequencing
HK40058973A (en) Training data generation for artificial intelligence-based sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination