CN104919466A - 数据库驱动的原始测序数据的初步分析 - Google Patents

数据库驱动的原始测序数据的初步分析 Download PDF

Info

Publication number
CN104919466A
CN104919466A CN201380065692.1A CN201380065692A CN104919466A CN 104919466 A CN104919466 A CN 104919466A CN 201380065692 A CN201380065692 A CN 201380065692A CN 104919466 A CN104919466 A CN 104919466A
Authority
CN
China
Prior art keywords
mer
sequence
database
sequences
arbitrary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201380065692.1A
Other languages
English (en)
Chinese (zh)
Inventor
L·戈蒂埃
O·伦德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Danmarks Tekniske Universitet
Original Assignee
Danmarks Tekniske Universitet
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Danmarks Tekniske Universitet filed Critical Danmarks Tekniske Universitet
Publication of CN104919466A publication Critical patent/CN104919466A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
CN201380065692.1A 2012-10-15 2013-10-11 数据库驱动的原始测序数据的初步分析 Pending CN104919466A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
EP12188538 2012-10-15
EP12188538.8 2012-10-15
PCT/EP2013/071280 WO2014060305A1 (en) 2012-10-15 2013-10-11 Database-driven primary analysis of raw sequencing data

Publications (1)

Publication Number Publication Date
CN104919466A true CN104919466A (zh) 2015-09-16

Family

ID=47357889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380065692.1A Pending CN104919466A (zh) 2012-10-15 2013-10-11 数据库驱动的原始测序数据的初步分析

Country Status (5)

Country Link
US (1) US20150294065A1 (enExample)
EP (1) EP2915084A1 (enExample)
JP (1) JP2016502162A (enExample)
CN (1) CN104919466A (enExample)
WO (1) WO2014060305A1 (enExample)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107683477A (zh) * 2015-06-05 2018-02-09 利姆博思医学科技有限责任公司 数据质量管理系统和方法
CN108699601A (zh) * 2016-02-11 2018-10-23 斯坦福大学托管董事会 第三代测序比对算法
CN111128303A (zh) * 2018-10-31 2020-05-08 深圳华大生命科学研究院 基于已知序列确定目标物种中对应序列的方法和系统
WO2021196358A1 (zh) * 2020-04-02 2021-10-07 上海之江生物科技股份有限公司 微生物目标片段中特异性区域的识别方法、装置及应用
CN113744806A (zh) * 2021-06-23 2021-12-03 杭州圣庭医疗科技有限公司 一种基于纳米孔测序仪的真菌测序数据鉴定方法
CN118051654A (zh) * 2024-04-15 2024-05-17 北京嘉和海森健康科技有限公司 一种数据分析方法、装置、电子设备和可读存储介质

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9679104B2 (en) 2013-01-17 2017-06-13 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US9792405B2 (en) 2013-01-17 2017-10-17 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US10691775B2 (en) 2013-01-17 2020-06-23 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US10847251B2 (en) 2013-01-17 2020-11-24 Illumina, Inc. Genomic infrastructure for on-site or cloud-based DNA and RNA processing and analysis
WO2014113736A1 (en) 2013-01-17 2014-07-24 Edico Genome Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US10068054B2 (en) 2013-01-17 2018-09-04 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
AU2014324729B2 (en) 2013-09-26 2019-08-22 Five3 Genomics, Llc Systems, methods, and compositions for viral-associated tumors
NL2011817C2 (en) * 2013-11-19 2015-05-26 Genalice B V A method of generating a reference index data structure and method for finding a position of a data pattern in a reference data structure.
US9697327B2 (en) 2014-02-24 2017-07-04 Edico Genome Corporation Dynamic genome reference generation for improved NGS accuracy and reproducibility
US9618474B2 (en) 2014-12-18 2017-04-11 Edico Genome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US10006910B2 (en) 2014-12-18 2018-06-26 Agilome, Inc. Chemically-sensitive field effect transistors, systems, and methods for manufacturing and using the same
US9859394B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
WO2016100049A1 (en) 2014-12-18 2016-06-23 Edico Genome Corporation Chemically-sensitive field effect transistor
US10020300B2 (en) 2014-12-18 2018-07-10 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US9857328B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Chemically-sensitive field effect transistors, systems and methods for manufacturing and using the same
WO2016103148A1 (en) * 2014-12-23 2016-06-30 Koninklijke Philips N.V. Systems, methods, and apparatuses for sequence alignment
WO2016154154A2 (en) 2015-03-23 2016-09-29 Edico Genome Corporation Method and system for genomic visualization
AU2016253004B2 (en) * 2015-04-24 2022-10-06 University Of Utah Research Foundation Methods and systems for multiple taxonomic classification
US11194778B2 (en) * 2015-12-18 2021-12-07 International Business Machines Corporation Method and system for hybrid sort and hash-based query execution
US10068183B1 (en) 2017-02-23 2018-09-04 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on a quantum processing platform
US20170270245A1 (en) 2016-01-11 2017-09-21 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
WO2017201081A1 (en) 2016-05-16 2017-11-23 Agilome, Inc. Graphene fet devices, systems, and methods of using the same for sequencing nucleic acids
US10246704B1 (en) 2017-12-29 2019-04-02 Clear Labs, Inc. Detection of microorganisms in food samples and food processing facilities
US10597714B2 (en) 2017-12-29 2020-03-24 Clear Labs, Inc. Automated priming and library loading device
EP3731959B1 (en) 2017-12-29 2025-10-08 Clear Labs, Inc. Automated priming and library loading device
US11314781B2 (en) 2018-09-28 2022-04-26 International Business Machines Corporation Construction of reference database accurately representing complete set of data items for faster and tractable classification usage
US11830580B2 (en) 2018-09-30 2023-11-28 International Business Machines Corporation K-mer database for organism identification
US11347810B2 (en) 2018-12-20 2022-05-31 International Business Machines Corporation Methods of automatically and self-consistently correcting genome databases
US11515011B2 (en) * 2019-08-09 2022-11-29 International Business Machines Corporation K-mer based genomic reference data compression
KR20230009877A (ko) * 2020-05-08 2023-01-17 일루미나, 인코포레이티드 게놈 서열분석 및 검출 기술
JP2023541090A (ja) * 2020-09-15 2023-09-28 イルミナ インコーポレイテッド ソフトウェアで加速されたゲノムリードマッピング
JP2025512716A (ja) 2022-03-08 2025-04-22 イルミナ インコーポレイテッド マルチパスソフトウェアで加速されたゲノムリードマッピングエンジン
CN120432002A (zh) * 2025-07-08 2025-08-05 四川国际旅行卫生保健中心(成都海关口岸门诊部) 一种输入性传染病快速测序与溯源分析系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060286566A1 (en) * 2005-02-03 2006-12-21 Helicos Biosciences Corporation Detecting apparent mutations in nucleic acid sequences
US20120004111A1 (en) * 2007-11-21 2012-01-05 Cosmosid Inc. Direct identification and measurement of relative populations of microorganisms with direct dna sequencing and probabilistic methods
CN102332064A (zh) * 2011-10-07 2012-01-25 吉林大学 基于基因条形码的生物物种识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120000411A1 (en) 2010-07-02 2012-01-05 Jim Scoledes Anchor device for coral rock

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060286566A1 (en) * 2005-02-03 2006-12-21 Helicos Biosciences Corporation Detecting apparent mutations in nucleic acid sequences
US20120004111A1 (en) * 2007-11-21 2012-01-05 Cosmosid Inc. Direct identification and measurement of relative populations of microorganisms with direct dna sequencing and probabilistic methods
CN102332064A (zh) * 2011-10-07 2012-01-25 吉林大学 基于基因条形码的生物物种识别方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DAVID R.MATHOG: "Parallel Blast on Split databases", 《BIOINFORMATICS》 *
ZEMIN NING ET AL: "SSAHA:a fast search method for large DNA databases", 《GENOME RESEARCH》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107683477A (zh) * 2015-06-05 2018-02-09 利姆博思医学科技有限责任公司 数据质量管理系统和方法
CN108699601A (zh) * 2016-02-11 2018-10-23 斯坦福大学托管董事会 第三代测序比对算法
CN111128303A (zh) * 2018-10-31 2020-05-08 深圳华大生命科学研究院 基于已知序列确定目标物种中对应序列的方法和系统
CN111128303B (zh) * 2018-10-31 2023-09-15 深圳华大生命科学研究院 基于已知序列确定目标物种中对应序列的方法和系统
WO2021196358A1 (zh) * 2020-04-02 2021-10-07 上海之江生物科技股份有限公司 微生物目标片段中特异性区域的识别方法、装置及应用
US12308093B2 (en) 2020-04-02 2025-05-20 Shanghai Zj Bio-Tech Co., Ltd Method and device for identifying specific region in microorganism target fragment and use thereof
CN113744806A (zh) * 2021-06-23 2021-12-03 杭州圣庭医疗科技有限公司 一种基于纳米孔测序仪的真菌测序数据鉴定方法
CN113744806B (zh) * 2021-06-23 2024-03-12 杭州圣庭医疗科技有限公司 一种基于纳米孔测序仪的真菌测序数据鉴定方法
CN118051654A (zh) * 2024-04-15 2024-05-17 北京嘉和海森健康科技有限公司 一种数据分析方法、装置、电子设备和可读存储介质

Also Published As

Publication number Publication date
WO2014060305A1 (en) 2014-04-24
EP2915084A1 (en) 2015-09-09
US20150294065A1 (en) 2015-10-15
JP2016502162A (ja) 2016-01-21

Similar Documents

Publication Publication Date Title
CN104919466A (zh) 数据库驱动的原始测序数据的初步分析
Coelho et al. Towards the biogeography of prokaryotic genes
Törönen et al. PANNZER—a practical tool for protein function prediction
Ekim et al. Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer
Steinegger et al. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold
Ondov et al. Mash: fast genome and metagenome distance estimation using MinHash
Al-Ghalith et al. NINJA-OPS: fast accurate marker gene alignment using concatenated ribosomes
Keegan et al. MG-RAST, a metagenomics service for analysis of microbial community structure and function
Freitas et al. Accurate read-based metagenome characterization using a hierarchical suite of unique signatures
US11676683B2 (en) Secure communication of sensitive genomic information using probabilistic data structures
Shi et al. Fast and accurate metagenotyping of the human gut microbiome with GT-Pro
Cumbie et al. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences
US11037654B2 (en) Rapid genomic sequence classification using probabilistic data structures
Rangel-Pineros et al. VIRify: an integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
CN111710364B (zh) 一种菌群标记物的获取方法、装置、终端及存储介质
Deng et al. Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences
Bálint et al. ContScout: sensitive detection and removal of contamination from annotated genomes
Shi et al. Maast: genotyping thousands of microbial strains efficiently
Sahlin Strobemers: an alternative to k-mers for sequence comparison
Depuydt et al. Run-length compressed metagenomic read classification with SMEM-finding and tagging
Khdhiri et al. refMLST: reference-based multilocus sequence typing enables universal bacterial typing
Saha et al. MSC: a metagenomic sequence classification algorithm
Wilke et al. MG-RAST manual for version 4, revision 3
Fatma et al. Metagenomics: a tool for haunting abandoned microbial community
Ndovie et al. Exploration of the genetic landscape of bacterial dsDNA viruses reveals an ANI gap amid extensive mosaicism

Legal Events

Date Code Title Description
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150916