CN106104541B - 序列数据分析装置、dna分析系统以及序列数据分析方法 - Google Patents

序列数据分析装置、dna分析系统以及序列数据分析方法 Download PDF

Info

Publication number
CN106104541B
CN106104541B CN201580014840.6A CN201580014840A CN106104541B CN 106104541 B CN106104541 B CN 106104541B CN 201580014840 A CN201580014840 A CN 201580014840A CN 106104541 B CN106104541 B CN 106104541B
Authority
CN
China
Prior art keywords
sequence
mentioned
character
sample
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580014840.6A
Other languages
English (en)
Chinese (zh)
Other versions
CN106104541A (zh
Inventor
木村宏
木村宏一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi High Tech Corp
Original Assignee
Hitachi High Technologies Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi High Technologies Corp filed Critical Hitachi High Technologies Corp
Publication of CN106104541A publication Critical patent/CN106104541A/zh
Application granted granted Critical
Publication of CN106104541B publication Critical patent/CN106104541B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN201580014840.6A 2014-04-03 2015-03-12 序列数据分析装置、dna分析系统以及序列数据分析方法 Active CN106104541B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2014077278A JP6198659B2 (ja) 2014-04-03 2014-04-03 配列データ解析装置、dna解析システムおよび配列データ解析方法
JP2014-077278 2014-04-03
PCT/JP2015/057348 WO2015151758A1 (ja) 2014-04-03 2015-03-12 配列データ解析装置、dna解析システムおよび配列データ解析方法

Publications (2)

Publication Number Publication Date
CN106104541A CN106104541A (zh) 2016-11-09
CN106104541B true CN106104541B (zh) 2018-09-11

Family

ID=54240090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580014840.6A Active CN106104541B (zh) 2014-04-03 2015-03-12 序列数据分析装置、dna分析系统以及序列数据分析方法

Country Status (6)

Country Link
US (1) US10810239B2 (https=)
JP (1) JP6198659B2 (https=)
CN (1) CN106104541B (https=)
DE (1) DE112015001637T5 (https=)
GB (1) GB2539596B (https=)
WO (1) WO2015151758A1 (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10395759B2 (en) 2015-05-18 2019-08-27 Regeneron Pharmaceuticals, Inc. Methods and systems for copy number variant detection
US12071669B2 (en) 2016-02-12 2024-08-27 Regeneron Pharmaceuticals, Inc. Methods and systems for detection of abnormal karyotypes
JP6653628B2 (ja) * 2016-06-16 2020-02-26 株式会社日立製作所 Dna配列解析装置、dna配列解析方法及びdna配列解析システム
CN106446537B (zh) * 2016-09-18 2018-07-10 北京百度网讯科技有限公司 结构体变异检测的方法、设备及系统
US20220199199A1 (en) * 2019-02-07 2022-06-23 Biokey Bv Biological sequence information handling
WO2021134574A1 (zh) * 2019-12-31 2021-07-08 深圳华大智造科技有限公司 创建基因突变词典及利用基因突变词典压缩基因组数据的方法和装置
CN111782609B (zh) * 2020-05-22 2023-10-13 北京和瑞精湛医学检验实验室有限公司 一种快速将fastq文件均匀分片的方法
WO2022054178A1 (ja) * 2020-09-09 2022-03-17 株式会社日立ハイテク 個体ゲノムの構造変異検出方法及び装置
KR102265937B1 (ko) * 2020-12-21 2021-06-17 주식회사 모비젠 시퀀스데이터의 분석 방법 및 그 장치
CN114550828B (zh) * 2022-01-28 2024-12-10 赛纳生物科技(北京)有限公司 一种基因序列的比对方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103476946A (zh) * 2011-01-14 2013-12-25 关键基因股份有限公司 基于配对末端随机序列的基因分型
CN104937599A (zh) * 2013-02-28 2015-09-23 株式会社日立高新技术 数据解析装置及其方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPS115502A0 (en) * 2002-03-18 2002-04-18 Diatech Pty Ltd Assessing data sets
US8428882B2 (en) 2005-06-14 2013-04-23 Agency For Science, Technology And Research Method of processing and/or genome mapping of diTag sequences
JP5183155B2 (ja) * 2007-11-06 2013-04-17 株式会社日立製作所 大量配列の一括検索方法及び検索システム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103476946A (zh) * 2011-01-14 2013-12-25 关键基因股份有限公司 基于配对末端随机序列的基因分型
CN104937599A (zh) * 2013-02-28 2015-09-23 株式会社日立高新技术 数据解析装置及其方法

Also Published As

Publication number Publication date
DE112015001637T5 (de) 2017-02-09
GB2539596B (en) 2021-03-17
WO2015151758A1 (ja) 2015-10-08
JP6198659B2 (ja) 2017-09-20
GB201616668D0 (en) 2016-11-16
GB2539596A (en) 2016-12-21
CN106104541A (zh) 2016-11-09
JP2015197899A (ja) 2015-11-09
US10810239B2 (en) 2020-10-20
US20170017717A1 (en) 2017-01-19

Similar Documents

Publication Publication Date Title
CN106104541B (zh) 序列数据分析装置、dna分析系统以及序列数据分析方法
US10777304B2 (en) Compressing, storing and searching sequence data
US12189693B2 (en) Method and system for document similarity analysis
KR101638594B1 (ko) Dna 서열 검색 방법 및 장치
AU2015298543B2 (en) Methods and systems for data analysis and compression
US20130166518A1 (en) Compression Of Genomic Data File
US20110196872A1 (en) Computational Method for Comparing, Classifying, Indexing, and Cataloging of Electronically Stored Linear Information
CN104937599B (zh) 数据解析装置及其方法
US20130117246A1 (en) Methods of processing text data
EP3072076B1 (en) A method of generating a reference index data structure and method for finding a position of a data pattern in a reference data structure
US9886561B2 (en) Efficient encoding and storage and retrieval of genomic data
CN114503206A (zh) 用于在基因组图中有效识别和提取序列路径的系统和方法
US11515011B2 (en) K-mer based genomic reference data compression
EP3418927B1 (en) Method and device for processing dna sequence
Eskandar et al. Lossless pangenome indexing using tag arrays
EP2795488B1 (en) Compressing, storing and searching sequence data
WO2022054178A1 (ja) 個体ゲノムの構造変異検出方法及び装置
Lenadora et al. An adapter architecture for heterogeneous data processing in bioinformatics pipelines
US9129232B2 (en) Determination tree generating apparatus
Lee et al. BulkAligner: A novel sequence alignment algorithm based on graph theory and Trinity
WO2016143062A1 (ja) 配列データ解析装置、dna解析システムおよび配列データ解析方法
CN117529779A (zh) 用于迭代和可扩展的群体规模变体分析的系统和方法
COLLIN et al. Supplementary Information For: An open-sourced bioinformatic pipeline for the processing of Next-Generation Sequencing derived nucleotide reads: Identification and authentication of ancient metagenomic DNA.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant