CN105229651B - Dna序列的快速并且安全的检索方法、装置及存储介质 - Google Patents

Dna序列的快速并且安全的检索方法、装置及存储介质 Download PDF

Info

Publication number
CN105229651B
CN105229651B CN201480029612.1A CN201480029612A CN105229651B CN 105229651 B CN105229651 B CN 105229651B CN 201480029612 A CN201480029612 A CN 201480029612A CN 105229651 B CN105229651 B CN 105229651B
Authority
CN
China
Prior art keywords
dna
rna sequence
model
ctw
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201480029612.1A
Other languages
English (en)
Chinese (zh)
Other versions
CN105229651A (zh
Inventor
T·伊格纳坚科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips NV filed Critical Koninklijke Philips NV
Publication of CN105229651A publication Critical patent/CN105229651A/zh
Application granted granted Critical
Publication of CN105229651B publication Critical patent/CN105229651B/zh
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/40Encryption of genetic data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24561Intermediate data storage techniques for performance improvement
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN201480029612.1A 2013-05-23 2014-04-30 Dna序列的快速并且安全的检索方法、装置及存储介质 Expired - Fee Related CN105229651B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201361826619P 2013-05-23 2013-05-23
US61/826,619 2013-05-23
PCT/IB2014/061098 WO2014188290A2 (en) 2013-05-23 2014-04-30 Fast and secure retrieval of dna sequences

Publications (2)

Publication Number Publication Date
CN105229651A CN105229651A (zh) 2016-01-06
CN105229651B true CN105229651B (zh) 2018-10-19

Family

ID=50884965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480029612.1A Expired - Fee Related CN105229651B (zh) 2013-05-23 2014-04-30 Dna序列的快速并且安全的检索方法、装置及存储介质

Country Status (5)

Country Link
US (1) US20160070859A1 (https=)
EP (1) EP3000067A2 (https=)
JP (1) JP6373977B2 (https=)
CN (1) CN105229651B (https=)
WO (1) WO2014188290A2 (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10116632B2 (en) * 2014-09-12 2018-10-30 New York University System, method and computer-accessible medium for secure and compressed transmission of genomic data
US10796000B2 (en) * 2016-06-11 2020-10-06 Intel Corporation Blockchain system with nucleobase sequencing as proof of work
US20190333607A1 (en) * 2016-06-29 2019-10-31 Koninklijke Philips N.V. Disease-oriented genomic anonymization
CN106484865A (zh) * 2016-10-10 2017-03-08 哈尔滨工程大学 一种基于DNA k‑mer index问题四字链表字典树检索算法
CN106557668B (zh) * 2016-11-04 2019-04-05 福建师范大学 基于lf熵的dna序列相似性检验方法
CN107103207B (zh) * 2017-04-05 2020-07-03 浙江大学 基于病例多组学变异特征的精准医学知识搜索系统及实现方法
CN107526942B (zh) * 2017-07-18 2021-04-20 中山大学 生命组学序列数据的反向检索方法
US12040058B2 (en) * 2019-01-17 2024-07-16 Flatiron Health, Inc. Systems and methods for providing clinical trial status information for patients
EP3799051A1 (en) * 2019-09-30 2021-03-31 Siemens Healthcare GmbH Intra-hospital genetic profile similar search
US11429615B2 (en) 2019-12-20 2022-08-30 Ancestry.Com Dna, Llc Linking individual datasets to a database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040068332A1 (en) * 2001-02-20 2004-04-08 Irad Ben-Gal Stochastic modeling of spatial distributed sequences
CN1701343A (zh) * 2002-09-20 2005-11-23 德克萨斯大学董事会 用于信息发现以及关联分析的计算机程序产品、系统以及方法
CN101124537A (zh) * 2004-11-12 2008-02-13 马克森斯公司 采用术语构建知识关联的知识发现技术

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040068332A1 (en) * 2001-02-20 2004-04-08 Irad Ben-Gal Stochastic modeling of spatial distributed sequences
CN1701343A (zh) * 2002-09-20 2005-11-23 德克萨斯大学董事会 用于信息发现以及关联分析的计算机程序产品、系统以及方法
CN101124537A (zh) * 2004-11-12 2008-02-13 马克森斯公司 采用术语构建知识关联的知识发现技术

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Biological Sequence Compression Algorithm;Toshiko Matsumoto;《Genome Information 11》;20001231;43-52 *
Mutual Information Based Distance Measures for Classification and Content Recognition with Applications to Genetics;Zaher Dawy 等;《communication,ICC 2005》;20051231;820-824 *
无损压缩CTW算法的改进及性能分析;孙文杰,李剑,李洪波;《电子测量技术》;20070731;第30卷(第7期);7-9 *

Also Published As

Publication number Publication date
US20160070859A1 (en) 2016-03-10
EP3000067A2 (en) 2016-03-30
WO2014188290A2 (en) 2014-11-27
JP2016524749A (ja) 2016-08-18
JP6373977B2 (ja) 2018-08-15
CN105229651A (zh) 2016-01-06
WO2014188290A3 (en) 2015-01-22

Similar Documents

Publication Publication Date Title
CN105229651B (zh) Dna序列的快速并且安全的检索方法、装置及存储介质
Ondov et al. Mash: fast genome and metagenome distance estimation using MinHash
Jain et al. A fast approximate algorithm for mapping long reads to large reference databases
Bradley et al. Ultrafast search of all deposited bacterial and viral genomic data
Ahmad et al. Techniques of data mining in healthcare: a review
CN108475297B (zh) 确定传染原的传播途径的方法、系统和过程
CN109074858B (zh) 没有明显准标识符的去识别的健康护理数据库的医院匹配
JP2019526851A (ja) 分散型機械学習システム、装置、および方法
JP6875498B6 (ja) 生体データ提供方法、生体データ暗号化方法および生体データ処理装置
Rasheed et al. 16S rRNA metagenome clustering and diversity estimation using locality sensitive hashing
US20180330054A1 (en) Rapid genomic sequence classification using probabilistic data structures
KR20190017738A (ko) 생물학적 데이터 관리를 위한 시스템 및 방법
US20200395095A1 (en) Method and system for generating and comparing genotypes
JP2023506271A (ja) 遺伝子データを処理するための方法及びデータ処理装置
US10116632B2 (en) System, method and computer-accessible medium for secure and compressed transmission of genomic data
Li et al. Biological data mining and its applications in healthcare
Li et al. GV-rep: A large-scale dataset for genetic variant representation learning
Alqahtani et al. Statistical mitogenome assembly with repeats
Behnam et al. A geometric interpretation for local alignment-free sequence comparison
CN116313153B (zh) 一种结合非临床数据的药物不良反应预测方法和系统
Senelle et al. TB-annotator: a scalable web application that allows in-depth analysis of very large sets of publicly available Mycobacterium tuberculosis complex genomes
de Souza et al. Private detection of relatives in forensic genomics using homomorphic encryption
Ignatenko Fast and secure retrieval of DNA sequences
Kulohoma BMX: a tool for computing bacterial phyletic composition from orthologous maps
Shi et al. Multi-Perspective Natural Vector: A Novel Method for Viral Sequence Feature Extraction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181019

Termination date: 20200430

CF01 Termination of patent right due to non-payment of annual fee