BR112018007092A2 - alinhamento de dna com o uso de uma tabela de índice hierárquico invertido - Google Patents

alinhamento de dna com o uso de uma tabela de índice hierárquico invertido

Info

Publication number
BR112018007092A2
BR112018007092A2 BR112018007092A BR112018007092A BR112018007092A2 BR 112018007092 A2 BR112018007092 A2 BR 112018007092A2 BR 112018007092 A BR112018007092 A BR 112018007092A BR 112018007092 A BR112018007092 A BR 112018007092A BR 112018007092 A2 BR112018007092 A2 BR 112018007092A2
Authority
BR
Brazil
Prior art keywords
index table
hierarchical index
constructed
reference data
dna alignment
Prior art date
Application number
BR112018007092A
Other languages
English (en)
Other versions
BR112018007092B1 (pt
Inventor
G Arastas Daemon
D Garmany Jan
A Hunt Martin
B Doerr Michael
V Wood Stephen
Original Assignee
Coherent Logix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Coherent Logix Inc filed Critical Coherent Logix Inc
Publication of BR112018007092A2 publication Critical patent/BR112018007092A2/pt
Publication of BR112018007092B1 publication Critical patent/BR112018007092B1/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

trata-se de sistema e método para construir uma tabela de índice hierárquico utilizável para compatibilizar uma sequência de busca com dados de referência. a tabela de índice pode ser construída para conter entradas associadas a uma lista exaustiva de todas as subsequências de um determinado comprimento, sendo que cada entrada contém o número e as localizações de compatibilidades de cada subsequência nos dados de referência. a tabela de índice hierárquico pode ser construída de uma maneira iterativa, em que as entradas para cada subsequência prolongada são construídas seletiva e iterativamente com base no número de compatibilidades que são maiores que cada um dentre um conjunto de respectivos limiares. a tabela de índice hierárquico pode ser usada para buscar por compatibilidades entre uma sequência de busca e dados de referência, e realizar identificação e caracterização de desajuste mediante cada respectiva compatibilidade candidata.
BR112018007092-0A 2015-10-21 2016-10-21 Alinhamento de dna com o uso de uma tabela de índice hierárquico invertido BR112018007092B1 (pt)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201562244541P 2015-10-21 2015-10-21
US62/244,541 2015-10-21
PCT/US2016/058183 WO2017070514A1 (en) 2015-10-21 2016-10-21 Dna alignment using a hierarchical inverted index table

Publications (2)

Publication Number Publication Date
BR112018007092A2 true BR112018007092A2 (pt) 2018-10-23
BR112018007092B1 BR112018007092B1 (pt) 2024-02-20

Family

ID=58557902

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112018007092-0A BR112018007092B1 (pt) 2015-10-21 2016-10-21 Alinhamento de dna com o uso de uma tabela de índice hierárquico invertido

Country Status (7)

Country Link
US (1) US11594301B2 (pt)
EP (1) EP3365821B1 (pt)
JP (1) JP6884143B2 (pt)
KR (1) KR20180072684A (pt)
CN (2) CN114783523A (pt)
BR (1) BR112018007092B1 (pt)
WO (1) WO2017070514A1 (pt)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8705623B2 (en) * 2009-10-02 2014-04-22 Texas Instruments Incorporated Line-based compression for digital image data
CN112948446A (zh) * 2019-11-26 2021-06-11 北京京东振世信息技术有限公司 一种匹配产品单据的方法和装置
CN111402959A (zh) * 2020-03-13 2020-07-10 苏州浪潮智能科技有限公司 一种序列比对的方法、系统、设备及可读存储介质
IL281960A (en) * 2021-04-01 2022-10-01 Zimmerman Israel A system and method for rapid statistical discovery of patterns

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08272824A (ja) * 1995-03-31 1996-10-18 Hitachi Software Eng Co Ltd 遺伝子配列データ自動検索方法
US20040153255A1 (en) * 2003-02-03 2004-08-05 Ahn Tae-Jin Apparatus and method for encoding DNA sequence, and computer readable medium
JP4614949B2 (ja) * 2004-03-31 2011-01-19 株式会社バイオシンクタンク 塩基配列検索装置及び塩基配列検索方法
US7702640B1 (en) * 2005-12-29 2010-04-20 Amazon Technologies, Inc. Stratified unbalanced trees for indexing of data items within a computer system
WO2007137225A2 (en) * 2006-05-19 2007-11-29 The University Of Chicago Method for indexing nucleic acid sequences for computer based searching
US8271206B2 (en) * 2008-04-21 2012-09-18 Softgenetics Llc DNA sequence assembly methods of short reads
WO2010104608A2 (en) * 2009-03-13 2010-09-16 Life Technologies Corporation Computer implemented method for indexing reference genome
CN101984445B (zh) 2010-03-04 2012-03-14 深圳华大基因科技有限公司 一种基于聚合酶链式反应产物测序序列分型的实现方法和系统
US20140163900A1 (en) * 2012-06-02 2014-06-12 Whitehead Institute For Biomedical Research Analyzing short tandem repeats from high throughput sequencing data for genetic applications
US10381106B2 (en) * 2013-01-28 2019-08-13 Hasso-Plattner-Institut Fuer Softwaresystemtechnik Gmbh Efficient genomic read alignment in an in-memory database
WO2014145503A2 (en) * 2013-03-15 2014-09-18 Lieber Institute For Brain Development Sequence alignment using divide and conquer maximum oligonucleotide mapping (dcmom), apparatus, system and method related thereto
NL2011817C2 (en) * 2013-11-19 2015-05-26 Genalice B V A method of generating a reference index data structure and method for finding a position of a data pattern in a reference data structure.
US9886561B2 (en) * 2014-02-19 2018-02-06 The Regents Of The University Of California Efficient encoding and storage and retrieval of genomic data
NL2013120B1 (en) * 2014-07-03 2016-09-20 Genalice B V A method for finding associated positions of bases of a read on a reference genome.

Also Published As

Publication number Publication date
JP2018535484A (ja) 2018-11-29
CN108140071B (zh) 2022-04-29
KR20180072684A (ko) 2018-06-29
CN114783523A (zh) 2022-07-22
BR112018007092B1 (pt) 2024-02-20
EP3365821B1 (en) 2022-06-29
CN108140071A (zh) 2018-06-08
US11594301B2 (en) 2023-02-28
JP6884143B2 (ja) 2021-06-09
US20170116370A1 (en) 2017-04-27
WO2017070514A1 (en) 2017-04-27
EP3365821A4 (en) 2019-06-26
EP3365821A1 (en) 2018-08-29

Similar Documents

Publication Publication Date Title
BR112018007092A2 (pt) alinhamento de dna com o uso de uma tabela de índice hierárquico invertido
CL2019000968A1 (es) Método y sistema para el acceso selectivo de datos bioinformáticos almacenados o transmitidos.
BR112016024774A2 (pt) sistema de criação de website implementável em um dispositivo de computação, e método implementável em um dispositivo de computação
BR112018076196A2 (pt) método, e, dispositivos de comunicação portátil e de acesso.
CO2017011544A2 (es) Sistema y método para extraer y compartir datos de usuario relacionados con la aplicación
BR112017010353A2 (pt) métodos e sistemas para projetar sistemas fotovoltaicos
BR112017019015A2 (pt) sistema que facilita o uso de palavras-chave inseridas pelo usuário para buscar conceitos clínicos relacionados, e método para facilitar o uso de palavras-chave inseridas pelo usuário para buscar conceitos clínicos relacionados
BR112016016007A2 (pt) Determinação de local interno que usa compatibilidade de padrão de dispositivos ponto a ponto
BR112015030417A8 (pt) Sistema de computador, método implementado por computador e sistema para resultados de busca de linguagem natural para consultas de intenção
WO2013163644A3 (en) Updating a search index used to facilitate application searches
CO2017007032A2 (es) Actualización de modelos de clasificador de entendimiento de lenguaje para un asistente digital personal basándose en externalización masiva
BR112017000635A2 (pt) sistema e método de remoção de ruído para dados de detecção acústica distribuída.
AR081313A1 (es) Metodo y sistema para categorizar documentos de propiedad intelectual usando un analisis de reivindicaciones
BR112016020457A8 (pt) Método de busca por impressões digitais de áudio armazenadas em um banco de dados dentro de um sistema de detecção de impressões digitais de áudio
BR112017009666A2 (pt) método e dispositivo para mineração de dados com base em plataforma social
CR20150552A (es) Entorno de aprendizaje de idiomas
GB2520878A (en) System and method for matching data using probabilistic modeling techniques
BR112017014029A2 (pt) sistema de computador para proporcionar conteúdo, e, método computadorizado.
CL2019001242A1 (es) Selección dinámica de recursos de energía externos.
BR112015025117A2 (pt) método para orientação de veículo e operador por meio de reconhecimento de padrões
CL2016000984A1 (es) Sistema y método para la implementación de consultas de búsqueda multi-facetadas
RU2013156495A (ru) Разрешение семантической неоднозначности при помощи семантического классификатора
AR099945A1 (es) Sistema y método para facilitar transacciones electrónicas
BR112016021410A2 (pt) Inspeção de usuário de sistema de transporte
BR112017000847A2 (pt) recuperação/armazenamento de imagens associadas a eventos

Legal Events

Date Code Title Description
B06U Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]
B06A Patent application procedure suspended [chapter 6.1 patent gazette]
B09A Decision: intention to grant [chapter 9.1 patent gazette]
B16A Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]

Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 21/10/2016, OBSERVADAS AS CONDICOES LEGAIS