MX2019009680A - Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos. - Google Patents

Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.

Info

Publication number
MX2019009680A
MX2019009680A MX2019009680A MX2019009680A MX2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A MX 2019009680 A MX2019009680 A MX 2019009680A
Authority
MX
Mexico
Prior art keywords
data
compact representation
multiple genomic
bioinformatics data
class
Prior art date
Application number
MX2019009680A
Other languages
English (en)
Inventor
Zoia Giorgio
Renzi Daniele
Khoso Baluch Mohamed
Alberti Claudio
Original Assignee
Genomsys Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/US2017/017842 external-priority patent/WO2018071055A1/en
Application filed by Genomsys Sa filed Critical Genomsys Sa
Publication of MX2019009680A publication Critical patent/MX2019009680A/es

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/58Random or pseudo-random number generators
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B99/00Subject matter not provided for in other groups of this subclass
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0866Generation of secret information including derivation or calculation of cryptographic keys or passwords involving user or device identifiers, e.g. serial number, physical or biometrical information, DNA, hand-signature or measurable physical characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
    • H04L9/3066Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves
    • H04L9/3073Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves involving pairings, e.g. identity based encryption [IBE], bilinear mappings or bilinear pairings, e.g. Weil or Tate pairing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/30Compression, e.g. Merkle-Damgard construction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/34Encoding or coding, e.g. Huffman coding or error correction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/88Medical equipments

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Genetics & Genomics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Human Computer Interaction (AREA)

Abstract

Método y aparato para la compresión de datos de secuencias de genomas producidos por máquinas de secuenciación del genoma. Las lecturas de secuencias se codifican mediante su alineación con respecto a secuencias de referencia preexistentes o construidas, en donde el proceso de codificación está compuesto por una clasificación de las lecturas en clases de datos, a lo cual sigue la codificación de cada clase en términos de una multiplicidad de bloques de descriptores. Se utilizan modelos fuente y codificadores de entropía específicos para cada clase de datos en los que se dividen los datos y para cada bloque de descriptor asociado.
MX2019009680A 2017-02-14 2018-02-14 Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos. MX2019009680A (es)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
PCT/US2017/017842 WO2018071055A1 (en) 2016-10-11 2017-02-14 Method and apparatus for the compact representation of bioinformatics data
PCT/US2017/041591 WO2018071080A2 (en) 2016-10-11 2017-07-11 Method and systems for the representation and processing of bioinformatics data using reference sequences
PCT/US2018/018092 WO2018152143A1 (en) 2017-02-14 2018-02-14 Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors

Publications (1)

Publication Number Publication Date
MX2019009680A true MX2019009680A (es) 2019-10-09

Family

ID=68609803

Family Applications (1)

Application Number Title Priority Date Filing Date
MX2019009680A MX2019009680A (es) 2017-02-14 2018-02-14 Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.

Country Status (10)

Country Link
EP (1) EP3583500A4 (es)
KR (1) KR20190113971A (es)
AU (1) AU2018221458B2 (es)
CA (1) CA3052824A1 (es)
EA (1) EA201991908A1 (es)
IL (1) IL268651A (es)
MX (1) MX2019009680A (es)
SG (1) SG11201907418YA (es)
WO (1) WO2018152143A1 (es)
ZA (1) ZA201905921B (es)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110189830B (zh) * 2019-05-24 2021-06-08 杭州火树科技有限公司 基于机器学习的电子病历词库训练方法
EP3896698A1 (en) 2020-04-15 2021-10-20 Genomsys SA Method and system for the efficient data compression in mpeg-g
KR102497634B1 (ko) * 2020-12-21 2023-02-08 부산대학교 산학협력단 문자 빈도 기반 서열 재정렬을 통한 fastq 데이터 압축 방법 및 장치

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1383911A4 (en) * 2001-04-02 2004-12-15 Cytoprint Inc METHOD AND APPARATUS FOR DISCOVERING, IDENTIFYING AND COMPARING BIOLOGICAL ACTIVITY MECHANISMS
US7698067B2 (en) * 2002-02-12 2010-04-13 International Business Machines Corporation Sequence pattern descriptors for transmembrane structural details
US7809765B2 (en) * 2007-08-24 2010-10-05 General Electric Company Sequence identification and analysis
KR101922129B1 (ko) * 2011-12-05 2018-11-26 삼성전자주식회사 차세대 시퀀싱을 이용하여 획득된 유전 정보를 압축 및 압축해제하는 방법 및 장치
US9679104B2 (en) * 2013-01-17 2017-06-13 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
CN103336916B (zh) * 2013-07-05 2016-04-06 中国科学院数学与系统科学研究院 一种测序序列映射方法及系统
US10902937B2 (en) * 2014-02-12 2021-01-26 International Business Machines Corporation Lossless compression of DNA sequences

Also Published As

Publication number Publication date
EA201991908A1 (ru) 2020-01-21
EP3583500A4 (en) 2020-12-16
SG11201907418YA (en) 2019-09-27
NZ757185A (en) 2021-05-28
EP3583500A1 (en) 2019-12-25
AU2018221458B2 (en) 2022-12-08
CA3052824A1 (en) 2018-08-23
WO2018152143A1 (en) 2018-08-23
IL268651A (en) 2019-10-31
ZA201905921B (en) 2021-05-26
KR20190113971A (ko) 2019-10-08
AU2018221458A1 (en) 2019-10-03

Similar Documents

Publication Publication Date Title
PH12019501879A1 (en) Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors
MX2019009680A (es) Metodo y aparato para la representacion compacta de datos de bioinformatica mediante el uso de multiples descriptores genomicos.
MX2013002086A (es) Metodo de codificacion de imagen, metodo de decodificacion de imagen, dispositivo de codificacion de imagen y dispositivo de decodificacion de imagen.
CO2019009919A2 (es) Método y sistemas para la compresión eficiente de lecturas de secuencias genómicas
MY176988A (en) Video-encoding method and video-encoding apparatus based on encoding units determined in a accordance with a tree structure, and video-decoding units determined in a accordance with a tree structure
MY165837A (en) Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
MY167857A (en) Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
EP2503723A3 (en) Method and apparatus for transmitting and receiving control information in a broadcasting/communication system
PH12019500791A1 (en) Efficient data structures for bioinformatics information presentation
EP3754484A3 (en) Generating encoding software and decoding means
MX2021010964A (es) Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo codificador de datos tridimensionales y dispositivo decodificador de datos tridimensionales.
MX2021006632A (es) Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo de codificacion de datos tridimensionales y dispositivo de decodificacion de datos tridimensionales.
PH12019500793A1 (en) Method and apparatus for compact representation of bioinformatics data
EA201991906A1 (ru) Способ и системы для восстановления геномных референсных последовательностей из сжатых прочтений геномной последовательности
PH12017500791A1 (en) Image coding device, image coding method, image coding program, transmission device, transmission method, transmission program, image decoding device, image decoding method, image decoding program, reception device, reception method, and reception program
MX2019009681A (es) Metodo y sistemas para la compresion eficiente de lecturas de secuencias genomicas.
MX2020002143A (es) Metodos y aparatos para codificar y decodificar informacion de modo y dispositivo electronico.
EA201990923A1 (ru) Способ и устройство для доступа к биоинформационным данным, структурированным в виде единиц доступа
MX2022002930A (es) Metodo para la compresion de datos de secuencias del genoma.
TH122271B (th) วิธีการและเครื่องชุดมือสำหรับการเข้ารหัสวีดีโอ และวิธีการและชุดเครื่องมือสำหรับการถอดรหัสวิดีโอโดยการพิจารณาลำดับข้ามและ แยกตัว