SG11201903272XA - Method and systems for the representation and processing of bioinformatics data using reference sequences - Google Patents

Method and systems for the representation and processing of bioinformatics data using reference sequences

Info

Publication number
SG11201903272XA
SG11201903272XA SG11201903272XA SG11201903272XA SG11201903272XA SG 11201903272X A SG11201903272X A SG 11201903272XA SG 11201903272X A SG11201903272X A SG 11201903272XA SG 11201903272X A SG11201903272X A SG 11201903272XA SG 11201903272X A SG11201903272X A SG 11201903272XA
Authority
SG
Singapore
Prior art keywords
pct
international
data
october
reference sequences
Prior art date
Application number
SG11201903272XA
Inventor
Claudio Alberti
Giorgio Zoia
Daniele Renzi
Mohamed Baluch
Original Assignee
Genomsys Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/EP2016/074311 external-priority patent/WO2018068830A1/en
Priority claimed from PCT/EP2016/074307 external-priority patent/WO2018068829A1/en
Priority claimed from PCT/EP2016/074297 external-priority patent/WO2018068827A1/en
Priority claimed from PCT/EP2016/074301 external-priority patent/WO2018068828A1/en
Application filed by Genomsys Sa filed Critical Genomsys Sa
Publication of SG11201903272XA publication Critical patent/SG11201903272XA/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/40Encryption of genetic data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B99/00Subject matter not provided for in other groups of this subclass
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3086Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical & Material Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Television Signal Processing For Recording (AREA)
  • Labeling Devices (AREA)

Abstract

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property 1111111111111 0 11011111111111 OH 10 1 0 1111111 1111 010 111110111111110 1110111111 Organization International Bureau (10) International Publication Number (43) International Publication Date .....0\"\" WO 2018/071080 A2 19 April 2018 (19.04.2018) WIP0 1 PCT (51) International Patent Classification: (74) Agent: BILICKI, Byron et al.; The Bilicki Law Firm, P.C., GOOF 19/20 (2011.01) 1285 North Main Street, Jamestown, NY 14701 (US). (21) International Application Number: (81) Designated States (unless otherwise indicated, for every PCT/US2017/041591 kind of national protection available): AE, AG, AL, AM, (22) International Filing Date: AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, 11 July 2017 (11.07.2017) CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, (25) Filing Language: English HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, (26) Publication Language: English KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, (30) Priority Data: OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, PCT/EP2016/074311 SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, 11 October 2016 (11.10.2016) EP TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. PCT/EP2016/074297 (84) Designated States (unless otherwise indicated, for every 11 October 2016 (11.10.2016) EP kind of regional protection available): ARIPO (BW, GH, PCT/EP2016/074307 GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, 11 October 2016 (11.10.2016) EP UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, PCT/EP2016/074301 — 11 October 2016 (11.10.2016) EP TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, PCT/US2017/017841 EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, 14 February 2017 (14.02.2017) US MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, PCT/US2017/017842 TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG). 14 February 2017 (14.02.2017) US = Applicant: GENOMSYS SA [CH/CH]; chemin de la Raye Declarations under Rule 4.17: (71) = 13, 1024 Ecublens VD (CH). — as to the identity of the inventor (Rule 4.17(i)) (72) Inventor; and Published: (71) Applicant: BALUCH, Mohamed, Khoso [US/US]; 4439 — without international search report and to be republished Woodsedge Ct, Chantilly, VA 20151 (US). upon receipt of that report (Rule 48.2(g)) = Inventors: ALBERTI, Claudio; Chemin des Esserts 1, = (72) — 1213 Petit-Lancy (CH). ZOIA, Giorgio; Chemin des Croix- = Rouges 10, 1007 Lausanne (CH). RENZI, Daniele; Route Aloys-Fauquez 105, 1018 Lausanne (CH). = = = Title: METHOD AND SYSTEMS FOR THE REPRESENTATION AND PROCESSING OF BIOINFORMATICS DATA (54) = USING REFERENCE SEQUENCES = = The left-most read has N alignments The right-most read has M alignments = • ' / •-.. _ N, / / Reference <= Ili a, b, aN, bm GC 0 11 Figure 15. IN © . - 6,-- t) (57) : Method and apparatus for the representation and processing of genome sequence data, produced by genome sequencing 1-1 machines, when aligned on one or more reference sequences. Sequence reads are coded by aligning them with respect to pre-existing or constructed reference sequences. After the alignment, the coding process is composed of a classification of the reads into data classes, ei followed by the coding of each data class in terms of a multiplicity of descriptors layers. Specific source models and entropy coders 0 are used for the coding of the sub-sets of descriptors used to represent each data class.
SG11201903272XA 2016-10-11 2017-07-11 Method and systems for the representation and processing of bioinformatics data using reference sequences SG11201903272XA (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
PCT/EP2016/074311 WO2018068830A1 (en) 2016-10-11 2016-10-11 Method and system for the transmission of bioinformatics data
PCT/EP2016/074307 WO2018068829A1 (en) 2016-10-11 2016-10-11 Method and apparatus for compact representation of bioinformatics data
PCT/EP2016/074297 WO2018068827A1 (en) 2016-10-11 2016-10-11 Efficient data structures for bioinformatics information representation
PCT/EP2016/074301 WO2018068828A1 (en) 2016-10-11 2016-10-11 Method and system for storing and accessing bioinformatics data
PCT/US2017/017841 WO2018071054A1 (en) 2016-10-11 2017-02-14 Method and system for selective access of stored or transmitted bioinformatics data
PCT/US2017/017842 WO2018071055A1 (en) 2016-10-11 2017-02-14 Method and apparatus for the compact representation of bioinformatics data
PCT/US2017/041591 WO2018071080A2 (en) 2016-10-11 2017-07-11 Method and systems for the representation and processing of bioinformatics data using reference sequences

Publications (1)

Publication Number Publication Date
SG11201903272XA true SG11201903272XA (en) 2019-05-30

Family

ID=61905752

Family Applications (3)

Application Number Title Priority Date Filing Date
SG11201903270RA SG11201903270RA (en) 2016-10-11 2017-02-14 Method and system for selective access of stored or transmitted bioinformatics data
SG11201903272XA SG11201903272XA (en) 2016-10-11 2017-07-11 Method and systems for the representation and processing of bioinformatics data using reference sequences
SG11201903271UA SG11201903271UA (en) 2016-10-11 2017-07-11 Method and systems for the indexing of bioinformatics data

Family Applications Before (1)

Application Number Title Priority Date Filing Date
SG11201903270RA SG11201903270RA (en) 2016-10-11 2017-02-14 Method and system for selective access of stored or transmitted bioinformatics data

Family Applications After (1)

Application Number Title Priority Date Filing Date
SG11201903271UA SG11201903271UA (en) 2016-10-11 2017-07-11 Method and systems for the indexing of bioinformatics data

Country Status (17)

Country Link
US (6) US20200042735A1 (en)
EP (3) EP3526694A4 (en)
JP (4) JP2020505702A (en)
KR (4) KR20190073426A (en)
CN (6) CN110168651A (en)
AU (3) AU2017342688A1 (en)
BR (7) BR112019007359A2 (en)
CA (3) CA3040138A1 (en)
CL (6) CL2019000972A1 (en)
CO (6) CO2019003639A2 (en)
EA (2) EA201990916A1 (en)
IL (3) IL265879B2 (en)
MX (2) MX2019004130A (en)
PE (7) PE20191058A1 (en)
PH (6) PH12019550060A1 (en)
SG (3) SG11201903270RA (en)
WO (4) WO2018071054A1 (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2526598B (en) * 2014-05-29 2018-11-28 Imagination Tech Ltd Allocation of primitives to primitive blocks
US11574287B2 (en) 2017-10-10 2023-02-07 Text IQ, Inc. Automatic document classification
US11030324B2 (en) * 2017-11-30 2021-06-08 Koninklijke Philips N.V. Proactive resistance to re-identification of genomic data
WO2019191083A1 (en) * 2018-03-26 2019-10-03 Colorado State University Research Foundation Apparatuses, systems and methods for generating and tracking molecular digital signatures to ensure authenticity and integrity of synthetic dna molecules
EP3803881A1 (en) * 2018-05-31 2021-04-14 Koninklijke Philips N.V. System and method for allele interpretation using a graph-based reference genome
CN108753765B (en) * 2018-06-08 2020-12-08 中国科学院遗传与发育生物学研究所 Genome assembly method for constructing ultra-long continuous DNA sequence
US20200058379A1 (en) * 2018-08-20 2020-02-20 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Compressing Genetic Sequencing Data and Uses Thereof
GB2585816A (en) * 2018-12-12 2021-01-27 Univ York Proof-of-work for blockchain applications
US20210074381A1 (en) * 2019-09-11 2021-03-11 Enancio Method for the compression of genome sequence data
CN110797087B (en) * 2019-10-17 2020-11-03 南京医基云医疗数据研究院有限公司 Sequencing sequence processing method and device, storage medium and electronic equipment
JP2022553199A (en) * 2019-10-18 2022-12-22 コーニンクレッカ フィリップス エヌ ヴェ Systems and methods for effective compression, representation, and decompression of diverse tabular data
CN111243663B (en) * 2020-02-26 2022-06-07 西安交通大学 Gene variation detection method based on pattern growth algorithm
CN111370070B (en) * 2020-02-27 2023-10-27 中国科学院计算技术研究所 Compression processing method for big data gene sequencing file
US12006539B2 (en) 2020-03-17 2024-06-11 Western Digital Technologies, Inc. Reference-guided genome sequencing
US12014802B2 (en) * 2020-03-17 2024-06-18 Western Digital Technologies, Inc. Devices and methods for locating a sample read in a reference genome
US11837330B2 (en) 2020-03-18 2023-12-05 Western Digital Technologies, Inc. Reference-guided genome sequencing
EP3896698A1 (en) * 2020-04-15 2021-10-20 Genomsys SA Method and system for the efficient data compression in mpeg-g
CN111459208A (en) * 2020-04-17 2020-07-28 南京铁道职业技术学院 Control system and method for electric energy of subway power supply system
IL298101A (en) * 2020-09-14 2023-01-01 Illumina Inc Custom data files for personalized medicine
CN112836355B (en) * 2021-01-14 2023-04-18 西安科技大学 Method for predicting coal face roof pressure probability
ES2930699A1 (en) * 2021-06-10 2022-12-20 Veritas Intercontinental S L GENOMIC ANALYSIS METHOD IN A BIOINFORMATIC PLATFORM (Machine-translation by Google Translate, not legally binding)
CN113670643B (en) * 2021-08-30 2023-05-12 四川虹美智能科技有限公司 Intelligent air conditioner testing method and system
CN113643761B (en) * 2021-10-13 2022-01-18 苏州赛美科基因科技有限公司 Extraction method for data required by interpretation of second-generation sequencing result
US20230187020A1 (en) * 2021-12-15 2023-06-15 Illumina Software, Inc. Systems and methods for iterative and scalable population-scale variant analysis
CN115391284B (en) * 2022-10-31 2023-02-03 四川大学华西医院 Method, system and computer readable storage medium for quickly identifying gene data file
CN116541348B (en) * 2023-03-22 2023-09-26 河北热点科技股份有限公司 Intelligent data storage method and terminal query integrated machine
CN116739646B (en) * 2023-08-15 2023-11-24 南京易联阳光信息技术股份有限公司 Method and system for analyzing big data of network transaction
CN117153270B (en) * 2023-10-30 2024-02-02 吉林华瑞基因科技有限公司 Gene second-generation sequencing data processing method

Family Cites Families (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6303297B1 (en) * 1992-07-17 2001-10-16 Incyte Pharmaceuticals, Inc. Database for storage and analysis of full-length sequences
JP3429674B2 (en) 1998-04-28 2003-07-22 沖電気工業株式会社 Multiplex communication system
EP1410301A4 (en) * 2000-04-12 2008-01-23 Cleveland Clinic Foundation System for identifying and analyzing expression of are-containing genes
FR2820563B1 (en) * 2001-02-02 2003-05-16 Expway COMPRESSION / DECOMPRESSION PROCESS FOR A STRUCTURED DOCUMENT
US20040153255A1 (en) * 2003-02-03 2004-08-05 Ahn Tae-Jin Apparatus and method for encoding DNA sequence, and computer readable medium
DE10320711A1 (en) * 2003-05-08 2004-12-16 Siemens Ag Method and arrangement for setting up and updating a user interface for accessing information pages in a data network
WO2005024562A2 (en) * 2003-08-11 2005-03-17 Eloret Corporation System and method for pattern recognition in sequential data
US7805282B2 (en) * 2004-03-30 2010-09-28 New York University Process, software arrangement and computer-accessible medium for obtaining information associated with a haplotype
WO2006052242A1 (en) * 2004-11-08 2006-05-18 Seirad, Inc. Methods and systems for compressing and comparing genomic data
US20130332133A1 (en) * 2006-05-11 2013-12-12 Ramot At Tel Aviv University Ltd. Classification of Protein Sequences and Uses of Classified Proteins
SE531398C2 (en) 2007-02-16 2009-03-24 Scalado Ab Generating a data stream and identifying positions within a data stream
KR101369745B1 (en) * 2007-04-11 2014-03-07 삼성전자주식회사 Method and apparatus for multiplexing and demultiplexing asynchronous bitstreams
US8832112B2 (en) * 2008-06-17 2014-09-09 International Business Machines Corporation Encoded matrix index
US20110264377A1 (en) * 2008-11-14 2011-10-27 John Gerald Cleary Method and system for analysing data sequences
US20100217532A1 (en) * 2009-02-25 2010-08-26 University Of Delaware Systems and methods for identifying structurally or functionally significant amino acid sequences
DK2494060T3 (en) * 2009-10-30 2016-08-01 Synthetic Genomics Inc Coding of text for nucleic acid sequences
EP2362657B1 (en) * 2010-02-18 2013-04-24 Research In Motion Limited Parallel entropy coding and decoding methods and devices
WO2011143231A2 (en) * 2010-05-10 2011-11-17 The Broad Institute High throughput paired-end sequencing of large-insert clone libraries
EP3657507A1 (en) * 2010-05-25 2020-05-27 The Regents of The University of California Bambam: parallel comparative analysis of high-throughput sequencing data
CN103329138A (en) * 2011-01-19 2013-09-25 皇家飞利浦电子股份有限公司 Method for processing genomic data
US8982879B2 (en) * 2011-03-09 2015-03-17 Annai Systems Inc. Biological data networks and methods therefor
US20140249764A1 (en) * 2011-06-06 2014-09-04 Koninklijke Philips N.V. Method for Assembly of Nucleic Acid Sequence Data
HUE063990T2 (en) * 2011-06-16 2024-02-28 Ge Video Compression Llc Entropy coding supporting mode switching
US8707289B2 (en) * 2011-07-20 2014-04-22 Google Inc. Multiple application versions
WO2013050612A1 (en) * 2011-10-06 2013-04-11 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Entropy coding buffer arrangement
AU2012335955A1 (en) * 2011-11-07 2014-07-03 QIAGEN Redwood City, Inc. Methods and systems for identification of causal genomic variants
KR101922129B1 (en) * 2011-12-05 2018-11-26 삼성전자주식회사 Method and apparatus for compressing and decompressing genetic information using next generation sequencing(NGS)
CN104246689B (en) * 2011-12-08 2020-06-02 凡弗3基因组有限公司 Distributed system providing dynamic indexing and visualization of genomic data
EP2608096B1 (en) * 2011-12-24 2020-08-05 Tata Consultancy Services Ltd. Compression of genomic data file
US9600625B2 (en) * 2012-04-23 2017-03-21 Bina Technologies, Inc. Systems and methods for processing nucleic acid sequence data
CN103049680B (en) * 2012-12-29 2016-09-07 深圳先进技术研究院 gene sequencing data reading method and system
US9679104B2 (en) * 2013-01-17 2017-06-13 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
WO2014145503A2 (en) * 2013-03-15 2014-09-18 Lieber Institute For Brain Development Sequence alignment using divide and conquer maximum oligonucleotide mapping (dcmom), apparatus, system and method related thereto
JP6054790B2 (en) * 2013-03-28 2016-12-27 三菱スペース・ソフトウエア株式会社 Gene information storage device, gene information search device, gene information storage program, gene information search program, gene information storage method, gene information search method, and gene information search system
GB2512829B (en) * 2013-04-05 2015-05-27 Canon Kk Method and apparatus for encoding or decoding an image with inter layer motion information prediction according to motion information compression scheme
WO2014186604A1 (en) * 2013-05-15 2014-11-20 Edico Genome Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
KR101522087B1 (en) * 2013-06-19 2015-05-28 삼성에스디에스 주식회사 System and method for aligning genome sequnce considering mismatch
CN103336916B (en) * 2013-07-05 2016-04-06 中国科学院数学与系统科学研究院 A kind of sequencing sequence mapping method and system
US20150032711A1 (en) * 2013-07-06 2015-01-29 Victor Kunin Methods for identification of organisms, assigning reads to organisms, and identification of genes in metagenomic sequences
KR101493982B1 (en) * 2013-09-26 2015-02-23 대한민국 Coding system for cultivar identification and coding method using thereof
CN104699998A (en) * 2013-12-06 2015-06-10 国际商业机器公司 Method and device for compressing and decompressing genome
US10902937B2 (en) * 2014-02-12 2021-01-26 International Business Machines Corporation Lossless compression of DNA sequences
US9916313B2 (en) * 2014-02-14 2018-03-13 Sap Se Mapping of extensible datasets to relational database schemas
US9886561B2 (en) * 2014-02-19 2018-02-06 The Regents Of The University Of California Efficient encoding and storage and retrieval of genomic data
US9354922B2 (en) * 2014-04-02 2016-05-31 International Business Machines Corporation Metadata-driven workflows and integration with genomic data processing systems and techniques
US20150379195A1 (en) * 2014-06-25 2015-12-31 The Board Of Trustees Of The Leland Stanford Junior University Software haplotying of hla loci
GB2527588B (en) * 2014-06-27 2016-05-18 Gurulogic Microsystems Oy Encoder and decoder
US20160019339A1 (en) * 2014-07-06 2016-01-21 Mercator BioLogic Incorporated Bioinformatics tools, systems and methods for sequence assembly
US10230390B2 (en) * 2014-08-29 2019-03-12 Bonnie Berger Leighton Compressively-accelerated read mapping framework for next-generation sequencing
US10116632B2 (en) * 2014-09-12 2018-10-30 New York University System, method and computer-accessible medium for secure and compressed transmission of genomic data
US20160125130A1 (en) * 2014-11-05 2016-05-05 Agilent Technologies, Inc. Method for assigning target-enriched sequence reads to a genomic location
US20180181706A1 (en) * 2015-06-16 2018-06-28 Gottfried Wilhelm Leibniz Universitaet Hannover Method for Compressing Genomic Data
CN105956417A (en) * 2016-05-04 2016-09-21 西安电子科技大学 Similar base sequence query method based on editing distance in cloud environment
CN105975811B (en) * 2016-05-09 2019-03-15 管仁初 A kind of gene sequencing device intelligently compared

Also Published As

Publication number Publication date
JP2019537172A (en) 2019-12-19
CO2019003638A2 (en) 2019-08-30
WO2018071079A1 (en) 2018-04-19
KR20190073426A (en) 2019-06-26
CA3040138A1 (en) 2018-04-19
CN110121577B (en) 2023-09-19
EP3526657A1 (en) 2019-08-21
CN110121577A (en) 2019-08-13
CO2019003842A2 (en) 2019-08-30
PE20200323A1 (en) 2020-02-13
WO2018071054A1 (en) 2018-04-19
CN110603595B (en) 2023-08-08
IL265879B1 (en) 2023-09-01
EP3526707A4 (en) 2020-06-17
EP3526657A4 (en) 2020-07-01
KR20190117652A (en) 2019-10-16
CN110114830B (en) 2023-10-13
BR112019007363A2 (en) 2019-07-09
US20200051667A1 (en) 2020-02-13
SG11201903270RA (en) 2019-05-30
EP3526694A4 (en) 2020-08-12
CL2019000972A1 (en) 2019-08-23
BR112019007360A2 (en) 2019-07-09
PE20191227A1 (en) 2019-09-11
BR112019016230A2 (en) 2020-04-07
EP3526707A2 (en) 2019-08-21
PH12019550060A1 (en) 2019-12-16
IL265972A (en) 2019-06-30
KR20190069469A (en) 2019-06-19
SG11201903271UA (en) 2019-05-30
PE20191057A1 (en) 2019-08-06
PH12019550058A1 (en) 2019-12-16
CN110603595A (en) 2019-12-20
CL2019002276A1 (en) 2019-11-29
CO2019009920A2 (en) 2020-01-17
US20200035328A1 (en) 2020-01-30
WO2018071055A1 (en) 2018-04-19
PE20191058A1 (en) 2019-08-06
CL2019002275A1 (en) 2019-11-22
US20200042735A1 (en) 2020-02-06
AU2017341685A1 (en) 2019-05-02
PE20200226A1 (en) 2020-01-29
PH12019501881A1 (en) 2020-06-29
EP3526694A1 (en) 2019-08-21
EA201990916A1 (en) 2019-10-31
WO2018071080A2 (en) 2018-04-19
US11404143B2 (en) 2022-08-02
CA3040145A1 (en) 2018-04-19
CA3040147A1 (en) 2018-04-19
BR112019007357A2 (en) 2019-07-16
CO2019003639A2 (en) 2020-02-28
BR112019007359A2 (en) 2019-07-16
CN110114830A (en) 2019-08-09
BR112019016232A2 (en) 2020-04-07
IL265928A (en) 2019-05-30
CO2019003595A2 (en) 2019-08-30
IL265928B (en) 2020-10-29
PH12019550059A1 (en) 2019-12-16
IL265879A (en) 2019-06-30
JP2020500382A (en) 2020-01-09
PE20200227A1 (en) 2020-01-29
CN110678929B (en) 2024-04-16
JP2020505702A (en) 2020-02-20
PE20191056A1 (en) 2019-08-06
BR112019016236A2 (en) 2020-04-07
CL2019000973A1 (en) 2019-08-23
CN110506272A (en) 2019-11-26
CL2019002277A1 (en) 2019-11-22
CL2019000968A1 (en) 2019-08-23
AU2017342688A1 (en) 2019-05-02
CO2019009922A2 (en) 2020-01-17
US20190385702A1 (en) 2019-12-19
JP7079786B2 (en) 2022-06-02
JP2020500383A (en) 2020-01-09
IL265879B2 (en) 2024-01-01
CN110168651A (en) 2019-08-23
AU2017341684A1 (en) 2019-05-02
US20190214111A1 (en) 2019-07-11
PH12019501879A1 (en) 2020-06-29
KR20190062541A (en) 2019-06-05
EA201990917A1 (en) 2019-08-30
US20200051665A1 (en) 2020-02-13
CN110678929A (en) 2020-01-10
MX2019004128A (en) 2019-08-21
CN110506272B (en) 2023-08-01
WO2018071080A3 (en) 2018-06-28
PH12019550057A1 (en) 2020-01-20
MX2019004130A (en) 2020-01-30

Similar Documents

Publication Publication Date Title
SG11201903272XA (en) Method and systems for the representation and processing of bioinformatics data using reference sequences
SG11201908571UA (en) Leveraging sequence-based fecal microbial community survey data to identify a composite biomarker for colorectal cancer
SG11201903895XA (en) Blockchain data processing method and apparatus
SG11201804190YA (en) Method and system for blockchain variant using digital signatures
SG11201907090WA (en) Affine motion information derivation
SG11201903141QA (en) Business processing method and apparatus
SG11201806190TA (en) Isotachophoresis for purification of nucleic acids
SG11201811338VA (en) A novel botulinum neurotoxin and its derivatives
SG11201807334SA (en) Methods, compositions, and devices for information storage
SG11201901550WA (en) Method and apparatus for data processing
SG11201811709WA (en) Compounds, compositions, and methods for the treatment of disease
SG11201903604PA (en) Iot security service
SG11201805136TA (en) Group b adenovirus encoding an anti-tcr-complex antibody or fragment
SG11201804696RA (en) Techniques for metadata processing
SG11201805562QA (en) Genomic infrastructure for on-site or cloud-based dna and rna processing and analysis
SG11201806322QA (en) Maytansinoid derivatives, conjugates thereof, and methods of use
SG11201803593QA (en) Engineered nucleic-acid targeting nucleic acids
SG11201811343SA (en) System and methods for detecting online fraud
SG11201407888RA (en) Method of sequence determination using sequence tags
SG11201903287PA (en) Anti-respiratory syncytial virus antibodies, and methods of their generation and use
SG11201905463TA (en) Abstract enclave identity
SG11201901168UA (en) Apparatuses and methods including ferroelectric memory and for operating ferroelectric memory
SG11201809872TA (en) Using hardware based secure isolated region to prevent piracy and cheating on electronic devices
SG11201808420PA (en) Method for separation of magnesium and calcium ions from saline water, for improving the quality of soft and desalinated waters
SG11201804841VA (en) Hardware integrity check