GB2578709B - Context aware delta algorithm for genomic files - Google Patents
Context aware delta algorithm for genomic files Download PDFInfo
- Publication number
- GB2578709B GB2578709B GB2003514.3A GB202003514A GB2578709B GB 2578709 B GB2578709 B GB 2578709B GB 202003514 A GB202003514 A GB 202003514A GB 2578709 B GB2578709 B GB 2578709B
- Authority
- GB
- United Kingdom
- Prior art keywords
- context aware
- delta algorithm
- genomic
- files
- genomic files
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/693,019 US11163726B2 (en) | 2017-08-31 | 2017-08-31 | Context aware delta algorithm for genomic files |
| PCT/IB2018/056009 WO2019043481A1 (en) | 2017-08-31 | 2018-08-09 | DELTA ALGORITHM SENSITIVE TO THE CONTEXT FOR GENOMIC FILES |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| GB202003514D0 GB202003514D0 (en) | 2020-04-29 |
| GB2578709A GB2578709A (en) | 2020-05-20 |
| GB2578709B true GB2578709B (en) | 2020-09-23 |
Family
ID=65435154
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| GB2003514.3A Active GB2578709B (en) | 2017-08-31 | 2018-08-09 | Context aware delta algorithm for genomic files |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11163726B2 (https=) |
| JP (1) | JP7157141B2 (https=) |
| CN (1) | CN111095421B (https=) |
| GB (1) | GB2578709B (https=) |
| WO (1) | WO2019043481A1 (https=) |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10554220B1 (en) * | 2019-01-30 | 2020-02-04 | International Business Machines Corporation | Managing compression and storage of genomic data |
| US11188503B2 (en) * | 2020-02-18 | 2021-11-30 | International Business Machines Corporation | Record-based matching in data compression |
| CN113012755B (zh) * | 2021-04-12 | 2023-10-27 | 聊城大学 | 基因组atcg的检索方法 |
| US12580047B2 (en) | 2022-01-18 | 2026-03-17 | Dell Products L.P. | Biological sequence compression using sequence alignment |
| US12530320B2 (en) | 2022-01-18 | 2026-01-20 | Dell Products L.P. | File compression using sequence splits and sequence alignment |
| US12572509B2 (en) | 2022-01-18 | 2026-03-10 | Dell Products L.P. | Structure based file compression using sequence alignment |
| US12511260B2 (en) * | 2022-01-18 | 2025-12-30 | Dell Products L.P. | File compression using sequence alignment |
| US12353358B2 (en) | 2022-01-18 | 2025-07-08 | Dell Products L.P. | Adding content to compressed files using sequence alignment |
| US12339811B2 (en) | 2022-04-12 | 2025-06-24 | Dell Products L.P. | Compressing multiple dimension files using sequence alignment |
| US11977517B2 (en) | 2022-04-12 | 2024-05-07 | Dell Products L.P. | Warm start file compression using sequence alignment |
| US12216621B2 (en) * | 2022-04-12 | 2025-02-04 | Dell Products L.P. | Hyperparameter optimization in file compression using sequence alignment |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103546160A (zh) * | 2013-09-22 | 2014-01-29 | 上海交通大学 | 基于多参考序列的基因序列分级压缩方法 |
| US8972201B2 (en) * | 2011-12-24 | 2015-03-03 | Tata Consultancy Services Limited | Compression of genomic data file |
| US20160306919A1 (en) * | 2013-12-06 | 2016-10-20 | International Business Machines Corporation | Genome compression and decompression |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040086861A1 (en) * | 2000-04-19 | 2004-05-06 | Satoshi Omori | Method and device for recording sequence information on nucleotides and amino acids |
| CN1680589A (zh) * | 2003-06-06 | 2005-10-12 | 李志广 | 基因芯片用人类白细胞抗原分型探针的筛选及其应用方法 |
| US7657383B2 (en) * | 2004-05-28 | 2010-02-02 | International Business Machines Corporation | Method, system, and apparatus for compactly storing a subject genome |
| WO2006052242A1 (en) * | 2004-11-08 | 2006-05-18 | Seirad, Inc. | Methods and systems for compressing and comparing genomic data |
| CN101535945A (zh) * | 2006-04-25 | 2009-09-16 | 英孚威尔公司 | 全文查询和搜索系统及其使用方法 |
| US20110119240A1 (en) * | 2009-11-18 | 2011-05-19 | Dana Shapira | Method and system for generating a bidirectional delta file |
| WO2011106629A2 (en) | 2010-02-26 | 2011-09-01 | Life Technologies Corporation | Modified proteins and methods of making and using same |
| WO2012092515A2 (en) | 2010-12-30 | 2012-07-05 | Life Technologies Corporation | Methods, systems, and computer readable media for nucleic acid sequencing |
| EP2595076B1 (en) * | 2011-11-18 | 2019-05-15 | Tata Consultancy Services Limited | Compression of genomic data |
| US9715574B2 (en) * | 2011-12-20 | 2017-07-25 | Michael H. Baym | Compressing, storing and searching sequence data |
| GB2507751A (en) * | 2012-11-07 | 2014-05-14 | Ibm | Storing data files in a file system which provides reference data files |
| NL2012222C2 (en) * | 2014-02-06 | 2015-08-10 | Genalice B V | A method of storing/reconstructing a multitude of sequences in/from a data storage structure. |
| GB2530012A (en) * | 2014-08-05 | 2016-03-16 | Illumina Cambridge Ltd | Methods and systems for data analysis and compression |
-
2017
- 2017-08-31 US US15/693,019 patent/US11163726B2/en active Active
-
2018
- 2018-08-09 JP JP2020509515A patent/JP7157141B2/ja active Active
- 2018-08-09 GB GB2003514.3A patent/GB2578709B/en active Active
- 2018-08-09 WO PCT/IB2018/056009 patent/WO2019043481A1/en not_active Ceased
- 2018-08-09 CN CN201880054764.5A patent/CN111095421B/zh active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8972201B2 (en) * | 2011-12-24 | 2015-03-03 | Tata Consultancy Services Limited | Compression of genomic data file |
| CN103546160A (zh) * | 2013-09-22 | 2014-01-29 | 上海交通大学 | 基于多参考序列的基因序列分级压缩方法 |
| US20160306919A1 (en) * | 2013-12-06 | 2016-10-20 | International Business Machines Corporation | Genome compression and decompression |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111095421A (zh) | 2020-05-01 |
| US20190065518A1 (en) | 2019-02-28 |
| JP2020533666A (ja) | 2020-11-19 |
| CN111095421B (zh) | 2024-02-02 |
| JP7157141B2 (ja) | 2022-10-19 |
| GB2578709A (en) | 2020-05-20 |
| GB202003514D0 (en) | 2020-04-29 |
| WO2019043481A1 (en) | 2019-03-07 |
| US11163726B2 (en) | 2021-11-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| GB2578709B (en) | Context aware delta algorithm for genomic files | |
| EP3237017A4 (en) | Systems and methods for genome modification and regulation | |
| EP3274813A4 (en) | Access files | |
| EP3241301A4 (en) | Encrypted file storage | |
| EP3132025A4 (en) | Methods and compositions for modifying genomic dna | |
| EP3157527A4 (en) | Ezh2 inhibitors for treating lymphoma | |
| EP3157536A4 (en) | Methods for treating overweight or obesity | |
| ZA201800782B (en) | Compounds and methods for inhibiting jak | |
| EP3117004A4 (en) | Genomic insulator elements and uses thereof | |
| EP3120278A4 (en) | Methods and systems for genome comparison | |
| EP3107902A4 (en) | Compounds and methods for inhibiting fascin | |
| EP3304275A4 (en) | PROTECTION OF DATA FILES | |
| PT3183295T (pt) | Composições de ciclodextrina alquilada fracionada e processos para preparação e utilização das mesmas | |
| EP3164394A4 (en) | Gls1 inhibitors for treating disease | |
| EP3165625A4 (en) | Wire material for steel wire, and steel wire | |
| GB201711552D0 (en) | Secure file transfer | |
| AP2016009570A0 (en) | Triaminopyrimidine compounds useful for preventing or treating malaria | |
| EP3230188A4 (en) | Evacuation controller | |
| EP3224155A4 (en) | Non-slip cable tie | |
| EP3206347A4 (en) | Method for calling routing algorithm, sdn controller, and sdn-oaf | |
| EP3168190A4 (en) | Method for purifying chlorosilane | |
| EP3250218A4 (en) | Methods for treating obesity | |
| EP3244896A4 (en) | Methods for treating pulmonary hypertension | |
| EP3168973A4 (en) | Cross regulation circuit for multiple outputs and cross regulation method thereof | |
| EP3263608A4 (en) | Method for catalyst removal |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 746 | Register noted 'licences of right' (sect. 46/1977) |
Effective date: 20201123 |