ES3050587T3 - Genomic information compression by configurable machine learning-based arithmetic coding - Google Patents
Genomic information compression by configurable machine learning-based arithmetic codingInfo
- Publication number
- ES3050587T3 ES3050587T3 ES21742062T ES21742062T ES3050587T3 ES 3050587 T3 ES3050587 T3 ES 3050587T3 ES 21742062 T ES21742062 T ES 21742062T ES 21742062 T ES21742062 T ES 21742062T ES 3050587 T3 ES3050587 T3 ES 3050587T3
- Authority
- ES
- Spain
- Prior art keywords
- context
- encoding
- data
- type
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4006—Conversion to or from arithmetic code
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6064—Selection of Compressor
- H03M7/6076—Selection between compressors of the same type
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3068—Precoding preceding compression, e.g. Burrows-Wheeler transformation
- H03M7/3079—Context modeling
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Genetics & Genomics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063050193P | 2020-07-10 | 2020-07-10 | |
| PCT/EP2021/067960 WO2022008311A1 (en) | 2020-07-10 | 2021-06-30 | Genomic information compression by configurable machine learning-based arithmetic coding |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| ES3050587T3 true ES3050587T3 (en) | 2025-12-22 |
Family
ID=76920753
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| ES21742062T Active ES3050587T3 (en) | 2020-07-10 | 2021-06-30 | Genomic information compression by configurable machine learning-based arithmetic coding |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US20230253074A1 (https=) |
| EP (1) | EP4179539B1 (https=) |
| JP (1) | JP7826277B2 (https=) |
| CN (1) | CN116018647A (https=) |
| ES (1) | ES3050587T3 (https=) |
| PL (1) | PL4179539T3 (https=) |
| WO (1) | WO2022008311A1 (https=) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113810155B (zh) * | 2020-06-17 | 2022-11-18 | 华为技术有限公司 | 信道编译码方法和通信装置 |
| US11818399B2 (en) * | 2021-01-04 | 2023-11-14 | Tencent America LLC | Techniques for signaling neural network topology and parameters in the coded video stream |
| CN115391298B (zh) * | 2021-05-25 | 2026-03-27 | 戴尔产品有限公司 | 基于内容的动态混合数据压缩 |
| EP4465207A4 (en) * | 2022-01-13 | 2025-10-15 | Lg Electronics Inc | METHOD BY WHICH A RECEIVING DEVICE PERFORMS END-TO-END LEARNING IN A WIRELESS COMMUNICATION SYSTEM, RECEIVING DEVICE, PROCESSING DEVICE, STORAGE MEDIUM, METHOD BY WHICH A TRANSMITTING DEVICE PERFORMS END-TO-END LEARNING, AND TRANSMITTING DEVICE |
| JP2025522817A (ja) * | 2022-06-30 | 2025-07-17 | 華為技術有限公司 | エントロピーコーディングパラメータの適応的選択 |
| CN115083530B (zh) * | 2022-08-22 | 2022-11-04 | 广州明领基因科技有限公司 | 基因测序数据压缩方法、装置、终端设备和存储介质 |
| CN117692094B (zh) * | 2022-09-02 | 2026-03-20 | 北京邮电大学 | 编码方法、解码方法、编码装置、解码装置及电子设备 |
| CN116886104B (zh) * | 2023-09-08 | 2023-11-21 | 西安小草植物科技有限责任公司 | 一种基于人工智能的智慧医疗数据分析方法 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2002655A1 (de) * | 2006-03-29 | 2008-12-17 | Nokia Siemens Networks Gmbh & Co. Kg | Verfahren und vorrichtung zum erstellen eines datenblocks für einen skalierbaren datenstrom |
| CN107529709B (zh) * | 2011-06-16 | 2019-05-07 | Ge视频压缩有限责任公司 | 解码器、编码器、解码和编码视频的方法及存储介质 |
| CN106096327B (zh) * | 2016-06-07 | 2018-08-17 | 广州麦仑信息科技有限公司 | 基于Torch监督式深度学习的基因性状识别方法 |
| CN110663022B (zh) * | 2016-10-11 | 2024-03-15 | 耶诺姆希斯股份公司 | 使用基因组描述符紧凑表示生物信息学数据的方法和设备 |
| EP3583250B1 (en) * | 2017-02-14 | 2023-07-12 | Genomsys SA | Method and systems for the efficient compression of genomic sequence reads |
| CN108306650A (zh) * | 2018-01-16 | 2018-07-20 | 厦门极元科技有限公司 | 基因测序数据的压缩方法 |
| PL4100954T3 (pl) * | 2020-02-07 | 2026-01-26 | Koninklijke Philips N.V. | Ulepszona struktura kompresji wartości jakości w dopasowanych danych sekwencjonowania na podstawie nowych kontekstów |
-
2021
- 2021-06-30 US US18/015,089 patent/US20230253074A1/en active Pending
- 2021-06-30 PL PL21742062.9T patent/PL4179539T3/pl unknown
- 2021-06-30 ES ES21742062T patent/ES3050587T3/es active Active
- 2021-06-30 EP EP21742062.9A patent/EP4179539B1/en active Active
- 2021-06-30 WO PCT/EP2021/067960 patent/WO2022008311A1/en not_active Ceased
- 2021-06-30 JP JP2023500391A patent/JP7826277B2/ja active Active
- 2021-06-30 CN CN202180056542.9A patent/CN116018647A/zh active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| US20230253074A1 (en) | 2023-08-10 |
| JP7826277B2 (ja) | 2026-03-09 |
| EP4179539C0 (en) | 2025-10-01 |
| EP4179539A1 (en) | 2023-05-17 |
| CN116018647A (zh) | 2023-04-25 |
| EP4179539B1 (en) | 2025-10-01 |
| JP2023535131A (ja) | 2023-08-16 |
| WO2022008311A1 (en) | 2022-01-13 |
| PL4179539T3 (pl) | 2026-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| ES3050587T3 (en) | Genomic information compression by configurable machine learning-based arithmetic coding | |
| Benoit et al. | Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph | |
| Sabary et al. | Survey for a decade of coding for DNA storage | |
| US12596685B2 (en) | System and methods for bandwidth-efficient data encoding | |
| US20250047300A1 (en) | System and method for data processing and transformation using reference data structures | |
| US12368453B2 (en) | Multi-stage fully homomorphic encryption and compression system for secure data processing and analysis | |
| US20250055476A1 (en) | Deep learning-based data compression with protocol adaptation | |
| US11734231B2 (en) | System and methods for bandwidth-efficient encoding of genomic data | |
| US12436920B2 (en) | System and method for file type identification using machine learning | |
| US12423271B2 (en) | System and methods for adaptive bandwidth-efficient encoding of genomic data | |
| CN110915140B (zh) | 用于编码和解码数据结构的质量值的方法 | |
| US20250284393A1 (en) | System and Method for Compaction of Floating-Point Numbers Within a Dataset with Metadata Tagging | |
| US12218697B2 (en) | Event-driven data transmission using codebooks with protocol prediction and translation | |
| US12499092B2 (en) | System and method for sourceblock length optimization for data compaction | |
| US11769570B2 (en) | Method and systems for genome sequence compression | |
| CN119068992B (zh) | 一种满足生物条件约束的dna编码方法、终端设备及存储介质 | |
| US20250298510A1 (en) | System and Method for Hardware-Accelerated Determination of Compression Performance Using Field-Programmable Gate Array Implementation | |
| US12483269B2 (en) | System and method for encrypted data compression with a hardware management layer | |
| US20260079891A1 (en) | System and Method for Sourceblock Length Optimization for Data Compaction | |
| US12289121B2 (en) | Adaptive neural upsampling system for decoding lossy compressed data streams | |
| US20250202498A1 (en) | System and method for enhancing decompressed data streams | |
| US20250306760A1 (en) | System and Method for Hardware-Accelerated Real-Time Tracking of Codebook Compression Performance Using a Field-Programmable Gate Array | |
| US20250284395A1 (en) | System and Method for Hybrid Codebook Performance Estimation Without Generation | |
| US12192467B1 (en) | Arithmetic encoding and decoding method based on semantic source and related device | |
| US20260030214A1 (en) | System and Method for Stream Data Type Identification Using Machine Learning |