JP7826277B2 - 構成可能な機械学習ベースの算術コード化によるゲノム情報圧縮 - Google Patents
構成可能な機械学習ベースの算術コード化によるゲノム情報圧縮Info
- Publication number
- JP7826277B2 JP7826277B2 JP2023500391A JP2023500391A JP7826277B2 JP 7826277 B2 JP7826277 B2 JP 7826277B2 JP 2023500391 A JP2023500391 A JP 2023500391A JP 2023500391 A JP2023500391 A JP 2023500391A JP 7826277 B2 JP7826277 B2 JP 7826277B2
- Authority
- JP
- Japan
- Prior art keywords
- context
- data
- training
- type
- encoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4006—Conversion to or from arithmetic code
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6064—Selection of Compressor
- H03M7/6076—Selection between compressors of the same type
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3068—Precoding preceding compression, e.g. Burrows-Wheeler transformation
- H03M7/3079—Context modeling
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Genetics & Genomics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063050193P | 2020-07-10 | 2020-07-10 | |
| US63/050,193 | 2020-07-10 | ||
| PCT/EP2021/067960 WO2022008311A1 (en) | 2020-07-10 | 2021-06-30 | Genomic information compression by configurable machine learning-based arithmetic coding |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2023535131A JP2023535131A (ja) | 2023-08-16 |
| JP2023535131A5 JP2023535131A5 (https=) | 2023-08-23 |
| JP7826277B2 true JP7826277B2 (ja) | 2026-03-09 |
Family
ID=76920753
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023500391A Active JP7826277B2 (ja) | 2020-07-10 | 2021-06-30 | 構成可能な機械学習ベースの算術コード化によるゲノム情報圧縮 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US20230253074A1 (https=) |
| EP (1) | EP4179539B1 (https=) |
| JP (1) | JP7826277B2 (https=) |
| CN (1) | CN116018647A (https=) |
| ES (1) | ES3050587T3 (https=) |
| PL (1) | PL4179539T3 (https=) |
| WO (1) | WO2022008311A1 (https=) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113810155B (zh) * | 2020-06-17 | 2022-11-18 | 华为技术有限公司 | 信道编译码方法和通信装置 |
| US11818399B2 (en) * | 2021-01-04 | 2023-11-14 | Tencent America LLC | Techniques for signaling neural network topology and parameters in the coded video stream |
| CN115391298B (zh) * | 2021-05-25 | 2026-03-27 | 戴尔产品有限公司 | 基于内容的动态混合数据压缩 |
| EP4465207A4 (en) * | 2022-01-13 | 2025-10-15 | Lg Electronics Inc | METHOD BY WHICH A RECEIVING DEVICE PERFORMS END-TO-END LEARNING IN A WIRELESS COMMUNICATION SYSTEM, RECEIVING DEVICE, PROCESSING DEVICE, STORAGE MEDIUM, METHOD BY WHICH A TRANSMITTING DEVICE PERFORMS END-TO-END LEARNING, AND TRANSMITTING DEVICE |
| JP2025522817A (ja) * | 2022-06-30 | 2025-07-17 | 華為技術有限公司 | エントロピーコーディングパラメータの適応的選択 |
| CN115083530B (zh) * | 2022-08-22 | 2022-11-04 | 广州明领基因科技有限公司 | 基因测序数据压缩方法、装置、终端设备和存储介质 |
| CN117692094B (zh) * | 2022-09-02 | 2026-03-20 | 北京邮电大学 | 编码方法、解码方法、编码装置、解码装置及电子设备 |
| CN116886104B (zh) * | 2023-09-08 | 2023-11-21 | 西安小草植物科技有限责任公司 | 一种基于人工智能的智慧医疗数据分析方法 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101461243A (zh) | 2006-03-29 | 2009-06-17 | 诺基亚西门子通信有限责任两合公司 | 为可定标的数据流产生数据块的方法和设备 |
| CN110663022A (zh) | 2016-10-11 | 2020-01-07 | 耶诺姆希斯股份公司 | 用于使用多个基因组描述符来紧凑表示生物信息学数据的方法和设备 |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107529709B (zh) * | 2011-06-16 | 2019-05-07 | Ge视频压缩有限责任公司 | 解码器、编码器、解码和编码视频的方法及存储介质 |
| CN106096327B (zh) * | 2016-06-07 | 2018-08-17 | 广州麦仑信息科技有限公司 | 基于Torch监督式深度学习的基因性状识别方法 |
| EP3583250B1 (en) * | 2017-02-14 | 2023-07-12 | Genomsys SA | Method and systems for the efficient compression of genomic sequence reads |
| CN108306650A (zh) * | 2018-01-16 | 2018-07-20 | 厦门极元科技有限公司 | 基因测序数据的压缩方法 |
| PL4100954T3 (pl) * | 2020-02-07 | 2026-01-26 | Koninklijke Philips N.V. | Ulepszona struktura kompresji wartości jakości w dopasowanych danych sekwencjonowania na podstawie nowych kontekstów |
-
2021
- 2021-06-30 US US18/015,089 patent/US20230253074A1/en active Pending
- 2021-06-30 PL PL21742062.9T patent/PL4179539T3/pl unknown
- 2021-06-30 ES ES21742062T patent/ES3050587T3/es active Active
- 2021-06-30 EP EP21742062.9A patent/EP4179539B1/en active Active
- 2021-06-30 WO PCT/EP2021/067960 patent/WO2022008311A1/en not_active Ceased
- 2021-06-30 JP JP2023500391A patent/JP7826277B2/ja active Active
- 2021-06-30 CN CN202180056542.9A patent/CN116018647A/zh active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101461243A (zh) | 2006-03-29 | 2009-06-17 | 诺基亚西门子通信有限责任两合公司 | 为可定标的数据流产生数据块的方法和设备 |
| CN110663022A (zh) | 2016-10-11 | 2020-01-07 | 耶诺姆希斯股份公司 | 用于使用多个基因组描述符来紧凑表示生物信息学数据的方法和设备 |
Non-Patent Citations (1)
| Title |
|---|
| W. Yang, Y. Lin, S. Wu, R. Yu,Improving Coding Efficiency of MPEG-G Standard Using Context-Based Arithmetic Coding,2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM),2018年,pp. 1177-1183,[online][検索日 2025年6月30日]取得先<https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8621550> |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230253074A1 (en) | 2023-08-10 |
| EP4179539C0 (en) | 2025-10-01 |
| EP4179539A1 (en) | 2023-05-17 |
| CN116018647A (zh) | 2023-04-25 |
| EP4179539B1 (en) | 2025-10-01 |
| JP2023535131A (ja) | 2023-08-16 |
| ES3050587T3 (en) | 2025-12-22 |
| WO2022008311A1 (en) | 2022-01-13 |
| PL4179539T3 (pl) | 2026-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7826277B2 (ja) | 構成可能な機械学習ベースの算術コード化によるゲノム情報圧縮 | |
| Zheng et al. | In-network machine learning using programmable network devices: A survey | |
| Benoit et al. | Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph | |
| EP3534283B1 (en) | Classification of source data by neural network processing | |
| EP3534284B1 (en) | Classification of source data by neural network processing | |
| JP7372347B2 (ja) | データ圧縮方法およびコンピューティングデバイス | |
| Yu et al. | Two-level data compression using machine learning in time series database | |
| JP7810664B2 (ja) | 質スコア圧縮 | |
| WO2015180203A1 (zh) | 一种高通量dna测序质量分数无损压缩系统及压缩方法 | |
| CN114048328B (zh) | 基于转换假设和消息传递的知识图谱链接预测方法及系统 | |
| US20110208820A1 (en) | Method and system for message handling | |
| CN107783998A (zh) | 一种数据处理的方法以及装置 | |
| US20230222354A1 (en) | A method for a distributed learning | |
| CN114222998A (zh) | 用于带宽增强的特征字典 | |
| JP2021072540A (ja) | 画像符号化装置、復号装置、伝送システム、及びその制御方法 | |
| CN110362683A (zh) | 一种基于递归神经网络的信息隐写方法、装置及存储介质 | |
| JP7674340B2 (ja) | ゲノム配列データの圧縮のための方法 | |
| US12423271B2 (en) | System and methods for adaptive bandwidth-efficient encoding of genomic data | |
| US12218697B2 (en) | Event-driven data transmission using codebooks with protocol prediction and translation | |
| WO2020070943A1 (ja) | パターン認識装置及び学習済みモデル | |
| CN111008276B (zh) | 一种完整实体关系抽取方法及装置 | |
| CN117995276A (zh) | 基于生成模型的数据缺失插补方法、电子设备、介质 | |
| US20190057185A1 (en) | Compression/Decompression Method and Apparatus for Genomic Variant Call Data | |
| CN116186202A (zh) | 结合时域特征的新词发现方法和系统 | |
| Pasquini et al. | Robust and Lightweight Modeling of IoT Network Behaviors From Raw Traffic Packets |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230214 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230810 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20240627 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20250708 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20250829 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20251021 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20251219 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20260127 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20260225 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7826277 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |