JP7631330B2 - 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 - Google Patents
多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 Download PDFInfo
- Publication number
- JP7631330B2 JP7631330B2 JP2022522858A JP2022522858A JP7631330B2 JP 7631330 B2 JP7631330 B2 JP 7631330B2 JP 2022522858 A JP2022522858 A JP 2022522858A JP 2022522858 A JP2022522858 A JP 2022522858A JP 7631330 B2 JP7631330 B2 JP 7631330B2
- Authority
- JP
- Japan
- Prior art keywords
- information
- attributes
- file
- data
- chunks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/604—Tools and structures for managing or administering access control systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/10—Ontologies; Annotations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3068—Precoding preceding compression, e.g. Burrows-Wheeler transformation
- H03M7/3079—Context modeling
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6064—Selection of Compressor
- H03M7/6082—Selection strategies
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Genetics & Genomics (AREA)
- Automation & Control Theory (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2025017259A JP2025069371A (ja) | 2019-10-18 | 2025-02-05 | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 |
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962923141P | 2019-10-18 | 2019-10-18 | |
| US62/923,141 | 2019-10-18 | ||
| US202062956952P | 2020-01-03 | 2020-01-03 | |
| US62/956,952 | 2020-01-03 | ||
| PCT/EP2020/079298 WO2021074440A1 (en) | 2019-10-18 | 2020-10-17 | System and method for effective compression, representation and decompression of diverse tabulated data |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2025017259A Division JP2025069371A (ja) | 2019-10-18 | 2025-02-05 | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2022553199A JP2022553199A (ja) | 2022-12-22 |
| JP2022553199A5 JP2022553199A5 (https=) | 2023-10-20 |
| JP7631330B2 true JP7631330B2 (ja) | 2025-02-18 |
Family
ID=72915837
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2022522858A Active JP7631330B2 (ja) | 2019-10-18 | 2020-10-17 | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 |
| JP2025017259A Pending JP2025069371A (ja) | 2019-10-18 | 2025-02-05 | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2025017259A Pending JP2025069371A (ja) | 2019-10-18 | 2025-02-05 | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11916576B2 (https=) |
| EP (1) | EP4046279A1 (https=) |
| JP (2) | JP7631330B2 (https=) |
| CN (1) | CN114556482A (https=) |
| BR (1) | BR112022007331A2 (https=) |
| WO (1) | WO2021074440A1 (https=) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11463556B2 (en) * | 2020-11-18 | 2022-10-04 | Verizon Patent And Licensing Inc. | Systems and methods for packet-based file compression and storage |
| CN114900571B (zh) * | 2022-07-13 | 2022-09-27 | 工业信息安全(四川)创新中心有限公司 | 一种基于模板解析可信密码指令的方法、设备及介质 |
| WO2024148566A1 (zh) * | 2023-01-12 | 2024-07-18 | 华为技术有限公司 | 数据压缩传输方法、装置、设备以及存储介质 |
| CN116521063B (zh) * | 2023-03-31 | 2024-03-26 | 北京瑞风协同科技股份有限公司 | 一种hdf5的试验数据高效读写方法及装置 |
| CN117312309B (zh) * | 2023-09-20 | 2026-04-21 | 北京火山引擎科技有限公司 | 一种针对软件产品的数据处理方法及装置 |
| CN117312261B (zh) * | 2023-11-29 | 2024-02-09 | 苏州元脑智能科技有限公司 | 文件的压缩编码方法、装置存储介质及电子设备 |
| CN120730504A (zh) * | 2024-03-29 | 2025-09-30 | 华为技术有限公司 | 一种通信方法及装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130031092A1 (en) | 2010-04-26 | 2013-01-31 | Samsung Electronics Co., Ltd. | Method and apparatus for compressing genetic data |
| US20180089369A1 (en) | 2016-05-19 | 2018-03-29 | Seven Bridges Genomics Inc. | Systems and methods for sequence encoding, storage, and compression |
| WO2018071055A1 (en) | 2016-10-11 | 2018-04-19 | Genomsys Sa | Method and apparatus for the compact representation of bioinformatics data |
| JP2019537781A (ja) | 2016-10-11 | 2019-12-26 | ゲノムシス エスアーGenomsys Sa | バイオインフォマティクスデータを格納およびアクセスするための方法およびシステム |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101074010B1 (ko) * | 2009-09-04 | 2011-10-17 | (주)이스트소프트 | 블록 단위 데이터 압축 및 복원 방법 및 그 장치 |
| US8412462B1 (en) * | 2010-06-25 | 2013-04-02 | Annai Systems, Inc. | Methods and systems for processing genomic data |
| EP2595076B1 (en) * | 2011-11-18 | 2019-05-15 | Tata Consultancy Services Limited | Compression of genomic data |
| US8937564B2 (en) * | 2013-01-10 | 2015-01-20 | Infinidat Ltd. | System, method and non-transitory computer readable medium for compressing genetic information |
| US11998540B2 (en) | 2015-12-11 | 2024-06-04 | The General Hospital Corporation | Compositions and methods for treating drug-tolerant glioblastoma |
| US20170177597A1 (en) * | 2015-12-22 | 2017-06-22 | DNANEXUS, Inc. | Biological data systems |
| WO2017153456A1 (en) * | 2016-03-09 | 2017-09-14 | Sophia Genetics S.A. | Methods to compress, encrypt and retrieve genomic alignment data |
| SG11201903858XA (en) | 2016-10-28 | 2019-05-30 | Illumina Inc | Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing |
| CN109712674B (zh) * | 2019-01-14 | 2023-06-30 | 深圳市泰尔迪恩生物信息科技有限公司 | 注释数据库索引结构、快速注释遗传变异的方法及系统 |
| US10554220B1 (en) * | 2019-01-30 | 2020-02-04 | International Business Machines Corporation | Managing compression and storage of genomic data |
-
2020
- 2020-10-17 CN CN202080073109.1A patent/CN114556482A/zh active Pending
- 2020-10-17 EP EP20792983.7A patent/EP4046279A1/en active Pending
- 2020-10-17 BR BR112022007331A patent/BR112022007331A2/pt unknown
- 2020-10-17 US US17/767,070 patent/US11916576B2/en active Active
- 2020-10-17 WO PCT/EP2020/079298 patent/WO2021074440A1/en not_active Ceased
- 2020-10-17 JP JP2022522858A patent/JP7631330B2/ja active Active
-
2025
- 2025-02-05 JP JP2025017259A patent/JP2025069371A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130031092A1 (en) | 2010-04-26 | 2013-01-31 | Samsung Electronics Co., Ltd. | Method and apparatus for compressing genetic data |
| US20180089369A1 (en) | 2016-05-19 | 2018-03-29 | Seven Bridges Genomics Inc. | Systems and methods for sequence encoding, storage, and compression |
| WO2018071055A1 (en) | 2016-10-11 | 2018-04-19 | Genomsys Sa | Method and apparatus for the compact representation of bioinformatics data |
| JP2019537781A (ja) | 2016-10-11 | 2019-12-26 | ゲノムシス エスアーGenomsys Sa | バイオインフォマティクスデータを格納およびアクセスするための方法およびシステム |
Also Published As
| Publication number | Publication date |
|---|---|
| BR112022007331A2 (pt) | 2022-07-05 |
| CN114556482A (zh) | 2022-05-27 |
| US11916576B2 (en) | 2024-02-27 |
| US20220368347A1 (en) | 2022-11-17 |
| JP2022553199A (ja) | 2022-12-22 |
| JP2025069371A (ja) | 2025-04-30 |
| EP4046279A1 (en) | 2022-08-24 |
| WO2021074440A1 (en) | 2021-04-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7631330B2 (ja) | 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 | |
| Holley et al. | Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage | |
| US11475034B2 (en) | Schemaless to relational representation conversion | |
| US11899641B2 (en) | Trie-based indices for databases | |
| US9805080B2 (en) | Data driven relational algorithm formation for execution against big data | |
| CN103026631B (zh) | 用于压缩xml文档的方法和系统 | |
| CN109299086B (zh) | 最优排序键压缩和索引重建 | |
| CN114556318A (zh) | 可定制的分隔文本压缩框架 | |
| Holley et al. | Bloom filter trie–a data structure for pan-genome storage | |
| CN113901006B (zh) | 大规模基因测序数据存储与查询系统 | |
| HUP0101298A2 (hu) | Adatszerkezet, valamint eljárás blokkokból álló, többszintű indexek előállítására, eljárás index előállítására adatrekordok kulcsai alapján, és eljárás kiegyensúlyozott indexek előállítására | |
| EP3526709B1 (en) | Efficient data structures for bioinformatics information representation | |
| JP7775215B2 (ja) | Mpeg-gにおける効率的なデータ圧縮の方法、ゲノムエンコーダ、ゲノムデコーダおよびコンピュータ可読媒体 | |
| CN110168652B (zh) | 用于存储和访问生物信息学数据的方法和系统 | |
| CN107852173B (zh) | 对无损简化的数据执行搜索和取回的方法以及装置 | |
| CN108475508B (zh) | 音频数据和保存在块处理存储系统中的数据的简化 | |
| Brown et al. | Improved pangenomic classification accuracy with chain statistics | |
| US12445148B2 (en) | System and method for effective compression representation and decompression of diverse tabulated data | |
| Pandey et al. | VariantStore: an index for large-scale genomic variant search | |
| TWI816954B (zh) | 用於重建無損地縮減的資料塊的序列的方法和設備,用於確定主要資料元件的元資料的方法和設備,及儲存媒體 | |
| Lichtenwalter et al. | Genotypic data in relational databases: efficient storage and rapid retrieval | |
| Luo et al. | GSC: efficient lossless compression of VCF files with fast query | |
| Tischler | Low space external memory construction of the succinct permuted longest common prefix array | |
| KR20250097132A (ko) | 블록체인 키워드 검색 장치 및 방법 | |
| Backman et al. | Package ‘bioassayR’ |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20231012 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20231012 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20250107 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20250205 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7631330 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |