JP7631330B2 - 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 - Google Patents

多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 Download PDF

Info

Publication number
JP7631330B2
JP7631330B2 JP2022522858A JP2022522858A JP7631330B2 JP 7631330 B2 JP7631330 B2 JP 7631330B2 JP 2022522858 A JP2022522858 A JP 2022522858A JP 2022522858 A JP2022522858 A JP 2022522858A JP 7631330 B2 JP7631330 B2 JP 7631330B2
Authority
JP
Japan
Prior art keywords
information
attributes
file
data
chunks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2022522858A
Other languages
English (en)
Japanese (ja)
Other versions
JP2022553199A5 (https=
JP2022553199A (ja
Inventor
シュブハム チャンダク
イー ヒム チャン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips NV filed Critical Koninklijke Philips NV
Publication of JP2022553199A publication Critical patent/JP2022553199A/ja
Publication of JP2022553199A5 publication Critical patent/JP2022553199A5/ja
Priority to JP2025017259A priority Critical patent/JP2025069371A/ja
Application granted granted Critical
Publication of JP7631330B2 publication Critical patent/JP7631330B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3068Precoding preceding compression, e.g. Burrows-Wheeler transformation
    • H03M7/3079Context modeling
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6064Selection of Compressor
    • H03M7/6082Selection strategies
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Genetics & Genomics (AREA)
  • Automation & Control Theory (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
JP2022522858A 2019-10-18 2020-10-17 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法 Active JP7631330B2 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2025017259A JP2025069371A (ja) 2019-10-18 2025-02-05 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201962923141P 2019-10-18 2019-10-18
US62/923,141 2019-10-18
US202062956952P 2020-01-03 2020-01-03
US62/956,952 2020-01-03
PCT/EP2020/079298 WO2021074440A1 (en) 2019-10-18 2020-10-17 System and method for effective compression, representation and decompression of diverse tabulated data

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP2025017259A Division JP2025069371A (ja) 2019-10-18 2025-02-05 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法

Publications (3)

Publication Number Publication Date
JP2022553199A JP2022553199A (ja) 2022-12-22
JP2022553199A5 JP2022553199A5 (https=) 2023-10-20
JP7631330B2 true JP7631330B2 (ja) 2025-02-18

Family

ID=72915837

Family Applications (2)

Application Number Title Priority Date Filing Date
JP2022522858A Active JP7631330B2 (ja) 2019-10-18 2020-10-17 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法
JP2025017259A Pending JP2025069371A (ja) 2019-10-18 2025-02-05 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法

Family Applications After (1)

Application Number Title Priority Date Filing Date
JP2025017259A Pending JP2025069371A (ja) 2019-10-18 2025-02-05 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法

Country Status (6)

Country Link
US (1) US11916576B2 (https=)
EP (1) EP4046279A1 (https=)
JP (2) JP7631330B2 (https=)
CN (1) CN114556482A (https=)
BR (1) BR112022007331A2 (https=)
WO (1) WO2021074440A1 (https=)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11463556B2 (en) * 2020-11-18 2022-10-04 Verizon Patent And Licensing Inc. Systems and methods for packet-based file compression and storage
CN114900571B (zh) * 2022-07-13 2022-09-27 工业信息安全(四川)创新中心有限公司 一种基于模板解析可信密码指令的方法、设备及介质
WO2024148566A1 (zh) * 2023-01-12 2024-07-18 华为技术有限公司 数据压缩传输方法、装置、设备以及存储介质
CN116521063B (zh) * 2023-03-31 2024-03-26 北京瑞风协同科技股份有限公司 一种hdf5的试验数据高效读写方法及装置
CN117312309B (zh) * 2023-09-20 2026-04-21 北京火山引擎科技有限公司 一种针对软件产品的数据处理方法及装置
CN117312261B (zh) * 2023-11-29 2024-02-09 苏州元脑智能科技有限公司 文件的压缩编码方法、装置存储介质及电子设备
CN120730504A (zh) * 2024-03-29 2025-09-30 华为技术有限公司 一种通信方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130031092A1 (en) 2010-04-26 2013-01-31 Samsung Electronics Co., Ltd. Method and apparatus for compressing genetic data
US20180089369A1 (en) 2016-05-19 2018-03-29 Seven Bridges Genomics Inc. Systems and methods for sequence encoding, storage, and compression
WO2018071055A1 (en) 2016-10-11 2018-04-19 Genomsys Sa Method and apparatus for the compact representation of bioinformatics data
JP2019537781A (ja) 2016-10-11 2019-12-26 ゲノムシス エスアーGenomsys Sa バイオインフォマティクスデータを格納およびアクセスするための方法およびシステム

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101074010B1 (ko) * 2009-09-04 2011-10-17 (주)이스트소프트 블록 단위 데이터 압축 및 복원 방법 및 그 장치
US8412462B1 (en) * 2010-06-25 2013-04-02 Annai Systems, Inc. Methods and systems for processing genomic data
EP2595076B1 (en) * 2011-11-18 2019-05-15 Tata Consultancy Services Limited Compression of genomic data
US8937564B2 (en) * 2013-01-10 2015-01-20 Infinidat Ltd. System, method and non-transitory computer readable medium for compressing genetic information
US11998540B2 (en) 2015-12-11 2024-06-04 The General Hospital Corporation Compositions and methods for treating drug-tolerant glioblastoma
US20170177597A1 (en) * 2015-12-22 2017-06-22 DNANEXUS, Inc. Biological data systems
WO2017153456A1 (en) * 2016-03-09 2017-09-14 Sophia Genetics S.A. Methods to compress, encrypt and retrieve genomic alignment data
SG11201903858XA (en) 2016-10-28 2019-05-30 Illumina Inc Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
CN109712674B (zh) * 2019-01-14 2023-06-30 深圳市泰尔迪恩生物信息科技有限公司 注释数据库索引结构、快速注释遗传变异的方法及系统
US10554220B1 (en) * 2019-01-30 2020-02-04 International Business Machines Corporation Managing compression and storage of genomic data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130031092A1 (en) 2010-04-26 2013-01-31 Samsung Electronics Co., Ltd. Method and apparatus for compressing genetic data
US20180089369A1 (en) 2016-05-19 2018-03-29 Seven Bridges Genomics Inc. Systems and methods for sequence encoding, storage, and compression
WO2018071055A1 (en) 2016-10-11 2018-04-19 Genomsys Sa Method and apparatus for the compact representation of bioinformatics data
JP2019537781A (ja) 2016-10-11 2019-12-26 ゲノムシス エスアーGenomsys Sa バイオインフォマティクスデータを格納およびアクセスするための方法およびシステム

Also Published As

Publication number Publication date
BR112022007331A2 (pt) 2022-07-05
CN114556482A (zh) 2022-05-27
US11916576B2 (en) 2024-02-27
US20220368347A1 (en) 2022-11-17
JP2022553199A (ja) 2022-12-22
JP2025069371A (ja) 2025-04-30
EP4046279A1 (en) 2022-08-24
WO2021074440A1 (en) 2021-04-22

Similar Documents

Publication Publication Date Title
JP7631330B2 (ja) 多様な表形式データの効果的な圧縮、表現、および展開のためのシステムおよび方法
Holley et al. Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage
US11475034B2 (en) Schemaless to relational representation conversion
US11899641B2 (en) Trie-based indices for databases
US9805080B2 (en) Data driven relational algorithm formation for execution against big data
CN103026631B (zh) 用于压缩xml文档的方法和系统
CN109299086B (zh) 最优排序键压缩和索引重建
CN114556318A (zh) 可定制的分隔文本压缩框架
Holley et al. Bloom filter trie–a data structure for pan-genome storage
CN113901006B (zh) 大规模基因测序数据存储与查询系统
HUP0101298A2 (hu) Adatszerkezet, valamint eljárás blokkokból álló, többszintű indexek előállítására, eljárás index előállítására adatrekordok kulcsai alapján, és eljárás kiegyensúlyozott indexek előállítására
EP3526709B1 (en) Efficient data structures for bioinformatics information representation
JP7775215B2 (ja) Mpeg-gにおける効率的なデータ圧縮の方法、ゲノムエンコーダ、ゲノムデコーダおよびコンピュータ可読媒体
CN110168652B (zh) 用于存储和访问生物信息学数据的方法和系统
CN107852173B (zh) 对无损简化的数据执行搜索和取回的方法以及装置
CN108475508B (zh) 音频数据和保存在块处理存储系统中的数据的简化
Brown et al. Improved pangenomic classification accuracy with chain statistics
US12445148B2 (en) System and method for effective compression representation and decompression of diverse tabulated data
Pandey et al. VariantStore: an index for large-scale genomic variant search
TWI816954B (zh) 用於重建無損地縮減的資料塊的序列的方法和設備,用於確定主要資料元件的元資料的方法和設備,及儲存媒體
Lichtenwalter et al. Genotypic data in relational databases: efficient storage and rapid retrieval
Luo et al. GSC: efficient lossless compression of VCF files with fast query
Tischler Low space external memory construction of the succinct permuted longest common prefix array
KR20250097132A (ko) 블록체인 키워드 검색 장치 및 방법
Backman et al. Package ‘bioassayR’

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20231012

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20231012

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20250107

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20250205

R150 Certificate of patent or registration of utility model

Ref document number: 7631330

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150