CA3157786A1 - Customizable delimited text compression framework - Google Patents

Customizable delimited text compression framework

Info

Publication number
CA3157786A1
CA3157786A1 CA3157786A CA3157786A CA3157786A1 CA 3157786 A1 CA3157786 A1 CA 3157786A1 CA 3157786 A CA3157786 A CA 3157786A CA 3157786 A CA3157786 A CA 3157786A CA 3157786 A1 CA3157786 A1 CA 3157786A1
Authority
CA
Canada
Prior art keywords
compression
data
schema
file
delimited text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3157786A
Other languages
English (en)
French (fr)
Inventor
Yee Him Cheung
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips NV filed Critical Koninklijke Philips NV
Publication of CA3157786A1 publication Critical patent/CA3157786A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/173Customisation support for file systems, e.g. localisation, multi-language support, personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/123Storage facilities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/183Tabulation, i.e. one-dimensional [1D] positioning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6064Selection of Compressor
    • H03M7/607Selection between different types of compressors
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound
    • H03M7/707Structured documents, e.g. XML

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Genetics & Genomics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Document Processing Apparatus (AREA)
CA3157786A 2019-10-18 2020-10-15 Customizable delimited text compression framework Pending CA3157786A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201962923113P 2019-10-18 2019-10-18
US62/923,113 2019-10-18
US202062956941P 2020-01-03 2020-01-03
US62/956,941 2020-01-03
PCT/EP2020/078996 WO2021074272A1 (en) 2019-10-18 2020-10-15 Customizable delimited text compression framework

Publications (1)

Publication Number Publication Date
CA3157786A1 true CA3157786A1 (en) 2021-04-22

Family

ID=72964653

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3157786A Pending CA3157786A1 (en) 2019-10-18 2020-10-15 Customizable delimited text compression framework

Country Status (7)

Country Link
US (1) US20240095218A1 (https=)
EP (1) EP4046052A1 (https=)
JP (1) JP7848681B2 (https=)
CN (1) CN114556318A (https=)
BR (1) BR112022007396A2 (https=)
CA (1) CA3157786A1 (https=)
WO (1) WO2021074272A1 (https=)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102948214B1 (ko) * 2021-07-16 2026-04-03 주식회사 쏠리드 프론트홀 다중화 장치
US12387053B2 (en) 2022-01-27 2025-08-12 International Business Machines Corporation Large-scale text data encoding and compression
CN117827775A (zh) * 2022-09-29 2024-04-05 华为技术有限公司 数据压缩方法、装置、计算设备及存储系统
CN116521063B (zh) * 2023-03-31 2024-03-26 北京瑞风协同科技股份有限公司 一种hdf5的试验数据高效读写方法及装置
CN119166428B (zh) * 2024-11-21 2025-10-17 北京高阳捷迅信息技术有限公司 基于大数据的关系型数据库备份恢复方法及系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2283591C (en) * 1997-03-07 2006-01-31 Intelligent Compression Technologies Data coding network
JP2000252832A (ja) * 1999-02-25 2000-09-14 Nikon Corp データ圧縮装置、およびデータ圧縮プログラムを記録した記録媒体
JP2005018672A (ja) * 2003-06-30 2005-01-20 Hitachi Ltd 構造化文書の圧縮方法
GB2412978A (en) * 2004-04-07 2005-10-12 Hewlett Packard Development Co Method and system for compressing and decompressing hierarchical data structures
US9667269B2 (en) * 2009-04-30 2017-05-30 Oracle International Corporation Technique for compressing XML indexes
JP5280425B2 (ja) * 2010-11-12 2013-09-04 シャープ株式会社 画像処理装置、画像読取装置、画像形成装置、画像処理方法、プログラムおよびその記録媒体
KR101922129B1 (ko) * 2011-12-05 2018-11-26 삼성전자주식회사 차세대 시퀀싱을 이용하여 획득된 유전 정보를 압축 및 압축해제하는 방법 및 장치
CA2958478C (en) 2014-09-03 2019-04-16 Patrick Soon-Shiong Synthetic genomic variant-based secure transaction devices, systems and methods
JP6949970B2 (ja) 2016-10-11 2021-10-13 ゲノムシス エスアー バイオインフォマティクスデータを送信する方法及びシステム
EA201990933A1 (ru) 2016-10-11 2019-11-29 Эффективные структуры данных для представления информации биоинформатики

Also Published As

Publication number Publication date
JP2023501093A (ja) 2023-01-18
BR112022007396A2 (pt) 2022-07-05
EP4046052A1 (en) 2022-08-24
JP7848681B2 (ja) 2026-04-21
CN114556318A (zh) 2022-05-27
US20240095218A1 (en) 2024-03-21
WO2021074272A1 (en) 2021-04-22

Similar Documents

Publication Publication Date Title
US20240095218A1 (en) Customizable deliminated text compression framework
US10778441B2 (en) Redactable document signatures
US11916576B2 (en) System and method for effective compression, representation and decompression of diverse tabulated data
US11120018B2 (en) Spark query method and system supporting trusted computing
US10970281B2 (en) Searching for data using superset tree data structures
WO2018200294A1 (en) Parser for schema-free data exchange format
JP6902104B2 (ja) バイオインフォマティクス情報表示のための効率的データ構造
CN115080651A (zh) Excel数据导入方法、装置、电子设备及存储介质
CN111095421A (zh) 基因文件的上下文感知增量算法
RU2633178C2 (ru) Способ и система базы данных для индексирования ссылок на документы базы данных
Holley et al. Bloom filter trie–a data structure for pan-genome storage
Brown et al. Improved pangenomic classification accuracy with chain statistics
Aronson et al. Towards an engineering approach to file carver construction
CN118523780B (zh) 一种对sas数据集进行解压以及压缩的方法及应用
CN118193567B (zh) 生成查询语句和查询业务数据的方法、装置、设备及介质
CN118692573A (zh) 一种基因型数据压缩及检索方法、装置、设备及计算机可读存储介质
US12445148B2 (en) System and method for effective compression representation and decompression of diverse tabulated data
CN118260772A (zh) 一种漏洞检测方法、装置及电子设备
CN121092504A (zh) 数据处理方法及装置、日志处理方法及装置、设备、介质及程序产品
Tollefson Importing and Creating Data
CN120066631A (zh) 文字的加载方法、文字文件的生成方法及相关设备
CN119690990A (zh) 数据库查询语句检查方法、装置、计算机设备、存储介质和计算机程序产品
CN121996323A (zh) 小程序文件压缩方法、小程序文件读取方法及电子设备
US8667386B2 (en) Network client optimization
JP2016081376A (ja) Url分類サーバ、url分類方法及びプログラム

Legal Events

Date Code Title Description
P22 Classification modified

Free format text: ST27 STATUS EVENT CODE: A-1-1-P10-P22-P110 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CLASSIFICATION MODIFIED

Effective date: 20240912

MFA Maintenance fee for application paid

Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 4TH ANNIV.) - STANDARD

Year of fee payment: 4

U00 Fee paid

Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED

Effective date: 20241009

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT

Effective date: 20241009

Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL

Effective date: 20241009

D11 Substantive examination requested

Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D11-D117 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: REQUEST FOR EXAMINATION RECEIVED

Effective date: 20241010

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT

Effective date: 20241125

D00 Search and/or examination requested or commenced

Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D00-D118 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: REQUEST FOR EXAMINATION REQUIREMENTS DETERMINED COMPLIANT

Effective date: 20241212

D11 Substantive examination requested

Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D11-D155 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: ALL REQUIREMENTS FOR EXAMINATION DETERMINED COMPLIANT

Effective date: 20250314

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT

Effective date: 20250319

MFA Maintenance fee for application paid

Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 5TH ANNIV.) - STANDARD

Year of fee payment: 5

U00 Fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED

Effective date: 20251007

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL

Effective date: 20251007

D15 Examination report completed

Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D15-D126 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXAMINER'S REPORT

Effective date: 20251212

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT RECEIVED - RESPONSE TO EXAMINER'S REQUISITION

Effective date: 20260410

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT DETERMINED COMPLIANT

Effective date: 20260422

P13 Application amended

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P13-X000 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: APPLICATION AMENDED

Effective date: 20260422

W00 Other event occurred

Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT

Effective date: 20260422