CA3157786A1 - Customizable delimited text compression framework - Google Patents
Customizable delimited text compression frameworkInfo
- Publication number
- CA3157786A1 CA3157786A1 CA3157786A CA3157786A CA3157786A1 CA 3157786 A1 CA3157786 A1 CA 3157786A1 CA 3157786 A CA3157786 A CA 3157786A CA 3157786 A CA3157786 A CA 3157786A CA 3157786 A1 CA3157786 A1 CA 3157786A1
- Authority
- CA
- Canada
- Prior art keywords
- compression
- data
- schema
- file
- delimited text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/173—Customisation support for file systems, e.g. localisation, multi-language support, personalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/123—Storage facilities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/131—Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/183—Tabulation, i.e. one-dimensional [1D] positioning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6064—Selection of Compressor
- H03M7/607—Selection between different types of compressors
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
- H03M7/707—Structured documents, e.g. XML
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioethics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Genetics & Genomics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Document Processing Apparatus (AREA)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962923113P | 2019-10-18 | 2019-10-18 | |
| US62/923,113 | 2019-10-18 | ||
| US202062956941P | 2020-01-03 | 2020-01-03 | |
| US62/956,941 | 2020-01-03 | ||
| PCT/EP2020/078996 WO2021074272A1 (en) | 2019-10-18 | 2020-10-15 | Customizable delimited text compression framework |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA3157786A1 true CA3157786A1 (en) | 2021-04-22 |
Family
ID=72964653
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3157786A Pending CA3157786A1 (en) | 2019-10-18 | 2020-10-15 | Customizable delimited text compression framework |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US20240095218A1 (https=) |
| EP (1) | EP4046052A1 (https=) |
| JP (1) | JP7848681B2 (https=) |
| CN (1) | CN114556318A (https=) |
| BR (1) | BR112022007396A2 (https=) |
| CA (1) | CA3157786A1 (https=) |
| WO (1) | WO2021074272A1 (https=) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102948214B1 (ko) * | 2021-07-16 | 2026-04-03 | 주식회사 쏠리드 | 프론트홀 다중화 장치 |
| US12387053B2 (en) | 2022-01-27 | 2025-08-12 | International Business Machines Corporation | Large-scale text data encoding and compression |
| CN117827775A (zh) * | 2022-09-29 | 2024-04-05 | 华为技术有限公司 | 数据压缩方法、装置、计算设备及存储系统 |
| CN116521063B (zh) * | 2023-03-31 | 2024-03-26 | 北京瑞风协同科技股份有限公司 | 一种hdf5的试验数据高效读写方法及装置 |
| CN119166428B (zh) * | 2024-11-21 | 2025-10-17 | 北京高阳捷迅信息技术有限公司 | 基于大数据的关系型数据库备份恢复方法及系统 |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2283591C (en) * | 1997-03-07 | 2006-01-31 | Intelligent Compression Technologies | Data coding network |
| JP2000252832A (ja) * | 1999-02-25 | 2000-09-14 | Nikon Corp | データ圧縮装置、およびデータ圧縮プログラムを記録した記録媒体 |
| JP2005018672A (ja) * | 2003-06-30 | 2005-01-20 | Hitachi Ltd | 構造化文書の圧縮方法 |
| GB2412978A (en) * | 2004-04-07 | 2005-10-12 | Hewlett Packard Development Co | Method and system for compressing and decompressing hierarchical data structures |
| US9667269B2 (en) * | 2009-04-30 | 2017-05-30 | Oracle International Corporation | Technique for compressing XML indexes |
| JP5280425B2 (ja) * | 2010-11-12 | 2013-09-04 | シャープ株式会社 | 画像処理装置、画像読取装置、画像形成装置、画像処理方法、プログラムおよびその記録媒体 |
| KR101922129B1 (ko) * | 2011-12-05 | 2018-11-26 | 삼성전자주식회사 | 차세대 시퀀싱을 이용하여 획득된 유전 정보를 압축 및 압축해제하는 방법 및 장치 |
| CA2958478C (en) | 2014-09-03 | 2019-04-16 | Patrick Soon-Shiong | Synthetic genomic variant-based secure transaction devices, systems and methods |
| JP6949970B2 (ja) | 2016-10-11 | 2021-10-13 | ゲノムシス エスアー | バイオインフォマティクスデータを送信する方法及びシステム |
| EA201990933A1 (ru) | 2016-10-11 | 2019-11-29 | Эффективные структуры данных для представления информации биоинформатики |
-
2020
- 2020-10-15 JP JP2022522976A patent/JP7848681B2/ja active Active
- 2020-10-15 CA CA3157786A patent/CA3157786A1/en active Pending
- 2020-10-15 BR BR112022007396A patent/BR112022007396A2/pt unknown
- 2020-10-15 CN CN202080073005.0A patent/CN114556318A/zh active Pending
- 2020-10-15 US US17/768,878 patent/US20240095218A1/en active Pending
- 2020-10-15 EP EP20793605.5A patent/EP4046052A1/en active Pending
- 2020-10-15 WO PCT/EP2020/078996 patent/WO2021074272A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| JP2023501093A (ja) | 2023-01-18 |
| BR112022007396A2 (pt) | 2022-07-05 |
| EP4046052A1 (en) | 2022-08-24 |
| JP7848681B2 (ja) | 2026-04-21 |
| CN114556318A (zh) | 2022-05-27 |
| US20240095218A1 (en) | 2024-03-21 |
| WO2021074272A1 (en) | 2021-04-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240095218A1 (en) | Customizable deliminated text compression framework | |
| US10778441B2 (en) | Redactable document signatures | |
| US11916576B2 (en) | System and method for effective compression, representation and decompression of diverse tabulated data | |
| US11120018B2 (en) | Spark query method and system supporting trusted computing | |
| US10970281B2 (en) | Searching for data using superset tree data structures | |
| WO2018200294A1 (en) | Parser for schema-free data exchange format | |
| JP6902104B2 (ja) | バイオインフォマティクス情報表示のための効率的データ構造 | |
| CN115080651A (zh) | Excel数据导入方法、装置、电子设备及存储介质 | |
| CN111095421A (zh) | 基因文件的上下文感知增量算法 | |
| RU2633178C2 (ru) | Способ и система базы данных для индексирования ссылок на документы базы данных | |
| Holley et al. | Bloom filter trie–a data structure for pan-genome storage | |
| Brown et al. | Improved pangenomic classification accuracy with chain statistics | |
| Aronson et al. | Towards an engineering approach to file carver construction | |
| CN118523780B (zh) | 一种对sas数据集进行解压以及压缩的方法及应用 | |
| CN118193567B (zh) | 生成查询语句和查询业务数据的方法、装置、设备及介质 | |
| CN118692573A (zh) | 一种基因型数据压缩及检索方法、装置、设备及计算机可读存储介质 | |
| US12445148B2 (en) | System and method for effective compression representation and decompression of diverse tabulated data | |
| CN118260772A (zh) | 一种漏洞检测方法、装置及电子设备 | |
| CN121092504A (zh) | 数据处理方法及装置、日志处理方法及装置、设备、介质及程序产品 | |
| Tollefson | Importing and Creating Data | |
| CN120066631A (zh) | 文字的加载方法、文字文件的生成方法及相关设备 | |
| CN119690990A (zh) | 数据库查询语句检查方法、装置、计算机设备、存储介质和计算机程序产品 | |
| CN121996323A (zh) | 小程序文件压缩方法、小程序文件读取方法及电子设备 | |
| US8667386B2 (en) | Network client optimization | |
| JP2016081376A (ja) | Url分類サーバ、url分類方法及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| P22 | Classification modified |
Free format text: ST27 STATUS EVENT CODE: A-1-1-P10-P22-P110 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CLASSIFICATION MODIFIED Effective date: 20240912 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 4TH ANNIV.) - STANDARD Year of fee payment: 4 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20241009 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT Effective date: 20241009 Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20241009 |
|
| D11 | Substantive examination requested |
Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D11-D117 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: REQUEST FOR EXAMINATION RECEIVED Effective date: 20241010 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT Effective date: 20241125 |
|
| D00 | Search and/or examination requested or commenced |
Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D00-D118 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: REQUEST FOR EXAMINATION REQUIREMENTS DETERMINED COMPLIANT Effective date: 20241212 |
|
| D11 | Substantive examination requested |
Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D11-D155 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: ALL REQUIREMENTS FOR EXAMINATION DETERMINED COMPLIANT Effective date: 20250314 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT Effective date: 20250319 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 5TH ANNIV.) - STANDARD Year of fee payment: 5 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20251007 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20251007 |
|
| D15 | Examination report completed |
Free format text: ST27 STATUS EVENT CODE: A-2-2-D10-D15-D126 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: EXAMINER'S REPORT Effective date: 20251212 |
|
| P11 | Amendment of application requested |
Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT RECEIVED - RESPONSE TO EXAMINER'S REQUISITION Effective date: 20260410 |
|
| P11 | Amendment of application requested |
Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-P102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: AMENDMENT DETERMINED COMPLIANT Effective date: 20260422 |
|
| P13 | Application amended |
Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P13-X000 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: APPLICATION AMENDED Effective date: 20260422 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-2-2-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT Effective date: 20260422 |