BR112022025042A2 - Compressão de escore de qualidade - Google Patents
Compressão de escore de qualidadeInfo
- Publication number
- BR112022025042A2 BR112022025042A2 BR112022025042A BR112022025042A BR112022025042A2 BR 112022025042 A2 BR112022025042 A2 BR 112022025042A2 BR 112022025042 A BR112022025042 A BR 112022025042A BR 112022025042 A BR112022025042 A BR 112022025042A BR 112022025042 A2 BR112022025042 A2 BR 112022025042A2
- Authority
- BR
- Brazil
- Prior art keywords
- nucleic acid
- data
- base
- quality scores
- quality score
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3068—Precoding preceding compression, e.g. Burrows-Wheeler transformation
- H03M7/3071—Prediction
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6011—Encoder aspects
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6017—Methods or arrangements to increase the throughput
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6017—Methods or arrangements to increase the throughput
- H03M7/6029—Pipelining
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
- H03M7/702—Software
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
- H03M7/705—Unicode
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
- H03M7/707—Structured documents, e.g. XML
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4006—Conversion to or from arithmetic code
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biotechnology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Applications Or Details Of Rotary Compressors (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063110308P | 2020-11-05 | 2020-11-05 | |
| PCT/US2021/058364 WO2022099097A1 (en) | 2020-11-05 | 2021-11-05 | Quality score compression |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| BR112022025042A2 true BR112022025042A2 (pt) | 2023-05-09 |
Family
ID=78725748
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| BR112022025042A BR112022025042A2 (pt) | 2020-11-05 | 2021-11-05 | Compressão de escore de qualidade |
Country Status (12)
| Country | Link |
|---|---|
| US (4) | US11527307B2 (https=) |
| EP (1) | EP4241276A1 (https=) |
| JP (1) | JP7810664B2 (https=) |
| KR (1) | KR20230101760A (https=) |
| CN (1) | CN115668384A (https=) |
| AU (1) | AU2021376411A1 (https=) |
| BR (1) | BR112022025042A2 (https=) |
| CA (1) | CA3174208A1 (https=) |
| IL (2) | IL298981B2 (https=) |
| MX (1) | MX2022016020A (https=) |
| WO (1) | WO2022099097A1 (https=) |
| ZA (2) | ZA202304367B (https=) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022099097A1 (en) * | 2020-11-05 | 2022-05-12 | Illumina, Inc. | Quality score compression |
| JP2022086403A (ja) * | 2020-11-30 | 2022-06-09 | キオクシア株式会社 | メモリシステム及び情報処理システム |
| EP4490735A1 (en) | 2022-03-08 | 2025-01-15 | Illumina Inc | Multi-pass software-accelerated genomic read mapping engine |
| US11775172B1 (en) * | 2022-05-05 | 2023-10-03 | CELLGENTEK Corp. | Genome data compression and transmission method for FASTQ-formatted genome data |
| CN115662525B (zh) * | 2022-10-25 | 2026-04-21 | 湖南大学 | 一种测序fastq文件质量分数序列的稀疏化处理方法 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100205204A1 (en) | 2007-03-02 | 2010-08-12 | Research Organization Of Information And Systems | Homology retrieval system, homology retrieval apparatus, and homology retrieval method |
| US10090857B2 (en) * | 2010-04-26 | 2018-10-02 | Samsung Electronics Co., Ltd. | Method and apparatus for compressing genetic data |
| US20110288785A1 (en) * | 2010-05-18 | 2011-11-24 | Translational Genomics Research Institute (Tgen) | Compression of genomic base and annotation data |
| AU2012272161B2 (en) * | 2011-06-21 | 2015-12-24 | Illumina Cambridge Limited | Methods and systems for data analysis |
| US10777301B2 (en) * | 2012-07-13 | 2020-09-15 | Pacific Biosciences For California, Inc. | Hierarchical genome assembly method using single long insert library |
| US10847251B2 (en) * | 2013-01-17 | 2020-11-24 | Illumina, Inc. | Genomic infrastructure for on-site or cloud-based DNA and RNA processing and analysis |
| WO2014197377A2 (en) * | 2013-06-03 | 2014-12-11 | Good Start Genetics, Inc. | Methods and systems for storing sequence read data |
| WO2016081712A1 (en) * | 2014-11-19 | 2016-05-26 | Bigdatabio, Llc | Systems and methods for genomic manipulations and analysis |
| CN107851118A (zh) * | 2015-05-21 | 2018-03-27 | 基因福米卡数据系统有限公司 | 下一代测序数据的存储、传输和压缩 |
| JP6949970B2 (ja) | 2016-10-11 | 2021-10-13 | ゲノムシス エスアー | バイオインフォマティクスデータを送信する方法及びシステム |
| CN110021349B (zh) * | 2017-07-31 | 2021-02-02 | 北京哲源科技有限责任公司 | 基因数据的编码方法 |
| CN110111852A (zh) * | 2018-01-11 | 2019-08-09 | 广州明领基因科技有限公司 | 一种海量dna测序数据无损快速压缩平台 |
| CN110797082A (zh) * | 2019-10-24 | 2020-02-14 | 福建和瑞基因科技有限公司 | 基因测序数据的存储读取方法及系统 |
| WO2022099097A1 (en) | 2020-11-05 | 2022-05-12 | Illumina, Inc. | Quality score compression |
-
2021
- 2021-11-05 WO PCT/US2021/058364 patent/WO2022099097A1/en not_active Ceased
- 2021-11-05 CN CN202180039438.9A patent/CN115668384A/zh active Pending
- 2021-11-05 KR KR1020227044606A patent/KR20230101760A/ko active Pending
- 2021-11-05 EP EP21811704.2A patent/EP4241276A1/en active Pending
- 2021-11-05 JP JP2022575435A patent/JP7810664B2/ja active Active
- 2021-11-05 IL IL298981A patent/IL298981B2/en unknown
- 2021-11-05 BR BR112022025042A patent/BR112022025042A2/pt unknown
- 2021-11-05 CA CA3174208A patent/CA3174208A1/en active Pending
- 2021-11-05 IL IL316156A patent/IL316156B1/en unknown
- 2021-11-05 US US17/520,615 patent/US11527307B2/en active Active
- 2021-11-05 AU AU2021376411A patent/AU2021376411A1/en active Pending
- 2021-11-05 MX MX2022016020A patent/MX2022016020A/es unknown
-
2022
- 2022-10-27 US US17/974,978 patent/US11776663B2/en active Active
-
2023
- 2023-04-13 ZA ZA2023/04367A patent/ZA202304367B/en unknown
- 2023-08-23 US US18/237,187 patent/US12080385B2/en active Active
-
2024
- 2024-04-17 ZA ZA2024/02955A patent/ZA202402955B/en unknown
- 2024-08-28 US US18/817,560 patent/US20240420804A1/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| JP7810664B2 (ja) | 2026-02-03 |
| CN115668384A (zh) | 2023-01-31 |
| IL298981B2 (en) | 2025-03-01 |
| ZA202304367B (en) | 2023-12-20 |
| KR20230101760A (ko) | 2023-07-06 |
| IL316156B1 (en) | 2026-04-01 |
| IL298981A (en) | 2023-02-01 |
| WO2022099097A1 (en) | 2022-05-12 |
| ZA202402955B (en) | 2025-04-30 |
| IL316156A (en) | 2024-12-01 |
| US20220139502A1 (en) | 2022-05-05 |
| US20240420804A1 (en) | 2024-12-19 |
| JP2023547973A (ja) | 2023-11-15 |
| MX2022016020A (es) | 2023-02-02 |
| US20230040143A1 (en) | 2023-02-09 |
| US12080385B2 (en) | 2024-09-03 |
| AU2021376411A1 (en) | 2022-10-27 |
| US20240062853A1 (en) | 2024-02-22 |
| CA3174208A1 (en) | 2022-05-12 |
| US11527307B2 (en) | 2022-12-13 |
| IL298981B1 (en) | 2024-11-01 |
| EP4241276A1 (en) | 2023-09-13 |
| US11776663B2 (en) | 2023-10-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| BR112022025042A2 (pt) | Compressão de escore de qualidade | |
| CN108766437B (zh) | 语音识别方法、装置、计算机设备及存储介质 | |
| CN107609350B (zh) | 一种二代测序数据分析平台的数据处理方法 | |
| Kruskal | An overview of sequence comparison: Time warps, string edits, and macromolecules | |
| Karplus et al. | Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set | |
| CN111881210B (zh) | 数据同步方法、装置、内网服务器及介质 | |
| Dunn et al. | Agalma: an automated phylogenomics workflow | |
| Liu et al. | Voxtral | |
| US7929764B2 (en) | Identifying character information in media content | |
| CN106100641A (zh) | 针对fastq数据的多线程快速存储无损压缩方法及其系统 | |
| US20210286715A1 (en) | Test device, test method, and computer readable medium | |
| ATE210851T1 (de) | Verfahren und vorrichtung zur verbesserung der systemleistung in einem datenverarbeitungssystem | |
| BR112022015328A2 (pt) | Método e sistema para compressão de informações | |
| CN105227949A (zh) | 一种Android机顶盒自动化测试方法 | |
| CN113778902A (zh) | 测试案例覆盖度的检测方法及装置 | |
| BR112021018792A2 (pt) | Método e aparelho para exibição de vídeo com letra, dispositivo eletrônico e meio legível por computador | |
| Maronikolakis et al. | Wine is not vi n. on the compatibility of tokenizations across languages | |
| Nguyen | T-test distance and clustering criterion for speaker diarization | |
| CN108563688B (zh) | 一种影视剧本人物情绪识别方法 | |
| CN109784207B (zh) | 一种人脸识别方法、装置及介质 | |
| CN116028936B (zh) | 一种基于神经网络的恶意代码检测方法、介质及设备 | |
| US20220261430A1 (en) | Storage medium, information processing method, and information processing apparatus | |
| BRPI0511918A (pt) | aparelho de processamento de dados e programa para permitir que um computador execute processamento de dados para processar dados codificados, meio de gravação de programa, meio de gravação de dados, e estrutura de dados de dados codificados | |
| CN114974602A (zh) | 一种基于对比学习的诊断编码方法及系统 | |
| JP7574857B2 (ja) | 情報処理プログラム、情報処理方法および情報処理装置 |