CN111279422A - 编码/解码方法、编码/解码器和存储方法、装置 - Google Patents

编码/解码方法、编码/解码器和存储方法、装置 Download PDF

Info

Publication number
CN111279422A
CN111279422A CN201880068914.8A CN201880068914A CN111279422A CN 111279422 A CN111279422 A CN 111279422A CN 201880068914 A CN201880068914 A CN 201880068914A CN 111279422 A CN111279422 A CN 111279422A
Authority
CN
China
Prior art keywords
sequence
bit
binary code
symbol group
code sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201880068914.8A
Other languages
English (en)
Other versions
CN111279422B (zh
Inventor
黄小罗
陈世宏
林涛
陈泰
沈玥
徐讯
尹烨
杨焕明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of CN111279422A publication Critical patent/CN111279422A/zh
Application granted granted Critical
Publication of CN111279422B publication Critical patent/CN111279422B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/02Conversion to or from weighted codes, i.e. the weight given to a digit depending on the position of the digit within the block or code word
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本公开涉及一种编码/解码方法、编码/解码器和存储方法、装置,涉及数据存储技术领域。该编码方法包括:根据第一二进制代码序列的第一位、第二二进制代码序列的第一位和参考符号,确定编码序列的第一位,参考符号为四种不同符号中的任一种;根据第一二进制代码序列的当前位、第二二进制代码序列的当前位和编码序列的前一位,确定编码序列的当前位,编码序列的当前位为除编码序列的第一位以外的其它位。本公开能够提高存储密度,并避免在编码序列中出现的高GC、高AT重复问题。

Description

PCT国内申请,说明书已公开。

Claims (23)

  1. PCT国内申请,权利要求书已公开。
CN201880068914.8A 2017-10-25 2018-09-03 编码/解码方法、编码/解码器和存储方法、装置 Active CN111279422B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN2017110099002 2017-10-25
CN201711009900 2017-10-25
PCT/CN2018/103795 WO2019080653A1 (zh) 2017-10-25 2018-09-03 编码/解码方法、编码/解码器和存储方法、装置

Publications (2)

Publication Number Publication Date
CN111279422A true CN111279422A (zh) 2020-06-12
CN111279422B CN111279422B (zh) 2023-12-22

Family

ID=66247716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880068914.8A Active CN111279422B (zh) 2017-10-25 2018-09-03 编码/解码方法、编码/解码器和存储方法、装置

Country Status (3)

Country Link
US (1) US20200321079A1 (zh)
CN (1) CN111279422B (zh)
WO (1) WO2019080653A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539371A (zh) * 2021-07-05 2021-10-22 南方科技大学 一种序列的编码方法及装置、可读存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114758703B (zh) * 2022-06-14 2022-09-13 深圳先进技术研究院 基于重组质粒dna分子的数据信息存储方法

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1463497A (zh) * 2001-06-06 2003-12-24 精工爱普生株式会社 解码器、解码方法、查用表和解码程序
US20060068405A1 (en) * 2004-01-27 2006-03-30 Alex Diber Methods and systems for annotating biomolecular sequences
CN101565746A (zh) * 2009-06-03 2009-10-28 东南大学 带奇偶校验的信号组合编码dna连接测序方法
CN104850760A (zh) * 2015-03-27 2015-08-19 苏州泓迅生物科技有限公司 带有编码信息的人工合成dna存储介质及信息的存储读取方法和应用
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
CN105550570A (zh) * 2015-12-02 2016-05-04 深圳市同创国芯电子有限公司 一种应用于可编程器件的加密、解密方法及装置
CN106022006A (zh) * 2016-06-02 2016-10-12 广州麦仑信息科技有限公司 一种将基因信息进行二进制表示的存储方法
CN106845158A (zh) * 2017-02-17 2017-06-13 苏州泓迅生物科技股份有限公司 一种利用dna进行信息存储的方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2875458A2 (en) * 2012-07-19 2015-05-27 President and Fellows of Harvard College Methods of storing information using nucleic acids

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1463497A (zh) * 2001-06-06 2003-12-24 精工爱普生株式会社 解码器、解码方法、查用表和解码程序
US20060068405A1 (en) * 2004-01-27 2006-03-30 Alex Diber Methods and systems for annotating biomolecular sequences
CN101565746A (zh) * 2009-06-03 2009-10-28 东南大学 带奇偶校验的信号组合编码dna连接测序方法
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
CN104850760A (zh) * 2015-03-27 2015-08-19 苏州泓迅生物科技有限公司 带有编码信息的人工合成dna存储介质及信息的存储读取方法和应用
CN105550570A (zh) * 2015-12-02 2016-05-04 深圳市同创国芯电子有限公司 一种应用于可编程器件的加密、解密方法及装置
CN106022006A (zh) * 2016-06-02 2016-10-12 广州麦仑信息科技有限公司 一种将基因信息进行二进制表示的存储方法
CN106845158A (zh) * 2017-02-17 2017-06-13 苏州泓迅生物科技股份有限公司 一种利用dna进行信息存储的方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539371A (zh) * 2021-07-05 2021-10-22 南方科技大学 一种序列的编码方法及装置、可读存储介质
CN113539371B (zh) * 2021-07-05 2023-06-23 南方科技大学 一种序列的编码方法及装置、可读存储介质

Also Published As

Publication number Publication date
CN111279422B (zh) 2023-12-22
US20200321079A1 (en) 2020-10-08
WO2019080653A1 (zh) 2019-05-02

Similar Documents

Publication Publication Date Title
JP7079786B2 (ja) アクセスユニットに構造化されたバイオインフォマティクスデータにアクセスするための方法、コンピュータ可読媒体、および装置
JP7090148B2 (ja) Dnaベースのデータストレージ及びデータ取り出し
US8972201B2 (en) Compression of genomic data file
CN107609356B (zh) 基于标签模型的文本无载体信息隐藏方法
CN105976303B (zh) 一种基于矢量量化的可逆信息隐藏和提取方法
CN112288090B (zh) 存有数据信息的dna序列的处理方法及装置
CN112527736B (zh) 基于dna的数据存储方法、数据恢复方法及终端设备
KR101969848B1 (ko) 유전자 데이터를 압축하는 방법 및 장치
WO2020132935A1 (zh) 一种定点编辑存储有数据的核酸序列的方法及装置
CN111279422B (zh) 编码/解码方法、编码/解码器和存储方法、装置
US20090045987A1 (en) Method and apparatus for encoding/decoding metadata
CN110088839B (zh) 用于生物信息学信息表示的有效数据结构
KR20150092585A (ko) 이진 영상에 기반한 유전체 데이터 압축 방법 및 장치
JP2014197844A (ja) テキストをマトリクスコードシンボルに符号化するためのエンコーダ、およびマトリクスコードシンボルを復号化するためのデコーダ
Zhang et al. A high storage density strategy for digital information based on synthetic DNA
CN111095423B (zh) 编码/解码方法、装置和数据处理装置
CN111061722A (zh) 一种数据压缩、数据解压缩方法、装置及设备
KR20050053996A (ko) 허프만 코드를 효율적으로 복호화하는 방법 및 장치
US20230032409A1 (en) Method for Information Encoding and Decoding, and Method for Information Storage and Interpretation
CN110708074B (zh) Sam及bam文件cigar域的压缩及解压还原方法、系统和介质
WO2004070029A1 (en) Method to encode a dna sequence and to compress a dna sequence
WO2023206023A1 (zh) 用于dna存储的编码方法及编码装置
KR101177092B1 (ko) 물체의 데이터를 이미지파일에 인코딩하는 방법, 물체의 데이터를 디코딩하는 방법, 인코딩 및 디코딩 장치 및 그 기록매체
CN117133360A (zh) Dna存储信息的方法及相应信息修改和读取方法
KR20210056822A (ko) Fastq 포맷의 유전체 데이터를 위한 유전체 데이터의 압축 및 전송 방법

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant