CN109074424A - 利用dna存储文本信息的方法、其解码方法及应用 - Google Patents

利用dna存储文本信息的方法、其解码方法及应用 Download PDF

Info

Publication number
CN109074424A
CN109074424A CN201680085320.9A CN201680085320A CN109074424A CN 109074424 A CN109074424 A CN 109074424A CN 201680085320 A CN201680085320 A CN 201680085320A CN 109074424 A CN109074424 A CN 109074424A
Authority
CN
China
Prior art keywords
dna
dna sequence
sequence dna
text
text information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201680085320.9A
Other languages
English (en)
Other versions
CN109074424B (zh
Inventor
沈玥
陈泰
刘龙英
陈世宏
王云
杨焕明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of CN109074424A publication Critical patent/CN109074424A/zh
Application granted granted Critical
Publication of CN109074424B publication Critical patent/CN109074424B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/123DNA computing
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • C07H21/04Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/123Storage facilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/263Language identification
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C13/00Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00
    • G11C13/0002Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 using resistive RAM [RRAM] elements
    • G11C13/0009RRAM elements whose operation depends upon chemical change
    • G11C13/0014RRAM elements whose operation depends upon chemical change comprising cells based on organic memory material
    • G11C13/0019RRAM elements whose operation depends upon chemical change comprising cells based on organic memory material comprising bio-molecules
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C13/00Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00
    • G11C13/02Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 using elements whose operation depends upon chemical change
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/20Heterogeneous data integration
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B99/00Subject matter not provided for in other groups of this subclass

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Biochemistry (AREA)
  • Organic Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

一种使用DNA作为存储介质,编码、储存文本信息的方法,以及其解码方法及应用。DNA存储文本信息的方法包括:通过编码,将文字编码为计算机二进制数字,再通过转码将二进制数字转换成DNA序列;人工合成此编码有文字信息的DNA序列,通过设计的连接接头对文字进行定位,按预设顺序将编码文字信息的DNA序列组装起来。DNA存储文本信息的方法具有存储体积小、存储量大、稳定性强以及维护成本低等优点。

Description

PCT国内申请,说明书已公开。

Claims (9)

  1. PCT国内申请,权利要求书已公开。
CN201680085320.9A 2016-05-04 2016-05-04 利用dna存储文本信息的方法、其解码方法及应用 Active CN109074424B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/081037 WO2017190297A1 (zh) 2016-05-04 2016-05-04 利用dna存储文本信息的方法、其解码方法及应用

Publications (2)

Publication Number Publication Date
CN109074424A true CN109074424A (zh) 2018-12-21
CN109074424B CN109074424B (zh) 2022-03-11

Family

ID=60202585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680085320.9A Active CN109074424B (zh) 2016-05-04 2016-05-04 利用dna存储文本信息的方法、其解码方法及应用

Country Status (4)

Country Link
US (1) US10839295B2 (zh)
EP (1) EP3470997A4 (zh)
CN (1) CN109074424B (zh)
WO (1) WO2017190297A1 (zh)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109830263A (zh) * 2019-01-30 2019-05-31 东南大学 一种基于寡核苷酸序列编码存储的dna存储方法
CN109887549A (zh) * 2019-02-22 2019-06-14 天津大学 一种数据存储、还原方法及装置
CN111028883A (zh) * 2019-11-20 2020-04-17 广州达美智能科技有限公司 基于布尔代数的基因处理方法、装置及可读存储介质
CN111091876A (zh) * 2019-12-16 2020-05-01 中国科学院深圳先进技术研究院 一种dna存储方法、系统及电子设备
CN111243670A (zh) * 2020-01-23 2020-06-05 天津大学 一种满足生物约束的dna信息存储编码方法
CN111368132A (zh) * 2020-02-28 2020-07-03 元码基因科技(北京)股份有限公司 基于dna序列存储音频或视频文件的方法及存储介质
CN111680797A (zh) * 2020-05-08 2020-09-18 中国科学院计算技术研究所 一种dna活字印刷机、基于dna的数据存储设备和方法
CN111737955A (zh) * 2020-06-24 2020-10-02 任兆瑞 一种使用dna字符码存储文字点阵的方法
CN112100982A (zh) * 2020-08-07 2020-12-18 广州大学 Dna存储方法、系统和存储介质
CN112382340A (zh) * 2020-11-25 2021-02-19 中国科学院深圳先进技术研究院 用于dna数据存储的二进制信息到碱基序列的编解码方法和编解码装置
CN112802549A (zh) * 2021-01-26 2021-05-14 武汉大学 Dna序列完整性校验和纠错的编解码方法
CN113744804A (zh) * 2021-06-21 2021-12-03 深圳先进技术研究院 利用dna进行数据存储的方法、装置及存储设备
CN114058471A (zh) * 2020-07-29 2022-02-18 东南大学 负载了dna存储数据的数据存储装置、制备方法和读数方法
CN114730616A (zh) * 2019-09-24 2022-07-08 深圳华大生命科学研究院 信息编码和解码方法、装置、存储介质以及信息存储和解读方法
CN114898806A (zh) * 2022-05-25 2022-08-12 天津大学 一种dna活字写入系统及方法
CN114958828A (zh) * 2022-06-14 2022-08-30 深圳先进技术研究院 基于dna分子介质的数据信息存储方法

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10650312B2 (en) 2016-11-16 2020-05-12 Catalog Technologies, Inc. Nucleic acid-based data storage
KR102521152B1 (ko) 2016-11-16 2023-04-13 카탈로그 테크놀로지스, 인크. 핵산-기반 데이터 저장용 시스템
WO2019178551A1 (en) 2018-03-16 2019-09-19 Catalog Technologies, Inc. Chemical methods for nucleic acid-based data storage
US11339423B2 (en) * 2018-03-18 2022-05-24 Bryan Bishop Systems and methods for data storage in nucleic acids
EP3794598A1 (en) 2018-05-16 2021-03-24 Catalog Technologies, Inc. Compositions and methods for nucleic acid-based data storage
GB2576304B (en) 2018-07-26 2020-09-09 Evonetix Ltd Accessing data storage provided using double-stranded nucleic acid molecules
CA3108869A1 (en) * 2018-08-10 2020-02-13 Nucleotrace Pty. Ltd. Systems and methods for identifying a product's identity
US11017170B2 (en) * 2018-09-27 2021-05-25 At&T Intellectual Property I, L.P. Encoding and storing text using DNA sequences
CN109460822B (zh) * 2018-11-19 2021-11-12 天津大学 基于dna的信息存储方法
CN109943560A (zh) * 2018-11-22 2019-06-28 西藏自治区人民政府驻成都办事处医院 基于dna载体的汉字信息存储方法
KR20220017409A (ko) 2019-05-09 2022-02-11 카탈로그 테크놀로지스, 인크. Dna 기반 데이터 저장소에서 검색, 컴퓨팅 및 인덱싱하기 위한 데이터 구조 및 동작
GB201907460D0 (en) 2019-05-27 2019-07-10 Vib Vzw A method of storing information in pools of nucleic acid molecules
US10956806B2 (en) 2019-06-10 2021-03-23 International Business Machines Corporation Efficient assembly of oligonucleotides for nucleic acid based data storage
WO2021072398A1 (en) 2019-10-11 2021-04-15 Catalog Technologies, Inc. Nucleic acid security and authentication
WO2021231493A1 (en) 2020-05-11 2021-11-18 Catalog Technologies, Inc. Programs and functions in dna-based data storage
WO2021243605A1 (zh) * 2020-06-03 2021-12-09 深圳华大生命科学研究院 生成dna存储编解码规则的方法和装置及dna存储编解码方法和装置
CN112002376B (zh) * 2020-08-13 2024-03-19 中国海洋大学 一种dna分子记录和读取信息的方法
US20220243252A1 (en) * 2021-02-03 2022-08-04 Seagate Technology Llc Isotope modified nucleotides for dna data storage
CN117751410A (zh) * 2021-12-17 2024-03-22 深圳华大生命科学研究院 利用dna进行信息存储的方法和系统
CN114758703B (zh) * 2022-06-14 2022-09-13 深圳先进技术研究院 基于重组质粒dna分子的数据信息存储方法

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050053968A1 (en) * 2003-03-31 2005-03-10 Council Of Scientific And Industrial Research Method for storing information in DNA
CN1791875A (zh) * 2003-05-29 2006-06-21 独立行政法人产业技术综合研究所 作为信息载体的dna代码的设计方法
US20080057546A1 (en) * 1999-03-18 2008-03-06 Complete Genomics As Methods of Cloning and Producing Fragment Chains with Readable Information Content
US20140315310A1 (en) * 2012-12-13 2014-10-23 Massachusetts Institute Of Technology Recombinase-based logic and memory systems
CN104520864A (zh) * 2012-06-01 2015-04-15 欧洲分子生物学实验室 Dna中数字信息的高容量存储
CN104850760A (zh) * 2015-03-27 2015-08-19 苏州泓迅生物科技有限公司 带有编码信息的人工合成dna存储介质及信息的存储读取方法和应用
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
CN105119717A (zh) * 2015-07-21 2015-12-02 郑州轻工业学院 一种基于dna编码的加密系统及加密方法
KR20160001455A (ko) * 2014-06-27 2016-01-06 한국생명공학연구원 데이터 저장용 dna 메모리 기술

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004001060A2 (en) * 2002-06-20 2003-12-31 Bristol-Myers Squibb Company Identification and regulation of a g-protein coupled receptor, rai-3
US20040001371A1 (en) * 2002-06-26 2004-01-01 The Arizona Board Of Regents On Behalf Of The University Of Arizona Information storage and retrieval device using macromolecules as storage media
US20040153255A1 (en) 2003-02-03 2004-08-05 Ahn Tae-Jin Apparatus and method for encoding DNA sequence, and computer readable medium
US20170249345A1 (en) * 2014-10-18 2017-08-31 Girik Malik A biomolecule based data storage system
WO2016069913A1 (en) * 2014-10-29 2016-05-06 Massachusetts Institute Of Technology Dna cloaking technologies
US20170141793A1 (en) * 2015-11-13 2017-05-18 Microsoft Technology Licensing, Llc Error correction for nucleotide data stores

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080057546A1 (en) * 1999-03-18 2008-03-06 Complete Genomics As Methods of Cloning and Producing Fragment Chains with Readable Information Content
US20050053968A1 (en) * 2003-03-31 2005-03-10 Council Of Scientific And Industrial Research Method for storing information in DNA
CN1791875A (zh) * 2003-05-29 2006-06-21 独立行政法人产业技术综合研究所 作为信息载体的dna代码的设计方法
CN104520864A (zh) * 2012-06-01 2015-04-15 欧洲分子生物学实验室 Dna中数字信息的高容量存储
US20140315310A1 (en) * 2012-12-13 2014-10-23 Massachusetts Institute Of Technology Recombinase-based logic and memory systems
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
KR20160001455A (ko) * 2014-06-27 2016-01-06 한국생명공학연구원 데이터 저장용 dna 메모리 기술
CN104850760A (zh) * 2015-03-27 2015-08-19 苏州泓迅生物科技有限公司 带有编码信息的人工合成dna存储介质及信息的存储读取方法和应用
CN105119717A (zh) * 2015-07-21 2015-12-02 郑州轻工业学院 一种基于dna编码的加密系统及加密方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GEORGE M. CHURCH ET AL: "Supplementary Materials for Next-Generation Digital Information Storage in DNA", 《SCIENCE》 *
S. M. HOSSEIN TABATABAEI YAZDI ET AL: "DNA-Based Storage: Trends and Methods", 《ARXIV》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109830263A (zh) * 2019-01-30 2019-05-31 东南大学 一种基于寡核苷酸序列编码存储的dna存储方法
CN109887549B (zh) * 2019-02-22 2023-01-20 天津大学 一种数据存储、还原方法及装置
CN109887549A (zh) * 2019-02-22 2019-06-14 天津大学 一种数据存储、还原方法及装置
CN114730616A (zh) * 2019-09-24 2022-07-08 深圳华大生命科学研究院 信息编码和解码方法、装置、存储介质以及信息存储和解读方法
CN111028883A (zh) * 2019-11-20 2020-04-17 广州达美智能科技有限公司 基于布尔代数的基因处理方法、装置及可读存储介质
CN111028883B (zh) * 2019-11-20 2023-07-18 广州达美智能科技有限公司 基于布尔代数的基因处理方法、装置及可读存储介质
CN111091876A (zh) * 2019-12-16 2020-05-01 中国科学院深圳先进技术研究院 一种dna存储方法、系统及电子设备
CN111091876B (zh) * 2019-12-16 2024-05-17 中国科学院深圳先进技术研究院 一种dna存储方法、系统及电子设备
CN111243670A (zh) * 2020-01-23 2020-06-05 天津大学 一种满足生物约束的dna信息存储编码方法
CN111368132A (zh) * 2020-02-28 2020-07-03 元码基因科技(北京)股份有限公司 基于dna序列存储音频或视频文件的方法及存储介质
CN111680797A (zh) * 2020-05-08 2020-09-18 中国科学院计算技术研究所 一种dna活字印刷机、基于dna的数据存储设备和方法
CN111680797B (zh) * 2020-05-08 2023-06-06 中国科学院计算技术研究所 一种dna活字印刷机、基于dna的数据存储设备和方法
CN111737955A (zh) * 2020-06-24 2020-10-02 任兆瑞 一种使用dna字符码存储文字点阵的方法
CN114058471A (zh) * 2020-07-29 2022-02-18 东南大学 负载了dna存储数据的数据存储装置、制备方法和读数方法
CN112100982B (zh) * 2020-08-07 2023-06-20 广州大学 Dna存储方法、系统和存储介质
CN112100982A (zh) * 2020-08-07 2020-12-18 广州大学 Dna存储方法、系统和存储介质
CN112382340A (zh) * 2020-11-25 2021-02-19 中国科学院深圳先进技术研究院 用于dna数据存储的二进制信息到碱基序列的编解码方法和编解码装置
CN112802549A (zh) * 2021-01-26 2021-05-14 武汉大学 Dna序列完整性校验和纠错的编解码方法
CN113744804B (zh) * 2021-06-21 2023-03-10 深圳先进技术研究院 利用dna进行数据存储的方法、装置及存储设备
CN113744804A (zh) * 2021-06-21 2021-12-03 深圳先进技术研究院 利用dna进行数据存储的方法、装置及存储设备
CN114898806A (zh) * 2022-05-25 2022-08-12 天津大学 一种dna活字写入系统及方法
CN114958828A (zh) * 2022-06-14 2022-08-30 深圳先进技术研究院 基于dna分子介质的数据信息存储方法
CN114958828B (zh) * 2022-06-14 2024-04-19 深圳先进技术研究院 基于dna分子介质的数据信息存储方法

Also Published As

Publication number Publication date
EP3470997A4 (en) 2020-04-01
EP3470997A1 (en) 2019-04-17
US20190138909A1 (en) 2019-05-09
CN109074424B (zh) 2022-03-11
WO2017190297A1 (zh) 2017-11-09
US10839295B2 (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN109074424A (zh) 利用dna存储文本信息的方法、其解码方法及应用
WO2018149405A1 (zh) 信息存储和读取的方法
CN104850760B (zh) 人工合成dna存储介质的信息存储读取方法
US20060281097A1 (en) Method of processing and/or genome mapping of ditag sequences
CN103093121B (zh) 双向多步deBruijn图的压缩存储和构造方法
CN111600609B (zh) 一种优化中文存储的dna存储编码方法
WO2015176990A1 (en) Method and apparatus for storing information units in nucleic acid molecules and nucleic acid storage system
CN110442472B (zh) 一种dna数据存储混合错误纠正与数据恢复方法
Brecher Name= struct: A practical approach to the sorry state of real-life chemical nomenclature
Zan et al. A hierarchical error correction strategy for text DNA storage
US8005621B2 (en) Transcript mapping method
Tan et al. NUS-IDS at FinCausal 2021: Dependency tree in graph neural network for better cause-effect span detection
CN103699819A (zh) 基于多步双向De Bruijn图的变长kmer查询的顶点扩展方法
Gagie et al. Compressing and indexing aligned readsets
Milenkovic et al. DNA-based data storage systems: A review of implementations and code constructions
CN107145947A (zh) 一种信息处理方法、装置及电子设备
Zhao et al. DBTRG: De Bruijn Trim rotation graph encoding for reliable DNA storage
Cao et al. Achieve Handle Level Random Access in Encrypted DNA Archival Storage System via Frequency Dictionary Mapping Coding
Bo et al. An information hiding method for text by substituted conception
Al-Khafaji et al. A new Approach to DNA, RNA, and protein motifs templates Visualization and Analysis via compilation technique
CN103699818A (zh) 基于多步双向De Bruijn图的变长kmer查询的双向边扩展方法
JP2003101485A (ja) 生体高分子を通信媒体もしくは記録媒体とした、情報通信方法、情報記録方法、エンコーダおよびデコーダ
Nunes et al. A compressed suffix tree based implementation with low peak memory usage
Jona et al. A BioSequence Ontology from Molecular Structure
EP3803882A1 (en) A method of storing information using dna molecules

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1260446

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant