WO2019037117A1 - Procédé de codage et de décodage, dispositif et dispositif de traitement de données - Google Patents

Procédé de codage et de décodage, dispositif et dispositif de traitement de données Download PDF

Info

Publication number
WO2019037117A1
WO2019037117A1 PCT/CN2017/099152 CN2017099152W WO2019037117A1 WO 2019037117 A1 WO2019037117 A1 WO 2019037117A1 CN 2017099152 W CN2017099152 W CN 2017099152W WO 2019037117 A1 WO2019037117 A1 WO 2019037117A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
information
nucleic acid
sequence
binary code
Prior art date
Application number
PCT/CN2017/099152
Other languages
English (en)
Chinese (zh)
Inventor
杨焕明
刘斯奇
汪建
Original Assignee
深圳华大基因研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因研究院 filed Critical 深圳华大基因研究院
Priority to PCT/CN2017/099152 priority Critical patent/WO2019037117A1/fr
Priority to CN201780094012.7A priority patent/CN111095423B/zh
Publication of WO2019037117A1 publication Critical patent/WO2019037117A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/40Encryption of genetic data

Definitions

  • the present invention relates to the field of data processing technologies, and in particular, to an encoding method, an encoding device, a decoding method, a decoding device, a data processing device, and a computer readable storage medium.
  • the related technology mainly uses the secret key to convert the plaintext of the information into meaningless ciphertext to achieve the encryption effect.
  • the inventors have found that the above-mentioned related art has the following problems: the complicated and cumbersome calculation of information is performed only by a predetermined mathematical method, resulting in low encryption efficiency and low security; the existing method for storing information by using DNA requires DNA. Synthesizers and sequencers are expensive and the method operations are time consuming and labor intensive. The inventors have proposed a solution to at least one of the above problems.
  • An object of the present invention is to provide an encoding technology solution with high encryption efficiency and high security, and another object of the present invention is to provide an information storage solution which is simple in operation and low in price.
  • an encoding method comprising: digitizing information to generate sequence data; dividing the sequence data into N data segments, N being an integer greater than 1; for each data segment Finding a corresponding nucleic acid fragment in a gene database, and arranging the nucleic acid fragment in the genetic data
  • the location information in the library is used as an identifier of each data segment; a sequence code is generated according to the identifier corresponding to each data segment.
  • the digitizing process is to transcode the binary code corresponding to the information to generate the sequence data.
  • sequence data is data consisting of four deoxyribonucleotides of adenine A, cytosine C, guanine G, and thymine T.
  • 0 in the binary code is converted to A or T, and 1 is converted to C or G to generate the sequence data.
  • 01 in the binary code is converted to A, 00 is converted to T, 11 is converted to C, and 10 is converted to G to generate the sequence data.
  • the sequence data is a binary code corresponding to the information.
  • nucleic acid fragments in the gene database are transcoded into binary code prior to the searching step.
  • a or T in the gene database is converted to binary code 0, and C or G is converted to binary code 1.
  • a in the gene database is converted to binary code 01
  • T is converted to binary code 00
  • C is converted to binary code 11
  • G is converted to binary code 10.
  • the identifier comprises position information of the first symbol and the last symbol of the nucleic acid fragment in the gene database.
  • the identifier comprises positional information of a first symbol of the nucleic acid fragment in the gene database, and a length of the nucleic acid fragment.
  • the genetic database comprises one or more animal and/or plant and/or microbial genomic data.
  • the gene database comprises wild type genomic data and/or synthetic genomic data.
  • the gene database comprises human genomic data.
  • a decoding method including: obtaining an identifier corresponding to each data segment from the encoded data, where the encoded data is a sequence encoding generated according to the encoding method according to any of the above embodiments; Obtaining location information corresponding to each data segment according to the identifier; and counting the number of genes according to the location information
  • a corresponding nucleic acid fragment is obtained from a library; sequence data is generated based on the nucleic acid fragment. Information is obtained based on the sequence data.
  • an encoding apparatus including: an information digitizing module, configured to digitize information to generate sequence data; a data identifier determining module, wherein the data identifier determining module is connected to the information digitizing module, Dividing the sequence data into N data segments, N being an integer greater than 1, searching for a corresponding nucleic acid fragment in the gene database for each data segment, and locating the nucleic acid fragment in the gene database
  • the information is used as an identifier of each data segment.
  • the code generation module is connected to the data identifier determining module, and is configured to generate a sequence code according to the identifier corresponding to each data segment.
  • the data identifier determining module performs further data partitioning on the data segment in the gene database that does not find the corresponding nucleic acid fragment, obtains M data segments, and searches for M data in the gene database.
  • M is an integer greater than one.
  • the information digitizing module transcodes the binary code corresponding to the information to generate the sequence data.
  • sequence data is data consisting of four deoxyribonucleotides of adenine A, cytosine C, guanine G, and thymine T.
  • the information digitizing module converts 0 of the binary code to A or T, and 1 converts to C or G to generate the sequence data.
  • the information digitizing module converts 01 in the binary code to A, 00 to T, 11 to C, and 10 to G to generate the sequence data.
  • the sequence data is a binary code corresponding to the information.
  • the apparatus further includes a genetic data transcoding module, wherein the genetic data transcoding module is respectively connected to the information digitizing module and the data identifier determining module, and is configured to convert all the nucleic acid fragments in the gene database
  • the code is a binary code.
  • the genetic data transcoding module converts A or T in the gene database into binary code 0, and C or G is converted to binary code 1.
  • the genetic data transcoding module converts A in the gene database into binary code 01, T is converted to binary code 00, C is converted to binary code 11, and G is converted to binary code 10.
  • the identifier comprises position information of the first symbol and the last symbol of the nucleic acid fragment in the gene database.
  • the identifier includes position information of the first symbol of the nucleic acid fragment in the gene database, And the length of the nucleic acid fragment.
  • the information is at least one of text information, picture information, audio information, or video information.
  • the genetic database comprises one or more animal and/or plant and/or microbial genomic data.
  • the gene database comprises wild type genomic data and/or synthetic genomic data.
  • the gene database comprises human genomic data.
  • a decoding apparatus including: a data identifier obtaining module, configured to acquire an identifier corresponding to each data segment from the encoded data, where the encoded data is according to any one of the foregoing embodiments.
  • the encoding method is the sequence encoding generated by the encoding device according to any one of the above embodiments; the sequence obtaining module is connected to the data identifier obtaining module, and configured to acquire a position corresponding to each data segment according to the identifier.
  • a data processing apparatus comprising: a memory and a processor coupled to the memory, the processor being configured to perform the above based on an instruction stored in the memory device An encoding method or a decoding method in any of the embodiments.
  • a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements an encoding method or a decoding method in any of the above embodiments.
  • An advantage of the present invention is that the information is encrypted by matching the sequence data of the information to be encrypted to the nucleic acid fragments in the gene database and encoding the corresponding position information as a sequence. Utilizing the ultra-high storage density of nucleic acids and a unique intermolecular recognition mechanism, encryption can be completed without complicated and cumbersome mathematical calculation of information, thereby improving encryption efficiency and security.
  • Another advantage of the present invention is that the use of the present invention for information storage eliminates the need for expensive DNA synthesizers and sequencers, and requires only a computer having associated programs for encoding and decoding information to store information in nucleotides.
  • the sequence includes wild-type genomes or synthetic genomes of humans or other species, and storage capacity is unlimited, allowing for the storage of an unlimited amount of information.
  • Figure 1 shows a flow chart of one embodiment of the encoding method of the present invention.
  • Fig. 2 shows a schematic diagram of one embodiment of the encoding/decoding method of the present invention.
  • Figure 3 shows a flow chart of one embodiment of the decoding method of the present invention.
  • Fig. 4 is a block diagram showing an embodiment of an encoding apparatus of the present invention.
  • Fig. 5 is a block diagram showing an embodiment of a decoding apparatus of the present invention.
  • Fig. 6 is a block diagram showing an embodiment of a data processing device of the present invention.
  • Figure 1 shows a flow chart of one embodiment of the encoding method of the present invention.
  • step 110 the information is digitized to generate sequence data.
  • the digitizing process can include converting the information to a binary code.
  • the binary code is transcoded to generate sequence data, which may be a series of data arranged in order.
  • the information may be in any form such as text information, image information, or audio information.
  • Fig. 2 shows a schematic diagram of one embodiment of the encoding/decoding method of the present invention.
  • the information to be processed is text information 21, "What I cannot create, I do not understand. Look deep into nature, and then you will understand everything better.”
  • the 0 in this binary code can be converted to A or T, and 1 is converted to C or G to generate sequence data 23 "AGACTGGCAGCTCTTTTGGTTTAGAGCGACTA".
  • A, C, G, and T correspond to adenine, cytosine, guanine, and thymine in DNA (Deoxyribonucleic Acid, deoxyribonucleic acid), respectively.
  • Other forms of sequence data may also be generated according to other conversion modes between 1, 0 and A, C, G, and T.
  • the sequence data can be a binary code corresponding to the information.
  • all gene segments in the gene database need to be transcoded into binary code so that the binary code corresponding to the information can be found in the transformed gene database.
  • any form of information can be mapped to the data stored in DNA, thereby linking the information with the genetic database, providing the necessary technical basis for the encryption of information. Further, the following steps can be used to encrypt and store information.
  • step 120 the sequence data is divided into N data segments, N being an integer greater than one.
  • step 130 for each data segment, the corresponding nucleic acid fragment is looked up in the gene database, and the position information of the nucleic acid fragment in the gene database is used as the identification of each data segment.
  • the position information of the first symbol of the nucleic acid fragment that matches the data fragment and the length of the nucleic acid fragment may also be saved as an identification of the data fragment.
  • a nucleic acid fragment refers to a fragment formed by a plurality of nucleotides linked end to end, and the nucleotide may be a deoxyribonucleotide or a ribonucleotide.
  • the nucleic acid fragment can be transcoded into a binary code according to certain rules as needed, and the nucleic acid fragment after transcoding refers to the binary code corresponding to the nucleic acid fragment.
  • the length of the nucleic acid fragment can be expressed by the number of nucleotides, that is, "nt"; each nucleotide is regarded as one character in the present invention, and the number of nucleotides can also be expressed by the number of characters.
  • the nucleic acid fragment can be transcoded into a binary code according to a certain rule as needed, and the nucleic acid fragment after transcoding refers to a binary code corresponding to the nucleic acid fragment.
  • the length of the nucleic acid fragment is represented by a Byte.
  • the size of N can be adjusted based on the search for nucleic acid fragments in the gene database.
  • nucleic acid fragments corresponding to the data segments are found in the gene database, and the sequence data can be re-divided to obtain M data segments, and the gene database is searched for each of the M data segments.
  • a nucleic acid fragment, M being an integer greater than one.
  • the length of the re-divided data segment is smaller than the length of the original data segment so that it can be checked in the genetic database.
  • Find the nucleic acid fragment corresponding to the data fragment For example, a data fragment that cannot find a corresponding nucleic acid fragment in a gene database can be divided into multiple parts, and each part is respectively searched for a corresponding nucleic acid fragment in the gene database to improve the probability of fragment matching and the efficiency of searching.
  • the gene database 24 may be a nucleotide sequence of the human nuclear pore-reporting protein gene (SEQ ID NO: 1), which contains a database of 4103 characters.
  • the sequence data 23 is divided into a plurality of data segments, each of which contains 2 characters. The same nucleic acid fragment as each data fragment is looked up in the gene database 24.
  • the position corresponding to the first character in the nucleic acid fragment and the length of the nucleic acid fragment are recorded as an identifier.
  • the data segment composed of the first two characters AG in the sequence data corresponds to the identifier of 3856 2, that is, the 3856th character in the AG corresponding gene database starts with a nucleic acid segment of length 2 characters.
  • the length of the data fragment is reduced to 1 character and the same nucleic acid fragment is looked up in the gene database 24.
  • the new data sequence A is composed of the third character alone.
  • the corresponding identifier of the data sequence is 3827 1, that is, the 3827 characters in the corresponding gene database of A corresponds to a nucleic acid fragment having a length of 1 character.
  • step 140 sequence encoding is generated based on the identification of the respective data segment.
  • the identifiers of the data segments can be stored in order to obtain the sequence code corresponding to the information.
  • the gene database 24 in the embodiment shown in FIG. 2 described above has a small capacity, and thus the divided data segment length is also relatively small, and only the implementation process of the method is exemplarily illustrated.
  • a gene database storing a large number of gene sequences can be used as a database of encoding methods.
  • the sequence code consisting of these identifiers only contains the identifier of each data segment, which not only can realize information encryption, but also can improve storage efficiency.
  • the encoded data composed of the sequence encoding can be decoded by the inverse of the above steps.
  • Figure 3 shows a flow chart of one embodiment of the decoding method of the present invention.
  • step 310 an identifier corresponding to each data segment is obtained from the encoded data.
  • the encoded data may be the sequence code 25 in FIG.
  • step 320 location information corresponding to each data segment is obtained according to the identifier.
  • a corresponding nucleic acid fragment is obtained from the gene database based on the location information.
  • the identifier 3856 2 in the sequence code 25 in FIG. 2 represents a nucleic acid fragment of the gene database 24 starting with the 3856th character and having a length of 2 characters.
  • step 340 sequence data is generated from the nucleic acid fragments.
  • the acquired gene fragments can be combined to obtain the sequence data 23 "AGACTGGCAGCTCTTTTGGTTTAGAGCGACTA" in FIG.
  • the sequence data 23 is transcoded into a binary code 22 "01010111011010000110000101110100" according to the transcoding relationship between A, C, G, T and 1, 0 employed in encoding.
  • step 350 information is obtained from the sequence data.
  • the binary code 22 can be decoded into text information 21 "What I cannot create, I do not understand", thereby completing the decryption.
  • the information of the information to be encrypted is corresponding to the gene segment in the gene database, and the corresponding position information is encoded as a sequence, thereby realizing the encryption of the information.
  • Fig. 4 is a block diagram showing an embodiment of an encoding apparatus of the present invention.
  • the apparatus includes an information digitization module 41, a data identification determination module 42, and an encoding generation module 43.
  • the information digitization module 31 digitizes the information to generate sequence data.
  • the information digitization module 41 transcodes the binary code corresponding to the information to generate sequence data. For example, the information digitization module 41 converts 0 in the binary code to A or T, 1 to C or G to generate sequence data, or converts 01 in the binary code to A, 00 to T, 11 to C, 10 is converted to G to generate sequence data.
  • the sequence data is data composed of A, C, G, and T.
  • the apparatus further includes a genetic data transcoding module 44.
  • the gene data transcoding module 44 transcodes all of the nucleic acid fragments in the gene database into a binary code.
  • the data identification determining module 42 divides the sequence data into N data segments, N is an integer greater than 1, for each data segment, searches for a corresponding gene segment in the gene database, and uses the position information of the nucleic acid segment in the gene database as The identification of each piece of data.
  • the identification may include positional information of the first symbol and the last symbol of the nucleic acid fragment in the gene database, or the identification may include positional information of the first symbol of the nucleic acid fragment in the gene database, and the length of the nucleic acid fragment.
  • the data identification determining module 42 performs further data partitioning on the data segments in the gene database for which the corresponding nucleic acid fragments are not found, obtains M data segments, and searches for the M data segments in the gene database. For each corresponding nucleic acid fragment, M is an integer greater than one.
  • the code generation module 43 generates a sequence code based on the identifier corresponding to each data segment. For example, the sequence encoding may be sequentially generated in the order in which the respective data segments are divided.
  • Fig. 5 is a block diagram showing an embodiment of a decoding apparatus of the present invention.
  • the apparatus includes: a data identifier acquisition module 51, a sequence acquisition module 52, and an information generation module 53.
  • the data identifier obtaining module 51 acquires an identifier corresponding to each data segment from the encoded data, and the encoded data is a sequence encoding generated by the encoding method in the above embodiment or by the encoding device in the above embodiment.
  • the sequence obtaining module 52 acquires location information corresponding to each data segment according to the identifier, and acquires a corresponding nucleic acid fragment from the gene database according to the location information.
  • the information generating module 53 generates sequence data based on the nucleic acid fragments and acquires information based on the sequence data.
  • the information of the information to be encrypted is corresponding to the gene segment in the gene database, and the corresponding position information is encoded as a sequence, thereby realizing the encryption of the information.
  • Fig. 6 is a block diagram showing an embodiment of a data processing device of the present invention.
  • the apparatus 6 of this embodiment includes a memory 61 and a processor 62 coupled to the memory 61, the processor 62 being configured to perform any one of the implementations of the present invention based on instructions stored in the memory 61.
  • the encoding method or decoding method in the example is the example.
  • the memory 61 may include, for example, a system memory, a fixed non-volatile storage medium, or the like.
  • the system memory stores, for example, an operating system, an application, a boot loader, a database, and other programs.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer usable program code. .
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer usable program code. .
  • the methods and systems of the present invention may be implemented in a number of ways.
  • the methods and systems of the present invention can be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the present invention are not limited to the order specifically described above unless otherwise specifically stated.
  • the invention may also be embodied as a program recorded in a recording medium, the program comprising machine readable instructions for implementing the method according to the invention.
  • the invention also covers a recording medium storing a program for performing the method according to the invention.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé de codage et de décodage, un dispositif et un dispositif de traitement de données se rapportant au domaine technique du traitement de données. Le procédé de codage consiste : à effectuer un traitement numérique sur des informations pour générer des données de séquence (110) ; à diviser les données de séquence en N segments de données (120), N étant un nombre entier supérieur à 1 ; à rechercher une base de données génétique pour un segment d'acide nucléique, correspondant à chaque segment de données, et à utiliser des informations de position du segment d'acide nucléique dans la base de données génétique comme identifiant de chaque segment de données (130) ; et à générer un codage de séquence en fonction de l'identifiant correspondant à chaque segment de données (140). Le procédé et le dispositif peuvent être utilisés pour augmenter l'efficacité et la sécurité du cryptage.
PCT/CN2017/099152 2017-08-25 2017-08-25 Procédé de codage et de décodage, dispositif et dispositif de traitement de données WO2019037117A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/099152 WO2019037117A1 (fr) 2017-08-25 2017-08-25 Procédé de codage et de décodage, dispositif et dispositif de traitement de données
CN201780094012.7A CN111095423B (zh) 2017-08-25 2017-08-25 编码/解码方法、装置和数据处理装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/099152 WO2019037117A1 (fr) 2017-08-25 2017-08-25 Procédé de codage et de décodage, dispositif et dispositif de traitement de données

Publications (1)

Publication Number Publication Date
WO2019037117A1 true WO2019037117A1 (fr) 2019-02-28

Family

ID=65439286

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/099152 WO2019037117A1 (fr) 2017-08-25 2017-08-25 Procédé de codage et de décodage, dispositif et dispositif de traitement de données

Country Status (2)

Country Link
CN (1) CN111095423B (fr)
WO (1) WO2019037117A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112687338B (zh) * 2020-12-31 2022-01-11 云舟生物科技(广州)有限公司 基因序列的存储和还原方法、计算机存储介质及电子设备
CN113380322B (zh) * 2021-06-25 2023-10-24 倍生生物科技(深圳)有限公司 人工核酸序列水印编码系统、水印字符串及编码和解码方法
CN113782102B (zh) * 2021-08-13 2022-12-13 中科碳元(深圳)生物科技有限公司 Dna数据的存储方法、装置、设备及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120079A1 (en) * 2005-02-11 2008-05-22 Smartgene Gmbh Computer-Implemented Method and Computer-Based System for Validating Dna Sequencing Data
CN103114127A (zh) * 2011-11-16 2013-05-22 中国科学院华南植物园 一种基于dna芯片的密码系统
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
CN106845158A (zh) * 2017-02-17 2017-06-13 苏州泓迅生物科技股份有限公司 一种利用dna进行信息存储的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05324738A (ja) * 1992-05-20 1993-12-07 Fujitsu Ltd 遺伝子データベースの相同性分類方法
CN101420614B (zh) * 2008-11-28 2010-08-18 同济大学 一种混合编码与字典编码整合的图像压缩方法及装置
EP2781072B1 (fr) * 2011-11-15 2015-10-21 Citrix Systems Inc. Systèmes et procédés de compression de texte court par des dictionnaires dans un réseau
CN106506007A (zh) * 2015-09-08 2017-03-15 联发科技(新加坡)私人有限公司 一种无损数据压缩和解压缩装置及其方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120079A1 (en) * 2005-02-11 2008-05-22 Smartgene Gmbh Computer-Implemented Method and Computer-Based System for Validating Dna Sequencing Data
CN103114127A (zh) * 2011-11-16 2013-05-22 中国科学院华南植物园 一种基于dna芯片的密码系统
CN105022935A (zh) * 2014-04-22 2015-11-04 中国科学院青岛生物能源与过程研究所 一种利用dna进行信息存储的编码方法和解码方法
CN106845158A (zh) * 2017-02-17 2017-06-13 苏州泓迅生物科技股份有限公司 一种利用dna进行信息存储的方法

Also Published As

Publication number Publication date
CN111095423A (zh) 2020-05-01
CN111095423B (zh) 2023-07-21

Similar Documents

Publication Publication Date Title
US20220344005A1 (en) Methods to compress, encrypt and retrieve genomic alignment data
Li Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
JP7079786B2 (ja) アクセスユニットに構造化されたバイオインフォマティクスデータにアクセスするための方法、コンピュータ可読媒体、および装置
US20170249345A1 (en) A biomolecule based data storage system
US10311239B2 (en) Genetic information storage apparatus, genetic information search apparatus, genetic information storage program, genetic information search program, genetic information storage method, genetic information search method, and genetic information search system
US20210194686A1 (en) Encoding and decoding information in synthetic dna with cryptographic keys generated based on polymorphic features of nucleic acids
Zhang et al. Light-weight reference-based compression of FASTQ data
WO2024077948A1 (fr) Procédé, appareil et système d'interrogation privée, et support de stockage
WO2019037117A1 (fr) Procédé de codage et de décodage, dispositif et dispositif de traitement de données
Liu et al. High-speed and high-ratio referential genome compression
CN110088839B (zh) 用于生物信息学信息表示的有效数据结构
Al Yami et al. LFastqC: A lossless non-reference-based FASTQ compressor
Garhwal et al. BIIIA: a bioinformatics-inspired image identification approach
Liu et al. High-capacity reversible data hiding in encrypted images based on hierarchical quad-tree coding and multi-MSB prediction
CN110168652B (zh) 用于存储和访问生物信息学数据的方法和系统
Sahlin Strobemers: an alternative to k-mers for sequence comparison
Liu et al. Hamming-shifting graph of genomic short reads: Efficient construction and its application for compression
US20220277098A1 (en) Method and system for securely storing and programmatically searching data
WO2019080653A1 (fr) Procédé de codage/décodage, codeur/décodeur, et procédé et appareil de mémorisation
Kredens et al. Vertical lossless genomic data compression tools for assembled genomes: A systematic literature review
Krokosz et al. Cryptographic Algorithms with Data Shorter than the Encryption Key, Based on LZW and Huffman Coding
Tripathi et al. Identifying DNA sequence by using stream matching techniques
Sarkar et al. Quark enables semi-reference-based compression of RNA-seq data
Jain et al. An information security-based literature survey and classification framework of data storage in DNA
Naro et al. Reversible fingerprinting for genomic information

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17922130

Country of ref document: EP

Kind code of ref document: A1