CN111440827A - Information storage medium, information storage method and application - Google Patents

Information storage medium, information storage method and application Download PDF

Info

Publication number
CN111440827A
CN111440827A CN202010443536.6A CN202010443536A CN111440827A CN 111440827 A CN111440827 A CN 111440827A CN 202010443536 A CN202010443536 A CN 202010443536A CN 111440827 A CN111440827 A CN 111440827A
Authority
CN
China
Prior art keywords
information storage
storage medium
information
gene
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010443536.6A
Other languages
Chinese (zh)
Inventor
朱佑民
程倩
王梓旭
陶小倩
侯强波
杨平
柳伟强
邢妍婧
赵一凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Synbio Technologies Co ltd
Original Assignee
Suzhou Synbio Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Synbio Technologies Co ltd filed Critical Suzhou Synbio Technologies Co ltd
Priority to CN202010443536.6A priority Critical patent/CN111440827A/en
Publication of CN111440827A publication Critical patent/CN111440827A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/905Stable introduction of foreign DNA into chromosome using homologous recombination in yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/22Vectors comprising a coding region that has been codon optimised for expression in a respective host

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mycology (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Plant Pathology (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides an information storage medium, an information storage method and application, wherein the information storage medium is a nucleic acid molecule; the information storage medium comprises a fusion gene consisting of a codon sequence corresponding to the stored information and a fluorescent protein gene. The invention stores information by using the codons corresponding to the amino acids, the stored information can be read by using a sequencing technology, and the information contained in the information can be presumed according to the amino acid sequence corresponding to the protein, so that the information reading is not influenced even if the DNA is damaged; furthermore, the stored codon sequence can be theoretically translated into fluorescent protein, and the information is classified by observing the fluorescent color through a laser confocal microscope.

Description

Information storage medium, information storage method and application
Technical Field
The invention belongs to the technical field of information storage, relates to an information storage medium, an information storage method and application, and particularly relates to an amino acid corresponding codon information storage medium, an information storage method and application.
Background
Today, global digital information is growing rapidly, and the total amount of global digital information is expected to reach 35 ze bytes in 2020. However, the existing information storage devices cannot meet the demand for an explosive increase in the amount of information, and the existing magnetic and optical storage media have problems of low storage density and poor durability. The tape cassette is one of the most dense storage media, the storage capacity is 185TB, and the storage density is about 10GB/mm3(ii) a The storage capacity of the optical disk can reach 1PB and the storage density reaches 100GB/mm3. Storage media based on magnetic and optical technologies have a limited lifetime and, to ensure long-term storage of information, it is necessary to refresh and erase the damaged data or perform data migration to avoid data loss. For example, the storage life of the rotating disk is 3-5 years, and the storage life of the magnetic tape is 10-30 years. Energy consumption of the storage medium is also important, and the U.S. data center consumes 1.5% of the total electricity used in the U.S. year 2010, costing up to $ 45 billion. Needless to say, developing a next generation digital information storage medium with higher storage density and stronger durability is one of the primary tasks in the field of digital information storage at present.
DNA will become the future due to its high density and good long-term stabilityAn excellent choice of information storage medium. The DNA density is large, the total length of human genome DNA is up to 30 hundred million base pairs, but the DNA can exist in cells with the diameter of tens of micrometers, and the theoretical limit value of the information storage density of the DNA is 1EB/mm3(1EB=109GB), that is to say 1gDNA can store 700TB of data, corresponding to 1.4 million blu-ray discs of 50GB or 233 hard discs of 3TB, far exceeding existing magnetic and optical storage media. The DNA stability is extremely strong, and researchers predict that the DNA can be stored for 1 million years in an environment at-18 ℃.
At present, the cost and time for gene synthesis and gene sequencing are exponentially reduced, greatly promoting the development of the field of DNA information storage media.
CN106845158A discloses a method for storing information by using DNA, which comprises (1) converting binary information of computer original files into quaternary system and further code-converting into DNA full sequence, wherein binary codes 00, 01, 10, 11 are correspondingly converted into A, T, C, G four deoxyribonucleotides respectively; (2) dividing the whole DNA sequence into a plurality of DNA fragments, and organizing and constructing an output DNA sequence which has the length of 90-110nt and comprises an insertion nucleotide coding sequence consisting of the DNA fragments, flanking primer sequences positioned at two ends and an index coding sequence positioned at the inner side of each flanking primer sequence; (3) synthesizing artificial DNA sequence based on the output DNA sequence and storing. The method has the remarkable advantages of good universality, capability of simplifying operation, improving the continuity, the storage efficiency and the density of DNA information storage, reducing the error rate, reducing the sequence synthesis and detection cost and the like.
However, the current DNA storage is that information is directly stored in units of bases, and if DNA is damaged, the stored data cannot be recovered. Therefore, there is a need for further improvement of the existing DNA information storage technology.
Disclosure of Invention
Aiming at the defects and actual requirements of the prior art, the invention provides an information storage medium, an information storage method and application, wherein the information is stored by utilizing codons corresponding to amino acids, the stored information can be read by utilizing a sequencing technology, and the information contained in the information can be presumed according to the amino acid sequence corresponding to the protein, so that the reading of the information is not influenced even if the DNA is damaged; furthermore, the stored codon sequence can be theoretically translated into fluorescent protein, and the information is classified by observing the fluorescent color through a laser confocal microscope.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides an information storage medium which is a nucleic acid molecule;
the information storage medium comprises a fusion gene consisting of a codon sequence corresponding to the stored information and a fluorescent protein gene.
Preferably, each character of the stored information is represented by a number of consecutive amino acids, and each character of the stored information corresponds to a codon sequence consisting of a number of nucleotide residues.
In the invention, each character of the stored information is expressed as N continuous amino acids, and then the amino acids are converted into corresponding codons, namely, each character of the stored information corresponds to a codon sequence consisting of 3N nucleotide residues.
Preferably, each character of the stored information is represented by three consecutive amino acids, and each character of the stored information corresponds to a codon sequence consisting of 9 nucleotide residues.
In the present invention, each character of stored information (chinese character, english character or symbol) is converted into an amino acid sequence represented by three consecutive amino acids, on the basis of twenty amino acids "G", "a", "V", "L", "I", "F", "W", "Y", "D", "N", "E", "K", "Q", "M", "S", "T", "C", "P", "H" and "R", any 3 amino acid combinations represent one chinese character, and 20 × 20 × 20 chinese characters can be represented at the maximum, each amino acid is converted into a corresponding codon and codon-optimized according to saccharomyces cerevisiae, that is, each character of the stored information corresponds to a codon sequence consisting of 9 nucleotide residues, the codon sequence and a fluorescent protein gene are constructed as a fusion gene, and the thus constructed information storage medium is essentially a DNA sequence, and the stored information can be read by sequencing, and the DNA sequence also corresponds to an amino acid sequence, even after DNA is damaged, the stored information can be calculated by the amino acid sequence, and the information storage medium can be theoretically translated into a fluorescent protein, and the information can be classified by observing the fluorescent color under a microscope.
Preferably, the codon sequence corresponding to the stored information is located at the 5 'end of the fusion gene, and the fluorescent protein gene is located at the 3' end of the fusion gene.
Preferably, the fluorescent protein gene includes a green fluorescent protein gene or a red fluorescent protein gene.
Preferably, the information storage medium further includes a promoter and a terminator.
In the invention, a promoter and a terminator which are suitable for the saccharomyces cerevisiae are respectively connected with the 5 'end and the 3' end of the fusion gene.
Preferably, the promoter is the GA L1 promoter.
Preferably, the terminator is CYC1 terminator.
Preferably, the information storage medium further includes a 5're-organizing arm and a 3're-organizing arm.
In the invention, the recombination arm is a sequence suitable for homologous recombination in saccharomyces cerevisiae, namely, a sequence on two sides of a target gene corresponding to the designed sgRNA when CRISPR/Cas9 gene editing is carried out.
Preferably, the length of the 5 'recombinant arm and the 3' recombinant arm is 50-200 bp, for example, 50bp, 60bp, 70bp, 80bp, 90bp, 100bp, 110bp, 120bp, 130bp, 140bp, 150bp, 160bp, 170bp, 180bp, 190bp or 200 bp.
Preferably, the information storage medium comprises a 5 'recombination arm, a GA L1 promoter, a codon sequence corresponding to stored information, a green fluorescent protein gene, a CYC1 terminator and a 3' recombination arm in sequence from 5 'to 3'.
Preferably, the green fluorescent protein comprises a nucleic acid sequence shown as SEQ ID NO. 1;
SEQ ID NO:1:
atggttagtaagggagaagagttatttacaggggtcgttcctatattagtagaacttgatggcgacgttaatggacataaatttagtgtttcaggtgaaggagaaggtgatgcaacgtacggtaaactgactctaaagttcatttgcaccaccggtaaattgcctgtaccgtggccaacactagttactacgttaacatacggcgtacagtgtttttcgagatatccagaccacatgaaacaacacgactttttcaaatccgcaatgccagaaggttacgtccaggaacgtactattttcttcaaagatgatggaaattataaaaccagggctgaagtgaaatttgaaggcgacactctagtgaacagaattgagttgaaggggattgatttcaaggaagacgggaacatactcggtcataagctggagtacaactataattcccataacgtctatattatggcggataagcaaaagaatggtatcaaggttaactttaaaatccggcacaatatcgaagatggctctgtacaattggccgatcattatcaacaaaatacacctattggagatggtcccgtgttgttaccagacaatcattacttgtcaacacaatctgctttaagcaaagatcccaatgagaaaagagatcatatggtcttgttagagtttgttactgccgctggtataactctgggtatggatgaactttataaataa。
preferably, the GA L1 promoter comprises the nucleic acid sequence shown in SEQ ID NO. 2;
SEQ ID NO:2:
cggattagaagccgccgagcgggtgacagccctccgaaggaagactctcctccgtgcgtcctcgtcttcaccggtcgcgttcctgaaacgcagatgtgcctcgcgccgcactgctccgaacaataaagattctacaatactagcttttatggttatgaagaggaaaaattggcagtaacctggccccacaaaccttcaaatgaacgaatcaaattaacaaccataggatgataatgcgattagttttttagccttatttctggggtaattaatcagcgaagcgatgatttttgatctattaacagatatataaatgcaaaaactgcataaccactttaactaatactttcaacattttcggtttgtattacttcttattcaaatgtaataaaagtatcaacaaaaaattgttaatatacctctatactttaacgtcaaggag。
preferably, the CYC1 terminator comprises the nucleic acid sequence set forth in SEQ ID NO. 3;
SEQ ID NO:3:
tcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgc。
preferably, the 5' recombination arm comprises the nucleic acid sequence shown in SEQ ID NO. 4;
SEQ ID NO:4:
gaggatgtaataatactaatctcgaagatgccatctaatacatatagacatacatatatatatatatacattctatatattcttacccagattctttgaggtaagacggttgggttttatcttttgcagttggtactattaagaacaatcgaatcataagcattgcttacaaagaatacacatacgaaatattaacgata。
preferably, the 3' recombination arm comprises the nucleic acid sequence shown in SEQ ID NO. 5;
SEQ ID NO:5:
cgtgatttacatatactacaagtcgccagtgtaactcctcactgaatatgattcatacatacccgtatgtattaatgtataaatgttctcagagcaaattttatcgatatcttgtttgccagtggtatgcaggtttggcaaattttttaccataatatccgtttatagattctggaaccttaccaactttcttaccgcta。
in a second aspect, the present invention provides an information storage method, including:
integrating the information storage medium of the first aspect into the saccharomyces cerevisiae genome for information storage.
Preferably, the method comprises:
transforming the information storage medium of the first aspect into competent saccharomyces cerevisiae simultaneously with the CRISPR/Cas9 gene editing plasmid targeting the saccharomyces cerevisiae gene;
primarily screening positive colonies, and carrying out PCR detection to obtain positive clones introduced with plasmids;
and extracting the positive cloned genome, and performing sequencing verification to obtain the saccharomyces cerevisiae integrated with the information storage medium.
Preferably, the CRISPR/Cas9 gene editing plasmid contains sgRNA targeting the saccharomyces cerevisiae AED1 or AED2 genes.
In the invention, CRISPR/Cas9 gene editing plasmids of targeted saccharomyces cerevisiae AED1 or AED2 genes are designed, so that the color of colonies is changed from white to red after the AED1 or AED2 genes of saccharomyces cerevisiae are edited, and the primary screening of positive colonies by visual observation is facilitated.
Preferably, the CRISPR/Cas9 gene editing plasmid contains sgrnas that target the 5' sequence of the saccharomyces cerevisiae AED1 or AED2 genes.
Preferably, the sgRNA comprises a nucleic acid sequence shown in SEQ ID NO. 6-11;
SEQ ID NO:6:TCCTGCCCAGGCCGCTGAGC;
SEQ ID NO:7:ATTGTCAGAGGCTACATCAC;
SEQ ID NO:8:ACTCTGACAGTTTGGTCAAT;
SEQ ID NO:9:ACTTTACCTCTGGCCACCAA;
SEQ ID NO:10:GGACGGTATATTGCCATTGG;
SEQ ID NO:11:TATGTCTCTAACTTTACCTC。
preferably, the information storage medium is prepared by the following method:
and synthesizing a plurality of positive and negative spacing sequences as PCR templates, designing head and tail primers for PCR amplification to obtain the information storage medium, wherein the adjacent positive and negative spacing sequences have overlapping regions.
Preferably, the length of the positive and negative spacing sequence is 55-70 bp, for example, 55bp, 56bp, 57bp, 58bp, 59bp, 60bp, 61bp, 62bp, 63bp, 64bp, 65bp, 66bp, 67bp, 68bp, 69bp or 70 bp.
Preferably, the length of the overlapping region is 20-25 bp, for example, 20bp, 21bp, 22bp, 23bp, 24bp or 25 bp.
Preferably, the length of the head and tail primers is 55-70 bp, for example, 55bp, 56bp, 57bp, 58bp, 59bp, 60bp, 61bp, 62bp, 63bp, 64bp, 65bp, 66bp, 67bp, 68bp, 69bp or 70 bp.
Preferably, the positive colonies are red.
Preferably, the sequencing comprises Sanger sequencing.
In a third aspect, the present invention provides an information storage kit comprising the information storage medium of the first aspect.
Preferably, the kit further comprises a CRISPR/Cas9 gene editing plasmid targeting the saccharomyces cerevisiae gene.
Preferably, the kit further comprises saccharomyces cerevisiae.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention stores information by using codons corresponding to amino acids, each character of the stored information corresponds to a codon sequence consisting of 9 nucleotide residues, and the codon sequence and the fluorescent protein gene construct a fusion gene, so that the constructed information storage medium is essentially a DNA sequence, the stored information can be read by sequencing, the DNA sequence also corresponds to the amino acid sequence, and the stored information can be calculated by the amino acid sequence even if the DNA is damaged, thereby solving the problems that the DNA sequence is damaged and the data can not be recovered in the prior art;
(2) the information storage medium of the present invention can theoretically be translated into a protein that exhibits fluorescence, and information classification is performed by observing the fluorescence color under a microscope;
(3) the information storage medium has high storage density and good stability, and has wide application prospect in the technical field of information storage.
Drawings
FIG. 1 is a flow chart of a method of information storage;
FIG. 2 is a CRISPR/Cas9 gene editing plasmid map;
FIG. 3 shows that the color of the Saccharomyces cerevisiae colony changes from white to red after ADE1 gene knockout is successful;
FIG. 4 is a Sanger sequencing alignment after information storage.
Detailed Description
To further illustrate the technical means adopted by the present invention and the effects thereof, the present invention is further described below with reference to the embodiments and the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.
The examples do not show the specific techniques or conditions, according to the technical or conditions described in the literature in the field, or according to the product specifications. The reagents or apparatus used are conventional products commercially available from normal sources, not indicated by the manufacturer.
Example 1 information storage Experimental procedure
The flow chart of the information storage method constructed by the invention is shown in figure 1, firstly, each character of the information to be stored is converted into an amino acid sequence represented by three continuous amino acids, and then each amino acid is converted into a corresponding codon to obtain a codon sequence corresponding to the stored information; fusing a codon sequence with a Green Fluorescent Protein (GFP) gene to form a fusion gene, adding a promoter and a terminator, and adding a homologous recombination arm to form Donor; designing sgRNA of a targeting saccharomyces cerevisiae AED1 or AED2 gene, and constructing a CRISPR/Cas9 gene editing plasmid; co-transforming the plasmid and the Donor into saccharomyces cerevisiae, primarily screening positive colonies according to the color of the colonies, and carrying out PCR (polymerase chain reaction) detection to obtain positive clones introduced with the plasmid; and extracting the positive cloned genome, sequencing and verifying to obtain the saccharomyces cerevisiae integrated with the information storage medium, and reversely pushing the stored information according to the sequencing result.
Example 2 selection of information and codon conversion
The embodiment selects the following information for storage:
"Hongxu Biotechnology GmbH (Hongxu technology, Suzhou) is a leading DNA technology company. The technical platform developed by the present company is a globally leading combined platform for synthesizing biologically complete DNA
Figure BDA0002504799700000061
1.0,
Figure BDA0002504799700000062
2.0 and
Figure BDA0002504799700000063
3.0. the method efficiently meets the customer requirements of humanized antibody library construction, genetic engineering vaccine development, industrial enzyme optimization, chromosome/genome synthesis, molecular assisted breeding, DNA information storage technology development and the like. "
Converting the information to be stored into an amino acid sequence, wherein any 3 amino acid combinations represent a Chinese character based on twenty amino acids 'G', 'A', 'V', 'L', 'I', 'F', 'W', 'Y', 'D', 'N', 'E', 'K', 'Q', 'M', 'S', 'T', 'C', 'P', 'H' and 'R', and can represent at most 20 × 20 × 20 Chinese characters, converting the amino acid sequence into a corresponding codon sequence, and optimizing the codon sequence according to the codon preference of Saccharomyces cerevisiae to obtain the nucleic acid sequence carrying the stored information shown as SEQ ID NO. 12:
cgcctggatagagtatacagattgaagagggtcgttagggcttaccgcgtagcacgtgcattgcgagtgtccagggccattaggctacctagattgtttcgaattgatagattcccgcgagtggataggggttgcagagcaaggagactgatcagagctaagagattaaagcgcgtagttcgtgcacttagagtatctagggttactagaggttctagattagagaggggcgaaagagcacaaagaggtaaaagaggaatgaggttatcgagaattgcgcgtggtgacaggatccccagggtcagtagagcagaaagattcccaagagttgacagattatggagaattatcagattgcaaagatttcccagagtggatcgtgcccacagagtctgccgtctatcaagagtctcccgcgctgagcggggttggagggtctggcgtggtttcagagctgatagggggcctaggggtaaaaggggtatgcgactttcgagagtccaacgcgcttgtagggcctacagagttgccagaggctataggggacttcgtattaccagactcagtaggatcgcaagaggcgatcgaattccaagggtccagagagcgtgcagactattgcgtgtgcaaagaggatggcgtgtatggagggctgtgcgagctgttaggatttcacgtgggcaacgcttctgtagaggacatcgaatacgcagaggggtccgcgtcatgcgagtggaacgattagttcgtatcagccgtggacagagattttgcagaggtcacagaatccggcgtttggggagagttatgagggtcgaaagggtgattagaatttctagaggacaacgcttttgtagaggccatagaataaggagaggtacgagagttatgagggttgagagattatggagattgtatcgggttaatcgggttaggcggattaaacgaatttgtcggatcgaaagaggtattcgtgtaaaaaggggtaatagattacgaaggattcaacgtatatttagactccatagagtaccacgtataggcagattgacaaggatagttagggctgctagaggcggtagattaaacagagcctggagattgatgagaattttacgtgctcacagagtttgtcgaatagttcgactgaacagaatctatagagttcacagattcagaaggatacagagaatcgtgagaatttggcgctttactcgtttgcatagaggtcgtagagcagctagaggaggtcggctattgaggctttcccgcgtacaaagagcttgtcgtatcgtgaggttatgtagatttcatagaatacatcgtgctggtcgtgccttcagagttctacgggccaatagagcaatgagaatagctaggggcgacagaattcctcgtgtttttagagttggaagaattatgagagcaagcagattatcaagagtatctagagccgaaagggcccatagagtatgtagactggcgagagcgacaagggctccaagaggtgcaaggataaataggctttgg。
EXAMPLE 3 design and Synthesis of information storage Medium
The method comprises the steps of fusing a nucleic acid sequence shown as SEQ ID NO. 12 and carrying storage information with a green fluorescent protein gene shown as SEQ ID NO. 1 through a carbon end to construct a fusion gene, adding an initiation codon at a 5 'end, adding a stop codon at a 3' end, adding a GA L1 promoter (SEQ ID NO:2) at the upstream end, adding a CYC1 terminator (SEQ ID NO:3) at the downstream end, and adding homologous recombination arms (SEQ ID NO: 4-5) at two ends to design an information storage medium shown as SEQ ID NO. 13;
SEQ ID NO:13:
gaggatgtaataatactaatctcgaagatgccatctaatacatatagacatacatatatatatatatacattctatatattcttacccagattctttgaggtaagacggttgggttttatcttttgcagttggtactattaagaacaatcgaatcataagcattgcttacaaagaatacacatacgaaatattaacgatacggattagaagccgccgagcgggtgacagccctccgaaggaagactctcctccgtgcgtcctcgtcttcaccggtcgcgttcctgaaacgcagatgtgcctcgcgccgcactgctccgaacaataaagattctacaatactagcttttatggttatgaagaggaaaaattggcagtaacctggccccacaaaccttcaaatgaacgaatcaaattaacaaccataggatgataatgcgattagttttttagccttatttctggggtaattaatcagcgaagcgatgatttttgatctattaacagatatataaatgcaaaaactgcataaccactttaactaatactttcaacattttcggtttgtattacttcttattcaaatgtaataaaagtatcaacaaaaaattgttaatatacctctatactttaacgtcaaggagatgcgcctggatagagtatacagattgaagagggtcgttagggcttaccgcgtagcacgtgcattgcgagtgtccagggccattaggctacctagattgtttcgaattgatagattcccgcgagtggataggggttgcagagcaaggagactgatcagagctaagagattaaagcgcgtagttcgtgcacttagagtatctagggttactagaggttctagattagagaggggcgaaagagcacaaagaggtaaaagaggaatgaggttatcgagaattgcgcgtggtgacaggatccccagggtcagtagagcagaaagattcccaagagttgacagattatggagaattatcagattgcaaagatttcccagagtggatcgtgcccacagagtctgccgtctatcaagagtctcccgcgctgagcggggttggagggtctggcgtggtttcagagctgatagggggcctaggggtaaaaggggtatgcgactttcgagagtccaacgcgcttgtagggcctacagagttgccagaggctataggggacttcgtattaccagactcagtaggatcgcaagaggcgatcgaattccaagggtccagagagcgtgcagactattgcgtgtgcaaagaggatggcgtgtatggagggctgtgcgagctgttaggatttcacgtgggcaacgcttctgtagaggacatcgaatacgcagaggggtccgcgtcatgcgagtggaacgattagttcgtatcagccgtggacagagattttgcagaggtcacagaatccggcgtttggggagagttatgagggtcgaaagggtgattagaatttctagaggacaacgcttttgtagaggccatagaataaggagaggtacgagagttatgagggttgagagattatggagattgtatcgggttaatcgggttaggcggattaaacgaatttgtcggatcgaaagaggtattcgtgtaaaaaggggtaatagattacgaaggattcaacgtatatttagactccatagagtaccacgtataggcagattgacaaggatagttagggctgctagaggcggtagattaaacagagcctggagattgatgagaattttacgtgctcacagagtttgtcgaatagttcgactgaacagaatctatagagttcacagattcagaaggatacagagaatcgtgagaatttggcgctttactcgtttgcatagaggtcgtagagcagctagaggaggtcggctattgaggctttcccgcgtacaaagagcttgtcgtatcgtgaggttatgtagatttcatagaatacatcgtgctggtcgtgccttcagagttctacgggccaatagagcaatgagaatagctaggggcgacagaattcctcgtgtttttagagttggaagaattatgagagcaagcagattatcaagagtatctagagccgaaagggcccatagagtatgtagactggcgagagcgacaagggctccaagaggtgcaaggataaataggctttgggttagtaagggagaagagttatttacaggggtcgttcctatattagtagaacttgatggcgacgttaatggacataaatttagtgtttcaggtgaaggagaaggtgatgcaacgtacggtaaactgactctaaagttcatttgcaccaccggtaaattgcctgtaccgtggccaacactagttactacgttaacatacggcgtacagtgtttttcgagatatccagaccacatgaaacaacacgactttttcaaatccgcaatgccagaaggttacgtccaggaacgtactattttcttcaaagatgatggaaattataaaaccagggctgaagtgaaatttgaaggcgacactctagtgaacagaattgagttgaaggggattgatttcaaggaagacgggaacatactcggtcataagctggagtacaactataattcccataacgtctatattatggcggataagcaaaagaatggtatcaaggttaactttaaaatccggcacaatatcgaagatggctctgtacaattggccgatcattatcaacaaaatacacctattggagatggtcccgtgttgttaccagacaatcattacttgtcaacacaatctgctttaagcaaagatcccaatgagaaaagagatcatatggtcttgttagagtttgttactgccgctggtataactctgggtatggatgaactttataaataatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgccgtgatttacatatactacaagtcgccagtgtaactcctcactgaatatgattcatacatacccgtatgtattaatgtataaatgttctcagagcaaattttatcgatatcttgtttgccagtggtatgcaggtttggcaaattttttaccataatatccgtttatagattctggaaccttaccaactttcttaccgcta;
the information storage medium was synthesized as follows:
synthesizing a plurality of positive and negative spacing sequences of 55-70 bp as a PCR template, designing an overlap region with the length of 20-25 bp in the adjacent positive and negative spacing sequences, carrying out PCR amplification by using a head-tail primer, and obtaining the information storage medium by using Kapa high-fidelity polymerase as DNA polymerase.
Example 4 integration of an information storage Medium into the Saccharomyces cerevisiae genome
Firstly, designing sgRNA SEQ ID NO of 6 targeting saccharomyces cerevisiae AED1 genes to be 5-11, inserting the sgRNA SEQ ID NO into CRISPR/Cas9 gene editing plasmid (pRCC-K) in a recombination mode, wherein a plasmid map is shown in figure 2;
the information storage medium shown as SEQ ID NO. 13 and the CRISPR/Cas9 gene editing plasmid are simultaneously transformed into the saccharomyces cerevisiae for G418 screening, the CRISPR/Cas9 gene editing plasmid edits the AED1 gene of the saccharomyces cerevisiae, and the colony color is changed from white to red as shown in figure 3;
selecting red colonies, and carrying out PCR detection to obtain positive clones introduced with plasmids;
extracting the genome of the positive clone, amplifying the fragment near the ADE1 target point by PCR, obtaining the saccharomyces cerevisiae integrated with the information storage medium by Sanger sequencing verification, and reversely pushing the stored information according to the sequencing result as shown in figure 4 to obtain the original information.
EXAMPLE 5 stability analysis of information storage Medium
After the Saccharomyces cerevisiae with the genome integrated with the information storage medium shown as SEQ ID NO. 13 is stored for one month at-80 ℃, 20 ℃, 0 ℃ or 4 ℃, genomic DNA is extracted, and a fragment near an ADE1 target point is amplified by PCR for Sanger sequencing.
The sequencing result shows that after the DNA information storage medium is stored for one month at-80 ℃, 20 ℃, 0 ℃ or 4 ℃, the stored information in the information storage medium is not lost or changed, and the sequencing result is consistent with the original information, which indicates that the DNA information storage medium constructed by the invention is very stable.
In summary, the invention stores information by using codons corresponding to amino acids, the stored information can be read by using a sequencing technology, and the information contained in the information can be presumed according to the amino acid sequence corresponding to the protein, so that the reading of the information is not influenced even if the DNA is damaged, the technical problems that the DNA sequence is damaged and the stored information is lost in the prior art are solved, and the method has important significance in the technical field of information storage.
The applicant states that the present invention is illustrated in detail by the above examples, but the present invention is not limited to the above detailed methods, i.e. it is not meant that the present invention must rely on the above detailed methods for its implementation. It should be understood by those skilled in the art that any modification of the present invention, equivalent substitutions of the raw materials of the product of the present invention, addition of auxiliary components, selection of specific modes, etc., are within the scope and disclosure of the present invention.
SEQUENCE LISTING
<110> Suzhou Hongxn Biotechnology Ltd
<120> information storage medium, information storage method and application
<130>20200522
<160>13
<170>PatentIn version 3.3
<210>1
<211>720
<212>DNA
<213> Artificial sequence
<400>1
atggttagta agggagaaga gttatttaca ggggtcgttc ctatattagt agaacttgat 60
ggcgacgtta atggacataa atttagtgtt tcaggtgaag gagaaggtga tgcaacgtac 120
ggtaaactga ctctaaagtt catttgcacc accggtaaat tgcctgtacc gtggccaaca 180
ctagttacta cgttaacata cggcgtacag tgtttttcga gatatccaga ccacatgaaa 240
caacacgact ttttcaaatc cgcaatgcca gaaggttacg tccaggaacg tactattttc 300
ttcaaagatg atggaaatta taaaaccagg gctgaagtga aatttgaagg cgacactcta 360
gtgaacagaa ttgagttgaa ggggattgat ttcaaggaag acgggaacat actcggtcat 420
aagctggagt acaactataa ttcccataac gtctatatta tggcggataa gcaaaagaat 480
ggtatcaagg ttaactttaa aatccggcac aatatcgaag atggctctgt acaattggcc 540
gatcattatc aacaaaatac acctattgga gatggtcccg tgttgttacc agacaatcat 600
tacttgtcaa cacaatctgc tttaagcaaa gatcccaatg agaaaagaga tcatatggtc 660
ttgttagagt ttgttactgc cgctggtata actctgggta tggatgaact ttataaataa 720
<210>2
<211>442
<212>DNA
<213> Artificial sequence
<400>2
cggattagaa gccgccgagc gggtgacagc cctccgaagg aagactctcc tccgtgcgtc 60
ctcgtcttca ccggtcgcgt tcctgaaacg cagatgtgcc tcgcgccgca ctgctccgaa 120
caataaagat tctacaatac tagcttttat ggttatgaag aggaaaaatt ggcagtaacc 180
tggccccaca aaccttcaaa tgaacgaatc aaattaacaa ccataggatg ataatgcgat 240
tagtttttta gccttatttc tggggtaatt aatcagcgaa gcgatgattt ttgatctatt 300
aacagatata taaatgcaaa aactgcataa ccactttaac taatactttc aacattttcg 360
gtttgtatta cttcttattc aaatgtaata aaagtatcaa caaaaaattg ttaatatacc 420
tctatacttt aacgtcaagg ag 442
<210>3
<211>248
<212>DNA
<213> Artificial sequence
<400>3
tcatgtaatt agttatgtca cgcttacatt cacgccctcc ccccacatcc gctctaaccg 60
aaaaggaagg agttagacaa cctgaagtct aggtccctat ttattttttt atagttatgt 120
tagtattaag aacgttattt atatttcaaa tttttctttt ttttctgtac agacgcgtgt 180
acgcatgtaa cattatactg aaaaccttgc ttgagaaggt tttgggacgc tcgaaggctt 240
taatttgc 248
<210>4
<211>200
<212>DNA
<213> Artificial sequence
<400>4
gaggatgtaa taatactaat ctcgaagatg ccatctaata catatagaca tacatatata 60
tatatataca ttctatatat tcttacccag attctttgag gtaagacggt tgggttttat 120
cttttgcagt tggtactatt aagaacaatc gaatcataag cattgcttac aaagaataca 180
catacgaaat attaacgata 200
<210>5
<211>200
<212>DNA
<213> Artificial sequence
<400>5
cgtgatttac atatactaca agtcgccagt gtaactcctc actgaatatg attcatacat 60
acccgtatgt attaatgtat aaatgttctc agagcaaatt ttatcgatat cttgtttgcc 120
agtggtatgc aggtttggca aattttttac cataatatcc gtttatagat tctggaacct 180
taccaacttt cttaccgcta 200
<210>6
<211>20
<212>DNA
<213> Artificial sequence
<400>6
tcctgcccag gccgctgagc 20
<210>7
<211>20
<212>DNA
<213> Artificial sequence
<400>7
attgtcagag gctacatcac 20
<210>8
<211>20
<212>DNA
<213> Artificial sequence
<400>8
actctgacag tttggtcaat 20
<210>9
<211>20
<212>DNA
<213> Artificial sequence
<400>9
actttacctc tggccaccaa 20
<210>10
<211>20
<212>DNA
<213> Artificial sequence
<400>10
ggacggtata ttgccattgg 20
<210>11
<211>20
<212>DNA
<213> Artificial sequence
<400>11
tatgtctcta actttacctc 20
<210>12
<211>1530
<212>DNA
<213> Artificial sequence
<400>12
cgcctggata gagtatacag attgaagagg gtcgttaggg cttaccgcgt agcacgtgca 60
ttgcgagtgt ccagggccat taggctacct agattgtttc gaattgatag attcccgcga 120
gtggataggg gttgcagagc aaggagactg atcagagcta agagattaaa gcgcgtagtt 180
cgtgcactta gagtatctag ggttactaga ggttctagat tagagagggg cgaaagagca 240
caaagaggta aaagaggaat gaggttatcg agaattgcgc gtggtgacag gatccccagg 300
gtcagtagag cagaaagatt cccaagagtt gacagattat ggagaattat cagattgcaa 360
agatttccca gagtggatcg tgcccacaga gtctgccgtc tatcaagagt ctcccgcgct 420
gagcggggtt ggagggtctg gcgtggtttc agagctgata gggggcctag gggtaaaagg 480
ggtatgcgac tttcgagagt ccaacgcgct tgtagggcct acagagttgc cagaggctat 540
aggggacttc gtattaccag actcagtagg atcgcaagag gcgatcgaat tccaagggtc 600
cagagagcgt gcagactatt gcgtgtgcaa agaggatggc gtgtatggag ggctgtgcga 660
gctgttagga tttcacgtgg gcaacgcttc tgtagaggac atcgaatacg cagaggggtc 720
cgcgtcatgc gagtggaacg attagttcgt atcagccgtg gacagagatt ttgcagaggt 780
cacagaatcc ggcgtttggg gagagttatg agggtcgaaa gggtgattag aatttctaga 840
ggacaacgct tttgtagagg ccatagaata aggagaggta cgagagttat gagggttgag 900
agattatgga gattgtatcg ggttaatcgg gttaggcgga ttaaacgaat ttgtcggatc 960
gaaagaggta ttcgtgtaaa aaggggtaat agattacgaa ggattcaacg tatatttaga 1020
ctccatagag taccacgtat aggcagattg acaaggatag ttagggctgc tagaggcggt 1080
agattaaaca gagcctggag attgatgaga attttacgtg ctcacagagt ttgtcgaata 1140
gttcgactga acagaatcta tagagttcac agattcagaa ggatacagag aatcgtgaga 1200
atttggcgct ttactcgttt gcatagaggt cgtagagcag ctagaggagg tcggctattg 1260
aggctttccc gcgtacaaag agcttgtcgt atcgtgaggt tatgtagatt tcatagaata 1320
catcgtgctg gtcgtgcctt cagagttcta cgggccaata gagcaatgag aatagctagg 1380
ggcgacagaa ttcctcgtgt ttttagagtt ggaagaatta tgagagcaag cagattatca 1440
agagtatcta gagccgaaag ggcccataga gtatgtagac tggcgagagc gacaagggct 1500
ccaagaggtg caaggataaa taggctttgg 1530
<210>13
<211>3340
<212>DNA
<213> Artificial sequence
<400>13
gaggatgtaa taatactaat ctcgaagatg ccatctaata catatagaca tacatatata 60
tatatataca ttctatatat tcttacccag attctttgag gtaagacggt tgggttttat 120
cttttgcagt tggtactatt aagaacaatc gaatcataag cattgcttac aaagaataca 180
catacgaaat attaacgata cggattagaa gccgccgagc gggtgacagc cctccgaagg 240
aagactctcc tccgtgcgtc ctcgtcttca ccggtcgcgt tcctgaaacg cagatgtgcc 300
tcgcgccgca ctgctccgaa caataaagat tctacaatac tagcttttat ggttatgaag 360
aggaaaaatt ggcagtaacc tggccccaca aaccttcaaa tgaacgaatc aaattaacaa 420
ccataggatg ataatgcgat tagtttttta gccttatttc tggggtaatt aatcagcgaa 480
gcgatgattt ttgatctatt aacagatata taaatgcaaa aactgcataa ccactttaac 540
taatactttc aacattttcg gtttgtatta cttcttattc aaatgtaata aaagtatcaa 600
caaaaaattg ttaatatacc tctatacttt aacgtcaagg agatgcgcct ggatagagta 660
tacagattga agagggtcgt tagggcttac cgcgtagcac gtgcattgcg agtgtccagg 720
gccattaggc tacctagatt gtttcgaatt gatagattcc cgcgagtgga taggggttgc 780
agagcaagga gactgatcag agctaagaga ttaaagcgcg tagttcgtgc acttagagta 840
tctagggtta ctagaggttc tagattagag aggggcgaaa gagcacaaag aggtaaaaga 900
ggaatgaggt tatcgagaat tgcgcgtggt gacaggatcc ccagggtcag tagagcagaa 960
agattcccaa gagttgacag attatggaga attatcagat tgcaaagatt tcccagagtg 1020
gatcgtgccc acagagtctg ccgtctatca agagtctccc gcgctgagcg gggttggagg 1080
gtctggcgtg gtttcagagc tgataggggg cctaggggta aaaggggtat gcgactttcg 1140
agagtccaac gcgcttgtag ggcctacaga gttgccagag gctatagggg acttcgtatt 1200
accagactca gtaggatcgc aagaggcgat cgaattccaa gggtccagag agcgtgcaga 1260
ctattgcgtg tgcaaagagg atggcgtgta tggagggctg tgcgagctgt taggatttca 1320
cgtgggcaac gcttctgtag aggacatcga atacgcagag gggtccgcgt catgcgagtg 1380
gaacgattag ttcgtatcag ccgtggacag agattttgca gaggtcacag aatccggcgt 1440
ttggggagag ttatgagggt cgaaagggtg attagaattt ctagaggaca acgcttttgt 1500
agaggccata gaataaggag aggtacgaga gttatgaggg ttgagagatt atggagattg 1560
tatcgggtta atcgggttag gcggattaaa cgaatttgtc ggatcgaaag aggtattcgt 1620
gtaaaaaggg gtaatagatt acgaaggatt caacgtatat ttagactcca tagagtacca 1680
cgtataggca gattgacaag gatagttagg gctgctagag gcggtagatt aaacagagcc 1740
tggagattga tgagaatttt acgtgctcac agagtttgtc gaatagttcg actgaacaga 1800
atctatagag ttcacagatt cagaaggata cagagaatcg tgagaatttg gcgctttact 1860
cgtttgcata gaggtcgtag agcagctaga ggaggtcggc tattgaggct ttcccgcgta 1920
caaagagctt gtcgtatcgt gaggttatgt agatttcata gaatacatcg tgctggtcgt 1980
gccttcagag ttctacgggc caatagagca atgagaatag ctaggggcga cagaattcct 2040
cgtgttttta gagttggaag aattatgaga gcaagcagat tatcaagagt atctagagcc 2100
gaaagggccc atagagtatg tagactggcg agagcgacaa gggctccaag aggtgcaagg 2160
ataaataggc tttgggttag taagggagaa gagttattta caggggtcgt tcctatatta 2220
gtagaacttg atggcgacgt taatggacat aaatttagtg tttcaggtga aggagaaggt 2280
gatgcaacgt acggtaaact gactctaaag ttcatttgca ccaccggtaa attgcctgta 2340
ccgtggccaa cactagttac tacgttaaca tacggcgtac agtgtttttc gagatatcca 2400
gaccacatga aacaacacga ctttttcaaa tccgcaatgc cagaaggtta cgtccaggaa 2460
cgtactattt tcttcaaaga tgatggaaat tataaaacca gggctgaagt gaaatttgaa 2520
ggcgacactc tagtgaacag aattgagttg aaggggattg atttcaagga agacgggaac 2580
atactcggtc ataagctgga gtacaactat aattcccata acgtctatat tatggcggat 2640
aagcaaaaga atggtatcaa ggttaacttt aaaatccggc acaatatcga agatggctct 2700
gtacaattgg ccgatcatta tcaacaaaat acacctattg gagatggtcc cgtgttgtta 2760
ccagacaatc attacttgtc aacacaatct gctttaagca aagatcccaa tgagaaaaga 2820
gatcatatgg tcttgttaga gtttgttact gccgctggta taactctggg tatggatgaa 2880
ctttataaat aatcatgtaa ttagttatgt cacgcttaca ttcacgccct ccccccacat 2940
ccgctctaac cgaaaaggaa ggagttagac aacctgaagt ctaggtccct atttattttt 3000
ttatagttat gttagtatta agaacgttat ttatatttca aatttttctt ttttttctgt 3060
acagacgcgt gtacgcatgt aacattatac tgaaaacctt gcttgagaag gttttgggac 3120
gctcgaaggc tttaatttgc cgtgatttac atatactaca agtcgccagt gtaactcctc 3180
actgaatatg attcatacat acccgtatgt attaatgtat aaatgttctc agagcaaatt 3240
ttatcgatat cttgtttgcc agtggtatgc aggtttggca aattttttac cataatatcc 3300
gtttatagat tctggaacct taccaacttt cttaccgcta 3340

Claims (10)

1. An information storage medium, wherein the information storage medium is a nucleic acid molecule;
the information storage medium comprises a fusion gene consisting of a codon sequence corresponding to the stored information and a fluorescent protein gene.
2. The information storage medium of claim 1, wherein each character of the stored information is represented by a number of consecutive amino acids, each character of the stored information corresponding to a codon sequence consisting of a number of nucleotide residues;
preferably, each character of the stored information is represented by three consecutive amino acids, and each character of the stored information corresponds to a codon sequence consisting of 9 nucleotide residues;
preferably, the codon sequence corresponding to the stored information is positioned at the 5 'end of the fusion gene, and the fluorescent protein gene is positioned at the 3' end of the fusion gene;
preferably, the fluorescent protein gene includes a green fluorescent protein gene or a red fluorescent protein gene.
3. The information storage medium of claim 1 or 2, further comprising a promoter and a terminator;
preferably, the promoter is GA L1 promoter;
preferably, the terminator is CYC1 terminator;
preferably, the information storage medium further comprises a 5're-organizing arm and a 3're-organizing arm;
preferably, the length of the 5 'recombination arm and the 3' recombination arm is 50-200 bp.
4. The information storage medium of any one of claims 1 to 3, wherein the information storage medium comprises, in order from 5 'to 3', a 5 'recombination arm, a GA L1 promoter, a codon sequence corresponding to a stored information, a green fluorescent protein gene, a CYC1 terminator, and a 3' recombination arm;
preferably, the green fluorescent protein comprises a nucleic acid sequence shown as SEQ ID NO. 1;
preferably, the GA L1 promoter comprises the nucleic acid sequence shown in SEQ ID NO. 2;
preferably, the CYC1 terminator comprises the nucleic acid sequence set forth in SEQ ID NO. 3;
preferably, the 5' recombination arm comprises the nucleic acid sequence shown in SEQ ID NO. 4;
preferably, the 3' recombination arm comprises the nucleic acid sequence shown in SEQ ID NO. 5.
5. An information storage method, the method comprising:
information storage medium according to any one of claims 1 to 4 incorporated into the s.cerevisiae genome for information storage.
6. The method of claim 5, wherein the method comprises:
transforming the information storage medium of any one of claims 1-4 into competent saccharomyces cerevisiae simultaneously with a CRISPR/Cas9 gene editing plasmid targeting a saccharomyces cerevisiae gene;
primarily screening positive colonies, and carrying out PCR detection to obtain positive clones introduced with plasmids;
and extracting the positive cloned genome, and performing sequencing verification to obtain the saccharomyces cerevisiae integrated with the information storage medium.
7. The method of claim 5 or 6, wherein the CRISPR/Cas9 gene editing plasmid contains a sgRNA targeted to the Saccharomyces cerevisiae AED1 or AED2 gene;
preferably, the CRISPR/Cas9 gene editing plasmid contains sgRNA targeting the 5' sequence of the saccharomyces cerevisiae AED1 or AED2 gene;
preferably, the sgRNA includes a nucleic acid sequence shown in SEQ ID NO. 6-11.
8. The method according to any one of claims 5 to 7, wherein the information storage medium is prepared by:
synthesizing a plurality of positive and negative spacing sequences as a PCR template, designing a head primer and a tail primer to carry out PCR amplification to obtain the information storage medium, wherein the adjacent positive and negative spacing sequences have an overlapping region;
preferably, the length of the positive and negative spacing sequences is 55-70 bp;
preferably, the length of the overlapping region is 20-25 bp;
preferably, the length of the head primer and the tail primer is 55-70 bp.
9. The method of any one of claims 5-8, wherein the positive colonies are red;
preferably, the sequencing comprises Sanger sequencing.
10. An information storage kit, characterized in that the kit comprises the information storage medium according to any one of claims 1 to 4;
preferably, the kit further comprises a CRISPR/Cas9 gene editing plasmid targeting the saccharomyces cerevisiae gene;
preferably, the kit further comprises saccharomyces cerevisiae.
CN202010443536.6A 2020-05-22 2020-05-22 Information storage medium, information storage method and application Pending CN111440827A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010443536.6A CN111440827A (en) 2020-05-22 2020-05-22 Information storage medium, information storage method and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010443536.6A CN111440827A (en) 2020-05-22 2020-05-22 Information storage medium, information storage method and application

Publications (1)

Publication Number Publication Date
CN111440827A true CN111440827A (en) 2020-07-24

Family

ID=71657058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010443536.6A Pending CN111440827A (en) 2020-05-22 2020-05-22 Information storage medium, information storage method and application

Country Status (1)

Country Link
CN (1) CN111440827A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489721A (en) * 2020-11-25 2021-03-12 清华大学 Mirror image protein information storage and coding technology
CN113462710A (en) * 2021-06-30 2021-10-01 清华大学 Random rewriting DNA information storage method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103975063A (en) * 2011-11-23 2014-08-06 帝斯曼知识产权资产管理有限公司 Nucleic acid assembly system
CN104603273A (en) * 2012-03-12 2015-05-06 帝斯曼知识产权资产管理有限公司 Recombination system
CN106086061A (en) * 2016-07-27 2016-11-09 苏州泓迅生物科技有限公司 A kind of genes of brewing yeast group editor's carrier based on CRISPR Cas9 system and application thereof
CN106191099A (en) * 2016-07-27 2016-12-07 苏州泓迅生物科技有限公司 A kind of parallel multiple editor's carrier of genes of brewing yeast group based on CRISPR Cas9 system and application thereof
CN107723287A (en) * 2016-08-12 2018-02-23 中国科学院天津工业生物技术研究所 A kind of expression system for strengthening silk-fibroin production and preparing
CN108517331A (en) * 2018-03-19 2018-09-11 安徽希普生物科技有限公司 A kind of engineering bacteria construction method of amalgamation and expression antibacterial peptide and red fluorescent protein
CN109072243A (en) * 2016-02-18 2018-12-21 哈佛学院董事及会员团体 Pass through the method and system for the molecule record that CRISPR-CAS system carries out
CN109460822A (en) * 2018-11-19 2019-03-12 天津大学 Information storage means based on DNA
US20200063119A1 (en) * 2018-08-22 2020-02-27 Massachusetts Institute Of Technology In vitro dna writing for information storage

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103975063A (en) * 2011-11-23 2014-08-06 帝斯曼知识产权资产管理有限公司 Nucleic acid assembly system
CN104603273A (en) * 2012-03-12 2015-05-06 帝斯曼知识产权资产管理有限公司 Recombination system
CN109072243A (en) * 2016-02-18 2018-12-21 哈佛学院董事及会员团体 Pass through the method and system for the molecule record that CRISPR-CAS system carries out
CN106086061A (en) * 2016-07-27 2016-11-09 苏州泓迅生物科技有限公司 A kind of genes of brewing yeast group editor's carrier based on CRISPR Cas9 system and application thereof
CN106191099A (en) * 2016-07-27 2016-12-07 苏州泓迅生物科技有限公司 A kind of parallel multiple editor's carrier of genes of brewing yeast group based on CRISPR Cas9 system and application thereof
CN107723287A (en) * 2016-08-12 2018-02-23 中国科学院天津工业生物技术研究所 A kind of expression system for strengthening silk-fibroin production and preparing
CN108517331A (en) * 2018-03-19 2018-09-11 安徽希普生物科技有限公司 A kind of engineering bacteria construction method of amalgamation and expression antibacterial peptide and red fluorescent protein
US20200063119A1 (en) * 2018-08-22 2020-02-27 Massachusetts Institute Of Technology In vitro dna writing for information storage
CN109460822A (en) * 2018-11-19 2019-03-12 天津大学 Information storage means based on DNA

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489721A (en) * 2020-11-25 2021-03-12 清华大学 Mirror image protein information storage and coding technology
CN112489721B (en) * 2020-11-25 2021-11-12 清华大学 Mirror image protein information storage and coding technology
CN113462710A (en) * 2021-06-30 2021-10-01 清华大学 Random rewriting DNA information storage method
CN113462710B (en) * 2021-06-30 2023-07-11 清华大学 DNA information storage method capable of randomly rewriting

Similar Documents

Publication Publication Date Title
CN111440827A (en) Information storage medium, information storage method and application
CN110607320B (en) Plant genome directional base editing framework vector and application thereof
WO2015144045A1 (en) Plasmid library comprising two random markers and use thereof in high throughput sequencing
CN108034671B (en) Plasmid vector and method for establishing plant population by using same
Wu et al. In vivo assembly in Escherichia coli of transformation vectors for plastid genome engineering
CN110699383A (en) Method for integrating multiple copies of target gene into saccharomyces cerevisiae genome
US20190194738A1 (en) Key-value store that harnesses live micro-organisms to store and retrieve digital information
CN101948871A (en) Marine microalgae chloroplast expression vector and application thereof
CN112481413B (en) Plant mitochondrial genome assembly method based on second-generation and third-generation sequencing technologies
WO2024207806A1 (en) Double-plasmid system for rapid gene editing of ralstonia eutropha and use thereof
CN103966249B (en) A kind of carrier and application thereof for building without screening label cyanobacteria
Ohdate et al. Discovery of novel replication proteins for large plasmids in cyanobacteria and their potential applications in genetic engineering
CN108130338A (en) The carrier T and application of a kind of pre-T carrier and its composition
CN116218890A (en) Gene tandem expression cassette, multi-site gene editing system and application
CN114591997A (en) Expression vector of schizochytrium limacinum, construction method of expression vector and application
US11124819B2 (en) Genes involved in astaxanthin biosynthesis
CN117683755B (en) C-to-G base editing system
CN113832151B (en) Cucumber endogenous promoter and application thereof
CN113005137B (en) Construction method of regulatory element with dual functions of starting and stopping, dual-function element library and application
CN114437191B (en) Voltage-dependent anion channel protein OsVDAC4 and application thereof in regulation and control of rice male sterility
Ohdate et al. Discovery of novel replication
CN114657202B (en) Heat-resistant nucleic acid degrading enzyme expression vector, construction method and application thereof
Pennetti et al. Single component CRISPR-mediated base-editors for Agrobacterium and their use to develop an improved suite of strains
KR102010246B1 (en) Promoter from microalgae Ettlia and uses thereof
CN118581087A (en) Data writing and erasing elements, constructs, hosts and methods based on yeast optical disc storage data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200724

RJ01 Rejection of invention patent application after publication