CN114196700A - Method and kit for direct reprogramming of hepatocytes into islet-like cells - Google Patents

Method and kit for direct reprogramming of hepatocytes into islet-like cells Download PDF

Info

Publication number
CN114196700A
CN114196700A CN202010985873.8A CN202010985873A CN114196700A CN 114196700 A CN114196700 A CN 114196700A CN 202010985873 A CN202010985873 A CN 202010985873A CN 114196700 A CN114196700 A CN 114196700A
Authority
CN
China
Prior art keywords
sequence
sgrnas
cells
casilio
puf domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010985873.8A
Other languages
Chinese (zh)
Inventor
杨晓菲
李富荣
王晗月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Peoples Hospital
Original Assignee
Shenzhen Peoples Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Peoples Hospital filed Critical Shenzhen Peoples Hospital
Priority to CN202010985873.8A priority Critical patent/CN114196700A/en
Publication of CN114196700A publication Critical patent/CN114196700A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0676Pancreatic cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention discloses a method and a kit for direct reprogramming of hepatocytes into islet-like cells. The invention provides a method for preparing insulin secreting cells from hepatocytes, which comprises the following steps: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells. The three functional elements of the Casilio system are dCas9 protein, sgRNA with PUF domain binding sites attached, and effector proteins fused to PUF domains; the sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table. The invention provides a new research idea and technical means for islet cell regeneration, lays an experimental foundation for clinical treatment of diabetes, and has clinical application value.

Description

Method and kit for direct reprogramming of hepatocytes into islet-like cells
Technical Field
The present invention relates to a method and kit for direct reprogramming of hepatocytes into islet-like cells.
Background
Diabetes mellitus is a chronic metabolic disease caused by insufficient insulin secretion or insulin resistance in a patient. Allogeneic islet transplantation is one of the most likely methods for treating diabetes at present, but islet donor deficiency and immune rejection limit its clinical application.
Direct cell reprogramming is the direct transdifferentiation of one differentiated cell into another without dedifferentiation and redifferentiation, and the liver and pancreas are both derived from the abdominal foregut endoderm, which determines the possibility of liver and pancreatic cells having interconversion. The most critical inducing factor in achieving direct reprogramming of hepatocytes to islet beta cells in vitro is activation of pancreatic transcription factor expression. Akinci and the like screen out three transcription factors Pdx1, Ngn3 and MafA (PNM) from 20 transcription factors related to pancreatic beta cell development, and directly reprograms exocrine cells into islet-like cells through retrovirus. Banga et al injected PNM transcription factor combination tandem adenovirus into mice via tail vein and found Insulin-secreting Cells (IPCs) only in the liver. Therefore, the key transcription factor for pancreatic development, such as PNM, is often used as a target gene for direct reprogramming of liver cells into pancreatic beta cells. At present, the efficiency and maturity of direct reprogramming of cells are low, the traditional reprogramming method cannot effectively open the regulation and control network of relevant endogenous genes of islet cells, and the obtained cells cannot meet the requirements of clinical treatment.
CRISPR/Cas9(clustered regulated short palindromic repeats Cas9) is an important novel gene editing technology following zinc-finger nucleases (ZFNs), transcription-activator-like effector nucleases (TALENs). The Casilio system is composed of dCas9 protein, a sgRNA (sgRNA-PBS) with one or more PUF domain binding sites attached thereto, and an effector protein (PUF fusion protein) fused to the PUF domain.
Disclosure of Invention
It is an object of the present invention to provide a method and kit for direct reprogramming of hepatocytes into islet-like cells.
The invention provides a method for preparing insulin secreting cells from hepatocytes, which comprises the following steps: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells.
The present invention also provides a method for direct reprogramming of hepatocytes into insulin-secreting cells, comprising the steps of: the DNA coding for the three functional elements of the Casilio system was introduced into hepatocytes.
The invention also provides a recombinant cell, and the preparation method comprises the following steps: DNA encoding three functional elements of the Casilio system was introduced into hepatocytes to obtain recombinant cells.
The invention also provides a kit comprising a recombinant expression vector containing the DNA coding for the three functional elements of the Casilio system or comprising the DNA coding for the three functional elements of the Casilio system; the kit has the function of preparing insulin secreting cells by using liver cells. The kit can also comprise PiggyBac transposase or coding DNA of the PiggyBac transposase or a recombinant expression vector containing the coding DNA of the PiggyBac transposase.
The invention also protects the use of the recombinant expression vectors of the three functional elements of the Casilio system or of the DNA coding for the three functional elements of the Casilio system for the preparation of a kit; the kit has the function of preparing insulin secreting cells by using liver cells.
The three functional elements of any of the above Casilio systems are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain; the sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table.
Effector proteins fused to PUF domains are transcriptional activators fused to PUF domains.
The DNA coding for the three functional elements of the Casilio system can be introduced into hepatocytes simultaneously or in steps.
Specifically, a DNA encoding dCas9 protein was introduced, a DNA encoding effector protein fused to the PUF domain was introduced, and a DNA encoding sgRNA having a PUF domain binding site attached thereto was introduced.
Specifically, in order to promote integration of foreign DNA into the genomic DNA of the cell, DNA encoding dCas9 protein was introduced together with DNA encoding PiggyBac transposase.
Specifically, in order to promote integration of foreign DNA into the genomic DNA of the cell, DNA encoding an effector protein fused to the PUF domain is introduced together with DNA encoding a PiggyBac transposase.
In particular, DNA encoding dCas9 protein is inducibly expressed in cells. The inducible expression is based on the tet-on system.
In particular, DNA encoding an effector protein fused to a PUF domain is inducibly expressed in a cell. The inducible expression is based on the tet-on system.
Specifically, a DNA encoding dCas9 protein is introduced into cells by a recombinant expression vector. The recombinant expression vector can be specifically PMax-NLS-dCas9 plasmid or PB3-PMax-dCas9 plasmid.
Specifically, DNA encoding an effector protein fused to a PUF domain is introduced into a cell via a recombinant expression vector. The recombinant expression vector can be Pmax-NLSPUFa _ p65-HSF1 plasmid or PB3-neo (-) -Pmax-NLSPUFa _ p65-HSF1 plasmid.
Specifically, the coding DNA of PiggyBac transposase is introduced into cells through a recombinant expression vector. The recombinant expression vector can be a PiggyBac plasmid.
The 12 sgRNAs are sequentially named as Pdx1-sgRNA1, Pdx1-sgRNA2, Pdx1-sgRNA3, Pdx1-sgRNA4, Ngn3-sgRNA1, Ngn3-sgRNA2, Ngn3-sgRNA3, Ngn3-sgRNA4, MafA-sgRNA1, MafA-sgRNA2, MafA-sgRNA3 and MafA-sgRNA 4.
Specifically, DNA encoding 12 sgrnas was introduced into cells via the corresponding 12 recombinant expression vectors. The 12 recombinant expression vectors are recombinant plasmids Pdx1-gRNA1, recombinant plasmids Pdx1-gRNA2, recombinant plasmids Pdx1-gRNA3, recombinant plasmids Pdx1-gRNA4, recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3, recombinant plasmids Ngn3-gRNA4, recombinant plasmids MafA-gRNA1, recombinant plasmids MafA-gRNA2, recombinant plasmids MafA-gRNA and recombinant plasmids MafA-gRNA4 in the embodiment. The starting vectors of the 12 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.
The recombinant expression vector containing the coding DNA of the three functional elements of the Casilio system may be specifically a recombinant expression vector containing the coding DNA of dCas9 protein, a recombinant expression vector containing the coding DNA of the effector protein fused to the PUF domain, and a recombinant expression vector containing the coding DNA of sgRNA to which the PUF domain binding site is attached. The sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table. Accordingly, a recombinant expression vector containing a coding DNA of sgRNA with a PUF domain binding site attached thereto was composed of 12 recombinant expression vectors. The 12 recombinant expression vectors are recombinant plasmids Pdx1-gRNA1, recombinant plasmids Pdx1-gRNA2, recombinant plasmids Pdx1-gRNA3, recombinant plasmids Pdx1-gRNA4, recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3, recombinant plasmids Ngn3-gRNA4, recombinant plasmids MafA-gRNA1, recombinant plasmids MafA-gRNA2, recombinant plasmids MafA-gRNA and recombinant plasmids MafA-gRNA4 in the embodiment. The starting vectors of the 12 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.
The recombinant expression vector containing the coding DNA of the PiggyBac transposase can be a PiggyBac plasmid.
The 12 sgrnas differ only in the target sequence binding region. Each of the 12 sgrnas consists of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA after the 3' end of the target sequence has been removed by three nucleotides. The target sequences are those in table 1.
The invention also protects sgRNA combinations as (a) or (b) below:
(a) the sgRNA combination consists of 12 sgRNAs; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;
(b) sgRNA combinations consisting of 4 sgRNAs; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.
The invention also protects the application of the sgRNA combination as follows (c1), or (c2), or (c 3):
(c1) preparing insulin secreting cells from hepatocytes;
(c2) direct reprogramming of hepatocytes into insulin-secreting cells;
(c3) preparing a kit; the kit has the function of preparing insulin secreting cells by using liver cells.
The invention also provides a method for preparing insulin secreting cells from hepatocytes, comprising the steps of: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells.
The invention also provides a recombinant cell, and the preparation method comprises the following steps: DNA encoding three functional elements of the Casilio system was introduced into hepatocytes to obtain recombinant cells.
The invention also protects the application of three functional elements of the Casilio system in the preparation of the kit; the kit has the function of preparing insulin secreting cells by using liver cells.
The three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain; the sgrnas with PUF domain binding sites consist of 4 sgrnas; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.
The 4 sgRNAs are sequentially named as Ngn3-sgRNA1, Ngn3-sgRNA2, Ngn3-sgRNA3 and Ngn3-sgRNA 4.
Specifically, 4 sgRNA-encoding DNAs were introduced into cells via 4 recombinant expression vectors. The 4 recombinant expression vectors are recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3 and recombinant plasmids Ngn3-gRNA4 in the embodiment. The starting vectors of the 4 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.
The recombinant expression vector containing the coding DNA of the three functional elements of the Casilio system may be specifically a recombinant expression vector containing the coding DNA of dCas9 protein, a recombinant expression vector containing the coding DNA of the effector protein fused to the PUF domain, and a recombinant expression vector containing the coding DNA of sgRNA to which the PUF domain binding site is attached. The sgrnas with PUF domain binding sites are composed of 4 sgrnas, and target sequences of the 4 sgrnas are sequentially represented by sequences 5 to 8 in the sequence table. Accordingly, a recombinant expression vector containing a coding DNA of sgRNA with a PUF domain binding site attached thereto was composed of 4 kinds of recombinant expression vectors. The 4 recombinant expression vectors are recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3 and recombinant plasmids Ngn3-gRNA4 in the embodiment.
The 4 sgrnas differ only in the target sequence binding region. All 4 sgrnas consist of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA after the 3' end of the target sequence has been removed by three nucleotides. The target sequences are those in table 1.
The amino acid sequence of the dCas9 protein is the same as the amino acid sequence coded by the 1125-5225 th nucleotide in the sequence 16 of the sequence table.
The amino acid sequence of the PiggyBac transposase is the same as the amino acid sequence coded by the 953-position 2737-position nucleotide in the sequence 14 of the sequence table.
Effector proteins fused to PUF domains include two segments, PUFa and P65-HSF 1. The amino acid sequence of the PUFa domain is identical to the amino acid sequence encoded by nucleotide 1248-2333 of SEQ ID NO. 15. The amino acid sequence of P65-HSF1 is the same as the amino acid sequence encoded by the 2445-3467 th nucleotide in the sequence 15.
The coding DNA of the dCas9 protein is shown as the 1125-th-5225 th nucleotide in the sequence 16 of the sequence table.
The coding DNA of the PUFa domain is shown as the nucleotide 1248-2333 th site in the sequence 15 of the sequence table.
The coding DNA of P65-HSF1 is shown as 2445-3467 th nucleotide in the sequence 15 of the sequence table.
The coding DNA of the effector protein fused with the PUF domain is shown as the 1248-bit 3467-bit nucleotide in the sequence 15 of the sequence table.
The coding DNA of the PiggyBac transposase is shown as 953-2737 th nucleotide in a sequence 14 of a sequence table.
Any of the above hepatocytes may specifically be HepG2 cells.
The problems of low transformation efficiency, poor maturity and the like still exist in the currently applied hepatocyte direct reprogramming islet-like cells, and are mainly related to the fact that epigenetic barriers exist among different mature somatic cells and the regulation network of islet-related endogenous genes cannot be effectively opened.
The inventor designs 4 gRNAs aiming at Pdx1, Ngn3 and MafA promoter regions respectively based on a Casilio system of CRISPR/dCas9, and designs 12 gRNAs in total. The activation efficiency of introducing 4 gRNAs has a superimposed effect relative to that of independently introducing 1 gRNA, and the Casilio system realizes the target gene specific activation regulation, but the regulation presents a dose effect. Meanwhile, the transformation of the liver cells into the islet-like cells can be realized by a strategy of targeted activation of three key transcription factors of the endogenous PNM or only targeted activation of the transcription factor of the endogenous Ngn3, and the problems that the specificity of the exogenous factor introduced into the living cells to the endogenous factor is poor, the expression time limit is short, the level is low and the like are solved.
The inventor of the invention combines a Casilio system and a PB transposition system based on CRISPR/dCas9, utilizes a transcription activator P65-HSF1 to change the histone modification state, activates the endogenous high-efficiency expression of a pancreatic key transcription factor PNM, realizes the integration of dCas9 and P65-HSF1 fragments in a HepG2 liver cell line genome, further realizes the direct reprogramming of liver cells into islet-like cells, and has the reprogramming efficiency of 10-15%. In the invention, no exogenous transcription regulating factor is needed, and the method has important significance on the safety and specificity of biomedical research. The invention provides a new research idea and technical means for islet cell regeneration, lays an experimental foundation for clinical treatment of diabetes, and has clinical application value.
Drawings
Fig. 1 is a schematic diagram of the sgRNA activation position selected for the position 1000bp upstream of the transcription start site of Pdx1 gene, Ngn3 gene, and MafA gene.
FIG. 2 is a graph showing the results of RT-PCR assay in step one of example 1.
FIG. 3 is a graph showing the results of fluorescence detection of immunocytes in step two of example 1.
FIG. 4 is a graph showing the results of RT-PCR detection in step two of example 1.
FIG. 5 is a graph showing the results of Western Blot in step three of example 2.
FIG. 6 is a graph showing the results of fluorescence detection of immunocytes in step three of example 2.
FIG. 7 is a photograph showing the observation of cell morphology in step three of example 2.
FIG. 8 is a graph showing the results of the first step in example 3.
FIG. 9 is a photograph of the cells obtained in step two of example 3 under fluorescence after culturing for 72 hours.
FIG. 10 is a photograph under fluorescence of the pnm-gRNA group during the culture in step two of example 3.
Fig. 11 is the statistical results of fig. 10.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.
The experimental procedures in the following examples, unless otherwise indicated, are conventional and are carried out according to the techniques or conditions described in the literature in the field or according to the instructions of the products. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified. The plasmids in the examples are all double-stranded, circularized DNA molecules. HepG2 cells: human liver cancer cells.
The pX-sgRNA-5xPBSA plasmid is shown as a sequence 13 in a sequence table. In the sequence 13, an insertion site is arranged between 257 th and 258 th nucleotides, and after a DNA molecule encoding a target sequence binding region is inserted, the recombinant plasmid expresses sgRNA with 5 PUF domain binding sites attached. In the sequence 13, the 262-position 343 nucleotide codes for a sgRNA skeleton, the 361-position 368 nucleotide codes for a1 st PUF domain binding site, the 372-position 379 nucleotide codes for a2 nd PUF domain binding site, the 383-position 390 nucleotide codes for a3 rd PUF domain binding site, the 394-position 401 nucleotide codes for a4 th PUF domain binding site, and the 405-position 412 nucleotide codes for a 5 th PUF domain binding site.
The piggyBac plasmid is shown as a sequence 14 in a sequence table. In the sequence 14, the 953-2737 th nucleotide codes for PiggyBac transposase (PBase).
Example 1 design and comparison of sgrnas
The Pmax-NLSPUFa _ p65-HSF1 plasmid is shown as a sequence 15 in the sequence table. In the sequence 15, the 1008-3476 nucleotides are a coding frame. In the sequence 15, the 1098-1169 th nucleotide encodes 3 continuous NLS, the 1248-2333 th nucleotide encodes the PUFa domain, and the 2445-3467 th nucleotide encodes P65-HSF 1.
The PMax-NLS-dCas9 plasmid is shown in sequence 16 of the sequence table. In the sequence 16, the 1008-5315 nucleotide is a coding frame. In the sequence 16, the 1095-1115 bit nucleotide encodes NLS (nuclear localization signal), and the 1125-5225 bit nucleotide encodes dCas9 protein.
Firstly, Casilio system establishment and verification for targeted activation of Pdx1, Ngn3 and MafA
And selecting the sgRNA activation position according to the 1000bp position upstream of the Pdx1 gene, the Ngn3 gene and the MafA gene transcription start site. Schematic representation is shown in fig. 1, 4 grnas were designed for each gene, and the target sequences are shown in table 1.
TABLE 1
5’→3’
Pdx1-gRNA1 target sequence (sequence 1 of sequence table) GAACCCACAGCCAGCGCGGACCGG
Pdx1-gRNA2 target sequence (sequence 2 of sequence table) GTTCAGCCGGGGGCCGTGATTGG
Pdx1-gRNTarget sequence of A3 (sequence 3 of sequence Listing) GAACAAAAGCAGGTGCTCGCGGG
Pdx1-gRNA4 target sequence (sequence 4 of sequence table) GCTGGCGGTGCTCCCCAAAATGG
Target sequence of Ngn3-gRNA1 (sequence 5 of sequence table) GCCACCGGCCAATCAGCGCCGGG
Target sequence of Ngn3-gRNA2 (sequence 6 of sequence listing) GGATTCCGGACAAAGGGCCGGGG
Target sequence of Ngn3-gRNA3 (sequence 7 of sequence Listing) GTGCTCTCTCGAGGGCGGGCTGGG
Target sequence of Ngn3-gRNA4 (sequence 8 of sequence Listing) GAGCCTCGTGTGGCTCTGGTCAGG
Target sequence of MafA-gRNA1 (sequence 9 of sequence Listing) GCGCAGGGAAAAGTTTCACGTGG
Target sequence of MafA-gRNA2 (sequence 10 of sequence Listing) GCCAGGTGTCTCGGGCGACCCCGG
Target sequence of MafA-gRNA3 (sequence 11 of sequence Listing) GCCGCCGCCTCGGGCTGCTCCGGG
Target sequence of MafA-gRNA4 (sequence 12 of sequence Listing) GCGTTTAGCCGTGGGAGGCGGGG
1. 12 recombinant plasmids were constructed.
The starting vector is pX-sgRNA-5xPBSA plasmid (the insertion position of exogenous DNA molecule is between 257 th and 258 th nucleotides of sequence 13 in the sequence table), and 12 recombinant plasmids are constructed. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 1 of the sequence table, the recombinant plasmid Pdx1-gRNA1 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 2 of the sequence table, the recombinant plasmid Pdx1-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 3 of the sequence table, the recombinant plasmid Pdx1-gRNA3 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 4 of the sequence table, the recombinant plasmid Pdx1-gRNA4 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in a sequence 5 of a sequence table, the recombinant plasmid Ngn3-gRNA1 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in a sequence 6 of a sequence table, the recombinant plasmid Ngn3-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 7 of the sequence table, the recombinant plasmid Ngn3-gRNA3 is obtained. When the exogenous DNA molecule is shown as 1 st to 21 st nucleotides in a sequence 8 of a sequence table, the recombinant plasmid Ngn3-gRNA4 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotide in the sequence 9 of the sequence table, the recombinant plasmid MafA-gRNA1 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 10 of the sequence table, the recombinant plasmid MafA-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 11 of the sequence table, the recombinant plasmid MafA-gRNA3 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in the sequence 12 of the sequence table, the recombinant plasmid MafA-gRNA4 is obtained.
12 recombinant plasmids expressed the corresponding 12 sgrnas. The 12 sgrnas differ only in the target sequence binding region. Each of the 12 sgrnas consists of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA of the target sequence after removing three nucleotides at the 3' end.
2. Co-transfected cells
Test cells: 293T cells.
The plasmid was co-transfected into test cells with lipofectamine3000, and 48 hours after transfection, the cells were collected and subjected to RT-PCR detection and immunofluorescence detection.
dCas9+ p65-HSF1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid and PMax-NLS-dCas9 plasmid, wherein the mass ratio of the two plasmids is 1: 1.
pool-gRNA-Pdx1 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3 and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/4: 1/4: 1/4: 1/4.
pool-gRNA-Ngn3 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3 and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4: 1/4: 1/4.
pool-gRNA-MafA group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4: 1/4: 1/4.
pnm-gRNA group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn3-gRNA4, recombinant plasmid MagAfA-gRNA 1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant MafA-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12.
293T cells not transfected at all were set and treated in parallel as Control group (ctrl).
3. RT-PCR detection
Taking cells, extracting total RNA and carrying out reverse transcription to obtain cDNA. Using cDNA as template, pooThe l-gRNA-Pdx1 group detected the level of Pdx1 gene, the pool-gRNA-Ngn3 group detected the level of Ngn3 gene, and the pool-gRNA-MafA group detected the level of MafA gene. dCas9+ p65-HSF1 group, Control group and pnm-gRNA group, and levels of Pdx1 gene, Ngn3 gene and MafA gene were detected. The GAPDH gene was used as a reference gene. Through 2-△△CtThe expression calculates the mRNA expression level of the gene relative to GAPDH.
For detecting Pdx1 gene: the sequence of the upstream primer is 5'-GATTGGCGTTGTTTGTGGCT-3', and the sequence of the downstream primer is 5'-GCCGGCTTCTCTAAACAGGT-3'.
For detection of the Ngn3 gene: the sequence of the upstream primer is 5'-CGGTAGAAAGGATGACGCCT-3', and the sequence of the downstream primer is 5'-GGTCACTTCGTCTTCCGAGG-3'.
For detection of the MafA gene: the sequence of the upstream primer is 5'-AGAGCGAGAAGTGCCAACTC-3', and the sequence of the downstream primer is 5'-TGTACAGGTCCCGCTCTTTG-3'.
For detection of GAPDH gene: the sequence of the upstream primer is 5'-AGAAGGCTGGGGCTCATTTG-3', and the sequence of the downstream primer is 5'-AGGGGCCATCCACAGTCTTC-3'.
The results are shown in figure 2 (mean of 4 replicates, # P <0.05, # P <0.01, # P < 0.001). Control group, gene background expression, as 1. Compared with the Control group, the expression levels of the three genes of the dCas9+ p65-HSF1 group (no sgRNA group) have no significant difference. The results show that: compared with the Control group, the gene levels of the 4 gRNAs respectively added to the Pdx1 gene, the Ngn3 gene and the MafA gene are respectively up-regulated by 163 times (P <0.05), 503 times (P <0.05) and 12 times (P < 0.001); compared with the Control group, 12 gRNAs are added simultaneously, the expression level of Pdx1 gene is up-regulated by 62 times (P <0.001), the expression level of Ngn3 gene is up-regulated by 301 times (P <0.01), and the expression level of MafA gene is up-regulated by 12 times (P < 0.05).
4. Immune cell fluorescence detection (IF detection)
The cells were taken, fixed with 4% paraformaldehyde at room temperature for 20min, blocked with 1% BSA at room temperature for 30min, incubated with primary antibody at 4 ℃ overnight, washed with PBS-T (PBS containing 0.1% Triton-100) for 3 times, then bound to the corresponding secondary antibodies at room temperature for 2h, washed with PBS-T three times, incubated with DAPI for 15min, observed with a fluorescence microscope and photographed.
The primary antibody for detecting Pdx1 is a rabbit anti-human Pdx 1I antibody, the working concentration is 1:1000, Abcam company. The primary antibody used to detect Ngn3 was the murine anti-human Ngn 3I antibody at a working concentration of 1:50, DSHB (development students Hybridoma Bank). The primary antibody for detecting MafA is a rabbit anti-human MafA I antibody, the working concentration is 1:1000, Abcam company.
The results are shown in FIG. 3. The results show that the Casilio system of the invention can realize the targeted activation of endogenous genes.
Secondly, screening the optimal gRNA aiming at Pdx1, Ngn3 and MafA promoter regions
The test cells were: 293T cells or HepG2 cells.
The plasmid was co-transfected into test cells with lipofectamine3000, and 48 hours after transfection, the cells were collected and subjected to RT-PCR detection according to the method of step one, 3.
dCas9+ p65-HSF1 group (also referred to as no gRNA group): cotransfects Pmax-NLSPUFa _ p65-HSF1 plasmid and PMax-NLS-dCas9 plasmid, the mass ratio of the two plasmids is 1: 1.
single-gRNA1-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA2-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA3-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA4-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
pool-gRNA-Pdx1 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3 and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.
single-gRNA1-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA2-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA3-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA4-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
pool-gRNA-Ngn 3: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3 and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.
single-gRNA1-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA2-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA3-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
single-gRNA4-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.
pool-gRNA-MafA: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.
Cells without any transfection were set and treated in parallel as Control group (ctrl).
The results are shown in FIG. 4 (A-C: 293T cells; D-F: HepG2 cells; average of 4 replicates;. P <0.05,. P <0.01,. P <0.001,. P < 0.0001). Control group, gene background expression, as 1. Compared with the Control group, the expression levels of the three genes of the dCas9+ p65-HSF1 group (no sgRNA group) have no significant difference. In the 293T cell line: the up-regulation times of gRNA3 and gRNA2 aiming at Pdx1 gene are the highest, gRNA3 is up-regulated by 26 times (P <0.0001), and gRNA2 is up-regulated by 16 times (P < 0.0001); the up-regulation times of gRNA3 and gRNA2 aiming at Ngn3 gene are the highest, gRNA3 is up-regulated by 501 times (P <0.0001), and gRNA2 is up-regulated by 398 times (P < 0.0001); the MafA genes gRNA2 and gRNA4 are up-regulated in the highest degree, gRNA2 is up-regulated in 3 times, and gRNA4 is up-regulated in 2 times. In the HepG2 cell line: the up-regulation times of gRNA4 and gRNA3 aiming at Pdx1 gene are highest, gRNA4 is up-regulated 223 times (P <0.05), and gRNA3 is up-regulated 82 times (P < 0.001); the up-regulation times of gRNA3 and gRNA2 aiming at Ngn3 gene are the highest, gRNA3 is up-regulated by 234 times (P <0.001), and gRNA2 is up-regulated by 95 times (P < 0.01); the gRNA2 and gRNA4 were upregulated the highest fold, gRNA2 was upregulated 6 fold, and gRNA4 was upregulated 5 fold for the MafA gene. 4 gRNAs against three genes Pdx1, Ngn3 and MafA all have synergistic effect in 293T and HepG2 cells, i.e. the activation efficiency of simultaneous transfection of 4 gRNAs is higher than that of single gRNA.
Example 2 construction of a lentivirus stably expressing an Ins-EGFP-HepG2 cell line and identification
The PB3-neo (-) -pmax-NLSPUFa _ p65-HSF1 plasmid is shown as the sequence 17 in the sequence table. In the sequence 17, nucleotide 1603-3435 is a gene (template strand), and nucleotide 5132-7600 is another gene (coding strand). In the sequence 17, the 1606-2349 th nucleotide codes an antisense Tet transcriptional activator (rtTA), the 2350-2403 th nucleotide codes a connecting peptide 2A, and the 2413-3420 th nucleotide codes a hygromycin resistance protein. In sequence 17, nucleotide 4753-4985 constitutes the Tet Response Element (TRE) (with 7 repeats of the TetO sequence) and nucleotide 4998-5070 constitutes the PminCMV promoter. In the sequence 17, the 5135-5200 th nucleotide codes a FLAG tag, the 5222-5293 th nucleotide codes 3 continuous NLS, the 5372-6457 th nucleotide codes a PUFa domain, and the 6569-7591 th nucleotide codes P65-HSF 1.
The PB3-pmax-dCas9 plasmid is shown as a sequence 18 in a sequence table. In the sequence 18, nucleotide 1603-2832 is one gene (template strand), and nucleotide 4529-8728 is the other gene (coding strand). In the sequence 18, the 1606-2349 th nucleotide codes an antisense Tet transcriptional activator (rtTA), the 2350-2403 th nucleotide codes a connecting peptide 2A, and the 2413-2832 th nucleotide codes a blasticidin resistance protein. In sequence 18, nucleotides 4150-4382 constitute the Tet Response Element (TRE) with 7 repeats of the TetO sequence, and nucleotides 4395-4467 constitute the PminCMV promoter. In the sequence 18, the 4532-4558 th nucleotide codes for HA tag, the 4562-4582 th nucleotide codes for NLS (nuclear localization signal), and the 4592-8692 th nucleotide codes for dCas9 protein.
Firstly, preparing Ins-Promoter-HepG2 cells
1. Specific DNA molecules shown in a sequence 19 of a sequence table (in the sequence 19, nucleotides 1 to 1000 form a human Insulin Promoter, and nucleotides 1026 and 1745 form an EGFP gene) are inserted between PacI and EcoRI enzyme cutting sites of a CV130 vector (a Kjeldahl gene) to obtain a recombinant plasmid Ins-Promoter-EGFP.
2. Cotransfecting 293T cells with the recombinant plasmid Ins-Promoter-EGFP, the virus packaging auxiliary plasmid (Helper 1.0) and the virus packaging auxiliary plasmid (Helper 2.0), harvesting the virus after transfection for 48-72 h, and then concentrating and purifying to obtain Ins-Promoter-EGFP lentivirus (in Ins-Promoter-EGFP lentivirus, the human Insulin Promoter drives expression of the EGFP gene).
3. HepG2 cells were seeded on cell culture plates, cultured for 24h, and then added with Ins-Promoter-EGFP lentivirus (MOI 20) with 5. mu.g/mL of polybrene as a transfection enhancer. After the lentivirus is infected for 72 hours, 1 mu g/mL Puromycin is adopted for screening for 1 day, and then 0.5 mu g/mL Puromycin is adopted for screening for 2 days to obtain a recombinant cell, namely the Ins-Promoter-HepG2 cell. The result of PCR detection shows that EGFP reporter gene is inserted into the genome DNA of the cell. In the recombinant cells, the expression of the EGFP gene is driven by a human Insulin promoter, so that the Insulin secretion level can be represented by the EGFP signal intensity, and the stronger the EGFP signal, the higher the Insulin secretion level.
Secondly, preparing Ins-EGFP-dCas9-PUFa-p65-HSF1 cells
1. And (2) inoculating the Ins-Promoter-HepG2 cell obtained in the first step to a cell culture plate, culturing for 24h, co-transfecting a PiggyBac plasmid and a PB3-pmax-dCas9 plasmid (the mass ratio of the two plasmids is 1:2 in sequence) by means of Lipifectamine3000, culturing for 48h, and screening for 7 days by adopting 6 mu g/mL Blasticidin to obtain a recombinant cell, namely the Ins-EGFP-HepG2/TetO-dCas9 cell. The cells were subjected to scale-up culture.
2. The Ins-EGFP-HepG2/TetO-dCas9 cells are inoculated to a cell culture plate, cultured for 24 hours, then co-transfected with PiggyBac plasmid and PB3-neo (-) -pmax-NLSPUFa _ p65-HSF1 plasmid (the mass ratio of the two plasmids is 1:2 in sequence) by means of Lipifectamine3000, cultured for 48 hours, and then screened for 15 days by adopting 200 mu g/mL Hygromycin to obtain recombinant cells, namely the Ins-EGFP-dCas9-PUFa-p65-HSF1 cells. The cells were subjected to scale-up culture.
Identification of cells
1、Western Blot
The test cells were Ins-Promoter-HepG2 cells or Ins-EGFP-dCas9-PUFa-p65-HSF1 cells.
Test cells are taken, induced and cultured for 3 days by adopting 1 mu g/mL doxycycline or 2 mu g/mL doxycycline, and then Western Blot is carried out. A parallel control without doxycycline addition was set up. The primary antibody used in Western Blot was either mouse anti-human Flag I antibody (Sigma, working concentration 1:1000) or rabbit anti-human HA I antibody (Abcam, working concentration 1:1000) or Tubulin I antibody (TransGen, working concentration 1: 5000).
The Western Blot results are shown in FIG. 5. After doxycycline induction, expression of dCas9 protein and NLS-PUFa-P65-HSF1 protein can be detected in Ins-EGFP-dCas9-PUFa-P65-HSF1 cells. The results show that the integration of the dCas9 gene and the NLS-PUFa-P65-HSF1 gene into the genome of the HepG2 liver cell line was successfully achieved by means of the PB transposon.
2. Immune cell fluorescence detection (IF detection)
The Ins-EGFP-dCas9-PUFa-p65-HSF1 cells are taken, induced and cultured for 3 days by adopting 1 mu g/mL doxycycline, and then immune cell fluorescence detection is carried out. A parallel control without doxycycline addition was set up.
The IF detection method comprises the following steps: the cells were taken, fixed with 4% paraformaldehyde at room temperature for 20min, blocked with 1% BSA at room temperature for 30min, incubated with primary antibody at 4 ℃ overnight, washed with PBS-T (PBS containing 0.1% Triton-100) for 3 times, then bound to the corresponding secondary antibodies at room temperature for 2h, washed with PBS-T three times, incubated with DAPI for 15min, observed with a fluorescence microscope and photographed.
The primary antibody used for IF detection was either mouse anti-human Flag I antibody (Sigma, working concentration 1:1000) or rabbit anti-human HA I antibody (Abcam, working concentration 1: 1000).
The immunofluorescence results are shown in FIG. 6. After doxycycline induction, expression of dCas9 protein and NLS-PUFa-P65-HSF1 protein can be detected in Ins-EGFP-dCas9-PUFa-P65-HSF1 cells. Ins-EGFP-dCas9-PUFa-p65-HSF1 cells can realize the activation of endogenous genes of the Casilio system only by introducing gRNA in vitro.
3. Cell morphology observation
Photographs of HepG2 cells, Ins-Promoter-HepG2 cells, Ins-EGFP-HepG2/TetO-dCas9 cells, Ins-EGFP-dCas9-PUFa-p65-HSF1 cells are shown in FIG. 7. The morphology has no significant difference.
Example 3 reprogramming of the liver cell line HepG2 to islet-like cells by the Casilio System
Test cells: Ins-EGFP-dCas9-PUFa-p65-HSF1 cells.
One, Casilio system for activating endogenous gene
1. Co-transfected cells
The plasmid was transfected into test cells with the aid of lipofectamine3000, 48 hours after transfection, induced with 1. mu.g/mL doxycycline for 3 days, and then the cells were collected.
pool-gRNA-Pdx1 group: cotransfection recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, four plasmids and the like.
pool-gRNA-Ngn3 group: cotransfection recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn3-gRNA4, four plasmids and the like in mass ratio.
pool-gRNA-MafA group: cotransfection recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, and the mass ratio of the four plasmids is equal.
pnm-gRNA group: cotransfection recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn 3-NA 4, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3, recombinant plasmid MafA-gRNA4 and 12 plasmids.
Test cells without any transfection were set and treated in parallel as Control group (ctrl).
2. RT-PCR detection
See step one, 3 of example 1.
Control group, gene background expression, as 1. The results are shown in A of FIG. 8. The sgRNA for each gene was introduced separately, and the Pdx1 gene was up-regulated 156-fold (P <0.05), the Ngn3 gene was up-regulated 275-fold (P <0.001), and the MafA gene was up-regulated 6-fold. After 12 gRNAs aiming at 3 genes are simultaneously introduced, the expression of Pdx1 gene is up-regulated by 93 times (P <0.01), the expression of Ngn3 gene is up-regulated by 404 times (P <0.001) and the expression of MafA gene is up-regulated by 7 times. The Casilio system can efficiently activate PNM in a targeted manner.
3. Immune cell fluorescence detection (IF detection)
See step one, 4 of example 1.
Partial results are shown in B of FIG. 8. IF results show that the Casilio system can realize obvious expression of Pdx1 protein and Ngn3 protein, and further prove that the Casilio system activates endogenous genes.
Secondly, reprogramming of the hepatocyte to the insulin-secreting cell
1. Co-transfected cells
The plasmid was transfected into test cells with the aid of lipofectamine3000, 48 hours after transfection, induced with 1. mu.g/mL doxycycline for 3 days, and then the cells were collected.
pool-gRNA-Ngn3 group: cotransfection recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn3-gRNA4, four plasmids and the like in mass ratio.
pnm-gRNA group: cotransfection recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn 3-NA 4, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3, recombinant plasmid MafA-gRNA4 and 12 plasmids.
Test cells without any transfection were set and treated in parallel as Control group (ctrl).
2. After completion of step 1, the cells were collected and cultured in DMEM medium containing 10mM Nicotinamide, 20ng/ml EGF and 5nM Exendin-4.
After 72 hours of incubation, fluorescence was observed and the photograph is shown in FIG. 9. Both the pool-gRNA-Ngn3 group and the pnm-gRNA group enabled reprogramming of hepatocytes to insulin-secreting cells. The ratio of EGFP positive cells in the pnm-gRNA group was 12.93% + -1.11%, and the ratio of EGFP positive cells in the pool-gRNA-Ngn3 group was 5.73% + -1.22%.
Fluorescence observations for the pnm-gRNA groups after 24 hours, 48 hours, 72 hours, or 96 hours of culture are shown in fig. 10, and statistical results are shown in fig. 11. The number of the reprogramming cells of the pnm-gRNA group gradually increased along with the time increment, and the reprogramming efficiency was highest at 72h of culture.
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
SEQUENCE LISTING
<110> Shenzhen citizen hospital
<120> method and kit for direct reprogramming of hepatocytes into islet-like cells
<130> GNCYX201739
<160> 20
<170> PatentIn version 3.5
<210> 1
<211> 24
<212> DNA
<213> Homo sapiens
<400> 1
gaacccacag ccagcgcgga ccgg 24
<210> 2
<211> 23
<212> DNA
<213> Homo sapiens
<400> 2
gttcagccgg gggccgtgat tgg 23
<210> 3
<211> 23
<212> DNA
<213> Homo sapiens
<400> 3
gaacaaaagc aggtgctcgc ggg 23
<210> 4
<211> 23
<212> DNA
<213> Homo sapiens
<400> 4
gctggcggtg ctccccaaaa tgg 23
<210> 5
<211> 23
<212> DNA
<213> Homo sapiens
<400> 5
gccaccggcc aatcagcgcc ggg 23
<210> 6
<211> 23
<212> DNA
<213> Homo sapiens
<400> 6
ggattccgga caaagggccg ggg 23
<210> 7
<211> 24
<212> DNA
<213> Homo sapiens
<400> 7
gtgctctctc gagggcgggc tggg 24
<210> 8
<211> 24
<212> DNA
<213> Homo sapiens
<400> 8
gagcctcgtg tggctctggt cagg 24
<210> 9
<211> 23
<212> DNA
<213> Homo sapiens
<400> 9
gcgcagggaa aagtttcacg tgg 23
<210> 10
<211> 24
<212> DNA
<213> Homo sapiens
<400> 10
gccaggtgtc tcgggcgacc ccgg 24
<210> 11
<211> 24
<212> DNA
<213> Homo sapiens
<400> 11
gccgccgcct cgggctgctc cggg 24
<210> 12
<211> 23
<212> DNA
<213> Homo sapiens
<400> 12
gcgtttagcc gtgggaggcg ggg 23
<210> 13
<211> 3255
<212> DNA
<213> Artificial sequence
<400> 13
ggcgcgccga gggcctattt cccatgattc cttcatattt gcatatacga tacaaggctg 60
ttagagagat aattggaatt aatttgactg taaacacaaa gatattagta caaaatacgt 120
gacgtagaaa gtaataattt cttgggtagt ttgcagtttt aaaattatgt tttaaaatgg 180
actatcatat gcttaccgta acttgaaagt atttcgattt cttggcttta tatatcttgt 240
ggaaaggacg aaacaccgtt taagagctat gctggaaaca gcatagcaag tttaaataag 300
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgccaattgg gtctccagat 360
tgtatgtagc ctgtatgtag cctgtatgta gcctgtatgt agcctgtatg taagatcttt 420
ttttgtttta gagctagaaa tagcaagtta aaataaggct agtccgtagc gcgtgcgcca 480
attctgcaga caaatggcgg cgcgccgcgg ccgcaggaac ccctagtgat ggagttggcc 540
actccctctc tgcgcgggcc tcgctcgctc actgaggccg ggcgaccaaa ggtcgcccga 600
cgcccgggct ttgcccgggc ggcctcagtg agcgagcgag cgcgcagctg cctgcagggg 660
cgcctgatgc ggtattttct ccttacgcat ctgtgcggta tttcacaccg catacgtcaa 720
agcaaccata gtacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc 780
gcagcgtgac cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt 840
cctttctcgc cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag 900
ggttccgatt tagtgcttta cggcacctcg accccaaaaa acttgatttg ggtgatggtt 960
cacgtagtgg gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt 1020
tctttaatag tggactcttg ttccaaactg gaacaacact caaccctatc tcgggctatt 1080
cttttgattt ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt 1140
aacaaaaatt taacgcgaat tttaacaaaa tattaacgtt tacaatttta tggtgcactc 1200
tcagtacaat ctgctctgat gccgcatagt taagccagcc ccgacacccg ccaacacccg 1260
ctgacgcgcc ctgacgggct tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg 1320
tctccgggag ctgcatgtgt cagaggtttt caccgtcatc accgaaacgc gcgagacgaa 1380
agggcctcgt gatacgccta tttttatagg ttaatgtcat gataataatg gtttcttaga 1440
cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta tttttctaaa 1500
tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt 1560
gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg 1620
cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag 1680
atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg 1740
agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg 1800
gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt 1860
ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga 1920
cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac 1980
ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc 2040
atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc 2100
gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac 2160
tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag 2220
gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg 2280
gtgagcgtgg aagccgcggt atcattgcag cactggggcc agatggtaag ccctcccgta 2340
tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg 2400
ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata 2460
tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt 2520
ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc 2580
ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 2640
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 2700
ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gtccttctag 2760
tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 2820
tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 2880
actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 2940
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 3000
gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 3060
tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 3120
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 3180
ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 3240
cttttgctca catgt 3255
<210> 14
<211> 6354
<212> DNA
<213> Artificial sequence
<400> 14
gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggac tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gcttggtacc 900
gagctcggat ccactagtaa cggccgccag tgtgctggaa ttcgccgcca ccatgggcag 960
cagcctggac gacgagcaca tcctgagcgc cctgctgcag agcgacgacg agctggtcgg 1020
cgaggacagc gacagcgagg tgagcgacca cgtgagcgag gacgacgtgc agtccgacac 1080
cgaggaggcc ttcatcgacg aggtgcacga ggtgcagcct accagcagcg gctccgagat 1140
cctggacgag cagaacgtga tcgagcagcc cggcagctcc ctggccagca acaggatcct 1200
gaccctgccc cagaggacca tcaggggcaa gaacaagcac tgctggtcca cctccaagcc 1260
caccaggcgg agcagggtgt ccgccctgaa catcgtgaga agccagaggg gccccaccag 1320
gatgtgcagg aacatctacg accccctgct gtgcttcaag ctgttcttca ccgacgagat 1380
catcagcgag atcgtgaagt ggaccaacgc cgagatcagc ctgaagaggc gggagagcat 1440
gacctccgcc accttcaggg acaccaacga ggacgagatc tacgccttct tcggcatcct 1500
ggtgatgacc gccgtgagga aggacaacca catgagcacc gacgacctgt tcgacagatc 1560
cctgagcatg gtgtacgtga gcgtgatgag cagggacaga ttcgacttcc tgatcagatg 1620
cctgaggatg gacgacaaga gcatcaggcc caccctgcgg gagaacgacg tgttcacccc 1680
cgtgagaaag atctgggacc tgttcatcca ccagtgcatc cagaactaca cccctggcgc 1740
ccacctgacc atcgacgagc agctgctggg cttcaggggc aggtgcccct tcagggtcta 1800
tatccccaac aagcccagca agtacggcat caagatcctg atgatgtgcg acagcggcac 1860
caagtacatg atcaacggca tgccctacct gggcaggggc acccagacca acggcgtgcc 1920
cctgggcgag tactacgtga aggagctgtc caagcccgtc cacggcagct gcagaaacat 1980
cacctgcgac aactggttca ccagcatccc cctggccaag aacctgctgc aggagcccta 2040
caagctgacc atcgtgggca ccgtgagaag caacaagaga gagatccccg aggtcctgaa 2100
gaacagcagg tccaggcccg tgggcaccag catgttctgc ttcgacggcc ccctgaccct 2160
ggtgtcctac aagcccaagc ccgccaagat ggtgtacctg ctgtccagct gcgacgagga 2220
cgccagcatc aacgagagca ccggcaagcc ccagatggtg atgtactaca accagaccaa 2280
gggcggcgtg gacaccctgg accagatgtg cagcgtgatg acctgcagca gaaagaccaa 2340
caggtggccc atggccctgc tgtacggcat gatcaacatc gcctgcatca acagcttcat 2400
catctacagc cacaacgtga gcagcaaggg cgagaaggtg cagagccgga aaaagttcat 2460
gcggaacctg tacatgggcc tgacctccag cttcatgagg aagaggctgg aggcccccac 2520
cctgaagaga tacctgaggg acaacatcag caacatcctg cccaaagagg tgcccggcac 2580
cagcgacgac agcaccgagg agcccgtgat gaagaagagg acctactgca cctactgtcc 2640
cagcaagatc agaagaaagg ccagcgccag ctgcaagaag tgtaagaagg tcatctgccg 2700
ggagcacaac atcgacatgt gccagagctg tttctgatga gcggccgctc gagcatgcat 2760
ctagagggcc ctattctata gtgtcaccta aatgctagag ctcgctgatc agcctcgact 2820
gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 2880
gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 2940
agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg 3000
gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga ggcggaaaga 3060
accagctggg gctctagggg gtatccccac gcgccctgta gcggcgcatt aagcgcggcg 3120
ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 3180
ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 3240
cggggcatcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 3300
gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 3360
acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 3420
cctatctcgg tctattcttt tgatttataa gggattttgg ggatttcggc ctattggtta 3480
aaaaatgagc tgatttaaca aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt 3540
tagggtgtgg aaagtcccca ggctccccag gcaggcagaa gtatgcaaag catgcatctc 3600
aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 3660
agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 3720
ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 3780
gcagaggccg aggccgcctc tgcctctgag ctattccaga agtagtgagg aggctttttt 3840
ggaggcctag gcttttgcaa aaagctcccc gaaatgaccg accaagcgac gcccaacctg 3900
ccatcacgag atttcgattc caccgccgcc ttctatgaaa ggttgggctt cggaatcgtt 3960
ttccgggacg ccggctggat gatcctccag cgcggggatc tcatgctgga gttcttcgcc 4020
caccccaact tgtttattgc agcttataat ggttacaaat aaagcaatag catcacaaat 4080
ttcacaaata aagcattttt ttcactgcat tctagttgtg gtttgtccaa actcatcaat 4140
gtatcttatc atgtctgtat accgtcgacc tctagctaga gcttggcgta atcatggtca 4200
tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga 4260
agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg 4320
cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc 4380
caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac 4440
tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata 4500
cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 4560
aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct 4620
gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa 4680
agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 4740
cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcaatgctca 4800
cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 4860
ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 4920
gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 4980
tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg 5040
acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc 5100
tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag 5160
attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac 5220
gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc 5280
ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag 5340
taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt 5400
ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag 5460
ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca 5520
gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact 5580
ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca 5640
gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg 5700
tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc 5760
atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg 5820
gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca 5880
tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt 5940
atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc 6000
agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 6060
ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca 6120
tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa 6180
aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat 6240
tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa 6300
aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtc 6354
<210> 15
<211> 5462
<212> DNA
<213> Artificial sequence
<400> 15
tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60
ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120
aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180
gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240
gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300
agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360
ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420
cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480
gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540
caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600
caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660
cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagg 720
tcgtttagtg aaccgtcaga tcactagtag ctttattgcg gtagtttatc acagttaaat 780
tgctaacgca gtcagtgctc gactgatcac aggtaagtat caaggttaca agacaggttt 840
aaggaggcca atagaaactg ggcttgtcga gacagagaag attcttgcgt ttctgatagg 900
cacctattgg tcttactgac atccactttg cctttctctc cacagggaaa aaacaattga 960
caagtttgta caaaaaagca ggctccgaat tcaccggtgc cgccaccatg gactacaagg 1020
atcacgacgg tgactataag gatcatgaca tcgactataa ggacgatgac gataagatcg 1080
atggcggagg cggatctgat ccaaaaaaga agagaaaggt agatccaaaa aagaagagaa 1140
aggtagatcc aaaaaagaag agaaaggtag gatctaccgg atctagaaac gatggtggtg 1200
gtggaagcgg gggtgggggc agcggtggag ggggaagcgg gcgcgccggg atcctccccc 1260
ccaagaaaaa gaggaaggta tctagaggcc gcagccgcct tttggaagat tttcgaaaca 1320
accggtaccc caatttacaa ctgcgggaga ttgctggaca tataatggaa ttttcccaag 1380
accagcatgg gtccagattc attcagctga aactggagcg tgccacacca gctgagcgcc 1440
agcttgtctt caatgaaatc ctccaggctg cctaccaact catggtggat gtgtttggta 1500
attacgtcat tcagaagttc tttgaatttg gcagtcttga acagaagctg gctttggcag 1560
aacggattcg aggccacgtc ctgtcattgg cactacagat gtatggcagc cgtgttatcg 1620
agaaagctct tgagtttatt ccttcagacc agcagaatga gatggttcgg gaactagatg 1680
gccatgtctt gaagtgtgtg aaagatcaga atggcaatca cgtggttcag aaatgcattg 1740
aatgtgtaca gccccagtct ttgcaattta tcatcgacgc gtttaaggga caggtatttg 1800
ccttatccac acatccttat ggctgccgag tgattcagag aatcctggag cactgtctcc 1860
ctgaccagac actccctatt ttagaggagc ttcaccagca cacagagcag ctggtacagg 1920
atcaatatgg aaattatgta atccaacatg tactggagca cggtcgtcct gaggataaaa 1980
gcaaaattgt agcagaaatc cgaggcaatg tacttgtatt gagtcagcac aaatttgcaa 2040
gcaatgttgt ggagaagtgt gttactcacg cctcacgtac ggagcgcgct gtgctcatcg 2100
acgaggtgtg caccatgaac gacggtcccc acagtgcctt atacaccatg atgaaggacc 2160
agtatgccaa ctacgtggtc cagaagatga ttgacgtggc ggagccaggc cagcggaaga 2220
tcgtcatgca taagatccgg ccccacatcg caactcttcg taagtacacc tatggcaagc 2280
acattctggc caagctggag aagtactaca tgaagaacgg tgttgactta ggggacccaa 2340
agaagaagcg caaagtggat cctaaaaaga aaagaaaggt aggcggccgc gggggaggcg 2400
gttccggtgg cggcggaagc ggaggtggag gatcagggcc ggccggagga ggtggaagcg 2460
gaggaggagg aagcggagga ggaggtagcg gacctaagaa aaagaggaag gtggcggccg 2520
ctggatcccc ttcagggcag atcagcaacc aggccctggc tctggcccct agctccgctc 2580
cagtgctggc ccagactatg gtgccctcta gtgctatggt gcctctggcc cagccacctg 2640
ctccagcccc tgtgctgacc ccaggaccac cccagtcact gagcgctcca gtgcccaagt 2700
ctacacaggc cggcgagggg actctgagtg aagctctgct gcacctgcag ttcgacgctg 2760
atgaggacct gggagctctg ctggggaaca gcaccgatcc cggagtgttc acagatctgg 2820
cctccgtgga caactctgag tttcagcagc tgctgaatca gggcgtgtcc atgtctcata 2880
gtacagccga accaatgctg atggagtacc ccgaagccat tacccggctg gtgaccggca 2940
gccagcggcc ccccgacccc gctccaactc ccctgggaac cagcggcctg cctaatgggc 3000
tgtccggaga tgaagacttc tcaagcatcg ctgatatgga ctttagtgcc ctgctgtcac 3060
agatttcctc tagtgggcag ggaggaggtg gaagcggctt cagcgtggac accagtgccc 3120
tgctggacct gttcagcccc tcggtgaccg tgcccgacat gagcctgcct gaccttgaca 3180
gcagcctggc cagtatccaa gagctcctgt ctccccagga gccccccagg cctcccgagg 3240
cagagaacag cagcccggat tcagggaagc agctggtgca ctacacagcg cagccgctgt 3300
tcctgctgga ccccggctcc gtggacaccg ggagcaacga cctgccggtg ctgtttgagc 3360
tgggagaggg ctcctacttc tccgaagggg acggcttcgc cgaggacccc accatctccc 3420
tgctgacagg ctcggagcct cccaaagcca aggaccccac tgtctccatc gattgattaa 3480
ttaagaattc gacccagctt tcttgtacaa agtggttgat atccagcaca gtggcggccg 3540
ctcgagtcta gagggcccgc ggttcgaagg taagcctatc cctaaccctc tcctcggtct 3600
cgattctacg cgtaccggtt agcaattgtt ttttcgatga gtttggacaa accacaacta 3660
gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 3720
ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg 3780
ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtactt 3840
aagaggggga gaccaaaggg cgagacgtta aggcctcacg tgacatgtga gcaaaaggcc 3900
agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 3960
cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac 4020
tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 4080
tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata 4140
gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 4200
acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 4260
acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 4320
cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 4380
gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg 4440
gtagctcttg atccggcaaa caaaccacgc tggtagcggt ggtttttttg tttgcaagca 4500
gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 4560
tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgccgt ctcagaagaa 4620
ctcgtcaaga aggcgataga aggcgatgcg ctgcgaatcg ggagcggcga taccgtaaag 4680
cacgaggaag cggtcagccc attcgccgcc aagctcttca gcaatatcac gggtagccaa 4740
cgctatgtcc tgatagcggt ccgccacacc cagccggcca cagtcgatga atccagaaaa 4800
gcggccattt tccaccatga tattcggcaa gcaggcatcg ccatgggtca cgacgagatc 4860
ctcgccgtcg ggcatgctcg ccttgagcct ggcgaacagt tcggctggcg cgagcccctg 4920
atgctcttcg tccagatcat cctgatcgac aagaccggct tccatccgag tacgtgctcg 4980
ctcgatgcga tgtttcgctt ggtggtcgaa tgggcaggta gccggatcaa gcgtatgcag 5040
ccgccgcatt gcatcagcca tgatggatac tttctcggca ggagcaaggt gagatgacag 5100
gagatcctgc cccggcactt cgcccaatag cagccagtcc cttcccgctt cagtgacaac 5160
gtcgagcaca gctgcgcaag gaacgcccgt cgtggccagc cacgatagcc gcgctgcctc 5220
gtcttgcagt tcattcaggg caccggacag gtcggtcttg acaaaaagaa ccgggcgccc 5280
ctgcgctgac agccggaaca cggcggcatc agagcagccg attgtctgtt gtgcccagtc 5340
atagccgaat agcctctcca cccaagcggc cggagaacct gcgtgcaatc catcttgttc 5400
aatcataata ttattgaagc atttatcagg gttcgtctcg tcccggtctc ctcccatgca 5460
tg 5462
<210> 16
<211> 7301
<212> DNA
<213> Artificial sequence
<400> 16
tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60
ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120
aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180
gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240
gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300
agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360
ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420
cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480
gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540
caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600
caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660
cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagg 720
tcgtttagtg aaccgtcaga tcactagtag ctttattgcg gtagtttatc acagttaaat 780
tgctaacgca gtcagtgctc gactgatcac aggtaagtat caaggttaca agacaggttt 840
aaggaggcca atagaaactg ggcttgtcga gacagagaag attcttgcgt ttctgatagg 900
cacctattgg tcttactgac atccactttg cctttctctc cacagggaaa aaacaattga 960
caagtttgta caaaaaagca ggctccgaat tcaccggtgc cgccaccatg atcgatggtg 1020
gcggtggtag cgggggaggc ggctccgggg gcggaggcag tatgtaccca tacgatgttc 1080
cagattacgc ttcgccgaag aaaaagcgca aggtcgaagc gtccgacaag aagtacagca 1140
tcggcctggc catcggcacc aactctgtgg gctgggccgt gatcaccgac gagtacaagg 1200
tgcccagcaa gaaattcaag gtgctgggca acaccgaccg gcacagcatc aagaagaacc 1260
tgatcggagc cctgctgttc gacagcggcg aaacagccga ggccacccgg ctgaagagaa 1320
ccgccagaag aagatacacc agacggaaga accggatctg ctatctgcaa gagatcttca 1380
gcaacgagat ggccaaggtg gacgacagct tcttccacag actggaagag tccttcctgg 1440
tggaagagga taagaagcac gagcggcacc ccatcttcgg caacatcgtg gacgaggtgg 1500
cctaccacga gaagtacccc accatctacc acctgagaaa gaaactggtg gacagcaccg 1560
acaaggccga cctgcggctg atctatctgg ccctggccca catgatcaag ttccggggcc 1620
acttcctgat cgagggcgac ctgaaccccg acaacagcga cgtggacaag ctgttcatcc 1680
agctggtgca gacctacaac cagctgttcg aggaaaaccc catcaacgcc agcggcgtgg 1740
acgccaaggc catcctgtct gccagactga gcaagagcag acggctggaa aatctgatcg 1800
cccagctgcc cggcgagaag aagaatggcc tgttcggcaa cctgattgcc ctgagcctgg 1860
gcctgacccc caacttcaag agcaacttcg acctggccga ggatgccaaa ctgcagctga 1920
gcaaggacac ctacgacgac gacctggaca acctgctggc ccagatcggc gaccagtacg 1980
ccgacctgtt tctggccgcc aagaacctgt ccgacgccat cctgctgagc gacatcctga 2040
gagtgaacac cgagatcacc aaggcccccc tgagcgcctc tatgatcaag agatacgacg 2100
agcaccacca ggacctgacc ctgctgaaag ctctcgtgcg gcagcagctg cctgagaagt 2160
acaaagagat tttcttcgac cagagcaaga acggctacgc cggctacatt gacggcggag 2220
ccagccagga agagttctac aagttcatca agcccatcct ggaaaagatg gacggcaccg 2280
aggaactgct cgtgaagctg aacagagagg acctgctgcg gaagcagcgg accttcgaca 2340
acggcagcat cccccaccag atccacctgg gagagctgca cgccattctg cggcggcagg 2400
aagattttta cccattcctg aaggacaacc gggaaaagat cgagaagatc ctgaccttcc 2460
gcatccccta ctacgtgggc cctctggcca ggggaaacag cagattcgcc tggatgacca 2520
gaaagagcga ggaaaccatc accccctgga acttcgagga agtggtggac aagggcgctt 2580
ccgcccagag cttcatcgag cggatgacca acttcgataa gaacctgccc aacgagaagg 2640
tgctgcccaa gcacagcctg ctgtacgagt acttcaccgt gtataacgag ctgaccaaag 2700
tgaaatacgt gaccgaggga atgagaaagc ccgccttcct gagcggcgag cagaaaaagg 2760
ccatcgtgga cctgctgttc aagaccaacc ggaaagtgac cgtgaagcag ctgaaagagg 2820
actacttcaa gaaaatcgag tgcttcgact ccgtggaaat ctccggcgtg gaagatcggt 2880
tcaacgcctc cctgggcaca taccacgatc tgctgaaaat tatcaaggac aaggacttcc 2940
tggacaatga ggaaaacgag gacattctgg aagatatcgt gctgaccctg acactgtttg 3000
aggacagaga gatgatcgag gaacggctga aaacctatgc ccacctgttc gacgacaaag 3060
tgatgaagca gctgaagcgg cggagataca ccggctgggg caggctgagc cggaagctga 3120
tcaacggcat ccgggacaag cagtccggca agacaatcct ggatttcctg aagtccgacg 3180
gcttcgccaa cagaaacttc atgcagctga tccacgacga cagcctgacc tttaaagagg 3240
acatccagaa agcccaggtg tccggccagg gcgatagcct gcacgagcac attgccaatc 3300
tggccggcag ccccgccatt aagaagggca tcctgcagac agtgaaggtg gtggacgagc 3360
tcgtgaaagt gatgggccgg cacaagcccg agaacatcgt gatcgaaatg gccagagaga 3420
accagaccac ccagaaggga cagaagaaca gccgcgagag aatgaagcgg atcgaagagg 3480
gcatcaaaga gctgggcagc cagatcctga aagaacaccc cgtggaaaac acccagctgc 3540
agaacgagaa gctgtacctg tactacctgc agaatgggcg ggatatgtac gtggaccagg 3600
aactggacat caaccggctg tccgactacg atgtggacgc catcgtgcct cagagctttc 3660
tgaaggacga ctccatcgac aacaaggtgc tgaccagaag cgacaagaac cggggcaaga 3720
gcgacaacgt gccctccgaa gaggtcgtga agaagatgaa gaactactgg cggcagctgc 3780
tgaacgccaa gctgattacc cagagaaagt tcgacaatct gaccaaggcc gagagaggcg 3840
gcctgagcga actggataag gccggcttca tcaagagaca gctggtggaa acccggcaga 3900
tcacaaagca cgtggcacag atcctggact cccggatgaa cactaagtac gacgagaatg 3960
acaagctgat ccgggaagtg aaagtgatca ccctgaagtc caagctggtg tccgatttcc 4020
ggaaggattt ccagttttac aaagtgcgcg agatcaacaa ctaccaccac gcccacgacg 4080
cctacctgaa cgccgtcgtg ggaaccgccc tgatcaaaaa gtaccctaag ctggaaagcg 4140
agttcgtgta cggcgactac aaggtgtacg acgtgcggaa gatgatcgcc aagagcgagc 4200
aggaaatcgg caaggctacc gccaagtact tcttctacag caacatcatg aactttttca 4260
agaccgagat taccctggcc aacggcgaga tccggaagcg gcctctgatc gagacaaacg 4320
gcgaaaccgg ggagatcgtg tgggataagg gccgggattt tgccaccgtg cggaaagtgc 4380
tgagcatgcc ccaagtgaat atcgtgaaaa agaccgaggt gcagacaggc ggcttcagca 4440
aagagtctat cctgcccaag aggaacagcg ataagctgat cgccagaaag aaggactggg 4500
accctaagaa gtacggcggc ttcgacagcc ccaccgtggc ctattctgtg ctggtggtgg 4560
ccaaagtgga aaagggcaag tccaagaaac tgaagagtgt gaaagagctg ctggggatca 4620
ccatcatgga aagaagcagc ttcgagaaga atcccatcga ctttctggaa gccaagggct 4680
acaaagaagt gaaaaaggac ctgatcatca agctgcctaa gtactccctg ttcgagctgg 4740
aaaacggccg gaagagaatg ctggcctctg ccggcgaact gcagaaggga aacgaactgg 4800
ccctgccctc caaatatgtg aacttcctgt acctggccag ccactatgag aagctgaagg 4860
gctcccccga ggataatgag cagaaacagc tgtttgtgga acagcacaag cactacctgg 4920
acgagatcat cgagcagatc agcgagttct ccaagagagt gatcctggcc gacgctaatc 4980
tggacaaagt gctgtccgcc tacaacaagc accgggataa gcccatcaga gagcaggccg 5040
agaatatcat ccacctgttt accctgacca atctgggagc ccctgccgcc ttcaagtact 5100
ttgacaccac catcgaccgg aagaggtaca ccagcaccaa agaggtgctg gacgccaccc 5160
tgatccacca gagcatcacc ggcctgtacg agacacggat cgacctgtct cagctgggag 5220
gcgacagccc caagaagaag agaaaggtgg aggccagcgg aggcggcggt agcggaggag 5280
gcgggtccgg cggcggcggt agtgggccgg cctgattaat taagaattcg acccagcttt 5340
cttgtacaaa gtggttgata tccagcacag tggcggccgc tcgagtctag agggcccgcg 5400
gttcgaaggt aagcctatcc ctaaccctct cctcggtctc gattctacgc gtaccggtta 5460
gcaattgttt tttcgatgag tttggacaaa ccacaactag aatgcagtga aaaaaatgct 5520
ttatttgtga aatttgtgat gctattgctt tatttgtaac cattataagc tgcaataaac 5580
aagttaacaa caacaattgc attcatttta tgtttcaggt tcagggggag gtgtgggagg 5640
ttttttaaag caagtaaaac ctctacaaat gtggtactta agagggggag accaaagggc 5700
gagacgttaa ggcctcacgt gacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 5760
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 5820
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 5880
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 5940
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 6000
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 6060
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 6120
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 6180
cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 6240
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 6300
aaaccacgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 6360
aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa 6420
ctcacgttaa gggattttgg tcatgccgtc tcagaagaac tcgtcaagaa ggcgatagaa 6480
ggcgatgcgc tgcgaatcgg gagcggcgat accgtaaagc acgaggaagc ggtcagccca 6540
ttcgccgcca agctcttcag caatatcacg ggtagccaac gctatgtcct gatagcggtc 6600
cgccacaccc agccggccac agtcgatgaa tccagaaaag cggccatttt ccaccatgat 6660
attcggcaag caggcatcgc catgggtcac gacgagatcc tcgccgtcgg gcatgctcgc 6720
cttgagcctg gcgaacagtt cggctggcgc gagcccctga tgctcttcgt ccagatcatc 6780
ctgatcgaca agaccggctt ccatccgagt acgtgctcgc tcgatgcgat gtttcgcttg 6840
gtggtcgaat gggcaggtag ccggatcaag cgtatgcagc cgccgcattg catcagccat 6900
gatggatact ttctcggcag gagcaaggtg agatgacagg agatcctgcc ccggcacttc 6960
gcccaatagc agccagtccc ttcccgcttc agtgacaacg tcgagcacag ctgcgcaagg 7020
aacgcccgtc gtggccagcc acgatagccg cgctgcctcg tcttgcagtt cattcagggc 7080
accggacagg tcggtcttga caaaaagaac cgggcgcccc tgcgctgaca gccggaacac 7140
ggcggcatca gagcagccga ttgtctgttg tgcccagtca tagccgaata gcctctccac 7200
ccaagcggcc ggagaacctg cgtgcaatcc atcttgttca atcataatat tattgaagca 7260
tttatcaggg ttcgtctcgt cccggtctcc tcccatgcat g 7301
<210> 17
<211> 12550
<212> DNA
<213> Artificial sequence
<400> 17
agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60
acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 120
tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180
ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagctca 240
gaattaaccc tcactaaagg gactagtcct gcaggtttaa acgaattcgc ccttttaacc 300
ctagaaagat agtctgcgta aaattgacgc atgcattctt gaaatattgc tctctctttc 360
taaatagcgc gaatccgtcg ctgtgcattt aggacatctc agtcgccgct tggagctccc 420
gtgaggcgtg cttgtcaatg cggtaagtgt cactgatttt gaactataac gaccgcgtga 480
gtcaaaatga cgcatgatta tcttttacgt gacttttaag atttaactca tacgataatt 540
atattgttat ttcatgttct acttacgtga taacttatta tatatatatt ttcttgttat 600
agatatcaac tagaatgcta gacctttcgt cttcaagaat tccgatcata ttcaataacc 660
cttaattagg tccctcgaag aggttcactg gcgcgttgga tccccgggta ccgagttggg 720
agctcacggg gacagccccc ccccaaagcc cccagggatg taattacgtc cctcccccgc 780
tagggggcag cagcgagccg cccggggctc cgctccggtc cggcgctccc cccgcatccc 840
cgagccggca gcgtgcgggg acagcccggg cacggggaag gtggcacggg atcgctttcc 900
tctgaacgct tctcgctgct ctttgagcct gcagacacct ggggggatac ggggaaaaag 960
ctttaggctg aaagagagat ttagaatgac agtctagtgg gagctcacgg ggacagcccc 1020
cccccaaagc ccccagggat gtaattacgt ccctcccccg ctagggggca gcagcgagcc 1080
gcccggggct ccgctccggt ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg 1140
gacagcccgg gcacggggaa ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc 1200
tctttgagcc tgcagacacc tggggggata cggggaaaaa gctttaggct gaaagagaga 1260
tttagaatga cagaactcga tttcattgca gactggcgcg ccgccttttt acggttcctg 1320
gccttttgct ggccttttgc tcacatgtca cgtgaggcct taacgtctcg ccctttggtc 1380
tccccctctt aagtaccaca tttgtagagg ttttacttgc tttaaaaaac ctcccacacc 1440
tccccctgaa cctgaaacat aaaatgaatg caattgttgt tgttaacttg tttattgcag 1500
cttataatgg ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 1560
cactgcattc tagttgtggt ttgtccaaac tcatcggcta gcttacccgg ggagcatgtc 1620
aaggtcaaaa tcgtcaagag cgtcagcagg cagcatatca aggtcaaagt cgtcaagggc 1680
atcggctggg agcatgtcta agtcaaaatc gtcaagggcg tcggtcggcc cgccgctttc 1740
gcactttagc tgtttctcca ggccacatat gattagttcc aggccgaaaa ggaaggcagg 1800
ttcggctccc tgccggtcga acagctcaat tgcttgtctc agaagtgggg gcatagaatc 1860
ggtggtaggt gtctctcttt cctcttttgc tacttgatgc tcctgttcct ccaatacgca 1920
gcccagtgta aagtggccca cggcggacag agcgtacagt gcgttctcca gggagaagcc 1980
ttgctgacac aggaacgcga gctgattttc cagggtttcg tactgtttct ctgttgggcg 2040
ggtgccgaga tgcactttag ccccgtcgcg atgtgagagg agagcacagc ggtatgactt 2100
ggcgttgttc cgcagaaagt cttgccatga ctcgccttcc agggggcaga agtgggtatg 2160
atgcctgtcc agcatctcga ttggcagggc atcgagcagg gcccgcttgt tcttcacgtg 2220
ccagtacagg gtaggctgct caactcccag cttttgagcg agtttccttg tcgtcaggcc 2280
ttcgataccg acaccattga gtaattccag agctccgttt atgactttgc tcttgtccag 2340
tctagacatt ggaccagggt tttcttcaac atcaccacaa gtgaggagag aacctctacc 2400
ttcggcaccg ggttcctttg ccctcggacg agtgctgggg cgtcggtttc cactatcggc 2460
gagtacttct acacagccat cggtccagac ggccgcgctt ctgcgggcga tttgtgtacg 2520
cccgacagtc ccggctccgg atcggacgat tgcgtcgcat cgaccctgcg cccaagctgc 2580
atcatcgaaa ttgccgtcaa ccaagctctg atagagttgg tcaagaccaa tgcggagcat 2640
atacgcccgg agccgcggcg atcctgcaag ctccggatgc ctccgctcga agtagcgcgt 2700
ctgctgctcc atacaagcca accacggcct ccagaagaag atgttggcga cctcgtattg 2760
ggaatccccg aacatcgcct cgctccagtc aatgaccgct gttatgcggc cattgtccgt 2820
caggacattg ttggagccga aatccgcgtg cacgaggtgc cggacttcgg ggcagtcctc 2880
ggcccaaagc atcagctcat cgagagcctg cgcgacggac gcactgacgg tgtcgtccat 2940
cacagtttgc cagtgataca catggggatc agcaatcgcg catatgaaat cacgccatgt 3000
agtgtattga ccgattcctt gcggtccgaa tgggccgaac ccgctcgtct ggctaagatc 3060
ggccgcagcg atcgcatcca tggcctccgc gaccggctgc agaacagcgg gcagttcggt 3120
ttcaggcagg tcttgcaacg tgacaccctg tgcacggcgg gagatgcaat aggtcaggct 3180
ctcgctaaat tccccaatgt caagcacttc cggaatcggg agcgcggccg atgcaaagtg 3240
ccgataaaca taacgatctt tgtagaaacc atcggcgcag ctatttaccc gcaggacata 3300
tccacgccct cctacatcga agctgaaagc acgagattct tcgccctccg agagctgcat 3360
caggtcggag acgctgtcga acttttcgat cagaaacttc tcgacagacg tcgcggtgag 3420
ttcaggcttt ttcatggtgg cggcactagt aagggcgaat tcggagcctg cttttttgta 3480
caaacttgtt gatatctgca gaattccacc acactggact agtggatccg agctcggtac 3540
caagcttctt cacgacacct gaaatggaag aaaaaaactt tgaaccactg tctgaggctt 3600
gagaatgaac caagatccaa actcaaaaag ggcaaattcc aaggagaatt acatcaagtg 3660
ccaagctggc ctaacttcag tctccaccca ctcagtgtgg ggaaactcca tcgcataaaa 3720
cccctccccc caacctaaag acgacgtact ccaaaagctc gagaactaat cgaggtgcct 3780
ggacggcgcc cggtactccg tggagtcaca tgaagcgacg gctgaggacg gaaaggccct 3840
tttcctttgt gtgggtgact cacccgcccg ctctcccgag cgccgcgtcc tccattttga 3900
gctccctgca gcagggccgg gaagcggcca tctttccgct cacgcaactg gtgccgaccg 3960
ggccagcctt gccgcccagg gcggggcgat acacggcggc gcgaggccag gcaccagagc 4020
aggccggcca gcttgagact acccccgtcc gattctcggt ggccgcgctc gcaggccccg 4080
cctcgccgaa catgtgcgct gggacgcacg ggccccgtcg ccgcccgcgg ccccaaaaac 4140
cgaaatacca gtgtgcagat cttggcccgc atttacaaga ctatcttgcc agaaaaaaag 4200
cgtcgcagca ggtcatcaaa aattttaaat ggctagagac ttatcgaaag cagcgagaca 4260
ggcgcgaagg tgccaccaga ttcgcacgcg gcggccccag cgcccaggcc aggcctcaac 4320
tcaagcacga ggcgaagggg ctccttaagc gcaaggcctc gaactctccc acccacttcc 4380
aacccgaagc tcgggatcaa gaatcacgta ctgcagccag gtggaagtaa ttcaaggcac 4440
gcaagggcca taacccgtaa agaggccagg cccgcgggaa ccacacacgg cacttacctg 4500
tgttctggcg gcaaacccgt tgcgaaaaag aacgttcacg gcgactactg cacttatata 4560
cggttctccc ccaccctcgg gaaaaaggcg gagccagtac acgacatcac tttcccagtt 4620
taccccgcgc caccttctct aggcaccggt tcaattgccg acccctcccc ccaacttctc 4680
ggggactgtg ggcgatgtgc gctctgccca ctgacgggca ccggagcctc acgatcgata 4740
tgtcgagttt actccctatc agtgatagag aacgtatgtc gagtttactc cctatcagtg 4800
atagagaacg atgtcgagtt tactccctat cagtgataga gaacgtatgt cgagtttact 4860
ccctatcagt gatagagaac gtatgtcgag tttactccct atcagtgata gagaacgtat 4920
gtcgagttta tccctatcag tgatagagaa cgtatgtcga gtttactccc tatcagtgat 4980
agagaacgta tgtcgaggta ggcgtgtacg gtgggaggcc tatataagca gagctcgttt 5040
agtgaaccgt cagatcgcct ggagaattgg ctaggcaccg gtgacaagtt tgtacaaaaa 5100
agcaggctcc gaattcaccg gtgccgccac catggactac aaggatcacg acggtgacta 5160
taaggatcat gacatcgact ataaggacga tgacgataag atcgatggcg gaggcggatc 5220
tgatccaaaa aagaagagaa aggtagatcc aaaaaagaag agaaaggtag atccaaaaaa 5280
gaagagaaag gtaggatcta ccggatctag aaacgatggt ggtggtggaa gcgggggtgg 5340
gggcagcggt ggagggggaa gcgggcgcgc cgggatcctc ccccccaaga aaaagaggaa 5400
ggtatctaga ggccgcagcc gccttttgga agattttcga aacaaccggt accccaattt 5460
acaactgcgg gagattgctg gacatataat ggaattttcc caagaccagc atgggtccag 5520
attcattcag ctgaaactgg agcgtgccac accagctgag cgccagcttg tcttcaatga 5580
aatcctccag gctgcctacc aactcatggt ggatgtgttt ggtaattacg tcattcagaa 5640
gttctttgaa tttggcagtc ttgaacagaa gctggctttg gcagaacgga ttcgaggcca 5700
cgtcctgtca ttggcactac agatgtatgg cagccgtgtt atcgagaaag ctcttgagtt 5760
tattccttca gaccagcaga atgagatggt tcgggaacta gatggccatg tcttgaagtg 5820
tgtgaaagat cagaatggca atcacgtggt tcagaaatgc attgaatgtg tacagcccca 5880
gtctttgcaa tttatcatcg acgcgtttaa gggacaggta tttgccttat ccacacatcc 5940
ttatggctgc cgagtgattc agagaatcct ggagcactgt ctccctgacc agacactccc 6000
tattttagag gagcttcacc agcacacaga gcagctggta caggatcaat atggaaatta 6060
tgtaatccaa catgtactgg agcacggtcg tcctgaggat aaaagcaaaa ttgtagcaga 6120
aatccgaggc aatgtacttg tattgagtca gcacaaattt gcaagcaatg ttgtggagaa 6180
gtgtgttact cacgcctcac gtacggagcg cgctgtgctc atcgacgagg tgtgcaccat 6240
gaacgacggt ccccacagtg ccttatacac catgatgaag gaccagtatg ccaactacgt 6300
ggtccagaag atgattgacg tggcggagcc aggccagcgg aagatcgtca tgcataagat 6360
ccggccccac atcgcaactc ttcgtaagta cacctatggc aagcacattc tggccaagct 6420
ggagaagtac tacatgaaga acggtgttga cttaggggac ccaaagaaga agcgcaaagt 6480
ggatcctaaa aagaaaagaa aggtaggcgg ccgcggggga ggcggttccg gtggcggcgg 6540
aagcggaggt ggaggatcag ggccggccgg aggaggtgga agcggaggag gaggaagcgg 6600
aggaggaggt agcggaccta agaaaaagag gaaggtggcg gccgctggat ccccttcagg 6660
gcagatcagc aaccaggccc tggctctggc ccctagctcc gctccagtgc tggcccagac 6720
tatggtgccc tctagtgcta tggtgcctct ggcccagcca cctgctccag cccctgtgct 6780
gaccccagga ccaccccagt cactgagcgc tccagtgccc aagtctacac aggccggcga 6840
ggggactctg agtgaagctc tgctgcacct gcagttcgac gctgatgagg acctgggagc 6900
tctgctgggg aacagcaccg atcccggagt gttcacagat ctggcctccg tggacaactc 6960
tgagtttcag cagctgctga atcagggcgt gtccatgtct catagtacag ccgaaccaat 7020
gctgatggag taccccgaag ccattacccg gctggtgacc ggcagccagc ggccccccga 7080
ccccgctcca actcccctgg gaaccagcgg cctgcctaat gggctgtccg gagatgaaga 7140
cttctcaagc atcgctgata tggactttag tgccctgctg tcacagattt cctctagtgg 7200
gcagggagga ggtggaagcg gcttcagcgt ggacaccagt gccctgctgg acctgttcag 7260
cccctcggtg accgtgcccg acatgagcct gcctgacctt gacagcagcc tggccagtat 7320
ccaagagctc ctgtctcccc aggagccccc caggcctccc gaggcagaga acagcagccc 7380
ggattcaggg aagcagctgg tgcactacac agcgcagccg ctgttcctgc tggaccccgg 7440
ctccgtggac accgggagca acgacctgcc ggtgctgttt gagctgggag agggctccta 7500
cttctccgaa ggggacggct tcgccgagga ccccaccatc tccctgctga caggctcgga 7560
gcctcccaaa gccaaggacc ccactgtctc catcgattga ttaattaaga attcgaccca 7620
gctttcttgt acaaagtggt tgatatccag cacagtggcg gccgctcgag tctagagggc 7680
ccgcggttcg aaggtaagcc tatccctaac cctctcctcg gtctcgattc tacgcgtacc 7740
ggttaggggc ccgtttaaac ccgctgatca gcctcgactg tgccttctag ttgccagcca 7800
tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 7860
ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg 7920
gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag caggcatgct 7980
ggggatgcgg tgggctctat ggcgacgcgc ctggatcccc gggtaccgag ttgggagctc 8040
acggggacag ccccccccca aagcccccag ggatgtaatt acgtccctcc cccgctaggg 8100
ggcagcagcg agccgcccgg ggctccgctc cggtccggcg ctccccccgc atccccgagc 8160
cggcagcgtg cggggacagc ccgggcacgg ggaaggtggc acgggatcgc tttcctctga 8220
acgcttctcg ctgctctttg agcctgcaga cacctggggg gatacgggga aaaagcttta 8280
ggctgaaaga gagatttaga atgacagtct agtgggagct cacggggaca gccccccccc 8340
aaagccccca gggatgtaat tacgtccctc ccccgctagg gggcagcagc gagccgcccg 8400
gggctccgct ccggtccggc gctccccccg catccccgag ccggcagcgt gcggggacag 8460
cccgggcacg gggaaggtgg cacgggatcg ctttcctctg aacgcttctc gctgctcttt 8520
gagcctgcag acacctgggg ggatacgggg aaaaagcttt aggctgaaag agagatttag 8580
aatgacagaa ctcgatttca ttgcagactg gccggccact agtacgcgcc ggctcgacat 8640
actagttaaa agttttgtta ctttatagaa gaaattttga gtttttgttt ttttttaata 8700
aataaataaa cataaataaa ttgtttgttg aatttattat tagtatgtaa gtgtaaatat 8760
aataaaactt aatatctatt caaattaata aataaacctc gatatacaga ccgataaaac 8820
acatgcgtca attttacgca tgattatctt taacgtacgt cacaatatga ttatctttct 8880
agggttaaaa gggcgaattc gcggccgcta aattcaattc gccctatagt gagtcgtatt 8940
acaattcact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac 9000
ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca 9060
ccgatcgccc ttcccaacag ttgcgcagcc tatacgtacg gcagtttaag gtttacacct 9120
ataaaagaga gagccgttat cgtctgtttg tggatgtaca gagtgatatt attgacacgc 9180
cggggcgacg gatggtgatc cccctggcca gtgcacgtct gctgtcagat aaagtctccc 9240
gtgaacttta cccggtggtg catatcgggg atgaaagctg gcgcatgatg accaccgata 9300
tggccagtgt gccggtctcc gttatcgggg aagaagtggc tgatctcagc caccgcgaaa 9360
atgacatcaa aaacgccatt aacctgatgt tctggggaat ataaatgtca ggcatgagat 9420
tatcaaaaag gatcttcacc tagatccttt tcacgtagaa agccagtccg cagaaacggt 9480
gctgaccccg gatgaatgtc agctactggg ctatctggac aagggaaaac gcaagcgcaa 9540
agagaaagca ggtagcttgc agtgggctta catggcgata gctagactgg gcggttttat 9600
ggacagcaag cgaaccggaa ttgccagctg gggcgccctc tggtaaggtt gggaagccct 9660
gcaaagtaaa ctggatggct ttcttgccgc caaggatctg atggcgcagg ggatcaagct 9720
ctgatcaaga gacaggatga ggatcgtttc gcatgattga acaagatgga ttgcacgcag 9780
gttctccggc cgcttgggtg gagaggctat tcggctatga ctgggcacaa cagacaatcg 9840
gctgctctga tgccgccgtg ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca 9900
agaccgacct gtccggtgcc ctgaatgaac tgcaagacga ggcagcgcgg ctatcgtggc 9960
tggccacgac gggcgttcct tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg 10020
actggctgct attgggcgaa gtgccggggc aggatctcct gtcatctcac cttgctcctg 10080
ccgagaaagt atccatcatg gctgatgcaa tgcggcggct gcatacgctt gatccggcta 10140
cctgcccatt cgaccaccaa gcgaaacatc gcatcgagcg agcacgtact cggatggaag 10200
ccggtcttgt cgatcaggat gatctggacg aagagcatca ggggctcgcg ccagccgaac 10260
tgttcgccag gctcaaggcg agcatgcccg acggcgagga tctcgtcgtg acccatggcg 10320
atgcctgctt gccgaatatc atggtggaaa atggccgctt ttctggattc atcgactgtg 10380
gccggctggg tgtggcggac cgctatcagg acatagcgtt ggctacccgt gatattgctg 10440
aagagcttgg cggcgaatgg gctgaccgct tcctcgtgct ttacggtatc gccgctcccg 10500
attcgcagcg catcgccttc tatcgccttc ttgacgagtt cttctgaatt attaacgctt 10560
acaatttcct gatgcggtat tttctcctta cgcatctgtg cggtatttca caccgcatca 10620
ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttattttt ctaaatacat 10680
tcaaatatgt atccgctcat gagattatca aaaaggatct tcacctagat ccttttaaat 10740
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 10800
caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt 10860
gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt 10920
gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag 10980
ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct 11040
attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt 11100
gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc 11160
tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt 11220
agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg 11280
gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg 11340
actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct 11400
tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc 11460
attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt 11520
tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt 11580
tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg 11640
aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta tcagggttat 11700
tgtctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc agaccccgta 11760
gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa 11820
acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt 11880
tttccgaagg taactggctt cagcagagcg cagataccaa atactgttct tctagtgtag 11940
ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta 12000
atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca 12060
agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag 12120
cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa 12180
agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga 12240
acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc 12300
gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc 12360
ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt 12420
gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt 12480
gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag 12540
gaagcggaag 12550
<210> 18
<211> 13678
<212> DNA
<213> Artificial sequence
<400> 18
agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60
acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 120
tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180
ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagctca 240
gaattaaccc tcactaaagg gactagtcct gcaggtttaa acgaattcgc ccttttaacc 300
ctagaaagat agtctgcgta aaattgacgc atgcattctt gaaatattgc tctctctttc 360
taaatagcgc gaatccgtcg ctgtgcattt aggacatctc agtcgccgct tggagctccc 420
gtgaggcgtg cttgtcaatg cggtaagtgt cactgatttt gaactataac gaccgcgtga 480
gtcaaaatga cgcatgatta tcttttacgt gacttttaag atttaactca tacgataatt 540
atattgttat ttcatgttct acttacgtga taacttatta tatatatatt ttcttgttat 600
agatatcaac tagaatgcta gacctttcgt cttcaagaat tccgatcata ttcaataacc 660
cttaattagg tccctcgaag aggttcactg gcgcgttgga tccccgggta ccgagttggg 720
agctcacggg gacagccccc ccccaaagcc cccagggatg taattacgtc cctcccccgc 780
tagggggcag cagcgagccg cccggggctc cgctccggtc cggcgctccc cccgcatccc 840
cgagccggca gcgtgcgggg acagcccggg cacggggaag gtggcacggg atcgctttcc 900
tctgaacgct tctcgctgct ctttgagcct gcagacacct ggggggatac ggggaaaaag 960
ctttaggctg aaagagagat ttagaatgac agtctagtgg gagctcacgg ggacagcccc 1020
cccccaaagc ccccagggat gtaattacgt ccctcccccg ctagggggca gcagcgagcc 1080
gcccggggct ccgctccggt ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg 1140
gacagcccgg gcacggggaa ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc 1200
tctttgagcc tgcagacacc tggggggata cggggaaaaa gctttaggct gaaagagaga 1260
tttagaatga cagaactcga tttcattgca gactggcgcg ccgccttttt acggttcctg 1320
gccttttgct ggccttttgc tcacatgtca cgtgaggcct taacgtctcg ccctttggtc 1380
tccccctctt aagtaccaca tttgtagagg ttttacttgc tttaaaaaac ctcccacacc 1440
tccccctgaa cctgaaacat aaaatgaatg caattgttgt tgttaacttg tttattgcag 1500
cttataatgg ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 1560
cactgcattc tagttgtggt ttgtccaaac tcatcggcta gcttacccgg ggagcatgtc 1620
aaggtcaaaa tcgtcaagag cgtcagcagg cagcatatca aggtcaaagt cgtcaagggc 1680
atcggctggg agcatgtcta agtcaaaatc gtcaagggcg tcggtcggcc cgccgctttc 1740
gcactttagc tgtttctcca ggccacatat gattagttcc aggccgaaaa ggaaggcagg 1800
ttcggctccc tgccggtcga acagctcaat tgcttgtctc agaagtgggg gcatagaatc 1860
ggtggtaggt gtctctcttt cctcttttgc tacttgatgc tcctgttcct ccaatacgca 1920
gcccagtgta aagtggccca cggcggacag agcgtacagt gcgttctcca gggagaagcc 1980
ttgctgacac aggaacgcga gctgattttc cagggtttcg tactgtttct ctgttgggcg 2040
ggtgccgaga tgcactttag ccccgtcgcg atgtgagagg agagcacagc ggtatgactt 2100
ggcgttgttc cgcagaaagt cttgccatga ctcgccttcc agggggcaga agtgggtatg 2160
atgcctgtcc agcatctcga ttggcagggc atcgagcagg gcccgcttgt tcttcacgtg 2220
ccagtacagg gtaggctgct caactcccag cttttgagcg agtttccttg tcgtcaggcc 2280
ttcgataccg acaccattga gtaattccag agctccgttt atgactttgc tcttgtccag 2340
tctagacatt ggaccagggt tttcttcaac atcaccacaa gtgaggagag aacctctacc 2400
ttcggcaccg ggatttcggg tatatttgag tggaatgagt tcttcaatcg tagttttgac 2460
taacttgcca ttcatttcta ttaacacaaa acaatctggt gcatagtctg aaatcaactc 2520
cctacacata ccacaaggac ttaccactcg aatacttcta tctacttcgt cagaataagg 2580
gtgtctaaca gctacaatcg tgtcaaaatc cttttgtcca ttcgaaactg cactaccaat 2640
cgcaatggct tctgcacaaa cagttactcg tcctatatac gcttcaatat gtactgccga 2700
aatgatttct cctgttttcg tacgaattgc cgctcccaca tgatgtttat tatcctcata 2760
aagcattgta atcttctctg tcgctacttc tactaattct agatcctgtt gagaaatgtt 2820
aaatgttttc atggtggcgg cactagtaag ggcgaattcg gagcctgctt ttttgtacaa 2880
acttgttgat atctgcagaa ttccaccaca ctggactagt ggatccgagc tcggtaccaa 2940
gcttcttcac gacacctgaa atggaagaaa aaaactttga accactgtct gaggcttgag 3000
aatgaaccaa gatccaaact caaaaagggc aaattccaag gagaattaca tcaagtgcca 3060
agctggccta acttcagtct ccacccactc agtgtgggga aactccatcg cataaaaccc 3120
ctccccccaa cctaaagacg acgtactcca aaagctcgag aactaatcga ggtgcctgga 3180
cggcgcccgg tactccgtgg agtcacatga agcgacggct gaggacggaa aggccctttt 3240
cctttgtgtg ggtgactcac ccgcccgctc tcccgagcgc cgcgtcctcc attttgagct 3300
ccctgcagca gggccgggaa gcggccatct ttccgctcac gcaactggtg ccgaccgggc 3360
cagccttgcc gcccagggcg gggcgataca cggcggcgcg aggccaggca ccagagcagg 3420
ccggccagct tgagactacc cccgtccgat tctcggtggc cgcgctcgca ggccccgcct 3480
cgccgaacat gtgcgctggg acgcacgggc cccgtcgccg cccgcggccc caaaaaccga 3540
aataccagtg tgcagatctt ggcccgcatt tacaagacta tcttgccaga aaaaaagcgt 3600
cgcagcaggt catcaaaaat tttaaatggc tagagactta tcgaaagcag cgagacaggc 3660
gcgaaggtgc caccagattc gcacgcggcg gccccagcgc ccaggccagg cctcaactca 3720
agcacgaggc gaaggggctc cttaagcgca aggcctcgaa ctctcccacc cacttccaac 3780
ccgaagctcg ggatcaagaa tcacgtactg cagccaggtg gaagtaattc aaggcacgca 3840
agggccataa cccgtaaaga ggccaggccc gcgggaacca cacacggcac ttacctgtgt 3900
tctggcggca aacccgttgc gaaaaagaac gttcacggcg actactgcac ttatatacgg 3960
ttctccccca ccctcgggaa aaaggcggag ccagtacacg acatcacttt cccagtttac 4020
cccgcgccac cttctctagg caccggttca attgccgacc cctcccccca acttctcggg 4080
gactgtgggc gatgtgcgct ctgcccactg acgggcaccg gagcctcacg atcgatatgt 4140
cgagtttact ccctatcagt gatagagaac gtatgtcgag tttactccct atcagtgata 4200
gagaacgatg tcgagtttac tccctatcag tgatagagaa cgtatgtcga gtttactccc 4260
tatcagtgat agagaacgta tgtcgagttt actccctatc agtgatagag aacgtatgtc 4320
gagtttatcc ctatcagtga tagagaacgt atgtcgagtt tactccctat cagtgataga 4380
gaacgtatgt cgaggtaggc gtgtacggtg ggaggcctat ataagcagag ctcgtttagt 4440
gaaccgtcag atcgcctgga gaattggcta ggcaccggtg acaagtttgt acaaaaaagc 4500
aggctccgaa ttcaccggtg ccgccaccat gtacccatac gatgttccag attacgcttc 4560
gccgaagaaa aagcgcaagg tcgaagcgtc cgacaagaag tacagcatcg gcctggccat 4620
cggcaccaac tctgtgggct gggccgtgat caccgacgag tacaaggtgc ccagcaagaa 4680
attcaaggtg ctgggcaaca ccgaccggca cagcatcaag aagaacctga tcggagccct 4740
gctgttcgac agcggcgaaa cagccgaggc cacccggctg aagagaaccg ccagaagaag 4800
atacaccaga cggaagaacc ggatctgcta tctgcaagag atcttcagca acgagatggc 4860
caaggtggac gacagcttct tccacagact ggaagagtcc ttcctggtgg aagaggataa 4920
gaagcacgag cggcacccca tcttcggcaa catcgtggac gaggtggcct accacgagaa 4980
gtaccccacc atctaccacc tgagaaagaa actggtggac agcaccgaca aggccgacct 5040
gcggctgatc tatctggccc tggcccacat gatcaagttc cggggccact tcctgatcga 5100
gggcgacctg aaccccgaca acagcgacgt ggacaagctg ttcatccagc tggtgcagac 5160
ctacaaccag ctgttcgagg aaaaccccat caacgccagc ggcgtggacg ccaaggccat 5220
cctgtctgcc agactgagca agagcagacg gctggaaaat ctgatcgccc agctgcccgg 5280
cgagaagaag aatggcctgt tcggcaacct gattgccctg agcctgggcc tgacccccaa 5340
cttcaagagc aacttcgacc tggccgagga tgccaaactg cagctgagca aggacaccta 5400
cgacgacgac ctggacaacc tgctggccca gatcggcgac cagtacgccg acctgtttct 5460
ggccgccaag aacctgtccg acgccatcct gctgagcgac atcctgagag tgaacaccga 5520
gatcaccaag gcccccctga gcgcctctat gatcaagaga tacgacgagc accaccagga 5580
cctgaccctg ctgaaagctc tcgtgcggca gcagctgcct gagaagtaca aagagatttt 5640
cttcgaccag agcaagaacg gctacgccgg ctacattgac ggcggagcca gccaggaaga 5700
gttctacaag ttcatcaagc ccatcctgga aaagatggac ggcaccgagg aactgctcgt 5760
gaagctgaac agagaggacc tgctgcggaa gcagcggacc ttcgacaacg gcagcatccc 5820
ccaccagatc cacctgggag agctgcacgc cattctgcgg cggcaggaag atttttaccc 5880
attcctgaag gacaaccggg aaaagatcga gaagatcctg accttccgca tcccctacta 5940
cgtgggccct ctggccaggg gaaacagcag attcgcctgg atgaccagaa agagcgagga 6000
aaccatcacc ccctggaact tcgaggaagt ggtggacaag ggcgcttccg cccagagctt 6060
catcgagcgg atgaccaact tcgataagaa cctgcccaac gagaaggtgc tgcccaagca 6120
cagcctgctg tacgagtact tcaccgtgta taacgagctg accaaagtga aatacgtgac 6180
cgagggaatg agaaagcccg ccttcctgag cggcgagcag aaaaaggcca tcgtggacct 6240
gctgttcaag accaaccgga aagtgaccgt gaagcagctg aaagaggact acttcaagaa 6300
aatcgagtgc ttcgactccg tggaaatctc cggcgtggaa gatcggttca acgcctccct 6360
gggcacatac cacgatctgc tgaaaattat caaggacaag gacttcctgg acaatgagga 6420
aaacgaggac attctggaag atatcgtgct gaccctgaca ctgtttgagg acagagagat 6480
gatcgaggaa cggctgaaaa cctatgccca cctgttcgac gacaaagtga tgaagcagct 6540
gaagcggcgg agatacaccg gctggggcag gctgagccgg aagctgatca acggcatccg 6600
ggacaagcag tccggcaaga caatcctgga tttcctgaag tccgacggct tcgccaacag 6660
aaacttcatg cagctgatcc acgacgacag cctgaccttt aaagaggaca tccagaaagc 6720
ccaggtgtcc ggccagggcg atagcctgca cgagcacatt gccaatctgg ccggcagccc 6780
cgccattaag aagggcatcc tgcagacagt gaaggtggtg gacgagctcg tgaaagtgat 6840
gggccggcac aagcccgaga acatcgtgat cgaaatggcc agagagaacc agaccaccca 6900
gaagggacag aagaacagcc gcgagagaat gaagcggatc gaagagggca tcaaagagct 6960
gggcagccag atcctgaaag aacaccccgt ggaaaacacc cagctgcaga acgagaagct 7020
gtacctgtac tacctgcaga atgggcggga tatgtacgtg gaccaggaac tggacatcaa 7080
ccggctgtcc gactacgatg tggacgccat cgtgcctcag agctttctga aggacgactc 7140
catcgacaac aaggtgctga ccagaagcga caagaaccgg ggcaagagcg acaacgtgcc 7200
ctccgaagag gtcgtgaaga agatgaagaa ctactggcgg cagctgctga acgccaagct 7260
gattacccag agaaagttcg acaatctgac caaggccgag agaggcggcc tgagcgaact 7320
ggataaggcc ggcttcatca agagacagct ggtggaaacc cggcagatca caaagcacgt 7380
ggcacagatc ctggactccc ggatgaacac taagtacgac gagaatgaca agctgatccg 7440
ggaagtgaaa gtgatcaccc tgaagtccaa gctggtgtcc gatttccgga aggatttcca 7500
gttttacaaa gtgcgcgaga tcaacaacta ccaccacgcc cacgacgcct acctgaacgc 7560
cgtcgtggga accgccctga tcaaaaagta ccctaagctg gaaagcgagt tcgtgtacgg 7620
cgactacaag gtgtacgacg tgcggaagat gatcgccaag agcgagcagg aaatcggcaa 7680
ggctaccgcc aagtacttct tctacagcaa catcatgaac tttttcaaga ccgagattac 7740
cctggccaac ggcgagatcc ggaagcggcc tctgatcgag acaaacggcg aaaccgggga 7800
gatcgtgtgg gataagggcc gggattttgc caccgtgcgg aaagtgctga gcatgcccca 7860
agtgaatatc gtgaaaaaga ccgaggtgca gacaggcggc ttcagcaaag agtctatcct 7920
gcccaagagg aacagcgata agctgatcgc cagaaagaag gactgggacc ctaagaagta 7980
cggcggcttc gacagcccca ccgtggccta ttctgtgctg gtggtggcca aagtggaaaa 8040
gggcaagtcc aagaaactga agagtgtgaa agagctgctg gggatcacca tcatggaaag 8100
aagcagcttc gagaagaatc ccatcgactt tctggaagcc aagggctaca aagaagtgaa 8160
aaaggacctg atcatcaagc tgcctaagta ctccctgttc gagctggaaa acggccggaa 8220
gagaatgctg gcctctgccg gcgaactgca gaagggaaac gaactggccc tgccctccaa 8280
atatgtgaac ttcctgtacc tggccagcca ctatgagaag ctgaagggct cccccgagga 8340
taatgagcag aaacagctgt ttgtggaaca gcacaagcac tacctggacg agatcatcga 8400
gcagatcagc gagttctcca agagagtgat cctggccgac gctaatctgg acaaagtgct 8460
gtccgcctac aacaagcacc gggataagcc catcagagag caggccgaga atatcatcca 8520
cctgtttacc ctgaccaatc tgggagcccc tgccgccttc aagtactttg acaccaccat 8580
cgaccggaag aggtacacca gcaccaaaga ggtgctggac gccaccctga tccaccagag 8640
catcaccggc ctgtacgaga cacggatcga cctgtctcag ctgggaggcg acagccccaa 8700
gaagaagaga aaggtggagg ccagctgatt aattaagaat tcgacccagc tttcttgtac 8760
aaagtggttg atatccagca cagtggcggc cgctcgagtc tagagggccc gcggttcgaa 8820
ggtaagccta tccctaaccc tctcctcggt ctcgattcta cgcgtaccgg ttaggggccc 8880
gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc 8940
ccctcccccg tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa 9000
aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg 9060
gggcaggaca gcaaggggga ggattgggaa gacaatagca ggcatgctgg ggatgcggtg 9120
ggctctatgg cgacgcgcct ggatccccgg gtaccgagtt gggagctcac ggggacagcc 9180
cccccccaaa gcccccaggg atgtaattac gtccctcccc cgctaggggg cagcagcgag 9240
ccgcccgggg ctccgctccg gtccggcgct ccccccgcat ccccgagccg gcagcgtgcg 9300
gggacagccc gggcacgggg aaggtggcac gggatcgctt tcctctgaac gcttctcgct 9360
gctctttgag cctgcagaca cctgggggga tacggggaaa aagctttagg ctgaaagaga 9420
gatttagaat gacagtctag tgggagctca cggggacagc ccccccccaa agcccccagg 9480
gatgtaatta cgtccctccc ccgctagggg gcagcagcga gccgcccggg gctccgctcc 9540
ggtccggcgc tccccccgca tccccgagcc ggcagcgtgc ggggacagcc cgggcacggg 9600
gaaggtggca cgggatcgct ttcctctgaa cgcttctcgc tgctctttga gcctgcagac 9660
acctgggggg atacggggaa aaagctttag gctgaaagag agatttagaa tgacagaact 9720
cgatttcatt gcagactggc cggccactag tacgcgccgg ctcgacatac tagttaaaag 9780
ttttgttact ttatagaaga aattttgagt ttttgttttt ttttaataaa taaataaaca 9840
taaataaatt gtttgttgaa tttattatta gtatgtaagt gtaaatataa taaaacttaa 9900
tatctattca aattaataaa taaacctcga tatacagacc gataaaacac atgcgtcaat 9960
tttacgcatg attatcttta acgtacgtca caatatgatt atctttctag ggttaaaagg 10020
gcgaattcgc ggccgctaaa ttcaattcgc cctatagtga gtcgtattac aattcactgg 10080
ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt aatcgccttg 10140
cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc gatcgccctt 10200
cccaacagtt gcgcagccta tacgtacggc agtttaaggt ttacacctat aaaagagaga 10260
gccgttatcg tctgtttgtg gatgtacaga gtgatattat tgacacgccg gggcgacgga 10320
tggtgatccc cctggccagt gcacgtctgc tgtcagataa agtctcccgt gaactttacc 10380
cggtggtgca tatcggggat gaaagctggc gcatgatgac caccgatatg gccagtgtgc 10440
cggtctccgt tatcggggaa gaagtggctg atctcagcca ccgcgaaaat gacatcaaaa 10500
acgccattaa cctgatgttc tggggaatat aaatgtcagg catgagatta tcaaaaagga 10560
tcttcaccta gatccttttc acgtagaaag ccagtccgca gaaacggtgc tgaccccgga 10620
tgaatgtcag ctactgggct atctggacaa gggaaaacgc aagcgcaaag agaaagcagg 10680
tagcttgcag tgggcttaca tggcgatagc tagactgggc ggttttatgg acagcaagcg 10740
aaccggaatt gccagctggg gcgccctctg gtaaggttgg gaagccctgc aaagtaaact 10800
ggatggcttt cttgccgcca aggatctgat ggcgcagggg atcaagctct gatcaagaga 10860
caggatgagg atcgtttcgc atgattgaac aagatggatt gcacgcaggt tctccggccg 10920
cttgggtgga gaggctattc ggctatgact gggcacaaca gacaatcggc tgctctgatg 10980
ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct ttttgtcaag accgacctgt 11040
ccggtgccct gaatgaactg caagacgagg cagcgcggct atcgtggctg gccacgacgg 11100
gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc gggaagggac tggctgctat 11160
tgggcgaagt gccggggcag gatctcctgt catctcacct tgctcctgcc gagaaagtat 11220
ccatcatggc tgatgcaatg cggcggctgc atacgcttga tccggctacc tgcccattcg 11280
accaccaagc gaaacatcgc atcgagcgag cacgtactcg gatggaagcc ggtcttgtcg 11340
atcaggatga tctggacgaa gagcatcagg ggctcgcgcc agccgaactg ttcgccaggc 11400
tcaaggcgag catgcccgac ggcgaggatc tcgtcgtgac ccatggcgat gcctgcttgc 11460
cgaatatcat ggtggaaaat ggccgctttt ctggattcat cgactgtggc cggctgggtg 11520
tggcggaccg ctatcaggac atagcgttgg ctacccgtga tattgctgaa gagcttggcg 11580
gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc cgctcccgat tcgcagcgca 11640
tcgccttcta tcgccttctt gacgagttct tctgaattat taacgcttac aatttcctga 11700
tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatcagg tggcactttt 11760
cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc aaatatgtat 11820
ccgctcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 11880
tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 11940
agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 12000
gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 12060
ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 12120
gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 12180
cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 12240
acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 12300
cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 12360
cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 12420
ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 12480
tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 12540
atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 12600
tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 12660
actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 12720
aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 12780
ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgacc 12840
aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 12900
ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 12960
ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 13020
actggcttca gcagagcgca gataccaaat actgttcttc tagtgtagcc gtagttaggc 13080
caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 13140
gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 13200
ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 13260
cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 13320
cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac aggagagcgc 13380
acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg gtttcgccac 13440
ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 13500
gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc tcacatgttc 13560
tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga gtgagctgat 13620
accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga agcggaag 13678
<210> 19
<211> 1745
<212> DNA
<213> Artificial sequence
<400> 19
acacggaaga tgaggtccga gtggcctgct gaggacttgc tgcttgtccc caggtcccca 60
ggtcatgccc tccttctgcc accctgggga gctgagggcc tcagctgggg ctgctgtcct 120
aaggcagggt gggaactagg cagccagcag ggaggggacc cctccctcac tcccactctc 180
ccacccccac caccttggcc catccatggc ggcatcttgg gccatccggg actggggaca 240
ggggtcctgg ggacaggggt gtggggacag gggtcctggg gacaggggtc tggggacagg 300
ggtcctgggg acaggggtgt ggggacaggg gtgtggggac aggggtgtgg ggacaggggt 360
cctggggaca ggggtctggg gacaggggtc tgaggacagg ggtgtgggga caggggtgtg 420
gggacagggg tgtggggaca ggggtgtggg gacaggggtc tggggacagg ggtccggggg 480
acaggggtgt ggggacaggg gtgtggggac aggggtgtgg ggacaggggt ctggggacag 540
gggtgtgggg acaggggtcc tggggacagg ggtgtgggga taggggtgtg gggacagggg 600
tgtggggaca ggggtgtggg gacaggggtc tggggacagc agcgcaaaga gccccgccct 660
gcagcctcca gctctcctgg tctaatgtgg aaagtggccc aggtgagggc tttgctctcc 720
tggagacatt tgcccccagc tgtgagcagg gacaggtctg gccaccgggc ccctggttaa 780
gactctaatg acccgctggt cctgaggaag aggtgctgac gaccaaggag atcttcccac 840
agacccagca ccagggaaat ggtccggaaa ttgcagcctc agcccccagc catctgccga 900
cccccccacc ccaggcccta atgggccagg cggcaggggt tgagaggtag gggagatggg 960
ctctgagact ataaagccag cgggggccca gcagccctca ggatccccgg gtaccggtcg 1020
ccaccatggt gagcaagggc gaggagctgt tcaccggggt ggtgcccatc ctggtcgagc 1080
tggacggcga cgtaaacggc cacaagttca gcgtgtcagg cgagggcgag ggcgatgcca 1140
cctacggcaa gctgaccctg aagttcatct gcaccaccgg caagctgccc gtgccctggc 1200
ccaccctcgt gaccaccctg acctacggcg tgcagtgctt cagccgctac cccgaccaca 1260
tgaagcagca cgacttcttc aagtccgcca tgccagaagg ctacgtccag gagcgcacca 1320
tcttcttcaa ggacgacggc aactacaaga cccgcgccga ggtgaagttc gagggcgaca 1380
ccctggtgaa ccgcatcgag ctgaagggca tcgacttcaa ggaggacggc aacatcctgg 1440
ggcacaagct ggagtacaac tacaacagcc acaacgtcta tatcatggcc gacaagcaga 1500
agaacggcat caaggtgaac ttcaagatcc gccacaacat cgaggacggc agcgtgcagc 1560
tcgccgacca ctaccagcag aacacaccca tcggcgacgg ccccgtgctg ctgcccgaca 1620
accactacct gagcacccag tccgccctga gcaaagaccc caacgagaag cgcgatcaca 1680
tggtcctgct ggagttcgtg accgccgccg ggatcactct cggcatggac gagctgtaca 1740
agtaa 1745
<210> 20
<211> 155
<212> RNA
<213> Artificial sequence
<400> 20
guuuaagagc uaugcuggaa acagcauagc aaguuuaaau aaggcuaguc cguuaucaac 60
uugaaaaagu ggcaccgagu cggugccaau ugggucucca gauuguaugu agccuguaug 120
uagccuguau guagccugua uguagccugu augua 155

Claims (10)

1. A method for preparing insulin-secreting cells from hepatocytes comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells, namely insulin secreting cells;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 12 sgrnas; the target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table.
2. A method of reprogramming hepatocytes directly to insulin-secreting cells comprising the steps of: introducing into hepatocytes DNA encoding three functional elements of the Casilio system;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 12 sgrnas; the target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table.
3. A recombinant cell prepared by a method comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 12 sgrnas; the target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table.
4. A kit comprising a recombinant expression vector comprising the three functional elements of the Casilio system or the DNA encoding the three functional elements of the Casilio system;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 12 sgrnas; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;
the kit has the function of preparing insulin secreting cells by using liver cells.
Use of a recombinant expression vector comprising the DNA coding for or containing the DNA coding for the three functional elements of the Casilio system in the preparation of a kit;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 12 sgrnas; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;
the kit has the function of preparing insulin secreting cells by using liver cells.
A sgRNA combination of (a) or (b) as follows:
(a) the sgRNA combination consists of 12 sgRNAs; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;
(b) sgRNA combinations consisting of 4 sgRNAs; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.
7. The sgRNA combination of claim 6 for use as (c1), (c2) or (c 3):
(c1) preparing insulin secreting cells from hepatocytes;
(c2) direct reprogramming of hepatocytes to insulin-secreting cells;
(c3) preparing a kit; the kit has the function of preparing insulin secreting cells by using liver cells.
8. A method for preparing insulin-secreting cells from hepatocytes comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells, namely insulin secreting cells;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 4 sgrnas; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.
9. A recombinant cell prepared by a method comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 4 sgrnas; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.
Use of three functional elements of the Casilio system for the preparation of a kit;
the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;
the sgrnas with PUF domain binding sites consist of 4 sgrnas; target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table;
the kit has the function of preparing insulin secreting cells by using liver cells.
CN202010985873.8A 2020-09-18 2020-09-18 Method and kit for direct reprogramming of hepatocytes into islet-like cells Pending CN114196700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010985873.8A CN114196700A (en) 2020-09-18 2020-09-18 Method and kit for direct reprogramming of hepatocytes into islet-like cells

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010985873.8A CN114196700A (en) 2020-09-18 2020-09-18 Method and kit for direct reprogramming of hepatocytes into islet-like cells

Publications (1)

Publication Number Publication Date
CN114196700A true CN114196700A (en) 2022-03-18

Family

ID=80645317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010985873.8A Pending CN114196700A (en) 2020-09-18 2020-09-18 Method and kit for direct reprogramming of hepatocytes into islet-like cells

Country Status (1)

Country Link
CN (1) CN114196700A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105002142A (en) * 2015-07-09 2015-10-28 深圳市人民医院 Method for directly reprogramming mouse hepatocyte into islet beta cell, and application thereof
CN108913692A (en) * 2018-07-17 2018-11-30 浙江大学 The sgRNA of selectively targeted SATB1 gene and its application in transcriptional activation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105002142A (en) * 2015-07-09 2015-10-28 深圳市人民医院 Method for directly reprogramming mouse hepatocyte into islet beta cell, and application thereof
CN108913692A (en) * 2018-07-17 2018-11-30 浙江大学 The sgRNA of selectively targeted SATB1 gene and its application in transcriptional activation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YOU等: "Advancements and obstacles of CRISPRCas9 technology in translational research", 《MOL THER METHODS CLIN DEV》 *
ZHU等: "Human pancreatic beta-like cells converted from fibroblasts", 《NAT COMMUN》, vol. 7, no. 1 *
王晗月等: "高效靶向激活肝细胞内源基因直接重编程为胰岛样细胞", 《中国组织工程研究》, pages 1 *

Similar Documents

Publication Publication Date Title
KR102370675B1 (en) Improved methods for modification of target nucleic acids
DK2925866T3 (en) CIRCULAR RNA FOR INHIBITING MICRO-RNA
CN110835633B (en) Preparation of PTC stable cell line by using optimized gene codon expansion system and application
CN109439708B (en) Method for producing kola acid by acid-resistant high-density growth escherichia coli
CN101213203A (en) Methods and compositions for regulated expression of nucleic acid at post-transcriptional level
CN1938428A (en) Plasmid system for multigene expression
KR100886312B1 (en) Method for analyzing protein-protein interaction
KR20180020202A (en) T cell receptor-specific antibody
CN101208425A (en) Cell lines for production of replication-defective adenovirus
US20050251872A1 (en) Lentiviral vectors, related reagents, and methods of use thereof
AU2023270345A1 (en) Compositions and methods for nucleic acid expression and protein secretion in bacteroides
CN110637090A (en) Plasmid vectors for expression of large nucleic acid transgenes
KR20230019063A (en) Triple function adeno-associated virus (AAV) vectors for the treatment of C9ORF72 associated diseases
CN113584062B (en) Fusion imaging gene, lentivirus expression plasmid, lentivirus and cell thereof, and preparation method and application thereof
CN101238214A (en) Treatment of disease using an improved regulated expression system
CN109468244B (en) Acid-resistant high-density-growth escherichia coli and application thereof
CN110305902B (en) Method for activating hSyn promoter in tool cell and application thereof
CN111534543A (en) Eukaryotic CRISPR/Cas9 knockout system, basic vector, vector and cell line
CN111549060A (en) Eukaryotic organism CRISPR/Cas9 whole genome editing cell library and construction method
CN114196700A (en) Method and kit for direct reprogramming of hepatocytes into islet-like cells
CN110777147A (en) IKZF3 gene-silenced T cell and application thereof
WO2022241455A1 (en) A synthetic circuit for buffering gene dosage variation between individual mammalian cells
KR20240021906A (en) Expression vectors, bacterial sequence-free vectors, and methods of making and using the same
CN117881788A (en) Expression vectors, bacterial sequence-free vectors, and methods of making and using the same
KR20240022571A (en) Systems, methods and components for RNA-guided effector recruitment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220318