CN114196700A

CN114196700A - Method and kit for direct reprogramming of hepatocytes into islet-like cells

Info

Publication number: CN114196700A
Application number: CN202010985873.8A
Authority: CN
Inventors: 杨晓菲; 李富荣; 王晗月
Original assignee: Shenzhen Peoples Hospital
Current assignee: Shenzhen Peoples Hospital
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2022-03-18

Abstract

The invention discloses a method and a kit for direct reprogramming of hepatocytes into islet-like cells. The invention provides a method for preparing insulin secreting cells from hepatocytes, which comprises the following steps: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells. The three functional elements of the Casilio system are dCas9 protein, sgRNA with PUF domain binding sites attached, and effector proteins fused to PUF domains; the sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table. The invention provides a new research idea and technical means for islet cell regeneration, lays an experimental foundation for clinical treatment of diabetes, and has clinical application value.

Description

Method and kit for direct reprogramming of hepatocytes into islet-like cells

Technical Field

The present invention relates to a method and kit for direct reprogramming of hepatocytes into islet-like cells.

Background

Diabetes mellitus is a chronic metabolic disease caused by insufficient insulin secretion or insulin resistance in a patient. Allogeneic islet transplantation is one of the most likely methods for treating diabetes at present, but islet donor deficiency and immune rejection limit its clinical application.

Direct cell reprogramming is the direct transdifferentiation of one differentiated cell into another without dedifferentiation and redifferentiation, and the liver and pancreas are both derived from the abdominal foregut endoderm, which determines the possibility of liver and pancreatic cells having interconversion. The most critical inducing factor in achieving direct reprogramming of hepatocytes to islet beta cells in vitro is activation of pancreatic transcription factor expression. Akinci and the like screen out three transcription factors Pdx1, Ngn3 and MafA (PNM) from 20 transcription factors related to pancreatic beta cell development, and directly reprograms exocrine cells into islet-like cells through retrovirus. Banga et al injected PNM transcription factor combination tandem adenovirus into mice via tail vein and found Insulin-secreting Cells (IPCs) only in the liver. Therefore, the key transcription factor for pancreatic development, such as PNM, is often used as a target gene for direct reprogramming of liver cells into pancreatic beta cells. At present, the efficiency and maturity of direct reprogramming of cells are low, the traditional reprogramming method cannot effectively open the regulation and control network of relevant endogenous genes of islet cells, and the obtained cells cannot meet the requirements of clinical treatment.

CRISPR/Cas9(clustered regulated short palindromic repeats Cas9) is an important novel gene editing technology following zinc-finger nucleases (ZFNs), transcription-activator-like effector nucleases (TALENs). The Casilio system is composed of dCas9 protein, a sgRNA (sgRNA-PBS) with one or more PUF domain binding sites attached thereto, and an effector protein (PUF fusion protein) fused to the PUF domain.

Disclosure of Invention

It is an object of the present invention to provide a method and kit for direct reprogramming of hepatocytes into islet-like cells.

The invention provides a method for preparing insulin secreting cells from hepatocytes, which comprises the following steps: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells.

The present invention also provides a method for direct reprogramming of hepatocytes into insulin-secreting cells, comprising the steps of: the DNA coding for the three functional elements of the Casilio system was introduced into hepatocytes.

The invention also provides a recombinant cell, and the preparation method comprises the following steps: DNA encoding three functional elements of the Casilio system was introduced into hepatocytes to obtain recombinant cells.

The invention also provides a kit comprising a recombinant expression vector containing the DNA coding for the three functional elements of the Casilio system or comprising the DNA coding for the three functional elements of the Casilio system; the kit has the function of preparing insulin secreting cells by using liver cells. The kit can also comprise PiggyBac transposase or coding DNA of the PiggyBac transposase or a recombinant expression vector containing the coding DNA of the PiggyBac transposase.

The invention also protects the use of the recombinant expression vectors of the three functional elements of the Casilio system or of the DNA coding for the three functional elements of the Casilio system for the preparation of a kit; the kit has the function of preparing insulin secreting cells by using liver cells.

The three functional elements of any of the above Casilio systems are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain; the sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table.

Effector proteins fused to PUF domains are transcriptional activators fused to PUF domains.

The DNA coding for the three functional elements of the Casilio system can be introduced into hepatocytes simultaneously or in steps.

Specifically, a DNA encoding dCas9 protein was introduced, a DNA encoding effector protein fused to the PUF domain was introduced, and a DNA encoding sgRNA having a PUF domain binding site attached thereto was introduced.

Specifically, in order to promote integration of foreign DNA into the genomic DNA of the cell, DNA encoding dCas9 protein was introduced together with DNA encoding PiggyBac transposase.

Specifically, in order to promote integration of foreign DNA into the genomic DNA of the cell, DNA encoding an effector protein fused to the PUF domain is introduced together with DNA encoding a PiggyBac transposase.

In particular, DNA encoding dCas9 protein is inducibly expressed in cells. The inducible expression is based on the tet-on system.

In particular, DNA encoding an effector protein fused to a PUF domain is inducibly expressed in a cell. The inducible expression is based on the tet-on system.

Specifically, a DNA encoding dCas9 protein is introduced into cells by a recombinant expression vector. The recombinant expression vector can be specifically PMax-NLS-dCas9 plasmid or PB3-PMax-dCas9 plasmid.

Specifically, DNA encoding an effector protein fused to a PUF domain is introduced into a cell via a recombinant expression vector. The recombinant expression vector can be Pmax-NLSPUFa _ p65-HSF1 plasmid or PB3-neo (-) -Pmax-NLSPUFa _ p65-HSF1 plasmid.

Specifically, the coding DNA of PiggyBac transposase is introduced into cells through a recombinant expression vector. The recombinant expression vector can be a PiggyBac plasmid.

The 12 sgRNAs are sequentially named as Pdx1-sgRNA1, Pdx1-sgRNA2, Pdx1-sgRNA3, Pdx1-sgRNA4, Ngn3-sgRNA1, Ngn3-sgRNA2, Ngn3-sgRNA3, Ngn3-sgRNA4, MafA-sgRNA1, MafA-sgRNA2, MafA-sgRNA3 and MafA-sgRNA 4.

Specifically, DNA encoding 12 sgrnas was introduced into cells via the corresponding 12 recombinant expression vectors. The 12 recombinant expression vectors are recombinant plasmids Pdx1-gRNA1, recombinant plasmids Pdx1-gRNA2, recombinant plasmids Pdx1-gRNA3, recombinant plasmids Pdx1-gRNA4, recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3, recombinant plasmids Ngn3-gRNA4, recombinant plasmids MafA-gRNA1, recombinant plasmids MafA-gRNA2, recombinant plasmids MafA-gRNA and recombinant plasmids MafA-gRNA4 in the embodiment. The starting vectors of the 12 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.

The recombinant expression vector containing the coding DNA of the three functional elements of the Casilio system may be specifically a recombinant expression vector containing the coding DNA of dCas9 protein, a recombinant expression vector containing the coding DNA of the effector protein fused to the PUF domain, and a recombinant expression vector containing the coding DNA of sgRNA to which the PUF domain binding site is attached. The sgrnas with PUF domain binding sites are composed of 12 sgrnas, and target sequences of the 12 sgrnas are sequentially represented as sequences 1 to 12 in the sequence table. Accordingly, a recombinant expression vector containing a coding DNA of sgRNA with a PUF domain binding site attached thereto was composed of 12 recombinant expression vectors. The 12 recombinant expression vectors are recombinant plasmids Pdx1-gRNA1, recombinant plasmids Pdx1-gRNA2, recombinant plasmids Pdx1-gRNA3, recombinant plasmids Pdx1-gRNA4, recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3, recombinant plasmids Ngn3-gRNA4, recombinant plasmids MafA-gRNA1, recombinant plasmids MafA-gRNA2, recombinant plasmids MafA-gRNA and recombinant plasmids MafA-gRNA4 in the embodiment. The starting vectors of the 12 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.

The recombinant expression vector containing the coding DNA of the PiggyBac transposase can be a PiggyBac plasmid.

The 12 sgrnas differ only in the target sequence binding region. Each of the 12 sgrnas consists of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA after the 3' end of the target sequence has been removed by three nucleotides. The target sequences are those in table 1.

The invention also protects sgRNA combinations as (a) or (b) below:

(a) the sgRNA combination consists of 12 sgRNAs; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;

(b) sgRNA combinations consisting of 4 sgRNAs; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.

The invention also protects the application of the sgRNA combination as follows (c1), or (c2), or (c 3):

(c1) preparing insulin secreting cells from hepatocytes;

(c2) direct reprogramming of hepatocytes into insulin-secreting cells;

(c3) preparing a kit; the kit has the function of preparing insulin secreting cells by using liver cells.

The invention also provides a method for preparing insulin secreting cells from hepatocytes, comprising the steps of: the coding DNA of the three functional elements of the Casilio system is introduced into the liver cells to obtain recombinant cells, namely the insulin secreting cells.

The invention also protects the application of three functional elements of the Casilio system in the preparation of the kit; the kit has the function of preparing insulin secreting cells by using liver cells.

The three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain; the sgrnas with PUF domain binding sites consist of 4 sgrnas; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.

The 4 sgRNAs are sequentially named as Ngn3-sgRNA1, Ngn3-sgRNA2, Ngn3-sgRNA3 and Ngn3-sgRNA 4.

Specifically, 4 sgRNA-encoding DNAs were introduced into cells via 4 recombinant expression vectors. The 4 recombinant expression vectors are recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3 and recombinant plasmids Ngn3-gRNA4 in the embodiment. The starting vectors of the 4 recombinant expression vectors are all pX-sgRNA-5xPBSA plasmids.

The recombinant expression vector containing the coding DNA of the three functional elements of the Casilio system may be specifically a recombinant expression vector containing the coding DNA of dCas9 protein, a recombinant expression vector containing the coding DNA of the effector protein fused to the PUF domain, and a recombinant expression vector containing the coding DNA of sgRNA to which the PUF domain binding site is attached. The sgrnas with PUF domain binding sites are composed of 4 sgrnas, and target sequences of the 4 sgrnas are sequentially represented by sequences 5 to 8 in the sequence table. Accordingly, a recombinant expression vector containing a coding DNA of sgRNA with a PUF domain binding site attached thereto was composed of 4 kinds of recombinant expression vectors. The 4 recombinant expression vectors are recombinant plasmids Ngn3-gRNA1, recombinant plasmids Ngn3-gRNA2, recombinant plasmids Ngn3-gRNA3 and recombinant plasmids Ngn3-gRNA4 in the embodiment.

The 4 sgrnas differ only in the target sequence binding region. All 4 sgrnas consist of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA after the 3' end of the target sequence has been removed by three nucleotides. The target sequences are those in table 1.

The amino acid sequence of the dCas9 protein is the same as the amino acid sequence coded by the 1125-5225 th nucleotide in the sequence 16 of the sequence table.

The amino acid sequence of the PiggyBac transposase is the same as the amino acid sequence coded by the 953-position 2737-position nucleotide in the sequence 14 of the sequence table.

Effector proteins fused to PUF domains include two segments, PUFa and P65-HSF 1. The amino acid sequence of the PUFa domain is identical to the amino acid sequence encoded by nucleotide 1248-2333 of SEQ ID NO. 15. The amino acid sequence of P65-HSF1 is the same as the amino acid sequence encoded by the 2445-3467 th nucleotide in the sequence 15.

The coding DNA of the dCas9 protein is shown as the 1125-th-5225 th nucleotide in the sequence 16 of the sequence table.

The coding DNA of the PUFa domain is shown as the nucleotide 1248-2333 th site in the sequence 15 of the sequence table.

The coding DNA of P65-HSF1 is shown as 2445-3467 th nucleotide in the sequence 15 of the sequence table.

The coding DNA of the effector protein fused with the PUF domain is shown as the 1248-bit 3467-bit nucleotide in the sequence 15 of the sequence table.

The coding DNA of the PiggyBac transposase is shown as 953-2737 th nucleotide in a sequence 14 of a sequence table.

Any of the above hepatocytes may specifically be HepG2 cells.

The problems of low transformation efficiency, poor maturity and the like still exist in the currently applied hepatocyte direct reprogramming islet-like cells, and are mainly related to the fact that epigenetic barriers exist among different mature somatic cells and the regulation network of islet-related endogenous genes cannot be effectively opened.

The inventor designs 4 gRNAs aiming at Pdx1, Ngn3 and MafA promoter regions respectively based on a Casilio system of CRISPR/dCas9, and designs 12 gRNAs in total. The activation efficiency of introducing 4 gRNAs has a superimposed effect relative to that of independently introducing 1 gRNA, and the Casilio system realizes the target gene specific activation regulation, but the regulation presents a dose effect. Meanwhile, the transformation of the liver cells into the islet-like cells can be realized by a strategy of targeted activation of three key transcription factors of the endogenous PNM or only targeted activation of the transcription factor of the endogenous Ngn3, and the problems that the specificity of the exogenous factor introduced into the living cells to the endogenous factor is poor, the expression time limit is short, the level is low and the like are solved.

The inventor of the invention combines a Casilio system and a PB transposition system based on CRISPR/dCas9, utilizes a transcription activator P65-HSF1 to change the histone modification state, activates the endogenous high-efficiency expression of a pancreatic key transcription factor PNM, realizes the integration of dCas9 and P65-HSF1 fragments in a HepG2 liver cell line genome, further realizes the direct reprogramming of liver cells into islet-like cells, and has the reprogramming efficiency of 10-15%. In the invention, no exogenous transcription regulating factor is needed, and the method has important significance on the safety and specificity of biomedical research. The invention provides a new research idea and technical means for islet cell regeneration, lays an experimental foundation for clinical treatment of diabetes, and has clinical application value.

Drawings

Fig. 1 is a schematic diagram of the sgRNA activation position selected for the position 1000bp upstream of the transcription start site of Pdx1 gene, Ngn3 gene, and MafA gene.

FIG. 2 is a graph showing the results of RT-PCR assay in step one of example 1.

FIG. 3 is a graph showing the results of fluorescence detection of immunocytes in step two of example 1.

FIG. 4 is a graph showing the results of RT-PCR detection in step two of example 1.

FIG. 5 is a graph showing the results of Western Blot in step three of example 2.

FIG. 6 is a graph showing the results of fluorescence detection of immunocytes in step three of example 2.

FIG. 7 is a photograph showing the observation of cell morphology in step three of example 2.

FIG. 8 is a graph showing the results of the first step in example 3.

FIG. 9 is a photograph of the cells obtained in step two of example 3 under fluorescence after culturing for 72 hours.

FIG. 10 is a photograph under fluorescence of the pnm-gRNA group during the culture in step two of example 3.

Fig. 11 is the statistical results of fig. 10.

Detailed Description

The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.

The experimental procedures in the following examples, unless otherwise indicated, are conventional and are carried out according to the techniques or conditions described in the literature in the field or according to the instructions of the products. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified. The plasmids in the examples are all double-stranded, circularized DNA molecules. HepG2 cells: human liver cancer cells.

The pX-sgRNA-5xPBSA plasmid is shown as a sequence 13 in a sequence table. In the sequence 13, an insertion site is arranged between 257 th and 258 th nucleotides, and after a DNA molecule encoding a target sequence binding region is inserted, the recombinant plasmid expresses sgRNA with 5 PUF domain binding sites attached. In the sequence 13, the 262-position 343 nucleotide codes for a sgRNA skeleton, the 361-position 368 nucleotide codes for a1 st PUF domain binding site, the 372-position 379 nucleotide codes for a2 nd PUF domain binding site, the 383-position 390 nucleotide codes for a3 rd PUF domain binding site, the 394-position 401 nucleotide codes for a4 th PUF domain binding site, and the 405-position 412 nucleotide codes for a 5 th PUF domain binding site.

The piggyBac plasmid is shown as a sequence 14 in a sequence table. In the sequence 14, the 953-2737 th nucleotide codes for PiggyBac transposase (PBase).

Example 1 design and comparison of sgrnas

The Pmax-NLSPUFa _ p65-HSF1 plasmid is shown as a sequence 15 in the sequence table. In the sequence 15, the 1008-3476 nucleotides are a coding frame. In the sequence 15, the 1098-1169 th nucleotide encodes 3 continuous NLS, the 1248-2333 th nucleotide encodes the PUFa domain, and the 2445-3467 th nucleotide encodes P65-HSF 1.

The PMax-NLS-dCas9 plasmid is shown in sequence 16 of the sequence table. In the sequence 16, the 1008-5315 nucleotide is a coding frame. In the sequence 16, the 1095-1115 bit nucleotide encodes NLS (nuclear localization signal), and the 1125-5225 bit nucleotide encodes dCas9 protein.

Firstly, Casilio system establishment and verification for targeted activation of Pdx1, Ngn3 and MafA

And selecting the sgRNA activation position according to the 1000bp position upstream of the Pdx1 gene, the Ngn3 gene and the MafA gene transcription start site. Schematic representation is shown in fig. 1, 4 grnas were designed for each gene, and the target sequences are shown in table 1.

TABLE 1

	5’→3’
		Pdx1-gRNA1 target sequence (sequence 1 of sequence table)	GAACCCACAGCCAGCGCGGACCGG
Pdx1-gRNA2 target sequence (sequence 2 of sequence table)	GTTCAGCCGGGGGCCGTGATTGG
		Pdx1-gRNTarget sequence of A3 (sequence 3 of sequence Listing)	GAACAAAAGCAGGTGCTCGCGGG
Pdx1-gRNA4 target sequence (sequence 4 of sequence table)	GCTGGCGGTGCTCCCCAAAATGG
		Target sequence of Ngn3-gRNA1 (sequence 5 of sequence table)	GCCACCGGCCAATCAGCGCCGGG
Target sequence of Ngn3-gRNA2 (sequence 6 of sequence listing)	GGATTCCGGACAAAGGGCCGGGG
		Target sequence of Ngn3-gRNA3 (sequence 7 of sequence Listing)	GTGCTCTCTCGAGGGCGGGCTGGG
Target sequence of Ngn3-gRNA4 (sequence 8 of sequence Listing)	GAGCCTCGTGTGGCTCTGGTCAGG
		Target sequence of MafA-gRNA1 (sequence 9 of sequence Listing)	GCGCAGGGAAAAGTTTCACGTGG
Target sequence of MafA-gRNA2 (sequence 10 of sequence Listing)	GCCAGGTGTCTCGGGCGACCCCGG
		Target sequence of MafA-gRNA3 (sequence 11 of sequence Listing)	GCCGCCGCCTCGGGCTGCTCCGGG
Target sequence of MafA-gRNA4 (sequence 12 of sequence Listing)	GCGTTTAGCCGTGGGAGGCGGGG

1. 12 recombinant plasmids were constructed.

The starting vector is pX-sgRNA-5xPBSA plasmid (the insertion position of exogenous DNA molecule is between 257 th and 258 th nucleotides of sequence 13 in the sequence table), and 12 recombinant plasmids are constructed. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 1 of the sequence table, the recombinant plasmid Pdx1-gRNA1 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 2 of the sequence table, the recombinant plasmid Pdx1-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 3 of the sequence table, the recombinant plasmid Pdx1-gRNA3 is obtained. When the exogenous DNA molecule is shown as the 1 st to 20 th nucleotides in the sequence 4 of the sequence table, the recombinant plasmid Pdx1-gRNA4 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in a sequence 5 of a sequence table, the recombinant plasmid Ngn3-gRNA1 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in a sequence 6 of a sequence table, the recombinant plasmid Ngn3-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 7 of the sequence table, the recombinant plasmid Ngn3-gRNA3 is obtained. When the exogenous DNA molecule is shown as 1 st to 21 st nucleotides in a sequence 8 of a sequence table, the recombinant plasmid Ngn3-gRNA4 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotide in the sequence 9 of the sequence table, the recombinant plasmid MafA-gRNA1 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 10 of the sequence table, the recombinant plasmid MafA-gRNA2 is obtained. When the exogenous DNA molecule is shown as the 1 st to 21 st nucleotides in the sequence 11 of the sequence table, the recombinant plasmid MafA-gRNA3 is obtained. When the exogenous DNA molecule is shown as 1-20 th nucleotides in the sequence 12 of the sequence table, the recombinant plasmid MafA-gRNA4 is obtained.

12 recombinant plasmids expressed the corresponding 12 sgrnas. The 12 sgrnas differ only in the target sequence binding region. Each of the 12 sgrnas consists of a target sequence binding region and a constant region from upstream to downstream. The constant region is shown as sequence 20 in the sequence table. The target sequence binding region is the corresponding RNA of the target sequence after removing three nucleotides at the 3' end.

2. Co-transfected cells

Test cells: 293T cells.

The plasmid was co-transfected into test cells with lipofectamine3000, and 48 hours after transfection, the cells were collected and subjected to RT-PCR detection and immunofluorescence detection.

dCas9+ p65-HSF1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid and PMax-NLS-dCas9 plasmid, wherein the mass ratio of the two plasmids is 1: 1.

pool-gRNA-Pdx1 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3 and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/4: 1/4: 1/4: 1/4.

pool-gRNA-Ngn3 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3 and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4: 1/4: 1/4.

pool-gRNA-MafA group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4: 1/4: 1/4.

pnm-gRNA group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn3-gRNA4, recombinant plasmid MagAfA-gRNA 1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant MafA-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12: 1/12.

293T cells not transfected at all were set and treated in parallel as Control group (ctrl).

3. RT-PCR detection

Taking cells, extracting total RNA and carrying out reverse transcription to obtain cDNA. Using cDNA as template, pooThe l-gRNA-Pdx1 group detected the level of Pdx1 gene, the pool-gRNA-Ngn3 group detected the level of Ngn3 gene, and the pool-gRNA-MafA group detected the level of MafA gene. dCas9+ p65-HSF1 group, Control group and pnm-gRNA group, and levels of Pdx1 gene, Ngn3 gene and MafA gene were detected. The GAPDH gene was used as a reference gene. Through 2^－△△CtThe expression calculates the mRNA expression level of the gene relative to GAPDH.

For detecting Pdx1 gene: the sequence of the upstream primer is 5'-GATTGGCGTTGTTTGTGGCT-3', and the sequence of the downstream primer is 5'-GCCGGCTTCTCTAAACAGGT-3'.

For detection of the Ngn3 gene: the sequence of the upstream primer is 5'-CGGTAGAAAGGATGACGCCT-3', and the sequence of the downstream primer is 5'-GGTCACTTCGTCTTCCGAGG-3'.

For detection of the MafA gene: the sequence of the upstream primer is 5'-AGAGCGAGAAGTGCCAACTC-3', and the sequence of the downstream primer is 5'-TGTACAGGTCCCGCTCTTTG-3'.

For detection of GAPDH gene: the sequence of the upstream primer is 5'-AGAAGGCTGGGGCTCATTTG-3', and the sequence of the downstream primer is 5'-AGGGGCCATCCACAGTCTTC-3'.

The results are shown in figure 2 (mean of 4 replicates, # P <0.05, # P <0.01, # P < 0.001). Control group, gene background expression, as 1. Compared with the Control group, the expression levels of the three genes of the dCas9+ p65-HSF1 group (no sgRNA group) have no significant difference. The results show that: compared with the Control group, the gene levels of the 4 gRNAs respectively added to the Pdx1 gene, the Ngn3 gene and the MafA gene are respectively up-regulated by 163 times (P <0.05), 503 times (P <0.05) and 12 times (P < 0.001); compared with the Control group, 12 gRNAs are added simultaneously, the expression level of Pdx1 gene is up-regulated by 62 times (P <0.001), the expression level of Ngn3 gene is up-regulated by 301 times (P <0.01), and the expression level of MafA gene is up-regulated by 12 times (P < 0.05).

4. Immune cell fluorescence detection (IF detection)

The cells were taken, fixed with 4% paraformaldehyde at room temperature for 20min, blocked with 1% BSA at room temperature for 30min, incubated with primary antibody at 4 ℃ overnight, washed with PBS-T (PBS containing 0.1% Triton-100) for 3 times, then bound to the corresponding secondary antibodies at room temperature for 2h, washed with PBS-T three times, incubated with DAPI for 15min, observed with a fluorescence microscope and photographed.

The primary antibody for detecting Pdx1 is a rabbit anti-human Pdx 1I antibody, the working concentration is 1:1000, Abcam company. The primary antibody used to detect Ngn3 was the murine anti-human Ngn 3I antibody at a working concentration of 1:50, DSHB (development students Hybridoma Bank). The primary antibody for detecting MafA is a rabbit anti-human MafA I antibody, the working concentration is 1:1000, Abcam company.

The results are shown in FIG. 3. The results show that the Casilio system of the invention can realize the targeted activation of endogenous genes.

Secondly, screening the optimal gRNA aiming at Pdx1, Ngn3 and MafA promoter regions

The test cells were: 293T cells or HepG2 cells.

The plasmid was co-transfected into test cells with lipofectamine3000, and 48 hours after transfection, the cells were collected and subjected to RT-PCR detection according to the method of step one, 3.

dCas9+ p65-HSF1 group (also referred to as no gRNA group): cotransfects Pmax-NLSPUFa _ p65-HSF1 plasmid and PMax-NLS-dCas9 plasmid, the mass ratio of the two plasmids is 1: 1.

single-gRNA1-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA2-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA3-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA4-Pdx1 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

pool-gRNA-Pdx1 group: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3 and recombinant plasmid Pdx1-gRNA4, wherein the mass ratio of the plasmids is as follows: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.

single-gRNA1-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA2-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA3-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA4-Ngn3 group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

pool-gRNA-Ngn 3: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3 and recombinant plasmid Ngn3-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.

single-gRNA1-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA1, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA2-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA2, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA3-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA3, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

single-gRNA4-MafA group: cotransfecting Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1:1.

pool-gRNA-MafA: cotransfection Pmax-NLSPUFa _ p65-HSF1 plasmid, PMax-NLS-dCas9 plasmid, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, wherein the mass ratio of the plasmids is as follows in sequence: 1:1: 1/4: 1/4, respectively; 1/4: 1/4.

Cells without any transfection were set and treated in parallel as Control group (ctrl).

The results are shown in FIG. 4 (A-C: 293T cells; D-F: HepG2 cells; average of 4 replicates;. P <0.05,. P <0.01,. P <0.001,. P < 0.0001). Control group, gene background expression, as 1. Compared with the Control group, the expression levels of the three genes of the dCas9+ p65-HSF1 group (no sgRNA group) have no significant difference. In the 293T cell line: the up-regulation times of gRNA3 and gRNA2 aiming at Pdx1 gene are the highest, gRNA3 is up-regulated by 26 times (P <0.0001), and gRNA2 is up-regulated by 16 times (P < 0.0001); the up-regulation times of gRNA3 and gRNA2 aiming at Ngn3 gene are the highest, gRNA3 is up-regulated by 501 times (P <0.0001), and gRNA2 is up-regulated by 398 times (P < 0.0001); the MafA genes gRNA2 and gRNA4 are up-regulated in the highest degree, gRNA2 is up-regulated in 3 times, and gRNA4 is up-regulated in 2 times. In the HepG2 cell line: the up-regulation times of gRNA4 and gRNA3 aiming at Pdx1 gene are highest, gRNA4 is up-regulated 223 times (P <0.05), and gRNA3 is up-regulated 82 times (P < 0.001); the up-regulation times of gRNA3 and gRNA2 aiming at Ngn3 gene are the highest, gRNA3 is up-regulated by 234 times (P <0.001), and gRNA2 is up-regulated by 95 times (P < 0.01); the gRNA2 and gRNA4 were upregulated the highest fold, gRNA2 was upregulated 6 fold, and gRNA4 was upregulated 5 fold for the MafA gene. 4 gRNAs against three genes Pdx1, Ngn3 and MafA all have synergistic effect in 293T and HepG2 cells, i.e. the activation efficiency of simultaneous transfection of 4 gRNAs is higher than that of single gRNA.

Example 2 construction of a lentivirus stably expressing an Ins-EGFP-HepG2 cell line and identification

The PB3-neo (-) -pmax-NLSPUFa _ p65-HSF1 plasmid is shown as the sequence 17 in the sequence table. In the sequence 17, nucleotide 1603-3435 is a gene (template strand), and nucleotide 5132-7600 is another gene (coding strand). In the sequence 17, the 1606-2349 th nucleotide codes an antisense Tet transcriptional activator (rtTA), the 2350-2403 th nucleotide codes a connecting peptide 2A, and the 2413-3420 th nucleotide codes a hygromycin resistance protein. In sequence 17, nucleotide 4753-4985 constitutes the Tet Response Element (TRE) (with 7 repeats of the TetO sequence) and nucleotide 4998-5070 constitutes the PminCMV promoter. In the sequence 17, the 5135-5200 th nucleotide codes a FLAG tag, the 5222-5293 th nucleotide codes 3 continuous NLS, the 5372-6457 th nucleotide codes a PUFa domain, and the 6569-7591 th nucleotide codes P65-HSF 1.

The PB3-pmax-dCas9 plasmid is shown as a sequence 18 in a sequence table. In the sequence 18, nucleotide 1603-2832 is one gene (template strand), and nucleotide 4529-8728 is the other gene (coding strand). In the sequence 18, the 1606-2349 th nucleotide codes an antisense Tet transcriptional activator (rtTA), the 2350-2403 th nucleotide codes a connecting peptide 2A, and the 2413-2832 th nucleotide codes a blasticidin resistance protein. In sequence 18, nucleotides 4150-4382 constitute the Tet Response Element (TRE) with 7 repeats of the TetO sequence, and nucleotides 4395-4467 constitute the PminCMV promoter. In the sequence 18, the 4532-4558 th nucleotide codes for HA tag, the 4562-4582 th nucleotide codes for NLS (nuclear localization signal), and the 4592-8692 th nucleotide codes for dCas9 protein.

Firstly, preparing Ins-Promoter-HepG2 cells

1. Specific DNA molecules shown in a sequence 19 of a sequence table (in the sequence 19, nucleotides 1 to 1000 form a human Insulin Promoter, and nucleotides 1026 and 1745 form an EGFP gene) are inserted between PacI and EcoRI enzyme cutting sites of a CV130 vector (a Kjeldahl gene) to obtain a recombinant plasmid Ins-Promoter-EGFP.

2. Cotransfecting 293T cells with the recombinant plasmid Ins-Promoter-EGFP, the virus packaging auxiliary plasmid (Helper 1.0) and the virus packaging auxiliary plasmid (Helper 2.0), harvesting the virus after transfection for 48-72 h, and then concentrating and purifying to obtain Ins-Promoter-EGFP lentivirus (in Ins-Promoter-EGFP lentivirus, the human Insulin Promoter drives expression of the EGFP gene).

3. HepG2 cells were seeded on cell culture plates, cultured for 24h, and then added with Ins-Promoter-EGFP lentivirus (MOI 20) with 5. mu.g/mL of polybrene as a transfection enhancer. After the lentivirus is infected for 72 hours, 1 mu g/mL Puromycin is adopted for screening for 1 day, and then 0.5 mu g/mL Puromycin is adopted for screening for 2 days to obtain a recombinant cell, namely the Ins-Promoter-HepG2 cell. The result of PCR detection shows that EGFP reporter gene is inserted into the genome DNA of the cell. In the recombinant cells, the expression of the EGFP gene is driven by a human Insulin promoter, so that the Insulin secretion level can be represented by the EGFP signal intensity, and the stronger the EGFP signal, the higher the Insulin secretion level.

Secondly, preparing Ins-EGFP-dCas9-PUFa-p65-HSF1 cells

1. And (2) inoculating the Ins-Promoter-HepG2 cell obtained in the first step to a cell culture plate, culturing for 24h, co-transfecting a PiggyBac plasmid and a PB3-pmax-dCas9 plasmid (the mass ratio of the two plasmids is 1:2 in sequence) by means of Lipifectamine3000, culturing for 48h, and screening for 7 days by adopting 6 mu g/mL Blasticidin to obtain a recombinant cell, namely the Ins-EGFP-HepG2/TetO-dCas9 cell. The cells were subjected to scale-up culture.

2. The Ins-EGFP-HepG2/TetO-dCas9 cells are inoculated to a cell culture plate, cultured for 24 hours, then co-transfected with PiggyBac plasmid and PB3-neo (-) -pmax-NLSPUFa _ p65-HSF1 plasmid (the mass ratio of the two plasmids is 1:2 in sequence) by means of Lipifectamine3000, cultured for 48 hours, and then screened for 15 days by adopting 200 mu g/mL Hygromycin to obtain recombinant cells, namely the Ins-EGFP-dCas9-PUFa-p65-HSF1 cells. The cells were subjected to scale-up culture.

Identification of cells

1、Western Blot

The test cells were Ins-Promoter-HepG2 cells or Ins-EGFP-dCas9-PUFa-p65-HSF1 cells.

Test cells are taken, induced and cultured for 3 days by adopting 1 mu g/mL doxycycline or 2 mu g/mL doxycycline, and then Western Blot is carried out. A parallel control without doxycycline addition was set up. The primary antibody used in Western Blot was either mouse anti-human Flag I antibody (Sigma, working concentration 1:1000) or rabbit anti-human HA I antibody (Abcam, working concentration 1:1000) or Tubulin I antibody (TransGen, working concentration 1: 5000).

The Western Blot results are shown in FIG. 5. After doxycycline induction, expression of dCas9 protein and NLS-PUFa-P65-HSF1 protein can be detected in Ins-EGFP-dCas9-PUFa-P65-HSF1 cells. The results show that the integration of the dCas9 gene and the NLS-PUFa-P65-HSF1 gene into the genome of the HepG2 liver cell line was successfully achieved by means of the PB transposon.

2. Immune cell fluorescence detection (IF detection)

The Ins-EGFP-dCas9-PUFa-p65-HSF1 cells are taken, induced and cultured for 3 days by adopting 1 mu g/mL doxycycline, and then immune cell fluorescence detection is carried out. A parallel control without doxycycline addition was set up.

The IF detection method comprises the following steps: the cells were taken, fixed with 4% paraformaldehyde at room temperature for 20min, blocked with 1% BSA at room temperature for 30min, incubated with primary antibody at 4 ℃ overnight, washed with PBS-T (PBS containing 0.1% Triton-100) for 3 times, then bound to the corresponding secondary antibodies at room temperature for 2h, washed with PBS-T three times, incubated with DAPI for 15min, observed with a fluorescence microscope and photographed.

The primary antibody used for IF detection was either mouse anti-human Flag I antibody (Sigma, working concentration 1:1000) or rabbit anti-human HA I antibody (Abcam, working concentration 1: 1000).

The immunofluorescence results are shown in FIG. 6. After doxycycline induction, expression of dCas9 protein and NLS-PUFa-P65-HSF1 protein can be detected in Ins-EGFP-dCas9-PUFa-P65-HSF1 cells. Ins-EGFP-dCas9-PUFa-p65-HSF1 cells can realize the activation of endogenous genes of the Casilio system only by introducing gRNA in vitro.

3. Cell morphology observation

Photographs of HepG2 cells, Ins-Promoter-HepG2 cells, Ins-EGFP-HepG2/TetO-dCas9 cells, Ins-EGFP-dCas9-PUFa-p65-HSF1 cells are shown in FIG. 7. The morphology has no significant difference.

Example 3 reprogramming of the liver cell line HepG2 to islet-like cells by the Casilio System

Test cells: Ins-EGFP-dCas9-PUFa-p65-HSF1 cells.

One, Casilio system for activating endogenous gene

1. Co-transfected cells

The plasmid was transfected into test cells with the aid of lipofectamine3000, 48 hours after transfection, induced with 1. mu.g/mL doxycycline for 3 days, and then the cells were collected.

pool-gRNA-Pdx1 group: cotransfection recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, four plasmids and the like.

pool-gRNA-Ngn3 group: cotransfection recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn3-gRNA4, four plasmids and the like in mass ratio.

pool-gRNA-MafA group: cotransfection recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3 and recombinant plasmid MafA-gRNA4, and the mass ratio of the four plasmids is equal.

pnm-gRNA group: cotransfection recombinant plasmid Pdx1-gRNA1, recombinant plasmid Pdx1-gRNA2, recombinant plasmid Pdx1-gRNA3, recombinant plasmid Pdx1-gRNA4, recombinant plasmid Ngn3-gRNA1, recombinant plasmid Ngn3-gRNA2, recombinant plasmid Ngn3-gRNA3, recombinant plasmid Ngn 3-NA 4, recombinant plasmid MafA-gRNA1, recombinant plasmid MafA-gRNA2, recombinant plasmid MafA-gRNA3, recombinant plasmid MafA-gRNA4 and 12 plasmids.

Test cells without any transfection were set and treated in parallel as Control group (ctrl).

2. RT-PCR detection

See step one, 3 of example 1.

Control group, gene background expression, as 1. The results are shown in A of FIG. 8. The sgRNA for each gene was introduced separately, and the Pdx1 gene was up-regulated 156-fold (P <0.05), the Ngn3 gene was up-regulated 275-fold (P <0.001), and the MafA gene was up-regulated 6-fold. After 12 gRNAs aiming at 3 genes are simultaneously introduced, the expression of Pdx1 gene is up-regulated by 93 times (P <0.01), the expression of Ngn3 gene is up-regulated by 404 times (P <0.001) and the expression of MafA gene is up-regulated by 7 times. The Casilio system can efficiently activate PNM in a targeted manner.

3. Immune cell fluorescence detection (IF detection)

See step one, 4 of example 1.

Partial results are shown in B of FIG. 8. IF results show that the Casilio system can realize obvious expression of Pdx1 protein and Ngn3 protein, and further prove that the Casilio system activates endogenous genes.

Secondly, reprogramming of the hepatocyte to the insulin-secreting cell

1. Co-transfected cells

2. After completion of step 1, the cells were collected and cultured in DMEM medium containing 10mM Nicotinamide, 20ng/ml EGF and 5nM Exendin-4.

After 72 hours of incubation, fluorescence was observed and the photograph is shown in FIG. 9. Both the pool-gRNA-Ngn3 group and the pnm-gRNA group enabled reprogramming of hepatocytes to insulin-secreting cells. The ratio of EGFP positive cells in the pnm-gRNA group was 12.93% + -1.11%, and the ratio of EGFP positive cells in the pool-gRNA-Ngn3 group was 5.73% + -1.22%.

Fluorescence observations for the pnm-gRNA groups after 24 hours, 48 hours, 72 hours, or 96 hours of culture are shown in fig. 10, and statistical results are shown in fig. 11. The number of the reprogramming cells of the pnm-gRNA group gradually increased along with the time increment, and the reprogramming efficiency was highest at 72h of culture.

The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.

SEQUENCE LISTING

<110> Shenzhen citizen hospital

<120> method and kit for direct reprogramming of hepatocytes into islet-like cells

<130> GNCYX201739

<160> 20

<170> PatentIn version 3.5

<210> 1

<211> 24

<212> DNA

<213> Homo sapiens

<400> 1

gaacccacag ccagcgcgga ccgg 24

<210> 2

<211> 23

<212> DNA

<213> Homo sapiens

<400> 2

gttcagccgg gggccgtgat tgg 23

<210> 3

<211> 23

<212> DNA

<213> Homo sapiens

<400> 3

gaacaaaagc aggtgctcgc ggg 23

<210> 4

<211> 23

<212> DNA

<213> Homo sapiens

<400> 4

gctggcggtg ctccccaaaa tgg 23

<210> 5

<211> 23

<212> DNA

<213> Homo sapiens

<400> 5

gccaccggcc aatcagcgcc ggg 23

<210> 6

<211> 23

<212> DNA

<213> Homo sapiens

<400> 6

ggattccgga caaagggccg ggg 23

<210> 7

<211> 24

<212> DNA

<213> Homo sapiens

<400> 7

gtgctctctc gagggcgggc tggg 24

<210> 8

<211> 24

<212> DNA

<213> Homo sapiens

<400> 8

gagcctcgtg tggctctggt cagg 24

<210> 9

<211> 23

<212> DNA

<213> Homo sapiens

<400> 9

gcgcagggaa aagtttcacg tgg 23

<210> 10

<211> 24

<212> DNA

<213> Homo sapiens

<400> 10

gccaggtgtc tcgggcgacc ccgg 24

<210> 11

<211> 24

<212> DNA

<213> Homo sapiens

<400> 11

gccgccgcct cgggctgctc cggg 24

<210> 12

<211> 23

<212> DNA

<213> Homo sapiens

<400> 12

gcgtttagcc gtgggaggcg ggg 23

<210> 13

<211> 3255

<212> DNA

<213> Artificial sequence

<400> 13

ggcgcgccga gggcctattt cccatgattc cttcatattt gcatatacga tacaaggctg 60

ttagagagat aattggaatt aatttgactg taaacacaaa gatattagta caaaatacgt 120

gacgtagaaa gtaataattt cttgggtagt ttgcagtttt aaaattatgt tttaaaatgg 180

actatcatat gcttaccgta acttgaaagt atttcgattt cttggcttta tatatcttgt 240

ggaaaggacg aaacaccgtt taagagctat gctggaaaca gcatagcaag tttaaataag 300

gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgccaattgg gtctccagat 360

tgtatgtagc ctgtatgtag cctgtatgta gcctgtatgt agcctgtatg taagatcttt 420

ttttgtttta gagctagaaa tagcaagtta aaataaggct agtccgtagc gcgtgcgcca 480

attctgcaga caaatggcgg cgcgccgcgg ccgcaggaac ccctagtgat ggagttggcc 540

actccctctc tgcgcgggcc tcgctcgctc actgaggccg ggcgaccaaa ggtcgcccga 600

cgcccgggct ttgcccgggc ggcctcagtg agcgagcgag cgcgcagctg cctgcagggg 660

cgcctgatgc ggtattttct ccttacgcat ctgtgcggta tttcacaccg catacgtcaa 720

agcaaccata gtacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc 780

gcagcgtgac cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt 840

cctttctcgc cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag 900

ggttccgatt tagtgcttta cggcacctcg accccaaaaa acttgatttg ggtgatggtt 960

cacgtagtgg gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt 1020

tctttaatag tggactcttg ttccaaactg gaacaacact caaccctatc tcgggctatt 1080

cttttgattt ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt 1140

aacaaaaatt taacgcgaat tttaacaaaa tattaacgtt tacaatttta tggtgcactc 1200

tcagtacaat ctgctctgat gccgcatagt taagccagcc ccgacacccg ccaacacccg 1260

ctgacgcgcc ctgacgggct tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg 1320

tctccgggag ctgcatgtgt cagaggtttt caccgtcatc accgaaacgc gcgagacgaa 1380

agggcctcgt gatacgccta tttttatagg ttaatgtcat gataataatg gtttcttaga 1440

cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta tttttctaaa 1500

tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt 1560

gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg 1620

cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag 1680

atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg 1740

agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg 1800

gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt 1860

ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga 1920

cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac 1980

ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc 2040

atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc 2100

gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac 2160

tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag 2220

gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg 2280

gtgagcgtgg aagccgcggt atcattgcag cactggggcc agatggtaag ccctcccgta 2340

tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg 2400

ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata 2460

tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt 2520

ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc 2580

ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 2640

tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 2700

ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gtccttctag 2760

tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 2820

tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 2880

actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 2940

cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 3000

gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 3060

tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 3120

ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 3180

ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 3240

cttttgctca catgt 3255

<210> 14

<211> 6354

<212> DNA

<213> Artificial sequence

<400> 14

gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60

ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120

cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180

ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240

gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300

tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360

cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420

attgacgtca atgggtggac tatttacggt aaactgccca cttggcagta catcaagtgt 480

atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540

atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600

tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660

actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720

aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780

gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840

ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gcttggtacc 900

gagctcggat ccactagtaa cggccgccag tgtgctggaa ttcgccgcca ccatgggcag 960

cagcctggac gacgagcaca tcctgagcgc cctgctgcag agcgacgacg agctggtcgg 1020

cgaggacagc gacagcgagg tgagcgacca cgtgagcgag gacgacgtgc agtccgacac 1080

cgaggaggcc ttcatcgacg aggtgcacga ggtgcagcct accagcagcg gctccgagat 1140

cctggacgag cagaacgtga tcgagcagcc cggcagctcc ctggccagca acaggatcct 1200

gaccctgccc cagaggacca tcaggggcaa gaacaagcac tgctggtcca cctccaagcc 1260

caccaggcgg agcagggtgt ccgccctgaa catcgtgaga agccagaggg gccccaccag 1320

gatgtgcagg aacatctacg accccctgct gtgcttcaag ctgttcttca ccgacgagat 1380

catcagcgag atcgtgaagt ggaccaacgc cgagatcagc ctgaagaggc gggagagcat 1440

gacctccgcc accttcaggg acaccaacga ggacgagatc tacgccttct tcggcatcct 1500

ggtgatgacc gccgtgagga aggacaacca catgagcacc gacgacctgt tcgacagatc 1560

cctgagcatg gtgtacgtga gcgtgatgag cagggacaga ttcgacttcc tgatcagatg 1620

cctgaggatg gacgacaaga gcatcaggcc caccctgcgg gagaacgacg tgttcacccc 1680

cgtgagaaag atctgggacc tgttcatcca ccagtgcatc cagaactaca cccctggcgc 1740

ccacctgacc atcgacgagc agctgctggg cttcaggggc aggtgcccct tcagggtcta 1800

tatccccaac aagcccagca agtacggcat caagatcctg atgatgtgcg acagcggcac 1860

caagtacatg atcaacggca tgccctacct gggcaggggc acccagacca acggcgtgcc 1920

cctgggcgag tactacgtga aggagctgtc caagcccgtc cacggcagct gcagaaacat 1980

cacctgcgac aactggttca ccagcatccc cctggccaag aacctgctgc aggagcccta 2040

caagctgacc atcgtgggca ccgtgagaag caacaagaga gagatccccg aggtcctgaa 2100

gaacagcagg tccaggcccg tgggcaccag catgttctgc ttcgacggcc ccctgaccct 2160

ggtgtcctac aagcccaagc ccgccaagat ggtgtacctg ctgtccagct gcgacgagga 2220

cgccagcatc aacgagagca ccggcaagcc ccagatggtg atgtactaca accagaccaa 2280

gggcggcgtg gacaccctgg accagatgtg cagcgtgatg acctgcagca gaaagaccaa 2340

caggtggccc atggccctgc tgtacggcat gatcaacatc gcctgcatca acagcttcat 2400

catctacagc cacaacgtga gcagcaaggg cgagaaggtg cagagccgga aaaagttcat 2460

gcggaacctg tacatgggcc tgacctccag cttcatgagg aagaggctgg aggcccccac 2520

cctgaagaga tacctgaggg acaacatcag caacatcctg cccaaagagg tgcccggcac 2580

cagcgacgac agcaccgagg agcccgtgat gaagaagagg acctactgca cctactgtcc 2640

cagcaagatc agaagaaagg ccagcgccag ctgcaagaag tgtaagaagg tcatctgccg 2700

ggagcacaac atcgacatgt gccagagctg tttctgatga gcggccgctc gagcatgcat 2760

ctagagggcc ctattctata gtgtcaccta aatgctagag ctcgctgatc agcctcgact 2820

gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 2880

gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 2940

agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg 3000

gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga ggcggaaaga 3060

accagctggg gctctagggg gtatccccac gcgccctgta gcggcgcatt aagcgcggcg 3120

ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 3180

ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 3240

cggggcatcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 3300

gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 3360

acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 3420

cctatctcgg tctattcttt tgatttataa gggattttgg ggatttcggc ctattggtta 3480

aaaaatgagc tgatttaaca aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt 3540

tagggtgtgg aaagtcccca ggctccccag gcaggcagaa gtatgcaaag catgcatctc 3600

aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 3660

agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 3720

ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 3780

gcagaggccg aggccgcctc tgcctctgag ctattccaga agtagtgagg aggctttttt 3840

ggaggcctag gcttttgcaa aaagctcccc gaaatgaccg accaagcgac gcccaacctg 3900

ccatcacgag atttcgattc caccgccgcc ttctatgaaa ggttgggctt cggaatcgtt 3960

ttccgggacg ccggctggat gatcctccag cgcggggatc tcatgctgga gttcttcgcc 4020

caccccaact tgtttattgc agcttataat ggttacaaat aaagcaatag catcacaaat 4080

ttcacaaata aagcattttt ttcactgcat tctagttgtg gtttgtccaa actcatcaat 4140

gtatcttatc atgtctgtat accgtcgacc tctagctaga gcttggcgta atcatggtca 4200

tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga 4260

agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg 4320

cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc 4380

caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac 4440

tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata 4500

cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 4560

aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct 4620

gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa 4680

agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 4740

cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcaatgctca 4800

cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 4860

ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 4920

gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 4980

tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg 5040

acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc 5100

tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag 5160

attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac 5220

gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc 5280

ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag 5340

taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt 5400

ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag 5460

ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca 5520

gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact 5580

ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca 5640

gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg 5700

tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc 5760

atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg 5820

gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca 5880

tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt 5940

atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc 6000

agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 6060

ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca 6120

tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa 6180

aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat 6240

tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa 6300

aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtc 6354

<210> 15

<211> 5462

<212> DNA

<213> Artificial sequence

<400> 15

tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60

ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120

aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180

gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240

gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300

agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360

ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420

cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480

gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540

caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600

caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660

cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagg 720

tcgtttagtg aaccgtcaga tcactagtag ctttattgcg gtagtttatc acagttaaat 780

tgctaacgca gtcagtgctc gactgatcac aggtaagtat caaggttaca agacaggttt 840

aaggaggcca atagaaactg ggcttgtcga gacagagaag attcttgcgt ttctgatagg 900

cacctattgg tcttactgac atccactttg cctttctctc cacagggaaa aaacaattga 960

caagtttgta caaaaaagca ggctccgaat tcaccggtgc cgccaccatg gactacaagg 1020

atcacgacgg tgactataag gatcatgaca tcgactataa ggacgatgac gataagatcg 1080

atggcggagg cggatctgat ccaaaaaaga agagaaaggt agatccaaaa aagaagagaa 1140

aggtagatcc aaaaaagaag agaaaggtag gatctaccgg atctagaaac gatggtggtg 1200

gtggaagcgg gggtgggggc agcggtggag ggggaagcgg gcgcgccggg atcctccccc 1260

ccaagaaaaa gaggaaggta tctagaggcc gcagccgcct tttggaagat tttcgaaaca 1320

accggtaccc caatttacaa ctgcgggaga ttgctggaca tataatggaa ttttcccaag 1380

accagcatgg gtccagattc attcagctga aactggagcg tgccacacca gctgagcgcc 1440

agcttgtctt caatgaaatc ctccaggctg cctaccaact catggtggat gtgtttggta 1500

attacgtcat tcagaagttc tttgaatttg gcagtcttga acagaagctg gctttggcag 1560

aacggattcg aggccacgtc ctgtcattgg cactacagat gtatggcagc cgtgttatcg 1620

agaaagctct tgagtttatt ccttcagacc agcagaatga gatggttcgg gaactagatg 1680

gccatgtctt gaagtgtgtg aaagatcaga atggcaatca cgtggttcag aaatgcattg 1740

aatgtgtaca gccccagtct ttgcaattta tcatcgacgc gtttaaggga caggtatttg 1800

ccttatccac acatccttat ggctgccgag tgattcagag aatcctggag cactgtctcc 1860

ctgaccagac actccctatt ttagaggagc ttcaccagca cacagagcag ctggtacagg 1920

atcaatatgg aaattatgta atccaacatg tactggagca cggtcgtcct gaggataaaa 1980

gcaaaattgt agcagaaatc cgaggcaatg tacttgtatt gagtcagcac aaatttgcaa 2040

gcaatgttgt ggagaagtgt gttactcacg cctcacgtac ggagcgcgct gtgctcatcg 2100

acgaggtgtg caccatgaac gacggtcccc acagtgcctt atacaccatg atgaaggacc 2160

agtatgccaa ctacgtggtc cagaagatga ttgacgtggc ggagccaggc cagcggaaga 2220

tcgtcatgca taagatccgg ccccacatcg caactcttcg taagtacacc tatggcaagc 2280

acattctggc caagctggag aagtactaca tgaagaacgg tgttgactta ggggacccaa 2340

agaagaagcg caaagtggat cctaaaaaga aaagaaaggt aggcggccgc gggggaggcg 2400

gttccggtgg cggcggaagc ggaggtggag gatcagggcc ggccggagga ggtggaagcg 2460

gaggaggagg aagcggagga ggaggtagcg gacctaagaa aaagaggaag gtggcggccg 2520

ctggatcccc ttcagggcag atcagcaacc aggccctggc tctggcccct agctccgctc 2580

cagtgctggc ccagactatg gtgccctcta gtgctatggt gcctctggcc cagccacctg 2640

ctccagcccc tgtgctgacc ccaggaccac cccagtcact gagcgctcca gtgcccaagt 2700

ctacacaggc cggcgagggg actctgagtg aagctctgct gcacctgcag ttcgacgctg 2760

atgaggacct gggagctctg ctggggaaca gcaccgatcc cggagtgttc acagatctgg 2820

cctccgtgga caactctgag tttcagcagc tgctgaatca gggcgtgtcc atgtctcata 2880

gtacagccga accaatgctg atggagtacc ccgaagccat tacccggctg gtgaccggca 2940

gccagcggcc ccccgacccc gctccaactc ccctgggaac cagcggcctg cctaatgggc 3000

tgtccggaga tgaagacttc tcaagcatcg ctgatatgga ctttagtgcc ctgctgtcac 3060

agatttcctc tagtgggcag ggaggaggtg gaagcggctt cagcgtggac accagtgccc 3120

tgctggacct gttcagcccc tcggtgaccg tgcccgacat gagcctgcct gaccttgaca 3180

gcagcctggc cagtatccaa gagctcctgt ctccccagga gccccccagg cctcccgagg 3240

cagagaacag cagcccggat tcagggaagc agctggtgca ctacacagcg cagccgctgt 3300

tcctgctgga ccccggctcc gtggacaccg ggagcaacga cctgccggtg ctgtttgagc 3360

tgggagaggg ctcctacttc tccgaagggg acggcttcgc cgaggacccc accatctccc 3420

tgctgacagg ctcggagcct cccaaagcca aggaccccac tgtctccatc gattgattaa 3480

ttaagaattc gacccagctt tcttgtacaa agtggttgat atccagcaca gtggcggccg 3540

ctcgagtcta gagggcccgc ggttcgaagg taagcctatc cctaaccctc tcctcggtct 3600

cgattctacg cgtaccggtt agcaattgtt ttttcgatga gtttggacaa accacaacta 3660

gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 3720

ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg 3780

ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtactt 3840

aagaggggga gaccaaaggg cgagacgtta aggcctcacg tgacatgtga gcaaaaggcc 3900

agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 3960

cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac 4020

tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 4080

tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata 4140

gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 4200

acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 4260

acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 4320

cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 4380

gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg 4440

gtagctcttg atccggcaaa caaaccacgc tggtagcggt ggtttttttg tttgcaagca 4500

gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 4560

tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgccgt ctcagaagaa 4620

ctcgtcaaga aggcgataga aggcgatgcg ctgcgaatcg ggagcggcga taccgtaaag 4680

cacgaggaag cggtcagccc attcgccgcc aagctcttca gcaatatcac gggtagccaa 4740

cgctatgtcc tgatagcggt ccgccacacc cagccggcca cagtcgatga atccagaaaa 4800

gcggccattt tccaccatga tattcggcaa gcaggcatcg ccatgggtca cgacgagatc 4860

ctcgccgtcg ggcatgctcg ccttgagcct ggcgaacagt tcggctggcg cgagcccctg 4920

atgctcttcg tccagatcat cctgatcgac aagaccggct tccatccgag tacgtgctcg 4980

ctcgatgcga tgtttcgctt ggtggtcgaa tgggcaggta gccggatcaa gcgtatgcag 5040

ccgccgcatt gcatcagcca tgatggatac tttctcggca ggagcaaggt gagatgacag 5100

gagatcctgc cccggcactt cgcccaatag cagccagtcc cttcccgctt cagtgacaac 5160

gtcgagcaca gctgcgcaag gaacgcccgt cgtggccagc cacgatagcc gcgctgcctc 5220

gtcttgcagt tcattcaggg caccggacag gtcggtcttg acaaaaagaa ccgggcgccc 5280

ctgcgctgac agccggaaca cggcggcatc agagcagccg attgtctgtt gtgcccagtc 5340

atagccgaat agcctctcca cccaagcggc cggagaacct gcgtgcaatc catcttgttc 5400

aatcataata ttattgaagc atttatcagg gttcgtctcg tcccggtctc ctcccatgca 5460

tg 5462

<210> 16

<211> 7301

<212> DNA

<213> Artificial sequence

<400> 16

tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60

ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120

aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180

gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240

gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300

agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360

ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420

cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480

gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540

caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600

caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660

cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagg 720

tcgtttagtg aaccgtcaga tcactagtag ctttattgcg gtagtttatc acagttaaat 780

tgctaacgca gtcagtgctc gactgatcac aggtaagtat caaggttaca agacaggttt 840

aaggaggcca atagaaactg ggcttgtcga gacagagaag attcttgcgt ttctgatagg 900

cacctattgg tcttactgac atccactttg cctttctctc cacagggaaa aaacaattga 960

caagtttgta caaaaaagca ggctccgaat tcaccggtgc cgccaccatg atcgatggtg 1020

gcggtggtag cgggggaggc ggctccgggg gcggaggcag tatgtaccca tacgatgttc 1080

cagattacgc ttcgccgaag aaaaagcgca aggtcgaagc gtccgacaag aagtacagca 1140

tcggcctggc catcggcacc aactctgtgg gctgggccgt gatcaccgac gagtacaagg 1200

tgcccagcaa gaaattcaag gtgctgggca acaccgaccg gcacagcatc aagaagaacc 1260

tgatcggagc cctgctgttc gacagcggcg aaacagccga ggccacccgg ctgaagagaa 1320

ccgccagaag aagatacacc agacggaaga accggatctg ctatctgcaa gagatcttca 1380

gcaacgagat ggccaaggtg gacgacagct tcttccacag actggaagag tccttcctgg 1440

tggaagagga taagaagcac gagcggcacc ccatcttcgg caacatcgtg gacgaggtgg 1500

cctaccacga gaagtacccc accatctacc acctgagaaa gaaactggtg gacagcaccg 1560

acaaggccga cctgcggctg atctatctgg ccctggccca catgatcaag ttccggggcc 1620

acttcctgat cgagggcgac ctgaaccccg acaacagcga cgtggacaag ctgttcatcc 1680

agctggtgca gacctacaac cagctgttcg aggaaaaccc catcaacgcc agcggcgtgg 1740

acgccaaggc catcctgtct gccagactga gcaagagcag acggctggaa aatctgatcg 1800

cccagctgcc cggcgagaag aagaatggcc tgttcggcaa cctgattgcc ctgagcctgg 1860

gcctgacccc caacttcaag agcaacttcg acctggccga ggatgccaaa ctgcagctga 1920

gcaaggacac ctacgacgac gacctggaca acctgctggc ccagatcggc gaccagtacg 1980

ccgacctgtt tctggccgcc aagaacctgt ccgacgccat cctgctgagc gacatcctga 2040

gagtgaacac cgagatcacc aaggcccccc tgagcgcctc tatgatcaag agatacgacg 2100

agcaccacca ggacctgacc ctgctgaaag ctctcgtgcg gcagcagctg cctgagaagt 2160

acaaagagat tttcttcgac cagagcaaga acggctacgc cggctacatt gacggcggag 2220

ccagccagga agagttctac aagttcatca agcccatcct ggaaaagatg gacggcaccg 2280

aggaactgct cgtgaagctg aacagagagg acctgctgcg gaagcagcgg accttcgaca 2340

acggcagcat cccccaccag atccacctgg gagagctgca cgccattctg cggcggcagg 2400

aagattttta cccattcctg aaggacaacc gggaaaagat cgagaagatc ctgaccttcc 2460

gcatccccta ctacgtgggc cctctggcca ggggaaacag cagattcgcc tggatgacca 2520

gaaagagcga ggaaaccatc accccctgga acttcgagga agtggtggac aagggcgctt 2580

ccgcccagag cttcatcgag cggatgacca acttcgataa gaacctgccc aacgagaagg 2640

tgctgcccaa gcacagcctg ctgtacgagt acttcaccgt gtataacgag ctgaccaaag 2700

tgaaatacgt gaccgaggga atgagaaagc ccgccttcct gagcggcgag cagaaaaagg 2760

ccatcgtgga cctgctgttc aagaccaacc ggaaagtgac cgtgaagcag ctgaaagagg 2820

actacttcaa gaaaatcgag tgcttcgact ccgtggaaat ctccggcgtg gaagatcggt 2880

tcaacgcctc cctgggcaca taccacgatc tgctgaaaat tatcaaggac aaggacttcc 2940

tggacaatga ggaaaacgag gacattctgg aagatatcgt gctgaccctg acactgtttg 3000

aggacagaga gatgatcgag gaacggctga aaacctatgc ccacctgttc gacgacaaag 3060

tgatgaagca gctgaagcgg cggagataca ccggctgggg caggctgagc cggaagctga 3120

tcaacggcat ccgggacaag cagtccggca agacaatcct ggatttcctg aagtccgacg 3180

gcttcgccaa cagaaacttc atgcagctga tccacgacga cagcctgacc tttaaagagg 3240

acatccagaa agcccaggtg tccggccagg gcgatagcct gcacgagcac attgccaatc 3300

tggccggcag ccccgccatt aagaagggca tcctgcagac agtgaaggtg gtggacgagc 3360

tcgtgaaagt gatgggccgg cacaagcccg agaacatcgt gatcgaaatg gccagagaga 3420

accagaccac ccagaaggga cagaagaaca gccgcgagag aatgaagcgg atcgaagagg 3480

gcatcaaaga gctgggcagc cagatcctga aagaacaccc cgtggaaaac acccagctgc 3540

agaacgagaa gctgtacctg tactacctgc agaatgggcg ggatatgtac gtggaccagg 3600

aactggacat caaccggctg tccgactacg atgtggacgc catcgtgcct cagagctttc 3660

tgaaggacga ctccatcgac aacaaggtgc tgaccagaag cgacaagaac cggggcaaga 3720

gcgacaacgt gccctccgaa gaggtcgtga agaagatgaa gaactactgg cggcagctgc 3780

tgaacgccaa gctgattacc cagagaaagt tcgacaatct gaccaaggcc gagagaggcg 3840

gcctgagcga actggataag gccggcttca tcaagagaca gctggtggaa acccggcaga 3900

tcacaaagca cgtggcacag atcctggact cccggatgaa cactaagtac gacgagaatg 3960

acaagctgat ccgggaagtg aaagtgatca ccctgaagtc caagctggtg tccgatttcc 4020

ggaaggattt ccagttttac aaagtgcgcg agatcaacaa ctaccaccac gcccacgacg 4080

cctacctgaa cgccgtcgtg ggaaccgccc tgatcaaaaa gtaccctaag ctggaaagcg 4140

agttcgtgta cggcgactac aaggtgtacg acgtgcggaa gatgatcgcc aagagcgagc 4200

aggaaatcgg caaggctacc gccaagtact tcttctacag caacatcatg aactttttca 4260

agaccgagat taccctggcc aacggcgaga tccggaagcg gcctctgatc gagacaaacg 4320

gcgaaaccgg ggagatcgtg tgggataagg gccgggattt tgccaccgtg cggaaagtgc 4380

tgagcatgcc ccaagtgaat atcgtgaaaa agaccgaggt gcagacaggc ggcttcagca 4440

aagagtctat cctgcccaag aggaacagcg ataagctgat cgccagaaag aaggactggg 4500

accctaagaa gtacggcggc ttcgacagcc ccaccgtggc ctattctgtg ctggtggtgg 4560

ccaaagtgga aaagggcaag tccaagaaac tgaagagtgt gaaagagctg ctggggatca 4620

ccatcatgga aagaagcagc ttcgagaaga atcccatcga ctttctggaa gccaagggct 4680

acaaagaagt gaaaaaggac ctgatcatca agctgcctaa gtactccctg ttcgagctgg 4740

aaaacggccg gaagagaatg ctggcctctg ccggcgaact gcagaaggga aacgaactgg 4800

ccctgccctc caaatatgtg aacttcctgt acctggccag ccactatgag aagctgaagg 4860

gctcccccga ggataatgag cagaaacagc tgtttgtgga acagcacaag cactacctgg 4920

acgagatcat cgagcagatc agcgagttct ccaagagagt gatcctggcc gacgctaatc 4980

tggacaaagt gctgtccgcc tacaacaagc accgggataa gcccatcaga gagcaggccg 5040

agaatatcat ccacctgttt accctgacca atctgggagc ccctgccgcc ttcaagtact 5100

ttgacaccac catcgaccgg aagaggtaca ccagcaccaa agaggtgctg gacgccaccc 5160

tgatccacca gagcatcacc ggcctgtacg agacacggat cgacctgtct cagctgggag 5220

gcgacagccc caagaagaag agaaaggtgg aggccagcgg aggcggcggt agcggaggag 5280

gcgggtccgg cggcggcggt agtgggccgg cctgattaat taagaattcg acccagcttt 5340

cttgtacaaa gtggttgata tccagcacag tggcggccgc tcgagtctag agggcccgcg 5400

gttcgaaggt aagcctatcc ctaaccctct cctcggtctc gattctacgc gtaccggtta 5460

gcaattgttt tttcgatgag tttggacaaa ccacaactag aatgcagtga aaaaaatgct 5520

ttatttgtga aatttgtgat gctattgctt tatttgtaac cattataagc tgcaataaac 5580

aagttaacaa caacaattgc attcatttta tgtttcaggt tcagggggag gtgtgggagg 5640

ttttttaaag caagtaaaac ctctacaaat gtggtactta agagggggag accaaagggc 5700

gagacgttaa ggcctcacgt gacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 5760

aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 5820

atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 5880

cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 5940

ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 6000

gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 6060

accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 6120

cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 6180

cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 6240

gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 6300

aaaccacgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 6360

aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa 6420

ctcacgttaa gggattttgg tcatgccgtc tcagaagaac tcgtcaagaa ggcgatagaa 6480

ggcgatgcgc tgcgaatcgg gagcggcgat accgtaaagc acgaggaagc ggtcagccca 6540

ttcgccgcca agctcttcag caatatcacg ggtagccaac gctatgtcct gatagcggtc 6600

cgccacaccc agccggccac agtcgatgaa tccagaaaag cggccatttt ccaccatgat 6660

attcggcaag caggcatcgc catgggtcac gacgagatcc tcgccgtcgg gcatgctcgc 6720

cttgagcctg gcgaacagtt cggctggcgc gagcccctga tgctcttcgt ccagatcatc 6780

ctgatcgaca agaccggctt ccatccgagt acgtgctcgc tcgatgcgat gtttcgcttg 6840

gtggtcgaat gggcaggtag ccggatcaag cgtatgcagc cgccgcattg catcagccat 6900

gatggatact ttctcggcag gagcaaggtg agatgacagg agatcctgcc ccggcacttc 6960

gcccaatagc agccagtccc ttcccgcttc agtgacaacg tcgagcacag ctgcgcaagg 7020

aacgcccgtc gtggccagcc acgatagccg cgctgcctcg tcttgcagtt cattcagggc 7080

accggacagg tcggtcttga caaaaagaac cgggcgcccc tgcgctgaca gccggaacac 7140

ggcggcatca gagcagccga ttgtctgttg tgcccagtca tagccgaata gcctctccac 7200

ccaagcggcc ggagaacctg cgtgcaatcc atcttgttca atcataatat tattgaagca 7260

tttatcaggg ttcgtctcgt cccggtctcc tcccatgcat g 7301

<210> 17

<211> 12550

<212> DNA

<213> Artificial sequence

<400> 17

agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60

acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 120

tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180

ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagctca 240

gaattaaccc tcactaaagg gactagtcct gcaggtttaa acgaattcgc ccttttaacc 300

ctagaaagat agtctgcgta aaattgacgc atgcattctt gaaatattgc tctctctttc 360

taaatagcgc gaatccgtcg ctgtgcattt aggacatctc agtcgccgct tggagctccc 420

gtgaggcgtg cttgtcaatg cggtaagtgt cactgatttt gaactataac gaccgcgtga 480

gtcaaaatga cgcatgatta tcttttacgt gacttttaag atttaactca tacgataatt 540

atattgttat ttcatgttct acttacgtga taacttatta tatatatatt ttcttgttat 600

agatatcaac tagaatgcta gacctttcgt cttcaagaat tccgatcata ttcaataacc 660

cttaattagg tccctcgaag aggttcactg gcgcgttgga tccccgggta ccgagttggg 720

agctcacggg gacagccccc ccccaaagcc cccagggatg taattacgtc cctcccccgc 780

tagggggcag cagcgagccg cccggggctc cgctccggtc cggcgctccc cccgcatccc 840

cgagccggca gcgtgcgggg acagcccggg cacggggaag gtggcacggg atcgctttcc 900

tctgaacgct tctcgctgct ctttgagcct gcagacacct ggggggatac ggggaaaaag 960

ctttaggctg aaagagagat ttagaatgac agtctagtgg gagctcacgg ggacagcccc 1020

cccccaaagc ccccagggat gtaattacgt ccctcccccg ctagggggca gcagcgagcc 1080

gcccggggct ccgctccggt ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg 1140

gacagcccgg gcacggggaa ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc 1200

tctttgagcc tgcagacacc tggggggata cggggaaaaa gctttaggct gaaagagaga 1260

tttagaatga cagaactcga tttcattgca gactggcgcg ccgccttttt acggttcctg 1320

gccttttgct ggccttttgc tcacatgtca cgtgaggcct taacgtctcg ccctttggtc 1380

tccccctctt aagtaccaca tttgtagagg ttttacttgc tttaaaaaac ctcccacacc 1440

tccccctgaa cctgaaacat aaaatgaatg caattgttgt tgttaacttg tttattgcag 1500

cttataatgg ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 1560

cactgcattc tagttgtggt ttgtccaaac tcatcggcta gcttacccgg ggagcatgtc 1620

aaggtcaaaa tcgtcaagag cgtcagcagg cagcatatca aggtcaaagt cgtcaagggc 1680

atcggctggg agcatgtcta agtcaaaatc gtcaagggcg tcggtcggcc cgccgctttc 1740

gcactttagc tgtttctcca ggccacatat gattagttcc aggccgaaaa ggaaggcagg 1800

ttcggctccc tgccggtcga acagctcaat tgcttgtctc agaagtgggg gcatagaatc 1860

ggtggtaggt gtctctcttt cctcttttgc tacttgatgc tcctgttcct ccaatacgca 1920

gcccagtgta aagtggccca cggcggacag agcgtacagt gcgttctcca gggagaagcc 1980

ttgctgacac aggaacgcga gctgattttc cagggtttcg tactgtttct ctgttgggcg 2040

ggtgccgaga tgcactttag ccccgtcgcg atgtgagagg agagcacagc ggtatgactt 2100

ggcgttgttc cgcagaaagt cttgccatga ctcgccttcc agggggcaga agtgggtatg 2160

atgcctgtcc agcatctcga ttggcagggc atcgagcagg gcccgcttgt tcttcacgtg 2220

ccagtacagg gtaggctgct caactcccag cttttgagcg agtttccttg tcgtcaggcc 2280

ttcgataccg acaccattga gtaattccag agctccgttt atgactttgc tcttgtccag 2340

tctagacatt ggaccagggt tttcttcaac atcaccacaa gtgaggagag aacctctacc 2400

ttcggcaccg ggttcctttg ccctcggacg agtgctgggg cgtcggtttc cactatcggc 2460

gagtacttct acacagccat cggtccagac ggccgcgctt ctgcgggcga tttgtgtacg 2520

cccgacagtc ccggctccgg atcggacgat tgcgtcgcat cgaccctgcg cccaagctgc 2580

atcatcgaaa ttgccgtcaa ccaagctctg atagagttgg tcaagaccaa tgcggagcat 2640

atacgcccgg agccgcggcg atcctgcaag ctccggatgc ctccgctcga agtagcgcgt 2700

ctgctgctcc atacaagcca accacggcct ccagaagaag atgttggcga cctcgtattg 2760

ggaatccccg aacatcgcct cgctccagtc aatgaccgct gttatgcggc cattgtccgt 2820

caggacattg ttggagccga aatccgcgtg cacgaggtgc cggacttcgg ggcagtcctc 2880

ggcccaaagc atcagctcat cgagagcctg cgcgacggac gcactgacgg tgtcgtccat 2940

cacagtttgc cagtgataca catggggatc agcaatcgcg catatgaaat cacgccatgt 3000

agtgtattga ccgattcctt gcggtccgaa tgggccgaac ccgctcgtct ggctaagatc 3060

ggccgcagcg atcgcatcca tggcctccgc gaccggctgc agaacagcgg gcagttcggt 3120

ttcaggcagg tcttgcaacg tgacaccctg tgcacggcgg gagatgcaat aggtcaggct 3180

ctcgctaaat tccccaatgt caagcacttc cggaatcggg agcgcggccg atgcaaagtg 3240

ccgataaaca taacgatctt tgtagaaacc atcggcgcag ctatttaccc gcaggacata 3300

tccacgccct cctacatcga agctgaaagc acgagattct tcgccctccg agagctgcat 3360

caggtcggag acgctgtcga acttttcgat cagaaacttc tcgacagacg tcgcggtgag 3420

ttcaggcttt ttcatggtgg cggcactagt aagggcgaat tcggagcctg cttttttgta 3480

caaacttgtt gatatctgca gaattccacc acactggact agtggatccg agctcggtac 3540

caagcttctt cacgacacct gaaatggaag aaaaaaactt tgaaccactg tctgaggctt 3600

gagaatgaac caagatccaa actcaaaaag ggcaaattcc aaggagaatt acatcaagtg 3660

ccaagctggc ctaacttcag tctccaccca ctcagtgtgg ggaaactcca tcgcataaaa 3720

cccctccccc caacctaaag acgacgtact ccaaaagctc gagaactaat cgaggtgcct 3780

ggacggcgcc cggtactccg tggagtcaca tgaagcgacg gctgaggacg gaaaggccct 3840

tttcctttgt gtgggtgact cacccgcccg ctctcccgag cgccgcgtcc tccattttga 3900

gctccctgca gcagggccgg gaagcggcca tctttccgct cacgcaactg gtgccgaccg 3960

ggccagcctt gccgcccagg gcggggcgat acacggcggc gcgaggccag gcaccagagc 4020

aggccggcca gcttgagact acccccgtcc gattctcggt ggccgcgctc gcaggccccg 4080

cctcgccgaa catgtgcgct gggacgcacg ggccccgtcg ccgcccgcgg ccccaaaaac 4140

cgaaatacca gtgtgcagat cttggcccgc atttacaaga ctatcttgcc agaaaaaaag 4200

cgtcgcagca ggtcatcaaa aattttaaat ggctagagac ttatcgaaag cagcgagaca 4260

ggcgcgaagg tgccaccaga ttcgcacgcg gcggccccag cgcccaggcc aggcctcaac 4320

tcaagcacga ggcgaagggg ctccttaagc gcaaggcctc gaactctccc acccacttcc 4380

aacccgaagc tcgggatcaa gaatcacgta ctgcagccag gtggaagtaa ttcaaggcac 4440

gcaagggcca taacccgtaa agaggccagg cccgcgggaa ccacacacgg cacttacctg 4500

tgttctggcg gcaaacccgt tgcgaaaaag aacgttcacg gcgactactg cacttatata 4560

cggttctccc ccaccctcgg gaaaaaggcg gagccagtac acgacatcac tttcccagtt 4620

taccccgcgc caccttctct aggcaccggt tcaattgccg acccctcccc ccaacttctc 4680

ggggactgtg ggcgatgtgc gctctgccca ctgacgggca ccggagcctc acgatcgata 4740

tgtcgagttt actccctatc agtgatagag aacgtatgtc gagtttactc cctatcagtg 4800

atagagaacg atgtcgagtt tactccctat cagtgataga gaacgtatgt cgagtttact 4860

ccctatcagt gatagagaac gtatgtcgag tttactccct atcagtgata gagaacgtat 4920

gtcgagttta tccctatcag tgatagagaa cgtatgtcga gtttactccc tatcagtgat 4980

agagaacgta tgtcgaggta ggcgtgtacg gtgggaggcc tatataagca gagctcgttt 5040

agtgaaccgt cagatcgcct ggagaattgg ctaggcaccg gtgacaagtt tgtacaaaaa 5100

agcaggctcc gaattcaccg gtgccgccac catggactac aaggatcacg acggtgacta 5160

taaggatcat gacatcgact ataaggacga tgacgataag atcgatggcg gaggcggatc 5220

tgatccaaaa aagaagagaa aggtagatcc aaaaaagaag agaaaggtag atccaaaaaa 5280

gaagagaaag gtaggatcta ccggatctag aaacgatggt ggtggtggaa gcgggggtgg 5340

gggcagcggt ggagggggaa gcgggcgcgc cgggatcctc ccccccaaga aaaagaggaa 5400

ggtatctaga ggccgcagcc gccttttgga agattttcga aacaaccggt accccaattt 5460

acaactgcgg gagattgctg gacatataat ggaattttcc caagaccagc atgggtccag 5520

attcattcag ctgaaactgg agcgtgccac accagctgag cgccagcttg tcttcaatga 5580

aatcctccag gctgcctacc aactcatggt ggatgtgttt ggtaattacg tcattcagaa 5640

gttctttgaa tttggcagtc ttgaacagaa gctggctttg gcagaacgga ttcgaggcca 5700

cgtcctgtca ttggcactac agatgtatgg cagccgtgtt atcgagaaag ctcttgagtt 5760

tattccttca gaccagcaga atgagatggt tcgggaacta gatggccatg tcttgaagtg 5820

tgtgaaagat cagaatggca atcacgtggt tcagaaatgc attgaatgtg tacagcccca 5880

gtctttgcaa tttatcatcg acgcgtttaa gggacaggta tttgccttat ccacacatcc 5940

ttatggctgc cgagtgattc agagaatcct ggagcactgt ctccctgacc agacactccc 6000

tattttagag gagcttcacc agcacacaga gcagctggta caggatcaat atggaaatta 6060

tgtaatccaa catgtactgg agcacggtcg tcctgaggat aaaagcaaaa ttgtagcaga 6120

aatccgaggc aatgtacttg tattgagtca gcacaaattt gcaagcaatg ttgtggagaa 6180

gtgtgttact cacgcctcac gtacggagcg cgctgtgctc atcgacgagg tgtgcaccat 6240

gaacgacggt ccccacagtg ccttatacac catgatgaag gaccagtatg ccaactacgt 6300

ggtccagaag atgattgacg tggcggagcc aggccagcgg aagatcgtca tgcataagat 6360

ccggccccac atcgcaactc ttcgtaagta cacctatggc aagcacattc tggccaagct 6420

ggagaagtac tacatgaaga acggtgttga cttaggggac ccaaagaaga agcgcaaagt 6480

ggatcctaaa aagaaaagaa aggtaggcgg ccgcggggga ggcggttccg gtggcggcgg 6540

aagcggaggt ggaggatcag ggccggccgg aggaggtgga agcggaggag gaggaagcgg 6600

aggaggaggt agcggaccta agaaaaagag gaaggtggcg gccgctggat ccccttcagg 6660

gcagatcagc aaccaggccc tggctctggc ccctagctcc gctccagtgc tggcccagac 6720

tatggtgccc tctagtgcta tggtgcctct ggcccagcca cctgctccag cccctgtgct 6780

gaccccagga ccaccccagt cactgagcgc tccagtgccc aagtctacac aggccggcga 6840

ggggactctg agtgaagctc tgctgcacct gcagttcgac gctgatgagg acctgggagc 6900

tctgctgggg aacagcaccg atcccggagt gttcacagat ctggcctccg tggacaactc 6960

tgagtttcag cagctgctga atcagggcgt gtccatgtct catagtacag ccgaaccaat 7020

gctgatggag taccccgaag ccattacccg gctggtgacc ggcagccagc ggccccccga 7080

ccccgctcca actcccctgg gaaccagcgg cctgcctaat gggctgtccg gagatgaaga 7140

cttctcaagc atcgctgata tggactttag tgccctgctg tcacagattt cctctagtgg 7200

gcagggagga ggtggaagcg gcttcagcgt ggacaccagt gccctgctgg acctgttcag 7260

cccctcggtg accgtgcccg acatgagcct gcctgacctt gacagcagcc tggccagtat 7320

ccaagagctc ctgtctcccc aggagccccc caggcctccc gaggcagaga acagcagccc 7380

ggattcaggg aagcagctgg tgcactacac agcgcagccg ctgttcctgc tggaccccgg 7440

ctccgtggac accgggagca acgacctgcc ggtgctgttt gagctgggag agggctccta 7500

cttctccgaa ggggacggct tcgccgagga ccccaccatc tccctgctga caggctcgga 7560

gcctcccaaa gccaaggacc ccactgtctc catcgattga ttaattaaga attcgaccca 7620

gctttcttgt acaaagtggt tgatatccag cacagtggcg gccgctcgag tctagagggc 7680

ccgcggttcg aaggtaagcc tatccctaac cctctcctcg gtctcgattc tacgcgtacc 7740

ggttaggggc ccgtttaaac ccgctgatca gcctcgactg tgccttctag ttgccagcca 7800

tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 7860

ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg 7920

gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag caggcatgct 7980

ggggatgcgg tgggctctat ggcgacgcgc ctggatcccc gggtaccgag ttgggagctc 8040

acggggacag ccccccccca aagcccccag ggatgtaatt acgtccctcc cccgctaggg 8100

ggcagcagcg agccgcccgg ggctccgctc cggtccggcg ctccccccgc atccccgagc 8160

cggcagcgtg cggggacagc ccgggcacgg ggaaggtggc acgggatcgc tttcctctga 8220

acgcttctcg ctgctctttg agcctgcaga cacctggggg gatacgggga aaaagcttta 8280

ggctgaaaga gagatttaga atgacagtct agtgggagct cacggggaca gccccccccc 8340

aaagccccca gggatgtaat tacgtccctc ccccgctagg gggcagcagc gagccgcccg 8400

gggctccgct ccggtccggc gctccccccg catccccgag ccggcagcgt gcggggacag 8460

cccgggcacg gggaaggtgg cacgggatcg ctttcctctg aacgcttctc gctgctcttt 8520

gagcctgcag acacctgggg ggatacgggg aaaaagcttt aggctgaaag agagatttag 8580

aatgacagaa ctcgatttca ttgcagactg gccggccact agtacgcgcc ggctcgacat 8640

actagttaaa agttttgtta ctttatagaa gaaattttga gtttttgttt ttttttaata 8700

aataaataaa cataaataaa ttgtttgttg aatttattat tagtatgtaa gtgtaaatat 8760

aataaaactt aatatctatt caaattaata aataaacctc gatatacaga ccgataaaac 8820

acatgcgtca attttacgca tgattatctt taacgtacgt cacaatatga ttatctttct 8880

agggttaaaa gggcgaattc gcggccgcta aattcaattc gccctatagt gagtcgtatt 8940

acaattcact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac 9000

ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca 9060

ccgatcgccc ttcccaacag ttgcgcagcc tatacgtacg gcagtttaag gtttacacct 9120

ataaaagaga gagccgttat cgtctgtttg tggatgtaca gagtgatatt attgacacgc 9180

cggggcgacg gatggtgatc cccctggcca gtgcacgtct gctgtcagat aaagtctccc 9240

gtgaacttta cccggtggtg catatcgggg atgaaagctg gcgcatgatg accaccgata 9300

tggccagtgt gccggtctcc gttatcgggg aagaagtggc tgatctcagc caccgcgaaa 9360

atgacatcaa aaacgccatt aacctgatgt tctggggaat ataaatgtca ggcatgagat 9420

tatcaaaaag gatcttcacc tagatccttt tcacgtagaa agccagtccg cagaaacggt 9480

gctgaccccg gatgaatgtc agctactggg ctatctggac aagggaaaac gcaagcgcaa 9540

agagaaagca ggtagcttgc agtgggctta catggcgata gctagactgg gcggttttat 9600

ggacagcaag cgaaccggaa ttgccagctg gggcgccctc tggtaaggtt gggaagccct 9660

gcaaagtaaa ctggatggct ttcttgccgc caaggatctg atggcgcagg ggatcaagct 9720

ctgatcaaga gacaggatga ggatcgtttc gcatgattga acaagatgga ttgcacgcag 9780

gttctccggc cgcttgggtg gagaggctat tcggctatga ctgggcacaa cagacaatcg 9840

gctgctctga tgccgccgtg ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca 9900

agaccgacct gtccggtgcc ctgaatgaac tgcaagacga ggcagcgcgg ctatcgtggc 9960

tggccacgac gggcgttcct tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg 10020

actggctgct attgggcgaa gtgccggggc aggatctcct gtcatctcac cttgctcctg 10080

ccgagaaagt atccatcatg gctgatgcaa tgcggcggct gcatacgctt gatccggcta 10140

cctgcccatt cgaccaccaa gcgaaacatc gcatcgagcg agcacgtact cggatggaag 10200

ccggtcttgt cgatcaggat gatctggacg aagagcatca ggggctcgcg ccagccgaac 10260

tgttcgccag gctcaaggcg agcatgcccg acggcgagga tctcgtcgtg acccatggcg 10320

atgcctgctt gccgaatatc atggtggaaa atggccgctt ttctggattc atcgactgtg 10380

gccggctggg tgtggcggac cgctatcagg acatagcgtt ggctacccgt gatattgctg 10440

aagagcttgg cggcgaatgg gctgaccgct tcctcgtgct ttacggtatc gccgctcccg 10500

attcgcagcg catcgccttc tatcgccttc ttgacgagtt cttctgaatt attaacgctt 10560

acaatttcct gatgcggtat tttctcctta cgcatctgtg cggtatttca caccgcatca 10620

ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttattttt ctaaatacat 10680

tcaaatatgt atccgctcat gagattatca aaaaggatct tcacctagat ccttttaaat 10740

taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac 10800

caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt 10860

gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt 10920

gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag 10980

ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct 11040

attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt 11100

gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc 11160

tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt 11220

agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg 11280

gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg 11340

actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct 11400

tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc 11460

attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt 11520

tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt 11580

tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg 11640

aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta tcagggttat 11700

tgtctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc agaccccgta 11760

gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa 11820

acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt 11880

tttccgaagg taactggctt cagcagagcg cagataccaa atactgttct tctagtgtag 11940

ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta 12000

atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca 12060

agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag 12120

cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa 12180

agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga 12240

acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc 12300

gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc 12360

ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt 12420

gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt 12480

gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag 12540

gaagcggaag 12550

<210> 18

<211> 13678

<212> DNA

<213> Artificial sequence

<400> 18

agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60

acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 120

tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180

ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagctca 240

gaattaaccc tcactaaagg gactagtcct gcaggtttaa acgaattcgc ccttttaacc 300

ctagaaagat agtctgcgta aaattgacgc atgcattctt gaaatattgc tctctctttc 360

taaatagcgc gaatccgtcg ctgtgcattt aggacatctc agtcgccgct tggagctccc 420

gtgaggcgtg cttgtcaatg cggtaagtgt cactgatttt gaactataac gaccgcgtga 480

gtcaaaatga cgcatgatta tcttttacgt gacttttaag atttaactca tacgataatt 540

atattgttat ttcatgttct acttacgtga taacttatta tatatatatt ttcttgttat 600

agatatcaac tagaatgcta gacctttcgt cttcaagaat tccgatcata ttcaataacc 660

cttaattagg tccctcgaag aggttcactg gcgcgttgga tccccgggta ccgagttggg 720

agctcacggg gacagccccc ccccaaagcc cccagggatg taattacgtc cctcccccgc 780

tagggggcag cagcgagccg cccggggctc cgctccggtc cggcgctccc cccgcatccc 840

cgagccggca gcgtgcgggg acagcccggg cacggggaag gtggcacggg atcgctttcc 900

tctgaacgct tctcgctgct ctttgagcct gcagacacct ggggggatac ggggaaaaag 960

ctttaggctg aaagagagat ttagaatgac agtctagtgg gagctcacgg ggacagcccc 1020

cccccaaagc ccccagggat gtaattacgt ccctcccccg ctagggggca gcagcgagcc 1080

gcccggggct ccgctccggt ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg 1140

gacagcccgg gcacggggaa ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc 1200

tctttgagcc tgcagacacc tggggggata cggggaaaaa gctttaggct gaaagagaga 1260

tttagaatga cagaactcga tttcattgca gactggcgcg ccgccttttt acggttcctg 1320

gccttttgct ggccttttgc tcacatgtca cgtgaggcct taacgtctcg ccctttggtc 1380

tccccctctt aagtaccaca tttgtagagg ttttacttgc tttaaaaaac ctcccacacc 1440

tccccctgaa cctgaaacat aaaatgaatg caattgttgt tgttaacttg tttattgcag 1500

cttataatgg ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 1560

cactgcattc tagttgtggt ttgtccaaac tcatcggcta gcttacccgg ggagcatgtc 1620

aaggtcaaaa tcgtcaagag cgtcagcagg cagcatatca aggtcaaagt cgtcaagggc 1680

atcggctggg agcatgtcta agtcaaaatc gtcaagggcg tcggtcggcc cgccgctttc 1740

gcactttagc tgtttctcca ggccacatat gattagttcc aggccgaaaa ggaaggcagg 1800

ttcggctccc tgccggtcga acagctcaat tgcttgtctc agaagtgggg gcatagaatc 1860

ggtggtaggt gtctctcttt cctcttttgc tacttgatgc tcctgttcct ccaatacgca 1920

gcccagtgta aagtggccca cggcggacag agcgtacagt gcgttctcca gggagaagcc 1980

ttgctgacac aggaacgcga gctgattttc cagggtttcg tactgtttct ctgttgggcg 2040

ggtgccgaga tgcactttag ccccgtcgcg atgtgagagg agagcacagc ggtatgactt 2100

ggcgttgttc cgcagaaagt cttgccatga ctcgccttcc agggggcaga agtgggtatg 2160

atgcctgtcc agcatctcga ttggcagggc atcgagcagg gcccgcttgt tcttcacgtg 2220

ccagtacagg gtaggctgct caactcccag cttttgagcg agtttccttg tcgtcaggcc 2280

ttcgataccg acaccattga gtaattccag agctccgttt atgactttgc tcttgtccag 2340

tctagacatt ggaccagggt tttcttcaac atcaccacaa gtgaggagag aacctctacc 2400

ttcggcaccg ggatttcggg tatatttgag tggaatgagt tcttcaatcg tagttttgac 2460

taacttgcca ttcatttcta ttaacacaaa acaatctggt gcatagtctg aaatcaactc 2520

cctacacata ccacaaggac ttaccactcg aatacttcta tctacttcgt cagaataagg 2580

gtgtctaaca gctacaatcg tgtcaaaatc cttttgtcca ttcgaaactg cactaccaat 2640

cgcaatggct tctgcacaaa cagttactcg tcctatatac gcttcaatat gtactgccga 2700

aatgatttct cctgttttcg tacgaattgc cgctcccaca tgatgtttat tatcctcata 2760

aagcattgta atcttctctg tcgctacttc tactaattct agatcctgtt gagaaatgtt 2820

aaatgttttc atggtggcgg cactagtaag ggcgaattcg gagcctgctt ttttgtacaa 2880

acttgttgat atctgcagaa ttccaccaca ctggactagt ggatccgagc tcggtaccaa 2940

gcttcttcac gacacctgaa atggaagaaa aaaactttga accactgtct gaggcttgag 3000

aatgaaccaa gatccaaact caaaaagggc aaattccaag gagaattaca tcaagtgcca 3060

agctggccta acttcagtct ccacccactc agtgtgggga aactccatcg cataaaaccc 3120

ctccccccaa cctaaagacg acgtactcca aaagctcgag aactaatcga ggtgcctgga 3180

cggcgcccgg tactccgtgg agtcacatga agcgacggct gaggacggaa aggccctttt 3240

cctttgtgtg ggtgactcac ccgcccgctc tcccgagcgc cgcgtcctcc attttgagct 3300

ccctgcagca gggccgggaa gcggccatct ttccgctcac gcaactggtg ccgaccgggc 3360

cagccttgcc gcccagggcg gggcgataca cggcggcgcg aggccaggca ccagagcagg 3420

ccggccagct tgagactacc cccgtccgat tctcggtggc cgcgctcgca ggccccgcct 3480

cgccgaacat gtgcgctggg acgcacgggc cccgtcgccg cccgcggccc caaaaaccga 3540

aataccagtg tgcagatctt ggcccgcatt tacaagacta tcttgccaga aaaaaagcgt 3600

cgcagcaggt catcaaaaat tttaaatggc tagagactta tcgaaagcag cgagacaggc 3660

gcgaaggtgc caccagattc gcacgcggcg gccccagcgc ccaggccagg cctcaactca 3720

agcacgaggc gaaggggctc cttaagcgca aggcctcgaa ctctcccacc cacttccaac 3780

ccgaagctcg ggatcaagaa tcacgtactg cagccaggtg gaagtaattc aaggcacgca 3840

agggccataa cccgtaaaga ggccaggccc gcgggaacca cacacggcac ttacctgtgt 3900

tctggcggca aacccgttgc gaaaaagaac gttcacggcg actactgcac ttatatacgg 3960

ttctccccca ccctcgggaa aaaggcggag ccagtacacg acatcacttt cccagtttac 4020

cccgcgccac cttctctagg caccggttca attgccgacc cctcccccca acttctcggg 4080

gactgtgggc gatgtgcgct ctgcccactg acgggcaccg gagcctcacg atcgatatgt 4140

cgagtttact ccctatcagt gatagagaac gtatgtcgag tttactccct atcagtgata 4200

gagaacgatg tcgagtttac tccctatcag tgatagagaa cgtatgtcga gtttactccc 4260

tatcagtgat agagaacgta tgtcgagttt actccctatc agtgatagag aacgtatgtc 4320

gagtttatcc ctatcagtga tagagaacgt atgtcgagtt tactccctat cagtgataga 4380

gaacgtatgt cgaggtaggc gtgtacggtg ggaggcctat ataagcagag ctcgtttagt 4440

gaaccgtcag atcgcctgga gaattggcta ggcaccggtg acaagtttgt acaaaaaagc 4500

aggctccgaa ttcaccggtg ccgccaccat gtacccatac gatgttccag attacgcttc 4560

gccgaagaaa aagcgcaagg tcgaagcgtc cgacaagaag tacagcatcg gcctggccat 4620

cggcaccaac tctgtgggct gggccgtgat caccgacgag tacaaggtgc ccagcaagaa 4680

attcaaggtg ctgggcaaca ccgaccggca cagcatcaag aagaacctga tcggagccct 4740

gctgttcgac agcggcgaaa cagccgaggc cacccggctg aagagaaccg ccagaagaag 4800

atacaccaga cggaagaacc ggatctgcta tctgcaagag atcttcagca acgagatggc 4860

caaggtggac gacagcttct tccacagact ggaagagtcc ttcctggtgg aagaggataa 4920

gaagcacgag cggcacccca tcttcggcaa catcgtggac gaggtggcct accacgagaa 4980

gtaccccacc atctaccacc tgagaaagaa actggtggac agcaccgaca aggccgacct 5040

gcggctgatc tatctggccc tggcccacat gatcaagttc cggggccact tcctgatcga 5100

gggcgacctg aaccccgaca acagcgacgt ggacaagctg ttcatccagc tggtgcagac 5160

ctacaaccag ctgttcgagg aaaaccccat caacgccagc ggcgtggacg ccaaggccat 5220

cctgtctgcc agactgagca agagcagacg gctggaaaat ctgatcgccc agctgcccgg 5280

cgagaagaag aatggcctgt tcggcaacct gattgccctg agcctgggcc tgacccccaa 5340

cttcaagagc aacttcgacc tggccgagga tgccaaactg cagctgagca aggacaccta 5400

cgacgacgac ctggacaacc tgctggccca gatcggcgac cagtacgccg acctgtttct 5460

ggccgccaag aacctgtccg acgccatcct gctgagcgac atcctgagag tgaacaccga 5520

gatcaccaag gcccccctga gcgcctctat gatcaagaga tacgacgagc accaccagga 5580

cctgaccctg ctgaaagctc tcgtgcggca gcagctgcct gagaagtaca aagagatttt 5640

cttcgaccag agcaagaacg gctacgccgg ctacattgac ggcggagcca gccaggaaga 5700

gttctacaag ttcatcaagc ccatcctgga aaagatggac ggcaccgagg aactgctcgt 5760

gaagctgaac agagaggacc tgctgcggaa gcagcggacc ttcgacaacg gcagcatccc 5820

ccaccagatc cacctgggag agctgcacgc cattctgcgg cggcaggaag atttttaccc 5880

attcctgaag gacaaccggg aaaagatcga gaagatcctg accttccgca tcccctacta 5940

cgtgggccct ctggccaggg gaaacagcag attcgcctgg atgaccagaa agagcgagga 6000

aaccatcacc ccctggaact tcgaggaagt ggtggacaag ggcgcttccg cccagagctt 6060

catcgagcgg atgaccaact tcgataagaa cctgcccaac gagaaggtgc tgcccaagca 6120

cagcctgctg tacgagtact tcaccgtgta taacgagctg accaaagtga aatacgtgac 6180

cgagggaatg agaaagcccg ccttcctgag cggcgagcag aaaaaggcca tcgtggacct 6240

gctgttcaag accaaccgga aagtgaccgt gaagcagctg aaagaggact acttcaagaa 6300

aatcgagtgc ttcgactccg tggaaatctc cggcgtggaa gatcggttca acgcctccct 6360

gggcacatac cacgatctgc tgaaaattat caaggacaag gacttcctgg acaatgagga 6420

aaacgaggac attctggaag atatcgtgct gaccctgaca ctgtttgagg acagagagat 6480

gatcgaggaa cggctgaaaa cctatgccca cctgttcgac gacaaagtga tgaagcagct 6540

gaagcggcgg agatacaccg gctggggcag gctgagccgg aagctgatca acggcatccg 6600

ggacaagcag tccggcaaga caatcctgga tttcctgaag tccgacggct tcgccaacag 6660

aaacttcatg cagctgatcc acgacgacag cctgaccttt aaagaggaca tccagaaagc 6720

ccaggtgtcc ggccagggcg atagcctgca cgagcacatt gccaatctgg ccggcagccc 6780

cgccattaag aagggcatcc tgcagacagt gaaggtggtg gacgagctcg tgaaagtgat 6840

gggccggcac aagcccgaga acatcgtgat cgaaatggcc agagagaacc agaccaccca 6900

gaagggacag aagaacagcc gcgagagaat gaagcggatc gaagagggca tcaaagagct 6960

gggcagccag atcctgaaag aacaccccgt ggaaaacacc cagctgcaga acgagaagct 7020

gtacctgtac tacctgcaga atgggcggga tatgtacgtg gaccaggaac tggacatcaa 7080

ccggctgtcc gactacgatg tggacgccat cgtgcctcag agctttctga aggacgactc 7140

catcgacaac aaggtgctga ccagaagcga caagaaccgg ggcaagagcg acaacgtgcc 7200

ctccgaagag gtcgtgaaga agatgaagaa ctactggcgg cagctgctga acgccaagct 7260

gattacccag agaaagttcg acaatctgac caaggccgag agaggcggcc tgagcgaact 7320

ggataaggcc ggcttcatca agagacagct ggtggaaacc cggcagatca caaagcacgt 7380

ggcacagatc ctggactccc ggatgaacac taagtacgac gagaatgaca agctgatccg 7440

ggaagtgaaa gtgatcaccc tgaagtccaa gctggtgtcc gatttccgga aggatttcca 7500

gttttacaaa gtgcgcgaga tcaacaacta ccaccacgcc cacgacgcct acctgaacgc 7560

cgtcgtggga accgccctga tcaaaaagta ccctaagctg gaaagcgagt tcgtgtacgg 7620

cgactacaag gtgtacgacg tgcggaagat gatcgccaag agcgagcagg aaatcggcaa 7680

ggctaccgcc aagtacttct tctacagcaa catcatgaac tttttcaaga ccgagattac 7740

cctggccaac ggcgagatcc ggaagcggcc tctgatcgag acaaacggcg aaaccgggga 7800

gatcgtgtgg gataagggcc gggattttgc caccgtgcgg aaagtgctga gcatgcccca 7860

agtgaatatc gtgaaaaaga ccgaggtgca gacaggcggc ttcagcaaag agtctatcct 7920

gcccaagagg aacagcgata agctgatcgc cagaaagaag gactgggacc ctaagaagta 7980

cggcggcttc gacagcccca ccgtggccta ttctgtgctg gtggtggcca aagtggaaaa 8040

gggcaagtcc aagaaactga agagtgtgaa agagctgctg gggatcacca tcatggaaag 8100

aagcagcttc gagaagaatc ccatcgactt tctggaagcc aagggctaca aagaagtgaa 8160

aaaggacctg atcatcaagc tgcctaagta ctccctgttc gagctggaaa acggccggaa 8220

gagaatgctg gcctctgccg gcgaactgca gaagggaaac gaactggccc tgccctccaa 8280

atatgtgaac ttcctgtacc tggccagcca ctatgagaag ctgaagggct cccccgagga 8340

taatgagcag aaacagctgt ttgtggaaca gcacaagcac tacctggacg agatcatcga 8400

gcagatcagc gagttctcca agagagtgat cctggccgac gctaatctgg acaaagtgct 8460

gtccgcctac aacaagcacc gggataagcc catcagagag caggccgaga atatcatcca 8520

cctgtttacc ctgaccaatc tgggagcccc tgccgccttc aagtactttg acaccaccat 8580

cgaccggaag aggtacacca gcaccaaaga ggtgctggac gccaccctga tccaccagag 8640

catcaccggc ctgtacgaga cacggatcga cctgtctcag ctgggaggcg acagccccaa 8700

gaagaagaga aaggtggagg ccagctgatt aattaagaat tcgacccagc tttcttgtac 8760

aaagtggttg atatccagca cagtggcggc cgctcgagtc tagagggccc gcggttcgaa 8820

ggtaagccta tccctaaccc tctcctcggt ctcgattcta cgcgtaccgg ttaggggccc 8880

gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc 8940

ccctcccccg tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa 9000

aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg 9060

gggcaggaca gcaaggggga ggattgggaa gacaatagca ggcatgctgg ggatgcggtg 9120

ggctctatgg cgacgcgcct ggatccccgg gtaccgagtt gggagctcac ggggacagcc 9180

cccccccaaa gcccccaggg atgtaattac gtccctcccc cgctaggggg cagcagcgag 9240

ccgcccgggg ctccgctccg gtccggcgct ccccccgcat ccccgagccg gcagcgtgcg 9300

gggacagccc gggcacgggg aaggtggcac gggatcgctt tcctctgaac gcttctcgct 9360

gctctttgag cctgcagaca cctgggggga tacggggaaa aagctttagg ctgaaagaga 9420

gatttagaat gacagtctag tgggagctca cggggacagc ccccccccaa agcccccagg 9480

gatgtaatta cgtccctccc ccgctagggg gcagcagcga gccgcccggg gctccgctcc 9540

ggtccggcgc tccccccgca tccccgagcc ggcagcgtgc ggggacagcc cgggcacggg 9600

gaaggtggca cgggatcgct ttcctctgaa cgcttctcgc tgctctttga gcctgcagac 9660

acctgggggg atacggggaa aaagctttag gctgaaagag agatttagaa tgacagaact 9720

cgatttcatt gcagactggc cggccactag tacgcgccgg ctcgacatac tagttaaaag 9780

ttttgttact ttatagaaga aattttgagt ttttgttttt ttttaataaa taaataaaca 9840

taaataaatt gtttgttgaa tttattatta gtatgtaagt gtaaatataa taaaacttaa 9900

tatctattca aattaataaa taaacctcga tatacagacc gataaaacac atgcgtcaat 9960

tttacgcatg attatcttta acgtacgtca caatatgatt atctttctag ggttaaaagg 10020

gcgaattcgc ggccgctaaa ttcaattcgc cctatagtga gtcgtattac aattcactgg 10080

ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt aatcgccttg 10140

cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc gatcgccctt 10200

cccaacagtt gcgcagccta tacgtacggc agtttaaggt ttacacctat aaaagagaga 10260

gccgttatcg tctgtttgtg gatgtacaga gtgatattat tgacacgccg gggcgacgga 10320

tggtgatccc cctggccagt gcacgtctgc tgtcagataa agtctcccgt gaactttacc 10380

cggtggtgca tatcggggat gaaagctggc gcatgatgac caccgatatg gccagtgtgc 10440

cggtctccgt tatcggggaa gaagtggctg atctcagcca ccgcgaaaat gacatcaaaa 10500

acgccattaa cctgatgttc tggggaatat aaatgtcagg catgagatta tcaaaaagga 10560

tcttcaccta gatccttttc acgtagaaag ccagtccgca gaaacggtgc tgaccccgga 10620

tgaatgtcag ctactgggct atctggacaa gggaaaacgc aagcgcaaag agaaagcagg 10680

tagcttgcag tgggcttaca tggcgatagc tagactgggc ggttttatgg acagcaagcg 10740

aaccggaatt gccagctggg gcgccctctg gtaaggttgg gaagccctgc aaagtaaact 10800

ggatggcttt cttgccgcca aggatctgat ggcgcagggg atcaagctct gatcaagaga 10860

caggatgagg atcgtttcgc atgattgaac aagatggatt gcacgcaggt tctccggccg 10920

cttgggtgga gaggctattc ggctatgact gggcacaaca gacaatcggc tgctctgatg 10980

ccgccgtgtt ccggctgtca gcgcaggggc gcccggttct ttttgtcaag accgacctgt 11040

ccggtgccct gaatgaactg caagacgagg cagcgcggct atcgtggctg gccacgacgg 11100

gcgttccttg cgcagctgtg ctcgacgttg tcactgaagc gggaagggac tggctgctat 11160

tgggcgaagt gccggggcag gatctcctgt catctcacct tgctcctgcc gagaaagtat 11220

ccatcatggc tgatgcaatg cggcggctgc atacgcttga tccggctacc tgcccattcg 11280

accaccaagc gaaacatcgc atcgagcgag cacgtactcg gatggaagcc ggtcttgtcg 11340

atcaggatga tctggacgaa gagcatcagg ggctcgcgcc agccgaactg ttcgccaggc 11400

tcaaggcgag catgcccgac ggcgaggatc tcgtcgtgac ccatggcgat gcctgcttgc 11460

cgaatatcat ggtggaaaat ggccgctttt ctggattcat cgactgtggc cggctgggtg 11520

tggcggaccg ctatcaggac atagcgttgg ctacccgtga tattgctgaa gagcttggcg 11580

gcgaatgggc tgaccgcttc ctcgtgcttt acggtatcgc cgctcccgat tcgcagcgca 11640

tcgccttcta tcgccttctt gacgagttct tctgaattat taacgcttac aatttcctga 11700

tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatcagg tggcactttt 11760

cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc aaatatgtat 11820

ccgctcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 11880

tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 11940

agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 12000

gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 12060

ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 12120

gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 12180

cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 12240

acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 12300

cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 12360

cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 12420

ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 12480

tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 12540

atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 12600

tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 12660

actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 12720

aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 12780

ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgacc 12840

aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 12900

ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 12960

ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 13020

actggcttca gcagagcgca gataccaaat actgttcttc tagtgtagcc gtagttaggc 13080

caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 13140

gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 13200

ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 13260

cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 13320

cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac aggagagcgc 13380

acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg gtttcgccac 13440

ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 13500

gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc tcacatgttc 13560

tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga gtgagctgat 13620

accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga agcggaag 13678

<210> 19

<211> 1745

<212> DNA

<213> Artificial sequence

<400> 19

acacggaaga tgaggtccga gtggcctgct gaggacttgc tgcttgtccc caggtcccca 60

ggtcatgccc tccttctgcc accctgggga gctgagggcc tcagctgggg ctgctgtcct 120

aaggcagggt gggaactagg cagccagcag ggaggggacc cctccctcac tcccactctc 180

ccacccccac caccttggcc catccatggc ggcatcttgg gccatccggg actggggaca 240

ggggtcctgg ggacaggggt gtggggacag gggtcctggg gacaggggtc tggggacagg 300

ggtcctgggg acaggggtgt ggggacaggg gtgtggggac aggggtgtgg ggacaggggt 360

cctggggaca ggggtctggg gacaggggtc tgaggacagg ggtgtgggga caggggtgtg 420

gggacagggg tgtggggaca ggggtgtggg gacaggggtc tggggacagg ggtccggggg 480

acaggggtgt ggggacaggg gtgtggggac aggggtgtgg ggacaggggt ctggggacag 540

gggtgtgggg acaggggtcc tggggacagg ggtgtgggga taggggtgtg gggacagggg 600

tgtggggaca ggggtgtggg gacaggggtc tggggacagc agcgcaaaga gccccgccct 660

gcagcctcca gctctcctgg tctaatgtgg aaagtggccc aggtgagggc tttgctctcc 720

tggagacatt tgcccccagc tgtgagcagg gacaggtctg gccaccgggc ccctggttaa 780

gactctaatg acccgctggt cctgaggaag aggtgctgac gaccaaggag atcttcccac 840

agacccagca ccagggaaat ggtccggaaa ttgcagcctc agcccccagc catctgccga 900

cccccccacc ccaggcccta atgggccagg cggcaggggt tgagaggtag gggagatggg 960

ctctgagact ataaagccag cgggggccca gcagccctca ggatccccgg gtaccggtcg 1020

ccaccatggt gagcaagggc gaggagctgt tcaccggggt ggtgcccatc ctggtcgagc 1080

tggacggcga cgtaaacggc cacaagttca gcgtgtcagg cgagggcgag ggcgatgcca 1140

cctacggcaa gctgaccctg aagttcatct gcaccaccgg caagctgccc gtgccctggc 1200

ccaccctcgt gaccaccctg acctacggcg tgcagtgctt cagccgctac cccgaccaca 1260

tgaagcagca cgacttcttc aagtccgcca tgccagaagg ctacgtccag gagcgcacca 1320

tcttcttcaa ggacgacggc aactacaaga cccgcgccga ggtgaagttc gagggcgaca 1380

ccctggtgaa ccgcatcgag ctgaagggca tcgacttcaa ggaggacggc aacatcctgg 1440

ggcacaagct ggagtacaac tacaacagcc acaacgtcta tatcatggcc gacaagcaga 1500

agaacggcat caaggtgaac ttcaagatcc gccacaacat cgaggacggc agcgtgcagc 1560

tcgccgacca ctaccagcag aacacaccca tcggcgacgg ccccgtgctg ctgcccgaca 1620

accactacct gagcacccag tccgccctga gcaaagaccc caacgagaag cgcgatcaca 1680

tggtcctgct ggagttcgtg accgccgccg ggatcactct cggcatggac gagctgtaca 1740

agtaa 1745

<210> 20

<211> 155

<212> RNA

<213> Artificial sequence

<400> 20

guuuaagagc uaugcuggaa acagcauagc aaguuuaaau aaggcuaguc cguuaucaac 60

uugaaaaagu ggcaccgagu cggugccaau ugggucucca gauuguaugu agccuguaug 120

uagccuguau guagccugua uguagccugu augua 155

Claims

1. A method for preparing insulin-secreting cells from hepatocytes comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells, namely insulin secreting cells;

the three functional elements of the Casilio system are: dCas9 protein, sgRNA with a PUF domain binding site, effector protein fused to a PUF domain;

the sgrnas with PUF domain binding sites consist of 12 sgrnas; the target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table.

2. A method of reprogramming hepatocytes directly to insulin-secreting cells comprising the steps of: introducing into hepatocytes DNA encoding three functional elements of the Casilio system;

3. A recombinant cell prepared by a method comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells;

4. A kit comprising a recombinant expression vector comprising the three functional elements of the Casilio system or the DNA encoding the three functional elements of the Casilio system;

the sgrnas with PUF domain binding sites consist of 12 sgrnas; target sequences of the 12 sgRNAs are sequentially shown as a sequence 1 to a sequence 12 in a sequence table;

the kit has the function of preparing insulin secreting cells by using liver cells.

Use of a recombinant expression vector comprising the DNA coding for or containing the DNA coding for the three functional elements of the Casilio system in the preparation of a kit;

A sgRNA combination of (a) or (b) as follows:

7. The sgRNA combination of claim 6 for use as (c1), (c2) or (c 3):

(c1) preparing insulin secreting cells from hepatocytes;

(c2) direct reprogramming of hepatocytes to insulin-secreting cells;

8. A method for preparing insulin-secreting cells from hepatocytes comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells, namely insulin secreting cells;

the sgrnas with PUF domain binding sites consist of 4 sgrnas; the target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table.

9. A recombinant cell prepared by a method comprising the steps of: introducing coding DNA of three functional elements of a Casilio system into hepatocytes to obtain recombinant cells;

Use of three functional elements of the Casilio system for the preparation of a kit;

the sgrnas with PUF domain binding sites consist of 4 sgrnas; target sequences of the 4 sgRNAs are sequentially shown as a sequence 5 to a sequence 8 in a sequence table;