CN117089572A

CN117089572A - Low off-target base editor and construction thereof

Info

Publication number: CN117089572A
Application number: CN202210518789.4A
Authority: CN
Inventors: 张学礼; 毕昌昊; 赵东东; 侯雪亭; 王杰; 魏占东; 陈旭旭
Original assignee: Tianjin Institute of Industrial Biotechnology of CAS
Current assignee: Tianjin Institute of Industrial Biotechnology of CAS
Priority date: 2022-05-13
Filing date: 2022-05-13
Publication date: 2023-11-21

Abstract

The application discloses a low off-target base editor and construction thereof. The low off-target base editor disclosed by the application contains or expresses a Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas 9. The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.

Description

Low off-target base editor and construction thereof

Technical Field

The application relates to the technical field of biotechnology and gene editing, in particular to a low off-target base editor and construction thereof.

Background

Genome editing refers to the effective design and efficient modification of cells at the genome level, and the CRISPR/Cas9 genome editing technology is simple in design, convenient to operate and high in editing efficiency, and is successfully applied to genome editing research of various target cells at present. CRISPR/Cas9 genome editing techniques mainly use guide RNAs (grnas) to guide Cas9 proteins to precisely cut at the targeting site of the genome resulting in DNA double strand breaks (double strand break, DSBs), host cells repair with their own non-homologous end-junctions (NHEJ) or based on homologous end recombination (homologous end recombination repair, HDR), but specific editing for single bases is difficult to achieve. Because double strand breaks of DNA have much uncertainty, the probability of HDR occurrence is low, whereas NHEJ causes random insertions or deletions of bases, and thus, conventional CRISPR/Cas techniques have certain drawbacks in gene editing for single bases.

The CRISPR Base Editor (BE) overcomes the defect of the traditional CRISPR/Cas technology in single base editing, mainly utilizes Cas9 (dCAs 9) without DNA cutting activity or an editor formed by fusing Cas9 protein (nCas 9) with single-chain DNA cutting activity and deaminase, can realize accurate point mutation of a target site under the guidance of gRNA, and has great application prospect in the treatment of genetic mutation genetic diseases. The existing base editor mainly comprises: a cytosine base editor (Cytosine base editor, CBE) that converts cytosine nucleotides within an edit window of the target sequence to thymidines (C > T); adenine base editors (Adenine base editor, ABE) that convert adenine nucleotides within the editing window to guanine nucleotides (a > G); and a novel glycosylase base editor (Glycosylase base editor, GBE) that can edit cytosine nucleotides to adenine nucleotides (C > A) in E.coli and to guanine nucleotides (C > G) specifically in mammalian cells.

The CRISPR/Cas9 has a serious off-target effect, and can cut DNA double chains at misplaced gene sites, thereby causing potential risks, which is also a big factor limiting the clinical application of CRISPR/Cas9 gene editing. At present, a plurality of clinical tests based on CRISPR/Cas9 gene editing are started at home and abroad, and the reduction of off-target effect is a problem to be solved urgently. CRISPR/Cas9 primarily uses base complementary pairing of the targeting sequence of the gRNA with the target DNA to identify the site that needs editing, however, sometimes Cas9 enzymes can still edit in cases where the targeting sequence of the gRNA does not perfectly match the genomic DNA, resulting in off-target effects. High-fidelity Cas9 proteins (Cas 9-HF1, hypas 9, evoCas9, etc.) constructed by proteolytic engineering or directed evolution can reduce off-target, but these high-fidelity Cas9 proteins reduce off-target as well as editing efficiency of the target site.

The CRISPR base editor mainly utilizes dCAS9 and nCas9 proteins and gRNA to identify a genome target site, deaminates a target base by utilizing deaminase, and then utilizes a DNA repair system or DNA repair of a cell to realize the base editing of the target site, and similar to CRISPR/Cas9 genome editing, the target-off phenomenon can occur at an incompletely matched site.

Disclosure of Invention

The application aims to solve the technical problem of reducing off-target of a base editor.

In order to solve the above technical problems, the present application firstly provides a base editor, which contains or expresses Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas9, or a mutant protein obtained by mutating amino acid residues at 1010 th, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 th positions of Cas9, or a mutant protein obtained by mutating any one or more of the following positions of Cas 9:

m1) mutating the tyrosine residue at position 1010 of Cas9 to an aspartic acid residue;

m2) mutating the tyrosine residue at position 1013 of Cas9 to an aspartic acid residue;

m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;

m4) mutating the tyrosine residue at position 1016 of Cas9 to an aspartic acid residue;

m5) mutating the valine residue at position 1018 of Cas9 to the aspartic acid residue;

m6) mutating the arginine residue at position 1019 of Cas9 to an aspartic acid residue;

m7) mutating the glutamine residue at position 1027 of Cas9 to an aspartic acid residue;

m8) mutates the lysine residue at position 1031 of Cas9 to an aspartic acid residue.

In the above base editor, the Cas9 may be a protein represented by sequence 2 or sequence 6 in the sequence table.

In the above base editor, the Cas9 mutant may be a mutant protein obtained by performing seven mutations of Cas9, M1), M2), M4), M5), M6), M7) and M8), or a mutant protein obtained by performing five mutations of Cas9, M2), M5), M7) and M8), or a mutant protein obtained by performing eight mutations of Cas9, M1) -M8).

The above base editor may also contain or also express sgrnas targeting the target sequence and/or domains with base modifying activity.

The domain having base modification activity may be a domain having deaminase activity, a mutant, homolog or polypeptide having or at least partially having deaminase activity. Specifically, the structural domain with the base modification activity is adenine deaminase, or a mutant, a homolog or a polypeptide fragment with or part of the adenine deaminase activity; or, the domain with base modification activity is cytosine deaminase, or a mutant, homolog or polypeptide fragment with or part of cytosine deaminase activity.

In particular, the base editor may also contain other components of the base editor than Cas9 or mutations thereof, such as sgrnas targeting the target sequence and/or domains with base modifying activity. Further, the base editor may also contain other components of the CBE base editor (e.g., BE4max or hyA3A-BE4 max) than Cas9 or a mutation thereof, or contain other components of the ABE base editor (e.g., NG-ABEmax or ABE8 e) than Cas9 or a mutation thereof, or contain other components of the GBE base editor (e.g., apodec_nmcas 9_ Ung) than Cas9 or a mutation thereof.

The Cas9 mutants also fall within the scope of the present application.

The application also provides a biological material associated with the Cas9 mutant, which is any one of the following B1) to B5):

b1 A nucleic acid molecule encoding the Cas9 mutant;

b2 An expression cassette comprising the nucleic acid molecule of B1);

b3 A recombinant vector comprising the nucleic acid molecule of B1) or a recombinant vector comprising the expression cassette of B2);

b4 A recombinant microorganism comprising the nucleic acid molecule of B1), or a recombinant microorganism comprising the expression cassette of B2), or a recombinant microorganism comprising the recombinant vector of B3);

b5 A cell line containing the nucleic acid molecule of B1) or a cell line containing the expression cassette of B2).

In the above biological material, the nucleic acid molecule of B1) may be a mutant gene obtained by mutating the nucleotide sequence between positions 3028 and 3093 of the Cas9 gene.

Further, the nucleic acid molecule of B1) may be a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 1 in the sequence table to sequence 10 or sequence 11, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 5 in the sequence table to sequence 12, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 9 in the sequence table to sequence 13 or sequence 14.

In the above biological material, the expression cassette (Cas 9 mutant gene expression cassette) of B2) containing the nucleic acid molecule encoding the Cas9 mutant refers to DNA capable of expressing the Cas9 mutant in a host cell, and the DNA may include not only a promoter for initiating transcription of the Cas9 mutant gene, but also a terminator for terminating transcription of the Cas9 mutant gene. Further, the expression cassette may also include an enhancer sequence.

Recombinant vectors containing the Cas9 mutant gene expression cassettes can be constructed using existing expression vectors.

In the above biological material, the vector may be a plasmid, cosmid, phage or viral vector.

B3 The recombinant vector can BE BE4maxM, BE4maxM2, NG-ABEmaxM, ABE8eM, APOBEC_nCas9_UngM2 or APOBEC_nCas9_UngM3;

the BE4maxM is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 10;

the BE4maxM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 11;

the NG-ABEmaxM is a mutant plasmid obtained by mutating 3028-3093 th bit of Cas9 gene (sequence 5) in NG-ABEmax plasmid into sequence 12

The ABE8eM is a mutant plasmid obtained by mutating 3028-3093 th sites of Cas9 genes (sequence 1) in an ABE8e plasmid into a sequence 10;

the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 13,

the APOBEC_nCas9_UngM3 is a mutant plasmid obtained by mutating 3028-3093 th sites of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 14.

In the above application, the microorganism may be yeast, bacteria, algae or fungi.

In the above applications, the cell line does not include propagation material.

The application also provides a product comprising the base editor, or the Cas9 mutant, or the biological material.

The product may also contain one or more pharmaceutically acceptable carriers.

The application also provides for the use of the base editor, or the Cas9 mutant, or the biomaterial, or any of the following for the product:

y1) converting a cytosine nucleotide residue in a biological cell or animal or subject to a thymine nucleotide residue;

y2) converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;

y3) converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;

y4) preparing a product for converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject;

y5) preparing a product for converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;

y6) preparing a product for converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;

y7) as or preparing a single base editing reagent or kit;

y8) as or for the preparation of a medicament for gene therapy;

y9) treating or preventing a disease;

y10) for the preparation of a product for the treatment or prophylaxis of a disease.

The application also provides any one of the following methods:

x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize conversion of the target cytosine nucleotide residue into thymine nucleotide residue;

x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target adenine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target adenine nucleotide residue into guanine nucleotide residue;

x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: and performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target cytosine nucleotide residue into guanine nucleotide residue.

In the above, the biological cell may be a mammalian cell; the animal may be a mammal.

The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.

Detailed Description

The following detailed description of the application is provided in connection with the accompanying drawings that are presented to illustrate the application and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the application in any way.

The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents, instruments and the like used in the examples described below are commercially available unless otherwise specified. The quantitative tests in the following examples were all set up in triplicate and the results averaged. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.

The medium used hereinafter was DMEM medium (Thermo Fisher) containing 10% fetal bovine serum (Gibco).

Example 1 preparation and use of Low off-target CBE genome editor

In the present example, the currently commonly used CBE base editors BE4max (Addgene: 112093) and hyA A-BE4max (Addgene: 157943) were mutated from Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, Y (tyrosine) at position 1016, V (valine) at position 1018, R (arginine) at position 1019, Q (glutamine) at position 1027 and K (lysine) at position 1031 to D (aspartic acid) (the mutated proteins of nCas9 were designated as nCas 9-M), and BE4maxM and hyA A-BE4maxM were constructed, respectively; BE4maxM2 was constructed by mutating Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, V (valine) at position 1018, Q (glutamine) at position 1027, and K (lysine) at position 1031 of nCas9 in BE4max (Addgene: 112093) to D (aspartic acid) (the protein after this nCas9 mutation was designated nCas 9-M2). The sequences of nCas9, nCas9-M and nCas9-M2 are respectively sequence 2, sequence 3 and sequence 4 in the sequence table.

In mammalian cells, genomic RNF2 sites were base-edited with BE4max, BE4maxM, hyA3A-BE4max, hyA3A-BE4maxM and BE4maxM2, respectively, using mismatched and non-mismatched gRNAs, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of BE4max and BE4maxM2 is not remarkably different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of BE4maxM is obviously lower than BE4max, the editing efficiency of BE4maxM2 is obviously lower than BE4max, and the editing efficiency of hyA A-BE4maxM is obviously lower than hyA A-BE4max.

The experimental procedure was as follows:

preparation of plasmids:

taking BE4max plasmid (Addgene: 112093) as a template, carrying out point mutation on nCas9 gene in BE4max plasmid by using a primer pair consisting of P1 and P2 in table 6 in a primer embedding mode, wherein the obtained mutant plasmid is BE4maxM; the nCas9 gene in the BE4max plasmid is subjected to point mutation by using the primer pair consisting of P1 and P3 in Table 6 by using the BE4max plasmid (Addgene: 112093) as a template in a primer embedding mode, and the obtained mutant plasmid is BE4maxM2.

The BE4max plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the BE4max can express nCas9 shown in a sequence 2; BE4maxM is a mutant plasmid obtained by mutating the 3028-3093 th position of nCas9 gene (sequence 1) in BE4max plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 10), the mutated gene of nCas9 gene is named as nCas9-M gene, and BE4maxM contains nCas9-M gene and can express nCas9-M shown in sequence 3; BE4maxM2 is a mutant plasmid obtained by mutating TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG from the 3028 th to 3093 th positions of nCas9 gene (sequence 1) in BE4max plasmid to GACGGCGACGACAAGGTGTACGACGACCGGAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 11), the mutated gene of nCas9 gene is denoted as nCas9-M2 gene, and BE4maxM2 contains the nCas9-M2 gene and can express nCas9-M2 shown in sequence 4.

The nCas9 gene in the hyA A-BE4max plasmid is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 by using the hyA A-BE4max plasmid (Addgene: 157943) as a template in a primer embedding mode, and the obtained mutant plasmid is hyA A-BE4maxM.

hyA3A-BE4max plasmid contains nCas9 gene shown in sequence 1 in a sequence table, hyA A-BE4max can express nCas9 shown in sequence 2; hyA3A-BE4maxM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in the hyA A-BE4max plasmid with the nCas9-M gene, and hyA A-BE4maxM can express the nCas9-M shown in sequence 3.

The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of targeted genome RNF2 locus, namely, a gRNA plasmid (gRNA 1) without mismatch, a gRNA plasmid (gRNA 2) with mismatch and a gRNA plasmid (gRNA 3) with deletion mismatch.

Primers used for the mismatch free gRNA plasmid (gRNA 1) are P10 and P11 in Table 6; primers used for the mismatched gRNA plasmid (gRNA 2) are P10 and P12 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 3) are P10 and P13 in Table 6.

The mismatch-free gRNA plasmid (gRNA 1) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 1; the mismatched gRNA plasmid (gRNA 2) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 2; the deletion mismatch-containing gRNA plasmid (gRNA 3) was obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 3.

Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 ⁵ 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the genomic RNF2 site (the pre-spacer sequences of the gRNA plasmids are shown in table 1) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, using Lipofectamine 2000 (Life, invitrogen, 11668019) reagent to HEK293T cells, each plasmid was transfected in combination with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P4 and P5 in Table 6.

TABLE 1 Pre-spacer sequence of RNF2 genomic editing site gRNA plasmid

gRNA	Pre-spacer sequence of target gene (5 '-3')
		gRNA1 (targeting RNF2 without mismatch)	GTCATCTTAGTCATTACCTG (sequence 15)
gRNA2 (targeting RNF)2 containing mismatches)	GTaATCTTAGTCATTACCTG (sequence 16)
		gRNA3 (targeting RNF2 with deletion mismatch)	G-CATCTTAGTCATTACCTG (sequence 17)

The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome was counted as T, i.e., the C-T editing efficiency, and the results are shown in Table 2. The results show that: for gRNA1 without mismatch, the editing efficiency of BE4max and BE4maxM is not significantly different, the editing efficiency of BE4max and BE4maxM2 is not significantly different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not significantly different; for grnas being mismatched gRNA2 and gRNA3, BE4maxM significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, BE4maxM2 significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, and hyA a-BE4maxM also significantly reduced the editing efficiency of mismatched gRNA compared to hyA a-BE4max.

TABLE 2C-T editing efficiency of different cRNAs at RNF2 (C6, 5' sixth cytosine base) site by different cytosine base editors

Example 2: preparation and application of low off-target ABE genome editor

In this example, NG-ABEmaxM and ABE8eM were constructed by mutating 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of the currently commonly used ABE base editors NG-ABEmax (Addgene: 124163) and ABE8e (Addgene: 138489), respectively. The protein after the mutation of the NG-nCas9 is NG-nCas9-M, and the sequences of the NG-nCas9 and the NG-nCas9-M are a sequence 6 and a sequence 7 in a sequence table respectively; the protein after the nCas9 mutation is nCas9-M, and the sequences of the nCas9 and the nCas9-M are respectively a sequence 2 and a sequence 3 in a sequence table.

In mammalian cells, genomic ABCA3 locus was base edited with NG-ABEmax, NG-ABEmaxM, ABE8e, ABE8eM, respectively, using mismatched and non-mismatched gRNA, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not greatly different, and the editing efficiency of ABE8e and ABE8eM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of NG-ABEmaxM is significantly lower than NG-ABEmax, and the editing efficiency of ABE8eM is significantly lower than ABE8e.

The experimental procedure was as follows:

preparation of plasmids:

the NG-nCas9 gene in the NG-ABEmax plasmid (Addgene: 124163) is used as a template, and the primer pair consisting of P1 and P2 in the table 6 is used for carrying out point mutation on the NG-nCas9 gene in the NG-ABEmax plasmid in a primer embedding mode, so that the obtained mutant plasmid is the NG-ABEmaxM.

The NG-ABEmax plasmid contains a NG-nCas9 gene shown in a sequence 5 in a sequence table, and the NG-ABEmax can express the NG-nCas9 shown in a sequence 6; the NG-ABEmaxM is a mutant plasmid obtained by mutating the 3028-3093 th position of the NG-nCas9 gene (sequence 5) in the NG-ABEmax plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 12), the mutated gene of the NG-nCas9 gene is named as the NG-nCas9-M gene, and the NG-ABEmaxM contains the NG-nCas9-M gene and can express the NG-nCas9-M shown in the sequence 7.

The nCas9 gene in the ABE8e plasmid (Addgene: 138489) is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 as a template in a primer embedding mode, and the obtained mutant plasmid is ABE8eM.

The ABE8e plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the ABE8e can express nCas9 shown in a sequence 2; ABE8eM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in ABE8e plasmid with the nCas9-M gene, and ABE8eM can express nCas9-M shown in sequence 3.

The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of a targeted genome ABCA3 site, namely, a gRNA plasmid (gRNA 4) without mismatch, a gRNA plasmid (gRNA 5) with mismatch and a gRNA plasmid (gRNA 6) with deletion mismatch.

Primers used for the mismatch free gRNA plasmid (gRNA 4) are P10 and P14 in Table 6; the primers used for the mismatched gRNA plasmid (gRNA 5) are P10 and P15 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 6) are P10 and P16 in Table 6.

The mismatch-free gRNA plasmid (gRNA 4) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 4; the mismatched gRNA plasmid (gRNA 5) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 5; the deletion mismatch-containing gRNA plasmid (gRNA 6) was a plasmid obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 6.

Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 ⁵ 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the ABCA3 site (the pre-spacer sequences of the gRNA plasmids are shown in table 3) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, to HEK293T cells using Lipofectamine 2000 (Life, invitrogen, 11668019) reagents, each plasmid combination transfected with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P6 and P7 in Table 6.

TABLE 3 Pre-spacer sequence of ABCA3 genomic editing site gRNA plasmid

gRNA	Pre-spacer sequence of target gene (5 '-3')
		gRNA4 (target ABCA3 without mismatch)	GAAGAGCAGGGTCATGAAGG (sequence 18)
gRNA5 (targeting ABCA3 with mismatch)	GAtGAGCAGGGTCATGAAGG (sequence 19)
		gRNA6 (targeting ABCA3 with deletion mismatch)	G-AGAGCAGGGTCATGAAGG (sequence 20)

The percentage of the mutation A corresponding to the fifth position of the sequence 12 in the sequence table in the genome to G, namely the A-G editing efficiency, was counted, and the results are shown in Table 4. The results show that: for gRNA4 without mismatch, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not obviously different, and the editing efficiency of ABE8e and ABE8eM is not obviously different; for grnas containing mismatches, both NG-ABEmaxM significantly reduced the editing efficiency of mismatched grnas compared to NG-ABEmax and ABE8eM significantly reduced the editing efficiency of mismatched grnas compared to ABE8e for grnas 5 and 6.

TABLE 4A-G editing efficiency of different gRNAs at ABCA3 (A5, 5' fifth cytosine base) site by different adenine base editors

	NG-ABEmax	NG-ABEmaxM	ABE8e	ABE8eM
					gRNA4	53.4±4.5％	46.8±4.3％	85.4±6.2％	78.6±6.5％
gRNA5	50.3±3.4％	12.1±2.3％	78.2±1.3％	15.9±3.7％
					gRNA6	48.2±7.8％	10.8±5.6％	70.9±5.6％	19.8±7.2％

Example 3: preparation and application of low off-target GBE genome editor

In the embodiment, 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of a commonly used GBE base editor APOBEC_nCas9_ Ung (molecular clone: MC_ 0101154) are mutated into D (aspartic acid), so that APOBEC_nCas9_UngM2 is constructed, the protein after nCas9 mutation is nCas9-M, and the amino acid sequences of nCas9 and nCas9-M are respectively the sequence 2 and the sequence 3 in a sequence table; mutation of 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) in APOBEC_nCas9_ Ung into D (aspartic acid), mutation of 1014K (lysine) into P (proline) to construct APOBEC_nCas9_UngM3, wherein the protein after nCas9 mutation is nCas9-M3, and the amino acid sequences of nCas9 and nCas9-M3 are respectively sequence 2 and sequence 8 in a sequence table.

In mammalian cells, genomic RNF2 locus was base edited with apodec_ncs9_ung, apodec_ncs9_ungm2, apodec_ncs9_ungm3, respectively using gRNA with and without mismatch, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of apodec_ncs9_ Ung is not greatly different from that of apodec_ncs9_ungm2 and apodec_ncs9_ungm3; when mismatched grnas are used for the editing sites, the editing efficiency of both apodec_ncs9_ungm2 and apodec_ncs9_ungm3 is significantly lower than apodec_ncs9_ Ung.

The experimental procedure was as follows:

preparation of plasmids:

taking an APOBEC_nCas9_ Ung plasmid (molecular clone: MC_ 0101154) as a template, carrying out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid by using a primer pair consisting of P1 and P8 in Table 6 in a primer embedding mode, wherein the obtained mutant plasmid is APOBEC_nCas9_UngM2; the APOBEC_nCas9_ Ung plasmid is taken as a template, the primer pair consisting of P1 and P9 in table 6 is utilized to carry out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid in a primer embedding mode, and the obtained mutant plasmid is APOBEC_nCas9_UngM3.

The APOBEC_nCas9_ Ung plasmid contains an nCas9' gene shown in a sequence 9 in a sequence table, and the APOBEC_nCas9_ Ung can express the nCas9 shown in a sequence 2; the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nCas9' gene (sequence 9) in the APOBEC_nCas9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 13), the mutated gene of the nCas9' gene is named as the nCas9-M ' gene, and the APOBEC_nCas9_UngM2 can express the nCas9-M shown in the amino acid sequence 3; the apodec_nmcas 9_ungm3 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nmas 9 'gene (sequence 9) in the apodec_nmcas 9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACCCCGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 14), the mutated gene of the nmas 9' gene is denoted as the nmas 9-M3 gene, and the apodec_nmcas 9_ungm3 contains the nmas 9-M3 gene and can express the nmas 9-M3 gene shown in sequence 8.

Recombinant plasmids targeting the grnas of genomic RNF2 sites (abbreviated as gRNA plasmids), i.e. gRNA plasmids without mismatches (gRNA 1 of example 1), gRNA plasmids with mismatches (gRNA 2 of example 1), gRNA plasmids with deletion mismatches (gRNA 3 of example 1).

The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome to G, i.e., the C-G editing efficiency was counted, and the results are shown in Table 5. The results show that: for gRNA1 with no mismatch, there is no significant difference in apodec_ncs9_ Ung and apodec_ncs9_ungm2 and apodec_ncs9_ungm3 editing efficiency; for grnas being mismatch-containing gRNA2 and gRNA3, apodec_ncs9_ Ung significantly reduces the editing efficiency of the mismatch gRNA compared to both apodec_ncs9_ungm2 and apodec_ncs9_ungm3.

TABLE 5 different efficiency of C-G editing of different gRNAs at RNF2 (C6, 5' sixth cytosine) site by different cytosine base editors

	APOBEC_nCas9_Ung	APOBEC_nCas9_UngM	APOBEC_nCas9_UngM2
				gRNA1	27.3±3.2％	22.3±4.2％	24.4±5.6％
gRNA2	18.6±2.4％	2.4±1.2％	7.8±4.7％
				gRNA2	21.3±2.8％	3.2±2.4％	9.6±1.5％

TABLE 6 primers

Primer(s)	Sequence(s)
		P1	CCACGTCTCAGATGATCGCCAAGAGCGAGGACGAAATCGGCGACGCTACCGCCAAGTACTTCT
P2	CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC
		P3	CCACGTCTCACATCTTCCGGTCGTCGTACACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC
P4	ACATTCAGACCATAGCACTTCC
		P5	GTCTTCCTTGGTGCCTTATCAG
P6	ACAGCACGGCTACATTTGG
		P7	CCAGGAGTTTGAGCAAGATGAG
P8	CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGGACTCCAGC
		P9	CCACGTCTCACATCTTGTCGTCGTCGTCCACGGGGTCGTCGCCGTCCACGAACTCGGACTCCAGC
P10	CCAGGTCTCA CGGTGTTTCGTCCTTTCCACAAGATATATAAAGC
		P11	CCAGGTCTCAACCGTCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P12	CCAGGTCTCAACCGTAATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
		P13	CCAGGTCTCAACCGCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P14	CCAGGTCTCAACCGAAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
		P15	CCAGGTCTCAACCGATGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P16	CCAGGTCTCAACCGAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC

The present application is described in detail above. It will be apparent to those skilled in the art that the present application can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the application and without undue experimentation. While the application has been described with respect to specific embodiments, it will be appreciated that the application may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.

Sequence listing

<110> institute of Tianjin Industrial biotechnology, national academy of sciences

<120> a low off-target base editor and construction thereof

<160> 20

<170>PatentIn version 3.5

<210> 1

<211> 4104

<212> DNA

<213> Artificial sequence (Artificial sequence)

<400> 1

atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60

atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120

cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180

gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240

tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300

ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360

aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420

aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480

atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540

gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600

atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660

cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720

ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780

gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840

cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900

ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960

atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020

cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080

ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140

gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200

aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260

gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320

gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380

agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440

gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500

aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560

tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620

agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680

gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740

tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800

atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860

ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920

cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980

aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040

gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100

agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160

cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220

gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280

atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340

atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400

gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460

gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520

atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580

gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640

aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700

accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760

ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820

actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880

aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940

taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000

taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060

atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120

aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180

cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240

gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300

cagacaggcggcttcagcaaagagtctatcctgcccaagaggaacagcgataagctgatc 3360

gccagaaagaaggactgggaccctaagaagtacggcggcttcgacagccccaccgtggcc 3420

tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480

aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540

tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600

tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccggcgaactg 3660

cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720

cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780

cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840

atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900

cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960

cctgccgccttcaagtactttgacaccaccatcgaccggaagaggtacaccagcaccaaa 4020

gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080

gacctgtctcagctgggaggtgac 4104

<210> 2

<211> 1368

<212> PRT

<213> artificial sequence

<400> 2

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 3

<211> 1368

<212> PRT

<213> artificial sequence

<400> 3

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 4

<211> 1368

<212> PRT

<213> artificial sequence

<400> 4

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Asp Gly Asp Asp Lys Val Tyr Asp AspArg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 5

<211> 4104

<212> DNA

<213> artificial sequence

<400> 5

atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60

atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120

cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180

gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240

tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300

ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360

aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420

aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480

atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540

gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600

atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660

cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720

ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780

gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840

cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900

ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960

atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020

cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080

ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140

gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200

aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260

gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320

gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380

agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440

gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500

aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560

tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620

agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680

gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740

tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800

atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860

ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920

cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980

aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040

gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100

agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160

cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220

gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280

atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340

atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400

gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460

gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520

atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580

gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640

aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700

accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760

ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820

actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880

aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940

taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000

taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060

atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120

aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180

cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240

gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300

cagacaggcggcttcagcaaagagtctatcaggcccaagaggaacagcgataagctgatc 3360

gccagaaagaaggactgggaccctaagaagtacggcggcttcgtcagccccaccgtggcc 3420

tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480

aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540

tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600

tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccagattcctg 3660

cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720

cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780

cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840

atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900

cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960

cctagggccttcaagtactttgacaccaccatcgaccggaaggtgtacaggagcaccaaa 4020

gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080

gacctgtctcagctgggaggtgac 4104

<210> 6

<211> 1368

<212> PRT

<213> artificial sequence

<400> 6

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg

1205 1210 1215

Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 7

<211> 1368

<212> PRT

<213> artificial sequence

<400> 7

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg

1205 1210 1215

Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 8

<211> 1368

<212> PRT

<213> artificial sequence

<400> 8

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val

1 5 10 15

GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu

50 55 60

Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg

385 390 395 400

Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp

450 455 460

Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr

645 650 655

ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Asp Gly Asp Asp Pro Val Asp AspAspAsp Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu

1055 1060 1065

ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val

1070 1075 1080

ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr

1085 1090 1095

Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp

1355 1360 1365

<210> 9

<211> 4104

<212> DNA

<213> artificial sequence

<400> 9

atggacaagaagtactcgatcggcctcgccatcgggacgaactcagttggctgggccgtg 60

atcaccgacgagtacaaggtgccctctaagaagttcaaggtcctggggaacaccgaccgc 120

cattccatcaagaagaacctcatcggcgctctcctgttcgacagcggggagaccgctgag 180

gctacgaggctcaagagaaccgctaggcgccggtacacgagaaggaagaacaggatctgc 240

tacctccaagagattttctccaacgagatggccaaggttgacgattcattcttccaccgc 300

ctggaggagtctttcctcgtggaggaggataagaagcacgagcggcatcccatcttcggc 360

aacatcgtggacgaggttgcctaccacgagaagtaccctacgatctaccatctgcggaag 420

aagctcgtggactccaccgataaggcggacctcagactgatctacctcgctctggcccac 480

atgatcaagttccgcggccatttcctgatcgagggggatctcaacccagacaacagcgat 540

gttgacaagctgttcatccaactcgtgcagacctacaaccaactcttcgaggagaacccg 600

atcaacgcctctggcgtggacgcgaaggctatcctgtccgcgaggctctcgaagtccagg 660

aggctggagaacctgatcgctcagctcccaggcgagaagaagaacggcctgttcgggaac 720

ctcatcgctctcagcctggggctcaccccgaacttcaagtcgaacttcgatctcgctgag 780

gacgccaagctgcaactctccaaggacacctacgacgatgacctcgataacctcctggcc 840

cagatcggcgatcaatacgcggacctgttcctcgctgccaagaacctgtcggacgccatc 900

ctcctgtcagatatcctccgcgtgaacaccgagatcacgaaggctccactctctgcctcc 960

atgatcaagcgctacgacgagcaccatcaggatctgaccctcctgaaggcgctggtccgc 1020

caacagctcccggagaagtacaaggagattttcttcgatcagtcgaagaacggctacgct 1080

gggtacatcgacggcggggcctcacaagaggagttctacaagttcatcaagccaatcctg 1140

gagaagatggacggcacggaggagctcctggtgaagctcaacagggaggacctcctgcgg 1200

aagcagagaaccttcgataacggcagcatcccccaccaaatccatctcggggagctgcac 1260

gccatcctgagaaggcaagaggacttctaccctttcctcaaggataaccgggagaagatc 1320

gagaagatcctgaccttcagaatcccatactacgtcggccctctcgcgcgggggaactca 1380

agattcgcttggatgacccgcaagtctgaggagaccatcacgccgtggaacttcgaggag 1440

gtggtggacaagggcgctagcgctcagtcgttcatcgagaggatgaccaacttcgacaag 1500

aacctgcccaacgagaaggtgctccctaagcactcgctcctgtacgagtacttcaccgtc 1560

tacaacgagctcacgaaggtgaagtacgtcaccgagggcatgcgcaagccagcgttcctg 1620

tccggggagcagaagaaggctatcgtggacctcctgttcaagaccaaccggaaggtcacg 1680

gttaagcaactcaaggaggactacttcaagaagatcgagtgcttcgattcggtcgagatc 1740

agcggcgttgaggaccgcttcaacgccagcctcgggacctaccacgatctcctgaagatc 1800

atcaaggataaggacttcctggacaacgaggagaacgaggatatcctggaggacatcgtg 1860

ctgaccctcacgctgttcgaggacagggagatgatcgaggagcgcctgaagacgtacgcc 1920

catctcttcgatgacaaggtcatgaagcaactcaagcgccggagatacaccggctggggg 1980

aggctgtcccgcaagctcatcaacggcatccgggacaagcagtccgggaagaccatcctc 2040

gacttcctcaagagcgatggcttcgccaacaggaacttcatgcaactgatccacgatgac 2100

agcctcaccttcaaggaggatatccaaaaggctcaagtgagcggccagggggactcgctg 2160

cacgagcatatcgcgaacctcgctggctcccccgcgatcaagaagggcatcctccagacc 2220

gtgaaggttgtggacgagctcgtgaaggtcatgggccggcacaagcctgagaacatcgtc 2280

atcgagatggccagagagaaccaaaccacgcagaaggggcaaaagaactctagggagcgc 2340

atgaagcgcatcgaggagggcatcaaggagctggggtcccaaatcctcaaggagcaccca 2400

gtggagaacacccaactgcagaacgagaagctctacctgtactacctccagaacggcagg 2460

gatatgtacgtggaccaagagctggatatcaaccgcctcagcgattacgacgtcgatcat 2520

atcgttccccagtctttcctgaaggatgactccatcgacaacaaggtcctcaccaggtcg 2580

gacaagaaccgcggcaagtcagataacgttccatctgaggaggtcgttaagaagatgaag 2640

aactactggaggcagctcctgaacgccaagctgatcacgcaaaggaagttcgacaacctc 2700

accaaggctgagagaggcgggctctcagagctggacaaggccggcttcatcaagcggcag 2760

ctggtcgagaccagacaaatcacgaagcacgttgcgcaaatcctcgactctcggatgaac 2820

acgaagtacgatgagaacgacaagctgatcagggaggttaaggtgatcaccctgaagtct 2880

aagctcgtctccgacttcaggaaggatttccagttctacaaggttcgcgagatcaacaac 2940

taccaccatgcccatgacgcttacctcaacgctgtggtcggcaccgctctgatcaagaag 3000

tacccaaagctggagtccgagttcgtgtacggggactacaaggtttacgatgtgcgcaag 3060

atgatcgccaagtcggagcaagagatcggcaaggctaccgccaagtacttcttctactca 3120

aacatcatgaacttcttcaagaccgagatcacgctggccaacggcgagatccggaagaga 3180

ccgctcatcgagaccaacggcgagacgggggagatcgtgtgggacaagggcagggatttc 3240

gcgaccgtccgcaaggttctctccatgccccaggtgaacatcgtcaagaagaccgaggtc 3300

caaacgggcgggttctcaaaggagtctatcctgcctaagcggaacagcgacaagctcatc 3360

gccagaaagaaggactgggacccaaagaagtacggcgggttcgacagccctaccgtggcc 3420

tactcggtcctggttgtggcgaaggttgagaagggcaagtccaagaagctcaagagcgtg 3480

aaggagctcctggggatcaccatcatggagaggtccagcttcgagaagaacccaatcgac 3540

ttcctggaggccaagggctacaaggaggtgaagaaggacctgatcatcaagctcccgaag 3600

tactctctcttcgagctggagaacggcaggaagagaatgctggcttccgctggcgagctc 3660

cagaaggggaacgagctcgcgctgccaagcaagtacgtgaacttcctctacctggcttcc 3720

cactacgagaagctcaagggcagcccggaggacaacgagcaaaagcagctgttcgtcgag 3780

cagcacaagcattacctcgacgagatcatcgagcaaatctccgagttcagcaagcgcgtg 3840

atcctcgccgacgcgaacctggataaggtcctctccgcctacaacaagcaccgggacaag 3900

cccatcagagagcaagcggagaacatcatccatctcttcaccctgacgaacctcggcgct 3960

cctgctgctttcaagtacttcgacaccacgatcgatcggaagagatacacctccacgaag 4020

gaggtcctggacgcgaccctcatccaccagtcgatcaccggcctgtacgagacgaggatc 4080

gacctctcacaactcggcggggat 4104

<210> 10

<211> 66

<212> DNA

<213> artificial sequence

<400> 10

gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60

ggcgac 66

<210> 11

<211> 66

<212> DNA

<213> artificial sequence

<400> 11

gacggcgacgacaaggtgtacgacgaccggaagatgatcgccaagagcgaggacgaaatc 60

ggcgac 66

<210> 12

<211> 66

<212> DNA

<213> artificial sequence

<400> 12

gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60

ggcgac 66

<210> 13

<211> 66

<212> DNA

<213> artificial sequence

<400> 13

gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60

ggcgac 66

<210> 14

<211> 66

<212> DNA

<213> artificial sequence

<400> 14

gacggcgacgaccccgtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60

ggcgac 66

<210> 15

<211> 20

<212> DNA

<213> artificial sequence

<400> 15

gtcatcttagtcattacctg 20

<210> 16

<211> 20

<212> DNA

<213> artificial sequence

<400> 16

gtaatcttagtcattacctg 20

<210> 17

<211> 19

<212> DNA

<213> artificial sequence

<400> 17

gcatcttagtcattacctg 19

<210> 18

<211> 20

<212> DNA

<213> artificial sequence

<400> 18

gaagagcagggtcatgaagg 20

<210> 19

<211> 20

<212> DNA

<213> artificial sequence

<400> 19

gatgagcagggtcatgaagg 20

<210> 20

<211> 19

<212> DNA

<213> artificial sequence

<400> 20

gagagcagggtcatgaagg 19

Claims

1. A base editor comprising or expressing a Cas9 mutant, the Cas9 mutant being a mutant protein obtained by mutating amino acid residues between positions 1010-1031 of Cas9, or a mutant protein obtained by mutating amino acid residues 1010, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 of Cas9, or a mutant protein obtained by mutating Cas9 by any one or more of the following:

m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;

2. The base editor of claim 1 wherein: the Cas9 is a protein shown as a sequence 2 or a sequence 6 in a sequence table.

3. The base editor of claim 1 or 2, wherein: the Cas9 mutant is a mutant protein obtained by performing seven mutations of M1), M2), M4), M5), M6), M7) and M8) on Cas9, or is a mutant protein obtained by performing five mutations of M1), M2), M5), M7) and M8) on Cas9, or is a mutant protein obtained by performing eight mutations of M1) -M8) on Cas 9.

4. The base editor of any one of claims 1-3 wherein: the base editor also contains or also expresses sgrnas targeting the target sequence and/or domains with base modification activity.

5. The Cas9 mutant of any one of claims 1-4.

6. The biological material associated with the Cas9 mutant of any one of claims 1-4, which is any one of the following B1) to B5):

b1 A nucleic acid molecule encoding the Cas9 mutant of any one of claims 1-4;

b2 An expression cassette comprising the nucleic acid molecule of B1);

7. The biomaterial according to claim 6, wherein: b1 The nucleic acid molecule is a mutant gene obtained by mutating a nucleotide sequence between 3028 and 3093 of the Cas9 gene;

further, the nucleic acid molecule B1) is obtained by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 1 in the sequence table into the sequence 10 or the sequence 11, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 5 in the sequence table into the sequence 12, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 9 in the sequence table into the sequence 13 or the sequence 14.

8. A product comprising the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7.

9. The base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or any one of the following uses of the product of claim 8:

y7) as or preparing a single base editing reagent or kit;

y8) as or for the preparation of a medicament for gene therapy;

y9) treating or preventing a disease;

10. The method comprises the following steps:

x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing of a cytosine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8 to effect conversion of the cytosine nucleotide residue of interest to a thymine nucleotide residue;

x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on an adenine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8, to convert the adenine nucleotide residue of interest to a guanine nucleotide residue;

x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: the conversion of a cytosine nucleotide residue of interest into a guanine nucleotide residue is accomplished by base editing the cytosine nucleotide residue of interest with a base editor as defined in any one of claims 1-4, or a Cas9 mutant as defined in claim 5, or a biological material as defined in claim 6 or 7, or a product as defined in claim 8.

11. The use according to claim 9, or the method according to claim 10, characterized in that: the biological cell is a mammalian cell; the animal is a mammal.