CN117089572A - Low off-target base editor and construction thereof - Google Patents

Low off-target base editor and construction thereof Download PDF

Info

Publication number
CN117089572A
CN117089572A CN202210518789.4A CN202210518789A CN117089572A CN 117089572 A CN117089572 A CN 117089572A CN 202210518789 A CN202210518789 A CN 202210518789A CN 117089572 A CN117089572 A CN 117089572A
Authority
CN
China
Prior art keywords
leu
lys
glu
asp
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210518789.4A
Other languages
Chinese (zh)
Inventor
张学礼
毕昌昊
赵东东
侯雪亭
王杰
魏占东
陈旭旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Institute of Industrial Biotechnology of CAS
Original Assignee
Tianjin Institute of Industrial Biotechnology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Institute of Industrial Biotechnology of CAS filed Critical Tianjin Institute of Industrial Biotechnology of CAS
Priority to CN202210518789.4A priority Critical patent/CN117089572A/en
Publication of CN117089572A publication Critical patent/CN117089572A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1024In vivo mutagenesis using high mutation rate "mutator" host strains by inserting genetic material, e.g. encoding an error prone polymerase, disrupting a gene for mismatch repair
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The application discloses a low off-target base editor and construction thereof. The low off-target base editor disclosed by the application contains or expresses a Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas 9. The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.

Description

Low off-target base editor and construction thereof
Technical Field
The application relates to the technical field of biotechnology and gene editing, in particular to a low off-target base editor and construction thereof.
Background
Genome editing refers to the effective design and efficient modification of cells at the genome level, and the CRISPR/Cas9 genome editing technology is simple in design, convenient to operate and high in editing efficiency, and is successfully applied to genome editing research of various target cells at present. CRISPR/Cas9 genome editing techniques mainly use guide RNAs (grnas) to guide Cas9 proteins to precisely cut at the targeting site of the genome resulting in DNA double strand breaks (double strand break, DSBs), host cells repair with their own non-homologous end-junctions (NHEJ) or based on homologous end recombination (homologous end recombination repair, HDR), but specific editing for single bases is difficult to achieve. Because double strand breaks of DNA have much uncertainty, the probability of HDR occurrence is low, whereas NHEJ causes random insertions or deletions of bases, and thus, conventional CRISPR/Cas techniques have certain drawbacks in gene editing for single bases.
The CRISPR Base Editor (BE) overcomes the defect of the traditional CRISPR/Cas technology in single base editing, mainly utilizes Cas9 (dCAs 9) without DNA cutting activity or an editor formed by fusing Cas9 protein (nCas 9) with single-chain DNA cutting activity and deaminase, can realize accurate point mutation of a target site under the guidance of gRNA, and has great application prospect in the treatment of genetic mutation genetic diseases. The existing base editor mainly comprises: a cytosine base editor (Cytosine base editor, CBE) that converts cytosine nucleotides within an edit window of the target sequence to thymidines (C > T); adenine base editors (Adenine base editor, ABE) that convert adenine nucleotides within the editing window to guanine nucleotides (a > G); and a novel glycosylase base editor (Glycosylase base editor, GBE) that can edit cytosine nucleotides to adenine nucleotides (C > A) in E.coli and to guanine nucleotides (C > G) specifically in mammalian cells.
The CRISPR/Cas9 has a serious off-target effect, and can cut DNA double chains at misplaced gene sites, thereby causing potential risks, which is also a big factor limiting the clinical application of CRISPR/Cas9 gene editing. At present, a plurality of clinical tests based on CRISPR/Cas9 gene editing are started at home and abroad, and the reduction of off-target effect is a problem to be solved urgently. CRISPR/Cas9 primarily uses base complementary pairing of the targeting sequence of the gRNA with the target DNA to identify the site that needs editing, however, sometimes Cas9 enzymes can still edit in cases where the targeting sequence of the gRNA does not perfectly match the genomic DNA, resulting in off-target effects. High-fidelity Cas9 proteins (Cas 9-HF1, hypas 9, evoCas9, etc.) constructed by proteolytic engineering or directed evolution can reduce off-target, but these high-fidelity Cas9 proteins reduce off-target as well as editing efficiency of the target site.
The CRISPR base editor mainly utilizes dCAS9 and nCas9 proteins and gRNA to identify a genome target site, deaminates a target base by utilizing deaminase, and then utilizes a DNA repair system or DNA repair of a cell to realize the base editing of the target site, and similar to CRISPR/Cas9 genome editing, the target-off phenomenon can occur at an incompletely matched site.
Disclosure of Invention
The application aims to solve the technical problem of reducing off-target of a base editor.
In order to solve the above technical problems, the present application firstly provides a base editor, which contains or expresses Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas9, or a mutant protein obtained by mutating amino acid residues at 1010 th, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 th positions of Cas9, or a mutant protein obtained by mutating any one or more of the following positions of Cas 9:
m1) mutating the tyrosine residue at position 1010 of Cas9 to an aspartic acid residue;
m2) mutating the tyrosine residue at position 1013 of Cas9 to an aspartic acid residue;
m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;
m4) mutating the tyrosine residue at position 1016 of Cas9 to an aspartic acid residue;
m5) mutating the valine residue at position 1018 of Cas9 to the aspartic acid residue;
m6) mutating the arginine residue at position 1019 of Cas9 to an aspartic acid residue;
m7) mutating the glutamine residue at position 1027 of Cas9 to an aspartic acid residue;
m8) mutates the lysine residue at position 1031 of Cas9 to an aspartic acid residue.
In the above base editor, the Cas9 may be a protein represented by sequence 2 or sequence 6 in the sequence table.
In the above base editor, the Cas9 mutant may be a mutant protein obtained by performing seven mutations of Cas9, M1), M2), M4), M5), M6), M7) and M8), or a mutant protein obtained by performing five mutations of Cas9, M2), M5), M7) and M8), or a mutant protein obtained by performing eight mutations of Cas9, M1) -M8).
The above base editor may also contain or also express sgrnas targeting the target sequence and/or domains with base modifying activity.
The domain having base modification activity may be a domain having deaminase activity, a mutant, homolog or polypeptide having or at least partially having deaminase activity. Specifically, the structural domain with the base modification activity is adenine deaminase, or a mutant, a homolog or a polypeptide fragment with or part of the adenine deaminase activity; or, the domain with base modification activity is cytosine deaminase, or a mutant, homolog or polypeptide fragment with or part of cytosine deaminase activity.
In particular, the base editor may also contain other components of the base editor than Cas9 or mutations thereof, such as sgrnas targeting the target sequence and/or domains with base modifying activity. Further, the base editor may also contain other components of the CBE base editor (e.g., BE4max or hyA3A-BE4 max) than Cas9 or a mutation thereof, or contain other components of the ABE base editor (e.g., NG-ABEmax or ABE8 e) than Cas9 or a mutation thereof, or contain other components of the GBE base editor (e.g., apodec_nmcas 9_ Ung) than Cas9 or a mutation thereof.
The Cas9 mutants also fall within the scope of the present application.
The application also provides a biological material associated with the Cas9 mutant, which is any one of the following B1) to B5):
b1 A nucleic acid molecule encoding the Cas9 mutant;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or a recombinant vector comprising the expression cassette of B2);
b4 A recombinant microorganism comprising the nucleic acid molecule of B1), or a recombinant microorganism comprising the expression cassette of B2), or a recombinant microorganism comprising the recombinant vector of B3);
b5 A cell line containing the nucleic acid molecule of B1) or a cell line containing the expression cassette of B2).
In the above biological material, the nucleic acid molecule of B1) may be a mutant gene obtained by mutating the nucleotide sequence between positions 3028 and 3093 of the Cas9 gene.
Further, the nucleic acid molecule of B1) may be a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 1 in the sequence table to sequence 10 or sequence 11, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 5 in the sequence table to sequence 12, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 9 in the sequence table to sequence 13 or sequence 14.
In the above biological material, the expression cassette (Cas 9 mutant gene expression cassette) of B2) containing the nucleic acid molecule encoding the Cas9 mutant refers to DNA capable of expressing the Cas9 mutant in a host cell, and the DNA may include not only a promoter for initiating transcription of the Cas9 mutant gene, but also a terminator for terminating transcription of the Cas9 mutant gene. Further, the expression cassette may also include an enhancer sequence.
Recombinant vectors containing the Cas9 mutant gene expression cassettes can be constructed using existing expression vectors.
In the above biological material, the vector may be a plasmid, cosmid, phage or viral vector.
B3 The recombinant vector can BE BE4maxM, BE4maxM2, NG-ABEmaxM, ABE8eM, APOBEC_nCas9_UngM2 or APOBEC_nCas9_UngM3;
the BE4maxM is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 10;
the BE4maxM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 11;
the NG-ABEmaxM is a mutant plasmid obtained by mutating 3028-3093 th bit of Cas9 gene (sequence 5) in NG-ABEmax plasmid into sequence 12
The ABE8eM is a mutant plasmid obtained by mutating 3028-3093 th sites of Cas9 genes (sequence 1) in an ABE8e plasmid into a sequence 10;
the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 13,
the APOBEC_nCas9_UngM3 is a mutant plasmid obtained by mutating 3028-3093 th sites of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 14.
In the above application, the microorganism may be yeast, bacteria, algae or fungi.
In the above applications, the cell line does not include propagation material.
The application also provides a product comprising the base editor, or the Cas9 mutant, or the biological material.
The product may also contain one or more pharmaceutically acceptable carriers.
The application also provides for the use of the base editor, or the Cas9 mutant, or the biomaterial, or any of the following for the product:
y1) converting a cytosine nucleotide residue in a biological cell or animal or subject to a thymine nucleotide residue;
y2) converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y3) converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y4) preparing a product for converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject;
y5) preparing a product for converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y6) preparing a product for converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y7) as or preparing a single base editing reagent or kit;
y8) as or for the preparation of a medicament for gene therapy;
y9) treating or preventing a disease;
y10) for the preparation of a product for the treatment or prophylaxis of a disease.
The application also provides any one of the following methods:
x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize conversion of the target cytosine nucleotide residue into thymine nucleotide residue;
x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target adenine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target adenine nucleotide residue into guanine nucleotide residue;
x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: and performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target cytosine nucleotide residue into guanine nucleotide residue.
In the above, the biological cell may be a mammalian cell; the animal may be a mammal.
The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.
Detailed Description
The following detailed description of the application is provided in connection with the accompanying drawings that are presented to illustrate the application and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the application in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents, instruments and the like used in the examples described below are commercially available unless otherwise specified. The quantitative tests in the following examples were all set up in triplicate and the results averaged. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
The medium used hereinafter was DMEM medium (Thermo Fisher) containing 10% fetal bovine serum (Gibco).
Example 1 preparation and use of Low off-target CBE genome editor
In the present example, the currently commonly used CBE base editors BE4max (Addgene: 112093) and hyA A-BE4max (Addgene: 157943) were mutated from Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, Y (tyrosine) at position 1016, V (valine) at position 1018, R (arginine) at position 1019, Q (glutamine) at position 1027 and K (lysine) at position 1031 to D (aspartic acid) (the mutated proteins of nCas9 were designated as nCas 9-M), and BE4maxM and hyA A-BE4maxM were constructed, respectively; BE4maxM2 was constructed by mutating Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, V (valine) at position 1018, Q (glutamine) at position 1027, and K (lysine) at position 1031 of nCas9 in BE4max (Addgene: 112093) to D (aspartic acid) (the protein after this nCas9 mutation was designated nCas 9-M2). The sequences of nCas9, nCas9-M and nCas9-M2 are respectively sequence 2, sequence 3 and sequence 4 in the sequence table.
In mammalian cells, genomic RNF2 sites were base-edited with BE4max, BE4maxM, hyA3A-BE4max, hyA3A-BE4maxM and BE4maxM2, respectively, using mismatched and non-mismatched gRNAs, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of BE4max and BE4maxM2 is not remarkably different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of BE4maxM is obviously lower than BE4max, the editing efficiency of BE4maxM2 is obviously lower than BE4max, and the editing efficiency of hyA A-BE4maxM is obviously lower than hyA A-BE4max.
The experimental procedure was as follows:
preparation of plasmids:
taking BE4max plasmid (Addgene: 112093) as a template, carrying out point mutation on nCas9 gene in BE4max plasmid by using a primer pair consisting of P1 and P2 in table 6 in a primer embedding mode, wherein the obtained mutant plasmid is BE4maxM; the nCas9 gene in the BE4max plasmid is subjected to point mutation by using the primer pair consisting of P1 and P3 in Table 6 by using the BE4max plasmid (Addgene: 112093) as a template in a primer embedding mode, and the obtained mutant plasmid is BE4maxM2.
The BE4max plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the BE4max can express nCas9 shown in a sequence 2; BE4maxM is a mutant plasmid obtained by mutating the 3028-3093 th position of nCas9 gene (sequence 1) in BE4max plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 10), the mutated gene of nCas9 gene is named as nCas9-M gene, and BE4maxM contains nCas9-M gene and can express nCas9-M shown in sequence 3; BE4maxM2 is a mutant plasmid obtained by mutating TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG from the 3028 th to 3093 th positions of nCas9 gene (sequence 1) in BE4max plasmid to GACGGCGACGACAAGGTGTACGACGACCGGAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 11), the mutated gene of nCas9 gene is denoted as nCas9-M2 gene, and BE4maxM2 contains the nCas9-M2 gene and can express nCas9-M2 shown in sequence 4.
The nCas9 gene in the hyA A-BE4max plasmid is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 by using the hyA A-BE4max plasmid (Addgene: 157943) as a template in a primer embedding mode, and the obtained mutant plasmid is hyA A-BE4maxM.
hyA3A-BE4max plasmid contains nCas9 gene shown in sequence 1 in a sequence table, hyA A-BE4max can express nCas9 shown in sequence 2; hyA3A-BE4maxM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in the hyA A-BE4max plasmid with the nCas9-M gene, and hyA A-BE4maxM can express the nCas9-M shown in sequence 3.
The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of targeted genome RNF2 locus, namely, a gRNA plasmid (gRNA 1) without mismatch, a gRNA plasmid (gRNA 2) with mismatch and a gRNA plasmid (gRNA 3) with deletion mismatch.
Primers used for the mismatch free gRNA plasmid (gRNA 1) are P10 and P11 in Table 6; primers used for the mismatched gRNA plasmid (gRNA 2) are P10 and P12 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 3) are P10 and P13 in Table 6.
The mismatch-free gRNA plasmid (gRNA 1) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 1; the mismatched gRNA plasmid (gRNA 2) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 2; the deletion mismatch-containing gRNA plasmid (gRNA 3) was obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 3.
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the genomic RNF2 site (the pre-spacer sequences of the gRNA plasmids are shown in table 1) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, using Lipofectamine 2000 (Life, invitrogen, 11668019) reagent to HEK293T cells, each plasmid was transfected in combination with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P4 and P5 in Table 6.
TABLE 1 Pre-spacer sequence of RNF2 genomic editing site gRNA plasmid
gRNA Pre-spacer sequence of target gene (5 '-3')
gRNA1 (targeting RNF2 without mismatch) GTCATCTTAGTCATTACCTG (sequence 15)
gRNA2 (targeting RNF)2 containing mismatches) GTaATCTTAGTCATTACCTG (sequence 16)
gRNA3 (targeting RNF2 with deletion mismatch) G-CATCTTAGTCATTACCTG (sequence 17)
The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome was counted as T, i.e., the C-T editing efficiency, and the results are shown in Table 2. The results show that: for gRNA1 without mismatch, the editing efficiency of BE4max and BE4maxM is not significantly different, the editing efficiency of BE4max and BE4maxM2 is not significantly different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not significantly different; for grnas being mismatched gRNA2 and gRNA3, BE4maxM significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, BE4maxM2 significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, and hyA a-BE4maxM also significantly reduced the editing efficiency of mismatched gRNA compared to hyA a-BE4max.
TABLE 2C-T editing efficiency of different cRNAs at RNF2 (C6, 5' sixth cytosine base) site by different cytosine base editors
Example 2: preparation and application of low off-target ABE genome editor
In this example, NG-ABEmaxM and ABE8eM were constructed by mutating 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of the currently commonly used ABE base editors NG-ABEmax (Addgene: 124163) and ABE8e (Addgene: 138489), respectively. The protein after the mutation of the NG-nCas9 is NG-nCas9-M, and the sequences of the NG-nCas9 and the NG-nCas9-M are a sequence 6 and a sequence 7 in a sequence table respectively; the protein after the nCas9 mutation is nCas9-M, and the sequences of the nCas9 and the nCas9-M are respectively a sequence 2 and a sequence 3 in a sequence table.
In mammalian cells, genomic ABCA3 locus was base edited with NG-ABEmax, NG-ABEmaxM, ABE8e, ABE8eM, respectively, using mismatched and non-mismatched gRNA, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not greatly different, and the editing efficiency of ABE8e and ABE8eM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of NG-ABEmaxM is significantly lower than NG-ABEmax, and the editing efficiency of ABE8eM is significantly lower than ABE8e.
The experimental procedure was as follows:
preparation of plasmids:
the NG-nCas9 gene in the NG-ABEmax plasmid (Addgene: 124163) is used as a template, and the primer pair consisting of P1 and P2 in the table 6 is used for carrying out point mutation on the NG-nCas9 gene in the NG-ABEmax plasmid in a primer embedding mode, so that the obtained mutant plasmid is the NG-ABEmaxM.
The NG-ABEmax plasmid contains a NG-nCas9 gene shown in a sequence 5 in a sequence table, and the NG-ABEmax can express the NG-nCas9 shown in a sequence 6; the NG-ABEmaxM is a mutant plasmid obtained by mutating the 3028-3093 th position of the NG-nCas9 gene (sequence 5) in the NG-ABEmax plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 12), the mutated gene of the NG-nCas9 gene is named as the NG-nCas9-M gene, and the NG-ABEmaxM contains the NG-nCas9-M gene and can express the NG-nCas9-M shown in the sequence 7.
The nCas9 gene in the ABE8e plasmid (Addgene: 138489) is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 as a template in a primer embedding mode, and the obtained mutant plasmid is ABE8eM.
The ABE8e plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the ABE8e can express nCas9 shown in a sequence 2; ABE8eM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in ABE8e plasmid with the nCas9-M gene, and ABE8eM can express nCas9-M shown in sequence 3.
The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of a targeted genome ABCA3 site, namely, a gRNA plasmid (gRNA 4) without mismatch, a gRNA plasmid (gRNA 5) with mismatch and a gRNA plasmid (gRNA 6) with deletion mismatch.
Primers used for the mismatch free gRNA plasmid (gRNA 4) are P10 and P14 in Table 6; the primers used for the mismatched gRNA plasmid (gRNA 5) are P10 and P15 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 6) are P10 and P16 in Table 6.
The mismatch-free gRNA plasmid (gRNA 4) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 4; the mismatched gRNA plasmid (gRNA 5) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 5; the deletion mismatch-containing gRNA plasmid (gRNA 6) was a plasmid obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 6.
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the ABCA3 site (the pre-spacer sequences of the gRNA plasmids are shown in table 3) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, to HEK293T cells using Lipofectamine 2000 (Life, invitrogen, 11668019) reagents, each plasmid combination transfected with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P6 and P7 in Table 6.
TABLE 3 Pre-spacer sequence of ABCA3 genomic editing site gRNA plasmid
gRNA Pre-spacer sequence of target gene (5 '-3')
gRNA4 (target ABCA3 without mismatch) GAAGAGCAGGGTCATGAAGG (sequence 18)
gRNA5 (targeting ABCA3 with mismatch) GAtGAGCAGGGTCATGAAGG (sequence 19)
gRNA6 (targeting ABCA3 with deletion mismatch) G-AGAGCAGGGTCATGAAGG (sequence 20)
The percentage of the mutation A corresponding to the fifth position of the sequence 12 in the sequence table in the genome to G, namely the A-G editing efficiency, was counted, and the results are shown in Table 4. The results show that: for gRNA4 without mismatch, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not obviously different, and the editing efficiency of ABE8e and ABE8eM is not obviously different; for grnas containing mismatches, both NG-ABEmaxM significantly reduced the editing efficiency of mismatched grnas compared to NG-ABEmax and ABE8eM significantly reduced the editing efficiency of mismatched grnas compared to ABE8e for grnas 5 and 6.
TABLE 4A-G editing efficiency of different gRNAs at ABCA3 (A5, 5' fifth cytosine base) site by different adenine base editors
NG-ABEmax NG-ABEmaxM ABE8e ABE8eM
gRNA4 53.4±4.5% 46.8±4.3% 85.4±6.2% 78.6±6.5%
gRNA5 50.3±3.4% 12.1±2.3% 78.2±1.3% 15.9±3.7%
gRNA6 48.2±7.8% 10.8±5.6% 70.9±5.6% 19.8±7.2%
Example 3: preparation and application of low off-target GBE genome editor
In the embodiment, 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of a commonly used GBE base editor APOBEC_nCas9_ Ung (molecular clone: MC_ 0101154) are mutated into D (aspartic acid), so that APOBEC_nCas9_UngM2 is constructed, the protein after nCas9 mutation is nCas9-M, and the amino acid sequences of nCas9 and nCas9-M are respectively the sequence 2 and the sequence 3 in a sequence table; mutation of 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) in APOBEC_nCas9_ Ung into D (aspartic acid), mutation of 1014K (lysine) into P (proline) to construct APOBEC_nCas9_UngM3, wherein the protein after nCas9 mutation is nCas9-M3, and the amino acid sequences of nCas9 and nCas9-M3 are respectively sequence 2 and sequence 8 in a sequence table.
In mammalian cells, genomic RNF2 locus was base edited with apodec_ncs9_ung, apodec_ncs9_ungm2, apodec_ncs9_ungm3, respectively using gRNA with and without mismatch, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of apodec_ncs9_ Ung is not greatly different from that of apodec_ncs9_ungm2 and apodec_ncs9_ungm3; when mismatched grnas are used for the editing sites, the editing efficiency of both apodec_ncs9_ungm2 and apodec_ncs9_ungm3 is significantly lower than apodec_ncs9_ Ung.
The experimental procedure was as follows:
preparation of plasmids:
taking an APOBEC_nCas9_ Ung plasmid (molecular clone: MC_ 0101154) as a template, carrying out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid by using a primer pair consisting of P1 and P8 in Table 6 in a primer embedding mode, wherein the obtained mutant plasmid is APOBEC_nCas9_UngM2; the APOBEC_nCas9_ Ung plasmid is taken as a template, the primer pair consisting of P1 and P9 in table 6 is utilized to carry out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid in a primer embedding mode, and the obtained mutant plasmid is APOBEC_nCas9_UngM3.
The APOBEC_nCas9_ Ung plasmid contains an nCas9' gene shown in a sequence 9 in a sequence table, and the APOBEC_nCas9_ Ung can express the nCas9 shown in a sequence 2; the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nCas9' gene (sequence 9) in the APOBEC_nCas9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 13), the mutated gene of the nCas9' gene is named as the nCas9-M ' gene, and the APOBEC_nCas9_UngM2 can express the nCas9-M shown in the amino acid sequence 3; the apodec_nmcas 9_ungm3 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nmas 9 'gene (sequence 9) in the apodec_nmcas 9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACCCCGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 14), the mutated gene of the nmas 9' gene is denoted as the nmas 9-M3 gene, and the apodec_nmcas 9_ungm3 contains the nmas 9-M3 gene and can express the nmas 9-M3 gene shown in sequence 8.
Recombinant plasmids targeting the grnas of genomic RNF2 sites (abbreviated as gRNA plasmids), i.e. gRNA plasmids without mismatches (gRNA 1 of example 1), gRNA plasmids with mismatches (gRNA 2 of example 1), gRNA plasmids with deletion mismatches (gRNA 3 of example 1).
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the genomic RNF2 site (the pre-spacer sequences of the gRNA plasmids are shown in table 1) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, using Lipofectamine 2000 (Life, invitrogen, 11668019) reagent to HEK293T cells, each plasmid was transfected in combination with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P4 and P5 in Table 6.
The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome to G, i.e., the C-G editing efficiency was counted, and the results are shown in Table 5. The results show that: for gRNA1 with no mismatch, there is no significant difference in apodec_ncs9_ Ung and apodec_ncs9_ungm2 and apodec_ncs9_ungm3 editing efficiency; for grnas being mismatch-containing gRNA2 and gRNA3, apodec_ncs9_ Ung significantly reduces the editing efficiency of the mismatch gRNA compared to both apodec_ncs9_ungm2 and apodec_ncs9_ungm3.
TABLE 5 different efficiency of C-G editing of different gRNAs at RNF2 (C6, 5' sixth cytosine) site by different cytosine base editors
APOBEC_nCas9_Ung APOBEC_nCas9_UngM APOBEC_nCas9_UngM2
gRNA1 27.3±3.2% 22.3±4.2% 24.4±5.6%
gRNA2 18.6±2.4% 2.4±1.2% 7.8±4.7%
gRNA2 21.3±2.8% 3.2±2.4% 9.6±1.5%
TABLE 6 primers
Primer(s) Sequence(s)
P1 CCACGTCTCAGATGATCGCCAAGAGCGAGGACGAAATCGGCGACGCTACCGCCAAGTACTTCT
P2 CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC
P3 CCACGTCTCACATCTTCCGGTCGTCGTACACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC
P4 ACATTCAGACCATAGCACTTCC
P5 GTCTTCCTTGGTGCCTTATCAG
P6 ACAGCACGGCTACATTTGG
P7 CCAGGAGTTTGAGCAAGATGAG
P8 CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGGACTCCAGC
P9 CCACGTCTCACATCTTGTCGTCGTCGTCCACGGGGTCGTCGCCGTCCACGAACTCGGACTCCAGC
P10 CCAGGTCTCA CGGTGTTTCGTCCTTTCCACAAGATATATAAAGC
P11 CCAGGTCTCAACCGTCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P12 CCAGGTCTCAACCGTAATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P13 CCAGGTCTCAACCGCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P14 CCAGGTCTCAACCGAAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P15 CCAGGTCTCAACCGATGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
P16 CCAGGTCTCAACCGAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC
The present application is described in detail above. It will be apparent to those skilled in the art that the present application can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the application and without undue experimentation. While the application has been described with respect to specific embodiments, it will be appreciated that the application may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
Sequence listing
<110> institute of Tianjin Industrial biotechnology, national academy of sciences
<120> a low off-target base editor and construction thereof
<160> 20
<170>PatentIn version 3.5
<210> 1
<211> 4104
<212> DNA
<213> Artificial sequence (Artificial sequence)
<400> 1
atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60
atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120
cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180
gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240
tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300
ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360
aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420
aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480
atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540
gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600
atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660
cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720
ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780
gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840
cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900
ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960
atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020
cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080
ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140
gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200
aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260
gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320
gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380
agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440
gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500
aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560
tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620
agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680
gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740
tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800
atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860
ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920
cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980
aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040
gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100
agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160
cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220
gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280
atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340
atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400
gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460
gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520
atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580
gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640
aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700
accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760
ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820
actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880
aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940
taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000
taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060
atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120
aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180
cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240
gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300
cagacaggcggcttcagcaaagagtctatcctgcccaagaggaacagcgataagctgatc 3360
gccagaaagaaggactgggaccctaagaagtacggcggcttcgacagccccaccgtggcc 3420
tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480
aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540
tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600
tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccggcgaactg 3660
cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720
cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780
cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840
atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900
cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960
cctgccgccttcaagtactttgacaccaccatcgaccggaagaggtacaccagcaccaaa 4020
gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080
gacctgtctcagctgggaggtgac 4104
<210> 2
<211> 1368
<212> PRT
<213> artificial sequence
<400> 2
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 3
<211> 1368
<212> PRT
<213> artificial sequence
<400> 3
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 4
<211> 1368
<212> PRT
<213> artificial sequence
<400> 4
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Tyr Asp AspArg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 5
<211> 4104
<212> DNA
<213> artificial sequence
<400> 5
atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60
atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120
cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180
gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240
tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300
ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360
aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420
aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480
atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540
gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600
atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660
cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720
ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780
gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840
cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900
ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960
atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020
cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080
ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140
gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200
aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260
gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320
gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380
agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440
gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500
aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560
tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620
agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680
gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740
tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800
atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860
ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920
cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980
aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040
gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100
agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160
cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220
gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280
atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340
atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400
gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460
gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520
atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580
gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640
aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700
accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760
ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820
actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880
aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940
taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000
taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060
atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120
aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180
cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240
gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300
cagacaggcggcttcagcaaagagtctatcaggcccaagaggaacagcgataagctgatc 3360
gccagaaagaaggactgggaccctaagaagtacggcggcttcgtcagccccaccgtggcc 3420
tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480
aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540
tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600
tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccagattcctg 3660
cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720
cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780
cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840
atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900
cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960
cctagggccttcaagtactttgacaccaccatcgaccggaaggtgtacaggagcaccaaa 4020
gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080
gacctgtctcagctgggaggtgac 4104
<210> 6
<211> 1368
<212> PRT
<213> artificial sequence
<400> 6
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 7
<211> 1368
<212> PRT
<213> artificial sequence
<400> 7
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 8
<211> 1368
<212> PRT
<213> artificial sequence
<400> 8
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Pro Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 9
<211> 4104
<212> DNA
<213> artificial sequence
<400> 9
atggacaagaagtactcgatcggcctcgccatcgggacgaactcagttggctgggccgtg 60
atcaccgacgagtacaaggtgccctctaagaagttcaaggtcctggggaacaccgaccgc 120
cattccatcaagaagaacctcatcggcgctctcctgttcgacagcggggagaccgctgag 180
gctacgaggctcaagagaaccgctaggcgccggtacacgagaaggaagaacaggatctgc 240
tacctccaagagattttctccaacgagatggccaaggttgacgattcattcttccaccgc 300
ctggaggagtctttcctcgtggaggaggataagaagcacgagcggcatcccatcttcggc 360
aacatcgtggacgaggttgcctaccacgagaagtaccctacgatctaccatctgcggaag 420
aagctcgtggactccaccgataaggcggacctcagactgatctacctcgctctggcccac 480
atgatcaagttccgcggccatttcctgatcgagggggatctcaacccagacaacagcgat 540
gttgacaagctgttcatccaactcgtgcagacctacaaccaactcttcgaggagaacccg 600
atcaacgcctctggcgtggacgcgaaggctatcctgtccgcgaggctctcgaagtccagg 660
aggctggagaacctgatcgctcagctcccaggcgagaagaagaacggcctgttcgggaac 720
ctcatcgctctcagcctggggctcaccccgaacttcaagtcgaacttcgatctcgctgag 780
gacgccaagctgcaactctccaaggacacctacgacgatgacctcgataacctcctggcc 840
cagatcggcgatcaatacgcggacctgttcctcgctgccaagaacctgtcggacgccatc 900
ctcctgtcagatatcctccgcgtgaacaccgagatcacgaaggctccactctctgcctcc 960
atgatcaagcgctacgacgagcaccatcaggatctgaccctcctgaaggcgctggtccgc 1020
caacagctcccggagaagtacaaggagattttcttcgatcagtcgaagaacggctacgct 1080
gggtacatcgacggcggggcctcacaagaggagttctacaagttcatcaagccaatcctg 1140
gagaagatggacggcacggaggagctcctggtgaagctcaacagggaggacctcctgcgg 1200
aagcagagaaccttcgataacggcagcatcccccaccaaatccatctcggggagctgcac 1260
gccatcctgagaaggcaagaggacttctaccctttcctcaaggataaccgggagaagatc 1320
gagaagatcctgaccttcagaatcccatactacgtcggccctctcgcgcgggggaactca 1380
agattcgcttggatgacccgcaagtctgaggagaccatcacgccgtggaacttcgaggag 1440
gtggtggacaagggcgctagcgctcagtcgttcatcgagaggatgaccaacttcgacaag 1500
aacctgcccaacgagaaggtgctccctaagcactcgctcctgtacgagtacttcaccgtc 1560
tacaacgagctcacgaaggtgaagtacgtcaccgagggcatgcgcaagccagcgttcctg 1620
tccggggagcagaagaaggctatcgtggacctcctgttcaagaccaaccggaaggtcacg 1680
gttaagcaactcaaggaggactacttcaagaagatcgagtgcttcgattcggtcgagatc 1740
agcggcgttgaggaccgcttcaacgccagcctcgggacctaccacgatctcctgaagatc 1800
atcaaggataaggacttcctggacaacgaggagaacgaggatatcctggaggacatcgtg 1860
ctgaccctcacgctgttcgaggacagggagatgatcgaggagcgcctgaagacgtacgcc 1920
catctcttcgatgacaaggtcatgaagcaactcaagcgccggagatacaccggctggggg 1980
aggctgtcccgcaagctcatcaacggcatccgggacaagcagtccgggaagaccatcctc 2040
gacttcctcaagagcgatggcttcgccaacaggaacttcatgcaactgatccacgatgac 2100
agcctcaccttcaaggaggatatccaaaaggctcaagtgagcggccagggggactcgctg 2160
cacgagcatatcgcgaacctcgctggctcccccgcgatcaagaagggcatcctccagacc 2220
gtgaaggttgtggacgagctcgtgaaggtcatgggccggcacaagcctgagaacatcgtc 2280
atcgagatggccagagagaaccaaaccacgcagaaggggcaaaagaactctagggagcgc 2340
atgaagcgcatcgaggagggcatcaaggagctggggtcccaaatcctcaaggagcaccca 2400
gtggagaacacccaactgcagaacgagaagctctacctgtactacctccagaacggcagg 2460
gatatgtacgtggaccaagagctggatatcaaccgcctcagcgattacgacgtcgatcat 2520
atcgttccccagtctttcctgaaggatgactccatcgacaacaaggtcctcaccaggtcg 2580
gacaagaaccgcggcaagtcagataacgttccatctgaggaggtcgttaagaagatgaag 2640
aactactggaggcagctcctgaacgccaagctgatcacgcaaaggaagttcgacaacctc 2700
accaaggctgagagaggcgggctctcagagctggacaaggccggcttcatcaagcggcag 2760
ctggtcgagaccagacaaatcacgaagcacgttgcgcaaatcctcgactctcggatgaac 2820
acgaagtacgatgagaacgacaagctgatcagggaggttaaggtgatcaccctgaagtct 2880
aagctcgtctccgacttcaggaaggatttccagttctacaaggttcgcgagatcaacaac 2940
taccaccatgcccatgacgcttacctcaacgctgtggtcggcaccgctctgatcaagaag 3000
tacccaaagctggagtccgagttcgtgtacggggactacaaggtttacgatgtgcgcaag 3060
atgatcgccaagtcggagcaagagatcggcaaggctaccgccaagtacttcttctactca 3120
aacatcatgaacttcttcaagaccgagatcacgctggccaacggcgagatccggaagaga 3180
ccgctcatcgagaccaacggcgagacgggggagatcgtgtgggacaagggcagggatttc 3240
gcgaccgtccgcaaggttctctccatgccccaggtgaacatcgtcaagaagaccgaggtc 3300
caaacgggcgggttctcaaaggagtctatcctgcctaagcggaacagcgacaagctcatc 3360
gccagaaagaaggactgggacccaaagaagtacggcgggttcgacagccctaccgtggcc 3420
tactcggtcctggttgtggcgaaggttgagaagggcaagtccaagaagctcaagagcgtg 3480
aaggagctcctggggatcaccatcatggagaggtccagcttcgagaagaacccaatcgac 3540
ttcctggaggccaagggctacaaggaggtgaagaaggacctgatcatcaagctcccgaag 3600
tactctctcttcgagctggagaacggcaggaagagaatgctggcttccgctggcgagctc 3660
cagaaggggaacgagctcgcgctgccaagcaagtacgtgaacttcctctacctggcttcc 3720
cactacgagaagctcaagggcagcccggaggacaacgagcaaaagcagctgttcgtcgag 3780
cagcacaagcattacctcgacgagatcatcgagcaaatctccgagttcagcaagcgcgtg 3840
atcctcgccgacgcgaacctggataaggtcctctccgcctacaacaagcaccgggacaag 3900
cccatcagagagcaagcggagaacatcatccatctcttcaccctgacgaacctcggcgct 3960
cctgctgctttcaagtacttcgacaccacgatcgatcggaagagatacacctccacgaag 4020
gaggtcctggacgcgaccctcatccaccagtcgatcaccggcctgtacgagacgaggatc 4080
gacctctcacaactcggcggggat 4104
<210> 10
<211> 66
<212> DNA
<213> artificial sequence
<400> 10
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 11
<211> 66
<212> DNA
<213> artificial sequence
<400> 11
gacggcgacgacaaggtgtacgacgaccggaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 12
<211> 66
<212> DNA
<213> artificial sequence
<400> 12
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 13
<211> 66
<212> DNA
<213> artificial sequence
<400> 13
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 14
<211> 66
<212> DNA
<213> artificial sequence
<400> 14
gacggcgacgaccccgtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 15
<211> 20
<212> DNA
<213> artificial sequence
<400> 15
gtcatcttagtcattacctg 20
<210> 16
<211> 20
<212> DNA
<213> artificial sequence
<400> 16
gtaatcttagtcattacctg 20
<210> 17
<211> 19
<212> DNA
<213> artificial sequence
<400> 17
gcatcttagtcattacctg 19
<210> 18
<211> 20
<212> DNA
<213> artificial sequence
<400> 18
gaagagcagggtcatgaagg 20
<210> 19
<211> 20
<212> DNA
<213> artificial sequence
<400> 19
gatgagcagggtcatgaagg 20
<210> 20
<211> 19
<212> DNA
<213> artificial sequence
<400> 20
gagagcagggtcatgaagg 19

Claims (11)

1. A base editor comprising or expressing a Cas9 mutant, the Cas9 mutant being a mutant protein obtained by mutating amino acid residues between positions 1010-1031 of Cas9, or a mutant protein obtained by mutating amino acid residues 1010, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 of Cas9, or a mutant protein obtained by mutating Cas9 by any one or more of the following:
m1) mutating the tyrosine residue at position 1010 of Cas9 to an aspartic acid residue;
m2) mutating the tyrosine residue at position 1013 of Cas9 to an aspartic acid residue;
m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;
m4) mutating the tyrosine residue at position 1016 of Cas9 to an aspartic acid residue;
m5) mutating the valine residue at position 1018 of Cas9 to the aspartic acid residue;
m6) mutating the arginine residue at position 1019 of Cas9 to an aspartic acid residue;
m7) mutating the glutamine residue at position 1027 of Cas9 to an aspartic acid residue;
m8) mutates the lysine residue at position 1031 of Cas9 to an aspartic acid residue.
2. The base editor of claim 1 wherein: the Cas9 is a protein shown as a sequence 2 or a sequence 6 in a sequence table.
3. The base editor of claim 1 or 2, wherein: the Cas9 mutant is a mutant protein obtained by performing seven mutations of M1), M2), M4), M5), M6), M7) and M8) on Cas9, or is a mutant protein obtained by performing five mutations of M1), M2), M5), M7) and M8) on Cas9, or is a mutant protein obtained by performing eight mutations of M1) -M8) on Cas 9.
4. The base editor of any one of claims 1-3 wherein: the base editor also contains or also expresses sgrnas targeting the target sequence and/or domains with base modification activity.
5. The Cas9 mutant of any one of claims 1-4.
6. The biological material associated with the Cas9 mutant of any one of claims 1-4, which is any one of the following B1) to B5):
b1 A nucleic acid molecule encoding the Cas9 mutant of any one of claims 1-4;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or a recombinant vector comprising the expression cassette of B2);
b4 A recombinant microorganism comprising the nucleic acid molecule of B1), or a recombinant microorganism comprising the expression cassette of B2), or a recombinant microorganism comprising the recombinant vector of B3);
b5 A cell line containing the nucleic acid molecule of B1) or a cell line containing the expression cassette of B2).
7. The biomaterial according to claim 6, wherein: b1 The nucleic acid molecule is a mutant gene obtained by mutating a nucleotide sequence between 3028 and 3093 of the Cas9 gene;
further, the nucleic acid molecule B1) is obtained by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 1 in the sequence table into the sequence 10 or the sequence 11, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 5 in the sequence table into the sequence 12, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 9 in the sequence table into the sequence 13 or the sequence 14.
8. A product comprising the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7.
9. The base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or any one of the following uses of the product of claim 8:
y1) converting a cytosine nucleotide residue in a biological cell or animal or subject to a thymine nucleotide residue;
y2) converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y3) converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y4) preparing a product for converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject;
y5) preparing a product for converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y6) preparing a product for converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y7) as or preparing a single base editing reagent or kit;
y8) as or for the preparation of a medicament for gene therapy;
y9) treating or preventing a disease;
y10) for the preparation of a product for the treatment or prophylaxis of a disease.
10. The method comprises the following steps:
x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing of a cytosine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8 to effect conversion of the cytosine nucleotide residue of interest to a thymine nucleotide residue;
x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on an adenine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8, to convert the adenine nucleotide residue of interest to a guanine nucleotide residue;
x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: the conversion of a cytosine nucleotide residue of interest into a guanine nucleotide residue is accomplished by base editing the cytosine nucleotide residue of interest with a base editor as defined in any one of claims 1-4, or a Cas9 mutant as defined in claim 5, or a biological material as defined in claim 6 or 7, or a product as defined in claim 8.
11. The use according to claim 9, or the method according to claim 10, characterized in that: the biological cell is a mammalian cell; the animal is a mammal.
CN202210518789.4A 2022-05-13 2022-05-13 Low off-target base editor and construction thereof Pending CN117089572A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210518789.4A CN117089572A (en) 2022-05-13 2022-05-13 Low off-target base editor and construction thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210518789.4A CN117089572A (en) 2022-05-13 2022-05-13 Low off-target base editor and construction thereof

Publications (1)

Publication Number Publication Date
CN117089572A true CN117089572A (en) 2023-11-21

Family

ID=88777528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210518789.4A Pending CN117089572A (en) 2022-05-13 2022-05-13 Low off-target base editor and construction thereof

Country Status (1)

Country Link
CN (1) CN117089572A (en)

Similar Documents

Publication Publication Date Title
US11713471B2 (en) Class II, type V CRISPR systems
CN110835634B (en) Novel base conversion editing system and application thereof
CN105247066B (en) Increasing specificity of RNA-guided genome editing using RNA-guided FokI nuclease (RFN)
CN110835629B (en) Construction method and application of novel base conversion editing system
WO2015079056A1 (en) Somatic human cell line mutations
JP2020517299A (en) Site-specific DNA modification using a donor DNA repair template with tandem repeats
US20230212612A1 (en) Genome editing system and method
US20230374482A1 (en) Base editing enzymes
US20230416710A1 (en) Engineered and chimeric nucleases
WO2020033083A1 (en) Optimized base editors enable efficient editing in cells, organoids and mice
US20240002834A1 (en) Adenine base editor lacking cytosine editing activity and use thereof
EP3412765B1 (en) Method for producing mutant filamentous fungi
CN113249362B (en) Modified cytosine base editor and application thereof
KR102151064B1 (en) Gene editing composition comprising sgRNAs with matched 5&#39; nucleotide and gene editing method using the same
CN117089572A (en) Low off-target base editor and construction thereof
CN113549650B (en) CRISPR-SaCas9 gene editing system and application thereof
EP4392561A1 (en) Enzymes with ruvc domains
EP4347816A1 (en) Class ii, type v crispr systems
KR102358538B1 (en) Method for gene editing in microalgae using particle bombardment
CN115772523A (en) Base editing tool
US20230348877A1 (en) Base editing enzymes
KR20240107373A (en) Novel genome editing system based on C2C9 nuclease and its application
WO2023039434A1 (en) Systems and methods for transposing cargo nucleotide sequences
KR20210118069A (en) DNA cutting material

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination