CN110029096B - Adenine base editing tool and application thereof - Google Patents

Adenine base editing tool and application thereof Download PDF

Info

Publication number
CN110029096B
CN110029096B CN201910382569.1A CN201910382569A CN110029096B CN 110029096 B CN110029096 B CN 110029096B CN 201910382569 A CN201910382569 A CN 201910382569A CN 110029096 B CN110029096 B CN 110029096B
Authority
CN
China
Prior art keywords
leu
lys
glu
asp
arg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910382569.1A
Other languages
Chinese (zh)
Other versions
CN110029096A (en
Inventor
黄诗圣
李向阳
黄行许
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ShanghaiTech University
Original Assignee
ShanghaiTech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ShanghaiTech University filed Critical ShanghaiTech University
Priority to CN201910382569.1A priority Critical patent/CN110029096B/en
Publication of CN110029096A publication Critical patent/CN110029096A/en
Application granted granted Critical
Publication of CN110029096B publication Critical patent/CN110029096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention relates to the technical field of biology, in particular to an adenine base editing tool and application thereof. The invention provides a fusion protein, which comprises an ecTadA-ecTadA dimer fragment and an SpCas9-NG D10A nicase fragment, wherein the ecTadA-ecTadA dimer fragment comprises an ecTad fragment and an ecTadA fragment. The fusion protein provided by the invention can take NG as a PAM sequence, mutate A at 4-7 positions of the 5' end of sgRNA into G, and can increase a targeting site for base editing.

Description

Adenine base editing tool and application thereof
Technical Field
The invention relates to the technical field of biology, in particular to an adenine base editing tool and application thereof.
Background
CRISPR/Cas9 is currently the most widely used gene editing technique. Cas9 reaches a designated region under the guidance of sgRNA to exert enzyme digestion activity, and cleavage is carried out between 3bp and 4bp upstream of PAM. Double Strand Breaks (DSBs) of DNA after CRISPR/Cas9 cleavage excite their own DNA repair mechanisms, largely divided into HDR (Homologous Directly Repair, homologous recombination repair) and NHEJ (Non-Homologous End Join, non-homologous recombination end repair). HDR can be accurately repaired by using a template, and the result of NHEJ repair is randomly introduced into insertion or deletion, wherein NHEJ is dominant in the repair process.
The advent of CRISPR/Cas9 makes gene manipulation very convenient, but precise editing of the genome cannot be achieved by random insertion or deletion of NHEJ, and the method of providing a homologous recombination vector or Single-stranded DNA (ssODN) after cleavage is inefficient and takes a lot of time. Furthermore, DSBs caused by Cas9 cleavage may cause large fragment deletions of the genome, leaving a safety hazard.
David Liu et al, university of Harvard, report that the use of RuvC domain inactivated Cas9D10Anickase (nCas 9) fusion deaminase approach can achieve point mutations (C-to-T or A-to-G) on single bases of the genome without causing DSB, with both cytosine base editing tools (Cytosine Base Editor, CBE) and adenine base editing tools (Adenine Base Editor, ABE).
The fusion protein of cytosine deaminase/adenine deaminase and nCas9 reaches the targeting site under the guidance of the sgRNA and binds to the DNA strand complementary to the sgRNA. Cytosine deaminase deaminates cytosine C in a certain range around to form uracil U, the U can be complementarily paired with adenine A, and the U can be finally replaced by a complementarily paired base T of A after DNA replication; similarly, adenine deaminase deaminates a range of surrounding adenine a to hypoxanthine I, which can be complementarily paired with cytosine C, and upon DNA replication, I is eventually replaced by C's complementarily paired base G. Thereby achieving the purpose of C-to-T or A-to-G.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, an object of the present invention is to provide an adenine base editing tool and its use for solving the problems in the prior art.
To achieve the above and other related objects, in one aspect, the present invention provides a fusion protein comprising an ecTadA-dimer fragment and an SpCas9-NG D10A nicase fragment, wherein the ecTadA-dimer fragment comprises an ecTad fragment and an ecTadA-x fragment.
In some embodiments of the invention, the amino acid sequence of the ecTadA fragment comprises:
a) An amino acid sequence as shown in SEQ ID NO. 57; or alternatively, the first and second heat exchangers may be,
b) An amino acid sequence having 80% or more sequence similarity to SEQ ID No.57 and having the function of the amino acid sequence defined in a), preferably capable of forming a dimer with an ecTadA fragment, and the dimer having adenine deaminase activity.
In some embodiments of the invention, the amino acid sequence of the ecTadA fragment comprises:
c) An amino acid sequence as shown in SEQ ID NO. 58; or alternatively, the first and second heat exchangers may be,
d) An amino acid sequence having 80% or more sequence similarity to SEQ ID No.58 and having the function of the amino acid sequence defined in c) is preferably capable of forming a dimer with an ecTadA fragment and the dimer has adenine deaminase activity.
In some embodiments of the invention, the amino acid sequence of the SpCas9-NG D10A nickase fragment comprises:
e) An amino acid sequence as shown in SEQ ID NO. 59; or alternatively, the first and second heat exchangers may be,
f) An amino acid sequence having 80% or more sequence similarity to SEQ ID No.59 and having the function of the amino acid sequence defined in e) is preferably capable of recognizing NG as PAM.
In some embodiments of the invention, the fusion protein comprises, in order from the 5 'end to the 3' end, an ecTadA-ecTadA dimer fragment and a SpCas9-NG D10A nicase fragment.
In some embodiments of the invention, the ecTadA-ecTadA dimer fragment comprises, in order from the 5 'end to the 3' end, an ecTad fragment and an ecTadA fragment.
In some embodiments of the invention, the fusion protein further comprises a nuclear localization signal fragment, preferably located at the 5 'and/or 3' end of the ecTadA-ecTadA dimer fragment and the SpCas9-NG D10A nicase fragment, preferably the amino acid sequence of the nuclear localization signal fragment is shown in SEQ ID No. 60.
In some embodiments of the invention, the amino acid sequence of the fusion protein is shown as SEQ ID No. 61.
In another aspect, the invention provides an isolated polynucleotide encoding the fusion protein.
In another aspect the invention provides a construct comprising said isolated polynucleotide.
In another aspect, the invention provides an expression system comprising said construct or said polynucleotide integrated into the genome.
In some embodiments of the invention, the host cell of the expression system is selected from eukaryotic cells or prokaryotic cells, preferably from mouse cells, human cells, more preferably from mouse brain neuroma cells, human embryonic kidney cells, or human cervical cancer cells, more preferably from N2a cells, HEK293FT cells, or Hela cells.
In a further aspect the invention provides the use of said fusion protein, said isolated polynucleotide, said construct or said expression system in gene editing.
In some embodiments of the invention, the use is in particular in gene editing of eukaryotic organisms.
In another aspect, the invention provides a base editing system comprising the fusion protein, the base editing system further comprising sgRNA.
In another aspect, the present invention provides a gene editing method comprising: and carrying out gene editing through the fusion protein or the base editing system.
Drawings
FIG. 1 shows a schematic representation of the structure of the ABEmax-NG plasmid constructed in example 1.
FIG. 2 is a schematic diagram showing the experimental results of example 2 of the present invention, wherein a is a schematic diagram of an enhanced green fluorescent protein reporting system; b is the microscopic photograph result of the report system on HEK293FT cells; c is a stream detection result; d is Sanger sequencing result.
FIG. 3 is a schematic diagram showing the experimental results of example 3 of the present invention, wherein a is the result of the deep-seq sequencing analysis of the editing of 16 endogenous gene loci on N2a cells by ABEmax-NG; b is the base distribution ratio of the corresponding editing sites; c is the corresponding mutation efficiency, mutation by-products and indel statistics; d is the deep-seq sequencing analysis result of ABEmax-NG on the mouse embryo edited on 4 endogenous gene loci; e is the corresponding mutation efficiency, mutation by-products and indel statistics.
FIG. 4 is a schematic diagram showing experimental results of example 3 of the present invention, wherein a is the adjacent off-target result of editing 16 endogenous gene loci on N2a cells by ABEmax-NG; b is the neighbor off-target result of ABEmax-NG editing 4 endogenous gene loci on mouse embryo.
FIG. 5 is a schematic diagram showing the experimental results of example 4 of the present invention, wherein a is the editing result of splice sites of 4 endogenous genes on N2a cells using ABEmax-NG; b is the corresponding RNA splice detection result; c is the corresponding Sanger sequencing result, verifying the newly emerging splice subtype.
FIG. 6 is a schematic diagram showing experimental results of example 5 of the present invention, wherein a is a schematic diagram of a BBS2 gene splice acceptor site mutant mouse obtained by using ABEmax-NG; b is the genotype identification result of the mutant mice; c is the genotype identification result of different tissues of the mutant mice; d is the RNA subtype detection result of different tissues of the mutant mice.
Detailed Description
Through a great deal of exploratory researches, the inventor provides a fusion protein which is a novel adenine base editing tool, and can recognize NG as PAM, so that the targeting range of base editing is widened, and the invention is completed on the basis.
The first aspect of the present invention provides a fusion protein comprising an ecTadA-ecTadA dimer fragment and a SpCas9-NGD10A nicase fragment, wherein the ecTadA-ecTadA dimer fragment comprises an ecTad fragment and an ecTadA fragment. The fusion protein can be matched with the sgRNA of a target area by taking NG as a PAM sequence to realize the efficient base editing of A-to-G of 4-7 bits of the 5' end of the sgRNA in the target area, has high mutation accuracy and low adjacent off-target,
in the fusion protein provided by the invention, the amino acid sequence of the ecTadA fragment may include: a) An amino acid sequence as shown in SEQ ID NO. 57; or b) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.57, and having the function of the amino acid sequence defined in a). Specifically, the amino acid sequence in b) specifically refers to: the amino acid sequence shown in SEQ ID No.57 may be obtained by substitution, deletion or addition of one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, 1 to 3, 1, 2 or 3) amino acids, or may be obtained by addition of one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, 1 to 3, 1, 2 or 3) amino acids at the N-terminus and/or the C-terminus, and may be a polypeptide fragment having a function of a polypeptide fragment shown in SEQ ID No.57, for example, a polypeptide fragment having a function of forming a dimer with an ecTadA fragment and a dimer having adenine deaminase activity, more specifically, a function of deaminating adenine (A) to produce hypoxanthine (I). The amino acid sequence in b) may have more than 80%, 85%, 90%, 93%, 95%, 97%, or 99% similarity to SEQ ID No. 57.
In the fusion protein provided by the present invention, the amino acid sequence of the ecTadA fragment may include: c) An amino acid sequence as shown in SEQ ID NO. 58; or d) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.58, and having the function of the amino acid sequence defined in c). Specifically, the amino acid sequence in d) specifically refers to: the amino acid sequence shown in SEQ ID No.58 may be obtained by substitution, deletion or addition of one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, 1 to 3, 1, 2 or 3) amino acids, or may be obtained by addition of one or more (specifically, 1 to 50, 1 to 30, 1 to 20, 1 to 10, 1 to 5, 1 to 3, 1, 2 or 3) amino acids at the N-terminus and/or the C-terminus, and may be a polypeptide fragment having a function of a polypeptide fragment shown in SEQ ID No.58, for example, a polypeptide fragment having a function of forming a dimer with an ecTadA fragment and a dimer having an adenine deaminase activity, and more specifically, a function of deaminating adenine (A) to produce hypoxanthine (I). The amino acid sequence in d) may have more than 80%, 85%, 90%, 93%, 95%, 97%, or 99% similarity to SEQ ID No. 58.
In the fusion protein provided by the invention, the amino acid sequence of the SpCas9-NG D10A nickase fragment can comprise: e) An amino acid sequence as shown in SEQ ID NO. 59; or f) an amino acid sequence having a sequence similarity of 80% or more with SEQ ID NO.59, and has the function of the amino acid sequence defined in e). Specifically, the amino acid sequence in f) specifically refers to: the amino acid sequence shown in SEQ ID No.59 is obtained by substituting, deleting or adding one or more (specifically, 1-50, 1-30, 1-20, 1-10, 1-5, 1-3, 1, 2 or 3) amino acids, or adding one or more (specifically, 1-50, 1-30, 1-20, 1-10, 1-5, 1-3, 1, 2 or 3) amino acids at the N-terminal and/or C-terminal, and has the function of the polypeptide fragment shown in SEQ ID No.59, for example, NG can be identified as PAM, specifically, NG sequence can be used as PAM, and editing of the base A-7 to G at the 5' -end of sgRNA and ecTaecTadA dimer fragment of the specific targeting site can be achieved. The amino acid sequence in f) may have 80%, 85%, 90%, 93%, 95%, 97%, or 99% or more similarity to SEQ ID No. 59. Targeting recognition of CRISPR/Cas9 systems typically requires a pre-spacer sequence adjacent motif (protospacer adjacent motif, PAM) beside the target site, as a Cas9 enzyme most frequently used for genome editing, cas9 (SpCas 9) from streptococcus pyogenes (Streptococcus pyogenes) is only able to recognize PAM of NGG sequences, which limits the range in the genome that can be targeted, whereas the SpCas9-NG D10A nicase fragment in the present invention is able to recognize NG sequences as PAM.
In the fusion proteins provided by the present invention, the substitution, deletion or addition may be conservative amino acid substitutions. The term "conservative amino acid substitution" may specifically refer to the case where an amino acid residue is substituted for another amino acid residue having a similar side chain. Families of amino acid residues with similar side chains should be known to those skilled in the art, and for example, may be families including, but not limited to, basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan) isoleucine, and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Conservative amino acid substitutions may more particularly include, but are not limited to, the specific cases listed in the following table, with the numbers in table 1 (amino acid similarity matrix) representing the similarity between two amino acids, and when the numbers are 0 or more, the conservative amino acid substitutions are considered, and table 2 is an exemplary scheme of conservative amino acid substitutions.
TABLE 1
C G P S A T D E N Q H K R V M I L F Y W
W -8 -7 -6 -2 -6 -5 -7 -7 -4 -5 -3 -3 2 -6 -4 -5 -2 0 0 17
Y 0 -5 -5 -3 -3 -3 -4 -4 -2 -4 0 -4 -5 -2 -2 -1 -1 7 10
F -4 -5 -5 -3 -4 -3 -6 -5 -4 -5 -2 -5 -4 -1 0 1 2 9
L -6 -4 -3 -3 -2 -2 -4 -3 -3 -2 -2 -3 -3 2 4 2 6
I -2 -3 -2 -1 -1 0 -2 -2 -2 -2 -2 -2 -2 4 2 5
M -5 -3 -2 -2 -1 -1 -3 -2 0 -1 -2 0 0 2 6
V -2 -1 -1 -1 0 0 -2 -2 -2 -2 -2 -2 -2 4
R -4 -3 0 0 -2 -1 -1 -1 0 1 2 3 6
K -5 -2 -1 0 -1 0 0 0 1 1 0 5
H -3 -2 0 -1 -1 -1 1 1 2 3 6
Q -5 -1 0 -1 0 -1 2 2 1 4
N -4 0 -1 1 0 0 2 1 2
E -5 0 -1 0 0 0 3 4
D -5 1 -1 0 0 0 4
T -2 0 0 1 1 3
A -2 1 1 1 2
S 0 1 1 1
P -3 -1 6
G -3 5
C 12
TABLE 2
Figure BDA0002053811490000061
The fusion protein provided by the invention can also comprise a nuclear localization signal fragment (NLS), wherein the nuclear localization signal fragment can be positioned at the N end of the ecTadA-ecTadA dimer fragment or at the C end of the SpCas9-NG D10A nicase fragment. The nuclear localization signal fragment may comprise the amino acid sequence shown as SEQ ID NO. 60.
In the fusion protein provided by the invention, the fusion protein can sequentially comprise an ecTadA-ecTadA dimer fragment and an SpCas9-NG D10A nicase fragment from a 5 'end to a 3' end, and the ecTadA-ecTadA dimer fragment can sequentially comprise an ecTadA fragment and an ecTadA fragment from the 5 'end to the 3' end. In one embodiment of the invention, the amino acid sequence of the fusion protein is shown in SEQ ID No. 61.
In a second aspect, the invention provides an isolated polynucleotide encoding the fusion protein provided in the first aspect of the invention.
In a third aspect the invention provides a construct comprising an isolated polynucleotide as provided in the second aspect of the invention. Such constructs may be generally obtained by inserting the isolated polynucleotide into a suitable expression vector, which may be selected by one of skill in the art, for example, such expression vectors may include, but are not limited to, pCMV expression vectors, pSV2 expression vectors, pGL3 expression vectors, and the like.
In a fourth aspect the invention provides an expression system comprising an isolated polynucleotide as provided in the second aspect of the invention having an exogenous integrated into the construct or genome as provided in the third aspect of the invention. The expression system may be a host cell that may express a fusion protein as described above, which may be mated with the sgRNA, such that the fusion protein may be localized to the target region, enabling base editing of the target region. In another embodiment of the present invention, the host cell may be a eukaryotic cell and/or a prokaryotic cell, more specifically a mouse cell, a human cell, etc., more specifically a mouse brain neuroma cell, a human embryonic kidney cell, a human cervical cancer cell, etc., more specifically an N2a cell, a HEK293FT cell, a Hela cell, etc.
The fifth aspect of the present invention provides the use of the fusion protein provided in the first aspect of the present invention, or the isolated polynucleotide provided in the second aspect of the present invention, or the construct provided in the third aspect of the present invention, or the expression system provided in the fourth aspect of the present invention in gene editing, preferably in gene editing of a eukaryotic organism, which may be specifically a metazoan, which may specifically include but are not limited to mice or the like. The uses may specifically be, but are not limited to, base editing from A to G (more specifically, base editing of A-to-G at positions 4-7 of the 5' end of sgRNA in the target region), editing of splice acceptor/donor sites to modulate RNA splicing, construction of mouse disease models using the present tool, or treatment of human disease, etc. In a specific embodiment of the present invention, the genes to be edited may be CHRNE (ID: 11448), SIX6 (ID: 20476), ITPR1 (ID: 16438), TMEM67 (ID: 329795), LMBR1 (ID: 56873), NFIX (ID: 18032), DES (ID: 13346), BHLHA9 (ID: 320522), NDUFS1 (ID: 227197), HOXD13 (ID: 15433), AKR1C19 (ID: 432720), LMNA (ID: 16905), WNT5A (ID: 22418), SUFU (ID: 24069), GJA8 (ID: 14616), EYA1 (ID: 14048), BBS2 (ID: 67378), OFD1 (ID: 237222), MYO7A (ID: 17921), SEPN1 (ID: 74777), and the like. In another embodiment of the present invention, the object being edited may be an embryo, a cell, or the like.
In a sixth aspect the invention provides a base editing system comprising the fusion protein provided in the first aspect of the invention, the base editing system further comprising sgRNA. One skilled in the art can select appropriate sgrnas targeting a specific site based on the targeted editing region of the gene. For example, the sequence of the sgRNA may be at least partially complementary to the target region, so that it may be coordinated with the fusion protein, and the fusion protein may be positioned to the target region, so as to implement base editing of a-to-G at positions 4-7 of the 5' end of the sgRNA in the target region, specifically, adenine deamination, that is, editing adenine (a) into hypoxanthine (I). The base editing system provided by the invention greatly widens the targetable range of the genome, can take the NG sequence as PAM, realizes the bases of A-to-G at the 4-7 positions of the 5' end in the sgRNA target region, and has high accuracy in mutation and low adjacent off-target. In one embodiment of the invention, the sgRNA may target genes such as CHRNE, SIX6, ITPR1, TMEM67, LMBR1, NFIX, DES, BHLHA, NDUFS1, HOXD13, AKR1C19, LMNA, WNT5A, SUFU, GJA, EYA1, BBS2, OFD1, MYO7A, SEPN1, etc. In another embodiment of the present invention, the object being edited may be an embryo, a cell, or the like.
A seventh aspect of the present invention provides a base editing method comprising: gene editing is performed by the fusion protein provided in the first aspect of the present invention or the base editing system provided in the sixth aspect of the present invention. For example, the gene editing method may include: culturing the expression system provided in the fourth aspect of the invention under appropriate conditions so as to express the fusion protein, wherein the fusion protein can carry out base editing on a target region in the presence of sgRNA (ribonucleic acid) of the target region matched with the fusion protein. Methods of providing conditions under which the sgrnas are present should be known to those skilled in the art, and for example, may be culturing under appropriate conditions an expression system capable of expressing the sgrnas, which may be a host cell comprising an expression vector comprising a polynucleotide encoding the sgrnas, or a host cell having a polynucleotide encoding the sgrnas integrated in the chromosome. In a specific embodiment of the present invention, the gene editing is in vitro gene editing.
The invention provides a novel adenine base editing tool by combining SpCas9-NG recognizing NG PAM with ABE, the fusion protein provided by the invention can take NG as a PAM sequence, mutate A at 4-7 positions of 5' end of sgRNA into G, increase target sites for base editing, have more sgRNAs for editing splice acceptor/donor sites, and have larger base editing range in genome for adjusting the gene number of RNA splicing. In addition, the fusion protein has the advantages of high editing accuracy, low adjacent off-target and the like, and has good industrialization prospect.
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
Before the embodiments of the invention are explained in further detail, it is to be understood that the invention is not limited in its scope to the particular embodiments described below; it is also to be understood that the terminology used in the examples of the invention is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the invention; in the description and claims of the invention, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise.
Where numerical ranges are provided in the examples, it is understood that unless otherwise stated herein, both endpoints of each numerical range and any number between the two endpoints are significant both in the numerical range. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition to the specific methods, devices, materials used in the embodiments, any methods, devices, and materials of the prior art similar or equivalent to those described in the embodiments of the present invention may be used to practice the present invention according to the knowledge of one skilled in the art and the description of the present invention.
Unless otherwise indicated, the experimental methods, detection methods, and preparation methods disclosed in the present invention employ techniques conventional in the art of molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA techniques, and related arts. These techniques are well described in the prior art literature and see, in particular, sambrook et al MOLECULAR CLONING: a LABORATORY MANUAL, second edition, cold Spring Harbor Laboratory Press,1989and Third edition,2001; ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, john Wiley & Sons, new York,1987and periodic updates; the series METHODS IN ENZYMOLOGY, academic Press, san Diego; wolffe, CHROMATIN STRUCTURE AND FUNCTION, third edition, academic Press, san Diego,1998; METHODS IN ENZYMOLOGY, vol.304, chromatin (p.m. wassman and a.p. wolffe, eds.), academic Press, san Diego,1999; and METHODS IN MOLECULAR BIOLOGY, vol.119, chromatin Protocols (p.b. becker, ed.) Humana Press, totowa,1999, etc.
Example 1
The ABEmax-NG plasmid was first constructed and 7 amino acid mutations (R1335V/L1111R/D1135V/G1218R/E1219F/A1322R/T1337R) were introduced into the ABEmax plasmid, purchased from Addgene (# 112095), via Mut Express II Fast Mutagenesis Kit V (Vazyme, C214-02). The constructed ABEmax-NG plasmid sequence is shown in an annex sequence table SEQ ID.1.
Example 2
In this example, the editing ability of ABEmax-NG was verified on HEK293FT using an enhanced green fluorescent protein reporting system.
1.1 construction of plasmid for Strong Green fluorescent protein reporter System
Two base mutations, one located at the third position of threonine-63 codon, to T, A or G, were introduced into the enhanced green fluorescent protein expression vector by Mut Express II Fast Mutagenesis Kit V (Vazyme, C214-02) which did not alter the amino acid sequence of the enhanced green fluorescent protein, but provided a variety of PAM sequences to test for recognition by ABEmax-NG; the other is located at the first position of the glutamine-69 codon, and is mutated to T, which is converted to a stop codon, thereby disabling the fluorescence of the enhanced green fluorescent protein. The plasmid sequence of the constructed enhanced green fluorescent protein reporting system is shown in an annex sequence table SEQ ID.2.
1.2 construction of sgRNA plasmids
The sgrnas were designed and oligos were synthesized with the upstream sequence: 5'-accgAGCACTACACGCCGTAGGTC-3' (SEQ ID NO. 3), downstream sequence: 5'-aaacGACCTACGGCGTGTAGTGCT-3' (SEQ ID NO. 4), the upstream and downstream sequences were annealed by the procedure (95 ℃,5min;95 ℃ C. -85 ℃ C./2 ℃ C./s; 85 ℃ C. -25 ℃ C./0.1 ℃ C./s; hold at 4 ℃ C.) to pGL3-U6-sgRNA (Addgene # 51133) vector linearized by BsaI (NEB: R0539L). The linearization system is as follows: pGL3-U6-sgRNA 2. Mu.g; buffer (NEB: R0539L) 6. Mu.L; bsaI 2. Mu.L; ddH2O was filled to 60. Mu.L. Cleavage was carried out overnight at 37 ℃. The connection system is as follows: t4 ligation buffer (NEB: M0202L) 1. Mu.L, linearization vector 20ng, annealed oligo fragment (10. Mu.M) 5. Mu.L, T4 ligase (NEB: M0202L) 0.5. Mu.L, ddH 2 O was made up to 10. Mu.L at 16℃overnight. The connected carrier is identified by transformation and fungus picking. Extracting plasmid (Axygene: AP-MN-P-250G) from positive clone by shaking and measuringAnd (5) determining the concentration.
1.3 culture transfection and identification of cells
HEK293FT cells (purchased from ATCC) were inoculated in DMEM high sugar broth (HyClone, SH30022.01B) supplemented with 10% fbs, which contained 1%Penicillin Streptomycin (v/v) (Gibco). At a cell concentration of 80%, the cell state was recovered optimally by exchanging with DMEM medium containing 10% serum for 2 hours. The amount of plasmid transfected per well was 1. Mu.g of ABEmax-NG plasmid, 0.5. Mu.g of sgRNA plasmid, and 0.5. Mu.g of mutant enhanced green fluorescent protein expression plasmid, respectively. The plasmid was mixed in 50. Mu.l of Opti-MEM (Gibco, 11058021) medium. Mu.l of Lipofectamine 2000 transfection reagent (Thermo, 11668019) was mixed into 50. Mu.l of Opti-MEM medium and mixed well and allowed to stand for 5 minutes. The Opti-MEM mixed with the plasmid was added to the Opti-MEM mixed with Lipofectamine 2000, and the mixture was stirred and stirred at a slow speed and allowed to stand for 20 minutes. Opti-MEM mixed with plasmid and Lipofectamine 2000 was added to each 12-well plate. DMEM with 10% fbs was used 6 hours after transfection. Fluorescence was observed under a microscope 48 hours after transfection and photographed, and the proportion of fluorescence was detected by flow cytometry. GFP positive cells were sorted and genotyping was performed by lysis, the composition of the lysate was 50mM KCl,1.5mM MgCl2, 10mM Tris pH 8.0,0.5%Nonidet P-40,0.5% Tween 20, 100g/ml protease K. PCR amplification was performed on sequences near the target, and the amplified products were purified and identified by Sanger sequencing. The amplification system is as follows: 2 Xbaffer (Vazyme, P505) 25. Mu.L; dNTP 1. Mu.L; f (10 pmol/. Mu.L) 1. Mu.L; r (10 pmol/. Mu.L) 1. Mu.L; template 1. Mu.L; DNA polymerase (Vazyme, P505) 0.5. Mu.L; ddH2O was filled to 50. Mu.L. The amplified PCR product was purified by the following steps: adding three times of volume of PCR-A (Axygen: AP-PCR-250G) and passing through a column, centrifuging, and centrifuging for 1 min at 12000 rpm; 700 mu L W2 was added and centrifuged for 1 min; discarding the waste liquid, adding 700 mu LW2, and centrifuging for 1 minute; waste liquid is discarded, and the waste liquid idles for 1 minute; 20. Mu.L of water was added for elution. The correlation results are shown in fig. 2. ABEmax-NG can recognize NG PAM in HEK293FT cells, repair the mutated enhanced fluorescent protein and restore the fluorescence.
Example 3
In this example, the endogenous gene locus was edited on N2a cells and mouse embryos using ABEmax-NG.
2.1 construction of sgRNA plasmids
16 mouse endogenous genes were selected: CHRNE (ID: 11448), SIX6 (ID: 20476), ITPR1 (ID: 16438), TMEM67 (ID: 329795), LMBR1 (ID: 56873), NFIX (ID: 18032), DES (ID: 13346), BHLHA9 (ID: 320522), NDUFS1 (ID: 227197), HOXD13 (ID: 15433), AKR1C19 (ID: 432720), LMNA (ID: 16905), WNT5A (ID: 22418), SUFU (ID: 24069), GJA8 (ID: 14616), EYA1 (ID: 14048), and the oligo used were designated as the sequence Listing of SEQ ID.5-36. Construction of the sgRNA plasmid was performed as 1.2.
4 mouse endogenous genes were selected: BBS2 (ID: 67378), OFD1 (ID: 237222), MYO7A (ID: 17921), SEPN1 (ID: 74777), the sgRNA was designed and oligo was synthesized, annealed and ligated into linearized PUC57-T7-sgRNA vector (adedge: 51132) as per 1.2. Linearization systems are described in 1.2. The oligos used are shown in the attached sequence listing SEQ ID.37-44.
2.2 culture transfection and identification of cells
N2a cells (purchased from ATCC) were cultured and transfected at 1.3, the amount of plasmid transfected was ABEmax-NG 1. Mu.g, the sgRNA expression vector plasmid was 0.5. Mu.g, and ABEmax was used as a control. GFP positive cells were sorted 72 hours after transfection, lysed, PCR amplified and purified at 1.3 and the products were identified by second generation sequencing. The correlation results are shown in fig. 3 and 4. ABEmax-NG can recognize NG PAM in N2a cells, efficiently edit endogenous gene loci, and has high accuracy, low side products and low proximity off-target.
2.3 in vitro transcription of sgRNA
Amplifying a fragment containing the sgRNA by taking the constructed PUC57-T7sgRNA as a template, wherein the primers are as follows: 5'-TCTCGCGCGTTTCGGTGATGACGG-3' (SEQ ID.45); R5'-AAAAAAAGCACCGACTCGGTGCCACTTTTTC-3' (SEQ ID.46). The amplification system is as follows: 2 Xbaffer (Vazyme, P505) 25. Mu.L; dNTP 1. Mu.L; f (10 pmol/. Mu.L) 2. Mu.L; r (10 pmol/. Mu.L) 2. Mu.L; template 1ng; DNA polymerase (Vazyme, P505) 0.5. Mu.L; ddH2O was filled to 50. Mu.L. The amplified PCR product was purified by the following steps: add 4. Mu.L RNAsecure (Life: AM 7005) per 100. Mu.L volume; 15 minutes at 60 ℃; adding three times of volume of PCR-A (Axygen: AP-PCR-250G) and passing through a column, centrifuging, and centrifuging for 1 min at 12000 rpm; 500 mu L W2 was added and centrifuged for 1 min; idling for 1 minute; 20. Mu.L RNAase-free water was added for elution.
Transcription using an in vitro transcription kit (Ambion, life Technologies, AM 1354) was performed as follows:
the reaction system is as follows: reaction buffer 1 μL; enzyme mix 1. Mu.L; a1.mu.L; t1. Mu.L; g1μl; c1 μl; 800ng of template; H2O was made up to 10. Mu.L. The above system was mixed and reacted at 37℃for 5 hours. 1. Mu.L DNase was added and reacted at 37℃for 15 minutes. Transcribed sgrnas were recovered using a recovery kit (Ambion, life Technologies, AM 1908) as follows: the reaction volume of the previous step was added to 90 mu LElution solution for transplantation into 1.5ml EP tubes; adding 350 mu L Binding solution and mixing uniformly; adding 250 mu L of absolute ethyl alcohol and uniformly mixing; loading a column; centrifuging at 10000 rpm for 30 seconds, and pouring out waste liquid; adding 500 mu L Washing solution, centrifuging at 10000 rpm for 30 seconds, and pouring out waste liquid; idling for 1 minute; changing the collecting tube, adding 100 mu L Elution solution for eluting; adding 10 μl ammonium acetate (Ambion, life Technologies, AM 1908) and mixing; adding 275 mu L of absolute ethyl alcohol and uniformly mixing; placing at-20deg.C for 30 min, and simultaneously preparing 70% ethanol for placing at-20deg.C; centrifugation at 13000 rpm for 15 minutes at 4 ℃. The supernatant was discarded and 500. Mu.L of 70% ethanol was added; centrifuging for 5 minutes, sucking away waste liquid, and airing for 5 minutes; adding 20 mu L of water for dissolution; 1. Mu.L of the sample was used for concentration measurement.
2.4 in vitro transcription of ABEmax-NG and ABEmax
And (5) enzyme cutting and recycling the plasmid. This step is to linearize the plasmid. The system is as follows: 10. Mu.g of plasmid; buffer I (NEB: R0539L) 10. Mu.L; bbsI 4. Mu.L (NEB: R0539L); H2O was made up to 100. Mu.L. After mixing, the mixture was digested overnight at 37 ℃.
Recovery of linearized plasmids. 4. Mu.L of RNAsecure (Life: AM 7005) was added to the digested product, and the reaction was carried out at 60℃for 10 minutes; the rest steps are operated by using a recovery kit (QIAGEN: 28004), 5 times volume of buffer PB is added, and the mixture is passed through a column; adding 750 mu L buffer PE for centrifugation; idling for 1 minute; the mixture was eluted with 10. Mu.L of water, and the concentration was measured.
And (5) in vitro transcription. The system was added sequentially as required by the kit (Invitrogen: AM 1345): 1 μg of linearization carrier; 10 μL2XNTP/ARCA; make up to 20 μl of water; 2 mu L T ezyme mix;2 μL of 10xreaction buffer. After mixing, the mixture was reacted at 37℃for 2 hours. 1. Mu.L of DNasea was added and reacted for 15 minutes.
And (5) adding a tail. The transcript is subjected to tailing treatment to ensure the stability of transcribed mRNA. The specific system is as follows: 20. Mu.L of the reaction product; 36 mu L H O;20 μL of 5xE-PAP buffer;10 μL 25mM MnCl2; 10. Mu.L of ATP solution; 4. Mu.L PEP. The reaction system was mixed and reacted at 37℃for 30 minutes.
And (5) recycling. This was performed using a recovery kit (QIAGEN: 74104). The method comprises the following steps: adding 350 mu L buffer RLT into the reaction product of the previous step; adding 250 μl of absolute ethanol, passing through column, and centrifuging; adding 500 mu L of RPE, centrifuging, adding 500 mu L of RPE, and centrifuging; idling; 30. Mu.L of water was added for elution. After the concentration is measured, the mixture is preserved at-80 ℃.
2.5 embryo injection, in vitro culture and identification of mice
A mixture of ABEmax-NG/ABEmax and sgRNA was injected into 1-cell embryos at concentrations of 100 NG/. Mu.L and 50 NG/. Mu.L, respectively. In vitro culture was carried out for E4.5 days under KSOM broth (Millipore, MR-106-D), 37℃and 5% CO 2 . Embryos were transferred to 200. Mu.L tubes and 5. Mu. L alkaline lysis solution (200 mM KOH/50mM dithiothreitol) were added. After incubation at 65℃for 10 min, neutralization solution (900 mM Tris-HCl, pH 8.3/300mM KCl/200mM HCl), 5. Mu.L of 400. Mu.M random primer (Genscript, nanjin, china), 6. Mu.L of 10 XPCR buffer (Takara, dalia, china), 3. Mu.L of dNTPs (2.5 mM) and 1. Mu.L of Taq polymerase (Takara, dalia, china) were added and water was added to 60. Mu.L. PCR was performed for 50 cycles, each cycle including 92℃for 1 minute; extending for 2 minutes at a temperature of from 37 ℃ to 55 ℃ at 10 sec/delay; 55℃for 4 minutes. The amplified product is used as a PCR template, the PCR amplification and purification of the target fragment are carried out according to 1.3, the purified product is identified by second generation sequencing, and the related results are shown in fig. 3 and 4. ABEmax-NG can identify NG PAM in mouse embryo, efficiently edit endogenous gene locus, and has high accuracy, low side product and low proximity off-target.
Example 4
In this example, ABEmax-NG was used to regulate RNA splicing of endogenous genes in N2a cells.
3.1 construction of sgRNA plasmids
4 mouse endogenous genes were selected: BBS2 (ID: 67378), OFD1 (ID: 237222), MYO7A (ID: 17921), SEPN1 (ID: 74777), the sgRNA was designed and the oligos used were found in the appendix sequence Listing SEQ ID.47-54. Construction of the sgRNA plasmid was performed as 1.2.
3.2 culture transfection and identification of cells
N2a cells (purchased from ATCC) were cultured and transfected at 1.3, the amount of plasmid transfected was ABEmax-NG 1. Mu.g, the sgRNA expression vector plasmid was 0.5. Mu.g, and ABEmax was used as a control. GFP positive cells were sorted 72 hours after transfection and cultured to confluence in 24 well plates. Part of the cells were taken for 1.3 lysis, PCR amplification and purification, and the product was identified by second generation sequencing. The remaining cells were subjected to total RNA extraction by TRIzol method, reverse transcription was performed by HiScript II Q RT SuperMix (Vazyme, R222), and PCR amplification of the reverse transcription product was performed at 1.3. Isolation of RNA isoforms was performed using agarose gel electrophoresis and isolated RNA isoforms were identified using Sanger sequencing.
The correlation results are shown in fig. 5. ABEmax-NG recognizes NG PAM in N2a cells, efficiently editing the splice sites of the endogenous gene site, thereby altering RNA splicing of the endogenous gene.
Example 5
In this example, ABEmax-NG was used to obtain BBS2 splice site mutant mice, demonstrating that ABEmax-NG can be an effective tool for making RNA splice model mice.
4.1 construction of sgRNA plasmids
The sgRNA editing the splice site of the mouse BBS2 gene (ID: 67378) was designed and the oligos used were as shown in the appendix sequence listing SEQ ID.55-56. Construction of the pUC57-T7-sgRNA vector plasmid was carried out as described in 2.1.
4.2 in vitro transcription
In vitro transcription of sgRNA and ABEmax-NG was performed at 2.3 and 2.4.
4.3 injection and transplantation of mouse embryos
Injection of mouse embryos was performed at 2.5. The injected embryos were transferred to surrogate mice (background: ICR strain).
4.4 genotyping of mice and detection of RNA splicing
The tail of the mouse was taken, genomic DNA was extracted by phenol chloroform, PCR amplification and purification were performed at 1.3, and the product was identified by Sanger sequencing. Taking different tissues of a mouse, extracting genome DNA (deoxyribonucleic acid) by using a phenol-chloroform method, carrying out PCR (polymerase chain reaction) amplification and purification according to 1.3, and identifying a product by using second-generation sequencing; RNA extraction, reverse transcription, amplification and purification were performed at 3.2 and the product was identified by second generation sequencing. The correlation results are shown in fig. 6. BBS2 splice site mutant mice were successfully obtained using ABEmax-NG, and splice site mutations and corresponding RNA splicing could be detected in various tissues.
In summary, the present invention effectively overcomes the disadvantages of the prior art and has high industrial utility value.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.
Sequence listing
<110> Shanghai university of science and technology
<120> an adenine base editing tool and use thereof
<160> 61
<170> SIPOSequenceListing 1.0
<210> 1
<211> 8811
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 1
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat gaaacggaca 420
gccgacggaa gcgagttcga gtcaccaaag aagaagcgga aagtctctga agtcgagttt 480
agccacgagt attggatgag gcacgcactg accctggcaa agcgagcatg ggatgaaaga 540
gaagtccccg tgggcgccgt gctggtgcac aacaatagag tgatcggaga gggatggaac 600
aggccaatcg gccgccacga ccctaccgca cacgcagaga tcatggcact gaggcaggga 660
ggcctggtca tgcagaatta ccgcctgatc gatgccaccc tgtatgtgac actggagcca 720
tgcgtgatgt gcgcaggagc aatgatccac agcaggatcg gaagagtggt gttcggagca 780
cgggacgcca agaccggcgc agcaggctcc ctgatggatg tgctgcacca ccccggcatg 840
aaccaccggg tggagatcac agagggaatc ctggcagacg agtgcgccgc cctgctgagc 900
gatttcttta gaatgcggag acaggagatc aaggcccaga agaaggcaca gagctccacc 960
gactctggag gatctagcgg aggatcctct ggaagcgaga caccaggcac aagcgagtcc 1020
gccacaccag agagctccgg cggctcctcc ggaggatcct ctgaggtgga gttttcccac 1080
gagtactgga tgagacatgc cctgaccctg gccaagaggg cacgcgatga gagggaggtg 1140
cctgtgggag ccgtgctggt gctgaacaat agagtgatcg gcgagggctg gaacagagcc 1200
atcggcctgc acgacccaac agcccatgcc gaaattatgg ccctgagaca gggcggcctg 1260
gtcatgcaga actacagact gattgacgcc accctgtacg tgacattcga gccttgcgtg 1320
atgtgcgccg gcgccatgat ccactctagg atcggccgcg tggtgtttgg cgtgaggaac 1380
gcaaaaaccg gcgccgcagg ctccctgatg gacgtgctgc actaccccgg catgaatcac 1440
cgcgtcgaaa ttaccgaggg aatcctggca gatgaatgtg ccgccctgct gtgctatttc 1500
tttcggatgc ctagacaggt gttcaatgct cagaagaagg cccagagctc caccgactcc 1560
ggaggatcta gcggaggctc ctctggctct gagacacctg gcacaagcga gagcgcaaca 1620
cctgaaagca gcgggggcag cagcgggggg tcagacaaga agtacagcat cggcctggcc 1680
atcggcacca actctgtggg ctgggccgtg atcaccgacg agtacaaggt gcccagcaag 1740
aaattcaagg tgctgggcaa caccgaccgg cacagcatca agaagaacct gatcggagcc 1800
ctgctgttcg acagcggcga aacagccgag gccacccggc tgaagagaac cgccagaaga 1860
agatacacca gacggaagaa ccggatctgc tatctgcaag agatcttcag caacgagatg 1920
gccaaggtgg acgacagctt cttccacaga ctggaagagt ccttcctggt ggaagaggat 1980
aagaagcacg agcggcaccc catcttcggc aacatcgtgg acgaggtggc ctaccacgag 2040
aagtacccca ccatctacca cctgagaaag aaactggtgg acagcaccga caaggccgac 2100
ctgcggctga tctatctggc cctggcccac atgatcaagt tccggggcca cttcctgatc 2160
gagggcgacc tgaaccccga caacagcgac gtggacaagc tgttcatcca gctggtgcag 2220
acctacaacc agctgttcga ggaaaacccc atcaacgcca gcggcgtgga cgccaaggcc 2280
atcctgtctg ccagactgag caagagcaga cggctggaaa atctgatcgc ccagctgccc 2340
ggcgagaaga agaatggcct gttcggaaac ctgattgccc tgagcctggg cctgaccccc 2400
aacttcaaga gcaacttcga cctggccgag gatgccaaac tgcagctgag caaggacacc 2460
tacgacgacg acctggacaa cctgctggcc cagatcggcg accagtacgc cgacctgttt 2520
ctggccgcca agaacctgtc cgacgccatc ctgctgagcg acatcctgag agtgaacacc 2580
gagatcacca aggcccccct gagcgcctct atgatcaaga gatacgacga gcaccaccag 2640
gacctgaccc tgctgaaagc tctcgtgcgg cagcagctgc ctgagaagta caaagagatt 2700
ttcttcgacc agagcaagaa cggctacgcc ggctacattg acggcggagc cagccaggaa 2760
gagttctaca agttcatcaa gcccatcctg gaaaagatgg acggcaccga ggaactgctc 2820
gtgaagctga acagagagga cctgctgcgg aagcagcgga ccttcgacaa cggcagcatc 2880
ccccaccaga tccacctggg agagctgcac gccattctgc ggcggcagga agatttttac 2940
ccattcctga aggacaaccg ggaaaagatc gagaagatcc tgaccttccg catcccctac 3000
tacgtgggcc ctctggccag gggaaacagc agattcgcct ggatgaccag aaagagcgag 3060
gaaaccatca ccccctggaa cttcgaggaa gtggtggaca agggcgcttc cgcccagagc 3120
ttcatcgagc ggatgaccaa cttcgataag aacctgccca acgagaaggt gctgcccaag 3180
cacagcctgc tgtacgagta cttcaccgtg tataacgagc tgaccaaagt gaaatacgtg 3240
accgagggaa tgagaaagcc cgccttcctg agcggcgagc agaaaaaggc catcgtggac 3300
ctgctgttca agaccaaccg gaaagtgacc gtgaagcagc tgaaagagga ctacttcaag 3360
aaaatcgagt gcttcgactc cgtggaaatc tccggcgtgg aagatcggtt caacgcctcc 3420
ctgggcacat accacgatct gctgaaaatt atcaaggaca aggacttcct ggacaatgag 3480
gaaaacgagg acattctgga agatatcgtg ctgaccctga cactgtttga ggacagagag 3540
atgatcgagg aacggctgaa aacctatgcc cacctgttcg acgacaaagt gatgaagcag 3600
ctgaagcggc ggagatacac cggctggggc aggctgagcc ggaagctgat caacggcatc 3660
cgggacaagc agtccggcaa gacaatcctg gatttcctga agtccgacgg cttcgccaac 3720
agaaacttca tgcagctgat ccacgacgac agcctgacct ttaaagagga catccagaaa 3780
gcccaggtgt ccggccaggg cgatagcctg cacgagcaca ttgccaatct ggccggcagc 3840
cccgccatta agaagggcat cctgcagaca gtgaaggtgg tggacgagct cgtgaaagtg 3900
atgggccggc acaagcccga gaacatcgtg atcgaaatgg ccagagagaa ccagaccacc 3960
cagaagggac agaagaacag ccgcgagaga atgaagcgga tcgaagaggg catcaaagag 4020
ctgggcagcc agatcctgaa agaacacccc gtggaaaaca cccagctgca gaacgagaag 4080
ctgtacctgt actacctgca gaatgggcgg gatatgtacg tggaccagga actggacatc 4140
aaccggctgt ccgactacga tgtggaccat atcgtgcctc agagctttct gaaggacgac 4200
tccatcgaca acaaggtgct gaccagaagc gacaagaacc ggggcaagag cgacaacgtg 4260
ccctccgaag aggtcgtgaa gaagatgaag aactactggc ggcagctgct gaacgccaag 4320
ctgattaccc agagaaagtt cgacaatctg accaaggccg agagaggcgg cctgagcgaa 4380
ctggataagg ccggcttcat caagagacag ctggtggaaa cccggcagat cacaaagcac 4440
gtggcacaga tcctggactc ccggatgaac actaagtacg acgagaatga caagctgatc 4500
cgggaagtga aagtgatcac cctgaagtcc aagctggtgt ccgatttccg gaaggatttc 4560
cagttttaca aagtgcgcga gatcaacaac taccaccacg cccacgacgc ctacctgaac 4620
gccgtcgtgg gaaccgccct gatcaaaaag taccctaagc tggaaagcga gttcgtgtac 4680
ggcgactaca aggtgtacga cgtgcggaag atgatcgcca agagcgagca ggaaatcggc 4740
aaggctaccg ccaagtactt cttctacagc aacatcatga actttttcaa gaccgagatt 4800
accctggcca acggcgagat ccggaagcgg cctctgatcg agacaaacgg cgaaaccggg 4860
gagatcgtgt gggataaggg ccgggatttt gccaccgtgc ggaaagtgct gagcatgccc 4920
caagtgaata tcgtgaaaaa gaccgaggtg cagacaggcg gcttcagcaa agagtctatc 4980
agacccaaga ggaacagcga taagctgatc gccagaaaga aggactggga ccctaagaag 5040
tacggcggct tcgtgagccc caccgtggcc tattctgtgc tggtggtggc caaagtggaa 5100
aagggcaagt ccaagaaact gaagagtgtg aaagagctgc tggggatcac catcatggaa 5160
agaagcagct tcgagaagaa tcccatcgac tttctggaag ccaagggcta caaagaagtg 5220
aaaaaggacc tgatcatcaa gctgcctaag tactccctgt tcgagctgga aaacggccgg 5280
aagagaatgc tggcctctgc cagattcctg cagaagggaa acgaactggc cctgccctcc 5340
aaatatgtga acttcctgta cctggccagc cactatgaga agctgaaggg ctcccccgag 5400
gataatgagc agaaacagct gtttgtggaa cagcacaagc actacctgga cgagatcatc 5460
gagcagatca gcgagttctc caagagagtg atcctggccg acgctaatct ggacaaagtg 5520
ctgtccgcct acaacaagca ccgggataag cccatcagag agcaggccga gaatatcatc 5580
cacctgttta ccctgaccaa tctgggagcc cctagagcct tcaagtactt tgacaccacc 5640
atcgaccgga aggtgtacag aagcaccaaa gaggtgctgg acgccaccct gatccaccag 5700
agcatcaccg gcctgtacga gacacggatc gacctgtctc agctgggagg tgactctggc 5760
ggctcaaaaa gaaccgccga cggcagcgaa ttcgagccca agaagaagag gaaagtctaa 5820
ccggtcatca tcaccatcac cattgagttt aaacccgctg atcagcctcg actgtgcctt 5880
ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg 5940
ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt ctgagtaggt 6000
gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat tgggaagaca 6060
atagcaggca tgctggggat gcggtgggct ctatggcttc tgaggcggaa agaaccagct 6120
ggggctcgat accgtcgacc tctagctaga gcttggcgta atcatggtca tagctgtttc 6180
ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt 6240
gtaaagccta gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc 6300
ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg 6360
ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct 6420
cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca 6480
cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga 6540
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc 6600
acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg 6660
cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat 6720
acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt 6780
atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc 6840
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg 6900
acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg 6960
gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg 7020
gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg 7080
gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca 7140
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac actcagtgga 7200
acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga 7260
tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt 7320
ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt 7380
catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat 7440
ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag 7500
caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct 7560
ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt 7620
tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg 7680
cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca 7740
aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt 7800
tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat 7860
gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac 7920
cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa 7980
aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt 8040
tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt 8100
tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa 8160
gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt 8220
atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 8280
taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtcgacgga tcgggagatc 8340
gatctcccga tcccctaggg tcgactctca gtacaatctg ctctgatgcc gcatagttaa 8400
gccagtatct gctccctgct tgtgtgttgg aggtcgctga gtagtgcgcg agcaaaattt 8460
aagctacaac aaggcaaggc ttgaccgaca attgcatgaa gaatctgctt agggttaggc 8520
gttttgcgct gcttcgcgat gtacgggcca gatatacgcg ttgacattga ttattgacta 8580
gttattaata gtaatcaatt acggggtcat tagttcatag cccatatatg gagttccgcg 8640
ttacataact tacggtaaat ggcccgcctg gctgaccgcc caacgacccc cgcccattga 8700
cgtcaataat gacgtatgtt cccatagtaa cgccaatagg gactttccat tgacgtcaat 8760
gggtggagta tttacggtaa actgcccact tggcagtaca tcaagtgtat c 8811
<210> 2
<211> 4368
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 2
ggggttgggg ttgcgccttt tccaaggcag ccctgggttt gcgcagggac gcggctgctc 60
tgggcgtggt tccgggaaac gcagcggcgc cgaccctggg actcgcacat tcttcacgtc 120
cgttcgcagc gtcacccgga tcttcgccgc tacccttgtg ggccccccgg cgacgcttcc 180
tgctccgccc ctaagtcggg aaggttcctt gcggttcgcg gcgtgccgga cgtgacaaac 240
ggaagccgca cgtctcacta gtaccctcgc agacggacag cgccagggag caatggcagc 300
gcgccgaccg cgatgggctg tggccaatag cggctgctca gcagggcgcg ccgagagcag 360
cggccgggaa ggggcggtgc gggaggcggg gtgtggggcg gtagtgtggg ccctgttcct 420
gcccgcgcgg tgttccgcat tctgcaagcc tccggagcgc acgtcggcag tcggctccct 480
cgttgaccga atcaccgacc tctctcccca gggggatcca tggtgagcaa gggcgaggag 540
ctgttcaccg gggtggtgcc catcctggtc gagctggacg gcgacgtaaa cggccacaag 600
ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg gcaagctgac cctgaagttc 660
atctgcacca ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac nctgacctac 720
ggcgtgtagt gcttcagccg ctaccccgac cacatgaagc agcacgactt cttcaagtcc 780
gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga cggcaactac 840
aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat cgagctgaag 900
ggcatcgact tcaaggagga cggcaacatc ctggggcaca agctggagta caactacaac 960
agccacaacg tctatatcat ggccgacaag cagaagaacg gcatcaaggt gaacttcaag 1020
atccgccaca acatcgagga cggcagcgtg cagctcgccg accactacca gcagaacacc 1080
cccatcggcg acggccccgt gctgctgccc gacaaccact acctgagcac ccagtccgcc 1140
ctgagcaaag accccaacga gaagcgcgat cacatggtcc tgctggagtt cgtgaccgcc 1200
gccgggatca ctctcggcat ggacgagctg tacaagtaaa gcggccgcga ctctagatca 1260
taatcagcca taccacattt gtagaggttt tacttgcttt aaaaaacctc ccacacctcc 1320
ccctgaacct gaaacataaa atgaatgcaa ttgttgttgt taacttgttt attgcagctt 1380
ataatggtta caaataaagc aatagcatca caaatttcac aaataaagca tttttttcac 1440
tgcattctag ttgtggtttg tccaaactca tcaatgtatc ttagtcgacc gatgcccttg 1500
agagccttca acccagtcag ctccttccgg tgggcgcggg gcatgactat cgtcgccgca 1560
cttatgactg tcttctttat catgcaactc gtaggacagg tgccggcagc gctcttccgc 1620
ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 1680
ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 1740
agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 1800
taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 1860
cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 1920
tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 1980
gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 2040
gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 2100
tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 2160
gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 2220
cggctacact agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 2280
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 2340
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 2400
ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 2460
attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 2520
ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 2580
tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 2640
aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgggaccc 2700
acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 2760
aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 2820
agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 2880
ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 2940
agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 3000
tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 3060
tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 3120
attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 3180
taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 3240
aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 3300
caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 3360
gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 3420
cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 3480
tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 3540
acctgacgcg ccctgtagcg gcgcattaag cgcggcgggt gtggtggtta cgcgcagcgt 3600
gaccgctaca cttgccagcg ccctagcgcc cgctcctttc gctttcttcc cttcctttct 3660
cgccacgttc gccggctttc cccgtcaagc tctaaatcgg gggctccctt tagggttccg 3720
atttagtgct ttacggcacc tcgaccccaa aaaacttgat tagggtgatg gttcacgtag 3780
tgggccatcg ccctgataga cggtttttcg ccctttgacg ttggagtcca cgttctttaa 3840
tagtggactc ttgttccaaa ctggaacaac actcaaccct atctcggtct attcttttga 3900
tttataaggg attttgccga tttcggccta ttggttaaaa aatgagctga tttaacaaaa 3960
atttaacgcg aattttaaca aaatattaac gcttacaatt tgccattcgc cattcaggct 4020
gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc agcccaagct 4080
accatgataa gtaagtaata ttaaggtacg ggaggtactt ggagcggccg caataaaata 4140
tctttatttt cattacatct gtgtgttggt tttttgtgtg aatcgatagt actaacatac 4200
gctctccatc aaaacaaaac gaaacaaaac aaactagcaa aataggctgt ccccagtgca 4260
agtgcaggtg ccagaacatt tctctatcga taggtaccga ttagtgaacg gatctcgacg 4320
gtatcgatca cgagactagc cagagatcca ctttggccgc ggctcgag 4368
<210> 3
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 3
accgagcact acacgccgta ggtc 24
<210> 4
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 4
aaacgaccta cggcgtgtag tgct 24
<210> 5
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 5
accgcaatcc agacactggt ggtc 24
<210> 6
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 6
aaacgaccac cagtgtctgg attg 24
<210> 7
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 7
accgcgggca gcgaccatag gaag 24
<210> 8
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 8
aaaccttcct atggtcgctg cccg 24
<210> 9
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 9
accgatggaa agcagacacg atag 24
<210> 10
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 10
aaacctatcg tgtctgcttt ccat 24
<210> 11
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 11
accggaacat gaactcttac gact 24
<210> 12
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 12
aaacagtcgt aagagttcat gttc 24
<210> 13
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 13
accgcctcta ttgtgctgtc atgt 24
<210> 14
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 14
aaacacatga cagcacaata gagg 24
<210> 15
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 15
accgcagcag ctcgtccttc actg 24
<210> 16
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 16
aaaccagtga aggacgagct gctg 24
<210> 17
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 17
accgtattac agaaaccagc cccg 24
<210> 18
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 18
aaaccggggc tggtttctgt aata 24
<210> 19
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 19
accgggctaa cgtgcgggag cgca 24
<210> 20
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 20
aaactgcgct cccgcacgtt agcc 24
<210> 21
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 21
accgattgat gtaatggatg cagt 24
<210> 22
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 22
aaacactgca tccattacat caat 24
<210> 23
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 23
accggtttca gaatcgaagg gtga 24
<210> 24
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 24
aaactcaccc ttcgattctg aaac 24
<210> 25
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 25
accgagacat attcctcact acaa 24
<210> 26
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 26
aaacttgtag tgaggaatat gtct 24
<210> 27
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 27
accggcgcat ggccacttcc tgtg 24
<210> 28
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 28
aaaccacagg aagtggccat gcgc 24
<210> 29
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 29
accgcttgta tcaggaccac atgc 24
<210> 30
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 30
aaacgcatgt ggtcctgata caag 24
<210> 31
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 31
accgaacgtg atggccatgt cgcc 24
<210> 32
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 32
aaacggcgac atggccatca cgtt 24
<210> 33
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 33
accgagccag actctgccga tgac 24
<210> 34
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 34
aaacgtcatc ggcagagtct ggct 24
<210> 35
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 35
accgtttgga aggaaagtgg tata 24
<210> 36
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 36
aaactatacc actttccttc caaa 24
<210> 37
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 37
taggcgggca gcgaccatag gaag 24
<210> 38
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 38
aaaccttcct atggtcgctg cccg 24
<210> 39
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 39
taggggctaa cgtgcgggag cgca 24
<210> 40
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 40
aaactgcgct cccgcacgtt agcc 24
<210> 41
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 41
taggagacat attcctcact acaa 24
<210> 42
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 42
aaacttgtag tgaggaatat gtct 24
<210> 43
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 43
taggtttgga aggaaagtgg tata 24
<210> 44
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 44
aaactatacc actttccttc caaa 24
<210> 45
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 45
tctcgcgcgt ttcggtgatg acgg 24
<210> 46
<211> 31
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 46
aaaaaaagca ccgactcggt gccacttttt c 31
<210> 47
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 47
accggttcag gttactggag acaa 24
<210> 48
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 48
aaacttgtct ccagtaacct gaac 24
<210> 49
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 49
accgctgata cctgaagtgt gtcc 24
<210> 50
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 50
aaacggacac acttcaggta tcag 24
<210> 51
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 51
accgcctcag gaggacgacc tggc 24
<210> 52
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 52
aaacgccagg tcgtcctcct gagg 24
<210> 53
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 53
accgcactca ccggaacatc acgg 24
<210> 54
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 54
aaacccgtga tgttccggtg agtg 24
<210> 55
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 55
tagggttcag gttactggag acaa 24
<210> 56
<211> 24
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 56
aaacttgtct ccagtaacct gaac 24
<210> 57
<211> 166
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 57
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro Ile
35 40 45
Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210> 58
<211> 166
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 58
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
35 40 45
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210> 59
<211> 1367
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 59
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 1100
Ser Lys Glu Ser Ile Arg Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Val Ser Pro
1125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 1215
Arg Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala Phe Lys Tyr Phe Asp Thr
1315 1320 1325
Thr Ile Asp Arg Lys Val Tyr Arg Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp
1365
<210> 60
<211> 18
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 60
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys Arg
1 5 10 15
Lys Val
<210> 61
<211> 1803
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 61
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His
20 25 30
Ala Leu Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val
35 40 45
Gly Ala Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn
50 55 60
Arg Pro Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala
65 70 75 80
Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala
85 90 95
Thr Leu Tyr Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met
100 105 110
Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys
115 120 125
Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met
130 135 140
Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala
145 150 155 160
Ala Leu Leu Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala
165 170 175
Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly
180 185 190
Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu
195 200 205
Ser Ser Gly Gly Ser Ser Gly Gly Ser Ser Glu Val Glu Phe Ser His
210 215 220
Glu Tyr Trp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp
225 230 235 240
Glu Arg Glu Val Pro Val Gly Ala Val Leu Val Leu Asn Asn Arg Val
245 250 255
Ile Gly Glu Gly Trp Asn Arg Ala Ile Gly Leu His Asp Pro Thr Ala
260 265 270
His Ala Glu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn
275 280 285
Tyr Arg Leu Ile Asp Ala Thr Leu Tyr Val Thr Phe Glu Pro Cys Val
290 295 300
Met Cys Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe
305 310 315 320
Gly Val Arg Asn Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val
325 330 335
Leu His Tyr Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile
340 345 350
Leu Ala Asp Glu Cys Ala Ala Leu Leu Cys Tyr Phe Phe Arg Met Pro
355 360 365
Arg Gln Val Phe Asn Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser
370 375 380
Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser
385 390 395 400
Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Asp
405 410 415
Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp
420 425 430
Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val
435 440 445
Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala
450 455 460
Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg
465 470 475 480
Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu
485 490 495
Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe
500 505 510
His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu
515 520 525
Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu
530 535 540
Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr
545 550 555 560
Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile
565 570 575
Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn
580 585 590
Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln
595 600 605
Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala
610 615 620
Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile
625 630 635 640
Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile
645 650 655
Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu
660 665 670
Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
675 680 685
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe
690 695 700
Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu
705 710 715 720
Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile
725 730 735
Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu
740 745 750
Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln
755 760 765
Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu
770 775 780
Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr
785 790 795 800
Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln
805 810 815
Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu
820 825 830
Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys
835 840 845
Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr
850 855 860
Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr
865 870 875 880
Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val
885 890 895
Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe
900 905 910
Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
915 920 925
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val
930 935 940
Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys
945 950 955 960
Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys
965 970 975
Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val
980 985 990
Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr
995 1000 1005
His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu
1010 1015 1020
Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe
1025 1030 1035 1040
Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu
1045 1050 1055
Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
1060 1065 1070
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln
1075 1080 1085
Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn
1090 1095 1100
Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu
1105 1110 1115 1120
Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu
1125 1130 1135
His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu
1140 1145 1150
Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1155 1160 1165
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr
1170 1175 1180
Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu
1185 1190 1195 1200
Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val Glu
1205 1210 1215
Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn
1220 1225 1230
Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser
1235 1240 1245
Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp Asp
1250 1255 1260
Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys
1265 1270 1275 1280
Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr
1285 1290 1295
Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
1300 1305 1310
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala
1315 1320 1325
Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His
1330 1335 1340
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn
1345 1350 1355 1360
Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu
1365 1370 1375
Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile
1380 1385 1390
Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1395 1400 1405
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr
1410 1415 1420
Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu
1425 1430 1435 1440
Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile
1445 1450 1455
Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg
1460 1465 1470
Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp
1475 1480 1485
Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro
1490 1495 1500
Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser
1505 1510 1515 1520
Lys Glu Ser Ile Arg Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg
1525 1530 1535
Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Val Ser Pro Thr
1540 1545 1550
Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser
1555 1560 1565
Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu
1570 1575 1580
Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly
1585 1590 1595 1600
Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1605 1610 1615
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1620 1625 1630
Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1635 1640 1645
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu
1650 1655 1660
Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu
1665 1670 1675 1680
Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile Leu
1685 1690 1695
Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg
1700 1705 1710
Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr
1715 1720 1725
Leu Thr Asn Leu Gly Ala Pro Arg Ala Phe Lys Tyr Phe Asp Thr Thr
1730 1735 1740
Ile Asp Arg Lys Val Tyr Arg Ser Thr Lys Glu Val Leu Asp Ala Thr
1745 1750 1755 1760
Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu
1765 1770 1775
Ser Gln Leu Gly Gly Asp Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly
1780 1785 1790
Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys Val
1795 1800

Claims (11)

1. An adenine base editor comprising an ecTadA-ecTadA dimer fragment and a SpCas9-NG D10A nicase fragment, wherein the ecTadA-ecTadA dimer fragment comprises an ecTad fragment and an ecTadA fragment, and the amino acid sequence of the SpCas9-NG D10A nicase fragment is the amino acid sequence shown as SEQ ID No. 59;
the adenine base editor sequentially comprises an ecTadA-ecTadA dimer fragment and an SpCas9-NG D10A nickase fragment from a 5 'end to a 3' end;
the ecTadA-ecTadA dimer fragment sequentially comprises an ecTad fragment and an ecTadA fragment from a 5 'end to a 3' end;
the adenine base editor also comprises a nuclear localization signal fragment, and the amino acid sequence of the nuclear localization signal fragment is shown as SEQ ID NO. 60;
the amino acid sequence of the adenine base editor is shown as SEQ ID No. 61.
2. The adenine base editor according to claim 1, wherein the amino acid sequence of the ecTadA fragment is
An amino acid sequence as shown in SEQ ID NO. 57.
3. The adenine base editor of claim 1, wherein the amino acid sequence of the ecTadA fragment is the amino acid sequence shown in SEQ ID No. 58.
4. An isolated polynucleotide encoding the adenine base editor of any one of claims 1-3.
5. A construct comprising the isolated polynucleotide of claim 4.
6. An expression system comprising the construct or genome of claim 5 integrated with an exogenous polynucleotide of claim 4.
7. The expression system of claim 6, wherein the host cell of the expression system is selected from eukaryotic cells and prokaryotic cells.
8. Use of an adenine base editor according to any one of claims 1 to 3, an isolated polynucleotide according to claim 4, a construct according to claim 5 or an expression system according to any one of claims 6 to 7 in gene editing.
9. The use according to claim 8, in particular in gene editing of eukaryotic organisms.
10. A base editing system comprising the adenine base editor of any one of claims 1-3 and sgRNA.
11. A method of gene editing comprising: gene editing by the adenine base editor according to any one of claims 1 to 3 or the base editing system according to claim 10.
CN201910382569.1A 2019-05-09 2019-05-09 Adenine base editing tool and application thereof Active CN110029096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910382569.1A CN110029096B (en) 2019-05-09 2019-05-09 Adenine base editing tool and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910382569.1A CN110029096B (en) 2019-05-09 2019-05-09 Adenine base editing tool and application thereof

Publications (2)

Publication Number Publication Date
CN110029096A CN110029096A (en) 2019-07-19
CN110029096B true CN110029096B (en) 2023-05-12

Family

ID=67241642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910382569.1A Active CN110029096B (en) 2019-05-09 2019-05-09 Adenine base editing tool and application thereof

Country Status (1)

Country Link
CN (1) CN110029096B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110467679B (en) * 2019-08-06 2021-04-23 广州大学 Fusion protein, base editing tool and method and application thereof
CN110551760B (en) * 2019-08-08 2022-11-18 复旦大学 CRISPR/Sa-SeqCas9 gene editing system and application thereof
CN110511286B (en) * 2019-08-29 2022-08-02 上海科技大学 RNA base editing molecule
CN110951773B (en) * 2019-12-20 2022-04-12 北京市农林科学院 Application of FNLS-sABE system in creating rice herbicide resistant material
CN110982818B (en) * 2019-12-20 2022-03-08 北京市农林科学院 Application of nuclear localization signal F4NLS in efficient creation of rice herbicide resistant material
CN110951736B (en) * 2019-12-20 2023-03-14 北京市农林科学院 Nuclear localization signal F4NLS and application thereof in improving base editing efficiency and expanding editable base range
CN110964742B (en) * 2019-12-20 2022-03-01 北京市农林科学院 Preparation method of herbicide-resistant rice
CN115380111A (en) * 2020-01-30 2022-11-22 成对植物服务股份有限公司 Compositions, systems, and methods for base diversification
EP4103705A4 (en) * 2020-02-14 2024-02-28 Ohio State Innovation Foundation Nucleobase editors and methods of use thereof
CN114058607B (en) * 2020-07-31 2024-02-27 上海科技大学 Fusion protein for editing C to U base, and preparation method and application thereof
CN112080513A (en) * 2020-09-16 2020-12-15 中国农业科学院植物保护研究所 Rice artificial genome editing system with expanded editing range and application thereof
CN112143753A (en) * 2020-09-17 2020-12-29 中国农业科学院植物保护研究所 Adenine base editor and related biological material and application thereof
CN112266420B (en) * 2020-10-30 2022-08-09 华南农业大学 Plant efficient cytosine single-base editor and construction and application thereof
CN112126637B (en) * 2020-11-20 2021-02-09 中国农业科学院植物保护研究所 Adenosine deaminase and related biological material and application thereof
US20240189457A1 (en) * 2021-04-02 2024-06-13 Shanghaitech University Gene therapy for treating beta-hemoglobinopathies
CN115161305B (en) * 2021-04-02 2023-05-12 上海科技大学 Fusion protein comprising double-base editor and preparation method and application thereof
CN113201517B (en) * 2021-05-12 2022-11-01 广州大学 Cytosine single base editor tool and application thereof
CN116064517A (en) * 2022-07-29 2023-05-05 之江实验室 Production mode of pilot editing gRNA and application thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071868A1 (en) * 2016-10-14 2018-04-19 President And Fellows Of Harvard College Aav delivery of nucleobase editors
CN108715861A (en) * 2018-04-26 2018-10-30 上海科技大学 A kind of base edit tool and its application
CN108822217A (en) * 2018-02-23 2018-11-16 上海科技大学 A kind of gene base editing machine
CN109385425A (en) * 2018-11-13 2019-02-26 中山大学 A kind of high specific ABE base editing system and its application in β hemoglobinopathy
CN109652439A (en) * 2018-12-27 2019-04-19 宜春学院 Utilize the method for the CRISPR/Cas9 adenine base editing system improvement rice blast resistance of wide spectrum mediated
CN109706185A (en) * 2019-02-01 2019-05-03 国家卫生计生委科学技术研究所 The method and application of gene knockout are realized based on base editing system mutation initiation codon

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071868A1 (en) * 2016-10-14 2018-04-19 President And Fellows Of Harvard College Aav delivery of nucleobase editors
CN108822217A (en) * 2018-02-23 2018-11-16 上海科技大学 A kind of gene base editing machine
CN108715861A (en) * 2018-04-26 2018-10-30 上海科技大学 A kind of base edit tool and its application
CN109385425A (en) * 2018-11-13 2019-02-26 中山大学 A kind of high specific ABE base editing system and its application in β hemoglobinopathy
CN109652439A (en) * 2018-12-27 2019-04-19 宜春学院 Utilize the method for the CRISPR/Cas9 adenine base editing system improvement rice blast resistance of wide spectrum mediated
CN109706185A (en) * 2019-02-01 2019-05-03 国家卫生计生委科学技术研究所 The method and application of gene knockout are realized based on base editing system mutation initiation codon

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Expanded base editing in rice and wheat using a Cas9-adenosine deaminase fusion;Chao Li et al.;《Genome Biology》;20180529;第19卷(第59期);第1-9页 *
Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction;Luke W Koblan et al.;《nature biotechnology》;20180529;第36卷;第843-847页 *
Increasing targeting scope of adenosine base editors in mouse and rat embryos through fusion of TadA deaminase with Cas9 variants;Lei Yang et al.;《Protein Cell》;20181231;第9卷(第9期);第814页左栏的2段、第815页图1A、补充材料第23页序列1 *
Lei Yang et al..Increasing targeting scope of adenosine base editors in mouse and rat embryos through fusion of TadA deaminase with Cas9 variants.《Protein Cell》.2018,第9卷(第9期),第814–819页. *
碱基编辑系统及其在构建多点突变模型中的潜在应用;陶皖豫等;《生命的化学》;20201231;第40卷(第11期);第1917-1923页 *

Also Published As

Publication number Publication date
CN110029096A (en) 2019-07-19

Similar Documents

Publication Publication Date Title
CN110029096B (en) Adenine base editing tool and application thereof
CN110656123B (en) Method for screening sgRNA high-efficiency action target based on CRISPR-Cas13d system and application
RU2766680C1 (en) New versions of hyaluronidase and a pharmaceutical composition containing them
AU2019206054B2 (en) Production of heterologous polypeptides in microalgae, microalgal extracellular bodies, compositions, and methods of making and uses thereof
US6773920B1 (en) Delivery of functional protein sequences by translocating polypeptides
CN109312360B (en) Transposon-based transfection system for primary cells
DK2713712T3 (en) TRANSGEN CHICKEN, INCLUDING AN INACTIVATED IMMUNGLOBULIN GENE
CN111662884B (en) Pseudovirus, packaging method thereof and drug evaluation system
CN112680434B (en) Method for improving secretory expression of protein glutaminase
KR20210139265A (en) Adenosine deaminase base editor for modifying nucleobases in target sequences and methods of using the same
KR20210023833A (en) How to edit single base polymorphisms using a programmable base editor system
KR20210041008A (en) Multi-effector nucleobase editor for modifying nucleic acid target sequences and methods of using the same
CN109706185B (en) Method for realizing gene knockout based on base editing system mutation initiation codon and application
CN108431221A (en) Genetic tool for converting Clostridium bacterium
KR20210124280A (en) Nucleobase editor with reduced off-target deamination and method for modifying nucleobase target sequence using same
CN108410787A (en) A kind of recombined bacillus subtilis of synthesis new tetroses of lactoyl-N- and its construction method and application
CN108949693A (en) A kind of pair of T cell immune detection point access carries out the method and application of gene knockout
CN114836443B (en) Recombinant coxsackievirus A10VLP and application thereof
KR20220066289A (en) Compositions and methods for editing mutations that enable transcription or expression
CN110938651A (en) Targeting vector, method for constructing BAC clone by targeting and integrating exogenous gene to mouse F4/80 exon 22 site and application
CN111500641A (en) Preparation method of pig with human nerve growth factor gene
KR20210084596A (en) H52 IBV vaccine with heterologous spike protein
CN111534543A (en) Eukaryotic CRISPR/Cas9 knockout system, basic vector, vector and cell line
KR102009273B1 (en) Recombinant foot-and-mouth disease virus expressing protective antigen of type O-TAW97
KR101535070B1 (en) Recomnication expression vector of vascular growth factor and the vascular growth factor expressing stem cell line thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant