CN110951742B - Method for realizing plant gene replacement without generating DNA double-strand break - Google Patents

Method for realizing plant gene replacement without generating DNA double-strand break Download PDF

Info

Publication number
CN110951742B
CN110951742B CN201911405281.8A CN201911405281A CN110951742B CN 110951742 B CN110951742 B CN 110951742B CN 201911405281 A CN201911405281 A CN 201911405281A CN 110951742 B CN110951742 B CN 110951742B
Authority
CN
China
Prior art keywords
sequence
dna
leu
plant
lys
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911405281.8A
Other languages
Chinese (zh)
Other versions
CN110951742A (en
Inventor
杨进孝
徐雯
康桂婷
李璐
赵思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911405281.8A priority Critical patent/CN110951742B/en
Publication of CN110951742A publication Critical patent/CN110951742A/en
Application granted granted Critical
Publication of CN110951742B publication Critical patent/CN110951742B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1022Transferases (2.) transferring aldehyde or ketonic groups (2.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y202/00Transferases transferring aldehyde or ketonic groups (2.2)
    • C12Y202/01Transketolases and transaldolases (2.2.1)
    • C12Y202/01006Acetolactate synthase (2.2.1.6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Abstract

The invention provides a method for realizing gene replacement without generating DNA double-strand break. The method comprises the following steps: introducing the sgRNA, the Cas9 nickase and the donor DNA into a target plant; a sgRNA target DNA fragment A target sequence; the donor DNA sequentially comprises a DNA fragment A target sequence, a DNA fragment B and a DNA fragment A target sequence; the DNA fragment B is a DNA molecule obtained by mutating the DNA fragment A by one or more bases; under the guidance of sgRNA, the Cas9 nicking enzyme generates single-stranded DNA nicks at the target site sequence of the DNA fragment A in the target plant genome and the target site sequence of the DNA fragment A in the donor DNA, and replaces the DNA fragment A in the target plant genome with the DNA fragment B through a repair mechanism in a target plant body, so that the plant gene replacement is realized.

Description

Method for realizing plant gene replacement without generating DNA double-strand break
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a method for realizing plant gene replacement without generating DNA double-strand break.
Background
The probability of precise replacement of genes mediated by long-chain DNA templates in cells is very low, but the introduction of a DNA double strand break (DSDNA break, DSB) near the site to be replaced can significantly increase the probability of replacement. The CRISPR-Cas9 technology has become a powerful genome editing means and is widely applied to many tissues and cells. The CRISPR/Cas9 protein-RNA complex is positioned on a target point through a guide RNA (guide RNA), and the DNA generates DSB through cutting, so that the accurate replacement efficiency of the gene mediated by a long-chain DNA template is increased. After the DSB is produced, the organism will instigate DNA repair mechanisms. There are two general repair mechanisms, one is non-homologous end joining (NHEJ), the repair is mostly performed, and random indels (insertions or deletions) are generally generated after DNA repair. The other is homologous-directed repair (HDR), which uses sister chromatids or exogenous DNA donors (donors) as repair templates to achieve precise repair of genes. In animal cells, the specific principle of repair is: ctIP enzyme starts end-to-end cleavage at DSB, generating a protruding 3' -end single-stranded DNA (ssDNA) tail, the ssDNA is recognized by recombinase Rad51, combined into a complex, invaded into a donor DNA template, and annealed with homologous fragments thereof to synthesize a new DNA strand by using the donor DNA as the template, thereby completing repair. When the sequence between the homologous arms of the donor DNA carries an exogenous mutation, this mutation will be introduced into the DNA strand during repair, thus allowing a precise site-directed substitution. The HDR repair starts from the generation of DSB, and as the repair probability of NHEJ is far greater than that of HDR, in a sample subjected to precise replacement, more byproducts such as indels are introduced to cause large-fragment deletion, displacement and the like of DNA.
To increase the ratio of exact HDR to non-exact HDR in the product, one attempts to use an inactive mutant D10A of Cas9, resulting in a single-strand Nick (Nick) on the DNA. In animals, single-stranded nick-initiated HDR had fewer by-products than DSBs, but at the same time somewhat reduced the efficiency of HDR. In plants, DSB-induced HDR enabled precise replacement on different genes, but Cas 9D 10A-induced nicks enabled precise replacement of HDR, whether their efficiency was lower than DSB-induced HDR, and whether byproducts were reduced, were not reported.
Disclosure of Invention
The invention aims to provide a method for realizing plant gene replacement without generating DNA double-strand break.
The method for realizing plant gene replacement without generating DNA double-strand break provided by the invention comprises the following steps: introducing sgRNA, cas9 nickase (Cas 9 n) or a variant thereof, donor DNA into a plant of interest;
the sgRNA target DNA fragment A target sequence;
the donor DNA sequentially comprises the DNA fragment A target sequence, the DNA fragment B and the DNA fragment A target sequence;
the DNA fragment B is a DNA molecule obtained by mutating the DNA fragment A by one or more bases;
under the guidance of the sgRNA, the Cas9 nickase or the variant thereof generates single-stranded DNA nicks at a DNA fragment A target sequence in a target plant genome and a DNA fragment A target sequence in the donor DNA, and replaces the DNA fragment A in the target plant genome with the DNA fragment B through a repair mechanism in a target plant body, so that plant gene replacement is realized.
In the above method, the DNA fragment a may be any fragment on the genome of the target plant, and the DNA fragment b is obtained by mutating one or several bases of the DNA fragment a, and the DNA fragment b is a fragment on the donor DNA. The base mutation may be a base substitution and/or a base insertion and/or a base deletion.
In practical application, after a sgRNA/Cas9n system and a DNA fragment B with target point sequences corresponding to target points added at two ends, namely donor DNA, are introduced into a target plant, the DNA fragment A in a target plant genome can be replaced by the DNA fragment B, and further gene replacement is realized. The gene substitution can introduce the gene mutation site into the target plant genome, so as to realize the gene mutation (such as base substitution, base insertion or base deletion) on the target plant genome, thereby changing the amino acid functional site and/or the type and/or the activity and/or the content of the corresponding protein expressed in the target plant, and obtaining the plant mutant with a certain function or character. In a specific embodiment of the present invention, the base mutation may specifically be a base substitution.
Furthermore, the sizes of the DNA fragment A and the DNA fragment B can be 200-2000bp or 200-1500bp or 200-1000bp.
Furthermore, the sizes of the DNA fragment A and the DNA fragment B are 694bp.
The DNA fragment A is a DNA molecule shown in 1300 th-1993 th sites of a sequence 5.
The DNA fragment B is a DNA molecule shown in 10695-11388 th site of the sequence 1.
In a specific embodiment of the invention, the DNA fragment A consists of an ALS gene fragment with the size of 636bp and a fragment with the downstream size of 58bp in sequence. And the DNA fragment B is a DNA molecule obtained by mutating the 344 th site of the DNA fragment A from a base G to a base T, mutating the 581 st site from a base G to a base T, mutating the 336 nd site from a base G to a base C, mutating the 339 nd site from a base G to a base C, mutating the 342 nd site from a base A to a base G, and mutating the 396 th site from a base G to a base C. After the DNA fragment A in the rice genome is replaced by the DNA fragment B, the DNA fragment A in the rice genome is mutated from a base G to a base T at the 344 th position and from the base G to the base T at the 581 th position, the 548 th amino acid of an amino acid sequence of the ALS protein expressed in rice is mutated from tyrosine (Try) to leucine (Leu), and the 627 th amino acid is mutated from serine (Ser) to isoleucine (Ile), so that a precise editing plant with herbicide resistance is generated.
In the above method, the sgRNA structure is as follows: tRNA-RNA-sgRNA backbone transcribed from the DNA fragment A target sequence;
the tRNA is a 1) or a 2) or a 3):
a1 An RNA molecule obtained by replacing T at positions 474 to 550 of the sequence 1 with U;
a2 RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides into the RNA molecules shown in a 1) and have the same functions;
a3 An RNA molecule with 75 percent or more than 75 percent of identity with the nucleotide sequence defined by a 1) or a 2) and with the same function;
the sgRNA backbone is b 1) or b 2) or b 3):
b1 RNA molecule obtained by replacing T in 571-646 th site of the sequence 1 with U;
b2 RNA molecules shown in b 1) are subjected to substitution and/or deletion and/or addition of one or more nucleotides and have the same functions;
b3 RNA molecules with 75% or more than 75% identity and the same function with the nucleotide sequences defined by b 1) or b 2).
In the method, the donor DNA sequentially consists of the DNA fragment A target sequence, the PAM sequence, the DNA fragment B, the DNA fragment A target sequence and the PAM sequence. The DNA fragment A target sequence is specifically an ST215 target sequence; the ST215 target sequence is 551-570 th site of the sequence 1. The donor DNA is particularly 10672-11411 th site of the sequence 1.
In the above method, the Cas9 nickase may be a Cas 9D 10A nickase or a Cas 9H 840A nickase;
the Cas9 nickase variants include Cas9 derived from bacteria (such as SaCas9, saCas9-KKH and the like), cas9 variants recognizing different PAMs (such as xCas9, cas9-NG, cas9-VQR, cas9-VRER and the like), cas9 high fidelity enzyme variants (such as HypaCas9, eSPcas9 (1.1), cas9-HF1 and the like) and the like.
Further, the Cas 9D 10A nickase is SpCas9n protein;
the SpCas9n protein is C1) or C2):
c1 ) the amino acid sequence is the protein shown in the sequence 3;
c2 Protein with the same function obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table;
further, the encoding gene of the SpCas9n protein is c 1) or c 2) or c 3):
c1 A cDNA molecule or DNA molecule shown in 2877-6977 site of a sequence 1 in a sequence table;
c2 A cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in c 1) and encoding said SpCas9 n;
c3 A cDNA or DNA molecule hybridizing under stringent conditions with a nucleotide sequence defined in c 1) or c 2) and encoding said SpCas9 n.
In the above method, the Cas9 nickase or the variant thereof carries a nuclear localization signal. The nuclear localization signal may be BP NLS, virD2 NLS, or SV40 NLS. The number of nuclear localization signals may be 1 or 2 or more.
Further, the nuclear localization signal is SV40 NLS. The amino acid sequence of the SV40 NLS is a sequence 2. The number of the nuclear localization signals is 8.
Furthermore, the SV40 NLS has a coding sequence of 2742-2762 bits of the sequence 1. The Cas9 nickase or variant thereof carries 4 SV40 NLS at each end.
In the above method, the method for introducing the sgRNA, the Cas9 nickase or the variant thereof, and the donor DNA into the plant of interest comprises the steps of: introducing a DNA molecule that transcribes the sgRNA, a gene encoding the Cas9 nickase or a variant thereof, and a donor DNA into a plant of interest.
In the above method, the sgRNA is tRNA-sgRNA, and the tRNA-sgRNA obtained by transcribing the DNA molecule of the tRNA-sgRNA is an immature RNA precursor, and the tRNA in the RNA precursor is cleaved by two enzymes (RNase P and RNaseZ) to obtain mature RNA. And obtaining independent mature RNAs according to the number of targets in a recombinant expression vector, wherein each mature RNA consists of RNA transcribed by the target sequence and the sgRNA framework in sequence or consists of RNA transcribed by the target sequence, the sgRNA framework and residual individual bases of tRNA in sequence. In a specific embodiment of the invention, the recombinant expression vector contains a target.
Further, the sgRNA-transcribed DNA molecule, the gene encoding the Cas9 nickase or a variant thereof, and the donor DNA are introduced into the plant of interest via a recombinant expression vector. The sgRNA-transcribed DNA molecule, the gene encoding the Cas9 nickase or a variant thereof, and the donor DNA can be introduced into a target plant through the same recombinant expression vector, or can be introduced into a target plant through two or more recombinant expression vectors.
In a specific embodiment of the invention, the sgRNA-transcribed DNA molecule, the gene encoding the Cas9 nickase or a variant thereof and the donor DNA are introduced into the plant of interest via the same recombinant expression vector. The recombinant expression vector comprises an expression cassette consisting of a promoter, the DNA molecule for transcribing the sgRNA and a terminator in sequence, and an expression cassette consisting of a promoter, the coding gene of the Cas9 nickase or the variant thereof and a terminator in sequence.
Furthermore, the nucleotide sequence of the recombinant expression vector is shown as a sequence 1.
The application of the method in plant gene editing or plant mutant preparation or plant gene replacement efficiency improvement or by-product reduction in plant gene replacement also belongs to the protection scope of the invention.
The invention finally provides a method I or a method II or a method III or a method IV:
the first method is a method for editing plant genes; the method for plant gene editing comprises the following steps: and replacing the target gene segment in the plant genome according to the method so as to realize plant gene editing. The editing may specifically be a base substitution.
The second method is a method for preparing the plant mutant; the method for preparing the plant mutant comprises the following steps: and replacing the target gene segment in the plant genome according to the method to obtain the plant mutant. The plant mutant may specifically be a herbicide-resistant mutant.
The third method is a method for improving the gene replacement efficiency of the plant; the method for improving the gene replacement efficiency of the plant comprises the following steps: the target gene fragment in the plant genome is replaced according to the method. The replacement efficiency may specifically be an HDR replacement efficiency.
The fourth method is a method for reducing byproducts generated by plant gene replacement; the method for reducing the byproducts generated by the gene replacement of the plant comprises the following steps: the target gene fragment in the plant genome is replaced according to the method. The reduction of the by-products generated by the plant gene replacement is embodied in that no additional Indels are generated from the product obtained by replacing the target gene segment in the plant genome according to the method.
In the above method or use, the plant is any one of the following d 1) -d 3):
d1 A monocot or dicot;
d2 Gramineous plants;
d3 Rice (e.g., nipponbare).
The principle of the method for realizing gene replacement without generating DNA double-strand break provided by the invention is as follows: the Cas 9D 10A nickase initiates single-stranded DNA nicking at the genomic ALS target site under the guidance of the sgRNA; donor DNA on the vector contains a mutation site with herbicide resistance, and the 5 'end and the 3' end of the donor DNA respectively contain target sequences corresponding to 1 target point, and under the guidance of a Cas 9D 10A/sgRNA complex, a nicking site is respectively generated; precise replacement of donor DNA with genomic fragments is achieved under repair mechanisms in plants, resulting in precisely edited plants with herbicide resistance.
The invention has the following advantages:
1. the efficiency is high: the single strand Nick (Nick) induced HDR accurate replacement efficiency was 2-3 times higher than that induced by DSB.
2. And (3) few by-products: single-stranded Nick (Nick) -initiated HDR exact replacement plants do not contain random Indels, while DSB-initiated HDR exact replacement contains a large number of Indels.
3. The cost is low: the method for realizing gene replacement without generating DNA double-strand break only needs that the carrier contains Cas9n related elements and corresponding donor DNA, and can realize accurate replacement by an agrobacterium infection method.
The invention provides a method for realizing gene replacement without generating DNA double-strand break. Experiments prove that: the method for realizing gene replacement without generating DNA double-strand break realizes the accurate replacement of the endogenous acetolactate synthase ALS gene in rice, and obtains an accurate editing plant with herbicide resistance.
Drawings
FIG. 1 is a schematic diagram of the construction of an exact replacement vector.
FIG. 2 is a schematic diagram of the exact alternate working principle of Nick induced ALS.
FIG. 3 is a schematic diagram of the precise replacement of ALS by DSB.
FIG. 4 shows the result of amplification detection of specific primers of plants resulting from precise replacement of ALS by Nick-initiated ALS.
FIG. 5 shows DSB-initiated ALS precise replacement resulting in plants, specific primer amplification test results.
Fig. 6 shows an exact alternative embodiment of the Nick and DSB.
FIG. 7 is a schematic diagram of the precise replacement vector construction.
Fig. 8 shows sgRNA/Cas9n conventional vector transgenic plants, and specific primer amplification detection results.
FIG. 9 shows the results of amplification detection of specific primers of esgRNA/Cas9n-P2A-Hpt optimized vector transgenic plants.
Detailed Description
The following examples are given to facilitate a better understanding of the invention, but do not limit the invention.
The experimental procedures in the following examples are conventional unless otherwise specified. The test materials used in the following examples were purchased from a conventional biochemical reagent store unless otherwise specified. In the following examples, the 1 st position of each nucleotide sequence in the sequence Listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA, unless otherwise specified.
The primer pair P1 consists of a primer HDR-F:5 'gcgccccgattctcttatgtc-3' and primer HDR-R:5 'acotatcctcacactggacg-3' for detecting whether the plant is accurately replaced.
Primer pair P2 consists of primer gALS-F:5 'atccccaggttacaaccacctg-3' and primer gALS-R:5 'and cacttaactcagagctattgcatag-3' for amplifying genome ALS sequence and sequencing.
HDR seedlings refer to T in obtention 0 And in the seedling, plants containing the corresponding mutation sites brought by the donor are included.
Based on T 0 HDR replacement efficiency of seedlings = T 0 Number of seedlings in which precise replacement occurred/T obtained 0 And (4) the total number of seedlings.
HDR replacement efficiency = T based on number of calli 0 The number of shoots with precise placement/total number of calli from initial infestation occurred in the shoot.
Indels efficiency = exact in HDRAlternative T 0 T with occurrence of indels in seedlings/precise replacement of seedlings 0 And (4) the total number of seedlings.
Japanese fine rice: reference documents: the effects of sodium nitroprusside and its photolysis products on the growth of Nippon rice seedlings and the expression of 5 hormone marker genes [ J ]. Proceedings of university of south Henan (Nature edition), 2017 (2): 48-52.; the public is available from the agroforestry academy of sciences of Beijing.
And (3) recovering the culture medium: n6 solid medium containing 200mg/L timentin.
Screening medium 1: n6 solid medium containing 50mg/L hygromycin.
Screening medium 2: n6 solid medium containing 0.4uM/L bispyribac-sodium.
Differentiation medium: n6 solid culture medium containing 2mg/L KT, 0.2mg/L NAA, 0.5g/L glutamic acid and 0.5g/L proline.
Rooting culture medium: n6 solid culture medium containing 0.2mg/L NAA, 0.5g/L glutamic acid, 0.5g/L proline and 0.28uM/L bispyribac-sodium.
In the following embodiments, the amino acid sequence of ALS protein is shown as sequence 6 in the sequence table, and the coding gene sequence is shown as 1 st to 1935 th site of sequence 5 in the sequence table.
Example 1 construction of vector for realizing Gene replacement without generating DNA double-stranded breaks and application thereof in Rice Gene replacement
1. Description of the principles of construction and replacement of recombinant expression vectors
1. Construction of recombinant expression vectors
The following recombinant expression vectors were artificially synthesized, each of which was a circular plasmid: an sgRNA/Cas9 recombinant expression vector and an sgRNA/Cas9n recombinant expression vector. The sgRNA/Cas9 recombinant expression vector and the sgRNA/Cas9n recombinant expression vector are shown in a structural schematic diagram in FIG. 1. The specific structural descriptions are as follows:
the sequence of the sgRNA/Cas9n recombinant expression vector is sequence 1 in the sequence table. The 131 th to 467 th sites of the sequence 1 are OsU3 promoter sequences, the 474 th to 550 th sites are tRNA sequences, the 551 th to 570 th sites are ST215 target site sequences, the 571 th to 646 th sites are sgRNA framework sequences, and the 647 th to 937 th sites are OsU3 terminator sequences. The 944-2657 th site of the sequence 1 is an OsUbq3 promoter sequence, the 2742-2762 th site, the 2772-2792 th site, the 2802-2822 th site, the 2832-2852 th site, the 6996-7016 th site, the 7026-7046 th site, the 7056-7076 th site and the 7086-7106 th site are coding sequences of a nuclear localization signal SV40 (a nuclear localization signal SV40 shown in a coding sequence 2), and the 2877-6977 th site is a coding sequence of a SpCas9n protein (a SpCas9n protein shown in a coding sequence 3); the 7122-7376 bits of the sequence 1 are Nos terminator sequence; the 7405-9397 th site of the sequence 1 is ZmUbi1 promoter sequence, the 9404-10429 th site is hygromycin phosphotransferase coding sequence, and the 10456-10671 th site is CaMV35S terminator sequence. The 10672-10694 th site and the 11389-11411 th site are ST215 target site target sequences (consisting of an ST215 target site sequence and a PAM sequence); ALS donor DNA sequence is at positions 10695-11388.
The sequence of the sgRNA/Cas9 recombinant expression vector is obtained by replacing the 2877-6977 th site in the sequence 1 with the SpCas9 protein coding sequence shown in the sequence 4 and keeping other sequences unchanged.
2. Precise substitution principle of recombinant expression vector
1) Precise replacement principle of sgRNA/Cas9n recombinant expression vector
The sgRNA/Cas9n recombinant expression vector is based on a precise substitution by single-strand Nick (Nick) guide, and the schematic diagram is shown in fig. 2.
The sgRNA/Cas9n recombinant expression vector comprises the following elements: sgRNA, cas9n, donor DNA.
The sgRNA targets the ST215 target sequence.
Donor DNA (donor DNA): the donor DNA consists of, in order, an ST215 target sequence, an ALS donor DNA sequence, and an ST215 target sequence.
The ALS donor DNA sequence (10695-11388 th site of the sequence 1) is a DNA molecule obtained by mutating a DNA fragment A (the DNA fragment A is a 694bp fragment shown in the sequence 5, 1300-1993, and sequentially consists of a 636bp ALS gene fragment and a 58bp downstream fragment thereof, wherein the DNA fragment A is a fragment on a rice genome). Mutations include functional site mutations and synonymous site mutations.
Functional site mutation: the DNA fragment A was mutated from base G to base T at position 344 (where the base mutation resulted in the mutation of tyrosine (Try) to leucine (Leu) at position 548 in the amino acid sequence of ALS protein in rice) and from base G to base T at position 581 (where the base mutation resulted in the mutation of serine (Ser) to isoleucine (Ile) at position 627 in the amino acid sequence of ALS protein in rice). Tyrosine (Try) at 548 th position of an ALS protein amino acid sequence in rice is mutated into leucine (Leu), serine (Ser) at 627 th position is mutated into isoleucine (Ile) and then can resist double-herb ether herbicide, the 548 th position of the ALS protein amino acid sequence in rice is marked as a W548L functional mutation site, and the 627 th position of the ALS protein amino acid sequence in rice is marked as an S627I functional mutation site.
Mutation at the synonymous site: in order to facilitate later-stage design of specific detection primers to detect the accurate substitution mutant, the 336 th site of the DNA fragment A is mutated from a base G to a base C (the base mutation corresponds to the 545 th site of the ALS protein amino acid sequence of rice), the 339 th site is mutated from a base G to a base C (the base mutation corresponds to the 546 th site of the ALS protein amino acid sequence of rice), the 342 th site is mutated from a base A to a base G (the base mutation corresponds to the 547 th site of the ALS protein amino acid sequence of rice), and the 396 th site is mutated from a base G to a base C (the base mutation corresponds to the 565 th site of the ALS protein amino acid sequence of rice). The synonymous site mutation does not change the amino acid corresponding to the corresponding amino acid site on the rice ALS protein amino acid sequence, the 545 th site of the rice ALS protein amino acid sequence is marked as the 545 th synonymous mutation site, the 546 th site of the rice ALS protein amino acid sequence is marked as the 546 synonymous mutation site, the 547 th site of the rice ALS protein amino acid sequence is marked as the 547 synonymous mutation site, and the 565 th site of the rice ALS protein amino acid sequence is marked as the P565 synonymous mutation site.
Under the guidance of ST215 target, the Cas9n/sgRNA complex generates two single-chain Nick sites on a target sequence on a donor in a carrier, and simultaneously generates one single-chain Nick site on a target sequence on a rice genome DNA fragment A, and the scheme is recorded as triNicks (three nicks). Under the in vivo repair mechanism of rice, donor DNA is precisely replaced at a nicking site on a rice genome (a DNA fragment A on the rice genome is replaced by ALS donor DNA), so that a mutation site on the ALS donor DNA is introduced into the rice genome to obtain a plant after gene replacement, and the plant after gene replacement has herbicide resistance.
2) Precise replacement principle of sgRNA/Cas9 recombinant expression vector
The sgRNA/Cas9 recombinant expression vector is based on DSB-guided precise replacement, the schematic diagram is shown in fig. 3.
The sgRNA/Cas9 recombinant expression vector comprises the following elements: sgRNA, cas9, donor DNA.
sgRNA targets ST215 target sequences.
Donor DNA (donor DNA): the donor DNA consists of an ST215 target sequence, an ALS donor DNA sequence, and an ST215 target sequence, in that order.
The Cas9/sgRNA complex, guided by ST215 target, generates two DSB sites in the vector for the target sequence on the donor, and one DSB site in the rice genome, this protocol is denoted as triDSBs (three DSBs). Under the in vivo repair mechanism of rice, donor DNA is precisely replaced at a DSB site on a rice genome (a DNA fragment A on the rice genome is replaced by ALS donor DNA), so that a mutation site on the ALS donor DNA is introduced into the rice genome to obtain a plant after gene replacement, and the plant after gene replacement has herbicide resistance.
2. Obtaining of Rice Positive resistant callus
And (3) operating the sgRNA/Cas9n and sgRNA/Cas9 recombinant expression vectors obtained in the step one according to the following steps 1-7 respectively:
1. the vector was introduced into Agrobacterium EHA105 (product of Shanghai Diego Biotechnology Ltd., CAT #: AC 1010) to obtain a recombinant Agrobacterium.
2. Culturing the recombinant Agrobacterium with a medium (YEP medium containing 50. Mu.g/ml kanamycin and 25. Mu.g/ml rifampicin), shaking at 28 ℃ and 150rpm to OD 600 At room temperature, centrifuging at 10000rpm for 1min, resuspending the thallus with infection solution (glucose and sucrose are substituted for N6 liquid culture medium, and the concentration of glucose and sucrose in the infection solution is 10g/L and 20g/L respectively) and diluting to OD 600 And the concentration is 0.2, and an agrobacterium infection solution is obtained.
3. The mature seeds of the rice variety Nipponbare are shelled and threshed, placed in a 100mL triangular flask, added with 70% (v/v) ethanol water solution for soaking for 30sec, then placed in 25% (v/v) sodium hypochlorite water solution, sterilized by shaking at 120rpm for 30min, washed by sterile water for 3 times, sucked by filter paper to remove water, then placed on an N6 solid culture medium with the embryo downwards, and cultured in dark at 28 ℃ for 4-6 weeks to obtain the rice callus.
4. After the step 3 is completed, the rice callus is soaked in an agrobacterium infection solution A (the agrobacterium infection solution A is a liquid obtained by adding acetosyringone into an agrobacterium infection solution), the addition amount of the acetosyringone meets the volume ratio of the acetosyringone to the agrobacterium infection solution of 25 mul: 50 ml), and the rice callus is soaked for 10min, and then the rice callus is placed on a culture dish (containing about 200ml of the infection solution without agrobacterium) paved with two layers of sterilizing filter paper and is cultured in the dark for 1 day at the temperature of 21 ℃.
5. And (4) putting the rice callus obtained in the step (4) on a recovery culture medium, and performing dark culture at 25-28 ℃ for 3 days.
6. And (4) placing the rice callus obtained in the step (5) on a screening culture medium 1, and performing dark culture at 28 ℃ for 2 weeks.
7. And (4) transferring the rice callus obtained in the step (6) to a screening culture medium 2, and performing dark culture at 28 ℃ for 4 weeks to obtain the rice resistance callus.
3. Obtaining of T0 seedlings of rice
1. And (2) putting the rice resistant callus obtained in the step (1) on a differentiation culture medium, and culturing for about 1 month at 25 ℃ under illumination.
2. Transplanting the differentiated plantlets to a rooting culture medium, and performing illumination culture at 25 ℃ for 2 weeks to obtain herbicide-resistant rice T0 plantlets.
4. Accurate replacement plant identification
1. After screening by a rooting medium containing herbicide, respectively extracting genomic DNA of surviving rice T0 seedlings, taking the genomic DNA as a template, and carrying out PCR amplification by adopting a primer pair consisting of a primer HDR-F (5 'gcgcgccgattctcttatgc-3') and a primer HDR-R (5 'acctatcctcacaggcaactgacg-3'), so as to obtain a PCR amplification product; the PCR amplification product was subjected to agarose gel electrophoresis, followed by judgment as follows: if the PCR amplification product contains DNA fragment of about 833bp, the corresponding rice T0 seedling is a positive T0 seedling which is accurately replaced; if the PCR amplification product does not contain the DNA fragment of about 833bp, the corresponding rice T0 seedling is the T0 seedling which is not accurately replaced.
2. PCR amplification is carried out on the genome ALS gene sequence of the positive T0 seedling accurately replaced and selected in the step 1 by using a primer pair consisting of a primer gALS-F (5-; and carrying out first-generation sequencing on the PCR amplification product, and analyzing whether the corresponding site is accurately replaced.
5. Analysis of results
1. Accurate replacement plant obtained by preliminary screening of primers in rice T0 seedling
32 transgenic positive seedlings (independent transformation events) are obtained by the triNicks scheme, and after the 32 seedlings are screened by a rooting medium containing herbicide, the 32 seedlings are all survived. After the screening of the HDR-F and HDR-R primers, 5 transgenic seedlings are accurate replacement plants, and the PCR detection result is shown in FIG. 4.
the triDSBs protocol yielded 23 transgenic positive shoots (independent transformation events) in total, and only 9 shoots survived after screening in rooting medium containing herbicide. After the screening of the HDR-F and HDR-R primers, 2 transgenic seedlings are accurate replacement plants, and the PCR detection result is shown in FIG. 5.
2. Accurate replacement plant confirmed by sequencing in rice T0 seedling and corresponding efficiency
The first generation sequencing results show: the plants with positive PCR detection result are accurately replaced on the corresponding sites. Thus, the probability of exact replacement in T0 shoots by the triNicks and the triDSBs regimens was 15.6% (5/32), 8.7% (2/23), respectively. From the point of view of overall transformation efficiency, the probability of precise replacement of triNicks and triDSBs protocols was 0.6% (5/840), 0.2% (2/840), respectively, calculated from the initial infected 840 pieces of resistant callus as the denominator.
Taken together, the precision replacement efficiency of the triNicks protocol in T0 shoots was 1.8 times that of the triDSBs, and 3 times that of the triNicks protocol if based on the initially infested calli. Demonstrating that the exact replacement efficiency of the Nick-induced triNicks protocol is significantly higher than the DSB-induced triDSBs protocol.
In addition, PCR amplification products of the samples with precise replacement are connected with pEASY-B (Beijing holotype gold organism), each sample is picked to select 4-6 positive clones for sequencing, the specific form of replacement is confirmed, and the specific sequencing result is shown in figure 6. the 5T 0 seedlings in the triNicks protocol were all heterozygous substitution mutants in the genome, causing no additional Indels production in the genome, while the 2T 0 seedlings in the triDSBs protocol all produced an insertion of one base G at the genomic ST215 target site. Indicating that the Nick-induced triNicks protocol produces fewer by-products on the genome than the DSB-induced triDSBs protocol.
Further analysis of the samples with precise substitutions in the triNicks protocol revealed that the substitution patterns at W548, P565 and S627 were not identical for the different samples, with substitutions occurring at three sites simultaneously (mutant pattern 1), at W548 and P565 (mutant pattern 2), at W548 and S627 (mutant pattern 3) and at W548 and S627 (mutant pattern 4). The efficiency for each mutant form is shown in table 1.
TABLE 1
Figure BDA0002348464030000091
Example 2 optimization of sgRNA/Cas9n recombinant expression vector and application thereof in rice gene replacement
1. Description of the principles of construction and replacement of recombinant expression vectors
1. Construction of recombinant expression vectors
The following recombinant expression vectors were artificially synthesized, each of which was a circular plasmid: sgRNA/Cas9n recombinant expression vector and esgRNA/Cas9n-P2A-Hpt recombinant expression vector. The sgRNA/Cas9n recombinant expression vector and the esgRNA/Cas9n-P2A-Hpt recombinant expression vector are shown in a structural schematic diagram in FIG. 7. The specific structural descriptions are as follows:
the sequence of the sgRNA/Cas9n recombinant expression vector is sequence 1 in the sequence table.
The sequence of the esgRNA/Cas9n-P2A-Hpt recombinant expression vector is obtained by replacing position 571-646 in the sequence 1 with an esgRNA framework sequence shown in a sequence 7, and replacing position 7107-10671 in the sequence 1 with a P2A-Hpt-tNos sequence shown in a sequence 8, and keeping other sequences unchanged. The 22 nd to 78 th positions of the sequence 8 are a coding sequence of a self-cutting oligopeptide P2A (the self-cutting oligopeptide P2A shown as a coding sequence 9), the 79 th to 1104 th positions of the sequence are coding sequences of hygromycin phosphotransferase (the hygromycin phosphotransferase shown as a coding sequence 10), and the 1111 nd to 1365 th positions of the sequence are Nos terminator sequences.
2. Explanation of the exact replacement principle for recombinant expression vectors
1) Precise replacement principle of sgRNA/Cas9n recombinant expression vector
The same precise replacement principle of sgRNA/Cas9n recombinant expression vector in example 1.
2) Precise replacement principle of esgRNA/Cas9n-P2A-Hpt recombinant expression vector
The precise replacement principle of the sgRNA/Cas9n-P2A-Hpt recombinant expression vector is the same as that of the sgRNA/Cas9n recombinant expression vector, only the sgRNA skeleton in the sgRNA/Cas9n recombinant expression vector is replaced by an sgRNA skeleton sequence, and an independent Cas9n coding gene expression cassette and an independent hygromycin phosphotransferase gene expression cassette in the sgRNA/Cas9n recombinant expression vector are replaced by an expression cassette co-expressed by Cas9n and hygromycin phosphotransferase connected by self-cleavage oligopeptide P2A.
2. Obtaining of Rice Positive resistant callus
And (3) respectively operating the sgRNA/Cas9n and the esgRNA/Cas9n-P2A-Hpt vectors obtained in the first step according to the method in the second step of the embodiment 1.
3. Obtaining of T0 seedling of rice
The same procedure as in step three of example 1.
4. Accurate replacement plant identification
The same procedure as in step four of example 1.
5. Analysis of results
1. Accurate replacement plant obtained by preliminary screening of primers in rice T0 seedling
32 transgenic positive seedlings (independent transformation events) are obtained by the sgRNA/Cas9n common vector, and the 32 seedlings survive after being screened by a rooting culture medium containing herbicide. After the screening of the HDR-F and HDR-R primers, 5 transgenic seedlings are accurate replacement plants, and the PCR detection result is shown in FIG. 8.
The esgRNA/Cas9n-P2A-Hpt optimized vector obtains 55 transgenic positive seedlings (independent transformation events) in total, and 24 seedlings survive after screening by a rooting culture medium containing a herbicide. After the screening of the HDR-F and HDR-R primers, 12 transgenic seedlings are accurate replacement plants, and the PCR detection result is shown in FIG. 9.
2. Accurate replacement plant confirmed by sequencing in rice T0 seedling and corresponding efficiency
The results of the first generation sequencing show: plants with positive PCR detection results are accurately replaced on corresponding sites, and no additional Indels bases are generated. Therefore, the probabilities of precise replacement of sgRNA/Cas9n ordinary vector and esgRNA/Cas9n-P2A-Hpt optimized vector in T0 vaccine are 15.6% (5/32) and 21.8% (12/55), respectively. From the point of view of overall transformation efficiency, the probabilities of precise replacement of sgRNA/Cas9n ordinary vector and esgRNA/Cas9n-P2A-Hpt optimized vector are 0.6% (5/840) and 1.4% (12/840), respectively, calculated by taking 840 resistant calli infected at the beginning as denominators.
In conclusion, the accurate replacement efficiency of the esgRNA/Cas9n-P2A-Hpt optimized vector in the T0 seedling is 1.4 times that of the sgRNA/Cas9n common vector, and if the initial infected callus is taken as the basis, the accurate replacement efficiency of the esgRNA/Cas9n-P2A-Hpt optimized vector is 2.3 times that of the sgRNA/Cas9n common vector. The accurate replacement efficiency of the sgRNA/Cas9n-P2A-Hpt optimized vector is higher than that of the sgRNA/Cas9n common vector.
TABLE 2
Figure BDA0002348464030000101
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the technical principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Sequence listing
<110> agriculture and forestry academy of sciences of Beijing City
<120> a method for realizing gene replacement without causing DNA double strand break
<160>10
<170>PatentIn version 3.5
<210>1
<211>11586
<212>DNA
<213>Artificial Sequence
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc gaagcaactt aaagttatca ggcatgcatg gatcttggag gaatcagatg 180
tgcagtcagg gaccatagca caagacaggc gtcttctact ggtgctacca gcaaatgctg 240
gaagccggga acactgggta cgttggaaac cacgtgatgt gaagaagtaa gataaactgt 300
aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt gcagtatggg 360
ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc tcggctatcc 420
acatagatca aagctgattt aaaagagttg tgcagatgat ccgtggcgga tccaacaaag 480
caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg ttcgattccc 540
ggctggtgca atttgggtat ggtggtgcaa gttttagagc tagaaatagc aagttaaaat 600
aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgctttt ttttttcgtt 660
ttgcattgag ttttctccgt cgcatgtttg cagttttatt ttccgttttg cattgaaatt 720
tctccgtctc atgtttgcag cgtgttcaaa aagtacgcag ctgtatttca cttatttacg 780
gcgccacatt ttcatgccgt ttgtgccaac tatcccgagc tagtgaatac agcttggctt 840
cacacaacac tggtgacccg ctgacctgct cgtacctcgt accgtcgtac ggcacagcat 900
ttggaattaa agggtgtgat cgatactgct tgctgctaag cttacaaatt cgggtcaagg 960
cggaagccag cgcgccaccc cacgtcagca aatacggagg cgcggggttg acggcgtcac 1020
ccggtcctaa cggcgaccaa caaaccagcc agaagaaatt acagtaaaaa aaaagtaaat 1080
tgcactttga tccacctttt attacctaag tctcaatttg gatcaccctt aaacctatct 1140
tttcaatttg ggccgggttg tggtttggac taccatgaac aacttttcgt catgtctaac 1200
ttccctttca gcaaacatat gaaccatata tagaggagat cggccgtata ctagagctga 1260
tgtgtttaag gtcgttgatt gcacgagaaa aaaaaatcca aatcgcaaca atagcaaatt 1320
tatctggttc aaagtgaaaa gatatgttta aaggtagtcc aaagtaaaac ttatagataa 1380
taaaatgtgg tccaaagcgt aattcactca aaaaaaatca acgagacgtg taccaaacgg 1440
agacaaacgg catcttctcg aaatttccca accgctcgct cgcccgcctc gtcttcccgg 1500
aaaccgcggt ggtttcagcg tggcggattc tccaagcaga cggagacgtc acggcacggg 1560
actcctccca ccacccaacc gccataaata ccagccccct catctcctct cctcgcatca 1620
gctccacccc cgaaaaattt ctccccaatc tcgcgaggct ctcgtcgtcg aatcgaatcc 1680
tctcgcgtcc tcaaggtacg ctgcttctcc tctcctcgct tcgtttcgat tcgatttcgg 1740
acgggtgagg ttgttttgtt gctagatccg attggtggtt agggttgtcg atgtgattat 1800
cgtgagatgt ttaggggttg tagatctgat ggttgtgatt tgggcacggt tggttcgata 1860
ggtggaatcg tggttaggtt ttgggattgg atgttggttc tgatgattgg ggggaatttt 1920
tacggttaga tgaattgttg gatgattcga ttggggaaat cggtgtagat ctgttgggga 1980
attgtggaac tagtcatgcc tgagtgattg gtgcgatttg tagcgtgttc catcttgtag 2040
gccttgttgc gagcatgttc agatctactg ttccgctctt gattgagtta ttggtgccat 2100
gggttggtgc aaacacaggc tttaatatgt tatatctgtt ttgtgtttga tgtagatctg 2160
tagggtagtt cttcttagac atggttcaat tatgtagctt gtgcgtttcg atttgatttc 2220
atatgttcac agattagata atgatgaact cttttaatta attgtcaatg gtaaatagga 2280
agtcttgtcg ctatatctgt cataatgatc tcatgttact atctgccagt aatttatgct 2340
aagaactata ttagaatatc atgttacaat ctgtagtaat atcatgttac aatctgtagt 2400
tcatctatat aatctattgt ggtaatttct ttttactatc tgtgtgaaga ttattgccac 2460
tagttcattc tacttatttc tgaagttcag gatacgtgtg ctgttactac ctatctgaat 2520
acatgtgtga tgtgcctgtt actatctttt tgaatacatg tatgttctgt tggaatatgt 2580
ttgctgtttg atccgttgtt gtgtccttaa tcttgtgcta gttcttaccc tatctgtttg 2640
gtgattattt cttgcagtac gtaagcatgg actacaagga ccacgacggg gattacaaag 2700
accacgacat agactacaag gatgacgatg acaaaatggc accgaagaaa aaaaggaagg 2760
tcggcggctc cccgaagaaa aaaaggaagg tcggcggctc cccgaagaaa aaaaggaagg 2820
tcggcggctc cccgaagaaa aaaaggaagg tcggaatcca tggcgttcca gctgccgaca 2880
agaagtactc catcggcctc gccatcggca ccaacagcgt cggctgggcg gtgatcaccg 2940
acgagtacaa ggtcccgtcc aagaagttca aggtcctggg caacaccgac cgccactcca 3000
tcaagaagaa cctcatcggc gccctcctct tcgactccgg cgagacggcg gaggcgaccc 3060
gcctcaagcg caccgcccgc cgccgctaca cccgccgcaa gaaccgcatc tgctacctcc 3120
aggagatctt ctccaacgag atggcgaagg tcgacgactc cttcttccac cgcctcgagg 3180
agtccttcct cgtggaggag gacaagaagc acgagcgcca ccccatcttc ggcaacatcg 3240
tcgacgaggt cgcctaccac gagaagtacc ccactatcta ccaccttcgt aagaagcttg 3300
ttgactctac tgataaggct gatcttcgtc tcatctacct tgctctcgct cacatgatca 3360
agttccgtgg tcacttcctt atcgagggtg accttaaccc tgataactcc gacgtggaca 3420
agctcttcat ccagctcgtc cagacctaca accagctctt cgaggagaac cctatcaacg 3480
cttccggtgt cgacgctaag gcgatccttt ccgctaggct ctccaagtcc aggcgtctcg 3540
agaacctcat cgcccagctc cctggtgaga agaagaacgg tcttttcggt aacctcatcg 3600
ctctctccct cggtctgacc cctaacttca agtccaactt cgacctcgct gaggacgcta 3660
agcttcagct ctccaaggat acctacgacg atgatctcga caacctcctc gctcagattg 3720
gagatcagta cgctgatctc ttccttgctg ctaagaacct ctccgatgct atcctccttt 3780
cggatatcct tagggttaac actgagatca ctaaggctcc tctttctgct tccatgatca 3840
agcgctacga cgagcaccac caggacctca ccctcctcaa ggctcttgtt cgtcagcagc 3900
tccccgagaa gtacaaggag atcttcttcg accagtccaa gaacggctac gccggttaca 3960
ttgacggtgg agctagccag gaggagttct acaagttcat caagccaatc cttgagaaga 4020
tggatggtac tgaggagctt ctcgttaagc ttaaccgtga ggacctcctt aggaagcaga 4080
ggactttcga taacggctct atccctcacc agatccacct tggtgagctt cacgccatcc 4140
ttcgtaggca ggaggacttc taccctttcc tcaaggacaa ccgtgagaag atcgagaaga 4200
tccttacttt ccgtattcct tactacgttg gtcctcttgc tcgtggtaac tcccgtttcg 4260
cttggatgac taggaagtcc gaggagacta tcaccccttg gaacttcgag gaggttgttg 4320
acaagggtgc ttccgcccag tccttcatcg agcgcatgac caacttcgac aagaacctcc 4380
ccaacgagaa ggtcctcccc aagcactccc tcctctacga gtacttcacg gtctacaacg 4440
agctcaccaa ggtcaagtac gtcaccgagg gtatgcgcaa gcctgccttc ctctccggcg 4500
agcagaagaa ggctatcgtt gacctcctct tcaagaccaa ccgcaaggtc accgtcaagc 4560
agctcaagga ggactacttc aagaagatcg agtgcttcga ctccgtcgag atcagcggcg 4620
ttgaggaccg tttcaacgct tctctcggta cctaccacga tctcctcaag atcatcaagg 4680
acaaggactt cctcgacaac gaggagaacg aggacatcct cgaggacatc gtcctcactc 4740
ttactctctt cgaggatagg gagatgatcg aggagaggct caagacttac gctcatctct 4800
tcgatgacaa ggttatgaag cagctcaagc gtcgccgtta caccggttgg ggtaggctct 4860
cccgcaagct catcaacggt atcagggata agcagagcgg caagactatc ctcgacttcc 4920
tcaagtctga tggtttcgct aacaggaact tcatgcagct catccacgat gactctctta 4980
ccttcaagga ggatattcag aaggctcagg tgtccggtca gggcgactct ctccacgagc 5040
acattgctaa ccttgctggt tcccctgcta tcaagaaggg catccttcag actgttaagg 5100
ttgtcgatga gcttgtcaag gttatgggtc gtcacaagcc tgagaacatc gtcatcgaga 5160
tggctcgtga gaaccagact acccagaagg gtcagaagaa ctcgagggag cgcatgaaga 5220
ggattgagga gggtatcaag gagcttggtt ctcagatcct taaggagcac cctgtcgaga 5280
acacccagct ccagaacgag aagctctacc tctactacct ccagaacggt agggatatgt 5340
acgttgacca ggagctcgac atcaacaggc tttctgacta cgacgtcgac cacattgttc 5400
ctcagtcttt ccttaaggat gactccatcg acaacaaggt cctcacgagg tccgacaaga 5460
acaggggtaa gtcggacaac gtcccttccg aggaggttgt caagaagatg aagaactact 5520
ggaggcagct tctcaacgct aagctcatta cccagaggaa gttcgacaac ctcacgaagg 5580
ctgagagggg tggcctttcc gagcttgaca aggctggttt catcaagagg cagcttgttg 5640
agacgaggca gattaccaag cacgttgctc agatcctcga ttctaggatg aacaccaagt 5700
acgacgagaa cgacaagctc atccgcgagg tcaaggtgat caccctcaag tccaagctcg 5760
tctccgactt ccgcaaggac ttccagttct acaaggtccg cgagatcaac aactaccacc 5820
acgctcacga tgcttacctt aacgctgtcg ttggtaccgc tcttatcaag aagtacccta 5880
agcttgagtc cgagttcgtc tacggtgact acaaggtcta cgacgttcgt aagatgatcg 5940
ccaagtccga gcaggagatc ggcaaggcca ccgccaagta cttcttctac tccaacatca 6000
tgaacttctt caagaccgag atcaccctcg ccaacggcga gatccgcaag cgccctctta 6060
tcgagacgaa cggtgagact ggtgagatcg tttgggacaa gggtcgcgac ttcgctactg 6120
ttcgcaaggt cctttctatg cctcaggtta acatcgtcaa gaagaccgag gtccagaccg 6180
gtggcttctc caaggagtct atccttccaa agagaaactc ggacaagctc atcgctagga 6240
agaaggattg ggaccctaag aagtacggtg gtttcgactc ccctactgtc gcctactccg 6300
tcctcgtggt cgccaaggtg gagaagggta agtcgaagaa gctcaagtcc gtcaaggagc 6360
tcctcggcat caccatcatg gagcgctcct ccttcgagaa gaacccgatc gacttcctcg 6420
aggccaaggg ctacaaggag gtcaagaagg acctcatcat caagctcccc aagtactctc 6480
ttttcgagct cgagaacggt cgtaagagga tgctggcttc cgctggtgag ctccagaagg 6540
gtaacgagct tgctcttcct tccaagtacg tgaacttcct ctacctcgcc tcccactacg 6600
agaagctcaa gggttcccct gaggataacg agcagaagca gctcttcgtg gagcagcaca 6660
agcactacct cgacgagatc atcgagcaga tctccgagtt ctccaagcgc gtcatcctcg 6720
ctgacgctaa cctcgacaag gtcctctccg cctacaacaa gcaccgcgac aagcccatcc 6780
gcgagcaggc cgagaacatc atccacctct tcacgctcac gaacctcggc gcccctgctg 6840
ctttcaagta cttcgacacc accatcgaca ggaagcgtta cacgtccacc aaggaggttc 6900
tcgacgctac tctcatccac cagtccatca ccggtcttta cgagactcgt atcgaccttt 6960
cccagcttgg tggtgatgac gatgacaaaa tggcaccgaa gaaaaaaagg aaggtcggcg 7020
gctccccgaa gaaaaaaagg aaggtcggcg gctccccgaa gaaaaaaagg aaggtcggcg 7080
gctccccgaa gaaaaaaagg aaggtcggaa tccatggcta gtcccgatcg ttcaaacatt 7140
tggcaataaa gtttcttaag attgaatcct gttgccggtc ttgcgatgat tatcatataa 7200
tttctgttga attacgttaa gcatgtaata attaacatgt aatgcatgac gttatttatg 7260
aggtgggttt ttatgattag agtcccgcaa ttatacattt aatacgcgat agaaaacaaa 7320
atatagcgcg caaactagga taaattatcg cgcgcggtgt catctatgtt actagaggcg 7380
cgcctggtgg atcgtccgcc taggctgcag tgcagcgtga cccggtcgtg cccctctcta 7440
gagataatga gcattgcatg tctaagttat aaaaaattac cacatatttt ttttgtcaca 7500
cttgtttgaa gtgcagttta tctatcttta tacatatatt taaactttac tctacgaata 7560
atataatcta tagtactaca ataatatcag tgttttagag aatcatataa atgaacagtt 7620
agacatggtc taaaggacaa ttgagtattt tgacaacagg actctacagt tttatctttt 7680
tagtgtgcat gtgttctcct ttttttttgc aaatagcttc acctatataa tacttcatcc 7740
attttattag tacatccatt tagggtttag ggttaatggt ttttatagac taattttttt 7800
agtacatcta ttttattcta ttttagcctc taaattaaga aaactaaaac tctattttag 7860
tttttttatt taataattta gatataaaat agaataaaat aaagtgacta aaaattaaac 7920
aaataccctt taagaaatta aaaaaactaa ggaaacattt ttcttgtttc gagtagataa 7980
tgccagcctg ttaaacgccg tcgacgagtc taacggacac caaccagcga accagcagcg 8040
tcgcgtcggg ccaagcgaag cagacggcac ggcatctctg tcgctgcctc tggacccctc 8100
tcgagagttc cgctccaccg ttggacttgc tccgctgtcg gcatccagaa attgcgtggc 8160
ggagcggcag acgtgagccg gcacggcagg cggcctcctc ctcctctcac ggcaccggca 8220
gctacggggg attcctttcc caccgctcct tcgctttccc ttcctcgccc gccgtaataa 8280
atagacaccc cctccacacc ctctttcccc aacctcgtgt tgttcggagc gcacacacac 8340
acaaccagat ctcccccaaa tccacccgtc ggcacctccg cttcaaggta cgccgctcgt 8400
cctccccccc cccccctctc taccttctct agatcggcgt tccggtccat ggttagggcc 8460
cggtagttct acttctgttc atgtttgtgt tagatccgtg tttgtgttag atccgtgctg 8520
ctagcgttcg tacacggatg cgacctgtac gtcagacacg ttctgattgc taacttgcca 8580
gtgtttctct ttggggaatc ctgggatggc tctagccgtt ccgcagacgg gatcgatttc 8640
atgatttttt ttgtttcgtt gcatagggtt tggtttgccc ttttccttta tttcaatata 8700
tgccgtgcac ttgtttgtcg ggtcatcttt tcatgctttt ttttgtcttg gttgtgatga 8760
tgtggtctgg ttgggcggtc gttctagatc ggagtagaat tctgtttcaa actacctggt 8820
ggatttatta attttggatc tgtatgtgtg tgccatacat attcatagtt acgaattgaa 8880
gatgatggat ggaaatatcg atctaggata ggtatacatg ttgatgcggg ttttactgat 8940
gcatatacag agatgctttt tgttcgcttg gttgtgatga tgtggtgtgg ttgggcggtc 9000
gttcattcgt tctagatcgg agtagaatac tgtttcaaac tacctggtgt atttattaat 9060
tttggaactg tatgtgtgtg tcatacatct tcatagttac gagtttaaga tggatggaaa 9120
tatcgatcta ggataggtat acatgttgat gtgggtttta ctgatgcata tacatgatgg 9180
catatgcagc atctattcat atgctctaac cttgagtacc tatctattat aataaacaag 9240
tatgttttat aattattttg atcttgatat acttggatga tggcatatgc agcagctata 9300
tgtggatttt tttagccctg ccttcatacg ctatttattt gcttggtact gtttcttttg 9360
tcgatgctca ccctgttgtt tggtgttact tctgcaggag ctcatgaaaa agcctgaact 9420
caccgcgacg tctgtcgaga agtttctgat cgaaaagttc gacagcgtct ccgacctgat 9480
gcagctctcg gagggcgaag aatctcgtgc tttcagcttc gatgtaggag ggcgtggata 9540
tgtcctgcgg gtaaatagct gcgccgatgg tttctacaaa gatcgttatg tttatcggca 9600
ctttgcatcg gccgcgctcc cgattccgga agtgcttgac attggggagt ttagcgagag 9660
cctgacctat tgcatctccc gccgttcaca gggtgtcacg ttgcaagacc tgcctgaaac 9720
cgaactgccc gctgttctac aaccggtcgc ggaggctatg gatgcgatcg ctgcggccga 9780
tcttagccag acgagcgggt tcggcccatt cggaccgcaa ggaatcggtc aatacactac 9840
atggcgtgat ttcatatgcg cgattgctga tccccatgtg tatcactggc aaactgtgat 9900
ggacgacacc gtcagtgcgt ccgtcgcgca ggctctcgat gagctgatgc tttgggccga 9960
ggactgcccc gaagtccggc acctcgtgca cgcggatttc ggctccaaca atgtcctgac 10020
ggacaatggc cgcataacag cggtcattga ctggagcgag gcgatgttcg gggattccca 10080
atacgaggtc gccaacatct tcttctggag gccgtggttg gcttgtatgg agcagcagac 10140
gcgctacttc gagcggaggc atccggagct tgcaggatcg ccacgactcc gggcgtatat 10200
gctccgcatt ggtcttgacc aactctatca gagcttggtt gacggcaatt tcgatgatgc 10260
agcttgggcg cagggtcgat gcgacgcaat cgtccgatcc ggagccggga ctgtcgggcg 10320
tacacaaatc gcccgcagaa gcgcggccgt ctggaccgat ggctgtgtag aagtactcgc 10380
cgatagtgga aaccgacgcc ccagcactcg tccgagggca aagaaataga gtagatgccg 10440
accgggatct gtcgatcgac aagctcgagt ttctccataa taatgtgtga gtagttccca 10500
gataagggaa ttagggttcc tatagggttt cgctcatgtg ttgagcatat aagaaaccct 10560
tagtatgtat ttgtatttgt aaaatacttc tatcaataaa atttctaatt cctaaaacca 10620
aaatccagta ctaaaatcca gatcccccga attaattcgg cgttaattca gatttgggta 10680
tggtggtgca acgggaagag atcccaccgc aatatgccat tcaggtgctg gatgagctga 10740
cgaaaggtga ggcaatcatc gctactggtg ttgggcagca ccagatgtgg gcggcacaat 10800
attacaccta caagcggcca cggcagtggc tgtcttcggc tggtctgggc gcaatgggat 10860
ttgggctgcc tgctgcagct ggtgcttctg tggctaaccc aggtgtcaca gttgttgata 10920
ttgatgggga tggtagcttc ctcatgaaca ttcaggagct ggcattgatc cgcattgaga 10980
acctccctgt gaaggtgatg gtgttgaaca accaacattt gggtatggtc gtccagttgg 11040
aggataggtt ttacaaggcg aatagggcgc atacatactt gggcaacccc gaatgtgaga 11100
gcgagatata tccagatttt gtgactattg ctaaggggtt caatattcct gcagtccgtg 11160
taacaaagaa gagtgaagtc cgtgccgcca tcaagaagat gctcgagact ccagggccat 11220
acttgttgga tatcatcgtc ccgcaccagg agcatgtgct gcctatgatc ccaattgggg 11280
gcgcattcaa ggacatgatc ctggatggtg atggcaggac tgtgtattaa tctataatct 11340
gtatgttggc aaagcaccag cccggcctat gtttgacctg aatgacccat ttgggtatgg 11400
tggtgcaacg gcctgcagga cgcgtttaat taagtgcacg cggccgccta cttagtcaag 11460
agcctcgcac gcgactgtca cgcggccagg atcgcctcgt gagcctcgca atctgtacct 11520
agtgtttaaa ctatcagtgt ttgacaggat atattggcgg gtaaacctaa gagaaaagag 11580
cgttta 11586
<210>2
<211>7
<212>PRT
<213>Artificial Sequence
<400>2
Pro Lys Lys Lys Arg Lys Val
1 5
<210>3
<211>1367
<212>PRT
<213>Artificial Sequence
<400>3
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210>4
<211>4101
<212>DNA
<213>Artificial Sequence
<400>4
gacaagaagt actccatcgg cctcgacatc ggcaccaaca gcgtcggctg ggcggtgatc 60
accgacgagt acaaggtccc gtccaagaag ttcaaggtcc tgggcaacac cgaccgccac 120
tccatcaaga agaacctcat cggcgccctc ctcttcgact ccggcgagac ggcggaggcg 180
acccgcctca agcgcaccgc ccgccgccgc tacacccgcc gcaagaaccg catctgctac 240
ctccaggaga tcttctccaa cgagatggcg aaggtcgacg actccttctt ccaccgcctc 300
gaggagtcct tcctcgtgga ggaggacaag aagcacgagc gccaccccat cttcggcaac 360
atcgtcgacg aggtcgccta ccacgagaag taccccacta tctaccacct tcgtaagaag 420
cttgttgact ctactgataa ggctgatctt cgtctcatct accttgctct cgctcacatg 480
atcaagttcc gtggtcactt ccttatcgag ggtgacctta accctgataa ctccgacgtg 540
gacaagctct tcatccagct cgtccagacc tacaaccagc tcttcgagga gaaccctatc 600
aacgcttccg gtgtcgacgc taaggcgatc ctttccgcta ggctctccaa gtccaggcgt 660
ctcgagaacc tcatcgccca gctccctggt gagaagaaga acggtctttt cggtaacctc 720
atcgctctct ccctcggtct gacccctaac ttcaagtcca acttcgacct cgctgaggac 780
gctaagcttc agctctccaa ggatacctac gacgatgatc tcgacaacct cctcgctcag 840
attggagatc agtacgctga tctcttcctt gctgctaaga acctctccga tgctatcctc 900
ctttcggata tccttagggt taacactgag atcactaagg ctcctctttc tgcttccatg 960
atcaagcgct acgacgagca ccaccaggac ctcaccctcc tcaaggctct tgttcgtcag 1020
cagctccccg agaagtacaa ggagatcttc ttcgaccagt ccaagaacgg ctacgccggt 1080
tacattgacg gtggagctag ccaggaggag ttctacaagt tcatcaagcc aatccttgag 1140
aagatggatg gtactgagga gcttctcgtt aagcttaacc gtgaggacct ccttaggaag 1200
cagaggactt tcgataacgg ctctatccct caccagatcc accttggtga gcttcacgcc 1260
atccttcgta ggcaggagga cttctaccct ttcctcaagg acaaccgtga gaagatcgag 1320
aagatcctta ctttccgtat tccttactac gttggtcctc ttgctcgtgg taactcccgt 1380
ttcgcttgga tgactaggaa gtccgaggag actatcaccc cttggaactt cgaggaggtt 1440
gttgacaagg gtgcttccgc ccagtccttc atcgagcgca tgaccaactt cgacaagaac 1500
ctccccaacg agaaggtcct ccccaagcac tccctcctct acgagtactt cacggtctac 1560
aacgagctca ccaaggtcaa gtacgtcacc gagggtatgc gcaagcctgc cttcctctcc 1620
ggcgagcaga agaaggctat cgttgacctc ctcttcaaga ccaaccgcaa ggtcaccgtc 1680
aagcagctca aggaggacta cttcaagaag atcgagtgct tcgactccgt cgagatcagc 1740
ggcgttgagg accgtttcaa cgcttctctc ggtacctacc acgatctcct caagatcatc 1800
aaggacaagg acttcctcga caacgaggag aacgaggaca tcctcgagga catcgtcctc 1860
actcttactc tcttcgagga tagggagatg atcgaggaga ggctcaagac ttacgctcat 1920
ctcttcgatg acaaggttat gaagcagctc aagcgtcgcc gttacaccgg ttggggtagg 1980
ctctcccgca agctcatcaa cggtatcagg gataagcaga gcggcaagac tatcctcgac 2040
ttcctcaagt ctgatggttt cgctaacagg aacttcatgc agctcatcca cgatgactct 2100
cttaccttca aggaggatat tcagaaggct caggtgtccg gtcagggcga ctctctccac 2160
gagcacattg ctaaccttgc tggttcccct gctatcaaga agggcatcct tcagactgtt 2220
aaggttgtcg atgagcttgt caaggttatg ggtcgtcaca agcctgagaa catcgtcatc 2280
gagatggctc gtgagaacca gactacccag aagggtcaga agaactcgag ggagcgcatg 2340
aagaggattg aggagggtat caaggagctt ggttctcaga tccttaagga gcaccctgtc 2400
gagaacaccc agctccagaa cgagaagctc tacctctact acctccagaa cggtagggat 2460
atgtacgttg accaggagct cgacatcaac aggctttctg actacgacgt cgaccacatt 2520
gttcctcagt ctttccttaa ggatgactcc atcgacaaca aggtcctcac gaggtccgac 2580
aagaacaggg gtaagtcgga caacgtccct tccgaggagg ttgtcaagaa gatgaagaac 2640
tactggaggc agcttctcaa cgctaagctc attacccaga ggaagttcga caacctcacg 2700
aaggctgaga ggggtggcct ttccgagctt gacaaggctg gtttcatcaa gaggcagctt 2760
gttgagacga ggcagattac caagcacgtt gctcagatcc tcgattctag gatgaacacc 2820
aagtacgacg agaacgacaa gctcatccgc gaggtcaagg tgatcaccct caagtccaag 2880
ctcgtctccg acttccgcaa ggacttccag ttctacaagg tccgcgagat caacaactac 2940
caccacgctc acgatgctta ccttaacgct gtcgttggta ccgctcttat caagaagtac 3000
cctaagcttg agtccgagtt cgtctacggt gactacaagg tctacgacgt tcgtaagatg 3060
atcgccaagt ccgagcagga gatcggcaag gccaccgcca agtacttctt ctactccaac 3120
atcatgaact tcttcaagac cgagatcacc ctcgccaacg gcgagatccg caagcgccct 3180
cttatcgaga cgaacggtga gactggtgag atcgtttggg acaagggtcg cgacttcgct 3240
actgttcgca aggtcctttc tatgcctcag gttaacatcg tcaagaagac cgaggtccag 3300
accggtggct tctccaagga gtctatcctt ccaaagagaa actcggacaa gctcatcgct 3360
aggaagaagg attgggaccc taagaagtac ggtggtttcg actcccctac tgtcgcctac 3420
tccgtcctcg tggtcgccaa ggtggagaag ggtaagtcga agaagctcaa gtccgtcaag 3480
gagctcctcg gcatcaccat catggagcgc tcctccttcg agaagaaccc gatcgacttc 3540
ctcgaggcca agggctacaa ggaggtcaag aaggacctca tcatcaagct ccccaagtac 3600
tctcttttcg agctcgagaa cggtcgtaag aggatgctgg cttccgctgg tgagctccag 3660
aagggtaacg agcttgctct tccttccaag tacgtgaact tcctctacct cgcctcccac 3720
tacgagaagc tcaagggttc ccctgaggat aacgagcaga agcagctctt cgtggagcag 3780
cacaagcact acctcgacga gatcatcgag cagatctccg agttctccaa gcgcgtcatc 3840
ctcgctgacg ctaacctcga caaggtcctc tccgcctaca acaagcaccg cgacaagccc 3900
atccgcgagc aggccgagaa catcatccac ctcttcacgc tcacgaacct cggcgcccct 3960
gctgctttca agtacttcga caccaccatc gacaggaagc gttacacgtc caccaaggag 4020
gttctcgacg ctactctcat ccaccagtcc atcaccggtc tttacgagac tcgtatcgac 4080
ctttcccagc ttggtggtga t 4101
<210>5
<211>2936
<212>DNA
<213>Artificial Sequence
<400>5
atggctacga ccgccgcggc cgcggccgcc gccctgtccg ccgccgcgac ggccaagacc 60
ggccgtaaga accaccagcg acaccacgtc cttcccgctc gaggccgggt gggggcggcg 120
gcggtcaggt gctcggcggt gtccccggtc accccgccgt ccccggcgcc gccggccacg 180
ccgctccggc cgtgggggcc ggccgagccc cgcaagggcg cggacatcct cgtggaggcg 240
ctggagcggt gcggcgtcag cgacgtgttc gcctacccgg gcggcgcgtc catggagatc 300
caccaggcgc tgacgcgctc cccggtcatc accaaccacc tcttccgcca cgagcagggc 360
gaggcgttcg cggcgtccgg gtacgcgcgc gcgtccggcc gcgtcggggt ctgcgtcgcc 420
acctccggcc ccggggcaac caacctcgtg tccgcgctcg ccgacgcgct gctcgactcc 480
gtcccgatgg tcgccatcac gggccaggtc ccccgccgca tgatcggcac cgacgccttc 540
caggagacgc ccatagtcga ggtcacccgc tccatcacca agcacaatta ccttgtcctt 600
gatgtggagg acatcccccg cgtcatacag gaagccttct tcctcgcgtc ctcgggccgt 660
cctggcccgg tgctggtcga catccccaag gacatccagc agcagatggc cgtgccggtc 720
tgggacacct cgatgaatct accagggtac atcgcacgcc tgcccaagcc acccgcgaca 780
gaattgcttg agcaggtctt gcgtctggtt ggcgagtcac ggcgcccgat tctctatgtc 840
ggtggtggct gctctgcatc tggtgacgaa ttgcgctggt ttgttgagct gactggtatc 900
ccagttacaa ccactctgat gggcctcggc aatttcccca gtgacgaccc gttgtccctg 960
cgcatgcttg ggatgcatgg cacggtgtac gcaaattatg ccgtggataa ggctgacctg 1020
ttgcttgcgt ttggtgtgcg gtttgatgat cgtgtgacag ggaaaattga ggcttttgca 1080
agcagggcca agattgtgca cattgacatt gatccagcag agattggaaa gaacaagcaa 1140
ccacatgtgt caatttgcgc agatgttaag cttgctttac agggcttgaa tgctctgcta 1200
caacagagca caacaaagac aagttctgat tttagtgcat ggcacaatga gttggaccag 1260
cagaagaggg agtttcctct ggggtacaaa acttttggtg aagagatccc accgcaatat 1320
gccattcagg tgctggatga gctgacgaaa ggtgaggcaa tcatcgctac tggtgttggg 1380
cagcaccaga tgtgggcggc acaatattac acctacaagc ggccacggca gtggctgtct 1440
tcggctggtc tgggcgcaat gggatttggg ctgcctgctg cagctggtgc ttctgtggct 1500
aacccaggtg tcacagttgt tgatattgat ggggatggta gcttcctcat gaacattcag 1560
gagctggcat tgatccgcat tgagaacctc cctgtgaagg tgatggtgtt gaacaaccaa 1620
catttgggta tggtggtgca atgggaggat aggttttaca aggcgaatag ggcgcataca 1680
tacttgggca acccggaatg tgagagcgag atatatccag attttgtgac tattgctaag 1740
gggttcaata ttcctgcagt ccgtgtaaca aagaagagtg aagtccgtgc cgccatcaag 1800
aagatgctcg agactccagg gccatacttg ttggatatca tcgtcccgca ccaggagcat 1860
gtgctgccta tgatcccaag tgggggcgca ttcaaggaca tgatcctgga tggtgatggc 1920
aggactgtgt attaatctat aatctgtatg ttggcaaagc accagcccgg cctatgtttg 1980
acctgaatga cccataaaga gtggtatgcc tatgatgttt gtatgtgctc tatcaataac 2040
taaggtgtca actatgaacc atatgctctt ctgttttact tgtttgatgt gcttggcatg 2100
gtaatcctaa ttagcttcct gctgtctagg tttgtagtgt gttgttttct gtaggcatat 2160
gcatcacaag atatcatgta agtttcttgt cctacatatc aataataaga gaataaagta 2220
cttctatgca atagctctga gttaagtgtt tcaacaattt ctgaacttct gaacttatgt 2280
ttgctcaact gtcatcacac gaagtactct ccttgtaact acattttccc caagacttta 2340
aatcccctca gttacagcaa aaaataaact ttgcatctac tgttttccct ctcttcggtc 2400
gatcttattg ggtactacta tagagagagg ctgcatgaag tatttccttt ttctgtttag 2460
ttatgccgtg taaattagca tccatgcaaa atagatgaaa aatcaagcta ttcctgactg 2520
ctaaggatta tttttggcat aatgtattct tatatactcc ctccgtccca tattataagg 2580
gattttgagt ttttgtttat actgtttgac cactcgtctt attcaaaaaa ttttagaatt 2640
attatttatt ttttttgtga cttactttat tatctaaagt actttaagca caattttcgt 2700
attttatatt tgcacaaatt ttttgaataa gacgaatggt caaacaatac aaataaaaat 2760
tcaaaatccc ttataatatg ggacggaggt atgatagttg gtgaactgct acgtattgcc 2820
atttgacatt ttttggatta tgcaattttg ctgtctatag tgctctaatc aattcgcaat 2880
cccgaccttg gagtattggt ctcatggaac ccctcatctg agtaatctcc atattt 2936
<210>6
<211>644
<212>PRT
<213>Artificial Sequence
<400>6
Met Ala Thr Thr Ala Ala Ala Ala Ala Ala Ala Leu Ser Ala Ala Ala
1 5 10 15
Thr Ala Lys Thr Gly Arg Lys Asn His Gln Arg His His Val Leu Pro
20 25 30
Ala Arg Gly Arg Val Gly Ala Ala Ala Val Arg Cys Ser Ala Val Ser
35 40 45
Pro Val Thr Pro Pro Ser Pro Ala Pro Pro Ala Thr Pro Leu Arg Pro
50 55 60
Trp Gly Pro Ala Glu Pro Arg Lys Gly Ala Asp Ile Leu Val Glu Ala
65 70 75 80
Leu Glu Arg Cys Gly Val Ser Asp Val Phe Ala Tyr Pro Gly Gly Ala
85 90 95
Ser Met Glu Ile His Gln Ala Leu Thr Arg Ser Pro Val Ile Thr Asn
100 105 110
His Leu Phe Arg His Glu Gln Gly Glu Ala Phe Ala Ala Ser Gly Tyr
115 120 125
Ala Arg Ala Ser Gly Arg Val Gly Val Cys Val Ala Thr Ser Gly Pro
130 135 140
Gly Ala Thr Asn Leu Val Ser Ala Leu Ala Asp Ala Leu Leu Asp Ser
145 150 155 160
Val Pro Met Val Ala Ile Thr Gly Gln Val Pro Arg Arg Met Ile Gly
165 170 175
Thr Asp Ala Phe Gln Glu Thr Pro Ile Val Glu Val Thr Arg Ser Ile
180 185 190
Thr Lys His Asn Tyr Leu Val Leu Asp Val Glu Asp Ile Pro Arg Val
195 200 205
Ile Gln Glu Ala Phe Phe Leu Ala Ser Ser Gly Arg Pro Gly Pro Val
210 215 220
Leu Val Asp Ile Pro Lys Asp Ile Gln Gln Gln Met Ala Val Pro Val
225 230 235 240
Trp Asp Thr Ser Met Asn Leu Pro Gly Tyr Ile Ala Arg Leu Pro Lys
245 250 255
Pro Pro Ala Thr Glu Leu Leu Glu Gln Val Leu Arg Leu Val Gly Glu
260 265 270
Ser Arg Arg Pro Ile Leu Tyr Val Gly Gly Gly Cys Ser Ala Ser Gly
275 280 285
Asp Glu Leu Arg Trp Phe Val Glu Leu Thr Gly Ile Pro Val Thr Thr
290 295 300
Thr Leu Met Gly Leu Gly Asn Phe Pro Ser Asp Asp Pro Leu Ser Leu
305 310 315 320
Arg Met Leu Gly Met His Gly Thr Val Tyr Ala Asn Tyr Ala Val Asp
325 330 335
Lys Ala Asp Leu Leu Leu Ala Phe Gly Val Arg Phe Asp Asp Arg Val
340 345 350
Thr Gly Lys Ile Glu Ala Phe Ala Ser Arg Ala Lys Ile Val His Ile
355 360 365
Asp Ile Asp Pro Ala Glu Ile Gly Lys Asn Lys Gln Pro His Val Ser
370 375 380
Ile Cys Ala Asp Val Lys Leu Ala Leu Gln Gly Leu Asn Ala Leu Leu
385 390 395 400
Gln Gln Ser Thr Thr Lys Thr Ser Ser Asp Phe Ser Ala Trp His Asn
405 410 415
Glu Leu Asp Gln Gln Lys Arg Glu Phe Pro Leu Gly Tyr Lys Thr Phe
420 425 430
Gly Glu Glu Ile Pro Pro Gln Tyr Ala Ile Gln Val Leu Asp Glu Leu
435 440 445
Thr Lys Gly Glu Ala Ile Ile Ala Thr Gly Val Gly Gln His Gln Met
450 455 460
Trp Ala Ala Gln Tyr Tyr Thr Tyr Lys Arg Pro Arg Gln Trp Leu Ser
465 470 475 480
Ser Ala Gly Leu Gly Ala Met Gly Phe Gly Leu Pro Ala Ala Ala Gly
485 490 495
Ala Ser Val Ala Asn Pro Gly Val Thr Val Val Asp Ile Asp Gly Asp
500 505 510
Gly Ser Phe Leu Met Asn Ile Gln Glu Leu Ala Leu Ile Arg Ile Glu
515 520 525
Asn Leu Pro Val Lys Val Met Val Leu Asn Asn Gln His Leu Gly Met
530 535 540
Val Val Gln Trp Glu Asp Arg Phe Tyr Lys Ala Asn Arg Ala His Thr
545 550 555 560
Tyr Leu Gly Asn Pro Glu Cys Glu Ser Glu Ile Tyr Pro Asp Phe Val
565 570 575
Thr Ile Ala Lys Gly Phe Asn Ile Pro Ala Val Arg Val Thr Lys Lys
580 585 590
Ser Glu Val Arg Ala Ala Ile Lys Lys Met Leu Glu Thr Pro Gly Pro
595 600 605
Tyr Leu Leu Asp Ile Ile Val Pro His Gln Glu His Val Leu Pro Met
610 615 620
Ile Pro Ser Gly Gly Ala Phe Lys Asp Met Ile Leu Asp Gly Asp Gly
625 630 635 640
Arg Thr Val Tyr
<210>7
<211>86
<212>DNA
<213>Artificial Sequence
<400>7
gtttcagagc tatgctggaa acagcatagc aagttgaaat aaggctagtc cgttatcaac 60
ttgaaaaagt ggcaccgagt cggtgc 86
<210>8
<211>1373
<212>DNA
<213>Artificial Sequence
<400>8
ggaatccatg gcggatcagg agccaccaac ttctccctcc tcaagcaggc cggcgacgtg 60
gaggagaacc cgggcccaat gaaaaagcct gaactcaccg cgacgtctgt cgagaagttt 120
ctgatcgaaa agttcgacag cgtctccgac ctgatgcagc tctcggaggg cgaagaatct 180
cgtgctttca gcttcgatgt aggagggcgt ggatatgtcc tgcgggtaaa tagctgcgcc 240
gatggtttct acaaagatcg ttatgtttat cggcactttg catcggccgc gctcccgatt 300
ccggaagtgc ttgacattgg ggagtttagc gagagcctga cctattgcat ctcccgccgt 360
tcacagggtg tcacgttgca agacctgcct gaaaccgaac tgcccgctgt tctacaaccg 420
gtcgcggagg ctatggatgc gatcgctgcg gccgatctta gccagacgag cgggttcggc 480
ccattcggac cgcaaggaat cggtcaatac actacatggc gtgatttcat atgcgcgatt 540
gctgatcccc atgtgtatca ctggcaaact gtgatggacg acaccgtcag tgcgtccgtc 600
gcgcaggctc tcgatgagct gatgctttgg gccgaggact gccccgaagt ccggcacctc 660
gtgcacgcgg atttcggctc caacaatgtc ctgacggaca atggccgcat aacagcggtc 720
attgactgga gcgaggcgat gttcggggat tcccaatacg aggtcgccaa catcttcttc 780
tggaggccgt ggttggcttg tatggagcag cagacgcgct acttcgagcg gaggcatccg 840
gagcttgcag gatcgccacg actccgggcg tatatgctcc gcattggtct tgaccaactc 900
tatcagagct tggttgacgg caatttcgat gatgcagctt gggcgcaggg tcgatgcgac 960
gcaatcgtcc gatccggagc cgggactgtc gggcgtacac aaatcgcccg cagaagcgcg 1020
gccgtctgga ccgatggctg tgtagaagta ctcgccgata gtggaaaccg acgccccagc 1080
actcgtccga gggcaaagaa atagactagt tcccgatcgt tcaaacattt ggcaataaag 1140
tttcttaaga ttgaatcctg ttgccggtct tgcgatgatt atcatataat ttctgttgaa 1200
ttacgttaag catgtaataa ttaacatgta atgcatgacg ttatttatga ggtgggtttt 1260
tatgattaga gtcccgcaat tatacattta atacgcgata gaaaacaaaa tatagcgcgc 1320
aaactaggat aaattatcgc gcgcggtgtc atctatgtta ctagaggcgc gcc 1373
<210>9
<211>19
<212>PRT
<213>Artificial Sequence
<400>9
Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn
1 5 10 15
Pro Gly Pro
<210>10
<211>341
<212>PRT
<213>Artificial Sequence
<400>10
Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu Ile
1 5 10 15
Glu Lys Phe Asp Ser Val Ser Asp Leu Met Gln Leu Ser Glu Gly Glu
20 25 30
Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu
35 40 45
Arg Val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr
50 55 60
Arg His Phe Ala Ser Ala Ala Leu Pro Ile Pro Glu Val Leu Asp Ile
65 70 75 80
Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys Ile Ser Arg Arg Ser Gln
85 90 95
Gly Val Thr Leu Gln Asp Leu Pro Glu Thr Glu Leu Pro Ala Val Leu
100 105 110
Gln Pro Val Ala Glu Ala Met Asp Ala Ile Ala Ala Ala Asp Leu Ser
115 120 125
Gln Thr Ser Gly Phe Gly Pro Phe Gly Pro Gln Gly Ile Gly Gln Tyr
130 135 140
Thr Thr Trp Arg Asp Phe Ile Cys Ala Ile Ala Asp Pro His Val Tyr
145 150 155 160
His Trp Gln Thr Val Met Asp Asp Thr Val Ser Ala Ser Val Ala Gln
165 170 175
Ala Leu Asp Glu Leu Met Leu Trp Ala Glu Asp Cys Pro Glu Val Arg
180 185 190
His Leu Val His Ala Asp Phe Gly Ser Asn Asn Val Leu Thr Asp Asn
195 200 205
Gly Arg Ile Thr Ala Val Ile Asp Trp Ser Glu Ala Met Phe Gly Asp
210 215 220
Ser Gln Tyr Glu Val Ala Asn Ile Phe Phe Trp Arg Pro Trp Leu Ala
225 230 235 240
Cys Met Glu Gln Gln Thr Arg Tyr Phe Glu Arg Arg His Pro Glu Leu
245 250 255
Ala Gly Ser Pro Arg Leu Arg Ala Tyr Met Leu Arg Ile Gly Leu Asp
260 265 270
Gln Leu Tyr Gln Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trp
275 280 285
Ala Gln Gly Arg Cys Asp Ala Ile Val Arg Ser Gly Ala Gly Thr Val
290 295 300
Gly Arg Thr Gln Ile Ala Arg Arg Ser Ala Ala Val Trp Thr Asp Gly
305 310 315 320
Cys Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg
325 330 335
Pro Arg Ala Lys Lys
340

Claims (15)

1. A method for realizing plant gene replacement without generating DNA double-strand break, comprising the following steps: introducing the sgRNA, the Cas9 nickase and the donor DNA into a target plant;
the sgRNA target DNA fragment A target sequence;
the donor DNA sequentially comprises the DNA fragment A target sequence, the DNA fragment B and the DNA fragment A target sequence;
the DNA fragment B is a DNA molecule obtained by mutating the DNA fragment A by one or more bases;
under the guidance of the sgRNA, the Cas9 nicking enzyme generates single-stranded DNA nicks at a DNA fragment A target sequence in a target plant genome and a DNA fragment A target sequence in the donor DNA, and replaces the DNA fragment A in the target plant genome with the DNA fragment B through a repair mechanism in a target plant body to realize plant gene replacement;
the DNA fragment A is a DNA molecule shown in 1300 th-1993 th sites of a sequence 5;
the DNA fragment B is a DNA molecule shown in 10695-11388 th site of the sequence 1;
the sgRNA structure is as follows: tRNA-the RNA-sgRNA backbone transcribed from the DNA fragment a target sequence;
the tRNA is an RNA molecule obtained by replacing T in 474 th-550 th positions of the sequence 1 with U;
the sgRNA framework is an RNA molecule obtained by replacing T in 571-646 of a sequence 1 with U;
the first target sequence of the DNA fragment is 551-570 th site of the sequence 1;
the sequence of the donor DNA is 10672-11411 th site of the sequence 1;
the Cas9 nickase is a Cas 9D 10A nickase;
the Cas 9D 10A nickase is a SpCas9n protein;
the amino acid sequence of the SpCas9n protein is shown as sequence 3.
2. The method of claim 1, wherein:
the encoding gene of the SpCas9n protein is a cDNA molecule or a DNA molecule shown in 2877-6977 site of a sequence 1 in a sequence table.
3. The method according to claim 1 or 2, characterized in that:
the Cas9 nickase carries a nuclear localization signal;
the nuclear localization signal is SV40 NLS;
the amino acid sequence of the SV40 NLS is a sequence 2.
4. The method according to claim 1 or 2, characterized in that: the method for introducing the sgRNA, the Cas9 nickase and the donor DNA into the target plant comprises the following steps:
the sgRNA-transcribed DNA molecule, the Cas9 nickase-encoding gene and the donor DNA are introduced into a target plant through a recombinant expression vector;
the recombinant expression vector comprises an expression cassette consisting of a promoter, the DNA molecule for transcribing the sgRNA and a terminator in sequence, and an expression cassette consisting of the promoter, the coding gene of the Cas9 nickase and the terminator in sequence;
the nucleotide sequence of the recombinant expression vector is shown as a sequence 1.
5. The method according to claim 1 or 2, characterized in that: the plant is rice.
6. Use of the method of any one of claims 1-4 for gene editing in plants.
7. Use of the method of any of claims 1-4 for the preparation of plant mutants.
8. Use of the method of any one of claims 1-4 for increasing the efficiency of gene replacement in plants.
9. Use of the method of any one of claims 1 to 4 for reducing the production of by-products of gene replacement in plants.
10. Use according to any one of claims 6 to 9, characterized in that: the plant is rice.
11. A method of plant gene editing comprising the steps of: a method according to any one of claims 1 to 4, wherein the plant gene is edited by replacing the gene segment of interest in the plant genome.
12. A method of making a plant mutant comprising the steps of: a plant mutant obtained by replacing a desired gene fragment in the genome of a plant according to the method of any one of claims 1 to 4.
13. A method for improving the gene replacement efficiency of a plant comprises the following steps: a method according to any one of claims 1 to 4 for replacing a gene segment of interest in the genome of a plant.
14. A method for reducing byproducts produced by gene replacement in plants comprising the steps of: a method according to any one of claims 1 to 4 for replacing a gene segment of interest in the genome of a plant.
15. The method according to any one of claims 11-14, wherein: the plant is rice.
CN201911405281.8A 2019-12-31 2019-12-31 Method for realizing plant gene replacement without generating DNA double-strand break Active CN110951742B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911405281.8A CN110951742B (en) 2019-12-31 2019-12-31 Method for realizing plant gene replacement without generating DNA double-strand break

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911405281.8A CN110951742B (en) 2019-12-31 2019-12-31 Method for realizing plant gene replacement without generating DNA double-strand break

Publications (2)

Publication Number Publication Date
CN110951742A CN110951742A (en) 2020-04-03
CN110951742B true CN110951742B (en) 2022-10-21

Family

ID=69985126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911405281.8A Active CN110951742B (en) 2019-12-31 2019-12-31 Method for realizing plant gene replacement without generating DNA double-strand break

Country Status (1)

Country Link
CN (1) CN110951742B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111321171A (en) * 2018-12-14 2020-06-23 江苏集萃药康生物科技有限公司 Method for preparing gene targeting animal model by applying CRISPR/Cas9 mediated ES targeting technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109456973A (en) * 2018-12-28 2019-03-12 北京市农林科学院 Application of the SpCas9n&PmCDA1&UGI base editing system in plant gene editor
CN110551752A (en) * 2019-08-30 2019-12-10 北京市农林科学院 xCas9n-epBE base editing system and application thereof in genome base replacement

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109456973A (en) * 2018-12-28 2019-03-12 北京市农林科学院 Application of the SpCas9n&PmCDA1&UGI base editing system in plant gene editor
CN110551752A (en) * 2019-08-30 2019-12-10 北京市农林科学院 xCas9n-epBE base editing system and application thereof in genome base replacement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Engineering Herbicide-Resistant Rice Plants through CRISPR/Cas9-Mediated Homologous Recombination of Acetolactate Synthase;Yongwei Sun等;《Molecular Plant》;20160404;第9卷(第4期);第628页右栏-630页左栏及图1 *

Also Published As

Publication number Publication date
CN110951742A (en) 2020-04-03

Similar Documents

Publication Publication Date Title
CN106957855B (en) Method for targeted knockout of rice dwarf gene SD1 by using CRISPR/Cas9 technology
WO2020007331A1 (en) Method for site-specific mutagenesis of medicago sativa gene by using crispr/cas9 system
CN110951743B (en) Method for improving plant gene replacement efficiency
AU2019297209B2 (en) Method of obtaining multi-leaf alfalfa material by means of MsPALM1 artificial site-directed mutant
Xue et al. Transformation of five grape rootstocks with plant virus genes and a virE2 gene from Agrobacterium tumefaciens
CN109705198B (en) Application of OsCKX7 protein and coding gene thereof in regulation and control of resistance to plant sheath blight
CN107475210B (en) Rice bacterial leaf blight resistance related gene OsABA2 and application thereof
Pavese et al. First report of CRISPR/Cas9 gene editing in Castanea sativa Mill
US11365423B2 (en) Method of obtaining multileaflet Medicago sativa materials by means of MsPALM1 artificial site-directed mutants
CN110951742B (en) Method for realizing plant gene replacement without generating DNA double-strand break
CN113265403A (en) Soybean Dt1 gene editing site and application thereof
CN112646011A (en) Protein PHD-Finger17 related to plant stress resistance and coding gene and application thereof
CN116179589B (en) SlPRMT5 gene and application of protein thereof in regulation and control of tomato fruit yield
EP3709792B1 (en) Plant promoter for transgene expression
CN113493803B (en) Alfalfa CRISPR/Cas9 genome editing system and application thereof
CN111875689B (en) Method for creating male sterile line by using tomato green stem close linkage marker
CN111411123B (en) Method for simultaneously improving rice fragrance and bacterial leaf blight resistance by using CRISPR/Cas9 system and expression vector
CN114438056A (en) CasF2 protein, CRISPR/Cas gene editing system and application thereof in plant gene editing
RUZYATI et al. Construction of CRISPR/Cas9_gRNA-OsCKX2 module cassette and its introduction into rice cv. Mentik Wangi mediated by Agrobacterium tumefaciens
CN113699165A (en) Nucleic acid for reducing height of corn strain and application thereof
CN113151314A (en) Plant ACCase mutant gene and application thereof
CN111647590A (en) Adenylate cyclase containing FYVE structural domain and coding gene and application thereof
CN112080513A (en) Rice artificial genome editing system with expanded editing range and application thereof
CN106086063B (en) RNAi vector constructed based on isocaudarner and application thereof
CN111019968B (en) Application of NTS/dNTS combination in preparation of plant mutant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant