CN110964742A - Preparation method of herbicide-resistant rice - Google Patents

Preparation method of herbicide-resistant rice Download PDF

Info

Publication number
CN110964742A
CN110964742A CN201911323192.9A CN201911323192A CN110964742A CN 110964742 A CN110964742 A CN 110964742A CN 201911323192 A CN201911323192 A CN 201911323192A CN 110964742 A CN110964742 A CN 110964742A
Authority
CN
China
Prior art keywords
sequence
leu
lys
ala
glu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911323192.9A
Other languages
Chinese (zh)
Other versions
CN110964742B (en
Inventor
王飞鹏
杨进孝
冯峰
宋金岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911323192.9A priority Critical patent/CN110964742B/en
Publication of CN110964742A publication Critical patent/CN110964742A/en
Application granted granted Critical
Publication of CN110964742B publication Critical patent/CN110964742B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8274Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance

Abstract

The invention discloses a preparation method of herbicide-resistant rice. The method comprises the steps of expressing esgRNA, adenine deaminase, Cas9 nuclease and a nuclear localization signal bpNLS in receptor rice; under the guidance of the esgRNA, the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS can mutate the 7 th site of an OsACC1 gene target sequence in a receptor rice genome from A to G, so that the ACCase herbicide-inhibiting rice is obtained and has ACCase herbicide resistance.

Description

Preparation method of herbicide-resistant rice
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a preparation method of herbicide-resistant rice.
Background
The CRISPR-Cas9 technology has become a powerful genome editing means and is widely applied to many tissues and cells. The CRISPR/Cas9 protein-RNA complex is localized on the target by a guide RNA (guide RNA), cleaved to generate a DNA double strand break (dsDNA break, DSB), and the organism will then instinctively initiate a DNA repair mechanism to repair the DSB. Repair mechanisms are generally of two types, one being non-homologous end joining (NHEJ) and the other being homologous recombination (HDR). In general, NHEJ dominates, and repair produces random indels (insertions or deletions) much higher than precise repair. For base exact substitution, the application of using HDR to achieve base exact substitution is greatly limited because of the low efficiency of HDR and the need for a DNA template.
In 2017, a novel Adenine Base Editor (ABE) was reported by David Liu laboratories. Through seven rounds of evolution, researchers fuse tRNA adenine deaminase (tRNAadensonine deaminase, ecTadA) derived from Escherichia coli at the 5' end of Cas9 nickase (Cas9n), can directly realize the replacement of a single base A to G (Guanine, G) in cells, and greatly improve the base editing efficiency of A to G without generating DSB and starting HDR repair. The specific process is as follows: when sgRNA containing a genome targeting sequence binds to ecTadA & Cas9n, the complex is targeted to a target, ecTadA catalyzes adenine deamination of a on unpaired single stranded DNA to Inosine (Inosine, I), I is considered to be G during DNA repair, Cas9n introduces a cytosine C that pairs with I upon cleavage of the phosphodiester bond of the paired DNA strands. Finally, C-G pairing is generated in the following repair process, so that A-G conversion is realized.
The single-base editing system is widely applied to creation of crop single-base mutant materials. Acetyl-coenzyme a carboxylase (ACCase) is a key enzyme regulating fatty acid biosynthesis, and the ACCase inhibiting herbicides are increasing in variety since the 1975 development and are now important herbicides in the world. ACCase inhibiting herbicides include the earliest developed APP (aryloxyphenoxypropionic acid) and CHD (cyclohexenedione) and the later developed DEN (phenylpyrazoline), among three classes. With the use of ACCase inhibiting herbicides, resistance is developed in many weeds, and 8 amino acid substitutions within an ACCase amino acid control the resistance of weeds to different classes of ACCase inhibiting herbicides, glutamine 1756 → glutamic acid, isoleucine 1781 → leucine or valine, tryptophan 1999 → cysteine or serine, tryptophan 2027 → cysteine, isoleucine 2041 → asparagine or valine, asparagine 2078 → glycine, cysteine 2088 → arginine, glycine 2096 → alanine, respectively. Wherein the amino acid mutation at position 1781, 2078 or 2088 is resistant to all ACCase inhibiting herbicides. Cysteine (C) at position 2099 in the rice OsACC1 exactly corresponds to cysteine at position 2088 in the weeds, and is mutated into arginine (R), namely C2099R can enable the rice to generate ACCase herbicide resistance, and C2099R can be obtained by a single base editing technology, but most of base editing systems for generating ACCase herbicide resistance-inhibiting rice by mutation C209 2099R reported at present are low in editing efficiency, and the generated mutants not only contain mutation at position 2099 of OsACC1, but also contain mutation at other positions, so that the identification cost of mutant molecules is high.
Disclosure of Invention
The invention aims to provide a preparation method of ACCase herbicide-inhibiting resistant rice.
The preparation method of the ACCase inhibiting herbicide resistant rice provided by the invention comprises the steps of expressing esgRNA, adenine deaminase, Cas9 nuclease and a nuclear localization signal bpNLS in receptor rice;
the esgRNA targets an OsACC1 gene target sequence; the OsACC1 gene target sequence contains a target site;
the target site is a complementary base A of a base T shown in the 6295 th position of the sequence 14;
the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS can mutate the target site in the receptor rice genome from a base A to a base G under the guidance of the esgRNA, so that the ACCase herbicide-inhibiting resistant rice with the OsACC1 protein with the 2099 th position mutated from cysteine to arginine is obtained;
the amino acid sequence of the nuclear localization signal bpNLS is A1) or A2):
A1) the amino acid sequence is a protein shown in a sequence 7;
A2) and (b) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 7 in the sequence table and has the same function.
In the above method, the number of the nuclear localization signals bpNLS may be 1 or 2 or more. In an embodiment of the present invention, the number of the nuclear localization signals bpNLS is 2.
The coding gene sequence of the nuclear localization signal bpNLS is a1) or a2) or a 3):
a1) a cDNA molecule or DNA molecule shown in 3796-3846 site of a sequence 1 in a sequence table;
a2) a cDNA or DNA molecule having 75% or more identity with the nucleotide sequence defined in a1) and encoding the nuclear localization signal bpNLS;
a3) a cDNA molecule or a DNA molecule which hybridizes with the nucleotide sequence defined in a1) or a2) under stringent conditions and codes for the nuclear localization signal bpNLS.
In the above method, the OsACC1 gene target site sequence is sequence 12, and the target site is base A shown in 7 th position of sequence 12.
In the above method, the esgRNA structure is as follows: an RNA-esgRNA backbone transcribed from the OsACC1 gene target sequence;
the esgRNA backbone is 1) or 2) or 3):
1) replacing T in the 617-702 th site of the sequence 1 with U to obtain an RNA molecule;
2) RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides in the RNA molecules shown in 1) and have the same functions;
3) RNA molecule with 75% or more than 75% identity with the nucleotide sequence defined in 1) or 2) and with the same function.
In the above method, the Cas9 nuclease is Cas9n protein;
the Cas9n protein is C1) or C2):
C1) the amino acid sequence is a protein shown in a sequence 4;
C2) and (b) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 4 in the sequence table and has the same function.
The adenine deaminase is an ecTadA protein and/or an ecTadA protein; in particular the ecTadA protein and the ecTadA protein;
the ecTadA protein is D1) or D2):
D1) the amino acid sequence is a protein shown in a sequence 2;
D2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
the ecTadA protein is E1) or E2):
E1) the amino acid sequence is a protein shown in a sequence 3;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 3 in the sequence table and has the same function.
The protein of A2), C2), D2) or E2) is a protein having 75% or more or 75% or more identity to the amino acid sequence of the protein represented by SEQ ID NO. 7, SEQ ID NO. 4, SEQ ID NO. 2 or SEQ ID NO. 3 and having the same function. The identity of 75% or more than 75% is 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity.
The protein of A2), C2), D2) or E2) may be artificially synthesized, or may be obtained by synthesizing a gene encoding the protein and then performing biological expression.
The gene encoding the protein of A2), C2), D2) or E2) described above can be obtained by deleting one or several amino acid residues from the DNA sequence shown in 3796-3846 (protein shown in coding sequence 7), 5035-9135 (protein shown in coding sequence 4), 3847-4344 (protein shown in coding sequence 2) or 4441-4938 (protein shown in coding sequence 3) of sequence 1, and/or by carrying out missense mutation of one or several base pairs and/or by linking a coding sequence of the tag shown in the above table to its 5 'end and/or 3' end.
Further, the coding gene of the Cas9n protein is c1) or c2) or c 3):
c1) a cDNA molecule or DNA molecule shown in position 5035-9135 of a sequence 1 in a sequence table;
c2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined by c1) and encoding the Cas9 n;
c3) a cDNA or DNA molecule hybridizing under stringent conditions with a nucleotide sequence defined in c1) or c2) and encoding the Cas9 n.
The coding gene of the ecTadA protein is d1) or d2) or d 3):
d1) a cDNA molecule or DNA molecule shown in 3847-4344 site of a sequence 1 in a sequence table;
d2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in d1) and encoding said ecTadA;
d3) a cDNA or DNA molecule which hybridizes under stringent conditions with a nucleotide sequence defined by d1) or d2) and encodes said ecTadA;
the coding gene of the ecTadA protein is e1) or e2) or e 3):
e1) a cDNA molecule or DNA molecule shown in 4441-4938 site of a sequence 1 in a sequence table;
e2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in e1) and encoding said ecTadA;
e3) a cDNA molecule or a DNA molecule which hybridizes under stringent conditions with a nucleotide sequence defined in e1) or e2) and encodes said ecTadA.
The nucleotide sequence of the present invention encoding the nuclear localization signal bpNLS, the Cas9n, the ecTadA can be easily mutated by a person of ordinary skill in the art using known methods, such as directed evolution and point mutation. Those nucleotides which are artificially modified to have 75% or more identity to the nucleotide sequence of the nuclear localization signal bpNLS, the Cas9n, the ecTadA of the present invention are derived from and identical to the sequence of the present invention as long as they encode the nuclear localization signal bpNLS, the Cas9n, the ecTadA and have the same function.
The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "identity" includes nucleotide sequences that are 75% or more, or 85% or more, or 90% or more, or 95% or more identical to the nucleotide sequence of a protein consisting of the amino acid sequence shown in coding sequences 7, 4, 2, 3 of the present invention. Identity can be assessed visually or by computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.
The stringent conditions are hybridization and washing of the membrane 2 times, 5min each, at 68 ℃ in a solution of 2 XSSC, 0.1% SDS, and 2 times, 15min each, at 68 ℃ in a solution of 0.5 XSSC, 0.1% SDS; alternatively, hybridization was carried out at 65 ℃ in a solution of 0.1 XSSPE (or 0.1 XSSC), 0.1% SDS, and the membrane was washed.
The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.
In the above method, the method for expressing esgRNA, adenine deaminase, Cas9 nuclease, and nuclear localization signal bpNLS in recipient rice is to introduce a DNA molecule for transcribing esgRNA, a gene encoding ecTadA protein, a gene encoding Cas9n protein, and a gene encoding nuclear localization signal bpNLS into recipient rice.
Furthermore, the DNA molecule for transcribing the esgRNA, the coding gene of the ecTadA protein, the coding gene of the Cas9n protein and the coding gene of the nuclear localization signal bpNLS are introduced into receptor rice through recombinant expression vectors.
Furthermore, the recombinant expression vector comprises an expression cassette which sequentially consists of a promoter, the coding gene of the nuclear localization signal bpNLS, the coding gene of the ecTadA protein, the coding gene of the Cas9n protein, the coding gene of the nuclear localization signal bpNLS and a terminator.
In a specific embodiment of the invention, the recombinant expression vector is specifically a sABE-1 recombinant expression vector.
The nucleotide sequence of the sABE-1 recombinant expression vector is sequence 1 in a sequence table.
In the above method, the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS are guided by the esgRNA to mutate the base a at position 690 (5 '-3' direction) on the non-coding strand (reverse complement of sequence 14) of the coding gene of the OsACC1 protein in the recipient rice genome to base G, further mutate the base T at position 6295 (5 '-3' direction) on the coding strand (sequence 14) of the coding gene of the OsACC1 protein to base C, cause the codon of cysteine (C) at position 2099 of the OsACC1 protein to be changed from "TGC" to "CGC", so that the position casc (C) at position 2099 of the OsACC1 protein is mutated from cysteine (C) to arginine (R), and the mutated rice mutant is the acacc herbicide-inhibiting rice of the present invention. The ACCase-inhibiting herbicide-resistant rice has ACCase-inhibiting herbicide resistance. The ACCase-inhibiting herbicide is specifically galapectium.
The ACCase inhibiting herbicide resistant rice is also the C2099R mutant or the A7> G7 mutant. The C2099R mutant is a rice mutant of OsACC1 protein (sequence 13) which is mutated into arginine (R) from cysteine (C) at position 2099. The A7> G7 mutant is a rice mutant in which the 7 th site of an OsACC1 gene target sequence (sequence 12) is mutated from a base A to a base G.
The application of the method in improving the A.G base substitution efficiency of the target site or preparing the ACCase inhibiting herbicide resistant rice with the A.G base substitution only at the target site also belongs to the protection scope of the invention; the target site is the 7 th site of the OsACC1 gene target sequence.
In the above application, the A.G base is replaced by a mutation from the base A to the base G.
In the method or the application, the amino acid sequence of the OsACC1 protein is shown as a sequence 13, and the gene sequence of the OsACC1 is shown as a sequence 14.
In the method or the application, the amino acid sequence of the OsACC1 protein is shown as a sequence 13, and the coding gene sequence of the OsACC1 protein is shown as a sequence 14 (LOC 4338322).
In practical application, the OsACC1 gene in different rice varieties may contain different alleles, and particularly, the amino acid sequence of the OsACC1 protein may have 1 or 2 or more amino acid substitutions and/or deletions and/or additions with the amino acid sequence of the OsACC1 protein; in addition, there are differences in the annotation results of exon sequences of the OsACC1 gene by those skilled in the art, and it is specifically shown that the length of the amino acid sequence of the OsACC1 protein may be different from that of the OsACC1 protein of the present invention (e.g., position 2099 of the OsACC1 protein of the present invention, that is, position 2186 of the OsACC1 protein encoded by the OsACC1 encoding gene sequence of LOC _ Os05g22940, which are identical but different in amino acid sequence length). Therefore, the target site has the possibility of changing the nucleotide position of the OsACC1 gene corresponding to different rice varieties and the mutation position of the OsACC1 protein amino acid sequence caused by the mutation of the target site, namely, the No. 6295 position of the sequence 14 or the No. 2099 position of the sequence 13. However, any alteration of the target site and/or alteration of the amino acid mutation site due to the above-mentioned circumstances is within the scope of the present invention as long as the ACCase-inhibiting herbicide-resistant rice obtained by the method of the present invention is used.
In the method or the application, the rice variety can be Nipponbare.
The invention has the beneficial effects that: the method can improve the A.G base substitution efficiency of the target site (7 th site of the sequence 12), and the generated mutant has mutation of only one site of the target site, thereby greatly improving the creation efficiency of the C2099R mutant.
The invention provides a preparation method of ACCase inhibiting herbicide resistant rice, which comprises the steps of expressing esgRNA, adenine deaminase, Cas9 nuclease and a nuclear localization signal bpNLS in receptor rice; under the guidance of the esgRNA, the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS can mutate the 7 th site of an OsACC1 gene target sequence in a receptor rice genome from A to G, so that the ACCase herbicide-inhibiting rice is obtained and has ACCase herbicide resistance.
Drawings
FIG. 1 is a schematic diagram of a four-adenine base editor recombinant expression vector.
FIG. 2 shows the variation of the target site in the target sequence during the creation of the C2099R mutant.
FIG. 3 shows the sequencing result of the target sequence (peak diagram of the reverse complement sequence of the target sequence) in the rice positive T0 seedling with the mutation A7> G7.
FIG. 4 shows the effect of 1680ug/l Geranium on the growth of WT seedlings and rice positive T0 seedlings with the mutation A7> G7.
FIG. 5 shows the heading and seed setting of rice positive T0 seedlings with mutations A7> G7.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The experimental procedures in the following examples are conventional unless otherwise specified. Materials, reagents, instruments and the like used in the following examples are commercially available unless otherwise specified. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
The primer pair T consists of a primer T-F: 5'-GCATTGCTGGACTTCAACC-3' and primer T-R: 5'-CAAACCGTATCGCAATCTGAG-3', and is used for amplifying target OsACC-T.
The efficiency of substitution of A.G bases was determined as the number of positive T0 seedlings in which substitution of A.G bases occurred/the number of total positive T0 seedlings analyzed × 100%.
Japanese fine rice: reference documents: the effects of sodium nitroprusside and its photolysis products on the growth of Nippon rice seedlings and the expression of 5 hormone marker genes [ J ]. proceedings of university of Master Henan (Nature edition), 2017(2): 48-52.; the public is available from the agroforestry academy of sciences of Beijing.
Recovering the culture medium: n6 solid medium containing 200mg/L timentin.
Screening a culture medium: n6 solid medium containing 50mg/L hygromycin.
Differentiation medium: n6 solid culture medium containing 2mg/L KT, 0.2mg/L NAA, 0.5g/L glutamic acid and 0.5g/L proline.
Rooting culture medium: n6 solid medium containing 0.2mg/L NAA, 0.5g/L glutamic acid, 0.5g/L proline.
The amino acid sequence of the OsACC1 protein in the following examples is shown as sequence 13, and the coding gene sequence of the OsACC1 protein is shown as sequence 14.
Example 1 preparation of ACCase inhibiting herbicide resistant Rice
Design and construction of recombinant expression vector
1. Design of recombinant expression vectors
The invention uses four types of adenine base editors to prepare ACCase inhibiting herbicide resistant rice, and the schematic structural diagram of the recombinant expression vector of the four types of adenine base editors is shown in figure 1. The method comprises the following specific steps:
the sABE system: the bpNLS nuclear localization signal was added before the ecTadA element in the ecTadA & Cas9n base editing system and after the Cas9n element. The amino acid sequence of the bpNLS nuclear localization signal is as follows: KRTADGSEFEPKKKRKV (SEQ ID NO: 7). This type of design is designated bpNLS-bpNLS.
FNLS-sABE system: in ecTadA&ecTadA*&Add 3 × Flag1 to the ecTadA element in Cas9n base editing System&1 × NLS1 nuclear localization signal, and 1 × NLS1 nuclear localization signal was added after the Cas9n element. 3 Flag1&1 the amino acid sequence of NLS1 is as follows:DYKDHDGDYKDHDIDYKDDDDKMAPKKKRKV(sequence 8), wherein the amino acid sequence of the 3 Flag1 tag protein is shown by underlining, and the amino acid sequence of the NLS1 protein is shown by wavy lines. The 1 × NLS1 nuclear localization signal includes 1 NLS1 protein, and the amino acid sequence of 1 × NLS1 is as follows: PKKKRKV (SEQ ID NO: 9). This design type is denoted as 3 Flag1&1*NLS1-1*NLS1(FNLS)。
F4NLS-sABE system: in ecTadA&ecTadA*&Add 3 × Flag2 to the ecTadA element in Cas9n base editing System&4N NLS2 nucleusLocalization signal, and 4 × NLS2 nuclear localization signal added after Cas9n element. 3 Flag2&The 4 × NLS2 nuclear localization signal sequentially comprises 13 × Flag2 tag protein and 4NLS 2 protein, 3 × Flag2&4NLS 2 the amino acid sequence of the nuclear localization signal is as follows:DYKDHDGDYKDHDIDYKDDDDKMAPKKKRKVGGSPKKKRKVGGSPKKKRKVGGSPKKKRKV(SEQ ID NO: 10); the amino acid sequence of the 3 Flag2 tag protein is shown by underlining, and the amino acid sequence of the NLS2 protein is shown by wavy lines. The 4 × NLS2 nuclear localization signal includes 4NLS 2 proteins, and the amino acid sequence of the 4 × NLS2 nuclear localization signal is as follows:PKKK RKVGGSPKKKRKVGGSPKKKRKVGGSPKKKRKV(SEQ ID NO: 11). This design type is denoted as 3 Flag2&4*NLS2-4*NLS2(F4NLS)。
4NLS-sABE system: the 4 × NLS2 nuclear localization signal was added after the Cas9n element in the ecTadA & Cas9n base editing system. This design type is denoted 4 × NLS2(4 NLS).
The recombinant expression vectors corresponding to the four adenine base editors all comprise esgRNA, the esgRNA targets the OsACC1 gene target sequence shown by the sequence 12, and the 7 th site of the target sequence is a target site. The four adenine base editors can mutate the 6295 th base A on the non-coding chain of the coding gene of the OsACC1 protein in the rice genome of the receptor into the G base, further mutate the 6295 th base T on the coding chain (SEQ ID NO: 14) of the coding gene of the OsACC1 protein into the C base, so that the codon of the 2099 th cysteine (C) of the OsACC1 protein is changed from 'TGC' to 'CGC' (FIG. 2), so that the 2099 th cysteine (C) of the OsACC1 protein is mutated into arginine (R), and the rice mutant with the mutation is the ACCase inhibitor herbicide-resistant rice of the invention, which is called as C2099R mutant, namely A7> G7 mutant. The C2099R mutant (A7> G7 mutant) can enable the rice to generate ACCase to inhibit herbicide resistance.
2. Construction of recombinant expression vectors
Artificially synthesizing recombinant expression vectors corresponding to four adenine base editors in the step 1, wherein each vector is a circular plasmid:
a recombinant expression vector comprising bpNLS-bpNLS: sABE-1;
a recombinant expression vector containing 3 Flag1&1 NLS1-1 NLS1 (FNLS): FNLS-sABE-1;
a recombinant expression vector containing 3 Flag2& 4NLS 2-4 NLS2(F4 NLS): f4 NLS-sABE-1;
a recombinant expression vector containing 4 × NLS2(4 NLS): 4 NLS-sABE-1.
The nucleotide sequence of the sABE-1 recombinant expression vector is sequence 1 in the sequence table. Wherein, the 131-596 site of the sequence 1 is a nucleotide sequence of OsU6a promoter, the 710-1090 site is a nucleotide sequence of OsU3 promoter, the 1204-1945 site is a nucleotide sequence of OsU6c promoter, the 597-702 site, the 1091-1196 site and the 1946-2051 site are espRNA nucleotide sequences, the 597-616 site, the 1091-1110 site and the 1946-1965 site are T1, T2 and OsACC-T target site sequences respectively, and the 617-702 site, the 1111-1196 site and the 1966-2051 site are espRNA skeleton nucleotide sequences; the nucleotide sequence of the OsUbq3 promoter at position 2070-3783 of the sequence 1, the bpNLS nucleotide sequence at position 3796-3846, the ecTadA protein coding sequence (without a stop codon) at position 3847-4344, and the ecTadA protein shown in the coding sequence 2; the sequence 1 has the coding sequence of the ecTadA protein (without a stop codon) at positions 4441-4938, and the ecTadA protein shown in the coding sequence 3; 5035-9135 of the sequence 1 is a coding sequence (without a stop codon) of the Cas9n protein, and the coding sequence 4 is a Cas9n protein; the 9136-9186 site of the sequence 1 is a bpNLS nucleotide sequence; no. 9529-9781 of the sequence 1 is a Nos terminator sequence; the nucleotide sequence of the ZmUbi1 promoter is positioned at the 9822-11814 site of the sequence 1, the coding sequence of hygromycin phosphotransferase is positioned at the 11821-12846 site, and the nucleotide sequence of CaMV35S polyA is positioned at the 12873-13088 site. Three targets in the sABE-1 recombinant expression vector are T1, T2 and OsACC-T respectively, and the sequences are shown in Table 1.
The FNLS-sABE-1 recombinant expression vector is a sequence obtained by replacing the bpNLS nucleotide sequence shown in 3796-3846 th site of the sequence 1 in the sequence of the sABE-1 recombinant expression vector with the sequence 5, replacing the bpNLS nucleotide sequence shown in 9186 rd site of the sequence 9136-9186 th site with the nucleotide sequence shown in 73-93 th site of the sequence 5, and keeping other sequences unchanged. Wherein, the 1 st to 66 th positions of the sequence 5 are 3 Flag1 nucleotide sequences, the 73 th to 93 th positions are NLS1 nucleotide sequences, and the sequence 5 totally contains 1 NLS1 nucleotide sequence.
The F4NLS-sABE-1 recombinant expression vector is a sequence obtained by replacing the bpNLS nucleotide sequence shown in 3796-3846 th site of the sequence 1 in the sequence of the sABE-1 recombinant expression vector with a sequence 6, replacing the bpNLS nucleotide sequence shown in 9136-9186 th site with a nucleotide sequence shown in 55-201 th site of the sequence 6, and keeping other sequences unchanged. Wherein, the 1 st to 66 th positions of the sequence 6 are 3 Flag2 nucleotide sequences, the 73 rd to 93 th positions, the 103 rd and 123 th positions, the 133 th and 153 th positions and the 163 th and 183 th positions are NLS2 nucleotide sequences, and the sequence 6 totally contains 4NLS 2 nucleotide sequences.
The 4NLS-sABE-1 recombinant expression vector is a sequence obtained by deleting the bpNLS nucleotide sequence shown in 3796-3846 th site of the sequence 1 in the sequence of the sABE-1 recombinant expression vector, replacing the bpNLS nucleotide sequence shown in 9186 th site of the sequence 9136-9186 th site with the nucleotide sequence shown in 55-201 th site of the sequence 6 and keeping other sequences unchanged.
The target nucleotide sequence and the corresponding PAM sequence of each vector are shown in table 1.
TABLE 1
Figure BDA0002327692880000071
Second, obtaining the Positive T0 Rice seedlings
Respectively operating the sABE-1 vector, the FNLS-sABE-1 vector, the F4NLS-sABE-1 vector and the 4NLS-sABE-1 vector obtained in the step one according to the following steps 1-9:
1. the vector was introduced into Agrobacterium EHA105 (product of Shanghai Diego Biotechnology Ltd., CAT #: AC1010) to obtain recombinant Agrobacterium.
2. Culturing the recombinant Agrobacterium with a medium (YEP medium containing 50. mu.g/ml kanamycin and 25. mu.g/ml rifampicin), shaking at 28 ℃ and 150rpm to OD600At room temperature, centrifuging at 10000rpm for 1min, and treating with an infection solution (glucose and sucrose are substituted for N6 liquid culture medium), wherein the concentrations of glucose and sucrose in the infection solution are 10g/L and 10g/L respectively20g/L) resuspended cells and diluted to OD600And the concentration is 0.2, and an agrobacterium tumefaciens infection solution is obtained.
3. The mature seeds of the rice variety Nipponbare are shelled and threshed, placed in a 100mL triangular flask, added with 70% (v/v) ethanol water solution for soaking for 30sec, then placed in 25% (v/v) sodium hypochlorite water solution, sterilized by shaking at 120rpm for 30min, washed by sterile water for 3 times, sucked by filter paper to remove water, then placed on an N6 solid culture medium with the embryo of the seeds facing downwards, and cultured in dark at 28 ℃ for 4-6 weeks to obtain the callus of the rice.
4. After the step 3 is completed, soaking the rice callus in an agrobacterium infection solution A (the agrobacterium infection solution A is a liquid obtained by adding acetosyringone into the agrobacterium infection solution, the addition amount of the acetosyringone meets the volume ratio of the acetosyringone to the agrobacterium infection solution of 25 mul: 50ml), soaking for 10min, then placing the rice callus on a culture dish (containing about 200ml of the agrobacterium-free infection solution) paved with two layers of sterilization filter paper, and performing dark culture at 21 ℃ for 1 day.
5. And (4) putting the rice callus obtained in the step (4) on a recovery culture medium, and performing dark culture at 25-28 ℃ for 3 days.
6. And (4) placing the rice callus obtained in the step (5) on a screening culture medium, and performing dark culture at 28 ℃ for 2 weeks.
7. And (4) putting the rice callus obtained in the step (6) on a screening culture medium again, and performing dark culture at 28 ℃ for 2 weeks to obtain the rice resistance callus.
8. And (3) putting the rice resistant callus obtained in the step (7) on a differentiation culture medium, performing illumination culture at 25 ℃ for about 1 month, transplanting the differentiated plantlets on a rooting culture medium, and performing illumination culture at 25 ℃ for 2 weeks to obtain rice T0 seedlings.
9. Extracting genome DNA of rice T0 seedling, using the genome DNA as a template, and performing PCR amplification by using a primer pair consisting of a primer F (5'-CCGAGGAGACTATCACCCCT-3') and a primer R (5'-CGACCCATAACCTTGACAAGC-3') to obtain a PCR amplification product; the PCR amplification product was subjected to agarose gel electrophoresis, followed by judgment as follows: if the PCR amplification product contains a DNA fragment of about 853bp, the corresponding rice T0 seedling is a rice positive T0 seedling; if the PCR amplification product does not contain the DNA fragment of about 853bp, the corresponding rice T0 seedling is not the rice positive T0 seedling.
Third, result analysis
1. And (3) taking the genomic DNA of the rice positive T0 seedling obtained in the second step as a template for each vector, and carrying out PCR amplification on the OsACC-T target by adopting a primer to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for the target region. Counting the number of positive T0 seedlings with A.G base substitution on the OsACC-T target spot, and calculating the A.G base substitution efficiency, wherein the result is shown in Table 2.
The nuclear localization signal used by the sABE system is bpNLS-bpNLS; the nuclear localization signal used by the FNLS-sccbe system was 3 Flag1&1 NLS1-1 NLS 1; the nuclear localization signal used by the F4 NLS-sccbe system was 3 Flag2& 4NLS 2-4 NLS 2; the nuclear localization signal used by the 4NLS-sABE system is 4 × NLS 2.
The results show that the four systems (the sABE system, the FNLS-sABE system, the F4NLS-sABE system and the 4NLS-sABE system) only generate a single A7> G7 mutant (figure 3), and the A.G base substitution efficiency is higher than 50 percent. Wherein, the A.G base substitution efficiency of the FNLS-sABE system is the highest and is as high as 78.3 percent, the A.G base substitution efficiency of the sABE system and the F4NLS-sABE system is the second highest and is respectively 66.7 percent and 61.9 percent, and the A.G base substitution efficiency of the F4NLS-sABE system is the lowest and is 52.2 percent. Therefore, the three systems of sABE, FNLS-sABE and F4NLS-sABE only generate a single A7> G7 mutant (only a target site is mutated, and the rest sites are not mutated), the A.G base substitution efficiency is high, and the creation efficiency of the A7> G7 mutant is greatly improved.
TABLE 2
Figure BDA0002327692880000081
Figure BDA0002327692880000091
Example 2, A7> ACCase inhibition of herbicide resistance assay of G7 mutant
Experiment of spraying herbicide
Randomly selecting 15 rice positive T0 seedlings with A7> G7 mutation and 15 rice positive T0 seedlings without A7> G7 mutation (marked as WT seedlings) obtained in the third step of example 1, and respectively carrying out the following steps 1-4:
1. when the positive T0 rice seedlings grow to about 15-20 cm in the rooting culture medium, the cover of the culture dish is opened, clear water is added for hardening seedlings, and the seedlings are cultured for 5 days at 25 ℃ by illumination.
2. And after hardening, taking out the T0 seedling from the culture medium, washing the residual culture medium in the root with clear water, transplanting the seedling into a small pot in a greenhouse, and transferring the seedling to a large pot for culture after 15 days.
3. At the 6-10 leaf stage, three rice positive T0 seedlings with mutation of A7> G7 and three rice positive T0 seedlings without mutation of A7> G7 are randomly selected as one group, and the total is five groups. Each group was sprayed with 1L of Geranium strictipes (69806-34-4, Yinong Dow, active ingredient haloxyfop-p-methyl 108 g/L) with concentrations of 0ug/L, 42ug/L, 84ug/L, 168ug/L, 1680ug/L, respectively, with a sprayer every two weeks for 3 times.
4. When the growth of the rice positive T0 seedlings which do not generate the A7 and G7 mutations stops or even withers, and the normal heading knots of the rice positive T0 seedlings which generate the A7 and G7 mutations are photographed in real time.
Second, result analysis
Gephyrocarbone belongs to a species of the APP class of ACCase inhibiting herbicides. For five concentrations of 0ug/l, 42ug/l, 84ug/l, 168ug/l and 1680ug/l, none of the first four concentrations had significant inhibition effect on the rice positive T0 seedlings without mutation of A7> G7, i.e., WT seedlings all grew normally, only 1680ug/l Geranium showed significant inhibition effect on WT seedlings, while the rice positive T0 seedlings with mutation of A7> G7 grew normally (FIG. 4), and the rice positive T0 seedlings with mutation of A7> G7 also showed normal spiked fruit (FIG. 5). Thus, it can be seen that the rice positive T0 seedling with the mutation A7> G7 has ACCase inhibiting herbicide resistance.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the technical principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Sequence listing
<110> agriculture and forestry academy of sciences of Beijing City
<120> preparation method of herbicide-resistant rice
<160>14
<170>SIPOSequenceListing 1.0
<210>1
<211>19494
<212>DNA
<213>Artificial Sequence
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc tggaatcggc agcaaaggat tttttcctgt agttttccca caaccatttt 180
ttaccatccg aatgatagga taggaaaaat atccaagtga acagtattcc tataaaattc 240
ccgtaaaaag cctgcaatcc gaatgagccc tgaagtctga actagccggt cacctgtaca 300
ggctatcgag atgccataca agagacggta gtaggaacta ggaagacgat ggttgattcg 360
tcaggcgaaa tcgtcgtcct gcagtcgcat ctatgggcct ggacggaata ggggaaaaag 420
ttggccggat aggagggaaa ggcccaggtg cttacgtgcg aggtaggcct gggctctcag 480
cacttcgatt cgttggcacc ggggtaggat gcaatagaga gcaacgttta gtaccacctc 540
gcttagctag agcaaactgg actgccttat atgcgcgggt gctggcttgg ctgccgagtg 600
cacggtgtcc gtggccgttt cagagctatg ctggaaacag catagcaagt tgaaataagg 660
ctagtccgtt atcaacttga aaaagtggca ccgagtcggt gcttttttta ggaatcttta 720
aacatacgaa cagatcactt aaagttcttc tgaagcaact taaagttatc aggcatgcat 780
ggatcttgga ggaatcagat gtgcagtcag ggaccatagc acaagacagg cgtcttctac 840
tggtgctacc agcaaatgct ggaagccggg aacactgggt acgttggaaa ccacgtgtga 900
tgtgaaggag taagataaac tgtaggagaa aagcatttcg tagtgggcca tgaagccttt 960
caggacatgt attgcagtat gggccggccc attacgcaat tggacgacaa caaagactag 1020
tattagtacc acctcggcta tccacataga tcaaagctgg tttaaaagag ttgtgcagat 1080
gatccgtggc gttgatagca agataaaccc gtttcagagc tatgctggaa acagcatagc 1140
aagttgaaat aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgctttt 1200
tttctcatta gcggtatgca tgttggtaga agtcggagat gtaaataatt ttcattatat 1260
aaaaaaggta cttcgagaaa aataaatgca tacgaattaa ttctttttat gttttttaaa 1320
ccaagtatat agaatttatt gatggttaaa atttcaaaaa tatgacgaga gaaaggttaa 1380
acgtacggca tatacttctg aacagagagg gaatatgggg tttttgttgc tcccaacaat 1440
tcttaagcac gtaaaggaaa aaagcacatt atccacattg tacttccaga gatatgtaca 1500
gcattacgta ggtacgtttt ctttttcttc ccggagagat gatacaataa tcatgtaaac 1560
ccagaattta aaaaatattc tttactataa aaattttaat tagggaacgt attatttttt 1620
acatgacacc ttttgagaaa gagggacttg taatatggga caaatgaaca atttctaaga 1680
aatgggcata tgactctcag tacaatggac caaattccct ccagtcggcc cagcaataca 1740
aagggaaaga aatgaggggg cccacaggcc acggcccact tttctccgtg gtggggagat 1800
ccagctagag gtccggccca caagtggccc ttgccccgtg ggacggtggg attgcagagc 1860
gcgtgggcgg aaacaacagt ttagtaccac ctcgctcacg caacgacgcg accacttgct 1920
tataagctgc tgcgctgagg ctcagcatag cactcaatgc ggtctgtttc agagctatgc 1980
tggaaacagc atagcaagtt gaaataaggc tagtccgtta tcaacttgaa aaagtggcac 2040
cgagtcggtg cttttttttt tttaagctta caaattcggg tcaaggcgga agccagcgcg 2100
ccaccccacg tcagcaaata cggaggcgcg gggttgacgg cgtcacccgg tcctaacggc 2160
gaccaacaaa ccagccagaa gaaattacag taaaaaaaaa gtaaattgca ctttgatcca 2220
ccttttatta cctaagtctc aatttggatc acccttaaac ctatcttttc aatttgggcc 2280
gggttgtggt ttggactacc atgaacaact tttcgtcatg tctaacttcc ctttcagcaa 2340
acatatgaac catatataga ggagatcggc cgtatactag agctgatgtg tttaaggtcg 2400
ttgattgcac gagaaaaaaa aatccaaatc gcaacaatag caaatttatc tggttcaaag 2460
tgaaaagata tgtttaaagg tagtccaaag taaaacttat agataataaa atgtggtcca 2520
aagcgtaatt cactcaaaaa aaatcaacga gacgtgtacc aaacggagac aaacggcatc 2580
ttctcgaaat ttcccaaccg ctcgctcgcc cgcctcgtct tcccggaaac cgcggtggtt 2640
tcagcgtggc ggattctcca agcagacgga gacgtcacgg cacgggactc ctcccaccac 2700
ccaaccgcca taaataccag ccccctcatc tcctctcctc gcatcagctc cacccccgaa 2760
aaatttctcc ccaatctcgc gaggctctcg tcgtcgaatc gaatcctctc gcgtcctcaa 2820
ggtacgctgc ttctcctctc ctcgcttcgt ttcgattcga tttcggacgg gtgaggttgt 2880
tttgttgcta gatccgattg gtggttaggg ttgtcgatgt gattatcgtg agatgtttag 2940
gggttgtaga tctgatggtt gtgatttggg cacggttggt tcgataggtg gaatcgtggt 3000
taggttttgg gattggatgt tggttctgat gattgggggg aatttttacg gttagatgaa 3060
ttgttggatg attcgattgg ggaaatcggt gtagatctgt tggggaattg tggaactagt 3120
catgcctgag tgattggtgc gatttgtagc gtgttccatc ttgtaggcct tgttgcgagc 3180
atgttcagat ctactgttcc gctcttgatt gagttattgg tgccatgggt tggtgcaaac 3240
acaggcttta atatgttata tctgttttgt gtttgatgta gatctgtagg gtagttcttc 3300
ttagacatgg ttcaattatg tagcttgtgc gtttcgattt gatttcatat gttcacagat 3360
tagataatga tgaactcttt taattaattg tcaatggtaa ataggaagtc ttgtcgctat 3420
atctgtcata atgatctcat gttactatct gccagtaatt tatgctaaga actatattag 3480
aatatcatgt tacaatctgt agtaatatca tgttacaatc tgtagttcat ctatataatc 3540
tattgtggta atttcttttt actatctgtg tgaagattat tgccactagt tcattctact 3600
tatttctgaa gttcaggata cgtgtgctgt tactacctat ctgaatacat gtgtgatgtg 3660
cctgttacta tctttttgaa tacatgtatg ttctgttgga atatgtttgc tgtttgatcc 3720
gttgttgtgt ccttaatctt gtgctagttc ttaccctatc tgtttggtga ttatttcttg 3780
cagtacgtaa gcatgaagag gaccgccgac ggcagcgagt tcgagccgaa gaagaagagg 3840
aaggtgtccg aggtggagtt ctcccacgag tactggatga ggcacgcact caccctcgca 3900
aagagggcat gggacgagag ggaggtgcct gtgggagcag tgctcgtgca caacaacagg 3960
gtgatcggag agggatggaa caggcctatc ggaaggcacg accctaccgc acacgcagag 4020
atcatggcac tcaggcaggg aggcctcgtg atgcagaact acaggctcat cgacgccacc 4080
ctctacgtga ccctcgagcc ttgcgtgatg tgcgcaggag ccatgatcca ctccaggatc 4140
ggaagggtgg tgttcggagc aagggacgca aagaccggag cagccggctc cctcatggac 4200
gtgctccacc acccgggcat gaaccacagg gtggagatca ccgagggaat cctcgcagac 4260
gagtgcgcag ccctcctctc cgacttcttc aggatgagga ggcaggagat caaggcccag 4320
aagaaggccc agtcctccac cgactccggc ggctcatcag gcggctcctc cggctccgag 4380
acaccgggca cctccgagtc cgccaccccg gagtcctccg gcggctcctc cggcggctcc 4440
tccgaggtgg agttctccca cgagtactgg atgaggcacg cactcaccct cgcaaagagg 4500
gcaagggacg agagggaggt gcctgtggga gcagtgctcg tgctcaacaa cagggtgatc 4560
ggagagggat ggaacagggc aatcggcctc cacgacccta ccgcacacgc agagatcatg 4620
gcactcaggc agggaggcct cgtgatgcag aactacaggc tcatcgacgc caccctctac 4680
gtgaccttcg agccttgcgt gatgtgcgca ggagccatga tccactccag gatcggcagg 4740
gtggtgttcg gcgtgaggaa cgcaaagacc ggagcagcag gctccctcat ggacgtgctc 4800
cactacccgg gcatgaacca cagggtggag atcaccgagg gaatcctcgc agacgagtgc 4860
gcagccctcc tctgctactt cttcaggatg ccgaggcagg tgttcaacgc ccagaagaag 4920
gcccagtcct ccaccgactc cggcggctca tcaggcggct cctccggctc cgagacaccg 4980
ggcacctccg agtccgccac cccggagtcc tccggcggct cctccggcgg ctccgacaag 5040
aagtactcca tcggcctcgc catcggcacc aacagcgtcg gctgggcggt gatcaccgac 5100
gagtacaagg tcccgtccaa gaagttcaag gtcctgggca acaccgaccg ccactccatc 5160
aagaagaacc tcatcggcgc cctcctcttc gactccggcg agacggcgga ggcgacccgc 5220
ctcaagcgca ccgcccgccg ccgctacacc cgccgcaaga accgcatctg ctacctccag 5280
gagatcttct ccaacgagat ggcgaaggtc gacgactcct tcttccaccg cctcgaggag 5340
tccttcctcg tggaggagga caagaagcac gagcgccacc ccatcttcgg caacatcgtc 5400
gacgaggtcg cctaccacga gaagtacccc actatctacc accttcgtaa gaagcttgtt 5460
gactctactg ataaggctga tcttcgtctc atctaccttg ctctcgctca catgatcaag 5520
ttccgtggtc acttccttat cgagggtgac cttaaccctg ataactccga cgtggacaag 5580
ctcttcatcc agctcgtcca gacctacaac cagctcttcg aggagaaccc tatcaacgct 5640
tccggtgtcg acgctaaggc gatcctttcc gctaggctct ccaagtccag gcgtctcgag 5700
aacctcatcg cccagctccc tggtgagaag aagaacggtc ttttcggtaa cctcatcgct 5760
ctctccctcg gtctgacccc taacttcaag tccaacttcg acctcgctga ggacgctaag 5820
cttcagctct ccaaggatac ctacgacgat gatctcgaca acctcctcgc tcagattgga 5880
gatcagtacg ctgatctctt ccttgctgct aagaacctct ccgatgctat cctcctttcg 5940
gatatcctta gggttaacac tgagatcact aaggctcctc tttctgcttc catgatcaag 6000
cgctacgacg agcaccacca ggacctcacc ctcctcaagg ctcttgttcg tcagcagctc 6060
cccgagaagt acaaggagat cttcttcgac cagtccaaga acggctacgc cggttacatt 6120
gacggtggag ctagccagga ggagttctac aagttcatca agccaatcct tgagaagatg 6180
gatggtactg aggagcttct cgttaagctt aaccgtgagg acctccttag gaagcagagg 6240
actttcgata acggctctat ccctcaccag atccaccttg gtgagcttca cgccatcctt 6300
cgtaggcagg aggacttcta ccctttcctc aaggacaacc gtgagaagat cgagaagatc 6360
cttactttcc gtattcctta ctacgttggt cctcttgctc gtggtaactc ccgtttcgct 6420
tggatgacta ggaagtccga ggagactatc accccttgga acttcgagga ggttgttgac 6480
aagggtgctt ccgcccagtc cttcatcgag cgcatgacca acttcgacaa gaacctcccc 6540
aacgagaagg tcctccccaa gcactccctc ctctacgagt acttcacggt ctacaacgag 6600
ctcaccaagg tcaagtacgt caccgagggt atgcgcaagc ctgccttcct ctccggcgag 6660
cagaagaagg ctatcgttga cctcctcttc aagaccaacc gcaaggtcac cgtcaagcag 6720
ctcaaggagg actacttcaa gaagatcgag tgcttcgact ccgtcgagat cagcggcgtt 6780
gaggaccgtt tcaacgcttc tctcggtacc taccacgatc tcctcaagat catcaaggac 6840
aaggacttcc tcgacaacga ggagaacgag gacatcctcg aggacatcgt cctcactctt 6900
actctcttcg aggataggga gatgatcgag gagaggctca agacttacgc tcatctcttc 6960
gatgacaagg ttatgaagca gctcaagcgt cgccgttaca ccggttgggg taggctctcc 7020
cgcaagctca tcaacggtat cagggataag cagagcggca agactatcct cgacttcctc 7080
aagtctgatg gtttcgctaa caggaacttc atgcagctca tccacgatga ctctcttacc 7140
ttcaaggagg atattcagaa ggctcaggtg tccggtcagg gcgactctct ccacgagcac 7200
attgctaacc ttgctggttc ccctgctatc aagaagggca tccttcagac tgttaaggtt 7260
gtcgatgagc ttgtcaaggt tatgggtcgt cacaagcctg agaacatcgt catcgagatg 7320
gctcgtgaga accagactac ccagaagggt cagaagaact cgagggagcg catgaagagg 7380
attgaggagg gtatcaagga gcttggttct cagatcctta aggagcaccc tgtcgagaac 7440
acccagctcc agaacgagaa gctctacctc tactacctcc agaacggtag ggatatgtac 7500
gttgaccagg agctcgacat caacaggctt tctgactacg acgtcgacca cattgttcct 7560
cagtctttcc ttaaggatga ctccatcgac aacaaggtcc tcacgaggtc cgacaagaac 7620
aggggtaagt cggacaacgt cccttccgag gaggttgtca agaagatgaa gaactactgg 7680
aggcagcttc tcaacgctaa gctcattacc cagaggaagt tcgacaacct cacgaaggct 7740
gagaggggtg gcctttccga gcttgacaag gctggtttca tcaagaggca gcttgttgag 7800
acgaggcaga ttaccaagca cgttgctcag atcctcgatt ctaggatgaa caccaagtac 7860
gacgagaacg acaagctcat ccgcgaggtc aaggtgatca ccctcaagtc caagctcgtc 7920
tccgacttcc gcaaggactt ccagttctac aaggtccgcg agatcaacaa ctaccaccac 7980
gctcacgatg cttaccttaa cgctgtcgtt ggtaccgctc ttatcaagaa gtaccctaag 8040
cttgagtccg agttcgtcta cggtgactac aaggtctacg acgttcgtaa gatgatcgcc 8100
aagtccgagc aggagatcgg caaggccacc gccaagtact tcttctactc caacatcatg 8160
aacttcttca agaccgagat caccctcgcc aacggcgaga tccgcaagcg ccctcttatc 8220
gagacgaacg gtgagactgg tgagatcgtt tgggacaagg gtcgcgactt cgctactgtt 8280
cgcaaggtcc tttctatgcc tcaggttaac atcgtcaaga agaccgaggt ccagaccggt 8340
ggcttctcca aggagtctat ccttccaaag agaaactcgg acaagctcat cgctaggaag 8400
aaggattggg accctaagaa gtacggtggt ttcgactccc ctactgtcgc ctactccgtc 8460
ctcgtggtcg ccaaggtgga gaagggtaag tcgaagaagc tcaagtccgt caaggagctc 8520
ctcggcatca ccatcatgga gcgctcctcc ttcgagaaga acccgatcga cttcctcgag 8580
gccaagggct acaaggaggt caagaaggac ctcatcatca agctccccaa gtactctctt 8640
ttcgagctcg agaacggtcg taagaggatg ctggcttccg ctggtgagct ccagaagggt 8700
aacgagcttg ctcttccttc caagtacgtg aacttcctct acctcgcctc ccactacgag8760
aagctcaagg gttcccctga ggataacgag cagaagcagc tcttcgtgga gcagcacaag 8820
cactacctcg acgagatcat cgagcagatc tccgagttct ccaagcgcgt catcctcgct 8880
gacgctaacc tcgacaaggt cctctccgcc tacaacaagc accgcgacaa gcccatccgc 8940
gagcaggccg agaacatcat ccacctcttc acgctcacga acctcggcgc ccctgctgct 9000
ttcaagtact tcgacaccac catcgacagg aagcgttaca cgtccaccaa ggaggttctc 9060
gacgctactc tcatccacca gtccatcacc ggtctttacg agactcgtat cgacctttcc 9120
cagcttggtg gtgataagag gaccgccgac ggcagcgagt tcgagccgaa gaagaagagg 9180
aaggtgtaga ctagttcagc cagtttggtg gagctgccga tgtgcctggt cgtcccgagc 9240
ctctgttcgt caagtatttg tggtgctgat gtctacttgt gtctggttta atggaccatc 9300
gagtccgtat gatatgttag ttttatgaaa cagtttcctg tgggacagca gtatgcttta 9360
tgaataagtt ggatttgaac ctaaatatgt gctcaatttg ctcatttgca tctcattcct 9420
gttgatgttt tatctgagtt gcaagtttga aaatgctgca tattcttatt aaatcgtcat 9480
ttacttttat cttaatgagc tttgcaatgg cctatgggat ataaaagaga tcgttcaaac 9540
atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat gattatcata 9600
taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat gacgttattt 9660
atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc gatagaaaac 9720
aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat gttactagat 9780
cggcgcctgt ccgggcgcgc ctggtggatc gtccgcctag gctgcagtgc agcgtgaccc 9840
ggtcgtgccc ctctctagag ataatgagca ttgcatgtct aagttataaa aaattaccac 9900
atattttttt tgtcacactt gtttgaagtg cagtttatct atctttatac atatatttaa 9960
actttactct acgaataata taatctatag tactacaata atatcagtgt tttagagaat 10020
catataaatg aacagttaga catggtctaa aggacaattg agtattttga caacaggact 10080
ctacagtttt atctttttag tgtgcatgtg ttctcctttt tttttgcaaa tagcttcacc 10140
tatataatac ttcatccatt ttattagtac atccatttag ggtttagggt taatggtttt 10200
tatagactaa tttttttagt acatctattt tattctattt tagcctctaa attaagaaaa 10260
ctaaaactct attttagttt ttttatttaa taatttagat ataaaataga ataaaataaa 10320
gtgactaaaa attaaacaaa taccctttaa gaaattaaaa aaactaagga aacatttttc 10380
ttgtttcgag tagataatgc cagcctgtta aacgccgtcg acgagtctaa cggacaccaa 10440
ccagcgaacc agcagcgtcg cgtcgggcca agcgaagcag acggcacggc atctctgtcg 10500
ctgcctctgg acccctctcg agagttccgc tccaccgttg gacttgctcc gctgtcggca 10560
tccagaaatt gcgtggcgga gcggcagacg tgagccggca cggcaggcgg cctcctcctc 10620
ctctcacggc accggcagct acgggggatt cctttcccac cgctccttcg ctttcccttc 10680
ctcgcccgcc gtaataaata gacaccccct ccacaccctc tttccccaac ctcgtgttgt 10740
tcggagcgca cacacacaca accagatctc ccccaaatcc acccgtcggc acctccgctt 10800
caaggtacgc cgctcgtcct cccccccccc ccctctctac cttctctaga tcggcgttcc 10860
ggtccatggt tagggcccgg tagttctact tctgttcatg tttgtgttag atccgtgttt 10920
gtgttagatc cgtgctgcta gcgttcgtac acggatgcga cctgtacgtc agacacgttc 10980
tgattgctaa cttgccagtg tttctctttg gggaatcctg ggatggctct agccgttccg 11040
cagacgggat cgatttcatg attttttttg tttcgttgca tagggtttgg tttgcccttt 11100
tcctttattt caatatatgc cgtgcacttg tttgtcgggt catcttttca tgcttttttt 11160
tgtcttggtt gtgatgatgt ggtctggttg ggcggtcgtt ctagatcgga gtagaattct 11220
gtttcaaact acctggtgga tttattaatt ttggatctgt atgtgtgtgc catacatatt 11280
catagttacg aattgaagat gatggatgga aatatcgatc taggataggt atacatgttg 11340
atgcgggttt tactgatgca tatacagaga tgctttttgt tcgcttggtt gtgatgatgt 11400
ggtgtggttg ggcggtcgtt cattcgttct agatcggagt agaatactgt ttcaaactac 11460
ctggtgtatt tattaatttt ggaactgtat gtgtgtgtca tacatcttca tagttacgag 11520
tttaagatgg atggaaatat cgatctagga taggtataca tgttgatgtg ggttttactg 11580
atgcatatac atgatggcat atgcagcatc tattcatatg ctctaacctt gagtacctat 11640
ctattataat aaacaagtat gttttataat tattttgatc ttgatatact tggatgatgg 11700
catatgcagc agctatatgt ggattttttt agccctgcct tcatacgcta tttatttgct 11760
tggtactgtt tcttttgtcg atgctcaccc tgttgtttgg tgttacttct gcaggagctc 11820
atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac 11880
agcgtctccg acctgatgca gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat 11940
gtaggagggc gtggatatgt cctgcgggta aatagctgcg ccgatggttt ctacaaagat 12000
cgttatgttt atcggcactt tgcatcggcc gcgctcccga ttccggaagt gcttgacatt 12060
ggggagttta gcgagagcct gacctattgc atctcccgcc gttcacaggg tgtcacgttg 12120
caagacctgc ctgaaaccga actgcccgct gttctacaac cggtcgcgga ggctatggat 12180
gcgatcgctg cggccgatct tagccagacg agcgggttcg gcccattcgg accgcaagga 12240
atcggtcaat acactacatg gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat 12300
cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag 12360
ctgatgcttt gggccgagga ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc 12420
tccaacaatg tcctgacgga caatggccgc ataacagcgg tcattgactg gagcgaggcg 12480
atgttcgggg attcccaata cgaggtcgcc aacatcttct tctggaggcc gtggttggct 12540
tgtatggagc agcagacgcg ctacttcgag cggaggcatc cggagcttgc aggatcgcca 12600
cgactccggg cgtatatgct ccgcattggt cttgaccaac tctatcagag cttggttgac 12660
ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga 12720
gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc 12780
tgtgtagaag tactcgccga tagtggaaac cgacgcccca gcactcgtcc gagggcaaag 12840
aaatagagta gatgccgacc gggatctgtc gatcgacaag ctcgagtttc tccataataa 12900
tgtgtgagta gttcccagat aagggaatta gggttcctat agggtttcgc tcatgtgttg 12960
agcatataag aaacccttagtatgtatttg tatttgtaaa atacttctat caataaaatt 13020
tctaattcct aaaaccaaaa tccagtacta aaatccagat cccccgaatt aattcggcgt 13080
taattcagcc tgcaggacgc gtttaattaa gtgcacgcgg ccgcctactt agtcaagagc 13140
ctcgcacgcg actgtcacgc ggccaggatc gcctcgtgag cctcgcaatc tgtacctagt 13200
gtttaaacta tcagtgtttg acaggatata ttggcgggta aacctaagag aaaagagcgt 13260
ttattagaat aacggatatt taaaagggcg tgaaaaggtt tatccgttcg tccatttgta 13320
tgtgcatgcc aaccacaggg ttcccctcgg gatcaaagta ctttgatcca acccctccgc 13380
tgctatagtg cagtcggctt ctgacgttca gtgcagccgt cttctgaaaa cgacatgtcg 13440
cacaagtcct aagttacgcg acaggctgcc gccctgccct tttcctggcg ttttcttgtc 13500
gcgtgtttta gtcgcataaa gtagaatact tgcgactaga accggagaca ttacgccatg 13560
aacaagagcg ccgccgctgg cctgctgggc tatgcccgcg tcagcaccga cgaccaggac 13620
ttgaccaacc aacgggccga actgcacgcg gccggctgca ccaagctgtt ttccgagaag 13680
atcaccggca ccaggcgcga ccgcccggag ctggccagga tgcttgacca cctacgccct 13740
ggcgacgttg tgacagtgac caggctagac cgcctggccc gcagcacccg cgacctactg 13800
gacattgccg agcgcatcca ggaggccggc gcgggcctgc gtagcctggc agagccgtgg 13860
gccgacacca ccacgccggc cggccgcatg gtgttgaccg tgttcgccgg cattgccgag 13920
ttcgagcgtt ccctaatcat cgaccgcacc cggagcgggc gcgaggccgc caaggcccga 13980
ggcgtgaagt ttggcccccg ccctaccctc accccggcac agatcgcgca cgcccgcgag 14040
ctgatcgacc aggaaggccg caccgtgaaa gaggcggctg cactgcttgg cgtgcatcgc 14100
tcgaccctgt accgcgcact tgagcgcagc gaggaagtga cgcccaccga ggccaggcgg 14160
cgcggtgcct tccgtgagga cgcattgacc gaggccgacg ccctggcggc cgccgagaat 14220
gaacgccaag aggaacaagc atgaaaccgc accaggacgg ccaggacgaa ccgtttttca 14280
ttaccgaaga gatcgaggcg gagatgatcg cggccgggta cgtgttcgag ccgcccgcgc 14340
acgtctcaac cgtgcggctg catgaaatcc tggccggttt gtctgatgcc aagctggcgg 14400
cctggccggc cagcttggcc gctgaagaaa ccgagcgccg ccgtctaaaa aggtgatgtg 14460
tatttgagta aaacagcttg cgtcatgcgg tcgctgcgta tatgatgcga tgagtaaata 14520
aacaaatacg caaggggaac gcatgaaggt tatcgctgta cttaaccaga aaggcgggtc 14580
aggcaagacg accatcgcaa cccatctagc ccgcgccctg caactcgccg gggccgatgt 14640
tctgttagtc gattccgatc cccagggcag tgcccgcgat tgggcggccg tgcgggaaga 14700
tcaaccgcta accgttgtcg gcatcgaccg cccgacgatt gaccgcgacg tgaaggccat 14760
cggccggcgc gacttcgtag tgatcgacgg agcgccccag gcggcggact tggctgtgtc 14820
cgcgatcaag gcagccgact tcgtgctgat tccggtgcag ccaagccctt acgacatatg 14880
ggccaccgcc gacctggtgg agctggttaa gcagcgcatt gaggtcacgg atggaaggct 14940
acaagcggcc tttgtcgtgt cgcgggcgat caaaggcacg cgcatcggcg gtgaggttgc 15000
cgaggcgctg gccgggtacg agctgcccat tcttgagtcc cgtatcacgc agcgcgtgag 15060
ctacccaggc actgccgccg ccggcacaac cgttcttgaa tcagaacccg agggcgacgc 15120
tgcccgcgag gtccaggcgc tggccgctga aattaaatca aaactcattt gagttaatga 15180
ggtaaagaga aaatgagcaa aagcacaaac acgctaagtg ccggccgtcc gagcgcacgc 15240
agcagcaagg ctgcaacgtt ggccagcctg gcagacacgc cagccatgaa gcgggtcaac 15300
tttcagttgc cggcggagga tcacaccaag ctgaagatgt acgcggtacg ccaaggcaag 15360
accattaccg agctgctatc tgaatacatc gcgcagctac cagagtaaat gagcaaatga 15420
ataaatgagt agatgaattt tagcggctaa aggaggcggc atggaaaatc aagaacaacc 15480
aggcaccgac gccgtggaat gccccatgtg tggaggaacg ggcggttggc caggcgtaag 15540
cggctgggtt gtctgccggc cctgcaatgg cactggaacc cccaagcccg aggaatcggc 15600
gtgacggtcg caaaccatcc ggcccggtac aaatcggcgc ggcgctgggt gatgacctgg 15660
tggagaagtt gaaggccgcg caggccgccc agcggcaacg catcgaggca gaagcacgcc 15720
ccggtgaatc gtggcaagcg gccgctgatc gaatccgcaa agaatcccgg caaccgccgg 15780
cagccggtgc gccgtcgatt aggaagccgc ccaagggcga cgagcaacca gattttttcg 15840
ttccgatgct ctatgacgtg ggcacccgcg atagtcgcag catcatggac gtggccgttt 15900
tccgtctgtc gaagcgtgac cgacgagctg gcgaggtgat ccgctacgag cttccagacg 15960
ggcacgtaga ggtttccgca gggccggccg gcatggccag tgtgtgggat tacgacctgg 16020
tactgatggc ggtttcccat ctaaccgaat ccatgaaccg ataccgggaa gggaagggag 16080
acaagcccgg ccgcgtgttc cgtccacacg ttgcggacgt actcaagttc tgccggcgag 16140
ccgatggcgg aaagcagaaa gacgacctgg tagaaacctg cattcggtta aacaccacgc 16200
acgttgccat gcagcgtacg aagaaggcca agaacggccg cctggtgacg gtatccgagg 16260
gtgaagcctt gattagccgc tacaagatcg taaagagcga aaccgggcgg ccggagtaca 16320
tcgagatcga gctagctgat tggatgtacc gcgagatcac agaaggcaag aacccggacg 16380
tgctgacggt tcaccccgat tactttttga tcgatcccgg catcggccgt tttctctacc 16440
gcctggcacg ccgcgccgca ggcaaggcag aagccagatg gttgttcaag acgatctacg 16500
aacgcagtgg cagcgccgga gagttcaaga agttctgttt caccgtgcgc aagctgatcg 16560
ggtcaaatga cctgccggag tacgatttga aggaggaggc ggggcaggct ggcccgatcc 16620
tagtcatgcg ctaccgcaac ctgatcgagg gcgaagcatc cgccggttcc taatgtacgg 16680
agcagatgct agggcaaatt gccctagcag gggaaaaagg tcgaaaaggt ctctttcctg 16740
tggatagcac gtacattggg aacccaaagc cgtacattgg gaaccggaac ccgtacattg 16800
ggaacccaaa gccgtacatt gggaaccggt cacacatgta agtgactgat ataaaagaga 16860
aaaaaggcga tttttccgcc taaaactctt taaaacttat taaaactctt aaaacccgcc 16920
tggcctgtgc ataactgtct ggccagcgca cagccgaaga gctgcaaaaa gcgcctaccc 16980
ttcggtcgct gcgctcccta cgccccgccg cttcgcgtcg gcctatcgcg gccgctggcc 17040
gctcaaaaat ggctggccta cggccaggca atctaccagg gcgcggacaa gccgcgccgt 17100
cgccactcga ccgccggcgc ccacatcaag gcaccctgcc tcgcgcgttt cggtgatgac 17160
ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct gtaagcggat 17220
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg tcggggcgca 17280
gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat gcggcatcag 17340
agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga tgcgtaagga 17400
gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 17460
ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 17520
caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 17580
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 17640
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 17700
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 17760
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 17820
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 17880
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 17940
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 18000
cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 18060
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 18120
aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 18180
aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 18240
actcacgtta agggattttg gtcatgcatt ctaggtacta aaacaattca tccagtaaaa 18300
tataatattt tattttctcc caatcaggct tgatccccag taagtcaaaa aatagctcga 18360
catactgttc ttccccgata tcctccctga tcgaccggac gcagaaggca atgtcatacc 18420
acttgtccgc cctgccgctt ctcccaagat caataaagcc acttactttg ccatctttca 18480
caaagatgtt gctgtctccc aggtcgccgt gggaaaagac aagttcctct tcgggctttt 18540
ccgtctttaa aaaatcatac agctcgcgcg gatctttaaa tggagtgtct tcttcccagt 18600
tttcgcaatc cacatcggcc agatcgttat tcagtaagta atccaattcg gctaagcggc 18660
tgtctaagct attcgtatag ggacaatccg atatgtcgat ggagtgaaag agcctgatgc 18720
actccgcata cagctcgata atcttttcag ggctttgttc atcttcatac tcttccgagc 18780
aaaggacgcc atcggcctca ctcatgagca gattgctcca gccatcatgc cgttcaaagt 18840
gcaggacctt tggaacaggc agctttcctt ccagccatag catcatgtcc ttttcccgtt 18900
ccacatcata ggtggtccct ttataccggc tgtccgtcat ttttaaatat aggttttcat 18960
tttctcccac cagcttatat accttagcag gagacattcc ttccgtatct tttacgcagc 19020
ggtatttttc gatcagtttt ttcaattccg gtgatattct cattttagcc atttattatt 19080
tccttcctct tttctacagt atttaaagat accccaagaa gctaattata acaagacgaa 19140
ctccaattca ctgttccttg cattctaaaa ccttaaatac cagaaaacag ctttttcaaa 19200
gttgttttca aagttggcgt ataacatagt atcgacggag ccgattttga aaccgcggtg 19260
atcacaggca gcaacgctct gtcatcgtta caatcaacat gctaccctcc gcgagatcat 19320
ccgtgtttca aacccggcag cttagttgcc gttcttccga atagcatcgg taacatgagc 19380
aaagtctgcc gccttacaac ggctctcccg ctgacgccgt cccggactga tgggctgcct 19440
gtatcgagtg gtgattttgt gccgagctgc cggtcgggga gctgttggct ggct 19494
<210>2
<211>166
<212>PRT
<213>Artificial Sequence
<400>2
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro Ile
35 40 45
Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210>3
<211>166
<212>PRT
<213>Artificial Sequence
<400>3
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
35 40 45
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210>4
<211>1367
<212>PRT
<213>Artificial Sequence
<400>4
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 4045
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn
1025 1030 1035 1040
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile
1045 1050 1055
Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val
1060 1065 1070
Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met
1075 1080 1085
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe
1090 1095 1100
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
1105 1110 1115 1120
Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro
1125 1130 1135
Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1140 1145 1150
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
1155 1160 1165
Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1170 1175 1180
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr
1185 1190 1195 1200
Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1205 1210 1215
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr
1250 1255 1260
Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile
1265 1270 1275 1280
Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His
1285 1290 1295
Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe
1300 1305 1310
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr
1315 1320 1325
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala
1330 1335 1340
Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp
1345 1350 1355 1360
Leu Ser Gln Leu Gly Gly Asp
1365
<210>5
<211>117
<212>DNA
<213>Artificial Sequence
<400>5
gactacaagg accacgacgg cgactacaag gatcatgaca tcgactacaa ggacgacgac 60
gacaagatgg ccccgaagaa gaagaggaaa gtgggcatcc acggcgtgcc ggccgcc 117
<210>6
<211>207
<212>DNA
<213>Artificial Sequence
<400>6
gactacaagg accacgacgg ggattacaaa gaccacgaca tagactacaa ggatgacgat 60
gacaaaatgg caccgaagaa aaaaaggaag gtcggcggct ccccgaagaa aaaaaggaag 120
gtcggcggct ccccgaagaa aaaaaggaag gtcggcggct ccccgaagaa aaaaaggaag 180
gtcggaatcc atggcgttcc agctgcc 207
<210>7
<211>17
<212>PRT
<213>Artificial Sequence
<400>7
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys
1 5 10 15
Val
<210>8
<211>31
<212>PRT
<213>Artificial Sequence
<400>8
Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr
1 5 10 15
Lys Asp Asp Asp Asp Lys Met Ala Pro Lys Lys Lys Arg Lys Val
20 25 30
<210>9
<211>7
<212>PRT
<213>Artificial Sequence
<400>9
Pro Lys Lys Lys Arg Lys Val
1 5
<210>10
<211>61
<212>PRT
<213>Artificial Sequence
<400>10
Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr
1 5 10 15
Lys Asp Asp Asp Asp Lys Met Ala Pro Lys Lys Lys Arg Lys Val Gly
20 25 30
Gly Ser Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys Lys Lys
35 40 45
Arg Lys Val Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
50 55 60
<210>11
<211>37
<212>PRT
<213>Artificial Sequence
<400>11
Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys Lys Lys Arg Lys
1 5 10 15
Val Gly Gly Ser Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys
20 25 30
Lys Lys Arg Lys Val
35
<210>12
<211>20
<212>DNA
<213>Artificial Sequence
<400>12
catagcactc aatgcggtct 20
<210>13
<211>2327
<212>PRT
<213>Artificial Sequence
<400>13
Met Thr Ser Thr His Val Ala Thr Leu Gly Val Gly Ala Gln Ala Pro
1 5 10 15
Pro Arg His Gln Lys Lys Ser Ala Gly Thr Ala Phe Val Ser Ser Gly
20 25 30
Ser Ser Arg Pro Ser Tyr Arg Lys Asn Gly Gln Arg Thr Arg Ser Leu
35 40 45
Arg Glu Glu Ser Asn Gly Gly Val Ser Asp Ser Lys Lys Leu Asn His
50 55 60
Ser Ile Arg Gln Gly Leu Ala Gly Ile Ile Asp Leu Pro Asn Asp Ala
65 70 75 80
Ala Ser Glu Val Asp Ile Ser His Gly Ser Glu Asp Pro Arg Gly Pro
85 90 95
Thr Val Pro Gly Ser Tyr Gln Met Asn Gly Ile Ile Asn Glu Thr His
100 105 110
Asn Gly Arg His Ala Ser Val Ser Lys Val Val Glu Phe Cys Thr Ala
115 120 125
Leu Gly Gly Lys Thr Pro Ile His Ser Val Leu Val Ala Asn Asn Gly
130 135 140
Met Ala Ala Ala Lys Phe Met Arg Ser Val Arg Thr Trp Ala Asn Asp
145 150155 160
Thr Phe Gly Ser Glu Lys Ala Ile Gln Leu Ile Ala Met Ala Thr Pro
165 170 175
Glu Asp Leu Arg Ile Asn Ala Glu His Ile Arg Ile Ala Asp Gln Phe
180 185 190
Val Glu Val Pro Gly Gly Thr Asn Asn Asn Asn Tyr Ala Asn Val Gln
195 200 205
Leu Ile Val Glu Ile Ala Glu Arg Thr Gly Val Ser Ala Val Trp Pro
210 215 220
Gly Trp Gly His Ala Ser Glu Asn Pro Glu Leu Pro Asp Ala Leu Thr
225 230 235 240
Ala Lys Gly Ile Val Phe Leu Gly Pro Pro Ala Ser Ser Met His Ala
245 250 255
Leu Gly Asp Lys Val Gly Ser Ala Leu Ile Ala Gln Ala Ala Gly Val
260 265 270
Pro Thr Leu Ala Trp Ser Gly Ser His Val Glu Val Pro Leu Glu Cys
275 280 285
Cys Leu Asp Ser Ile Pro Asp Glu Met Tyr Arg Lys Ala Cys Val Thr
290 295 300
Thr Thr Glu Glu Ala Val Ala Ser Cys Gln Val Val Gly Tyr Pro Ala
305 310315 320
Met Ile Lys Ala Ser Trp Gly Gly Gly Gly Lys Gly Ile Arg Lys Val
325 330 335
His Asn Asp Asp Glu Val Arg Thr Leu Phe Lys Gln Val Gln Gly Glu
340 345 350
Val Pro Gly Ser Pro Ile Phe Ile Met Arg Leu Ala Ala Gln Ser Arg
355 360 365
His Leu Glu Val Gln Leu Leu Cys Asp Gln Tyr Gly Asn Val Ala Ala
370 375 380
Leu His Ser Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Ile Ile
385 390 395 400
Glu Glu Gly Pro Val Thr Val Ala Pro Arg Glu Thr Val Lys Glu Leu
405 410 415
Glu Gln Ala Ala Arg Arg Leu Ala Lys Ala Val Gly Tyr Val Gly Ala
420 425 430
Ala Thr Val Glu Tyr Leu Tyr Ser Met Glu Thr Gly Glu Tyr Tyr Phe
435 440 445
Leu Glu Leu Asn Pro Arg Leu Gln Val Glu His Pro Val Thr Glu Trp
450 455 460
Ile Ala Glu Val Asn Leu Pro Ala Ala Gln Val Ala Val Gly Met Gly
465 470 475 480
Ile Pro Leu Trp Gln Ile Pro Glu Ile Arg Arg Phe Tyr Gly Met Asn
485 490 495
His Gly Gly Gly Tyr Asp Leu Trp Arg Lys Thr Ala Ala Leu Ala Thr
500 505 510
Pro Phe Asn Phe Asp Glu Val Asp Ser Lys Trp Pro Lys Gly His Cys
515 520 525
Val Ala Val Arg Ile Thr Ser Glu Asp Pro Asp Asp Gly Phe Lys Pro
530 535 540
Thr Gly Gly Lys Val Lys Glu Ile Ser Phe Lys Ser Lys Pro Asn Val
545 550 555 560
Trp Ala Tyr Phe Ser Val Lys Ser Gly Gly Gly Ile His Glu Phe Ala
565 570 575
Asp Ser Gln Phe Gly His Val Phe Ala Tyr Gly Thr Thr Arg Ser Ala
580 585 590
Ala Ile Thr Thr Met Ala Leu Ala Leu Lys Glu Val Gln Ile Arg Gly
595 600 605
Glu Ile His Ser Asn Val Asp Tyr Thr Val Asp Leu Leu Asn Ala Ser
610 615 620
Asp Phe Arg Glu Asn Lys Ile His Thr Gly Trp Leu Asp Thr Arg Ile
625 630 635640
Ala Met Arg Val Gln Ala Glu Arg Pro Pro Trp Tyr Ile Ser Val Val
645 650 655
Gly Gly Ala Leu Tyr Lys Thr Val Thr Ala Asn Thr Ala Thr Val Ser
660 665 670
Asp Tyr Val Gly Tyr Leu Thr Lys Gly Gln Ile Pro Pro Lys His Ile
675 680 685
Ser Leu Val Tyr Thr Thr Val Ala Leu Asn Ile Asp Gly Lys Lys Tyr
690 695 700
Thr Ile Asp Thr Val Arg Ser Gly His Gly Ser Tyr Arg Leu Arg Met
705 710 715 720
Asn Gly Ser Thr Val Asp Ala Asn Val Gln Ile Leu Cys Asp Gly Gly
725 730 735
Leu Leu Met Gln Leu Asp Gly Asn Ser His Val Ile Tyr Ala Glu Glu
740 745 750
Glu Ala Ser Gly Thr Arg Leu Leu Ile Asp Gly Lys Thr Cys Met Leu
755 760 765
Gln Asn Asp His Asp Pro Ser Lys Leu Leu Ala Glu Thr Pro Cys Lys
770 775 780
Leu Leu Arg Phe Leu Val Ala Asp Gly Ala His Val Asp Ala Asp Val
785 790 795800
Pro Tyr Ala Glu Val Glu Val Met Lys Met Cys Met Pro Leu Leu Ser
805 810 815
Pro Ala Ser Gly Val Ile His Val Val Met Ser Glu Gly Gln Ala Met
820 825 830
Gln Ala Gly Asp Leu Ile Ala Arg Leu Asp Leu Asp Asp Pro Ser Ala
835 840 845
Val Lys Arg Ala Glu Pro Phe Glu Asp Thr Phe Pro Gln Met Gly Leu
850 855 860
Pro Ile Ala Ala Ser Gly Gln Val His Lys Leu Cys Ala Ala Ser Leu
865 870 875 880
Asn Ala Cys Arg Met Ile Leu Ala Gly Tyr Glu His Asp Ile Asp Lys
885 890 895
Val Val Pro Glu Leu Val Tyr Cys Leu Asp Thr Pro Glu Leu Pro Phe
900 905 910
Leu Gln Trp Glu Glu Leu Met Ser Val Leu Ala Thr Arg Leu Pro Arg
915 920 925
Asn Leu Lys Ser Glu Leu Glu Gly Lys Tyr Glu Glu Tyr Lys Val Lys
930 935 940
Phe Asp Ser Gly Ile Ile Asn Asp Phe Pro Ala Asn Met Leu Arg Val
945 950 955 960
Ile Ile Glu Glu Asn Leu Ala Cys Gly Ser Glu Lys Glu Lys Ala Thr
965 970 975
Asn Glu Arg Leu Val Glu Pro Leu Met Ser Leu Leu Lys Ser Tyr Glu
980 985 990
Gly Gly Arg Glu Ser His Ala His Phe Val Val Lys Ser Leu Phe Glu
995 1000 1005
Glu Tyr Leu Tyr Val Glu Glu Leu Phe Ser Asp Gly Ile Gln Ser Asp
1010 1015 1020
Val Ile Glu Arg Leu Arg Leu Gln His Ser Lys Asp Leu Gln Lys Val
1025 1030 1035 1040
Val Asp Ile Val Leu Ser His Gln Ser Val Arg Asn Lys Thr Lys Leu
1045 1050 1055
Ile Leu Lys Leu Met Glu Ser Leu Val Tyr Pro Asn Pro Ala Ala Tyr
1060 1065 1070
Arg Asp Gln Leu Ile Arg Phe Ser Ser Leu Asn His Lys Ala Tyr Tyr
1075 1080 1085
Lys Leu Ala Leu Lys Ala Ser Glu Leu Leu Glu Gln Thr Lys Leu Ser
1090 1095 1100
Glu Leu Arg Ala Arg Ile Ala Arg Ser Leu Ser Glu Leu Glu Met Phe
1105 1110 1115 1120
Thr Glu Glu Ser Lys Gly Leu Ser Met His Lys Arg Glu Ile Ala Ile
1125 1130 1135
Lys Glu Ser Met Glu Asp Leu Val Thr Ala Pro Leu Pro Val Glu Asp
1140 1145 1150
Ala Leu Ile Ser Leu Phe Asp Cys Ser Asp Thr Thr Val Gln Gln Arg
1155 1160 1165
Val Ile Glu Thr Tyr Ile Ala Arg Leu Tyr Gln Pro His Leu Val Lys
1170 1175 1180
Asp Ser Ile Lys Met Lys Trp Ile Glu Ser Gly Val Ile Ala Leu Trp
1185 1190 1195 1200
Glu Phe Pro Glu Gly His Phe Asp Ala Arg Asn Gly Gly Ala Val Leu
1205 1210 1215
Gly Asp Lys Arg Trp Gly Ala Met Val Ile Val Lys Ser Leu Glu Ser
1220 1225 1230
Leu Ser Met Ala Ile Arg Phe Ala Leu Lys Glu Thr Ser His Tyr Thr
1235 1240 1245
Ser Ser Glu Gly Asn Met Met His Ile Ala Leu Leu Gly Ala Asp Asn
1250 1255 1260
Lys Met His Ile Ile Gln Glu Ser Gly Asp Asp Ala Asp Arg Ile Ala
1265 1270 1275 1280
Lys Leu Pro Leu Ile Leu Lys Asp Asn Val Thr Asp Leu His Ala Ser
1285 1290 1295
Gly Val Lys Thr Ile Ser Phe Ile Val Gln Arg Asp Glu Ala Arg Met
1300 1305 1310
Thr Met Arg Arg Thr Phe Leu Trp Ser Asp Glu Lys Leu Ser Tyr Glu
1315 1320 1325
Glu Glu Pro Ile Leu Arg His Val Glu Pro Pro Leu Ser Ala Leu Leu
1330 1335 1340
Glu Leu Asp Lys Leu Lys Val Lys Gly Tyr Asn Glu Met Lys Tyr Thr
1345 1350 1355 1360
Pro Ser Arg Asp Arg Gln Trp His Ile Tyr Thr Leu Arg Asn Thr Glu
1365 1370 1375
Asn Pro Lys Met Leu His Arg Val Phe Phe Arg Thr Leu Val Arg Gln
1380 1385 1390
Pro Ser Val Ser Asn Lys Phe Ser Ser Gly Gln Ile Gly Asp Met Glu
1395 1400 1405
Val Gly Ser Ala Glu Glu Pro Leu Ser Phe Thr Ser Thr Ser Ile Leu
1410 1415 1420
Arg Ser Leu Met Thr Ala Ile Glu Glu Leu Glu Leu His Ala Ile Arg
1425 1430 1435 1440
Thr Gly His Ser His Met Tyr Leu His Val Leu Lys Glu Gln Lys Leu
1445 1450 1455
Leu Asp Leu Val Pro Val Ser Gly Asn Thr Val Leu Asp Val Gly Gln
1460 1465 1470
Asp Glu Ala Thr Ala Tyr Ser Leu Leu Lys Glu Met Ala Met Lys Ile
1475 1480 1485
His Glu Leu Val Gly Ala Arg Met His His Leu Ser Val Cys Gln Trp
1490 1495 1500
Glu Val Lys Leu Lys Leu Asp Cys Asp Gly Pro Ala Ser Gly Thr Trp
1505 1510 1515 1520
Arg Ile Val Thr Thr Asn Val Thr Ser His Thr Cys Thr Val Asp Ile
1525 1530 1535
Tyr Arg Glu Met Glu Asp Lys Glu Ser Arg Lys Leu Val Tyr His Pro
1540 1545 1550
Ala Thr Pro Ala Ala Gly Pro Leu His Gly Val Ala Leu Asn Asn Pro
1555 1560 1565
Tyr Gln Pro Leu Ser Val Ile Asp Leu Lys Arg Cys Ser Ala Arg Asn
1570 1575 1580
Asn Arg Thr Thr Tyr Cys Tyr Asp Phe Pro Leu Ala Phe Glu Thr Ala
1585 1590 15951600
Val Arg Lys Ser Trp Ser Ser Ser Thr Ser Gly Ala Ser Lys Gly Val
1605 1610 1615
Glu Asn Ala Gln Cys Tyr Val Lys Ala Thr Glu Leu Val Phe Ala Asp
1620 1625 1630
Lys His Gly Ser Trp Gly Thr Pro Leu Val Gln Met Asp Arg Pro Ala
1635 1640 1645
Gly Leu Asn Asp Ile Gly Met Val Ala Trp Thr Leu Lys Met Ser Thr
1650 1655 1660
Pro Glu Phe Pro Ser Gly Arg Glu Ile Ile Val Val Ala Asn Asp Ile
1665 1670 1675 1680
Thr Phe Arg Ala Gly Ser Phe Gly Pro Arg Glu Asp Ala Phe Phe Glu
1685 1690 1695
Ala Val Thr Asn Leu Ala Cys Glu Lys Lys Leu Pro Leu Ile Tyr Leu
1700 1705 1710
Ala Ala Asn Ser Gly Ala Arg Ile Gly Ile Ala Asp Glu Val Lys Ser
1715 1720 1725
Cys Phe Arg Val Gly Trp Ser Asp Asp Gly Ser Pro Glu Arg Gly Phe
1730 1735 1740
Gln Tyr Ile Tyr Leu Ser Glu Glu Asp Tyr Ala Arg Ile Gly Thr Ser
1745 1750 17551760
Val Ile Ala His Lys Met Gln Leu Asp Ser Gly Glu Ile Arg Trp Val
1765 1770 1775
Ile Asp Ser Val Val Gly Lys Glu Asp Gly Leu Gly Val Glu Asn Ile
1780 1785 1790
His Gly Ser Ala Ala Ile Ala Ser Ala Tyr Ser Arg Ala Tyr Lys Glu
1795 1800 1805
Thr Phe Thr Leu Thr Phe Val Thr Gly Arg Thr Val Gly Ile Gly Ala
1810 1815 1820
Tyr Leu Ala Arg Leu Gly Ile Arg Cys Ile Gln Arg Leu Asp Gln Pro
1825 1830 1835 1840
Ile Ile Leu Thr Gly Tyr Ser Ala Leu Asn Lys Leu Leu Gly Arg Glu
1845 1850 1855
Val Tyr Ser Ser His Met Gln Leu Gly Gly Pro Lys Ile Met Ala Thr
1860 1865 1870
Asn Gly Val Val His Leu Thr Val Ser Asp Asp Leu Glu Gly Val Ser
1875 1880 1885
Asn Ile Leu Arg Trp Leu Ser Tyr Val Pro Ala Tyr Ile Gly Gly Pro
1890 1895 1900
Leu Pro Val Thr Thr Pro Leu Asp Pro Pro Asp Arg Pro Val Ala Tyr
1905 1910 19151920
Ile Pro Glu Asn Ser Cys Asp Pro Arg Ala Ala Ile Arg Gly Val Asp
1925 1930 1935
Asp Ser Gln Gly Lys Trp Leu Gly Gly Met Phe Asp Lys Asp Ser Phe
1940 1945 1950
Val Glu Thr Phe Glu Gly Trp Ala Lys Thr Val Val Thr Gly Arg Ala
1955 1960 1965
Lys Leu Gly Gly Ile Pro Val Gly Val Ile Ala Val Glu Thr Gln Thr
1970 1975 1980
Met Met Gln Thr Ile Pro Ala Asp Pro Gly Gln Leu Asp Ser Arg Glu
1985 1990 1995 2000
Gln Ser Val Pro Arg Ala Gly Gln Val Trp Phe Pro Asp Ser Ala Thr
2005 2010 2015
Lys Thr Ala Gln Ala Leu Leu Asp Phe Asn Arg Glu Gly Leu Pro Leu
2020 2025 2030
Phe Ile Leu Ala Asn Trp Arg Gly Phe Ser Gly Gly Gln Arg Asp Leu
2035 2040 2045
Phe Glu Gly Ile Leu Gln Ala Gly Ser Thr Ile Val Glu Asn Leu Arg
2050 2055 2060
Thr Tyr Asn Gln Pro Ala Phe Val Tyr Ile Pro Met Ala Ala Glu Leu
2065 2070 20752080
Arg Gly Gly Ala Trp Val Val Val Asp Ser Lys Ile Asn Pro Asp Arg
2085 2090 2095
Ile Glu Cys Tyr Ala Glu Arg Thr Ala Lys Gly Asn Val Leu Glu Pro
2100 2105 2110
Gln Gly Leu Ile Glu Ile Lys Phe Arg Ser Glu Glu Leu Gln Asp Cys
2115 2120 2125
Met Ser Arg Leu Asp Pro Thr Leu Ile Asp Leu Lys Ala Lys Leu Glu
2130 2135 2140
Val Ala Asn Lys Asn Gly Ser Ala Asp Thr Lys Ser Leu Gln Glu Asn
2145 2150 2155 2160
Ile Glu Ala Arg Thr Lys Gln Leu Met Pro Leu Tyr Thr Gln Ile Ala
2165 2170 2175
Ile Arg Phe Ala Glu Leu His Asp Thr Ser Leu Arg Met Ala Ala Lys
2180 2185 2190
Gly Val Ile Lys Lys Val Val Asp Trp Glu Glu Ser Arg Ser Phe Phe
2195 2200 2205
Tyr Lys Arg Leu Arg Arg Arg Ile Ser Glu Asp Val Leu Ala Lys Glu
2210 2215 2220
Ile Arg Ala Val Ala Gly Glu Gln Phe Ser His Gln Pro Ala Ile Glu
2225 2230 22352240
Leu Ile Lys Lys Trp Tyr Ser Ala Ser His Ala Ala Glu Trp Asp Asp
2245 2250 2255
Asp Asp Ala Phe Val Ala Trp Met Asp Asn Pro Glu Asn Tyr Lys Asp
2260 2265 2270
Tyr Ile Gln Tyr Leu Lys Ala Gln Arg Val Ser Gln Ser Leu Ser Ser
2275 2280 2285
Leu Ser Asp Ser Ser Ser Asp Leu Gln Ala Leu Pro Gln Gly Leu Ser
2290 2295 2300
Met Leu Leu Asp Lys Met Asp Pro Ser Arg Arg Ala Gln Leu Val Glu
2305 2310 2315 2320
Glu Ile Arg Lys Val Leu Gly
2325
<210>14
<211>6984
<212>DNA
<213>Artificial Sequence
<400>14
atgacatcca cacatgtggc gacattggga gttggtgccc aggcacctcc tcgtcaccag 60
aaaaagtcag ctggcactgc atttgtatca tctgggtcat caagaccctc ataccgaaag 120
aatggtcagc gtactcggtc acttagggaa gaaagcaatg gaggagtgtc tgattccaaa 180
aagcttaacc actctattcg ccaaggtctt gctggcatca ttgacctccc aaatgacgca 240
gcttcagaag ttgatatttc acatggttcc gaagatccca gggggcctac ggtcccaggt 300
tcctaccaaa tgaatgggat tatcaatgaa acacataatg ggaggcatgc ttcagtctcc 360
aaggttgttg agttttgtac ggcacttggt ggcaaaacac caattcacag tgtattagtg 420
gccaacaatg gaatggcagc agctaagttc atgcggagtg tccgaacatg ggctaatgat 480
acttttggat cagagaaggc aattcagctg atagctatgg caactccgga ggatctgagg 540
ataaatgcag agcacatcag aattgccgat caatttgtag aggtacctgg tggaacaaac 600
aacaacaact atgcaaatgt ccaactcata gtggagatag cagagagaac aggtgtttct 660
gctgtttggc ctggttgggg tcatgcatct gagaatcctg aacttccaga tgcgctgact 720
gcaaaaggaa ttgtttttct tgggccacca gcatcatcaa tgcatgcatt aggagacaag 780
gttggctcag ctctcattgc tcaagcagct ggagttccaa cacttgcttg gagtggatca 840
catgtggaag ttcctctgga gtgttgcttg gactcaatac ctgatgagat gtatagaaaa 900
gcttgtgtta ctaccacaga ggaagcagtt gcaagttgtc aggtggttgg ttatcctgcc 960
atgattaagg catcttgggg tggtggtggt aaaggaataa ggaaggttca taatgatgat 1020
gaggttagga cattatttaa gcaagttcaa ggcgaagtac ctggttcccc aatatttatc 1080
atgaggctag ctgctcagag tcgacatctt gaagttcagt tgctttgtga tcaatatggc 1140
aacgtagcag cacttcacag tcgagattgc agtgtacaac ggcgacacca aaagataatc 1200
gaggaaggac cagttactgt tgctcctcgt gagactgtga aagagcttga gcaggcagca 1260
cggaggcttg ctaaagctgt gggttatgtt ggtgctgcta ctgttgaata cctttacagc 1320
atggaaactg gtgaatatta ttttctggaa cttaatccac ggctacaggt tgagcatcct 1380
gtcactgagt ggatagctga agtaaatttg cctgcggctc aagttgctgt tggaatgggt 1440
ataccccttt ggcagattcc agagatcagg cgcttctacg gaatgaacca tggaggaggc 1500
tatgaccttt ggaggaaaac agcagctcta gcgactccat ttaactttga tgaagtagat 1560
tctaaatggc caaaaggcca ctgcgtagct gttagaataa ctagcgagga tccagatgat 1620
gggtttaagc ctactggtgg aaaagtaaag gagataagtt tcaagagtaa accaaatgtt 1680
tgggcctatt tctcagtaaa gtctggtgga ggcatccatg aattcgctga ttctcagttc 1740
ggacatgttt ttgcgtatgg aactactaga tcggcagcaa taactaccat ggctcttgca 1800
ctaaaagagg ttcaaattcg tggagaaatt cattcaaacg tagactacac agttgaccta 1860
ttaaatgcct cagattttag agaaaataag attcatactg gttggctgga taccaggata 1920
gccatgcgtg ttcaagctga gaggcctcca tggtatattt cagtcgttgg aggggcttta 1980
tataaaacag taactgccaa cacggccact gtttctgatt atgttggtta tcttaccaag 2040
ggccagattc caccaaagca tatatccctt gtctatacga ctgttgcttt gaatatagat 2100
gggaaaaaat atacaatcga tactgtgagg agtggacatg gtagctacag attgcgaatg 2160
aatggatcaa cggttgacgc aaatgtacaa atattatgtg atggtgggct tttaatgcag 2220
ctggatggaa acagccatgt aatttatgct gaagaagagg ccagtggtac acgacttctt 2280
attgatggaa agacatgcat gttacagaat gaccatgacc catcaaagtt attagctgag 2340
acaccatgca aacttcttcg tttcttggtt gctgatggtg ctcatgttga tgctgatgta 2400
ccatatgcgg aagttgaggt tatgaagatg tgcatgcccc tcttatcacc cgcttctggt 2460
gtcatacatg ttgtaatgtc tgagggccaa gcaatgcagg ctggtgatct tatagctagg 2520
ctggatcttg atgacccttc tgctgttaag agagctgagc cgttcgaaga tacttttcca 2580
caaatgggtc tccctattgc tgcttctggc caagttcaca aattatgtgc tgcaagtctg 2640
aatgcttgtc gaatgatcct tgcggggtat gagcatgata ttgacaaggt tgtgccagag 2700
ttggtatact gcctagacac tccggagctt cctttcctgc agtgggagga gcttatgtct 2760
gttttagcaa ctagacttcc aagaaatctt aaaagtgagt tggagggcaa atatgaggaa 2820
tacaaagtaa aatttgactc tgggataatc aatgatttcc ctgccaatat gctacgagtg 2880
ataattgagg aaaatcttgc atgtggttct gagaaggaga aggctacaaa tgagaggctt 2940
gttgagcctc ttatgagcct actgaagtca tatgagggtg ggagagaaag tcatgctcac 3000
tttgttgtca agtccctttt tgaggagtat ctctatgttg aagaattgtt cagtgatgga 3060
attcagtctg atgtgattga gcgtctgcgc cttcaacata gtaaagacct acagaaggtc 3120
gtagacattg tgttgtccca ccagagtgtt agaaataaaa ctaagctgat actaaaactc 3180
atggagagtc tggtctatcc aaatcctgct gcctacaggg atcaattgat tcgcttttct 3240
tcccttaatc acaaagcgta ttacaagttg gcacttaaag ctagtgaact tcttgaacaa 3300
acaaaactta gtgagctccg tgcaagaata gcaaggagcc tttcagagct ggagatgttt 3360
actgaggaaa gcaagggtct ctccatgcat aagcgagaaa ttgccattaa ggagagcatg 3420
gaagatttag tcactgctcc actgccagtt gaagatgcgc tcatttctttatttgattgt 3480
agtgatacaa ctgttcaaca gagagtgatt gagacttata tagctcgatt ataccagcct 3540
catcttgtaa aggacagtat caaaatgaaa tggatagaat cgggtgttat tgctttatgg 3600
gaatttcctg aagggcattt tgatgcaaga aatggaggag cggttcttgg tgacaaaaga 3660
tggggtgcca tggtcattgt caagtctctt gaatcacttt caatggccat tagatttgca 3720
ctaaaggaga catcacacta cactagctct gagggcaata tgatgcatat tgctttgttg 3780
ggtgctgata ataagatgca tataattcaa gaaagtggtg atgatgctga cagaatagcc 3840
aaacttccct tgatactaaa ggataatgta accgatctgc atgcctctgg tgtgaaaaca 3900
ataagtttca ttgttcaaag agatgaagca cggatgacaa tgcgtcgtac cttcctttgg 3960
tctgatgaaa agctttctta tgaggaagag ccaattctcc ggcatgtgga acctcctctt 4020
tctgcacttc ttgagttgga caagttgaaa gtgaaaggat acaatgaaat gaagtatacc 4080
ccatcacggg atcgtcaatg gcatatctac acacttagaa atactgaaaa ccccaaaatg 4140
ttgcaccggg tatttttccg aacccttgtc aggcaaccca gtgtatccaa caagttttct 4200
tcgggccaga ttggtgacat ggaagttggg agtgctgaag aacctctgtc atttacatca 4260
accagcatat taagatcttt gatgactgct atagaggaat tggagcttca cgcaattaga 4320
actggccatt cacacatgta tttgcatgta ttgaaagaac aaaagcttct tgatcttgtt 4380
ccagtttcag ggaatacagt tttggatgtt ggtcaagatg aagctactgc atattcactt 4440
ttaaaagaaa tggctatgaa gatacatgaa cttgttggtg caagaatgca ccatctttct 4500
gtatgccaat gggaagtgaa acttaagttg gactgcgatg gtcctgccag tggtacctgg 4560
aggattgtaa caaccaatgt tactagtcac acttgcactg tggatatcta ccgtgagatg 4620
gaagataaag aatcacggaa gttagtatac catcccgcca ctccggcggc tggtcctctg 4680
catggtgtgg cactgaataa tccatatcag cctttgagtg tcattgatct caaacgctgt 4740
tctgctagga ataatagaac tacatactgc tatgattttc cactggcatt tgaaactgca 4800
gtgaggaagt catggtcctc tagtacctct ggtgcttcta aaggtgttga aaatgcccaa 4860
tgttatgtta aagctacaga gttggtattt gcggacaaac atgggtcatg gggcactcct 4920
ttagttcaaa tggaccggcc tgctgggctc aatgacattg gtatggtagc ttggaccttg 4980
aagatgtcca ctcctgaatt tcctagtggt agggagatta ttgttgttgc aaatgatatt 5040
acgttcagag ctggatcatt tggcccaagg gaagatgcat tttttgaagc tgttaccaac 5100
ctagcctgtg agaagaaact tcctcttatt tatttggcag caaattctgg tgctcgaatt 5160
ggcatagcag atgaagtgaa atcttgcttc cgtgttgggt ggtctgatga tggcagccct 5220
gaacgtgggt ttcagtacat ttatctaagc gaagaagact atgctcgtat tggcacttct 5280
gtcatagcac ataagatgca gctagacagt ggtgaaatta ggtgggttat tgattctgtt 5340
gtgggcaagg aagatggact tggtgtggag aatatacatg gaagtgctgc tattgccagt 5400
gcttattcta gggcatataa ggagacattt acacttacat ttgtgactgg aagaactgtt 5460
ggaataggag cttatcttgc tcgacttggc atccggtgca tacagcgtct tgaccagcct 5520
attattctta caggctattc tgcactgaac aagcttcttg ggcgggaagt gtacagctcc 5580
cacatgcagt tgggtggtcc caaaatcatg gcaactaatg gtgttgtcca tcttactgtt 5640
tcagatgacc ttgaaggcgt ttctaatata ttgaggtggc tcagttatgt tcctgcctac 5700
attggtggac cacttccagt aacaacaccg ttggacccac cggacagacc tgttgcatac 5760
attcctgaga actcgtgtga tcctcgagcg gctatccgtg gtgttgatga cagccaaggg 5820
aaatggttag gtggtatgtt tgataaagac agctttgtgg aaacatttga aggttgggct 5880
aagacagtgg ttactggcag agcaaagctt ggtggaattc cagtgggtgt gatagctgtg 5940
gagactcaga ccatgatgca aactatccct gctgaccctg gtcagcttga ttcccgtgag 6000
caatctgttc ctcgtgctgg acaagtgtgg tttccagatt ctgcaaccaa gactgcgcag 6060
gcattgctgg acttcaaccg tgaaggatta cctctgttca tcctcgctaa ctggagaggc 6120
ttctctggtg gacaaagaga tctttttgaa ggaattcttc aggctggctc gactattgtt 6180
gagaacctta ggacatacaa tcagcctgcc tttgtctaca ttcccatggc tgcagagcta 6240
cgaggagggg cttgggttgt ggttgatagc aagataaacc cagaccgcat tgagtgctat 6300
gctgagagga ctgcaaaagg caatgttctg gaaccgcaag ggttaattga gatcaagttc 6360
aggtcagagg aactccagga ttgcatgagt cggcttgacc caacattaat tgatctgaaa 6420
gcaaaactcg aagtagcaaa taaaaatgga agtgctgaca caaaatcgct tcaagaaaat 6480
atagaagctc gaacaaaaca gttgatgcct ctatatactc agattgcgat acggtttgct 6540
gaattgcatg atacatccct cagaatggct gcgaaaggtg tgattaagaa agttgtggac 6600
tgggaagaat cacgatcttt cttctataag agattacgga ggaggatctc tgaggatgtt 6660
cttgcaaaag aaattagagc tgtagcaggt gagcagtttt cccaccaacc agcaatcgag 6720
ctgatcaaga aatggtattc agcttcacat gcagctgaat gggatgatga cgatgctttt 6780
gttgcttgga tggataaccc tgaaaactac aaggattata ttcaatatct taaggctcaa 6840
agagtatccc aatccctctc aagtctttca gattccagct cagatttgca agccctgcca 6900
cagggtcttt ccatgttact agataagatg gatccctcta gaagagctca acttgttgaa 6960
gaaatcagga aggtccttgg ttga 6984

Claims (10)

1. A method for preparing herbicide-resistant rice with ACCase inhibition comprises the steps of enabling a receptor rice to express esgRNA, adenine deaminase, Cas9 nuclease and a nuclear localization signal bpNLS;
the esgRNA targets an OsACC1 gene target sequence; the OsACC1 gene target sequence contains a target site;
the target site is a complementary base A of a base T shown in the 6295 th position of the sequence 14;
the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS can mutate the target site in the receptor rice genome from a base A to a base G under the guidance of the esgRNA, so that the ACCase herbicide-inhibiting resistant rice with the OsACC1 protein with the 2099 th position mutated from cysteine to arginine is obtained;
the amino acid sequence of the nuclear localization signal bpNLS is A1) or A2):
A1) the amino acid sequence is a protein shown in a sequence 7;
A2) and (b) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 7 in the sequence table and has the same function.
2. The method of claim 1, wherein: the number of the nuclear localization signals bpNLS is 2;
and/or the coding gene sequence of the nuclear localization signal bpNLS is a1) or a2) or a 3):
a1) a cDNA molecule or DNA molecule shown in 3796-3846 site of a sequence 1 in a sequence table;
a2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in a1) and encoding the nuclear localization signal bpNLS as claimed in claim 1;
a3) a cDNA or DNA molecule which hybridizes under stringent conditions with the nucleotide sequence defined under a1) or a2) and which codes for the nuclear localization signal bpNLS as claimed in claim 1;
and/or the OsACC1 gene target point sequence is sequence 12, and the target point is base A shown in 7 th position of sequence 12.
3. The method according to claim 1 or 2, characterized in that: the esgRNA structure is as follows: an RNA-esgRNA backbone transcribed from the OsACC1 gene target sequence;
the esgRNA backbone is 1) or 2) or 3):
1) replacing T in the 617-702 th site of the sequence 1 with U to obtain an RNA molecule;
2) RNA molecules which are obtained by substituting and/or deleting and/or adding one or more nucleotides in the RNA molecules shown in 1) and have the same functions;
3) RNA molecule with 75% or more than 75% identity with the nucleotide sequence defined in 1) or 2) and with the same function.
4. A method according to any one of claims 1 to 3, wherein: the Cas9 nuclease is a Cas9n protein;
the Cas9n protein is C1) or C2):
C1) the amino acid sequence is a protein shown in a sequence 4;
C2) and (b) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 4 in the sequence table and has the same function.
5. The method according to any one of claims 1 to 4, wherein: the coding gene of the Cas9n protein is c1) or c2) or c 3):
c1) a cDNA molecule or DNA molecule shown in position 5035-9135 of a sequence 1 in a sequence table;
c2) a cDNA molecule or DNA molecule having 75% or more identity to the nucleotide sequence defined in c1) and encoding for Cas9n as claimed in claim 4;
c3) a cDNA molecule or a DNA molecule hybridizing under stringent conditions with a nucleotide sequence defined in c1) or c2) and encoding Cas9n as claimed in claim 4.
6. The method according to any one of claims 1 to 5, wherein: the adenine deaminase is an ecTadA protein and/or an ecTadA protein;
the ecTadA protein is D1) or D2):
D1) the amino acid sequence is a protein shown in a sequence 2;
D2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
the ecTadA protein is E1) or E2):
E1) the amino acid sequence is a protein shown in a sequence 3;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in the sequence 3 in the sequence table and has the same function.
7. The method according to any one of claims 1 to 6, wherein: the coding gene of the ecTadA protein is d1) or d2) or d 3):
d1) a cDNA molecule or DNA molecule shown in 3847-4344 site of a sequence 1 in a sequence table;
d2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in d1) and encoding said ecTadA according to claim 6;
d3) a cDNA molecule or DNA molecule which hybridizes under stringent conditions with a nucleotide sequence defined by d1) or d2) and which encodes the ecTadA according to claim 6;
the coding gene of the ecTadA protein is e1) or e2) or e 3):
e1) a cDNA molecule or DNA molecule shown in 4441-4938 site of a sequence 1 in a sequence table;
e2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in e1) and encoding said ecTadA according to claim 6;
e3) hybridizes under stringent conditions with a nucleotide sequence defined in e1) or e2) and encodes a cDNA molecule or a DNA molecule of the ecTadA) according to claim 6.
8. The method according to any one of claims 1 to 7, wherein: the method for expressing the esgRNA, the adenine deaminase, the Cas9 nuclease and the nuclear localization signal bpNLS in the receptor rice is characterized in that a DNA molecule for transcribing the esgRNA, an encoding gene of the ecTadA protein, an encoding gene of the Cas9n protein and an encoding gene of the nuclear localization signal bpNLS are introduced into the receptor rice.
9. The method of claim 8, wherein: the DNA molecule for transcribing the esgRNA, the coding gene of the ecTadA protein, the coding gene of the Cas9n protein and the coding gene of the nuclear localization signal bpNLS are introduced into receptor rice through recombinant expression vectors;
the recombinant expression vector comprises an expression cassette which consists of a promoter, the coding gene of the nuclear localization signal bpNLS, the coding gene of the ecTadA protein, the coding gene of the Cas9n protein, the coding gene of the nuclear localization signal bpNLS and a terminator in sequence.
10. Use of the method of any one of claims 1 to 9 for increasing the efficiency of a.g base substitutions at a target site;
or, the use of the method according to any one of claims 1 to 9 for preparing ACCase inhibiting herbicide resistant rice in which only the target site is substituted with the a.g base;
the target site is a complementary base A of a base T shown in the 6295 th position of the sequence 14;
the A.G base is replaced by a base A which is mutated into a base G.
CN201911323192.9A 2019-12-20 2019-12-20 Preparation method of herbicide-resistant rice Active CN110964742B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911323192.9A CN110964742B (en) 2019-12-20 2019-12-20 Preparation method of herbicide-resistant rice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911323192.9A CN110964742B (en) 2019-12-20 2019-12-20 Preparation method of herbicide-resistant rice

Publications (2)

Publication Number Publication Date
CN110964742A true CN110964742A (en) 2020-04-07
CN110964742B CN110964742B (en) 2022-03-01

Family

ID=70035454

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911323192.9A Active CN110964742B (en) 2019-12-20 2019-12-20 Preparation method of herbicide-resistant rice

Country Status (1)

Country Link
CN (1) CN110964742B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112410308A (en) * 2020-11-20 2021-02-26 江苏省农业科学院 Application of ACCase mutant gene of rice and protein thereof in herbicide resistance of plants

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110029096A (en) * 2019-05-09 2019-07-19 上海科技大学 A kind of adenine base edit tool and application thereof
WO2019147073A1 (en) * 2018-01-25 2019-08-01 주식회사 툴젠 Method for identifying base editing by using adenosine deaminase
CN110407945A (en) * 2019-06-14 2019-11-05 上海科技大学 A kind of adenine base edit tool and application thereof
WO2019226953A1 (en) * 2018-05-23 2019-11-28 The Broad Institute, Inc. Base editors and uses thereof
CN110914426A (en) * 2017-03-23 2020-03-24 哈佛大学的校长及成员们 Nucleobase editors comprising nucleic acid programmable DNA binding proteins
CN111757937A (en) * 2017-10-16 2020-10-09 布罗德研究所股份有限公司 Use of adenosine base editor
CN111801345A (en) * 2017-07-28 2020-10-20 哈佛大学的校长及成员们 Methods and compositions using an evolved base editor for Phage Assisted Continuous Evolution (PACE)

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110914426A (en) * 2017-03-23 2020-03-24 哈佛大学的校长及成员们 Nucleobase editors comprising nucleic acid programmable DNA binding proteins
CN111801345A (en) * 2017-07-28 2020-10-20 哈佛大学的校长及成员们 Methods and compositions using an evolved base editor for Phage Assisted Continuous Evolution (PACE)
CN111757937A (en) * 2017-10-16 2020-10-09 布罗德研究所股份有限公司 Use of adenosine base editor
WO2019147073A1 (en) * 2018-01-25 2019-08-01 주식회사 툴젠 Method for identifying base editing by using adenosine deaminase
WO2019226953A1 (en) * 2018-05-23 2019-11-28 The Broad Institute, Inc. Base editors and uses thereof
CN110029096A (en) * 2019-05-09 2019-07-19 上海科技大学 A kind of adenine base edit tool and application thereof
CN110407945A (en) * 2019-06-14 2019-11-05 上海科技大学 A kind of adenine base edit tool and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHAO LI等: "Expanded base editing in rice and wheat using a Cas9-adenosine deaminase fusion", 《GENOME BIOLOGY》 *
XIAOSHUANG LIU等: "A CRISPR-Cas9-mediated domain-specific base-editing screen enables functional assessment of ACCase variants in rice", 《PLANT BIOTECHNOLOGY JOURNAL》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112410308A (en) * 2020-11-20 2021-02-26 江苏省农业科学院 Application of ACCase mutant gene of rice and protein thereof in herbicide resistance of plants
CN112410308B (en) * 2020-11-20 2023-11-10 江苏省农业科学院 Rice ACCase mutant gene and application of rice ACCase mutant gene protein in herbicide resistance of plants

Also Published As

Publication number Publication date
CN110964742B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
CN111394369B (en) Glyphosate-resistant EPSPS mutant gene, plant genetic transformation screening vector containing glyphosate-resistant EPSPS mutant gene and application of glyphosate-resistant EPSPS mutant gene
CN111378679B (en) Gene expression assembly, cloning vector constructed by same and application of cloning vector
CN109679989A (en) A method of improving base editing system editorial efficiency
CN110951736B (en) Nuclear localization signal F4NLS and application thereof in improving base editing efficiency and expanding editable base range
CN108642061B (en) Ogura CMS sterility restorer gene RfoB、RfoBPlant expression vector and application thereof
CN110964742B (en) Preparation method of herbicide-resistant rice
CN110951773B (en) Application of FNLS-sABE system in creating rice herbicide resistant material
CN110982818B (en) Application of nuclear localization signal F4NLS in efficient creation of rice herbicide resistant material
CN113430225A (en) Vector for analyzing expression specificity of plant promoter, preparation method and application thereof
CN112538477B (en) Application of xCas9 gene editing system in genome editing
CN101892259B (en) SiRNA plant gene expression vector and construction method and application thereof
CN113564177B (en) Method for improving crop yield by regulating wheat ARE1 gene through CRISPR/Cas9 technology
CN111961126B (en) Application of TaVQ25 gene in regulation and control of resistance of wheat to powdery mildew and banded sclerotial blight
CN111961684B (en) Method for improving disease resistance of wheat by inhibiting expression of TaVQ5 gene in wheat
CN112280799B (en) Method for site-directed mutagenesis of hevea brasiliensis or dandelion gene by using CRISPR/Cas9 system
CN103173488B (en) Method for quickly screening paddy transgenes by novel fusion tag
CN107988226A (en) A kind of identification and application of the special High-expression promoter of Rice Callus
CN112941100B (en) Genetic transformation method of elytrigia intermedium and special primer thereof
CN110195067B (en) Method for cultivating glufosinate-ammonium-resistant rape
CN113462697B (en) Degradable glyphosate-resistant gene, plant expression vector, cultivation method of degradable glyphosate-resistant transgenic rice and application
KR101773365B1 (en) Gene transfer vector for co-expressing two foreign genes using soybean mosaic virus
CN113355352B (en) Method for modifying virus expression vector based on TuMV-phe virus gene of Apostichopus japonicus
KR102281973B1 (en) Polycistronic Expression System for Plants
CN110484560B (en) Method for producing barren-resistant rice containing HVUL2H20083.2 gene
KR20180111664A (en) Manufacturing method of mutant strain having increased phytoene productivity and mutant strain manufactured same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant