CN110982818A - 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用 - Google Patents

核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用 Download PDF

Info

Publication number
CN110982818A
CN110982818A CN201911323608.7A CN201911323608A CN110982818A CN 110982818 A CN110982818 A CN 110982818A CN 201911323608 A CN201911323608 A CN 201911323608A CN 110982818 A CN110982818 A CN 110982818A
Authority
CN
China
Prior art keywords
sequence
leu
protein
lys
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911323608.7A
Other languages
English (en)
Other versions
CN110982818B (zh
Inventor
张成伟
杨进孝
王飞鹏
徐雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201911323608.7A priority Critical patent/CN110982818B/zh
Publication of CN110982818A publication Critical patent/CN110982818A/zh
Application granted granted Critical
Publication of CN110982818B publication Critical patent/CN110982818B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8274Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y604/00Ligases forming carbon-carbon bonds (6.4)
    • C12Y604/01Ligases forming carbon-carbon bonds (6.4.1)
    • C12Y604/01002Acetyl-CoA carboxylase (6.4.1.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Virology (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

本发明公开了核定位信号F4NLS在高效创制水稻除草剂抗性材料中的应用。所述核定位信号F4NLS由核定位信号甲和核定位信号乙组成,所述核定位信号甲包括3*Flag2标签蛋白和NLS2蛋白;所述核定位信号乙包括所述NLS2蛋白。本发明还公开了一种ACCase抑制除草剂抗性水稻的制备方法。所述方法包括使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的步骤;所述腺嘌呤脱氨酶、所述Cas9核酸酶、所述核定位信号甲和所述核定位信号乙在所述esgRNA的向导下,可将受体水稻基因组中OsACC1基因靶点序列的第7位由A突变为G,从而获得ACCase抑制除草剂抗性水稻。

Description

核定位信号F4NLS在高效创制水稻除草剂抗性材料中的应用
技术领域
本发明属于生物技术领域,具体涉及核定位信号F4NLS在高效创制水稻除草剂抗性材料中的应用。
背景技术
CRISPR-Cas9技术已经成为强有力的基因组编辑手段,被广泛应用到很多组织和细胞中。CRISPR/Cas9 protein-RNA复合物通过向导RNA(guide RNA)定位于靶点上,切割产生DNA双链断裂(dsDNA break,DSB),而后生物体会本能的启动DNA修复机制修复DSB。修复机制一般有两种,一种是非同源末端连接(non-homologous end joining,NHEJ),另一种是同源重组(homology-directed repair,HDR)。通常情况下NHEJ占大多数,因此修复产生的随机的indels(insertions or deletions)比精确修复高很多。对于碱基精确替换,因为HDR效率低以及需要DNA模板,所以使用HDR实现碱基精确替换的应用受到很大的限制。
2017年,David Liu实验室报道了一种新型的腺嘌呤碱基编辑器(adenine baseeditors,ABE)。通过七轮进化,研究者将来源于大肠杆菌的tRNA腺嘌呤脱氨酶(tRNAadenosinedeaminase,ecTadA)融合在Cas9 nickase(Cas9n)的5’端,在细胞内能够直接实现对单个碱基A到G(Guanine,G)的替换,而不再通过产生DSB和启动HDR修复,大大提高了A替换为G的碱基编辑效率。具体过程为:当含有基因组靶向序列的sgRNA与ecTadA&ecTadA&Cas9n结合时,复合体定位到靶点,ecTadA催化非配对的单链DNA上的A发生腺嘌呤脱氨反应变成肌苷(Inosine,I),在DNA修复的过程中,I会被视为G,Cas9n会在切割配对的DNA链的磷酸二酯键,引入一个胞嘧啶C与I配对。最终在接下来的修复过程中产生C-G配对,从而实现了A到G的转换。
单碱基编辑系统被广泛应用于作物单碱基突变材料的创制。乙酰辅酶A羧化酶(acetyl-coenzyme A carboxylase,ACCase)是调节脂肪酸生物合成的一种关键酶,ACCase抑制除草剂自1975年开发以来,品种不断增多,目前已成为世界上重要的除草剂。ACCase抑制除草剂包括最早开发的APP(芳氧苯氧丙酸)和CHD(环已烯二酮)以及后开发的DEN(苯基吡唑啉),共三类。随着ACCase抑制除草剂的使用,很多杂草产生了抗性,ACCase氨基酸内8种氨基酸置换控制着杂草对不同类ACCase抑制除草剂的抗性,它们分别是谷氨酰胺1756→谷氨酸,异亮氨酸1781→亮氨酸或缬氨酸,色氨酸1999→半胱氨酸或丝氨酸,色氨酸2027→半胱氨酸,异亮氨酸2041→天冬酰胺或缬氨酸,天冬酰胺2078→甘氨酸,半胱氨酸2088→精氨酸,甘氨酸2096→丙氨酸。其中1781、2078或2088位置的氨基酸突变抗所有ACCase抑制除草剂。水稻OsACC1中第2099位的半胱氨酸(C)正好对应上述杂草中的第2088位的半胱氨酸,将其突变为精氨酸(R),即C2099R能够使水稻产生ACCase抑制除草剂抗性。C2099R可通过单碱基编辑技术获得,但目前已报道的突变C2099R创制ACCase抑制除草剂抗性水稻的碱基编辑系统大多编辑效率较低,且产生的突变体不只含有OsACC1第2099位这一个位点的突变,有的还会含有其他位点的突变,突变体分子鉴定成本高。
发明内容
本发明的目的是提供一种ACCase抑制除草剂抗性水稻的制备方法。
本发明提供的ACCase抑制除草剂抗性水稻的制备方法包括使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的步骤;
所述esgRNA靶向OsACC1基因靶点序列;所述OsACC1基因靶点序列含有目标位点;
所述目标位点为序列14第6295位所示碱基T的互补碱基A;
所述腺嘌呤脱氨酶、所述Cas9核酸酶、所述核定位信号甲和所述核定位信号乙在所述esgRNA的向导下,可将受体水稻基因组中所述目标位点由碱基A突变为碱基G,从而获得OsACC1蛋白第2099位由半胱氨酸突变为精氨酸的ACCase抑制除草剂抗性水稻;
所述核定位信号甲包括3*Flag2标签蛋白和NLS2蛋白;
所述核定位信号乙包括所述NLS2蛋白;
所述3*Flag2标签蛋白的氨基酸序列为序列10第1-22位;
所述NLS2蛋白的氨基酸序列为序列9。
上述方法中,所述核定位信号甲中,所述3*Flag2标签蛋白的个数可为1个或2个或多个;所述NLS2蛋白的个数可为1个或2个或多个。在本发明的具体实施例中,所述3*Flag2标签蛋白的个数为1个;所述NLS2蛋白的个数为4个。
所述核定位信号乙中,所述NLS2蛋白的个数可为1个或2个或多个。在本发明的具体实施例中,所述NLS2蛋白的个数为4个。
进一步的,所述核定位信号甲为A1)或A2):
A1)氨基酸序列是序列10所示的蛋白质;
A2)将序列表中序列10所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
所述核定位信号乙为B1)或B2):
B1)氨基酸序列是序列11所示的蛋白质;
B2)将序列表中序列11所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
更进一步的,所述核定位信号甲的编码基因序列为a1)或a2)或a3):
a1)序列表中序列6第1-183位所示的cDNA分子或DNA分子;
a2)与a1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述核定位信号甲的cDNA分子或DNA分子;
a3)在严格条件下与a1)或a2)限定的核苷酸序列杂交,且编码所述核定位信号甲的cDNA分子或DNA分子;
所述核定位信号乙的编码基因序列为b1)或b2)或b3):
b1)序列表中序列6第73-183位所示的cDNA分子或DNA分子;
b2)与b1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述核定位信号乙的cDNA分子或DNA分子;
b3)在严格条件下与b1)或b2)限定的核苷酸序列杂交,且编码所述核定位信号乙的cDNA分子或DNA分子。
上述方法中,所述OsACC1基因靶点序列为序列12,所述目标位点为序列12第7位所示的碱基A。
上述方法中,所述esgRNA结构如下:所述OsACC1基因靶点序列转录的RNA-esgRNA骨架;
所述esgRNA骨架为1)或2)或3):
1)将序列1第617-702位中的T替换为U得到的RNA分子;
2)将1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;
3)与1)或2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。
上述方法中,所述Cas9核酸酶为Cas9n蛋白质;
所述Cas9n蛋白质为C1)或C2):
C1)氨基酸序列是序列4所示的蛋白质;
C2)将序列表中序列4所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
所述腺嘌呤脱氨酶为ecTadA蛋白质和/或ecTadA*蛋白质;具体为ecTadA蛋白质和ecTadA*蛋白质;
所述ecTadA蛋白质为D1)或D2):
D1)氨基酸序列是序列2所示的蛋白质;
D2)将序列表中序列2所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
所述ecTadA*蛋白质为E1)或E2):
E1)氨基酸序列是序列3所示的蛋白质;
E2)将序列表中序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
上述A2)、B2)、C2)、D2)或E2)中的蛋白质,为与序列10、序列11、序列4、序列2或序列3所示蛋白质的氨基酸序列具有75%或75%以上同一性且具有相同功能的蛋白质。所述具有75%或75%以上同一性为具有75%、具有80%、具有85%、具有90%、具有95%、具有96%、具有97%、具有98%或具有99%的同一性。
上述A2)、B2)、C2)、D2)或E2)中的蛋白质可人工合成,也可先合成其编码基因,再进行生物表达得到。
上述A2)、B2)、C2)、D2)或E2)中的蛋白质的编码基因可通过将序列6的第1-183位(编码序列10所示的蛋白质)、序列6第73-183位(编码序列11所示的蛋白质)、序列1第5035-9135位(编码序列4所示的蛋白质)、序列1第3847-4344位(编码序列2所示的蛋白质)或序列1第4441-4938位(编码序列3所示的蛋白质)所示的DNA序列中缺失一个或几个氨基酸残基的密码子,和/或进行一个或几个碱基对的错义突变,和/或在其5′端和/或3′端连接上表所示的标签的编码序列得到。
进一步的,所述Cas9n蛋白质的编码基因为c1)或c2)或c3):
c1)序列表中序列1第5035-9135位所示的cDNA分子或DNA分子;
c2)与c1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述Cas9n的cDNA分子或DNA分子;
c3)在严格条件下与c1)或c2)限定的核苷酸序列杂交,且编码所述Cas9n的cDNA分子或DNA分子。
所述ecTadA蛋白质的编码基因为d1)或d2)或d3):
d1)序列表中序列1第3847-4344位所示的cDNA分子或DNA分子;
d2)与d1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述ecTadA的cDNA分子或DNA分子;
d3)在严格条件下与d1)或d2)限定的核苷酸序列杂交,且编码所述ecTadA的cDNA分子或DNA分子;
所述ecTadA*蛋白质的编码基因为e1)或e2)或e3):
e1)序列表中序列1第4441-4938位所示的cDNA分子或DNA分子;
e2)与e1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述ecTadA*的cDNA分子或DNA分子;
e3)在严格条件下与e1)或e2)限定的核苷酸序列杂交,且编码所述ecTadA*的cDNA分子或DNA分子。
本领域普通技术人员可以很容易地采用已知的方法,例如定向进化和点突变的方法,对本发明的编码所述核定位信号甲、所述核定位信号乙、所述Cas9n、所述ecTadA、所述ecTadA*的核苷酸序列进行突变。那些经过人工修饰的,具有与本发明的所述核定位信号甲、所述核定位信号乙、所述Cas9n、所述ecTadA、所述ecTadA*的核苷酸序列75%或者更高同一性的核苷酸,只要编码所述核定位信号甲、所述核定位信号乙、所述Cas9n、所述ecTadA、所述ecTadA*且具有相同功能,均是衍生于本发明的核苷酸序列并且等同于本发明的序列。
这里使用的术语“同一性”指与天然核酸序列的序列相似性。“同一性”包括与本发明的编码序列10、11、4、2、3所示的氨基酸序列组成的蛋白质的核苷酸序列具有75%或更高,或85%或更高,或90%或更高,或95%或更高同一性的核苷酸序列。同一性可以用肉眼或计算机软件进行评价。使用计算机软件,两个或多个序列之间的同一性可以用百分比(%)表示,其可以用来评价相关序列之间的同一性。
所述严格条件是在2×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次5min,又于0.5×SSC,0.1%SDS的溶液中,在68℃下杂交并洗膜2次,每次15min;或,0.1×SSPE(或0.1×SSC)、0.1%SDS的溶液中,65℃条件下杂交并洗膜。
上述75%或75%以上同一性,可为80%、85%、90%或95%以上的同一性。
上述方法中,所述使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的方法为将转录esgRNA的DNA分子、ecTadA蛋白质的编码基因、ecTadA*蛋白质的编码基因、Cas9n蛋白质的编码基因、核定位信号甲的编码基因和核定位信号乙的编码基因导入受体水稻中。
进一步的,所述转录esgRNA的DNA分子、所述ecTadA蛋白质的编码基因、所述ecTadA*蛋白质的编码基因、所述Cas9n蛋白质的编码基因、所述核定位信号甲的编码基因和所述核定位信号乙的编码基因通过重组表达载体导入受体水稻中。
更进一步,所述重组表达载体包括依次由启动子、所述核定位信号甲的编码基因、所述ecTadA蛋白质的编码基因、所述ecTadA*蛋白质的编码基因、所述Cas9n蛋白质的编码基因和所述核定位信号乙的编码基因和终止子组成的表达盒。
在本发明的具体实施例中,所述重组表达载体具体为F4NLS-sABE-1重组表达载体。
所述F4NLS-sABE-1重组表达载体为将sABE-1重组表达载体序列中序列1第3796-3846位所示的bpNLS核苷酸序列替换为序列6,且将第9136-9186位所示的bpNLS核苷酸序列替换为序列6第55-201位所示的核苷酸序列,且保持其他序列不变后得到的序列。
上述方法中,所述腺嘌呤脱氨酶、所述Cas9核酸酶、所述核定位信号甲和所述核定位信号乙在所述esgRNA的向导下,可将受体水稻基因组中OsACC1蛋白的编码基因的非编码链(序列14的反向互补序列)上第690位(5'-3’方向)的碱基A突变为碱基G,进而使OsACC1蛋白的编码基因的编码链(序列14)上第6295位(5'-3’方向)的碱基T突变为碱基C,导致OsACC1蛋白第2099位的半胱氨酸(C)的密码子由“TGC”变成“CGC”,所以OsACC1蛋白第2099位由半胱氨酸(C)突变成精氨酸(R),发生该突变的水稻突变体即为本发明的ACCase抑制除草剂抗性水稻。所述ACCase抑制除草剂抗性水稻具有ACCase抑制除草剂抗性。所述ACCase抑制除草剂具体为盖草能。
本发明的ACCase抑制除草剂抗性水稻也即为C2099R突变体或A7>G7突变体。所述C2099R突变体为OsACC1蛋白(序列13)第2099位由半胱氨酸(C)突变成精氨酸(R)的水稻突变体。所述A7>G7突变体为OsACC1基因靶点序列(序列12)第7位由碱基A突变为碱基G的水稻突变体。
上述方法在提高目标位点的A·G碱基替换效率或制备仅目标位点发生A·G碱基替换的ACCase抑制除草剂抗性水稻中的应用也均属于本发明的保护范围;所述目标位点为所述OsACC1基因靶点序列的第7位。
上述应用中,所述A·G碱基替换为由碱基A突变为碱基G。
上述方法或应用中,所述OsACC1蛋白的氨基酸序列如序列13所示,所述OsACC1蛋白的编码基因序列如序列14所示(LOC4338322)。
在实际应用中,不同水稻品种中的OsACC1基因可能会含有不同的等位基因,具体表现为OsACC1蛋白的氨基酸序列可能会与本发明的OsACC1蛋白的氨基酸序列存在1个或2个或多个氨基酸的取代和/或缺失和/或添加;此外,本领域不同技术人员对OsACC1基因外显子序列的注释结果也会存在差异,具体表现为OsACC1蛋白的氨基酸序列长度可能会与本发明的OsACC1蛋白的氨基酸序列不同(如本发明的OsACC1蛋白的第2099位也即为LOC_Os05g22940所示的OsACC1编码基因序列编码的OsACC1蛋白的第2186位,两者为同一位点,但氨基酸序列长度不同)。因此,所述目标位点在不同的水稻品种中所对应的OsACC1基因核苷酸位置及由所述目标位点突变导致的OsACC1蛋白氨基酸序列的突变位置都有可能发生变化,即非序列14第6295位或序列13第2099位。但由于上述情况导致的目标位点发生的变化和/或氨基酸突变位点的变化,只要是运用本发明方法获得的ACCase抑制除草剂抗性水稻,均属于本发明的保护范围。
上述方法或应用中,所述水稻品种具体可为日本晴。
本发明的有益效果:本发明方法可提高目标位点(序列12第7位)的A·G碱基替换效率,且产生的突变体仅目标位点一个位点发生突变,大大提高了C2099R突变体的创制效率。
本发明提供了一种ACCase抑制除草剂抗性水稻的制备方法,该方法包括使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的步骤;所述腺嘌呤脱氨酶、所述Cas9核酸酶、所述核定位信号甲和所述核定位信号乙在所述esgRNA的向导下,可将受体水稻基因组中OsACC1基因靶点序列的第7位由A突变为G,从而获得ACCase抑制除草剂抗性水稻,该ACCase抑制除草剂抗性水稻具有ACCase抑制除草剂抗性。
附图说明
图1为四种腺嘌呤碱基编辑器的重组表达载体结构示意图。
图2为C2099R突变体创制过程中靶点序列中目标位点的变化情况。
图3为发生A7>G7突变的水稻阳性T0苗中靶点序列(靶点序列的反向互补序列峰图)的测序结果。
图4为1680ug/l的盖草能对WT苗和发生A7>G7突变的水稻阳性T0苗生长情况的影响。
图5为发生A7>G7突变的水稻阳性T0苗抽穗结实情况。
具体实施方式
下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的材料、试剂、仪器等,如无特殊说明,均可从商业途径得到。下述实施例中,如无特殊说明,序列表中各核苷酸序列的第1位均为相应DNA/RNA的5′末端核苷酸,末位均为相应DNA/RNA的3′末端核苷酸。
引物对T由引物T-F:5’-GCATTGCTGGACTTCAACC-3’和引物T-R:5’-CAAACCGTATCGCAATCTGAG-3’组成,用于扩增靶点OsACC-T。
A·G碱基替换效率=发生A·G碱基替换的阳性T0苗数/分析的总阳性T0苗数×100%。
日本晴水稻:参考文献:梁卫红,王高华,杜京尧,等.硝普钠及其光解产物对日本晴水稻幼苗生长和5种激素标记基因表达的影响[J].河南师范大学学报(自然版),2017(2):48-52.;公众可以从北京市农林科学院获得。
恢复培养基:含有200mg/L特美汀的N6固体培养基。
筛选培养基:含有50mg/L潮霉素的N6固体培养基。
分化培养基:含有2mg/L KT、0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。
生根培养基:含有0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。
下述实施例中的OsACC1蛋白的氨基酸序列如序列13所示,OsACC1蛋白的编码基因序列如序列14所示。
实施例1、一种ACCase抑制除草剂抗性水稻的制备方法
一、重组表达载体的设计与构建
1、重组表达载体的设计
本发明使用四种类型的腺嘌呤碱基编辑器制备ACCase抑制除草剂抗性水稻,四种不同类型的腺嘌呤碱基编辑器的重组表达载体结构示意图如图1所示。具体如下:
sABE系统:在ecTadA&ecTadA*&Cas9n碱基编辑系统中的ecTadA元件前添加bpNLS核定位信号,且在Cas9n元件后添加bpNLS核定位信号。bpNLS核定位信号的氨基酸序列如下:KRTADGSEFEPKKKRKV(序列7)。将该种设计类型记做bpNLS-bpNLS。
FNLS-sABE系统:在ecTadA&ecTadA*&Cas9n碱基编辑系统中的ecTadA元件前添加3*Flag1&1*NLS1核定位信号,且在Cas9n元件后添加1*NLS1核定位信号。3*Flag1&1*NLS1的氨基酸序列如下:
Figure BDA0002327792220000061
(序列8),其中,3*Flag1标签蛋白的氨基酸序列如下划线所示,NLS1蛋白的氨基酸序列如波浪线所示。1*NLS1核定位信号包括1个NLS1蛋白,1*NLS1的氨基酸序列如下:PKKKRKV(序列9)。将该种设计类型记做3*Flag1&1*NLS1-1*NLS1(FNLS)。
F4NLS-sABE系统:在ecTadA&ecTadA*&Cas9n碱基编辑系统中的ecTadA元件前添加3*Flag2&4*NLS2核定位信号,且在Cas9n元件后添加4*NLS2核定位信号。3*Flag2&4*NLS2核定位信号依次包括1个3*Flag2标签蛋白和4个NLS2蛋白,3*Flag2&4*NLS2核定位信号的氨基酸序列如下:
Figure BDA0002327792220000062
(序列10);其中,3*Flag2标签蛋白的氨基酸序列如下划线所示,NLS2蛋白的氨基酸序列如波浪线所示。4*NLS2核定位信号包括4个NLS2蛋白,4*NLS2核定位信号的氨基酸序列如下:
Figure BDA0002327792220000071
(序列11)。将该种设计类型记做3*Flag2&4*NLS2-4*NLS2(F4NLS)。
4NLS-sABE系统:在ecTadA&ecTadA*&Cas9n碱基编辑系统中的Cas9n元件后添加4*NLS2核定位信号。将该种设计类型记做4*NLS2(4NLS)。
上述四种腺嘌呤碱基编辑器对应的重组表达载体中均包括esgRNA,该esgRNA靶向序列12所示的OsACC1基因靶点序列,该靶点序列第7位为目标位点。上述四种腺嘌呤碱基编辑器可将受体水稻基因组中OsACC1蛋白的编码基因的非编码链上第6295位的碱基A突变为碱基G,进而使OsACC1蛋白的编码基因的编码链上(序列14)的第6295位的碱基T突变为碱基C,导致OsACC1蛋白第2099位的半胱氨酸(C)的密码子由“TGC”变成“CGC”(图2),所以OsACC1蛋白第2099位由半胱氨酸(C)突变成精氨酸(R),发生该突变的水稻突变体即为本发明的ACCase抑制除草剂抗性水稻,将其记做C2099R突变体,也即下文中的A7>G7突变体。C2099R突变体(A7>G7突变体)能够使水稻产生ACCase抑制除草剂抗性。
2、重组表达载体的构建
人工合成步骤1中四种腺嘌呤碱基编辑器对应的重组表达载体,各载体均为环状质粒:
一个含有bpNLS-bpNLS的重组表达载体:sABE-1;
一个含有3*Flag1&1*NLS1-1*NLS1(FNLS)的重组表达载体:FNLS-sABE-1;
一个含有3*Flag2&4*NLS2-4*NLS2(F4NLS)的重组表达载体:F4NLS-sABE-1;
一个含有4*NLS2(4NLS)的重组表达载体:4NLS-sABE-1。
sABE-1重组表达载体的核苷酸序列为序列表中的序列1。其中,序列1的第131-596位为OsU6a启动子的核苷酸序列,第710-1090位为OsU3启动子的核苷酸序列,第1204-1945位为OsU6c启动子的核苷酸序列,第597-702位、第1091-1196位和第1946-2051位均为esgRNA核苷酸序列,第597-616位、第1091-1110位和第1946-1965位分别为T1、T2和OsACC-T靶点序列,第617-702位、第1111-1196位和第1966-2051位均为esgRNA骨架核苷酸序列;序列1的第2070-3783位为OsUbq3启动子的核苷酸序列,第3796-3846位为bpNLS核苷酸序列,第3847-4344位为ecTadA蛋白质的编码序列(不含有终止密码子),编码序列2所示的ecTadA蛋白质;序列1的第4441-4938位为ecTadA*蛋白质的编码序列(不含有终止密码子),编码序列3所示的ecTadA*蛋白质;序列1的第5035-9135位为Cas9n蛋白质的编码序列(不含有终止密码子),编码序列4所示的Cas9n蛋白质;序列1的第9136-9186位为bpNLS核苷酸序列;序列1的第9529-9781位为Nos终止子序列;序列1的第9822-11814位为ZmUbi1启动子的核苷酸序列,第11821-12846位为潮霉素磷酸转移酶的编码序列,第12873-13088位为CaMV35S polyA的核苷酸序列。sABE-1重组表达载体中的三个靶点分别为T1、T2和OsACC-T,序列见表1。
FNLS-sABE-1重组表达载体为将sABE-1重组表达载体序列中序列1第3796-3846位所示的bpNLS核苷酸序列替换为序列5,且将第9136-9186位所示的bpNLS核苷酸序列替换为序列5第73-93位所示的核苷酸序列,且保持其他序列不变后得到的序列。其中,序列5的第1-66位为3*Flag1核苷酸序列,第73-93位为NLS1核苷酸序列,序列5中共计含有1个NLS1核苷酸序列。
F4NLS-sABE-1重组表达载体为将sABE-1重组表达载体序列中序列1第3796-3846位所示的bpNLS核苷酸序列替换为序列6,且将第9136-9186位所示的bpNLS核苷酸序列替换为序列6第55-201位所示的核苷酸序列,且保持其他序列不变后得到的序列。其中,序列6第1-66位为3*Flag2核苷酸序列,第73-93位、第103-123位、第133-153位和第163-183位均为NLS2核苷酸序列,序列6中共计含有4个NLS2核苷酸序列。
4NLS-sABE-1重组表达载体为将sABE-1重组表达载体序列中序列1第3796-3846位所示的bpNLS核苷酸序列删除,且将第9136-9186位所示的bpNLS核苷酸序列替换为序列6第55-201位所示的核苷酸序列,且保持其他序列不变后得到的序列。
各载体的靶点核苷酸序列及相应的PAM序列如表1所示。
表1
Figure BDA0002327792220000081
二、水稻阳性T0苗的获得
将步骤一获得的sABE-1载体,FNLS-sABE-1载体,F4NLS-sABE-1载体和4NLS-sABE-1载体分别按照如下步骤1-9进行操作:
1、将载体导入农杆菌EHA105(上海唯地生物技术有限公司的产品,CAT#:AC1010),得到重组农杆菌。
2、采用培养基(含50μg/ml卡那霉素和25μg/ml利福平的YEP培养基)培养重组农杆菌,28℃,150rpm震荡培养至OD600为1.0-2.0,室温条件下,10000rpm离心1min,用侵染液(将N6液体培养基中的糖替换为葡萄糖和蔗糖,葡萄糖和蔗糖在侵染液中的浓度分别为10g/L和20g/L)重悬菌体并稀释至OD600为0.2,得到农杆菌侵染液。
3、水稻品种日本晴成熟种子去壳脱粒,置于100mL三角瓶中,加入70%(v/v)乙醇水溶液浸泡30sec,再置于25%(v/v)次氯酸钠水溶液中,120rpm震荡灭菌30min,无菌水冲洗3次,用滤纸吸干水分,然后将种子胚朝下置于N6固体培养基上,28℃暗培养4-6周,得到水稻愈伤。
4、完成步骤3后,将水稻愈伤浸泡置于农杆菌侵染液甲(农杆菌侵染液甲为向农杆菌侵染液中加入乙酰丁香酮得到的液体,乙酰丁香酮的添加量满足乙酰丁香酮与农杆菌侵染液的体积比为25μl:50ml)中浸泡10min,然后,放在铺有两层灭菌滤纸的培养皿(内含约200ml不含农杆菌的侵染液)上,21℃暗培养1天。
5、取步骤4得到的水稻愈伤放入恢复培养基上,25-28℃暗培养3天。
6、取步骤5得到的水稻愈伤,置于筛选培养基上,28℃暗培养2周。
7、取步骤6得到的水稻愈伤,再次置于筛选培养基上,28℃暗培养2周,得到水稻抗性愈伤。
8、取步骤7得到的水稻抗性愈伤放入分化培养基上,25℃光照培养1个月左右,将分化出来的小苗移至生根培养基上,25℃光照培养2周,获取水稻T0苗。
9、提取水稻T0苗的基因组DNA并以其作为模板,采用引物F(5’-CCGAGGAGACTATCACCCCT-3’)和引物R(5’-CGACCCATAACCTTGACAAGC-3’)组成的引物对进行PCR扩增,得到PCR扩增产物;将该PCR扩增产物进行琼脂糖凝胶电泳,然后进行如下判断:如果PCR扩增产物中含有约853bp的DNA片段,则相应的水稻T0苗为水稻阳性T0苗;如果PCR扩增产物中不含有约853bp的DNA片段,则相应的水稻T0苗不为水稻阳性T0苗。
三、结果分析
1、每载体分别取步骤二所获得的水稻阳性T0苗的基因组DNA作为模板,对于OsACC-T靶点,采用引物对T进行PCR扩增,得到PCR扩增产物。
2、将步骤1得到的PCR扩增产物进行Sanger测序及分析。测序结果只针对靶点区进行分析。统计OsACC-T靶点发生A·G碱基替换的阳性T0苗数,计算得出A·G碱基替换效率,结果见表2。
sABE系统使用的核定位信号为bpNLS-bpNLS;FNLS-sCBE系统使用的核定位信号为3*Flag1&1*NLS1-1*NLS1;F4NLS-sCBE系统使用的核定位信号为3*Flag2&4*NLS2-4*NLS2;4NLS-sABE系统使用的核定位信号为4*NLS2。
结果表明,四种系统(sABE系统、FNLS-sABE系统、F4NLS-sABE系统、4NLS-sABE系统)均只产生单一的A7>G7突变体(图3),且A·G碱基替换效率均高于50%。其中,FNLS-sABE系统的A·G碱基替换效率最高,高达78.3%,sABE系统和F4NLS-sABE系统的A·G碱基替换效率次之,分别为66.7%和61.9%,4NLS-sABE系统最低,为52.2%。由此可见,sABE、FNLS-sABE和F4NLS-sABE这三种系统只产生单一的A7>G7突变体(仅有目标位点发生突变,其余位点均未发生突变),且A·G碱基替换效率高,大大提高了A7>G7突变体的创制效率。
表2
Figure BDA0002327792220000091
实施例2、A7>G7突变体的ACCase抑制除草剂抗性检测
一、除草剂喷施实验
随机选择实施例1步骤三所获得的15株发生A7>G7突变的水稻阳性T0苗和15株未发生A7>G7突变的水稻阳性T0苗(记为WT苗),分别按照如下步骤1-4进行操作:
1、水稻阳性T0苗在生根培养基中长至15-20厘米左右时,打开培养皿盖子,放入清水进行炼苗,25℃光照培养5天。
2、炼苗结束后,从培养基中取出T0苗,用清水将根中残留的培养基洗干净,送进温室移栽至小盆中,15天后转移至大盆进行培养。
3、在6-10叶期,随机选择三株发生A7>G7突变的水稻阳性T0苗和三株未发生A7>G7突变的水稻阳性T0苗作为一组,共五组。每组分别用喷雾器喷施1L浓度分别为0ug/l、42ug/l、84ug/l、168ug/l、1680ug/l的盖草能(69806-34-4,陶氏益农,有效成分吡氟氯禾灵含量为108克/升),每隔两周喷施一次,共喷施3次。
4、待未发生A7>G7突变的水稻阳性T0苗生长停止甚至枯死时,以及发生A7>G7突变的水稻阳性T0苗正常抽穗结实时进行拍照。
二、结果分析
盖草能属于ACCase抑制除草剂中APP类型的一个品种。对于五种浓度0ug/l、42ug/l、84ug/l、168ug/l、1680ug/l,前四种浓度均未对未发生A7>G7突变的水稻阳性T0苗有明显抑制效果,即WT苗均生长正常,只有1680ug/l的盖草能对WT苗表现出明显抑制效果,而发生A7>G7突变的水稻阳性T0苗则生长正常(图4),且发生A7>G7突变的水稻阳性T0苗抽穗结实也表现为正常(图5)。由此可见,发生A7>G7突变的水稻阳性T0苗具有ACCase抑制除草剂抗性。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
序列表
<110>北京市农林科学院
<120>核定位信号F4NLS在高效创制水稻除草剂抗性材料中的应用
<160>14
<170>PatentIn version 3.5
<210>1
<211>19494
<212>DNA
<213>Artificial Sequence
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc tggaatcggc agcaaaggat tttttcctgt agttttccca caaccatttt 180
ttaccatccg aatgatagga taggaaaaat atccaagtga acagtattcc tataaaattc 240
ccgtaaaaag cctgcaatcc gaatgagccc tgaagtctga actagccggt cacctgtaca 300
ggctatcgag atgccataca agagacggta gtaggaacta ggaagacgat ggttgattcg 360
tcaggcgaaa tcgtcgtcct gcagtcgcat ctatgggcct ggacggaata ggggaaaaag 420
ttggccggat aggagggaaa ggcccaggtg cttacgtgcg aggtaggcct gggctctcag 480
cacttcgatt cgttggcacc ggggtaggat gcaatagaga gcaacgttta gtaccacctc 540
gcttagctag agcaaactgg actgccttat atgcgcgggt gctggcttgg ctgccgagtg 600
cacggtgtcc gtggccgttt cagagctatg ctggaaacag catagcaagt tgaaataagg 660
ctagtccgtt atcaacttga aaaagtggca ccgagtcggt gcttttttta ggaatcttta 720
aacatacgaa cagatcactt aaagttcttc tgaagcaact taaagttatc aggcatgcat 780
ggatcttgga ggaatcagat gtgcagtcag ggaccatagc acaagacagg cgtcttctac 840
tggtgctacc agcaaatgct ggaagccggg aacactgggt acgttggaaa ccacgtgtga 900
tgtgaaggag taagataaac tgtaggagaa aagcatttcg tagtgggcca tgaagccttt 960
caggacatgt attgcagtat gggccggccc attacgcaat tggacgacaa caaagactag 1020
tattagtacc acctcggcta tccacataga tcaaagctgg tttaaaagag ttgtgcagat 1080
gatccgtggc gttgatagca agataaaccc gtttcagagc tatgctggaa acagcatagc 1140
aagttgaaat aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgctttt 1200
tttctcatta gcggtatgca tgttggtaga agtcggagat gtaaataatt ttcattatat 1260
aaaaaaggta cttcgagaaa aataaatgca tacgaattaa ttctttttat gttttttaaa 1320
ccaagtatat agaatttatt gatggttaaa atttcaaaaa tatgacgaga gaaaggttaa 1380
acgtacggca tatacttctg aacagagagg gaatatgggg tttttgttgc tcccaacaat 1440
tcttaagcac gtaaaggaaa aaagcacatt atccacattg tacttccaga gatatgtaca 1500
gcattacgta ggtacgtttt ctttttcttc ccggagagat gatacaataa tcatgtaaac 1560
ccagaattta aaaaatattc tttactataa aaattttaat tagggaacgt attatttttt 1620
acatgacacc ttttgagaaa gagggacttg taatatggga caaatgaaca atttctaaga 1680
aatgggcata tgactctcag tacaatggac caaattccct ccagtcggcc cagcaataca 1740
aagggaaaga aatgaggggg cccacaggcc acggcccact tttctccgtg gtggggagat 1800
ccagctagag gtccggccca caagtggccc ttgccccgtg ggacggtggg attgcagagc 1860
gcgtgggcgg aaacaacagt ttagtaccac ctcgctcacg caacgacgcg accacttgct 1920
tataagctgc tgcgctgagg ctcagcatag cactcaatgc ggtctgtttc agagctatgc 1980
tggaaacagc atagcaagtt gaaataaggc tagtccgtta tcaacttgaa aaagtggcac 2040
cgagtcggtg cttttttttt tttaagctta caaattcggg tcaaggcgga agccagcgcg 2100
ccaccccacg tcagcaaata cggaggcgcg gggttgacgg cgtcacccgg tcctaacggc 2160
gaccaacaaa ccagccagaa gaaattacag taaaaaaaaa gtaaattgca ctttgatcca 2220
ccttttatta cctaagtctc aatttggatc acccttaaac ctatcttttc aatttgggcc 2280
gggttgtggt ttggactacc atgaacaact tttcgtcatg tctaacttcc ctttcagcaa 2340
acatatgaac catatataga ggagatcggc cgtatactag agctgatgtg tttaaggtcg 2400
ttgattgcac gagaaaaaaa aatccaaatc gcaacaatag caaatttatc tggttcaaag 2460
tgaaaagata tgtttaaagg tagtccaaag taaaacttat agataataaa atgtggtcca 2520
aagcgtaatt cactcaaaaa aaatcaacga gacgtgtacc aaacggagac aaacggcatc 2580
ttctcgaaat ttcccaaccg ctcgctcgcc cgcctcgtct tcccggaaac cgcggtggtt 2640
tcagcgtggc ggattctcca agcagacgga gacgtcacgg cacgggactc ctcccaccac 2700
ccaaccgcca taaataccag ccccctcatc tcctctcctc gcatcagctc cacccccgaa 2760
aaatttctcc ccaatctcgc gaggctctcg tcgtcgaatc gaatcctctc gcgtcctcaa 2820
ggtacgctgc ttctcctctc ctcgcttcgt ttcgattcga tttcggacgg gtgaggttgt 2880
tttgttgcta gatccgattg gtggttaggg ttgtcgatgt gattatcgtg agatgtttag 2940
gggttgtaga tctgatggtt gtgatttggg cacggttggt tcgataggtg gaatcgtggt 3000
taggttttgg gattggatgt tggttctgat gattgggggg aatttttacg gttagatgaa 3060
ttgttggatg attcgattgg ggaaatcggt gtagatctgt tggggaattg tggaactagt 3120
catgcctgag tgattggtgc gatttgtagc gtgttccatc ttgtaggcct tgttgcgagc 3180
atgttcagat ctactgttcc gctcttgatt gagttattgg tgccatgggt tggtgcaaac 3240
acaggcttta atatgttata tctgttttgt gtttgatgta gatctgtagg gtagttcttc 3300
ttagacatgg ttcaattatg tagcttgtgc gtttcgattt gatttcatat gttcacagat 3360
tagataatga tgaactcttt taattaattg tcaatggtaa ataggaagtc ttgtcgctat 3420
atctgtcata atgatctcat gttactatct gccagtaatt tatgctaaga actatattag 3480
aatatcatgt tacaatctgt agtaatatca tgttacaatc tgtagttcat ctatataatc 3540
tattgtggta atttcttttt actatctgtg tgaagattat tgccactagt tcattctact 3600
tatttctgaa gttcaggata cgtgtgctgt tactacctat ctgaatacat gtgtgatgtg 3660
cctgttacta tctttttgaa tacatgtatg ttctgttgga atatgtttgc tgtttgatcc 3720
gttgttgtgt ccttaatctt gtgctagttc ttaccctatc tgtttggtga ttatttcttg 3780
cagtacgtaa gcatgaagag gaccgccgac ggcagcgagt tcgagccgaa gaagaagagg 3840
aaggtgtccg aggtggagtt ctcccacgag tactggatga ggcacgcact caccctcgca 3900
aagagggcat gggacgagag ggaggtgcct gtgggagcag tgctcgtgca caacaacagg 3960
gtgatcggag agggatggaa caggcctatc ggaaggcacg accctaccgc acacgcagag 4020
atcatggcac tcaggcaggg aggcctcgtg atgcagaact acaggctcat cgacgccacc 4080
ctctacgtga ccctcgagcc ttgcgtgatg tgcgcaggag ccatgatcca ctccaggatc 4140
ggaagggtgg tgttcggagc aagggacgca aagaccggag cagccggctc cctcatggac 4200
gtgctccacc acccgggcat gaaccacagg gtggagatca ccgagggaat cctcgcagac 4260
gagtgcgcag ccctcctctc cgacttcttc aggatgagga ggcaggagat caaggcccag 4320
aagaaggccc agtcctccac cgactccggc ggctcatcag gcggctcctc cggctccgag 4380
acaccgggca cctccgagtc cgccaccccg gagtcctccg gcggctcctc cggcggctcc 4440
tccgaggtgg agttctccca cgagtactgg atgaggcacg cactcaccct cgcaaagagg 4500
gcaagggacg agagggaggt gcctgtggga gcagtgctcg tgctcaacaa cagggtgatc 4560
ggagagggat ggaacagggc aatcggcctc cacgacccta ccgcacacgc agagatcatg 4620
gcactcaggc agggaggcct cgtgatgcag aactacaggc tcatcgacgc caccctctac 4680
gtgaccttcg agccttgcgt gatgtgcgca ggagccatga tccactccag gatcggcagg 4740
gtggtgttcg gcgtgaggaa cgcaaagacc ggagcagcag gctccctcat ggacgtgctc 4800
cactacccgg gcatgaacca cagggtggag atcaccgagg gaatcctcgc agacgagtgc 4860
gcagccctcc tctgctactt cttcaggatg ccgaggcagg tgttcaacgc ccagaagaag 4920
gcccagtcct ccaccgactc cggcggctca tcaggcggct cctccggctc cgagacaccg 4980
ggcacctccg agtccgccac cccggagtcc tccggcggct cctccggcgg ctccgacaag 5040
aagtactcca tcggcctcgc catcggcacc aacagcgtcg gctgggcggt gatcaccgac 5100
gagtacaagg tcccgtccaa gaagttcaag gtcctgggca acaccgaccg ccactccatc 5160
aagaagaacc tcatcggcgc cctcctcttc gactccggcg agacggcgga ggcgacccgc 5220
ctcaagcgca ccgcccgccg ccgctacacc cgccgcaaga accgcatctg ctacctccag 5280
gagatcttct ccaacgagat ggcgaaggtc gacgactcct tcttccaccg cctcgaggag 5340
tccttcctcg tggaggagga caagaagcac gagcgccacc ccatcttcgg caacatcgtc 5400
gacgaggtcg cctaccacga gaagtacccc actatctacc accttcgtaa gaagcttgtt 5460
gactctactg ataaggctga tcttcgtctc atctaccttg ctctcgctca catgatcaag 5520
ttccgtggtc acttccttat cgagggtgac cttaaccctg ataactccga cgtggacaag 5580
ctcttcatcc agctcgtcca gacctacaac cagctcttcg aggagaaccc tatcaacgct 5640
tccggtgtcg acgctaaggc gatcctttcc gctaggctct ccaagtccag gcgtctcgag 5700
aacctcatcg cccagctccc tggtgagaag aagaacggtc ttttcggtaa cctcatcgct 5760
ctctccctcg gtctgacccc taacttcaag tccaacttcg acctcgctga ggacgctaag 5820
cttcagctct ccaaggatac ctacgacgat gatctcgaca acctcctcgc tcagattgga 5880
gatcagtacg ctgatctctt ccttgctgct aagaacctct ccgatgctat cctcctttcg 5940
gatatcctta gggttaacac tgagatcact aaggctcctc tttctgcttc catgatcaag 6000
cgctacgacg agcaccacca ggacctcacc ctcctcaagg ctcttgttcg tcagcagctc 6060
cccgagaagt acaaggagat cttcttcgac cagtccaaga acggctacgc cggttacatt 6120
gacggtggag ctagccagga ggagttctac aagttcatca agccaatcct tgagaagatg 6180
gatggtactg aggagcttct cgttaagctt aaccgtgagg acctccttag gaagcagagg 6240
actttcgata acggctctat ccctcaccag atccaccttg gtgagcttca cgccatcctt 6300
cgtaggcagg aggacttcta ccctttcctc aaggacaacc gtgagaagat cgagaagatc 6360
cttactttcc gtattcctta ctacgttggt cctcttgctc gtggtaactc ccgtttcgct 6420
tggatgacta ggaagtccga ggagactatc accccttgga acttcgagga ggttgttgac 6480
aagggtgctt ccgcccagtc cttcatcgag cgcatgacca acttcgacaa gaacctcccc 6540
aacgagaagg tcctccccaa gcactccctc ctctacgagt acttcacggt ctacaacgag 6600
ctcaccaagg tcaagtacgt caccgagggt atgcgcaagc ctgccttcct ctccggcgag 6660
cagaagaagg ctatcgttga cctcctcttc aagaccaacc gcaaggtcac cgtcaagcag 6720
ctcaaggagg actacttcaa gaagatcgag tgcttcgact ccgtcgagat cagcggcgtt 6780
gaggaccgtt tcaacgcttc tctcggtacc taccacgatc tcctcaagat catcaaggac 6840
aaggacttcc tcgacaacga ggagaacgag gacatcctcg aggacatcgt cctcactctt 6900
actctcttcg aggataggga gatgatcgag gagaggctca agacttacgc tcatctcttc 6960
gatgacaagg ttatgaagca gctcaagcgt cgccgttaca ccggttgggg taggctctcc 7020
cgcaagctca tcaacggtat cagggataag cagagcggca agactatcct cgacttcctc 7080
aagtctgatg gtttcgctaa caggaacttc atgcagctca tccacgatga ctctcttacc 7140
ttcaaggagg atattcagaa ggctcaggtg tccggtcagg gcgactctct ccacgagcac 7200
attgctaacc ttgctggttc ccctgctatc aagaagggca tccttcagac tgttaaggtt 7260
gtcgatgagc ttgtcaaggt tatgggtcgt cacaagcctg agaacatcgt catcgagatg 7320
gctcgtgaga accagactac ccagaagggt cagaagaact cgagggagcg catgaagagg 7380
attgaggagg gtatcaagga gcttggttct cagatcctta aggagcaccc tgtcgagaac 7440
acccagctcc agaacgagaa gctctacctc tactacctcc agaacggtag ggatatgtac 7500
gttgaccagg agctcgacat caacaggctt tctgactacg acgtcgacca cattgttcct 7560
cagtctttcc ttaaggatga ctccatcgac aacaaggtcc tcacgaggtc cgacaagaac 7620
aggggtaagt cggacaacgt cccttccgag gaggttgtca agaagatgaa gaactactgg 7680
aggcagcttc tcaacgctaa gctcattacc cagaggaagt tcgacaacct cacgaaggct 7740
gagaggggtg gcctttccga gcttgacaag gctggtttca tcaagaggca gcttgttgag 7800
acgaggcaga ttaccaagca cgttgctcag atcctcgatt ctaggatgaa caccaagtac 7860
gacgagaacg acaagctcat ccgcgaggtc aaggtgatca ccctcaagtc caagctcgtc 7920
tccgacttcc gcaaggactt ccagttctac aaggtccgcg agatcaacaa ctaccaccac 7980
gctcacgatg cttaccttaa cgctgtcgtt ggtaccgctc ttatcaagaa gtaccctaag 8040
cttgagtccg agttcgtcta cggtgactac aaggtctacg acgttcgtaa gatgatcgcc 8100
aagtccgagc aggagatcgg caaggccacc gccaagtact tcttctactc caacatcatg 8160
aacttcttca agaccgagat caccctcgcc aacggcgaga tccgcaagcg ccctcttatc 8220
gagacgaacg gtgagactgg tgagatcgtt tgggacaagg gtcgcgactt cgctactgtt 8280
cgcaaggtcc tttctatgcc tcaggttaac atcgtcaaga agaccgaggt ccagaccggt 8340
ggcttctcca aggagtctat ccttccaaag agaaactcgg acaagctcat cgctaggaag 8400
aaggattggg accctaagaa gtacggtggt ttcgactccc ctactgtcgc ctactccgtc 8460
ctcgtggtcg ccaaggtgga gaagggtaag tcgaagaagc tcaagtccgt caaggagctc 8520
ctcggcatca ccatcatgga gcgctcctcc ttcgagaaga acccgatcga cttcctcgag 8580
gccaagggct acaaggaggt caagaaggac ctcatcatca agctccccaa gtactctctt 8640
ttcgagctcg agaacggtcg taagaggatg ctggcttccg ctggtgagct ccagaagggt 8700
aacgagcttg ctcttccttc caagtacgtg aacttcctct acctcgcctc ccactacgag 8760
aagctcaagg gttcccctga ggataacgag cagaagcagc tcttcgtgga gcagcacaag 8820
cactacctcg acgagatcat cgagcagatc tccgagttct ccaagcgcgt catcctcgct 8880
gacgctaacc tcgacaaggt cctctccgcc tacaacaagc accgcgacaa gcccatccgc 8940
gagcaggccg agaacatcat ccacctcttc acgctcacga acctcggcgc ccctgctgct 9000
ttcaagtact tcgacaccac catcgacagg aagcgttaca cgtccaccaa ggaggttctc 9060
gacgctactc tcatccacca gtccatcacc ggtctttacg agactcgtat cgacctttcc 9120
cagcttggtg gtgataagag gaccgccgac ggcagcgagt tcgagccgaa gaagaagagg 9180
aaggtgtaga ctagttcagc cagtttggtg gagctgccga tgtgcctggt cgtcccgagc 9240
ctctgttcgt caagtatttg tggtgctgat gtctacttgt gtctggttta atggaccatc 9300
gagtccgtat gatatgttag ttttatgaaa cagtttcctg tgggacagca gtatgcttta 9360
tgaataagtt ggatttgaac ctaaatatgt gctcaatttg ctcatttgca tctcattcct 9420
gttgatgttt tatctgagtt gcaagtttga aaatgctgca tattcttatt aaatcgtcat 9480
ttacttttat cttaatgagc tttgcaatgg cctatgggat ataaaagaga tcgttcaaac 9540
atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat gattatcata 9600
taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat gacgttattt 9660
atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc gatagaaaac 9720
aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat gttactagat 9780
cggcgcctgt ccgggcgcgc ctggtggatc gtccgcctag gctgcagtgc agcgtgaccc 9840
ggtcgtgccc ctctctagag ataatgagca ttgcatgtct aagttataaa aaattaccac 9900
atattttttt tgtcacactt gtttgaagtg cagtttatct atctttatac atatatttaa 9960
actttactct acgaataata taatctatag tactacaata atatcagtgt tttagagaat 10020
catataaatg aacagttaga catggtctaa aggacaattg agtattttga caacaggact 10080
ctacagtttt atctttttag tgtgcatgtg ttctcctttt tttttgcaaa tagcttcacc 10140
tatataatac ttcatccatt ttattagtac atccatttag ggtttagggt taatggtttt 10200
tatagactaa tttttttagt acatctattt tattctattt tagcctctaa attaagaaaa 10260
ctaaaactct attttagttt ttttatttaa taatttagat ataaaataga ataaaataaa 10320
gtgactaaaa attaaacaaa taccctttaa gaaattaaaa aaactaagga aacatttttc 10380
ttgtttcgag tagataatgc cagcctgtta aacgccgtcg acgagtctaa cggacaccaa 10440
ccagcgaacc agcagcgtcg cgtcgggcca agcgaagcag acggcacggc atctctgtcg 10500
ctgcctctgg acccctctcg agagttccgc tccaccgttg gacttgctcc gctgtcggca 10560
tccagaaatt gcgtggcgga gcggcagacg tgagccggca cggcaggcgg cctcctcctc 10620
ctctcacggc accggcagct acgggggatt cctttcccac cgctccttcg ctttcccttc 10680
ctcgcccgcc gtaataaata gacaccccct ccacaccctc tttccccaac ctcgtgttgt 10740
tcggagcgca cacacacaca accagatctc ccccaaatcc acccgtcggc acctccgctt 10800
caaggtacgc cgctcgtcct cccccccccc ccctctctac cttctctaga tcggcgttcc 10860
ggtccatggt tagggcccgg tagttctact tctgttcatg tttgtgttag atccgtgttt 10920
gtgttagatc cgtgctgcta gcgttcgtac acggatgcga cctgtacgtc agacacgttc 10980
tgattgctaa cttgccagtg tttctctttg gggaatcctg ggatggctct agccgttccg 11040
cagacgggat cgatttcatg attttttttg tttcgttgca tagggtttgg tttgcccttt 11100
tcctttattt caatatatgc cgtgcacttg tttgtcgggt catcttttca tgcttttttt 11160
tgtcttggtt gtgatgatgt ggtctggttg ggcggtcgtt ctagatcgga gtagaattct 11220
gtttcaaact acctggtgga tttattaatt ttggatctgt atgtgtgtgc catacatatt 11280
catagttacg aattgaagat gatggatgga aatatcgatc taggataggt atacatgttg 11340
atgcgggttt tactgatgca tatacagaga tgctttttgt tcgcttggtt gtgatgatgt 11400
ggtgtggttg ggcggtcgtt cattcgttct agatcggagt agaatactgt ttcaaactac 11460
ctggtgtatt tattaatttt ggaactgtat gtgtgtgtca tacatcttca tagttacgag 11520
tttaagatgg atggaaatat cgatctagga taggtataca tgttgatgtg ggttttactg 11580
atgcatatac atgatggcat atgcagcatc tattcatatg ctctaacctt gagtacctat 11640
ctattataat aaacaagtat gttttataat tattttgatc ttgatatact tggatgatgg 11700
catatgcagc agctatatgt ggattttttt agccctgcct tcatacgcta tttatttgct 11760
tggtactgtt tcttttgtcg atgctcaccc tgttgtttgg tgttacttct gcaggagctc 11820
atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac 11880
agcgtctccg acctgatgca gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat 11940
gtaggagggc gtggatatgt cctgcgggta aatagctgcg ccgatggttt ctacaaagat 12000
cgttatgttt atcggcactt tgcatcggcc gcgctcccga ttccggaagt gcttgacatt 12060
ggggagttta gcgagagcct gacctattgc atctcccgcc gttcacaggg tgtcacgttg 12120
caagacctgc ctgaaaccga actgcccgct gttctacaac cggtcgcgga ggctatggat 12180
gcgatcgctg cggccgatct tagccagacg agcgggttcg gcccattcgg accgcaagga 12240
atcggtcaat acactacatg gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat 12300
cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag 12360
ctgatgcttt gggccgagga ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc 12420
tccaacaatg tcctgacgga caatggccgc ataacagcgg tcattgactg gagcgaggcg 12480
atgttcgggg attcccaata cgaggtcgcc aacatcttct tctggaggcc gtggttggct 12540
tgtatggagc agcagacgcg ctacttcgag cggaggcatc cggagcttgc aggatcgcca 12600
cgactccggg cgtatatgct ccgcattggt cttgaccaac tctatcagag cttggttgac 12660
ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga 12720
gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc 12780
tgtgtagaag tactcgccga tagtggaaac cgacgcccca gcactcgtcc gagggcaaag 12840
aaatagagta gatgccgacc gggatctgtc gatcgacaag ctcgagtttc tccataataa 12900
tgtgtgagta gttcccagat aagggaatta gggttcctat agggtttcgc tcatgtgttg 12960
agcatataag aaacccttag tatgtatttg tatttgtaaa atacttctat caataaaatt 13020
tctaattcct aaaaccaaaa tccagtacta aaatccagat cccccgaatt aattcggcgt 13080
taattcagcc tgcaggacgc gtttaattaa gtgcacgcgg ccgcctactt agtcaagagc 13140
ctcgcacgcg actgtcacgc ggccaggatc gcctcgtgag cctcgcaatc tgtacctagt 13200
gtttaaacta tcagtgtttg acaggatata ttggcgggta aacctaagag aaaagagcgt 13260
ttattagaat aacggatatt taaaagggcg tgaaaaggtt tatccgttcg tccatttgta 13320
tgtgcatgcc aaccacaggg ttcccctcgg gatcaaagta ctttgatcca acccctccgc 13380
tgctatagtg cagtcggctt ctgacgttca gtgcagccgt cttctgaaaa cgacatgtcg 13440
cacaagtcct aagttacgcg acaggctgcc gccctgccct tttcctggcg ttttcttgtc 13500
gcgtgtttta gtcgcataaa gtagaatact tgcgactaga accggagaca ttacgccatg 13560
aacaagagcg ccgccgctgg cctgctgggc tatgcccgcg tcagcaccga cgaccaggac 13620
ttgaccaacc aacgggccga actgcacgcg gccggctgca ccaagctgtt ttccgagaag 13680
atcaccggca ccaggcgcga ccgcccggag ctggccagga tgcttgacca cctacgccct 13740
ggcgacgttg tgacagtgac caggctagac cgcctggccc gcagcacccg cgacctactg 13800
gacattgccg agcgcatcca ggaggccggc gcgggcctgc gtagcctggc agagccgtgg 13860
gccgacacca ccacgccggc cggccgcatg gtgttgaccg tgttcgccgg cattgccgag 13920
ttcgagcgtt ccctaatcat cgaccgcacc cggagcgggc gcgaggccgc caaggcccga 13980
ggcgtgaagt ttggcccccg ccctaccctc accccggcac agatcgcgca cgcccgcgag 14040
ctgatcgacc aggaaggccg caccgtgaaa gaggcggctg cactgcttgg cgtgcatcgc 14100
tcgaccctgt accgcgcact tgagcgcagc gaggaagtga cgcccaccga ggccaggcgg 14160
cgcggtgcct tccgtgagga cgcattgacc gaggccgacg ccctggcggc cgccgagaat 14220
gaacgccaag aggaacaagc atgaaaccgc accaggacgg ccaggacgaa ccgtttttca 14280
ttaccgaaga gatcgaggcg gagatgatcg cggccgggta cgtgttcgag ccgcccgcgc 14340
acgtctcaac cgtgcggctg catgaaatcc tggccggttt gtctgatgcc aagctggcgg 14400
cctggccggc cagcttggcc gctgaagaaa ccgagcgccg ccgtctaaaa aggtgatgtg 14460
tatttgagta aaacagcttg cgtcatgcgg tcgctgcgta tatgatgcga tgagtaaata 14520
aacaaatacg caaggggaac gcatgaaggt tatcgctgta cttaaccaga aaggcgggtc 14580
aggcaagacg accatcgcaa cccatctagc ccgcgccctg caactcgccg gggccgatgt 14640
tctgttagtc gattccgatc cccagggcag tgcccgcgat tgggcggccg tgcgggaaga 14700
tcaaccgcta accgttgtcg gcatcgaccg cccgacgatt gaccgcgacg tgaaggccat 14760
cggccggcgc gacttcgtag tgatcgacgg agcgccccag gcggcggact tggctgtgtc 14820
cgcgatcaag gcagccgact tcgtgctgat tccggtgcag ccaagccctt acgacatatg 14880
ggccaccgcc gacctggtgg agctggttaa gcagcgcatt gaggtcacgg atggaaggct 14940
acaagcggcc tttgtcgtgt cgcgggcgat caaaggcacg cgcatcggcg gtgaggttgc 15000
cgaggcgctg gccgggtacg agctgcccat tcttgagtcc cgtatcacgc agcgcgtgag 15060
ctacccaggc actgccgccg ccggcacaac cgttcttgaa tcagaacccg agggcgacgc 15120
tgcccgcgag gtccaggcgc tggccgctga aattaaatca aaactcattt gagttaatga 15180
ggtaaagaga aaatgagcaa aagcacaaac acgctaagtg ccggccgtcc gagcgcacgc 15240
agcagcaagg ctgcaacgtt ggccagcctg gcagacacgc cagccatgaa gcgggtcaac 15300
tttcagttgc cggcggagga tcacaccaag ctgaagatgt acgcggtacg ccaaggcaag 15360
accattaccg agctgctatc tgaatacatc gcgcagctac cagagtaaat gagcaaatga 15420
ataaatgagt agatgaattt tagcggctaa aggaggcggc atggaaaatc aagaacaacc 15480
aggcaccgac gccgtggaat gccccatgtg tggaggaacg ggcggttggc caggcgtaag 15540
cggctgggtt gtctgccggc cctgcaatgg cactggaacc cccaagcccg aggaatcggc 15600
gtgacggtcg caaaccatcc ggcccggtac aaatcggcgc ggcgctgggt gatgacctgg 15660
tggagaagtt gaaggccgcg caggccgccc agcggcaacg catcgaggca gaagcacgcc 15720
ccggtgaatc gtggcaagcg gccgctgatc gaatccgcaa agaatcccgg caaccgccgg 15780
cagccggtgc gccgtcgatt aggaagccgc ccaagggcga cgagcaacca gattttttcg 15840
ttccgatgct ctatgacgtg ggcacccgcg atagtcgcag catcatggac gtggccgttt 15900
tccgtctgtc gaagcgtgac cgacgagctg gcgaggtgat ccgctacgag cttccagacg 15960
ggcacgtaga ggtttccgca gggccggccg gcatggccag tgtgtgggat tacgacctgg 16020
tactgatggc ggtttcccat ctaaccgaat ccatgaaccg ataccgggaa gggaagggag 16080
acaagcccgg ccgcgtgttc cgtccacacg ttgcggacgt actcaagttc tgccggcgag 16140
ccgatggcgg aaagcagaaa gacgacctgg tagaaacctg cattcggtta aacaccacgc 16200
acgttgccat gcagcgtacg aagaaggcca agaacggccg cctggtgacg gtatccgagg 16260
gtgaagcctt gattagccgc tacaagatcg taaagagcga aaccgggcgg ccggagtaca 16320
tcgagatcga gctagctgat tggatgtacc gcgagatcac agaaggcaag aacccggacg 16380
tgctgacggt tcaccccgat tactttttga tcgatcccgg catcggccgt tttctctacc 16440
gcctggcacg ccgcgccgca ggcaaggcag aagccagatg gttgttcaag acgatctacg 16500
aacgcagtgg cagcgccgga gagttcaaga agttctgttt caccgtgcgc aagctgatcg 16560
ggtcaaatga cctgccggag tacgatttga aggaggaggc ggggcaggct ggcccgatcc 16620
tagtcatgcg ctaccgcaac ctgatcgagg gcgaagcatc cgccggttcc taatgtacgg 16680
agcagatgct agggcaaatt gccctagcag gggaaaaagg tcgaaaaggt ctctttcctg 16740
tggatagcac gtacattggg aacccaaagc cgtacattgg gaaccggaac ccgtacattg 16800
ggaacccaaa gccgtacatt gggaaccggt cacacatgta agtgactgat ataaaagaga 16860
aaaaaggcga tttttccgcc taaaactctt taaaacttat taaaactctt aaaacccgcc 16920
tggcctgtgc ataactgtct ggccagcgca cagccgaaga gctgcaaaaa gcgcctaccc 16980
ttcggtcgct gcgctcccta cgccccgccg cttcgcgtcg gcctatcgcg gccgctggcc 17040
gctcaaaaat ggctggccta cggccaggca atctaccagg gcgcggacaa gccgcgccgt 17100
cgccactcga ccgccggcgc ccacatcaag gcaccctgcc tcgcgcgttt cggtgatgac 17160
ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct gtaagcggat 17220
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg tcggggcgca 17280
gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat gcggcatcag 17340
agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga tgcgtaagga 17400
gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 17460
ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 17520
caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 17580
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 17640
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 17700
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 17760
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 17820
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 17880
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 17940
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 18000
cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 18060
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 18120
aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 18180
aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 18240
actcacgtta agggattttg gtcatgcatt ctaggtacta aaacaattca tccagtaaaa 18300
tataatattt tattttctcc caatcaggct tgatccccag taagtcaaaa aatagctcga 18360
catactgttc ttccccgata tcctccctga tcgaccggac gcagaaggca atgtcatacc 18420
acttgtccgc cctgccgctt ctcccaagat caataaagcc acttactttg ccatctttca 18480
caaagatgtt gctgtctccc aggtcgccgt gggaaaagac aagttcctct tcgggctttt 18540
ccgtctttaa aaaatcatac agctcgcgcg gatctttaaa tggagtgtct tcttcccagt 18600
tttcgcaatc cacatcggcc agatcgttat tcagtaagta atccaattcg gctaagcggc 18660
tgtctaagct attcgtatag ggacaatccg atatgtcgat ggagtgaaag agcctgatgc 18720
actccgcata cagctcgata atcttttcag ggctttgttc atcttcatac tcttccgagc 18780
aaaggacgcc atcggcctca ctcatgagca gattgctcca gccatcatgc cgttcaaagt 18840
gcaggacctt tggaacaggc agctttcctt ccagccatag catcatgtcc ttttcccgtt 18900
ccacatcata ggtggtccct ttataccggc tgtccgtcat ttttaaatat aggttttcat 18960
tttctcccac cagcttatat accttagcag gagacattcc ttccgtatct tttacgcagc 19020
ggtatttttc gatcagtttt ttcaattccg gtgatattct cattttagcc atttattatt 19080
tccttcctct tttctacagt atttaaagat accccaagaa gctaattata acaagacgaa 19140
ctccaattca ctgttccttg cattctaaaa ccttaaatac cagaaaacag ctttttcaaa 19200
gttgttttca aagttggcgt ataacatagt atcgacggag ccgattttga aaccgcggtg 19260
atcacaggca gcaacgctct gtcatcgtta caatcaacat gctaccctcc gcgagatcat 19320
ccgtgtttca aacccggcag cttagttgcc gttcttccga atagcatcgg taacatgagc 19380
aaagtctgcc gccttacaac ggctctcccg ctgacgccgt cccggactga tgggctgcct 19440
gtatcgagtg gtgattttgt gccgagctgc cggtcgggga gctgttggct ggct 19494
<210>2
<211>166
<212>PRT
<213>Artificial Sequence
<400>2
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro Ile
35 40 45
Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210>3
<211>166
<212>PRT
<213>Artificial Sequence
<400>3
Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr
1 5 10 15
Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala Val
20 25 30
Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala Ile
35 40 45
Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln
50 55 60
Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr
65 70 75 80
Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser
85 90 95
Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly Ala
100 105 110
Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His Arg
115 120 125
Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu
130 135 140
Cys Tyr Phe Phe Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys Lys
145 150 155 160
Ala Gln Ser Ser Thr Asp
165
<210>4
<211>1367
<212>PRT
<213>Artificial Sequence
<400>4
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210>5
<211>117
<212>DNA
<213>Artificial Sequence
<400>5
gactacaagg accacgacgg cgactacaag gatcatgaca tcgactacaa ggacgacgac 60
gacaagatgg ccccgaagaa gaagaggaaa gtgggcatcc acggcgtgcc ggccgcc 117
<210>6
<211>207
<212>DNA
<213>Artificial Sequence
<400>6
gactacaagg accacgacgg ggattacaaa gaccacgaca tagactacaa ggatgacgat 60
gacaaaatgg caccgaagaa aaaaaggaag gtcggcggct ccccgaagaa aaaaaggaag 120
gtcggcggct ccccgaagaa aaaaaggaag gtcggcggct ccccgaagaa aaaaaggaag 180
gtcggaatcc atggcgttcc agctgcc 207
<210>7
<211>17
<212>PRT
<213>Artificial Sequence
<400>7
Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys
1 5 10 15
Val
<210>8
<211>31
<212>PRT
<213>Artificial Sequence
<400>8
Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr
1 5 10 15
Lys Asp Asp Asp Asp Lys Met Ala Pro Lys Lys Lys Arg Lys Val
20 25 30
<210>9
<211>7
<212>PRT
<213>Artificial Sequence
<400>9
Pro Lys Lys Lys Arg Lys Val
1 5
<210>10
<211>61
<212>PRT
<213>Artificial Sequence
<400>10
Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr
1 5 10 15
Lys Asp Asp Asp Asp Lys Met Ala Pro Lys Lys Lys Arg Lys Val Gly
20 25 30
Gly Ser Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys Lys Lys
35 40 45
Arg Lys Val Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
50 55 60
<210>11
<211>37
<212>PRT
<213>Artificial Sequence
<400>11
Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys Lys Lys Arg Lys
1 5 10 15
Val Gly Gly Ser Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Pro Lys
20 25 30
Lys Lys Arg Lys Val
35
<210>12
<211>20
<212>DNA
<213>Artificial Sequence
<400>12
catagcactc aatgcggtct 20
<210>13
<211>2327
<212>PRT
<213>Artificial Sequence
<400>13
Met Thr Ser Thr His Val Ala Thr Leu Gly Val Gly Ala Gln Ala Pro
1 5 10 15
Pro Arg His Gln Lys Lys Ser Ala Gly Thr Ala Phe Val Ser Ser Gly
20 25 30
Ser Ser Arg Pro Ser Tyr Arg Lys Asn Gly Gln Arg Thr Arg Ser Leu
35 40 45
Arg Glu Glu Ser Asn Gly Gly Val Ser Asp Ser Lys Lys Leu Asn His
50 55 60
Ser Ile Arg Gln Gly Leu Ala Gly Ile Ile Asp Leu Pro Asn Asp Ala
65 70 75 80
Ala Ser Glu Val Asp Ile Ser His Gly Ser Glu Asp Pro Arg Gly Pro
85 90 95
Thr Val Pro Gly Ser Tyr Gln Met Asn Gly Ile Ile Asn Glu Thr His
100 105 110
Asn Gly Arg His Ala Ser Val Ser Lys Val Val Glu Phe Cys Thr Ala
115 120 125
Leu Gly Gly Lys Thr Pro Ile His Ser Val Leu Val Ala Asn Asn Gly
130 135 140
Met Ala Ala Ala Lys Phe Met Arg Ser Val Arg Thr Trp Ala Asn Asp
145 150 155 160
Thr Phe Gly Ser Glu Lys Ala Ile Gln Leu Ile Ala Met Ala Thr Pro
165 170 175
Glu Asp Leu Arg Ile Asn Ala Glu His Ile Arg Ile Ala Asp Gln Phe
180 185 190
Val Glu Val Pro Gly Gly Thr Asn Asn Asn Asn Tyr Ala Asn Val Gln
195 200 205
Leu Ile Val Glu Ile Ala Glu Arg Thr Gly Val Ser Ala Val Trp Pro
210 215 220
Gly Trp Gly His Ala Ser Glu Asn Pro Glu Leu Pro Asp Ala Leu Thr
225 230 235 240
Ala Lys Gly Ile Val Phe Leu Gly Pro Pro Ala Ser Ser Met His Ala
245 250 255
Leu Gly Asp Lys Val Gly Ser Ala Leu Ile Ala Gln Ala Ala Gly Val
260 265 270
Pro Thr Leu Ala Trp Ser Gly Ser His Val Glu Val Pro Leu Glu Cys
275 280 285
Cys Leu Asp Ser Ile Pro Asp Glu Met Tyr Arg Lys Ala Cys Val Thr
290 295 300
Thr Thr Glu Glu Ala Val Ala Ser Cys Gln Val Val Gly Tyr Pro Ala
305 310 315 320
Met Ile Lys Ala Ser Trp Gly Gly Gly Gly Lys Gly Ile Arg Lys Val
325 330 335
His Asn Asp Asp Glu Val Arg Thr Leu Phe Lys Gln Val Gln Gly Glu
340 345 350
Val Pro Gly Ser Pro Ile Phe Ile Met Arg Leu Ala Ala Gln Ser Arg
355 360 365
His Leu Glu Val Gln Leu Leu Cys Asp Gln Tyr Gly Asn Val Ala Ala
370 375 380
Leu His Ser Arg Asp Cys Ser Val Gln Arg Arg His Gln Lys Ile Ile
385 390 395 400
Glu Glu Gly Pro Val Thr Val Ala Pro Arg Glu Thr Val Lys Glu Leu
405 410 415
Glu Gln Ala Ala Arg Arg Leu Ala Lys Ala Val Gly Tyr Val Gly Ala
420 425 430
Ala Thr Val Glu Tyr Leu Tyr Ser Met Glu Thr Gly Glu Tyr Tyr Phe
435 440 445
Leu Glu Leu Asn Pro Arg Leu Gln Val Glu His Pro Val Thr Glu Trp
450 455 460
Ile Ala Glu Val Asn Leu Pro Ala Ala Gln Val Ala Val Gly Met Gly
465 470 475 480
Ile Pro Leu Trp Gln Ile Pro Glu Ile Arg Arg Phe Tyr Gly Met Asn
485 490 495
His Gly Gly Gly Tyr Asp Leu Trp Arg Lys Thr Ala Ala Leu Ala Thr
500 505 510
Pro Phe Asn Phe Asp Glu Val Asp Ser Lys Trp Pro Lys Gly His Cys
515 520 525
Val Ala Val Arg Ile Thr Ser Glu Asp Pro Asp Asp Gly Phe Lys Pro
530 535 540
Thr Gly Gly Lys Val Lys Glu Ile Ser Phe Lys Ser Lys Pro Asn Val
545 550 555 560
Trp Ala Tyr Phe Ser Val Lys Ser Gly Gly Gly Ile His Glu Phe Ala
565 570 575
Asp Ser Gln Phe Gly His Val Phe Ala Tyr Gly Thr Thr Arg Ser Ala
580 585 590
Ala Ile Thr Thr Met Ala Leu Ala Leu Lys Glu Val Gln Ile Arg Gly
595 600 605
Glu Ile His Ser Asn Val Asp Tyr Thr Val Asp Leu Leu Asn Ala Ser
610 615 620
Asp Phe Arg Glu Asn Lys Ile His Thr Gly Trp Leu Asp Thr Arg Ile
625 630 635 640
Ala Met Arg Val Gln Ala Glu Arg Pro Pro Trp Tyr Ile Ser Val Val
645 650 655
Gly Gly Ala Leu Tyr Lys Thr Val Thr Ala Asn Thr Ala Thr Val Ser
660 665 670
Asp Tyr Val Gly Tyr Leu Thr Lys Gly Gln Ile Pro Pro Lys His Ile
675 680 685
Ser Leu Val Tyr Thr Thr Val Ala Leu Asn Ile Asp Gly Lys Lys Tyr
690 695 700
Thr Ile Asp Thr Val Arg Ser Gly His Gly Ser Tyr Arg Leu Arg Met
705 710 715 720
Asn Gly Ser Thr Val Asp Ala Asn Val Gln Ile Leu Cys Asp Gly Gly
725 730 735
Leu Leu Met Gln Leu Asp Gly Asn Ser His Val Ile Tyr Ala Glu Glu
740 745 750
Glu Ala Ser Gly Thr Arg Leu Leu Ile Asp Gly Lys Thr Cys Met Leu
755 760 765
Gln Asn Asp His Asp Pro Ser Lys Leu Leu Ala Glu Thr Pro Cys Lys
770 775 780
Leu Leu Arg Phe Leu Val Ala Asp Gly Ala His Val Asp Ala Asp Val
785 790 795 800
Pro Tyr Ala Glu Val Glu Val Met Lys Met Cys Met Pro Leu Leu Ser
805 810 815
Pro Ala Ser Gly Val Ile His Val Val Met Ser Glu Gly Gln Ala Met
820 825 830
Gln Ala Gly Asp Leu Ile Ala Arg Leu Asp Leu Asp Asp Pro Ser Ala
835 840 845
Val Lys Arg Ala Glu Pro Phe Glu Asp Thr Phe Pro Gln Met Gly Leu
850 855 860
Pro Ile Ala Ala Ser Gly Gln Val His Lys Leu Cys Ala Ala Ser Leu
865 870 875 880
Asn Ala Cys Arg Met Ile Leu Ala Gly Tyr Glu His Asp Ile Asp Lys
885 890 895
Val Val Pro Glu Leu Val Tyr Cys Leu Asp Thr Pro Glu Leu Pro Phe
900 905 910
Leu Gln Trp Glu Glu Leu Met Ser Val Leu Ala Thr Arg Leu Pro Arg
915 920 925
Asn Leu Lys Ser Glu Leu Glu Gly Lys Tyr Glu Glu Tyr Lys Val Lys
930 935 940
Phe Asp Ser Gly Ile Ile Asn Asp Phe Pro Ala Asn Met Leu Arg Val
945 950 955 960
Ile Ile Glu Glu Asn Leu Ala Cys Gly Ser Glu Lys Glu Lys Ala Thr
965 970 975
Asn Glu Arg Leu Val Glu Pro Leu Met Ser Leu Leu Lys Ser Tyr Glu
980 985 990
Gly Gly Arg Glu Ser His Ala His Phe Val Val Lys Ser Leu Phe Glu
995 1000 1005
Glu Tyr Leu Tyr Val Glu Glu Leu Phe Ser Asp Gly Ile Gln Ser
1010 1015 1020
Asp Val Ile Glu Arg Leu Arg Leu Gln His Ser Lys Asp Leu Gln
1025 1030 1035
Lys Val Val Asp Ile Val Leu Ser His Gln Ser Val Arg Asn Lys
1040 1045 1050
Thr Lys Leu Ile Leu Lys Leu Met Glu Ser Leu Val Tyr Pro Asn
1055 1060 1065
Pro Ala Ala Tyr Arg Asp Gln Leu Ile Arg Phe Ser Ser Leu Asn
1070 1075 1080
His Lys Ala Tyr Tyr Lys Leu Ala Leu Lys Ala Ser Glu Leu Leu
1085 1090 1095
Glu Gln Thr Lys Leu Ser Glu Leu Arg Ala Arg Ile Ala Arg Ser
1100 1105 1110
Leu Ser Glu Leu Glu Met Phe Thr Glu Glu Ser Lys Gly Leu Ser
1115 1120 1125
Met His Lys Arg Glu Ile Ala Ile Lys Glu Ser Met Glu Asp Leu
1130 1135 1140
Val Thr Ala Pro Leu Pro Val Glu Asp Ala Leu Ile Ser Leu Phe
1145 1150 1155
Asp Cys Ser Asp Thr Thr Val Gln Gln Arg Val Ile Glu Thr Tyr
1160 1165 1170
Ile Ala Arg Leu Tyr Gln Pro His Leu Val Lys Asp Ser Ile Lys
1175 1180 1185
Met Lys Trp Ile Glu Ser Gly Val Ile Ala Leu Trp Glu Phe Pro
1190 1195 1200
Glu Gly His Phe Asp Ala Arg Asn Gly Gly Ala Val Leu Gly Asp
1205 1210 1215
Lys Arg Trp Gly Ala Met Val Ile Val Lys Ser Leu Glu Ser Leu
1220 1225 1230
Ser Met Ala Ile Arg Phe Ala Leu Lys Glu Thr Ser His Tyr Thr
1235 1240 1245
Ser Ser Glu Gly Asn Met Met His Ile Ala Leu Leu Gly Ala Asp
1250 1255 1260
Asn Lys Met His Ile Ile Gln Glu Ser Gly Asp Asp Ala Asp Arg
1265 1270 1275
Ile Ala Lys Leu Pro Leu Ile Leu Lys Asp Asn Val Thr Asp Leu
1280 1285 1290
His Ala Ser Gly Val Lys Thr Ile Ser Phe Ile Val Gln Arg Asp
1295 1300 1305
Glu Ala Arg Met Thr Met Arg Arg Thr Phe Leu Trp Ser Asp Glu
1310 1315 1320
Lys Leu Ser Tyr Glu Glu Glu Pro Ile Leu Arg His Val Glu Pro
1325 1330 1335
Pro Leu Ser Ala Leu Leu Glu Leu Asp Lys Leu Lys Val Lys Gly
1340 1345 1350
Tyr Asn Glu Met Lys Tyr Thr Pro Ser Arg Asp Arg Gln Trp His
1355 1360 1365
Ile Tyr Thr Leu Arg Asn Thr Glu Asn Pro Lys Met Leu His Arg
1370 1375 1380
Val Phe Phe Arg Thr Leu Val Arg Gln Pro Ser Val Ser Asn Lys
1385 1390 1395
Phe Ser Ser Gly Gln Ile Gly Asp Met Glu Val Gly Ser Ala Glu
1400 1405 1410
Glu Pro Leu Ser Phe Thr Ser Thr Ser Ile Leu Arg Ser Leu Met
1415 1420 1425
Thr Ala Ile Glu Glu Leu Glu Leu His Ala Ile Arg Thr Gly His
1430 1435 1440
Ser His Met Tyr Leu His Val Leu Lys Glu Gln Lys Leu Leu Asp
1445 1450 1455
Leu Val Pro Val Ser Gly Asn Thr Val Leu Asp Val Gly Gln Asp
1460 1465 1470
Glu Ala Thr Ala Tyr Ser Leu Leu Lys Glu Met Ala Met Lys Ile
1475 1480 1485
His Glu Leu Val Gly Ala Arg Met His His Leu Ser Val Cys Gln
1490 1495 1500
Trp Glu Val Lys Leu Lys Leu Asp Cys Asp Gly Pro Ala Ser Gly
1505 1510 1515
Thr Trp Arg Ile Val Thr Thr Asn Val Thr Ser His Thr Cys Thr
1520 1525 1530
Val Asp Ile Tyr Arg Glu Met Glu Asp Lys Glu Ser Arg Lys Leu
1535 1540 1545
Val Tyr His Pro Ala Thr Pro Ala Ala Gly Pro Leu His Gly Val
1550 1555 1560
Ala Leu Asn Asn Pro Tyr Gln Pro Leu Ser Val Ile Asp Leu Lys
1565 1570 1575
Arg Cys Ser Ala Arg Asn Asn Arg Thr Thr Tyr Cys Tyr Asp Phe
1580 1585 1590
Pro Leu Ala Phe Glu Thr Ala Val Arg Lys Ser Trp Ser Ser Ser
1595 1600 1605
Thr Ser Gly Ala Ser Lys Gly Val Glu Asn Ala Gln Cys Tyr Val
1610 1615 1620
Lys Ala Thr Glu Leu Val Phe Ala Asp Lys His Gly Ser Trp Gly
1625 1630 1635
Thr Pro Leu Val Gln Met Asp Arg Pro Ala Gly Leu Asn Asp Ile
1640 1645 1650
Gly Met Val Ala Trp Thr Leu Lys Met Ser Thr Pro Glu Phe Pro
1655 1660 1665
Ser Gly Arg Glu Ile Ile Val Val Ala Asn Asp Ile Thr Phe Arg
1670 1675 1680
Ala Gly Ser Phe Gly Pro Arg Glu Asp Ala Phe Phe Glu Ala Val
1685 1690 1695
Thr Asn Leu Ala Cys Glu Lys Lys Leu Pro Leu Ile Tyr Leu Ala
1700 1705 1710
Ala Asn Ser Gly Ala Arg Ile Gly Ile Ala Asp Glu Val Lys Ser
1715 1720 1725
Cys Phe Arg Val Gly Trp Ser Asp Asp Gly Ser Pro Glu Arg Gly
1730 1735 1740
Phe Gln Tyr Ile Tyr Leu Ser Glu Glu Asp Tyr Ala Arg Ile Gly
1745 1750 1755
Thr Ser Val Ile Ala His Lys Met Gln Leu Asp Ser Gly Glu Ile
1760 1765 1770
Arg Trp Val Ile Asp Ser Val Val Gly Lys Glu Asp Gly Leu Gly
1775 1780 1785
Val Glu Asn Ile His Gly Ser Ala Ala Ile Ala Ser Ala Tyr Ser
1790 1795 1800
Arg Ala Tyr Lys Glu Thr Phe Thr Leu Thr Phe Val Thr Gly Arg
1805 1810 1815
Thr Val Gly Ile Gly Ala Tyr Leu Ala Arg Leu Gly Ile Arg Cys
1820 1825 1830
Ile Gln Arg Leu Asp Gln Pro Ile Ile Leu Thr Gly Tyr Ser Ala
1835 1840 1845
Leu Asn Lys Leu Leu Gly Arg Glu Val Tyr Ser Ser His Met Gln
1850 1855 1860
Leu Gly Gly Pro Lys Ile Met Ala Thr Asn Gly Val Val His Leu
1865 1870 1875
Thr Val Ser Asp Asp Leu Glu Gly Val Ser Asn Ile Leu Arg Trp
1880 1885 1890
Leu Ser Tyr Val Pro Ala Tyr Ile Gly Gly Pro Leu Pro Val Thr
1895 1900 1905
Thr Pro Leu Asp Pro Pro Asp Arg Pro Val Ala Tyr Ile Pro Glu
1910 1915 1920
Asn Ser Cys Asp Pro Arg Ala Ala Ile Arg Gly Val Asp Asp Ser
1925 1930 1935
Gln Gly Lys Trp Leu Gly Gly Met Phe Asp Lys Asp Ser Phe Val
1940 1945 1950
Glu Thr Phe Glu Gly Trp Ala Lys Thr Val Val Thr Gly Arg Ala
1955 1960 1965
Lys Leu Gly Gly Ile Pro Val Gly Val Ile Ala Val Glu Thr Gln
1970 1975 1980
Thr Met Met Gln Thr Ile Pro Ala Asp Pro Gly Gln Leu Asp Ser
1985 1990 1995
Arg Glu Gln Ser Val Pro Arg Ala Gly Gln Val Trp Phe Pro Asp
2000 2005 2010
Ser Ala Thr Lys Thr Ala Gln Ala Leu Leu Asp Phe Asn Arg Glu
2015 2020 2025
Gly Leu Pro Leu Phe Ile Leu Ala Asn Trp Arg Gly Phe Ser Gly
2030 2035 2040
Gly Gln Arg Asp Leu Phe Glu Gly Ile Leu Gln Ala Gly Ser Thr
2045 2050 2055
Ile Val Glu Asn Leu Arg Thr Tyr Asn Gln Pro Ala Phe Val Tyr
2060 2065 2070
Ile Pro Met Ala Ala Glu Leu Arg Gly Gly Ala Trp Val Val Val
2075 2080 2085
Asp Ser Lys Ile Asn Pro Asp Arg Ile Glu Cys Tyr Ala Glu Arg
2090 2095 2100
Thr Ala Lys Gly Asn Val Leu Glu Pro Gln Gly Leu Ile Glu Ile
2105 2110 2115
Lys Phe Arg Ser Glu Glu Leu Gln Asp Cys Met Ser Arg Leu Asp
2120 2125 2130
Pro Thr Leu Ile Asp Leu Lys Ala Lys Leu Glu Val Ala Asn Lys
2135 2140 2145
Asn Gly Ser Ala Asp Thr Lys Ser Leu Gln Glu Asn Ile Glu Ala
2150 2155 2160
Arg Thr Lys Gln Leu Met Pro Leu Tyr Thr Gln Ile Ala Ile Arg
2165 2170 2175
Phe Ala Glu Leu His Asp Thr Ser Leu Arg Met Ala Ala Lys Gly
2180 2185 2190
Val Ile Lys Lys Val Val Asp Trp Glu Glu Ser Arg Ser Phe Phe
2195 2200 2205
Tyr Lys Arg Leu Arg Arg Arg Ile Ser Glu Asp Val Leu Ala Lys
2210 2215 2220
Glu Ile Arg Ala Val Ala Gly Glu Gln Phe Ser His Gln Pro Ala
2225 2230 2235
Ile Glu Leu Ile Lys Lys Trp Tyr Ser Ala Ser His Ala Ala Glu
2240 2245 2250
Trp Asp Asp Asp Asp Ala Phe Val Ala Trp Met Asp Asn Pro Glu
2255 2260 2265
Asn Tyr Lys Asp Tyr Ile Gln Tyr Leu Lys Ala Gln Arg Val Ser
2270 2275 2280
Gln Ser Leu Ser Ser Leu Ser Asp Ser Ser Ser Asp Leu Gln Ala
2285 2290 2295
Leu Pro Gln Gly Leu Ser Met Leu Leu Asp Lys Met Asp Pro Ser
2300 2305 2310
Arg Arg Ala Gln Leu Val Glu Glu Ile Arg Lys Val Leu Gly
2315 2320 2325
<210>14
<211>6984
<212>DNA
<213>Artificial Sequence
<400>14
atgacatcca cacatgtggc gacattggga gttggtgccc aggcacctcc tcgtcaccag 60
aaaaagtcag ctggcactgc atttgtatca tctgggtcat caagaccctc ataccgaaag 120
aatggtcagc gtactcggtc acttagggaa gaaagcaatg gaggagtgtc tgattccaaa 180
aagcttaacc actctattcg ccaaggtctt gctggcatca ttgacctccc aaatgacgca 240
gcttcagaag ttgatatttc acatggttcc gaagatccca gggggcctac ggtcccaggt 300
tcctaccaaa tgaatgggat tatcaatgaa acacataatg ggaggcatgc ttcagtctcc 360
aaggttgttg agttttgtac ggcacttggt ggcaaaacac caattcacag tgtattagtg 420
gccaacaatg gaatggcagc agctaagttc atgcggagtg tccgaacatg ggctaatgat 480
acttttggat cagagaaggc aattcagctg atagctatgg caactccgga ggatctgagg 540
ataaatgcag agcacatcag aattgccgat caatttgtag aggtacctgg tggaacaaac 600
aacaacaact atgcaaatgt ccaactcata gtggagatag cagagagaac aggtgtttct 660
gctgtttggc ctggttgggg tcatgcatct gagaatcctg aacttccaga tgcgctgact 720
gcaaaaggaa ttgtttttct tgggccacca gcatcatcaa tgcatgcatt aggagacaag 780
gttggctcag ctctcattgc tcaagcagct ggagttccaa cacttgcttg gagtggatca 840
catgtggaag ttcctctgga gtgttgcttg gactcaatac ctgatgagat gtatagaaaa 900
gcttgtgtta ctaccacaga ggaagcagtt gcaagttgtc aggtggttgg ttatcctgcc 960
atgattaagg catcttgggg tggtggtggt aaaggaataa ggaaggttca taatgatgat 1020
gaggttagga cattatttaa gcaagttcaa ggcgaagtac ctggttcccc aatatttatc 1080
atgaggctag ctgctcagag tcgacatctt gaagttcagt tgctttgtga tcaatatggc 1140
aacgtagcag cacttcacag tcgagattgc agtgtacaac ggcgacacca aaagataatc 1200
gaggaaggac cagttactgt tgctcctcgt gagactgtga aagagcttga gcaggcagca 1260
cggaggcttg ctaaagctgt gggttatgtt ggtgctgcta ctgttgaata cctttacagc 1320
atggaaactg gtgaatatta ttttctggaa cttaatccac ggctacaggt tgagcatcct 1380
gtcactgagt ggatagctga agtaaatttg cctgcggctc aagttgctgt tggaatgggt 1440
ataccccttt ggcagattcc agagatcagg cgcttctacg gaatgaacca tggaggaggc 1500
tatgaccttt ggaggaaaac agcagctcta gcgactccat ttaactttga tgaagtagat 1560
tctaaatggc caaaaggcca ctgcgtagct gttagaataa ctagcgagga tccagatgat 1620
gggtttaagc ctactggtgg aaaagtaaag gagataagtt tcaagagtaa accaaatgtt 1680
tgggcctatt tctcagtaaa gtctggtgga ggcatccatg aattcgctga ttctcagttc 1740
ggacatgttt ttgcgtatgg aactactaga tcggcagcaa taactaccat ggctcttgca 1800
ctaaaagagg ttcaaattcg tggagaaatt cattcaaacg tagactacac agttgaccta 1860
ttaaatgcct cagattttag agaaaataag attcatactg gttggctgga taccaggata 1920
gccatgcgtg ttcaagctga gaggcctcca tggtatattt cagtcgttgg aggggcttta 1980
tataaaacag taactgccaa cacggccact gtttctgatt atgttggtta tcttaccaag 2040
ggccagattc caccaaagca tatatccctt gtctatacga ctgttgcttt gaatatagat 2100
gggaaaaaat atacaatcga tactgtgagg agtggacatg gtagctacag attgcgaatg 2160
aatggatcaa cggttgacgc aaatgtacaa atattatgtg atggtgggct tttaatgcag 2220
ctggatggaa acagccatgt aatttatgct gaagaagagg ccagtggtac acgacttctt 2280
attgatggaa agacatgcat gttacagaat gaccatgacc catcaaagtt attagctgag 2340
acaccatgca aacttcttcg tttcttggtt gctgatggtg ctcatgttga tgctgatgta 2400
ccatatgcgg aagttgaggt tatgaagatg tgcatgcccc tcttatcacc cgcttctggt 2460
gtcatacatg ttgtaatgtc tgagggccaa gcaatgcagg ctggtgatct tatagctagg 2520
ctggatcttg atgacccttc tgctgttaag agagctgagc cgttcgaaga tacttttcca 2580
caaatgggtc tccctattgc tgcttctggc caagttcaca aattatgtgc tgcaagtctg 2640
aatgcttgtc gaatgatcct tgcggggtat gagcatgata ttgacaaggt tgtgccagag 2700
ttggtatact gcctagacac tccggagctt cctttcctgc agtgggagga gcttatgtct 2760
gttttagcaa ctagacttcc aagaaatctt aaaagtgagt tggagggcaa atatgaggaa 2820
tacaaagtaa aatttgactc tgggataatc aatgatttcc ctgccaatat gctacgagtg 2880
ataattgagg aaaatcttgc atgtggttct gagaaggaga aggctacaaa tgagaggctt 2940
gttgagcctc ttatgagcct actgaagtca tatgagggtg ggagagaaag tcatgctcac 3000
tttgttgtca agtccctttt tgaggagtat ctctatgttg aagaattgtt cagtgatgga 3060
attcagtctg atgtgattga gcgtctgcgc cttcaacata gtaaagacct acagaaggtc 3120
gtagacattg tgttgtccca ccagagtgtt agaaataaaa ctaagctgat actaaaactc 3180
atggagagtc tggtctatcc aaatcctgct gcctacaggg atcaattgat tcgcttttct 3240
tcccttaatc acaaagcgta ttacaagttg gcacttaaag ctagtgaact tcttgaacaa 3300
acaaaactta gtgagctccg tgcaagaata gcaaggagcc tttcagagct ggagatgttt 3360
actgaggaaa gcaagggtct ctccatgcat aagcgagaaa ttgccattaa ggagagcatg 3420
gaagatttag tcactgctcc actgccagtt gaagatgcgc tcatttcttt atttgattgt 3480
agtgatacaa ctgttcaaca gagagtgatt gagacttata tagctcgatt ataccagcct 3540
catcttgtaa aggacagtat caaaatgaaa tggatagaat cgggtgttat tgctttatgg 3600
gaatttcctg aagggcattt tgatgcaaga aatggaggag cggttcttgg tgacaaaaga 3660
tggggtgcca tggtcattgt caagtctctt gaatcacttt caatggccat tagatttgca 3720
ctaaaggaga catcacacta cactagctct gagggcaata tgatgcatat tgctttgttg 3780
ggtgctgata ataagatgca tataattcaa gaaagtggtg atgatgctga cagaatagcc 3840
aaacttccct tgatactaaa ggataatgta accgatctgc atgcctctgg tgtgaaaaca 3900
ataagtttca ttgttcaaag agatgaagca cggatgacaa tgcgtcgtac cttcctttgg 3960
tctgatgaaa agctttctta tgaggaagag ccaattctcc ggcatgtgga acctcctctt 4020
tctgcacttc ttgagttgga caagttgaaa gtgaaaggat acaatgaaat gaagtatacc 4080
ccatcacggg atcgtcaatg gcatatctac acacttagaa atactgaaaa ccccaaaatg 4140
ttgcaccggg tatttttccg aacccttgtc aggcaaccca gtgtatccaa caagttttct 4200
tcgggccaga ttggtgacat ggaagttggg agtgctgaag aacctctgtc atttacatca 4260
accagcatat taagatcttt gatgactgct atagaggaat tggagcttca cgcaattaga 4320
actggccatt cacacatgta tttgcatgta ttgaaagaac aaaagcttct tgatcttgtt 4380
ccagtttcag ggaatacagt tttggatgtt ggtcaagatg aagctactgc atattcactt 4440
ttaaaagaaa tggctatgaa gatacatgaa cttgttggtg caagaatgca ccatctttct 4500
gtatgccaat gggaagtgaa acttaagttg gactgcgatg gtcctgccag tggtacctgg 4560
aggattgtaa caaccaatgt tactagtcac acttgcactg tggatatcta ccgtgagatg 4620
gaagataaag aatcacggaa gttagtatac catcccgcca ctccggcggc tggtcctctg 4680
catggtgtgg cactgaataa tccatatcag cctttgagtg tcattgatct caaacgctgt 4740
tctgctagga ataatagaac tacatactgc tatgattttc cactggcatt tgaaactgca 4800
gtgaggaagt catggtcctc tagtacctct ggtgcttcta aaggtgttga aaatgcccaa 4860
tgttatgtta aagctacaga gttggtattt gcggacaaac atgggtcatg gggcactcct 4920
ttagttcaaa tggaccggcc tgctgggctc aatgacattg gtatggtagc ttggaccttg 4980
aagatgtcca ctcctgaatt tcctagtggt agggagatta ttgttgttgc aaatgatatt 5040
acgttcagag ctggatcatt tggcccaagg gaagatgcat tttttgaagc tgttaccaac 5100
ctagcctgtg agaagaaact tcctcttatt tatttggcag caaattctgg tgctcgaatt 5160
ggcatagcag atgaagtgaa atcttgcttc cgtgttgggt ggtctgatga tggcagccct 5220
gaacgtgggt ttcagtacat ttatctaagc gaagaagact atgctcgtat tggcacttct 5280
gtcatagcac ataagatgca gctagacagt ggtgaaatta ggtgggttat tgattctgtt 5340
gtgggcaagg aagatggact tggtgtggag aatatacatg gaagtgctgc tattgccagt 5400
gcttattcta gggcatataa ggagacattt acacttacat ttgtgactgg aagaactgtt 5460
ggaataggag cttatcttgc tcgacttggc atccggtgca tacagcgtct tgaccagcct 5520
attattctta caggctattc tgcactgaac aagcttcttg ggcgggaagt gtacagctcc 5580
cacatgcagt tgggtggtcc caaaatcatg gcaactaatg gtgttgtcca tcttactgtt 5640
tcagatgacc ttgaaggcgt ttctaatata ttgaggtggc tcagttatgt tcctgcctac 5700
attggtggac cacttccagt aacaacaccg ttggacccac cggacagacc tgttgcatac 5760
attcctgaga actcgtgtga tcctcgagcg gctatccgtg gtgttgatga cagccaaggg 5820
aaatggttag gtggtatgtt tgataaagac agctttgtgg aaacatttga aggttgggct 5880
aagacagtgg ttactggcag agcaaagctt ggtggaattc cagtgggtgt gatagctgtg 5940
gagactcaga ccatgatgca aactatccct gctgaccctg gtcagcttga ttcccgtgag 6000
caatctgttc ctcgtgctgg acaagtgtgg tttccagatt ctgcaaccaa gactgcgcag 6060
gcattgctgg acttcaaccg tgaaggatta cctctgttca tcctcgctaa ctggagaggc 6120
ttctctggtg gacaaagaga tctttttgaa ggaattcttc aggctggctc gactattgtt 6180
gagaacctta ggacatacaa tcagcctgcc tttgtctaca ttcccatggc tgcagagcta 6240
cgaggagggg cttgggttgt ggttgatagc aagataaacc cagaccgcat tgagtgctat 6300
gctgagagga ctgcaaaagg caatgttctg gaaccgcaag ggttaattga gatcaagttc 6360
aggtcagagg aactccagga ttgcatgagt cggcttgacc caacattaat tgatctgaaa 6420
gcaaaactcg aagtagcaaa taaaaatgga agtgctgaca caaaatcgct tcaagaaaat 6480
atagaagctc gaacaaaaca gttgatgcct ctatatactc agattgcgat acggtttgct 6540
gaattgcatg atacatccct cagaatggct gcgaaaggtg tgattaagaa agttgtggac 6600
tgggaagaat cacgatcttt cttctataag agattacgga ggaggatctc tgaggatgtt 6660
cttgcaaaag aaattagagc tgtagcaggt gagcagtttt cccaccaacc agcaatcgag 6720
ctgatcaaga aatggtattc agcttcacat gcagctgaat gggatgatga cgatgctttt 6780
gttgcttgga tggataaccc tgaaaactac aaggattata ttcaatatct taaggctcaa 6840
agagtatccc aatccctctc aagtctttca gattccagct cagatttgca agccctgcca 6900
cagggtcttt ccatgttact agataagatg gatccctcta gaagagctca acttgttgaa 6960
gaaatcagga aggtccttgg ttga 6984

Claims (10)

1.一种ACCase抑制除草剂抗性水稻的制备方法,包括使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的步骤;
所述esgRNA靶向OsACC1基因靶点序列;所述OsACC1基因靶点序列含有目标位点;
所述目标位点为序列14第6295位所示碱基T的互补碱基A;
所述腺嘌呤脱氨酶、所述Cas9核酸酶、所述核定位信号甲和所述核定位信号乙在所述esgRNA的向导下,可将受体水稻基因组中所述目标位点由碱基A突变为碱基G,从而获得OsACC1蛋白第2099位由半胱氨酸突变为精氨酸的ACCase抑制除草剂抗性水稻;
所述核定位信号甲包括3*Flag2标签蛋白和NLS2蛋白;
所述核定位信号乙包括所述NLS2蛋白;
所述3*Flag2标签蛋白的氨基酸序列为序列10第1-22位;
所述NLS2蛋白的氨基酸序列为序列9。
2.根据权利要求1所述的方法,其特征在于:所述核定位信号甲包括1个所述3*Flag2标签蛋白和4个所述NLS2蛋白;
和/或,所述核定位信号乙包括4个所述NLS2蛋白;
和/或,所述核定位信号甲为A1)或A2):
A1)氨基酸序列是序列10所示的蛋白质;
A2)将序列表中序列10所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
和/或,所述核定位信号乙为B1)或B2):
B1)氨基酸序列是序列11所示的蛋白质;
B2)将序列表中序列11所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
和/或,所述核定位信号甲的编码基因序列为a1)或a2)或a3):
a1)序列表中序列6第1-183位所示的cDNA分子或DNA分子;
a2)与a1)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求1或2中所述核定位信号甲的cDNA分子或DNA分子;
a3)在严格条件下与a1)或a2)限定的核苷酸序列杂交,且编码权利要求1或2中所述核定位信号甲的cDNA分子或DNA分子;
和/或,所述核定位信号乙的编码基因序列为b1)或b2)或b3):
b1)序列表中序列6第73-183位所示的cDNA分子或DNA分子;
b2)与b1)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求1或2中所述核定位信号乙的cDNA分子或DNA分子;
b3)在严格条件下与b1)或b2)限定的核苷酸序列杂交,且编码权利要求1或2中所述核定位信号乙的cDNA分子或DNA分子;
和/或,所述OsACC1基因靶点序列为序列12,所述目标位点为序列12第7位所示的碱基A。
3.根据权利要求1或2所述的方法,其特征在于:
所述esgRNA结构如下:所述OsACC1基因靶点序列转录的RNA-esgRNA骨架;
所述esgRNA骨架为1)或2)或3):
1)将序列1第617-702位中的T替换为U得到的RNA分子;
2)将1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;
3)与1)或2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。
4.根据权利要求1-3任一所述的方法,其特征在于:所述Cas9核酸酶为Cas9n蛋白质;
所述Cas9n蛋白质为C1)或C2):
C1)氨基酸序列是序列4所示的蛋白质;
C2)将序列表中序列4所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
5.根据权利要求1-4任一所述的方法,其特征在于:所述Cas9n蛋白质的编码基因为c1)或c2)或c3):
c1)序列表中序列1第5035-9135位所示的cDNA分子或DNA分子;
c2)与c1)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求4中所述Cas9n的cDNA分子或DNA分子;
c3)在严格条件下与c1)或c2)限定的核苷酸序列杂交,且编码权利要求4中所述Cas9n的cDNA分子或DNA分子。
6.根据权利要求1-5任一所述的方法,其特征在于:所述腺嘌呤脱氨酶为ecTadA蛋白质和/或ecTadA*蛋白质;
所述ecTadA蛋白质为D1)或D2):
D1)氨基酸序列是序列2所示的蛋白质;
D2)将序列表中序列2所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质;
所述ecTadA*蛋白质为E1)或E2):
E1)氨基酸序列是序列3所示的蛋白质;
E2)将序列表中序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
7.根据权利要求1-6任一所述的方法,其特征在于:所述ecTadA蛋白质的编码基因为d1)或d2)或d3):
d1)序列表中序列1第3847-4344位所示的cDNA分子或DNA分子;
d2)与d1)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求6中所述ecTadA的cDNA分子或DNA分子;
d3)在严格条件下与d1)或d2)限定的核苷酸序列杂交,且编码权利要求6中所述ecTadA的cDNA分子或DNA分子;
所述ecTadA*蛋白质的编码基因为e1)或e2)或e3):
e1)序列表中序列1第4441-4938位所示的cDNA分子或DNA分子;
e2)与e1)限定的核苷酸序列具有75%或75%以上同一性,且编码权利要求6中所述ecTadA*的cDNA分子或DNA分子;
e3)在严格条件下与e1)或e2)限定的核苷酸序列杂交,且编码权利要求6中所述ecTadA*的cDNA分子或DNA分子。
8.根据权利要求1-7任一所述的方法,其特征在于:所述使受体水稻中表达esgRNA、腺嘌呤脱氨酶、Cas9核酸酶、核定位信号甲和核定位信号乙的方法为将转录esgRNA的DNA分子、ecTadA蛋白质的编码基因、ecTadA*蛋白质的编码基因、Cas9n蛋白质的编码基因、核定位信号甲的编码基因和核定位信号乙的编码基因导入受体水稻中。
9.根据权利要求8所述的方法,其特征在于:所述转录esgRNA的DNA分子、所述ecTadA蛋白质的编码基因、所述ecTadA*蛋白质的编码基因、所述Cas9n蛋白质的编码基因、所述核定位信号甲的编码基因和所述核定位信号乙的编码基因通过重组表达载体导入受体水稻中;
所述重组表达载体包括依次由启动子、所述核定位信号甲的编码基因、所述ecTadA蛋白质的编码基因、所述ecTadA*蛋白质的编码基因、所述Cas9n蛋白质的编码基因和所述核定位信号乙的编码基因和终止子组成的表达盒。
10.权利要求1-9任一所述的方法在提高目标位点的A·G碱基替换效率中的应用;
或,权利要求1-9任一所述的方法在制备仅目标位点发生A·G碱基替换的ACCase抑制除草剂抗性水稻中的应用;
所述目标位点为序列14第6295位所示碱基T的互补碱基A;
所述A·G碱基替换为由碱基A突变为碱基G。
CN201911323608.7A 2019-12-20 2019-12-20 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用 Active CN110982818B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911323608.7A CN110982818B (zh) 2019-12-20 2019-12-20 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911323608.7A CN110982818B (zh) 2019-12-20 2019-12-20 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用

Publications (2)

Publication Number Publication Date
CN110982818A true CN110982818A (zh) 2020-04-10
CN110982818B CN110982818B (zh) 2022-03-08

Family

ID=70073311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911323608.7A Active CN110982818B (zh) 2019-12-20 2019-12-20 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用

Country Status (1)

Country Link
CN (1) CN110982818B (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114990104A (zh) * 2021-11-15 2022-09-02 广州瑞风生物科技有限公司 改造的sgRNA分子及其应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108866092A (zh) * 2017-05-11 2018-11-23 中国科学院遗传与发育生物学研究所 抗除草剂基因的产生及其用途
CN110029096A (zh) * 2019-05-09 2019-07-19 上海科技大学 一种腺嘌呤碱基编辑工具及其用途
CN110157727A (zh) * 2017-12-21 2019-08-23 中国科学院遗传与发育生物学研究所 植物碱基编辑方法
CN110407945A (zh) * 2019-06-14 2019-11-05 上海科技大学 一种腺嘌呤碱基编辑工具及其用途

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108866092A (zh) * 2017-05-11 2018-11-23 中国科学院遗传与发育生物学研究所 抗除草剂基因的产生及其用途
CN110157727A (zh) * 2017-12-21 2019-08-23 中国科学院遗传与发育生物学研究所 植物碱基编辑方法
CN110029096A (zh) * 2019-05-09 2019-07-19 上海科技大学 一种腺嘌呤碱基编辑工具及其用途
CN110407945A (zh) * 2019-06-14 2019-11-05 上海科技大学 一种腺嘌呤碱基编辑工具及其用途

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHAO LI等: "Expanded base editing in rice and wheat using a Cas9-adenosine deaminase fusion", 《GENOME BIOLOGY》 *
WANG, FP等: "Developing high-efficiency base editors by combining optimized synergistic core components with new types of nuclear localization signal peptide", 《CROP JOURNAL》 *
宗媛等: "碱基编辑系统研究进展", 《遗传》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114990104A (zh) * 2021-11-15 2022-09-02 广州瑞风生物科技有限公司 改造的sgRNA分子及其应用
CN114990104B (zh) * 2021-11-15 2023-10-20 广州瑞风生物科技有限公司 改造的sgRNA分子及其应用

Also Published As

Publication number Publication date
CN110982818B (zh) 2022-03-08

Similar Documents

Publication Publication Date Title
CN111394369B (zh) 抗草甘膦epsps突变基因、含有该基因的植物遗传转化筛选载体及其应用
CN111378679B (zh) 一种基因表达组件及其构建的克隆载体和应用
CN109679989A (zh) 一种提高碱基编辑系统编辑效率的方法
CN110951736B (zh) 一种核定位信号f4nls及其在提高碱基编辑效率与拓展可编辑碱基范围中的应用
CN108642061B (zh) Ogura CMS不育恢复基因RfoB、RfoB植物表达载体及其应用
CN110964742B (zh) 一种抗除草剂水稻的制备方法
CN110951773B (zh) FNLS-sABE系统在创制水稻除草剂抗性材料中的应用
CN110982818B (zh) 核定位信号f4nls在高效创制水稻除草剂抗性材料中的应用
CN117202778A (zh) 植物中可转化性和单倍体诱导的提高
CN113430225A (zh) 一种用于分析植物启动子表达特异性的载体,其制备方法及应用
CN112280799B (zh) 利用CRISPR/Cas9系统对橡胶草或蒲公英基因定点突变的方法
CN112538477B (zh) xCas9基因编辑系统在基因组编辑中的应用
CN101892259B (zh) 一种siRNA植物基因表达载体及其构建方法和应用
CN113564177B (zh) 通过CRISPR/Cas9技术调控小麦ARE1基因提高作物产量的方法
CN111961126B (zh) TaVQ25基因在调控小麦对白粉病和纹枯病抗性中的应用
CN111961684B (zh) 通过抑制小麦中TaVQ5基因的表达提高小麦抗病性的方法
CN113684224A (zh) 一种拟南芥高效遗传转化的方法
CN103173488B (zh) 新的融合标签进行水稻转基因快速筛选的方法
CN107988226A (zh) 一种水稻愈伤组织特异高表达启动子的鉴定及应用
CN112941100B (zh) 一种中间偃麦草遗传转化方法及其专用引物
CN110195067B (zh) 一种抗草铵膦油菜的培育方法
CN113462697B (zh) 一种降解型抗草甘膦基因、植物表达载体、降解型抗草甘膦转基因水稻的培育方法和应用
KR101773365B1 (ko) 콩 모자이크 바이러스를 이용한 두 종의 외래 유전자 동시발현용 유전자 전달 벡터
CN113355352B (zh) 一种以太子参TuMV-phe病毒基因为基础修饰改造病毒表达载体的方法
KR102281973B1 (ko) 식물에 대한 폴리시스트론 발현 시스템

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant