下面将参考实施例对本发明作详细描述,但该实施例不构成对本发明的限制。
实施例1
初步连锁分析
为确定Pi-b基因在水稻染色体2连锁图谱中的大概位置,首先利用DNA标记进行连锁分析。使用的来源为94个植株的分离群体,得自Sasanishiki与Sasanishiki和TohokuIL9之间杂交的F1代之间两个回交的F1代的自体受精。连锁分析表明Pi-b基因定位于染色体2的RFLP标记C2782和C379之间,其间被R1792、R257和R2821共分隔(cosegregate)(日本育种协会,第87次会议,图1)。
实施例2
精密连锁分析
对大量分离的群体进行分析以分离基因。从上述94个植株群体中,选择出20个对Pi-b基因座为杂的植株,将约20,000个分离群体的种子,包括自体受精种子用于分析(日本育种协会,第89次会议)。分析中,使用了集中取样法(pool sampling method)以减小任务量(Churchill等,美国国家科学进展90:16-20(1993))。
为增加连锁分析的精确性,需要增加目标基因附近的DNA标记的数量,并增大取样的群体。因此,亚克隆经初步连锁分析确定的具有Pi-b基因座的YAC克隆,以增加Pi-b基因附近的DNA标记数量(Monna等,遗传学理论和应用94:170-176(1997))。利用大群体进行连锁分析,将Pi-b基因座缩小到RFLP标记S1916和G7073之间的区域。另外,Pi-b基因被三种RFLP标记共同分隔开(G7010、G7021和G7023;图2)。
实施例3
用粘粒克隆排列Pi-b基因座
为进一步缩小Pi-b基因座的范围,使用粘粒克隆进行排列。从含有抗性基因的TohokuIL9中,用CTAB法提取基因组DNA。然后用限制性核酸内切酶Sau3A部分消化上述DNA。用蔗糖密度梯度离心法从消化产物中分级分离约30-50kb的片段。得到的DNA片段和粘粒载体SuperCos(Stratagene,Wahl等,Proc.Natl.Acad.Sci.USA 84:2160-2164(1987))用于构建粘粒文库。利用Pi-b基因座附近的五个DNA克隆(S1916、G7010、G7021、G7023和G7030)作为探针筛选粘粒文库。结果选择出六个粘粒克降(COS140、COS147、COS117、COS137、COS205和COS207)。这些克隆的限制性图谱的构建和其重叠区域的检测表明,Pi-b基因座位于由三个克隆(COS140、COS147、和COS117;图2)覆盖的基因组区域。
实施例4
序列分析确定候选基因组区
对推测含有Pi-b基因的三个排列的粘粒克隆进行亚克隆,部分分析它们的核苷酸序列。利用在公共核苷酸数据库中进行BLAST同源性检索,分析获得的核苷酸序列。结果表明,从COS140获得的2.3kb克隆的部分核苷酸序列和从COS147获得的4.6kb克隆的部分核苷酸序列包含核苷酸结合位点(NBS),所述位点普遍存在于几种植物的抗性基因中,例如,拟南芥属(Arabidoposis)中的PRM1病抗性基因中。因此,这些核苷酸序列有望成为Pi-b基因的候选区。
实施例5
cDNA的分离和序列分析
分离cDNA,以检测是否核苷酸序列分析揭示的候选区在抗性品种TohokuIL9中得以表达。使抗性品种TohokuIL9发芽,待小苗长出四片叶子时,按照标准方法接种稻瘟病真菌TH68-141(003种)。然后在接种后的三个时间点,即6小时、12小时、和24小时,收集叶片。从样品中提取信使RNA,并构建cDNA文库。进一步亚克隆候选基因组区的2.3kb的片段,得到1kb的片段(SEQ ID NO:3的3471-4507位),将该片段用作筛选上述文库的探针。结果选择出八个cDNA克隆。该克隆的序列分析表明,c23的核苷酸序列与粘粒克降COS 140的核苷酸序列完全配对。因此确定候选基因组区在TohokuIL9中表达。选择出的c23克隆约有4kb,估计包含该基因的几乎全部区域。确定了该克隆的全部核苷酸序列(SEQ ID NO:2)。
实施例6
候选cDNA表达模式的分析
在敏感品种(Sasanishiki)和抗性品种(TohokuIL9)中,发现候选cDNA区的不同表达模式。用003种稻瘟病真菌接种处于4叶期的上述两个品种,接种后6小时、12小时和24小时,收集叶片。然后提取mRNA,用作RT-PCR的模板。在cDNA克隆c23核苷酸序列基础上设计了引物SEQ ID NO:4/5′-AGGGAAAAATGGAAATGTGC-3′(反义链)和SEQ ID NO:5/5′-AGTAACCTTCTGCTGCCCAA-3′(有义链)用于RT-PCR以特异性扩增该区。PCT条件如下:94℃2分钟循环1次;94℃1分钟,55℃2分钟,72℃3分钟,循环30次;72℃7分钟循环1次。PCR后,检测到抗性品种TohokuIL9中的特异性扩增,但用敏感品种(Sasanishiki;图3)的mRNA没有检测到扩增。这表明cDNA克隆c23在抗性品种中特异性表达。同时,在包含NBS并临近c23区的4.6kb片段的核苷酸序列基础上设计了引物SEQID NO:6/5′-TTACCATCCCAGCAATCAGC-3′(有义链)和SEQ ID NO:7/5′-AGACACCCTGCCACACAACA-3′(反义链),利用该引物进行RT-PCR。该4.6kb片段区既不在敏感品种(Sasanishiki)中扩增,也不在抗性品种(TohokuIL9;图3)中扩增。进一步表明克隆c23的相应基因组区是Pi-b基因座。
实施例7
Pi-b候选基因的基因组DNA的序列分析
确定了对应于cDNA克隆c23的基区组区的完全核苷酸序列。用五种不同的限制酶酶切,对粘粒克隆COS 140进行亚克隆,尽可能地从两端确定得到的亚克隆重的核苷酸序列。不适合上述分析的区域,通过删除切割得更短后,再进行DNA测序。确定的区域扩展到10.3kb(SEQ ID NO:3)。
实施例8
Pi-b基因的结构
Pi-b的候选cDNA c23全长有3925个碱基对,并含有3618个碱基对的ORF,该ORF包含由两个内含子分隔开的三个外显子。Pi-b翻译产物为含有1205个氨基酸残基的蛋白质(SEQ ID NO:1),该蛋白质含有在很多抗性基因中可发现的两个NBS(386-395位氨基酸的P-环,和474-484位的激酶2),以及三个保守区(1区在503-513位氨基酸,2区在572-583位,3区在631-638位)。这些区域与已知抗性基因例如RPM1的保守区表现高度同源性(图4)。同时,该基因在3′-侧还含有12个不完全的、富含亮氨酸的重复序列(LRR在755-1058氨基酸位置)。这些结构与以前报道的NBS-LRR类抗性基因具有非常高度的同源性。基于上述结果,本发明人论断:上述经分析的cDNA和对应的基因组区为稻瘟病抗性基因Pi-b。
序列表(1)一般资料
(i)申请人:NAKAGAWARA Masahiro,Director General of THENATIONAL INSTITUTE OF AGROBIOLOGICAL RESOURCES(NIAR),MAFF SOCIETY for TECHNO-INNOVATION of AGRICUTURE,FORESTRYand FISHERIES
(ii)发明题目:抗稻瘟病的水稻基因
(iii)序列数:7
(v)计算机可读形式:
(A)介质类型:软盘-3.50英寸,1.44Mb内存
(B)计算机:IBMPC
(C)操作系统:MS-DOS 3.30版本或更高
(D)软件:(2)SEQ ID NO:1的资料
(i)序列特征:
(A)长度:1205
(B)类型:氨基酸
(D)拓扑结构:线性
(ii)分子类型:蛋白质
(xi)序列描述:SEQ ID NO:1
Met Met Arg Ser Phe Met Met Glu Ala His
1 5 10
Glu Glu Gln Asp Asn Ser Lys Val Val Lys Thr Trp Val Lys Gln Val
15 20 25
Arg Asp Thr Ala Tyr Asp Val Glu Asp Ser Leu Gln Asp Phe Ala Val
30 35 40His Leu Lys Arg Pro Ser Trp Trp Arg Phe Pro Arg Thr Leu Leu Glu
45 50 55Arg His Arg Val Ala Lys Gln Met Lys Glu Leu Arg Asn Lys Val Glu
60 65 70Asp Val Ser Gln Arg Asn Val Arg Tyr His Leu Ile Lys Gly Ser Ala75 80 85 90Lys Ala Thr Ile Asn Ser Thr Glu Gln Ser Ser Val Ile Ala Thr Ala
95 100 105Ile Phe Gly Ile Asp Asp Ala Arg Arg Ala Ala Lys Gln Asp Asn Gln
110 115 120Arg Val Asp Leu Val Gln Leu Ile Asn Ser Glu Asp Gln Asp Leu Lys
125 130 135Val Ile Ala Val Trp Gly Thr Ser Gly Asp Met Gly Gln Thr Thr Ile
140 145 150Ile Arg Met Ala Tyr Glu Asn Pro Asp Val Gln Ile Arg Phe Pro Cys155 160 165 170Arg Ala Trp Val Arg Val Met His Pro Phe Ser Pro Arg Asp Phe Val
175 180 185Gln Ser Leu Val Asn Gln Leu His Ala Thr Gln Gly Val Glu Ala Leu
190 195 200Leu Glu Lys Glu Lys Thr Glu Gln Asp Leu Ala Lys Lys Phe Asn Gly
205 210 215Cys Val Asn Asp Arg Lys Cys Leu Ile Val Leu Asn Asp Leu Ser Thr
220 225 230Ile Glu Glu Trp Asp Gln Ile Lys Lys Cys Phe Gln Lys Cys Arg Lys235 240 245 250Gly Ser Arg Ile Ile Val Ser Ser Thr Gln Val Glu Val Ala Ser Leu
255 260 265Cys Ala Gly Gln Glu Ser Gln Ala Ser Glu Leu Lys Gln Leu Ser Ala
270 275 280Asp Gln Thr Leu Tyr Ala Phe Tyr Asp Lys Gly Ser Gln Ile Ile Glu
285 290 295Asp Ser Val Lys Pro Val Ser Ile Ser Asp Val Ala Ile Thr Ser Thr
300 305 310Asn Asn His Thr Val Ala His Gly Glu Ile Ile Asp Asp Gln Ser Met315 320 325 330Asp Ala Asp Glu Lys Lys Val Ala Arg Lys Ser Leu Thr Arg Ile Arg
335 340 345Thr Ser Val Gly Ala Ser Glu Glu Ser Gln Leu Ile Gly Arg Glu Lys
350 355 360Glu Ile Ser Glu Ile Thr His Leu Ile Leu Asn Asn Asp Ser Gln Gln
365 370 375Val Gln Val Ile Ser Val Trp Gly Met Gly Gly Leu Gly Lys Thr Thr
380 385 390Leu Val Ser Gly Val Tyr Gln Ser Pro Arg Leu Ser Asp Lys Phe Asp395 400 405 410Lys Tyr Val Phe Val Thr Ile Met Arg Pro Phe Ile Leu Val Glu Leu
415 420 425Leu Arg Ser Leu Ala Glu Gln Leu His Lys Gly Ser Ser Lys Lys Glu
430 435 440Glu Leu Leu Glu Asn Arg Val Ser Ser Lys Lys Ser Leu Ala Ser Met
445 450 455Glu Asp Thr Glu Leu Thr Gly Gln Leu Lys Arg Leu Leu Glu Lys Lys
460 465 470Ser Cys Leu Ile Val Leu Asp Asp Phe Ser Asp Thr Ser Glu Trp Asp475 480 485 490Gln Ile Lys Pro Thr Leu Phe Pro Leu Leu Glu Lys Thr Ser Arg Ile
495 500 505Ile Val Thr Thr Arg Lys Glu Asn Ile Ala Asn His Cys Ser Gly Lys
510 515 520Asn Gly Asn Val His Asn Leu Lys Val Leu Lys His Asn Asp Ala Leu
525 530 535Cys Leu Leu Ser Glu Lys Val Phe Glu Glu Ala Thr Tyr Leu Asp Asp
540 545 550Gln Asn Asn Pro Glu Leu Val Lys Glu Ala Lys Gln Ile Leu Lys Lys555 560 565 570Cys Asp Gly Leu Pro Leu Ala Ile Val Val Ile Gly Gly Phe Leu Ala
575 580 585Asn Arg Pro Lys Thr Pro Glu Glu Trp Arg Lys Leu Asn Glu Asn Ile
590 595 600Asn Ala Glu Leu Glu Met Asn Pro Glu Leu Gly Met Ile Arg Thr Val
605 610 615Leu Glu Lys Ser Tyr Asp Gly Leu Pro Tyr His Leu Lys Ser Cys Phe
620 625 630Leu Tyr Leu Ser Ile Phe Pro Glu Asp Gln Ile Ile Ser Arg Arg Arg635 640 645 650Leu Val His Arg Trp Ala Ala Glu Gly Tyr Ser Thr Ala Ala His Gly
655 660 665Lys Ser Ala Ile Glu Ile Ala Asn Gly Tyr Phe Met Glu Leu Lys Asn
670 675 680Arg Ser Met Ile Leu Pro Phe Gln Gln Ser Gly Ser Ser Arg Lys Ser
685 690 695Ile Asp Ser Cys Lys Val His Asp Leu Met Arg Asp Ile Ala Ile Ser
700 705 710Lys Ser Thr Glu Glu Asn Leu Val Phe Arg Val Glu Glu Gly Cys Ser715 720 725 730Ala Tyr Ile His Gly Ala Ile Arg His Leu Ala Ile Ser Ser Asn Trp
735 740 745Lys Gly Asp Lys Ser Glu Phe Glu Gly Ile Val Asp Leu Ser Arg Ile
750 755 760Arg Ser Leu Ser Leu Phe Gly Asp Trp Lys Pro Phe Phe Val Tyr Gly
765 770 775Lys Met Arg Phe Ile Arg Val Leu Asp Phe Glu Gly Thr Arg Gly Leu
780 785 790Glu Tyr His His Leu Asp Gln Ile Trp Lys Leu Asn His Leu Lys Phe795 800 805 810Leu Ser Leu Arg Gly Cys Tyr Arg Ile Asp Leu Leu Pro Asp Leu Leu
815 820 825Gly Asn Leu Arg Gln Leu Gln Met Leu Asp Ile Arg Gly Thr Tyr Val
830 835 840Lys Ala Leu Pro Lys Thr Ile Ile Lys Leu Gln Lys Leu Gln Tyr Ile
845 850 855His Ala Gly Arg Lys Thr Asp Tyr Val Trp Glu Glu Lys His Ser Leu
860 865 870Met Gln Arg Cys Arg Lys Val Gly Cys Ile Cys Ala Thr Cys Cys Leu875 880 885 890Pro Leu Leu Cys Glu Met Tyr Gly Pro Leu His Lys Ala Leu Ala Arg
895 900 905Arg Asp Ala Trp Thr Phe Ala Cys Cys Val Lys Phe Pro Ser Ile Met
910 915 920Thr Gly Val His Glu Glu Glu Gly Ala Met Val Pro Ser Gly Ile Arg
925 930 935Lys Leu Lys Asp Leu His Thr Leu Arg Asn Ile Asn Val Gly Arg Gly
940 945 950Asn Ala Ile Leu Arg Asp Ile Gly Met Leu Thr Gly Leu His Lys Leu955 960 965 970Gly Val Ala Gly Ile Asn Lys Lys Asn Gly Arg Ala Phe Arg Leu Ala
975 980 985Ile Ser Asn Leu Asn Lys Leu Glu Ser Leu Ser Val Ser Ser Ala Gly
990 995 1000Met Pro Gly Leu Cys Gly Cys Leu Asp Asp Ile Ser Ser Pro Pro Glu
1005 1010 1015Asn Leu Gln Ser Leu Lys Leu Tyr Gly Ser Leu Lys Thr Leu Pro Glu1020 1025 1030Trp Ile Lys Glu Leu Gln His Leu Val Lys Leu Lys Leu Val Ser Thr1035 1040 1045 1050Arg Leu Leu Glu His Asp Val Ala Met Glu Phe Leu Gly Glu Leu Pro
1055 1060 1065Lys Val Glu Ile Leu Val Ile Ser Pro Phe Lys Ser Glu Glu Ile His
1070 1075 1080Phe Lys Pro Pro Gln Thr Gly Thr Ala Phe Val Ser Leu Arg Val Leu
1085 1090 1095Lys Leu Ala Gly Leu Trp Gly Ile Lys Ser Val Lys Phe Glu Glu Gly1100 1105 1110Thr Met Pro Lys Leu Glu Arg Leu Gln Val Gln Gly Arg Ile Glu Asn1115 1120 1125 1130Glu Ile Gly Phe Ser Gly Leu Glu Phe Leu Gln Asn Ile Asn Glu Val
1135 1140 1145Gln Leu Ser Val Trp Phe Pro Thr Asp His Asp Arg Ile Arg Ala Ala
1150 1155 1160Arg Ala Ala Gly Ala Asp Tyr Glu Thr Ala Trp Glu Glu Glu Val Gln
1165 1170 1175Glu Ala Arg Arg Lys Gly Gly Glu Leu Lys Arg Lys Ile Arg Glu Gln1180 1185 1190Leu Ala Arg Asn Pro Asn Gln Pro Ile Ile Thr1195 1200 1205(2)SEQ ID NO:2的资料:
(i)序列特征:
(A)长度:3925个碱基对
(B)类型:核酸
(C)链型:双链
(D)拓扑结构:线性(ii)分子类型:cDNA到mRNA(ix)特征:
(A)名称/关键词:CDS
(B)位置:82..3696
(C)鉴定方法:EGCAAAATCTG CATTTGCTGA GGAGGTGGCC TTGCAGCTTG GTATCCAGAA AGACCACACA 60TTTGTTGCAG ATGAGCTTGA G ATG ATG AGG TCT TTC ATG ATG GAG GCG CAC 111
Met Met Arg Ser Phe Met Met Glu Ala His
1 5 10GAG GAG CAA GAT AAC AGC AAG GTG GTC AAG ACT TGG GTG AAG CAA GTC 159Glu Glu Gln Asp Asn Ser Lys Val Val Lys Thr Trp Val Lys Gln Val
15 20 25CGT GAC ACT GCC TAT GAT GTT GAG GAC AGC CTC CAG GAT TTC GCT GTT 207Arg Asp Thr Ala Tyr Asp Val Glu Asp Ser Leu Gln Asp Phe Ala Val
30 35 40CAT CTT AAG AGG CCA TCC TGG TGG CGA TTT CCT CGT ACG CTG CTC GAG 255His Leu Lys Arg Pro Ser Trp Trp Arg Phe Pro Arg Thr Leu Leu Glu
45 50 55CGG CAC CGT GTG GCC AAG CAG ATG AAG GAG CTT AGG AAC AAG GTC GAG 303Arg His Arg Val Ala Lys Gln Met Lys Glu Leu Arg Asn Lys Val Glu
60 65 70GAT GTC AGC CAG AGG AAT GTG CGG TAC CAC CTC ATC AAG GGC TCT GCC 351Asp Val Ser Gln Arg Asn Val Arg Tyr His Leu Ile Lys Gly Ser Ala75 80 85 90AAG GCC ACC ATC AAT TCC ACT GAG CAA TCT AGC GTT ATT GCT ACA GCC 399Lys Ala Thr Ile Asn Ser Thr Glu Gln Ser Ser Val Ile Ala Thr Ala
95 100 105ATA TTC GGC ATT GAC GAT GCA AGG CGT GCC GCA AAG CAG GAC AAT CAG 447Ile Phe Gly Ile Asp Asp Ala Arg Arg Ala Ala Lys Gln Asp Asn Gln
110 115 120AGA GTG GAT CTT GTC CAA CTA ATC AAC AGT GAG GAT CAG GAC CTA AAA 495Arg Val Asp Leu Val Gln Leu Ile Asn Ser Glu Asp Gln Asp Leu Lys
125 130 135GTG ATC GCG GTC TGG GGA ACA AGT GGT GAT ATG GGC CAA ACA ACA ATA 543Val Ile Ala Val Trp Gly Thr Ser Gly Asp Met Gly Gln Thr Thr Ile
140 145 150ATC AGG ATG GCT TAT GAG AAC CCA GAT GTC CAA ATC AGA TTC CCA TGC 591Ile Arg Met Ala Tyr Glu Asn Pro Asp Val Gln Ile Arg Phe Pro Cys155 160 165 170CGT GCA TGG GTA AGG GTG ATG CAT CCT TTC AGT CCA AGA GAC TTT GTC 639Arg Ala Trp Val Arg Val Met His Pro Phe Ser Pro Arg Asp Phe Val
175 180 185CAG AGC TTG GTG AAT CAG CTT CAT GCA ACC CAA GGG GTT GAA GCT CTG 687Gln Ser Leu Val Asn Gln Leu His Ala Thr Gln Gly Val Glu Ala Leu
190 195 200TTG GAG AAA GAG AAG ACA GAA CAA GAT TTA GCT AAG AAA TTC AAT GGA 735Leu Glu Lys Glu Lys Thr Glu Gln Asp Leu Ala Lys Lys Phe Asn Gly
205 210 215TGT GTG AAT GAT AGG AAG TGT CTA ATT GTG CTT AAT GAC CTA TCC ACC 783Cys Val Asn Asp Arg Lys Cys Leu Ile Val Leu Asn Asp Leu Ser Thr
220 225 230ATT GAA GAG TGG GAC CAG ATT AAG AAA TGC TTC CAA AAA TGC AGG AAA 831Ile Glu Glu Trp Asp Gln Ile Lys Lys Cys Phe Gln Lys Cys Arg Lys235 240 245 250GGA AGC CGA ATC ATA GTG TCA AGC ACT CAA GTT GAA GTT GCA AGC TTA 879Gly Ser Arg Ile Ile Val Ser Ser Thr Gln Val Glu Val Ala Ser Leu
255 260 265TGT GCT GGG CAA GAA AGC CAA GCC TCA GAG CTA AAG CAA TTG TCT GCT 927Cys Ala Gly Gln Glu Ser Gln Ala Ser Glu Leu Lys Gln Leu Ser Ala
270 275 280GAT CAG ACC CTT TAC GCA TTC TAC GAC AAG GGT TCC CAA ATT ATA GAG 975Asp Gln Thr Leu Tyr Ala Phe Tyr Asp Lys Gly Ser Gln Ile Ile Glu
285 290 295GAT TCA GTG AAG CCA GTG TCT ATC TCG GAT GTG GCC ATC ACA AGT ACA 1023Asp Ser Val Lys Pro Val Ser Ile Ser Asp Val Ala Ile Thr Ser Thr
300 305 310AAC AAT CAT ACA GTG GCC CAT GGT GAG ATT ATA GAT GAT CAA TCA ATG 1071Asn Asn His Thr Val Ala His Gly Glu Ile Ile Asp Asp Gln Ser Met315 320 325 330GAT GCT GAT GAG AAG AAG GTG GCT AGA AAG AGT CTT ACT CGC ATT AGG 1119Asp Ala Asp Glu Lys Lys Val Ala Arg Lys Ser Leu Thr Arg Ile Arg
335 340 345ACA AGT GTT GGT GCT TCG GAG GAA TCA CAA CTT ATT GGG CGA GAG AAA 1167Thr Ser Val Gly Ala Ser Glu Glu Ser Gln Leu Ile Gly Arg Glu Lys
350 355 360GAA ATA TCT GAA ATA ACA CAC TTA ATT TTA AAC AAT GAT AGC CAG CAG 1215Glu Ile Ser Glu Ile Thr His Leu Ile Leu Asn Asn Asp Ser Gln Gln
365 370 375GTT CAG GTG ATC TCT GTG TGG GGA ATG GGT GGC CTT GGA AAA ACC ACC 1263Val Gln Val Ile Ser Val Trp Gly Met Gly Gly Leu Gly Lys Thr Thr
380 385 390CTA GTA AGC GGT GTT TAT CAA AGC CCA AGG CTG AGT GAT AAG TTT GAC 1311Leu Val Ser Gly Val Tyr Gln Ser Pro Arg Leu Ser Asp Lys Phe Asp395 400 405 410AAG TAT GTT TTT GTC ACA ATC ATG CGT CCT TTC ATT CTT GTA GAG CTC 1359Lys Tyr Val Phe Val Thr Ile Met Arg Pro Phe Ile Leu Val Glu Leu
415 420 425CTT AGG AGT TTG GCT GAG CAA CTA CAT AAA GGA TCT TCT AAG AAG GAA 1407Leu Arg Ser Leu Ala Glu Gln Leu His Lys Gly Ser Ser Lys Lys Glu
430 435 440GAA CTG TTA GAA AAT AGA GTC AGC AGT AAG AAA TCA CTA GCA TCG ATG 1455Glu Leu Leu Glu Asn Arg Val Ser Ser Lys Lys Ser Leu Ala Ser Met
445 450 455GAG GAT ACC GAG TTG ACT GGG CAG TTG AAA AGG CTT TTA GAA AAG AAA 1503Glu Asp Thr Glu Leu Thr Gly Gln Leu Lys Arg Leu Leu Glu Lys Lys
460 465 470AGT TGC TTG ATT GTT CTA GAT GAT TTC TCA GAT ACC TCA GAA TGG GAC 1551Ser Cys Leu Ile Val Leu Asp Asp Phe Ser Asp Thr Ser Glu Trp Asp475 480 485 490CAG ATA AAA CCA ACG TTA TTC CCC CTG TTG GAA AAG ACA AGC CGA ATA 1599Gln Ile Lys Pro Thr Leu Phe Pro Leu Leu Glu Lys Thr Ser Arg Ile
495 500 505ATT GTG ACT ACA AGA AAA GAG AAT ATT GCC AAC CAT TGC TCA GGG AAA 1647Ile Val Thr Thr Arg Lys Glu Asn Ile Ala Asn His Cys Ser Gly Lys
510 515 520AAT GGA AAT GTG CAC AAC CTT AAA GTT CTT AAA CAT AAT GAT GCA TTG 1695Asn Gly Asn Val His Asn Leu Lys Val Leu Lys His Asn Asp Ala Leu
525 530 535TGC CTC TTG AGT GAG AAG GTA TTT GAG GAG GCT ACA TAT TTG GAT GAT 1743Cys Leu Leu Ser Glu Lys Val Phe Glu Glu Ala Thr Tyr Leu Asp Asp
540 545 550CAG AAC AAT CCA GAG TTG GTT AAA GAA GCA AAA CAA ATC CTA AAG AAG 1791Gln Asn Asn Pro Glu Leu Val Lys Glu Ala Lys Gln Ile Leu Lys Lys555 560 565 570TGC GAT GGA CTG CCC CTT GCA ATA GTT GTC ATA GGT GGA TTC TTG GCA 1839Cys Asp Gly Leu Pro Leu Ala Ile Val Val Ile Gly Gly Phe Leu Ala
575 580 585AAC CGA CCA AAG ACC CCA GAA GAG TGG AGA AAA TTG AAC GAG AAT ATC 1887Asn Arg Pro Lys Thr Pro Glu Glu Trp Arg Lys Leu Asn Glu Asn Ile
590 595 600AAT GCT GAG TTG GAA ATG AAT CCA GAG CTT GGA ATG ATA AGA ACC GTC 1935Asn Ala Glu Leu Glu Met Asn Pro Glu Leu Gly Met Ile Arg Thr Val
605 610 615CTT GAA AAA AGC TAT GAT GGT TTA CCA TAC CAT CTC AAG TCA TGT TTT 1983Leu Glu Lys Ser Tyr Asp Gly Leu Pro Tyr His Leu Lys Ser Cys Phe
620 625 630TTA TAT CTG TCC ATT TTC CCT GAA GAC CAG ATC ATT AGT CGA AGG CGT 2031Leu Tyr Leu Ser Ile Phe Pro Glu Asp Gln Ile Ile Ser Arg Arg Arg635 640 645 650TTG GTG CAT CGT TGG GCA GCA GAA GGT TAC TCA ACT GCA GCA CAT GGG 2079Leu Val His Arg Trp Ala Ala Glu Gly Tyr Ser Thr Ala Ala His Gly
655 660 665AAA TCT GCC ATT GAA ATA GCT AAC GGC TAC TTC ATG GAA CTC AAG AAT 2127Lys Ser Ala Ile Glu Ile Ala Asn Gly Tyr Phe Met Glu Leu Lys Asn
670 675 680AGA AGC ATG ATT TTA CCA TTC CAG CAA TCA GGT AGC AGC AGG AAA TCA 2175Arg Ser Met Ile Leu Pro Phe Gln Gln Ser Gly Ser Ser Arg Lys Ser
685 690 695ATT GAC TCT TGC AAA GTC CAT GAT CTC ATG CGT GAC ATC GCC ATC TCA 2223Ile Asp Ser Cys Lys Val His Asp Leu Met Arg Asp Ile Ala Ile Ser
700 705 710AAG TCA ACG GAG GAA AAC CTT GTT TTT AGG GTG GAG GAA GGC TGC AGC 2271Lys Ser Thr Glu Glu Asn Leu Val Phe Arg Val Glu Glu Gly Cys Ser715 720 725 730GCG TAC ATA CAT GGT GCA ATT CGT CAT CTT GCT ATA AGT AGC AAC TGG 2319Ala Tyr Ile His Gly Ala Ile Arg His Leu Ala Ile Ser Ser Asn Trp
735 740 745AAG GGA GAT AAG AGT GAA TTC GAG GGC ATA GTG GAC CTG TCC CGA ATA 2367Lys Gly Asp Lys Ser Glu Phe Glu Gly Ile Val Asp Leu Ser Arg Ile
750 755 760CGA TCG TTA TCT CTG TTT GGG GAT TGG AAG CCA TTT TTT GTT TAT GGC 2415Arg Ser Leu Ser Leu Phe Gly Asp Trp Lys Pro Phe Phe Val Tyr Gly
765 770 775AAG ATG AGG TTT ATA CGA GTG CTT GAC TTT GAA GGG ACT AGA GGT CTA 2463Lys Met Arg Phe Ile Arg Val Leu Asp Phe Glu Gly Thr Arg Gly Leu
780 785 790GAA TAT CAT CAC CTT GAT CAG ATT TGG AAG CTT AAT CAC CTA AAA TTC 2511Glu Tyr His His Leu Asp Gln Ile Trp Lys Leu Asn His Leu Lys Phe795 800 805 810CTT TCT CTA CGA GGA TGC TAT CGT ATT GAT CTA CTG CCA GAT TTA CTG 2559Leu Ser Leu Arg Gly Cys Tyr Arg Ile Asp Leu Leu Pro Asp Leu Leu
815 820 825GGC AAC CTG AGG CAA CTC CAG ATG CTA GAC ATC AGA GGT ACA TAT GTA 2607Gly Asn Leu Arg Gln Leu Gln Met Leu Asp Ile Arg Gly Thr Tyr Val
830 835 840AAG GCT TTG CCA AAA ACC ATC ATC AAG CTT CAG AAG CTA CAG TAC ATT 2655Lys Ala Leu Pro Lys Thr Ile Ile Lys Leu Gln Lys Leu Gln Tyr Ile
845 850 855CAT GCT GGG CGC AAA ACA GAC TAT GTA TGG GAG GAA AAG CAT AGT TTA 2703His Ala Gly Arg Lys Thr Asp Tyr Val Trp Glu Glu Lys His Ser Leu
860 865 870ATG CAG AGG TGT CGT AAG GTG GGA TGT ATA TGT GCA ACA TGT TGC CTC 2751Met Gln Arg Cys Arg Lys Val Gly Cys Ile Cys Ala Thr Cys Cys Leu875 880 885 890CCT CTT CTT TGC GAA ATG TAT GGC CCT CTC CAT AAG GCC CTA GCC CGG 2799Pro Leu Leu Cys Glu Met Tyr Gly Pro Leu His Lys Ala Leu Ala Arg
895 900 905CGT GAT GCG TGG ACT TTC GCT TGC TGC GTG AAA TTC CCA TCT ATC ATG 2847Arg Asp Ala Trp Thr Phe Ala Cys Cys Val Lys Phe Pro Ser Ile Met
910 915 920ACG GGA GTA CAT GAA GAG GAA GGC GCT ATG GTG CCA AGT GGG ATT AGA 2895Thr Gly Val His Glu Glu Glu Gly Ala Met Val Pro Ser Gly Ile Arg
925 930 935AAA CTG AAA GAC TTG CAC ACA CTA AGG AAC ATA AAT GTC GGA AGG GGA 2943Lys Leu Lys Asp Leu His Thr Leu Arg Asn Ile Asn Val Gly Arg Gly
940 945 950AAT GCC ATC CTA CGA GAT ATC GGA ATG CTC ACA GGA TTA CAC AAG TTA 2991Asn Ala Ile Leu Arg Asp Ile Gly Met Leu Thr Gly Leu His Lys Leu955 960 965 970GGA GTG GCT GGC ATC AAC AAG AAG AAT GGA CGA GCG TTT CGC TTG GCC 3039Gly Val Ala Gly Ile Asn Lys Lys Asn Gly Arg Ala Phe Arg Leu Ala
975 980 985ATT TCC AAC CTC AAC AAG CTG GAA TCA CTG TCT GTG AGT TCA GCA GGG 3087Ile Ser Asn Leu Asn Lys Leu Glu Ser Leu Ser Val Ser Ser Ala Gly
990 995 1000ATG CCG GGC TTG TGT GGT TGC TTG GAT GAT ATA TCC TCG CCT CCG GAA 3135Met Pro Gly Leu Cys Gly Cys Leu Asp Asp Ile Ser Ser Pro Pro Glu
1005 1010 1015AAC CTA CAG AGC CTC AAG CTG TAC GGC AGT TTG AAA ACG TTG CCG GAA 3183Asn Leu Gln Ser Leu Lys Leu Tyr Gly Ser Leu Lys Thr Leu Pro Glu
1020 1025 1030TGG ATC AAG GAG CTC CAG CAT CTC GTG AAG TTA AAA CTA GTG AGT ACT 3231Trp Ile Lys Glu Leu Gln His Leu Val Lys Leu Lys Leu Val Ser Thr1035 1040 1045 1050AGG CTA TTG GAG CAC GAC GTT GCT ATG GAA TTC CTT GGG GAA CTA CCG 3279Arg Leu Leu Glu His Asp Val Ala Met Glu Phe Leu Gly Glu Leu Pro
1055 1060 1065AAG GTG GAA ATT CTA GTT ATT TCA CCG TTT AAG AGT GAA GAA ATT CAT 3327Lys Val Glu Ile Leu Val Ile Ser Pro Phe Lys Ser Glu Glu Ile His
1070 1075 1080TTC AAG CCT CCG CAG ACT GGG ACT GCT TTT GTA AGC CTC AGG GTG CTC 3375Phe Lys Pro Pro Gln Thr Gly Thr Ala Phe Val Ser Leu Arg Val Leu
1085 1090 1095AAG CTT GCA GGA TTA TGG GGC ATC AAA TCA GTG AAG TTT GAG GAA GGA 3423Lys Leu Ala Gly Leu Trp Gly Ile Lys Ser Val Lys Phe Glu Glu Gly1100 1105 1110ACA ATG CCC AAA CTT GAG AGG CTG CAG GTC CAA GGG CGA ATA GAA AAT 3471Thr Met Pro Lys Leu Glu Arg Leu Gln Val Gln Gly Arg Ile Glu Asn1115 1120 1125 1130GAA ATT GGC TTT TCT GGG TTA GAG TTT CTC CAA AAC ATC AAC GAA GTC 3519Glu Ile Gly Phe Ser Gly Leu Glu Phe Leu Gln Asn Ile Asn Glu Val
1135 1140 1145CAG CTC AGT GTT TGG TTT CCC ACG GAT CAT GAT AGG ATA AGA GCC GCG 3567Gln Leu Ser Val Trp Phe Pro Thr Asp His Asp Arg Ile Arg Ala Ala
1150 1155 1160CGC GCC GCG GGC GCT GAT TAT GAG ACT GCC TGG GAG GAA GAG GTA CAG 3615Arg Ala Ala Gly Ala Asp Tyr Glu Thr Ala Trp Glu Glu Glu Val Gln
1165 1170 1175GAA GCA AGG CGC AAG GGA GGT GAA CTG AAG AGG AAA ATC CGA GAA CAG 3663Glu Ala Arg Arg Lys Gly Gly Glu Leu Lys Arg Lys Ile Arg Glu Gln1180 1185 1190CTT GCT CGG AAT CCA AAC CAA CCC ATC ATT ACC TGAGCTCCTT TGGAGTTACT 3716Leu Ala Arg Asn Pro Asn Gln Pro Ile Ile Thr1195 1200 1205TTGCCGTGCT CCATACTATC CTACAAGTGA GATCCTCTGC AGTACTGCATGCTCACTGAC 3776ATGTGGACCC GAGGGGCTGT GGGGCCCACA TGTCAGTGAG CAGTACTGTGCAGTACTGCA 3836GAGGACCTGC ATCCACTATC CTATATTATA ATGGATTGTA CTATCGATCCAACTATTCAG 3896ATTAACTCTA TACTAGTGAA CTTATTTTT 3925(2)SEQ ID NO:3的资料:
(i)序列特征:
(A)长度:10322个碱基对
(B)类型:核酸
(C)链型:双链
(D)拓扑结构:线性
(ii)分子类型:基因组DNA
(ix)特征:
(A)名称/关键词:外显子
(B)位置:3630..4586
(C)鉴定方法:E
(ix)特征:
(A)名称/关键词:外显子
(B)位置:5927..6682
(C)鉴定方法:E
(ix)特征:
(A)名称/关键词:外显子
(B)位置:6991..8973
(C)鉴定方法:E
(xi)序列描述:SEQ ID NO:3:CGGCCGCATA ATACGACTCA CTATAGGGAT CTCCTCTAGA GTTACTTTGC CGTGCTCCAT 60ACTATCCTAT TCTATATTGG ATTATACTAT CGATCCAACG ATTCAGATTA ACTCTATACT 120AGTGAAGTCT ACACTTATGG TATGGGTAAT ATACATATGT AGTATAGTAT AGCATAAGGG 180TATTTCATTT TGCAGGTTAG CCGTTTATCT GCTGGTGCTC CTCTTGCTGT AGTAGTGTTG 240TTGGTGTTGC TGCTGATGAC CTAAAATGCT TGCATGTTTC TATCATGTTC TCCATAATGT 300AGTATCATGT ACTCCATCTT CCTTGTTGGT TTTTGTCCAT AATCTCCACC TTGGCAGCTT 360GCATCATCTT ACTCTCGAGC TTGTCCACCT TGAGATTCAA CTCCTGGAAC GCGGCTCCCA 420GTTCATCCAC CCTCTTCTCC ACGGCAGGAA TCCGTGACTC CACCGTACGC TTGAGATCTT 480GGTACTCCGC CCTGGTGCGC TCATCAGCCT CAACTCGTTT CTTCTCATTC CCTTCCACAA 540GTTGCAGAAG GAGGTCCAAC TTCTTATCAG TCTCCATGGC CTCGGATCTG GATCAGGTAC 600CTACTGCTCT CGCTCCGAAT TCCGCGAACC TTAGGGGGCA AGTTTCCTTT TCGCGGTGCC 660GATCCGAAGA TCAGCTCCAA TCCACCCCAA GGAACAATTT CACCGCAGAA TCAAGAGAAT 720TTGAGAAGCA AGAGAGGCTC TGATACCAGA TTGTCAGGAT CTCAAGAAAT CAGCAAAGAA 780CAACAAGAAC ACACAAGGAT TCAGGCAACT AGTTTGGATT GATCTGCTCC AACCCAACAG 840GATTGAGCCT TCCGCCGCCA CCGCCACCGA GTTGCCAGTT CATAGTTGTC TTTCTCGAGT 900TCATCTTATT TATACAGTAG TATCTCCCTA CTCACACGAC ACACACAGTA GCCAGCTGTA 960CAACAGATAG CTGGGCTACG CAACCCACTC GGACCCATGG TAACGAGGAT TGGGCTTTGG 1020CCCTCTTGTG GGTCTTGCTC TTCCTGGAGT AGTAGTCTGT ATCTCCTCCT CCTGGACTTC 1080AGCTTCTGCT TCATCAGGTT CTCCTTCTTC AGGTTCCTTC TCTCCCTGTT CAGCTTCTGC 1140TTCATCAGGT TCTCCTTCTT CAGTACCCAT AGTGACAGGC AGGTTCCTGA CAAAATTCTG 1200CTCGTTTGCG ACCAATGGTA GTGATCATAG TTGCAACCAG GAGGGGGGGG GGGGAAATCG 1260CCGTCCCCTC CGCTCCTCTC CCGTCGTCCC CAACGCCTCG TTCGCGCATT TCGTTGAACA 1320CCATGACGGC GCCGAATTCG CAGTGTCCGC ACATCTCCTC CTCCCCCGTC CTCTCCAAAC 1380CCCAAACCCT ATCTCCACCC CCGAGGCAGG CGCCCCCATG CACTTGTAAG TCGATTGGAT 1440GTCCTGTCCC AGAAGACATA TCGAGCGAGG AGGCGGAGGG GGACGAAGGG AACATATCGA 1500GCGAGGAAGC GGAGGGTGGA TCGGCATCCC CCATTTCAAG GTACTATACT AGTCCATTAT 1560AGTAGTAGTG CTTTTGCATC TTAGAAAAAA AAATATGTTC ATTAGCCATT GAGAGCTTCT 1620GAAGTTGTTG ATTTTGTTCC AACCCCAACT GTGAGTTTCA GTTCAGGTCA TCCACTGATT 1680TTCACTATGC CAATTCTCTG AAACAACTTT ACCACTGTCA CATGAACACA CTGAAACAGT 1740TTGGTGTAGA CGTGTAGTGA AGAATGTAGC ATATATACCT TCACTTAATT TTTCTTGCAA 1800TTATTGGCCA TTACTAGTTA TGCGAGGTAG AAGTGTTCTA AGGTACTGTA TCATTTTTAT 1860GTACTAATTA ATTAAGTTTA ATAAAAACTT TTATTATCTA AAAATAAATG ACTATTACTA 1920GCTCGGTACT CCCTTTATTT TATATTATAA GACGTTTTGA TTTTTTTATA TACAACTTTC 1980TTTAAGTTTG ATTATACTTA TAAAAAATTA GCAAACATAT ATATTTTTTT TACATTAATA 2040GTGCAAGTGA GCACGCTTAA ATGCATTGTA CTTCCTTCGT AAAAAAACAT CAAACTTTTA 2100CGGACGAATA TGGATAAATG CATATCTAAA TTCATCCTCA ATAATTGATT CTTTTTGGAG 2160GAGTACAATT GGTTGGTGCG CTTTGTCCTT GGACCCTACA ATAATGATGA TTGTTTCTTT 2220AATCTATTGA CCTTGACTTA CCACATGGGC TATGTTTATC CCTTCCTGAA TCCTGAGCAC 2280TGACTACCGA GGCACCGAGT GTGAGCGGCA ACGGCGGTCA GGGAGCAGGC GTGGCTCGTC 2340GGCGAGCGGC TACGGGCAAC GGCGCCTTGG CGTCAGGCAT CCGCCGTCAC TCACCTCAAG 2400CTTGCGGGCT CTGCGACCAC CCTCTCATAG TCATAGGCCA CAGAAGGTGT AGTAGTACTT 2460CATACATTTC GAGCAGTTTC TTTCAGATTG TTTGTTTTTG AGCTTCTAAT TTTGGGATGC 2520ATTAGATAGT GATGAAAGCC TGAATTATTG GAATTTTGGT GTTGGTACTC ACACTCTCAC 2580AGTCAGAACA TACTCCTATA TATTTTGCAG CACATTTGCC TTGTGCGTGC TGTTCGTCTG 2640TTCCACTCGT GAACATCAGA CGCGAAGATT ATAGATTCAC CCCTGTTCAC AGATTCAGGT 2700ACTGCCAATT GCCTGGATGA ACACCAGTCC ATTTGCTCTC TTTCGCCTTA CAATTTTTCT 2760CTGCATTGTA CTAGCAGCCG TAGCTCGAAA GCCTCGAATA TGATTCCTTT TCAAGATTTT 2820ATATTTATGG AATATAATTC ACTTTTAAGA TGCCTTGATG GTGAAATAGT AGACATGTGA 2880GACTCCAAAT CTCGTCCTAA AAGAGCATGG AGGTAAAAAA AGAAAAAGGT AGACATCGCT 2940ATTGTAGACA TGGAGAGCTG GAATACGATT ACTTTCAAGA TATTATATTC AATGAGCATT 3000CATTCTTACA CATATGCCAC AAAGGTAAAA AAAAACAGAG AAAGAGAGAG AGAGGGGAAA 3060GAAGCCAAGT TCTTTCTTCT ACTATCATTT AGGTTGAGTT CGTTTGTTAA GGTTCCCAAC 3120CTACGATTCC TCGTTTCCCG CGTGCACGAT TCCCAAACTA CTAAATGGTA TGCTTTTTAA 3180AATATTTCGT AGAAAAATTG CTTTAAAAAA TCATATTAAT TTATTTTTTA AGTTGTTTAG 3240CTAATACTCA ATTAATCATG CATTAATTTG CCGCTCCGTT TTAGTGGAAG TCATCTGAAA 3300GGATCAAAGG AAGCAACACC AAGTCCTTAT TTCGACTCCG ACTCTCTCAC TCTCGCCATT 3360TATTCTTTTC TTTCTGTTAT TTTAAAAGTT GCTACTTTAG CTTCAGCCAC GTGAATTCTT 3420GATATTTCAT TATTTTTCTC ATCAAACAAT AGCATCTTCT TCTGGAAATC GAATTCAGGG 3480CTTATATGTT GCTTATTCTG ATATATAGGT CTGTCACGAG GCGTATGATC ATCAACTCTG 3540CCACAAAATC CATTCAAAAA TAGAACAGAG CAATGGAGGC GACGGCGCTG AGTGTGGGCA 3600AATCCGTGCT GAATGGAGCG CTTGGCTACG CAAAATCTGC ATTTGCTGAG GAGGTGGCCT 3660TGCAGCTTGG TATCCAGAAA GACCACACAT TTGTTGCAGA TGAGCTTGAG ATG ATG 3716
Met Met
1AGG TCT TTC ATG ATG GAG GCG CAC GAG GAG CAA GAT AAC AGC AAG GTG 3764Arg Ser Phe Met Met Glu Ala His Glu Glu Gln Asp Asn Ser Lys Val
5 10 15GTC AAG ACT TGG GTG AAG CAA GTC CGT GAC ACT GCC TAT GAT GTT GAG 3812Val Lys Thr Trp Val Lys Gln Val Arg Asp Thr Ala Tyr Asp Val Glu
20 25 30GAC AGC CTC CAG GAT TTC GCT GTT CAT CTT AAG AGG CCA TCC TGG TGG 3860Asp Ser Leu Gln Asp Phe Ala Val His Leu Lys Arg Pro Ser Trp Trp35 40 45 50CGA TTT CCT CGT ACG CTG CTC GAG CGG CAC CGT GTG GCC AAG CAG ATG 3908Arg Phe Pro Arg Thr Leu Leu Glu Arg His Arg Val Ala Lys Gln Met
55 60 65AAG GAG CTT AGG AAC AAG GTC GAG GAT GTC AGC CAG AGG AAT GTG CGG 3956Lys Glu Leu Arg Asn Lys Val Glu Asp Val Ser Gln Arg Asn Val Arg
70 75 80TAC CAC CTC ATC AAG GGC TCT GCC AAG GCC ACC ATC AAT TCC ACT GAG 4004Tyr His Leu Ile Lys Gly Ser Ala Lys Ala Thr Ile Asn Ser Thr Glu
85 90 95CAA TCT AGC GTT ATT GCT ACA GCC ATA TTC GGC ATT GAC GAT GCA AGG 4052Gln Ser Ser Val Ile Ala Thr Ala Ile Phe Gly Ile Asp Asp Ala Arg
100 105 110CGT GCC GCA AAG CAG GAC AAT CAG AGA GTG GAT CTT GTC CAA CTA ATC 4100Arg Ala Ala Lys Gln Asp Asn Gln Arg Val Asp Leu Val Gln Leu Ile115 120 125 130AAC AGT GAG GAT CAG GAC CTA AAA GTG ATC GCG GTC TGG GGA ACA AGT 4148Asn Ser Glu Asp Gln Asp Leu Lys Val Ile Ala Val Trp Gly Thr Ser
135 140 145GGT GAT ATG GGC CAA ACA ACA ATA ATC AGG ATG GCT TAT GAG AAC CCA 4196Gly Asp Met Gly Gln Thr Thr Ile Ile Arg Met Ala Tyr Glu Asn Pro
150 155 160GAT GTC CAA ATC AGA TTC CCA TGC CGT GCA TGG GTA AGG GTG ATG CAT 4244Asp Val Gln Ile Arg Phe Pro Cys Arg Ala Trp Val Arg Val Met His
165 170 175CCT TTC AGT CCA AGA GAC TTT GTC CAG AGC TTG GTG AAT CAG CTT CAT 4292Pro Phe Ser Pro Arg Asp Phe Val Gln Ser Leu Val Asn Gln Leu His
180 185 190GCA ACC CAA GGG GTT GAA GCT CTG TTG GAG AAA GAG AAG ACA GAA CAA 4340Ala Thr Gln Gly Val Glu Ala Leu Leu Glu Lys Glu Lys Thr Glu Gln195 200 205 210GAT TTA GCT AAG AAA TTC AAT GGA TGT GTG AAT GAT AGG AAG TGT CTA 4388Asp Leu Ala Lys Lys Phe Asn Gly Cys Val Asn Asp Arg Lys Cys Leu
215 220 225ATT GTG CTT AAT GAC CTA TCC ACC ATT GAA GAG TGG GAC CAG ATT AAG 4436Ile Val Leu Asn Asp Leu Ser Thr Ile Glu Glu Trp Asp Gln Ile Lys
230 235 240AAA TGC TTC CAA AAA TGC AGG AAA GGA AGC CGA ATC ATA GTG TCA AGC 4484Lys Cys Phe Gln Lys Cys Arg Lys Gly Ser Arg Ile Ile Val Ser Ser
245 250 255ACT CAA GTT GAA GTT GCA AGC TTA TGT GCT GGG CAA GAA AGC CAA GCC 4532Thr Gln Val Glu Val Ala Ser Leu Cys Ala Gly Gln Glu Ser Gln Ala
260 265 270TCA GAG CTA AAG CAA TTG TCT GCT GAT CAG ACC CTT TAC GCA TTC TAC 4580Ser Glu Leu Lys Gln Leu Ser Ala Asp Gln Thr Leu Tyr Ala Phe Tyr275 280 285 290GAC AAG GTAATATACT TGCTCTTCAA GCATACCTCT CGATATCATT TTTAATTCAG 4636Asp LysTTATGCCTTT AGTAATTTCT AATTCAATTG TGTATAGGCT AGTTGAAGTG CGTGGGAGTT 4696ACCATTCCAT TAGAAACACA TGACCTAATG CAACTAACAA GTGCTCCTCC TGTTCTCTCT 4756CATTTGCCTT TTGGGAATGC ATGCACTCAA CATTTTAAGA TTACAGCCAA AATATATGTA 4816TTTGGATTTG TCAAAACAAA GATGTATGCT AGAAAAAGAA ATGGTCTAAT ACAGGTTTAC 4876AAATAAGACA ACGATGCAAA AAGGGCAACT AAAAACATAT TGATTCCCTC ATCTGCCACT 4936GCAATTGCCT TAAATTCTAG TCCATTCTAC TATCTCCGTT TCATATTATA AGTCACTCTA 4996GTTTTTTTCC AGTCAAACTT CTTTAGTTTG ACCAAGTTTA TACAAAAATT TAGCAACATA 5056TCCAACACGA AATTAGTTTC ATTAAATGTA GCATTGAATA TATTTTGATA GTATGTTTGT 5116TTTGTGTTGA AAATGCTGCT ATATTTTTTA AAAAAACTTG GTCAAACCTA AACAAGTTTG 5176ACTAGGAGAA AAGTCGAAAC GACTTATAAT ATGAAATAGA GGGAGAATGT TCGAAGTTTG 5236GCTAACGGTC AATGCTAGTG CTTTAAGTGG GTAAGCCGCA AATCCAATTA TAGGCCAAAA 5296TACATGGGTT TGTGGCTTAT TTTGGCTATA AGTGGGTTTC GCGGGTTAGC CACTTACACC 5356CCTAGTCAAT GCTAATGAAA GTAGAAGTGA TGCTATTCAA GGAAAATGTA TTGGATACCG 5416AGATTGCCTT GAATAAAGAA TAAAATTGAG GTAGTAGATT GGATAATAGA TTGACCCACA 5476AAATTGTACA AGTATGTAAT GTAGCACAAG TCCTCTTTGC ACAATTAAAA TTTTGAAGCT 5536CCTATTTCAC AAATAATTTT GATATGGATT AATTGATTTC ATATCCAATT CGCACAGTTT 5596ATTGAATTTG GAGATTTATT TCCTCTATAT GTGAGAGATG ATTGTAAAAT GGGCAAATCT 5656AGCAAATGCA TCCTCTCATC CTTTGGATTA AATGTAGTGT ACTTATCCCA TTATTTTAAA 5716GTTAAATTAA TACATATTTT ATTGAACAGT CAGATATACG TTTTTCAAAA TAGGATCCAA 5776AACTAAGGTT TATACTAGAC TGCAAATTAA TGAAAGGAAT TATCATTATT GTTTTGTATA 5836CTTTCATGAC CGAAAACAAG GCTAAACACT ATCCATGTAT GAAAATTTAA GGCTAAAAGT 5896TGTTCTTAAT CATTGCTCCC TTTTGTTTAG GGT TCC CAA ATT ATA GAG GAT TCA 5950
Gly Ser Gln Ile Ile Glu Asp Ser
295 300GTG AAG CCA GTG TCT ATC TCG GAT GTG GCC ATC ACA AGT ACA AAC AAT 5998Val Lys Pro Val Ser Ile Ser Asp Val Ala Ile Thr Ser Thr Asn Asn
305 310 315CAT ACA GTG GCC CAT GGT GAG ATT ATA GAT GAT CAA TCA ATG GAT GCT 6046His Thr Val Ala His Gly Glu Ile Ile Asp Asp Gln Ser Met Asp Ala
320 325 330GAT GAG AAG AAG GTG GCT AGA AAG AGT CTT ACT CGC ATT AGG ACA AGT 6094Asp Glu Lys Lys Val Ala Arg Lys Ser Leu Thr Arg Ile Arg Thr Ser
335 340 345GTT GGT GCT TCG GAG GAA TCA CAA CTT ATT GGG CGA GAG AAA GAA ATA 6142Val Gly Ala Ser Glu Glu Ser Gln Leu Ile Gly Arg Glu Lys Glu Ile
350 355 360TCT GAA ATA ACA CAC TTA ATT TTA AAC AAT GAT AGC CAG CAG GTT CAG 6190Ser Glu Ile Thr His Leu Ile Leu Asn Asn Asp Ser Gln Gln Val Gln365 370 375 380GTG ATC TCT GTG TGG GGA ATG GGT GGC CTT GGA AAA ACC ACC CTA GTA 6238Val Ile Ser Val Trp Gly Met Gly Gly Leu Gly Lys Thr Thr Leu Val
385 390 395AGC GGT GTT TAT CAA AGC CCA AGG CTG AGT GAT AAG TTT GAC AAG TAT 6286Ser Gly Val Tyr Gln Ser Pro Arg Leu Ser Asp Lys Phe Asp Lys Tyr
400 405 410GTT TTT GTC ACA ATC ATG CGT CCT TTC ATT CTT GTA GAG CTC CTT AGG 6334Val Phe Val Thr Ile Met Arg Pro Phe Ile Leu Val Glu Leu Leu Arg
415 420 425AGT TTG GCT GAG CAA CTA CAT AAA GGA TCT TCT AAG AAG GAA GAA CTG 6382Ser Leu Ala Glu Gln Leu His Lys Gly Ser Ser Lys Lys Glu Glu Leu
430 435 440TTA GAA AAT AGA GTC AGC AGT AAG AAA TCA CTA GCA TCG ATG GAG GAT 6430Leu Glu Asn Arg Val Ser Ser Lys Lys Ser Leu Ala Ser Met Glu Asp445 450 455 460ACC GAG TTG ACT GGG CAG TTG AAA AGG CTT TTA GAA AAG AAA AGT TGC 6478Thr Glu Leu Thr Gly Gln Leu Lys Arg Leu Leu Glu Lys Lys Ser Cys
465 470 475TTG ATT GTT CTA GAT GAT TTC TCA GAT ACC TCA GAA TGG GAC CAG ATA 6526Leu Ile Val Leu Asp Asp Phe Ser Asp Thr Ser Glu Trp Asp Gln Ile
480 485 490AAA CCA ACG TTA TTC CCC CTG TTG GAA AAG ACA AGC CGA ATA ATT GTG 6574Lys Pro Thr Leu Phe Pro Leu Leu Glu Lys Thr Ser Arg Ile Ile Val
495 500 505ACT ACA AGA AAA GAG AAT ATT GCC AAC CAT TGC TCA GGG AAA AAT GGA 6622Thr Thr Arg Lys Glu Asn Ile Ala Asn His Cys Ser Gly Lys Asn Gly
510 515 520AAT GTG CAC AAC CTT AAA GTT CTT AAA CAT AAT GAT GCA TTG TGC CTC 6670Asn Val His Asn Leu Lys Val Leu Lys His Asn Asp Ala Leu Cys Leu525 530 535 540TTG AGT GAG AAG GTAATATAAG TGTGCTCCAT TTTTCTTGGT TTGATATTCT 6722Leu Ser Glu LysTTTAATCATT TGAGTTATCC AATCAAGATG ATATTTGTGC ATGCAGAAAT AGCATATACT 6782AGATTCATAT ACAACTTAAT CTGTTCTCAC AACAATAGCA ATGCAGTTCC TAAAATGACC 6842TGCATTGGAT GGACGTTAGA TGTGACTTTG TTTTTGTATG TAATGGTGGC CTTCATTCCT 6902TAGTTTTAAT AGTAAAGACG TATTTCTAAA TTTAATTTTT TTTGTTTTAC TTTAGAGCAC 6962AATAAAGCTT AAATTGTATC AATGTCAG GTA TTT GAG GAG GCT ACA TAT TTG 7014
Val Phe Glu Glu Ala Thr Tyr Leu
545 550GAT GAT CAG AAC AAT CCA GAG TTG GTT AAA GAA GCA AAA CAA ATC CTA 7062Asp Asp Gln Asn Asn Pro Glu Leu Val Lys Glu Ala Lys Gln Ile Leu
555 560 565AAG AAG TGC GAT GGA CTG CCC CTT GCA ATA GTT GTC ATA GGT GGA TTC 7110Lys Lys Cys Asp Gly Leu Pro Leu Ala Ile Val Val Ile Gly Gly Phe
570 575 580TTG GCA AAC CGA CCA AAG ACC CCA GAA GAG TGG AGA AAA TTG AAC GAG 7158Leu Ala Asn Arg Pro Lys Thr Pro Glu Glu Trp Arg Lys Leu Asn Glu585 590 595 600AAT ATC AAT GCT GAG TTG GAA ATG AAT CCA GAG CTT GGA ATG ATA AGA 7206Asn Ile Asn Ala Glu Leu Glu Met Asn Pro Glu Leu Gly Met Ile Arg
605 610 615ACC GTC CTT GAA AAA AGC TAT GAT GGT TTA CCA TAC CAT CTC AAG TCA 7254Thr Val Leu Glu Lys Ser Tyr Asp Gly Leu Pro Tyr His Leu Lys Ser
620 625 630TGT TTT TTA TAT CTG TCC ATT TTC CCT GAA GAC CAG ATC ATT AGT CGA 7302Cys Phe Leu Tyr Leu Ser Ile Phe Pro Glu Asp Gln Ile Ile Ser Arg
635 640 645AGG CGT TTG GTG CAT CGT TGG GCA GCA GAA GGT TAC TCA ACT GCA GCA 7350Arg Arg Leu Val His Arg Trp Ala Ala Glu Gly Tyr Ser Thr Ala Ala
650 655 660CAT GGG AAA TCT GCC ATT GAA ATA GCT AAC GGC TAC TTC ATG GAA CTC 7398His Gly Lys Ser Ala Ile Glu Ile Ala Asn Gly Tyr Phe Met Glu Leu665 670 675 680AAG AAT AGA AGC ATG ATT TTA CCA TTC CAG CAA TCA GGT AGC AGC AGG 7446Lys Asn Arg Ser Met Ile Leu Pro Phe Gln Gln Ser Gly Ser Ser Arg
685 690 695AAA TCA ATT GAC TCT TGC AAA GTC CAT GAT CTC ATG CGT GAC ATC GCC 7494Lys Ser Ile Asp Ser Cys Lys Val His Asp Leu Met Arg Asp Ile Ala
700 705 710ATC TCA AAG TCA ACG GAG GAA AAC CTT GTT TTT AGG GTG GAG GAA GGC 7542Ile Ser Lys Ser Thr Glu Glu Asn Leu Val Phe Arg Val Glu Glu Gly
715 720 725TGC AGC GCG TAC ATA CAT GGT GCA ATT CGT CAT CTT GCT ATA AGT AGC 7590Cys Ser Ala Tyr Ile His Gly Ala Ile Arg His Leu Ala Ile Ser Ser
730 735 740AAC TGG AAG GGA GAT AAG AGT GAA TTC GAG GGC ATA GTG GAC CTG TCC 7638Asn Trp Lys Gly Asp Lys Ser Glu Phe Glu Gly Ile Val Asp Leu Ser745 750 755 760CGA ATA CGA TCG TTA TCT CTG TTT GGG GAT TGG AAG CCA TTT TTT GTT 7686Arg Ile Arg Ser Leu Ser Leu Phe Gly Asp Trp Lys Pro Phe Phe Val
765 770 775TAT GGC AAG ATG AGG TTT ATA CGA GTG CTT GAC TTT GAA GGG ACT AGA 7734Tyr Gly Lys Met Arg Phe Ile Arg Val Leu Asp Phe Glu Gly Thr Arg
780 785 790GGT CTA GAA TAT CAT CAC CTT GAT CAG ATT TGG AAG CTT AAT CAC CTA 7782Gly Leu Glu Tyr His His Leu Asp Gln Ile Trp Lys Leu Asn His Leu
795 800 805AAA TTC CTT TCT CTA CGA GGA TGC TAT CGT ATT GAT CTA CTG CCA GAT 7830Lys Phe Leu Ser Leu Arg Gly Cys Tyr Arg Ile Asp Leu Leu Pro Asp
810 815 820TTA CTG GGC AAC CTG AGG CAA CTC CAG ATG CTA GAC ATC AGA GGT ACA 7878Leu Leu Gly Asn Leu Arg Gln Leu Gln Met Leu Asp Ile Arg Gly Thr825 830 835 840TAT GTA AAG GCT TTG CCA AAA ACC ATC ATC AAG CTT CAG AAG CTA CAG 7926Tyr Val Lys Ala Leu Pro Lys Thr Ile Ile Lys Leu Gln Lys Leu Gln
845 850 855TAC ATT CAT GCT GGG CGC AAA ACA GAC TAT GTA TGG GAG GAA AAG CAT 7974Tyr Ile His Ala Gly Arg Lys Thr Asp Tyr Val Trp Glu Glu Lys His
860 865 870AGT TTA ATG CAG AGG TGT CGT AAG GTG GGA TGT ATA TGT GCA ACA TGT 8022Ser Leu Met Gln Arg Cys Arg Lys Val Gly Cys Ile Cys Ala Thr Cys
875 880 885TGC CTC CCT CTT CTT TGC GAA ATG TAT GGC CCT CTC CAT AAG GCC CTA 8070Cys Leu Pro Leu Leu Cys Glu Met Tyr Gly Pro Leu His Lys Ala Leu
890 895 900GCC CGG CGT GAT GCG TGG ACT TTC GCT TGC TGC GTG AAA TTC CCA TCT 8118Ala Arg Arg Asp Ala Trp Thr Phe Ala Cys Cys Val Lys Phe Pro Ser905 910 915 920ATC ATG ACG GGA GTA CAT GAA GAG GAA GGC GCT ATG GTG CCA AGT GGG 8166Ile Met Thr Gly Val His Glu Glu Glu Gly Ala Met Val Pro Ser Gly
925 930 935ATT AGA AAA CTG AAA GAC TTG CAC ACA CTA AGG AAC ATA AAT GTC GGA 8214Ile Arg Lys Leu Lys Asp Leu His Thr Leu Arg Asn Ile Asn Val Gly
940 945 950AGG GGA AAT GCC ATC CTA CGA GAT ATC GGA ATG CTC ACA GGA TTA CAC 8262Arg Gly Asn Ala Ile Leu Arg Asp Ile Gly Met Leu Thr Gly Leu His
955 960 965AAG TTA GGA GTG GCT GGC ATC AAC AAG AAG AAT GGA CGA GCG TTT CGC 8310Lys Leu Gly Val Ala Gly Ile Asn Lys Lys Asn Gly Arg Ala Phe Arg
970 975 980TTG GCC ATT TCC AAC CTC AAC AAG CTG GAA TCA CTG TCT GTG AGT TCA 8358Leu Ala Ile Ser Asn Leu Asn Lys Leu Glu Ser Leu Ser Val Ser Ser985 990 995 1000GCA GGG ATG CCG GGC TTG TGT GGT TGC TTG GAT GAT ATA TCC TCG CCT 8406Ala Gly Met Pro Gly Leu Cys Gly Cys Leu Asp Asp Ile Ser Ser Pro
1005 1010 1015CCG GAA AAC CTA CAG AGC CTC AAG CTG TAC GGC AGT TTG AAA ACG TTG 8454Pro Glu Asn Leu Gln Ser Leu Lys Leu Tyr Gly Ser Leu Lys Thr Leu
1020 1025 1030CCG GAA TGG ATC AAG GAG CTC CAG CAT CTC GTG AAG TTA AAA CTA GTG 8502Pro Glu Trp Ile Lys Glu Leu Gln His Leu Val Lys Leu Lys Leu Val
1035 1040 1045AGT ACT AGG CTA TTG GAG CAC GAC GTT GCT ATG GAA TTC CTT GGG GAA 8550Ser Thr Arg Leu Leu Glu His Asp Val Ala Met Glu Phe Leu Gly Glu1050 1055 1060CTA CCG AAG GTG GAA ATT CTA GTT ATT TCA CCG TTT AAG AGT GAA GAA 8598Leu Pro Lys Val Glu Ile Leu Val Ile Ser Pro Phe Lys Ser Glu Glu1065 1070 1075 1080ATT CAT TTC AAG CCT CCG CAG ACT GGG ACT GCT TTT GTA AGC CTC AGG 8646Ile His Phe Lys Pro Pro Gln Thr Gly Thr Ala Phe Val Ser Leu Arg
1085 1090 1095GTG CTC AAG CTT GCA GGA TTA TGG GGC ATC AAA TCA GTG AAG TTT GAG 8694Val Leu Lys Leu Ala Gly Leu Trp Gly Ile Lys Ser Val Lys Phe Glu
1100 1105 1110GAA GGA ACA ATG CCC AAA CTT GAG AGG CTG CAG GTC CAA GGG CGA ATA 8742Glu Gly Thr Met Pro Lys Leu Glu Arg Leu Gln Val Gln Gly Arg Ile
1115 1120 1125GAA AAT GAA ATT GGC TTT TCT GGG TTA GAG TTT CTC CAA AAC ATC AAC 8790Glu Asn Glu Ile Gly Phe Ser Gly Leu Glu Phe Leu Gln Asn Ile Asn1130 1135 1140GAA GTC CAG CTC AGT GTT TGG TTT CCC ACG GAT CAT GAT AGG ATA AGA 8838Glu Val Gln Leu Ser Val Trp Phe Pro Thr Asp His Asp Arg Ile Arg1145 1150 1155 1160GCC GCG CGC GCC GCG GGC GCT GAT TAT GAG ACT GCC TGG GAG GAA GAG 8886Ala Ala Arg Ala Ala Gly Ala Asp Tyr Glu Thr Ala Trp Glu Glu Glu
1165 1170 1175GTA CAG GAA GCA AGG CGC AAG GGA GGT GAA CTG AAG AGG AAA ATC CGA 8934Val Gln Glu Ala Arg Arg Lys Gly Gly Glu Leu Lys Arg Lys Ile Arg
1180 1185 1190GAA CAG CTT GCT CGG AAT CCA AAC CAA CCC ATC ATT ACC TGAGCTCCTT 8983Glu Gln Leu Ala Arg Asn Pro Asn Gln Pro Ile Ile Thr
1195 1200 1205TGGAGTTACT TTGCCGTGCT CCATACTATC CTACAAGTGA GATCCTCTGC AGTACTGCAT 9043GCTCACTGAC ATGTGGACCC GAGGGGCTGT GGGGCCCACA TGTCAGTGAG CAGTACTGTG 9103CAGTACTGCA GAGGACCTGC ATCCACTATC CTATATTATA ATGGATTGTA CTATCGATCC 9163AACTATTCAG ATTAACTCTA TACTAGTGAA CTTATTTTTT TTTGCCGGGC CGGCAAATAG 9223CTGGTCGATG TATATTAAGA ATAAGAAAGG GAATGTACAA GATAGCGCGG TGCGTCAATG 9283CACCACCATT ACAGACGTAA AAGGAAAGCT AAAATCTCAC AGAATGAGTT GCTACAGAGT 9343GACACATGGG GCTAACAAGA CCTGCAGCTA TCCAAGTCTC CCATTCATCC CCCATGGCAG 9403AACAGAACTG GGGAACCGTT GCCGCGATCC CTTCAAACAC CCTTGCGTTT CGCTCTTTCG 9463AAATCAACCA GGTTACAAGG ATCACCCTTG CATCGAACGT TTTGCGGTCA ACCTTAGCAA 9523CAGATTTCCG GGCTGCAAGC CACCAATCAG CAAAATCAGC CGACGATGAG GAGCACGAAA 9583GGACCAGGCG TGTGCGCACC TGACCTCAAA TCTCCTGGGT GTAAGAGCAG CCCACGAAGA 9643TGTGCTGGCA GGTTTCCCCG TCATTGGAGC AGAAATAGCA CACCGGAGCA AGCTTCCATC 9703TGTGACGTTG TAGATTGTTG GCAGTGAGGC AAGCATTGCG CTCGGCGAGA AACATAAAGA 9763ACTTACATCT CGCCGGGGCA AGAGACTTCC AAATAATGGT ATACATATGT AGTATATAGT 9823ATAGTATAGT ATAGTATAAG GGTATTCATT TTGCAGGTTA GCGGTTATCT GCTGCTGTTC 9883CTCCTGCTGC GGCGTGCTGG AGTAGTGTTG TTGGTGGTGG TGCTGATGAC CTAAAATGCT 9943TGCTTGTTTC TATCAAGTTC TCCAGAATGT AGTATGTACT GCATCTTGTT GATTTTTGTC 10003CATAAACGGA TTGCATTATC TGTATATGAC CCAATCAACA ATAAACGGTG TTGCATTTTG 10063TTCCTAAAAG CTCTTAGAGT CTGACCAGTT ATCTCTGTAC GCATCTTCAT GCTGTTCTTT 10123GGGCACTGGT CATGGTTAAA TCACAGTTCA CCGAAACTTA TTTTCTGTAG ACTTATTCTG 10183AAATACTGAG AAATTGAAAT GTAGTAACTA TTGTCTGTAG ACTGCTTTCT CGTTTTTCTT 10243TTGCGGTCGC CATCTCCAGT CAGTATCTAC AGAAGAAGAG CCAATGCAGC CTATTGTCCT 10303TTTTTTGCCG GGTCGGCCG 10322(2)SEQ ID NO:4的资料:
(i)序列特征:
(A)长度:20个碱基对
(B)类型:核酸
(C)链型:单链
(D)拓扑结构:线性
(ii)分子类型:其它核酸,合成DNA
(ix)序列描述:SEQ ID NO:4:
AGGGAAAAAT GGAAATGTGC(2)SEQ ID NO:5的资料:
(i)序列特征:
(A)长度:20个碱基对
(B)类型:核酸
(C)链型:单链
(D)拓扑结构:线性
(ii)分子类型:其它核酸,合成DNA
(ix)序列描述:SEQ ID NO:5:
AGTAACCTTC TGCTGCCCAA(2)SEQ ID NO:6的资料:
(i)序列特征:
(A)长度:20个碱基对
(B)类型:核酸
(C)链型:单链
(D)拓扑结构:线性
(ii)分子类型:其它核酸,合成DNA
(ix)序列描述:SEQ ID NO:6:
TTACCATCCC AGCAATCAGC(2)SEQ ID NO:7的资料:
(i)序列特征:
(A)长度:20个碱基对
(B)类型:核酸
(C)链型:单链
(D)拓扑结构:线性
(ii)分子类型:其它核酸,合成DNA
(ix)序列描述:SEQ ID NO:7:
AGACACCCTG CCACACAACA