成纤维生长因子21变体、其融合蛋白及其用途
技术领域
本发明属于生物制药领域。具体地,本发明涉及成纤维生长因子21变体。更具体地,本发明还涉及含有成纤维生长因子21变体、GLP-1变体和FC序列的融合蛋白及其应用。
背景技术
现代人长期久坐的生活方式和过量的热量摄入加剧了全球的肥胖、非酒精性脂肪肝和2型糖尿病的流行。这些能量代谢上的缺陷还会进一步引发严重的心血管疾病甚至是肿瘤发生。目前能有效治疗肥胖和相关的并发症的手段有限,迫切需要找到一种副作用小且能纠正能量代谢不平衡的新药物。
成纤维生长因子21(FGF21)属于成纤维生长因子(FGF)家族的一员,它是一种重要的代谢调节因子,通过激活酪氨酸激酶跨膜受体家族的FGF受体(FGFRs)和共受体β-klotho(KLB)参与调控能量和糖脂代谢的平衡(Sonoda J,Chen MZ,Baruch A.Hormone MolecularBiology and Clinical Investigatio n,2017,30(2):1-13)。野生型的人FGF21为含181氨基酸的分泌性的多肽,与小鼠的FGF21氨基酸序列同源性为81%。人FGF21序列的N段末端参与与F GFRs的相互作用,而C段序列对结合共受体KLB是必不可少的(Micanovic R,RachesDW,Dunbar JD,等.Journal of Cellular Physiology,2009,219(2):227-234)。FGF21主要是通过激活AMPK/SIRT1/PGC1α来缓解高血糖症、减少甘油三酯水平和改善脂代谢(ChauMD,Gao J,Yang Q,等.Proceedings of t he National Academy of Sciences USA,2010,107(28):12553-12558)。FGF21被认为是治疗各种代谢疾病的有效靶点。例如将重组的FGF21蛋白注射到小鼠和受试人群体内,可以降低血清葡萄糖、甘油三酯和胆固醇的水平,增加胰岛素敏感性和促进能量代谢,减轻脂肪肝和肥胖(Hecht R,Li YS,Sun J等.PL oSOne,2012,7(11):e49345;Kharitonenkov A,Beals JM,Micanovic R等.PLoS One.2013;8(3):e58575)。FGF21在体内的半衰期很短,在灵长类动物中只有0.5~2h。并且FGF21在血液中极易被蛋白酶DPPIV在N末端的P2和P4位点剪切,以及成纤维细胞活化蛋白(FAP)在C末端的P171位点剪切,从而丧失活性(Sonoda J,Chen MZ,Baruch A.Hormone MolecularBiology and Clinical Investigation,2017,30(2):1-13)。这些问题都是在开发FGF21作为治疗代谢疾病药物过程中遇到的巨大挑战。
胰高血糖素样肽-1(GLP-1)是胰高血糖素肽家族的成员之一,是一种内源性的肠促胰岛素,参与葡萄糖转运和代谢的过程(Lee S,Lee DY.Annals of P ediatricEndocrinology&Metabolism,2017,22(1):15-26)。人体内的GLP-1有2种形式:GLP-1(7-36)主要由胰腺组织分泌;GLP-1(7-37)主要由肠道分泌。GLP-1通过激活G蛋白偶联受体家族的GLP-1受体(GLP-1R)来激活下游的cAMP依赖的信号通路。GLP-1受体激动剂也是目前治疗2型糖尿病的热门靶点,并且有多种药物获批在临床用于治疗2型糖尿病,如诺和诺德公司的利拉鲁肽,礼来公司的杜拉鲁肽等。这些GLP-1受体激动剂药物也具有减轻体重的效果,但是主要是通过抑制食欲和控制进食量来实现,这一作用就降低了患者的生活质量(Glaesner W,Vick AM,Millican R,等.Diabetes/Metabolism Research and Reviews,2010,26(4):287-296)。
虽然融合蛋白的研究在过去的数年间取得了相当大的进展,使我们日益看到了其最终走向临床应用的辉煌前景。但是总体说来,直接根据野生型蛋白序列制备成融合蛋白会影响其空间结构,使其活性受到影响。申请号为CN201280057819.0的专利申请公开了一种包含成纤维细胞生长因子(FGF21)和已知改善施用对象的代谢谱的其他代谢调节物的新的蛋白质,包括其变体。还公开了用于治疗FGF21相关疾病、GLP-1相关疾病和毒蜥外泌肽-4相关疾病的方法,包括代谢病况。然而,该公开获得的融合蛋白的活性不是太高,因此在实际的临床使用时需要频繁给药,仍需要进一步提高临床依从性。
因此,当前仍然需要活性更高而且依从性更好的FGF21相关疾病治疗剂。
发明内容
因此,本发明的目的是针对现有技术的不足,提供一种同时具有GLP-1和FGF21活性的融合蛋白,本发明还提供了所述蛋白的制备方法及其用途。本发明提供的蛋白在治疗或预防代谢疾病包括肥胖、高血脂症、糖尿病以及心脑血管疾病方面的应用。与现有技术相比,本发明提供的融合蛋白活性高、半衰期长、结构新,可显著降低血糖、血脂、体重和改善脂肪代谢。
一方面,本发明提供了一种人源成纤维生长因子21(FGF21)的变体,所述变体的氨基酸序列如以下通式I所示:
通式I
DSSPLLQFGGQVRQX15YLYTDDAQQTEAHLEIREDGTVGGAADQSPESL LQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFREX94LLEDGY NVYQSEAHGLPLHX114PGNKSPHRDPAPRGPX130RFLPLPGLPPALPEPPGILAP QPPDVGSSDPLSMVGGSQGRSPSYX176S
其中,X15为Arg或Val;
X94为Leu或Arg;
X114为Leu或Cys;
X130为Ala或Cys;
X176为Ala或Glu;
并且,X94和X114中有且仅有一个位点为Leu,X130和X176中至多一个位点为Ala;
优选地,其氨基酸序列如SEQ ID NO:1-4中任一条所示。
另一方面,本发明提供了一种融合蛋白,所述融合蛋白通过以下通式表示:
G-L-Fc-L-F;或
G-L-G-L-Fc-L-F;
其中,F表示本发明的人源FGF21的变体;;
G表示GLP-1变体(GLP-1v),其氨基酸序列如SEQ ID NO:5所示;
L表示连接序列;
FC表示人或动物的免疫球蛋白及其亚型和变体、人或动物白蛋白及其变体或PEG。
根据本发明所述的融合蛋白,其中,所述L通式为(GGGGS)n,其中n为0-5的整数;优选地,n为3;
优选地,FC表示IgG4FC片段;更优选地,所述FC包含如SEQ ID NO:17所示的氨基酸序列。
根据本发明所述的融合蛋白,其中,所述融合蛋白还包括其他抗原、功能性氨基酸序列和/或信号肽序列;优选地,所述功能性氨基酸序列为组氨酸标签或GST标签;
优选地,所述融合蛋白的氨基酸序列如SEQ ID NO:6-9、18或24-26中任一项所示。
另一方面,本发明提供了一种融合基因,其中,所述融合基因含有本发明所述的人源FGF21的变体或融合蛋白的编码核苷酸序列;所述FGF21的变体的编码核苷酸序列如SEQID NO:20-23中任一项所示;
所述融合蛋白的编码核苷酸序列如SEQ ID NO:10-13、19或27-29中任一项所示。
再一方面,本发明提供了一种表达构建体,其中,所述表达构建体含有本发明所述的人源FGF21的变体或融合蛋白的编码核苷酸序列。
根据本发明所述的表达构建体,其中,所述表达构建体是原核表达构建体;优选地,所述原核表达构建体为pET载体系列;
或所述表达构建体为真核表达构建体;优选地,所述真核表达构建体为质粒DNA载体,优选pVAX1载体和pSV1.0载体;重组病毒载体,优选重组痘苗病毒载体、重组腺病毒载体或重组腺相关病毒载体;或逆转录病毒载体,优选HIV病毒载体,或慢病毒载体。
另一方面,本发明提供了一种宿主细胞,其中所述宿主细胞包括本发明所述的表达构建体;
优选地,当所述表达构建体是原核表达构建体时,所述宿主细胞是原核生物细胞,优选细菌细胞;或当所述表达构建体是真核表达构建体时,所述宿主细胞是真核生物细胞,优选哺乳动物细胞,更优选地为CHO细胞。
另一方面,本发明提供了一种药物组合物,所述药物组合物包括根据本发明所述的人源FGF21的变体或融合蛋白。
另一方面,本发明提供了一种人源FGF21的变体或融合蛋白的制备方法,所述方法包括将所述的融合蛋白的编码核苷酸序列克隆至表达载体的步骤;
优选地,所述制备方法包括以下步骤:
1)构建上述人源FGF21的变体或融合蛋白的核酸序列;
2)构建包含步骤1)的核酸序列的表达载体;
3)将步骤2)的表达载体用于转染或转化宿主细胞,并使所述核酸序列在宿主细胞中表达;
4)将步骤3)中表达的蛋白进行纯化;
更优选地,在步骤3)中,所述宿主细胞为CHO-S细胞。
本发明还提供了上述人源FGF21的变体或融合蛋白、融合基因、表达构建体、所述的宿主细胞或药物组合物在制备糖尿病、肥胖、高血脂症以及心脑血管疾病药物中的应用。
本发明的人源FGF21的变体(FGF21v)Fv2、Fv3、Fv4和Fv5的氨基酸序列如SEQ IDNO:1、2、3和4所示:
Fv2 SEQ ID NO:1
DSSPLLQFGGQVRQRYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYAS
Fv3 SEQ ID NO:2
DSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYAS
Fv4 SEQ ID NO:3
DSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRERLLEDGYNVYQSEAHGLPLHLPGNKSPHRDPAPRGPARFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYES
Fv5 SEQ ID NO:4
DSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYES
本发明的人源FGF21的变体(FGF21v)Fv2、Fv3、Fv4和Fv5的核苷酸序列如SEQ IDNO:20、21、22和23所示:
Fv2 SEQ ID NO:20
GACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGCGGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
Fv3 SEQ ID NO:21
GACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
Fv4 SEQ ID NO:22
GACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCGGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACCTGCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCAGCTCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
Fv5 SEQ ID NO:23
GACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
与现有技术相比,本发明具有以下优点:
本发明的实施方案中采用低密度脂蛋白缺失小鼠正常小鼠糖负荷模型,以杜拉鲁肽为阳性对照药,评价了本发明的融合蛋白的活性,结果表明,本发明的融合蛋白在治疗高血脂症上的良好疗效,并且优势更明显。
附图说明
以下,结合附图来详细说明本发明的实施方案,其中:
图1:pcDNA3.4-融合蛋白的质粒图谱
图2:野生型GF蛋白及其突变体对HepG2细胞中AMPK的磷酸化和总AMPK的影响。con是未加药物处理的细胞。*代表与con相比差异显著(p值<0.05);**代表与con相比差异极显著(p值<0.001);##代表与GF相比差异极显著(p值<0.001)
图3:不同蛋白对HEK293-GLP1R/β-klotho/CRE-Luciferase细胞中荧光素酶表达的影响。(A)不同GGFvn蛋白的EC50值与对应GFvn蛋白的比较,n=2-5。(B)G、GFv5和GGFv5的EC50值比较。
图4:GFv5和GGFv5对ldlr-/-小鼠体重(A)和进食量(B)的影响。*代表与con组相比差异显著(p值<0.05);#代表与G组相比差异显著(p值<0.05);$代表与GFv5相比差异显著(p值<0.05)。
图5:GFv5和GGFv5对ldlr-/-小血脂的影响。*代表与con组相比差异显著(p值<0.05);#代表与G组相比差异显著(p值<0.05);$代表与GFv5相比差异显著(p值<0.05)。
具体实施方式
以下参照具体的实施例来说明本发明。本领域技术人员能够理解,这些实施例仅用于说明本发明,其不以任何方式限制本发明所要保护的范围。
除非特别指明,以下实施例中所用的试剂均为分析纯级试剂,且可从正规渠道商购获得。
实施例1制备本发明的融合蛋白
所述融合蛋白使用本发明的常规技术手段制备,并具体包括以下步骤:用pcDNA3.4-TOPO TA cloning kit(购自英潍捷基(上海)贸易有限公司)构建含有融合蛋白的pcDNA3.4质粒,质粒图谱如图1所示。使用该质粒转染ExpiCHO-S细胞,用ExpiCHO表达系统(购自英潍捷基(上海)贸易有限公司)表达蛋白。
用以下方法纯化后得到本发明所述的融合蛋白:上清用0.22μm滤膜过滤去除细胞碎片。用5倍柱体积的平衡缓冲液(5.6mM NaH2PO4,14.4mM Na2HPO4,0.15M NaCl,pH7.2)处理protein A亲和柱HiTrap MabSelect SuRe(购自GE通用公司),再将上清进行上样,上样结束后,用缓冲液(5.6mM NaH2PO4·H2O,14.4mM Na2HPO4,0.5M NaCl,pH7.2)冲洗结合不牢固的杂蛋白至基线。再用洗脱液50mM柠檬酸/柠檬酸钠缓冲液(含0.02%吐温-80+5%甘露醇,pH3.2)洗脱蛋白,再用1M Tris-Cl(pH8.0)调节pH至7.0。纯化后的样品经0.22μm滤膜过滤除菌后保存于4℃。
具体地,本发明的融合蛋白具有通式G-L-Fc-L-Fv2、G-L-Fc-L-Fv3、G-L-Fc-L-Fv4、G-L-Fc-L-Fv5;其氨基酸序列分别如SEQ ID NO:6-9所示;核苷酸序列如SEQ ID NO:10-13所示;
G-L-Fc-L-Fv2SEQ ID NO:6(其中,加粗部分为GLP-1变体的氨基酸序列,斜体加粗部分为链接序列的氨基酸序列,下划线部分为Fc的氨基酸序列)
G-L-Fc-L-Fv3SEQ ID NO:7(其中,加粗部分为GLP-1变体的氨基酸序列,斜体加粗部分为链接序列的氨基酸序列,下划线部分为Fc的氨基酸序列)
G-L-Fc-L-Fv4SEQ ID NO:8(其中,加粗部分为GLP-1变体的氨基酸序列,斜体加粗部分为链接序列的氨基酸序列,下划线部分为Fc的氨基酸序列)
G-L-Fc-L-Fv5SEQ ID NO:9(其中,加粗部分为GLP-1变体的氨基酸序列,斜体加粗部分为链接序列的氨基酸序列,下划线部分为Fc的氨基酸序列)
G-L-Fc-L-Fv2 SEQ ID NO:10:
CACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGCGGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
G-L-Fc-L-Fv3 SEQ ID NO:11
CACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
G-L-Fc-L-Fv4 SEQ ID NO:12
CACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCGGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACCTGCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCAGCTCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
G-L-Fc-L-Fv5 SEQ ID NO:13
CACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
进一步地,发明人制备了野生型G-L-Fc-L-F融合蛋白,其氨基酸序列如SEQ IDNO:14所示:
HGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSAESKYGPPCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAKTKPREEQFNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKGLPSSIEKTISKAKGQPREPQVYTLPPSQEEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSRLTVDKSRWQEGNVFSCSVMHEALHNHYTQKSLSLSLGGGGGSGGGGSGGGGSHPIPDSSPLLQFGGQVRQRYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHLPGNKSPHRDPAPRGPARFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGPSQGRSPSYAS
其核苷酸序列如SEQ ID NO:15所示:
CACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCCACCCCATCCCTGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGCGGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACCTGCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCAGCTCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGACCTTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
其中,所使用的信号肽的核苷酸序列如SEQ ID NO:16所示:
ATGCCGTCTTCTGTCTCGTGGGGCATCCTCCTGCTGGCAGGCCTGTGCTGCCTGGTCCCTGTCTCCCTGGCT
实施例2本发明的融合蛋白对HepG2细胞的AMPK信号通路的影响用含10%FBS的DMEM培养基将HepG2细胞(军事医学科学院惠赠)培养至90%以上汇合度,消化细胞并重悬,按照每孔2.5×105个细胞接种到6孔板中,每孔加入2mL含10%FBS的DMEM培养基,于37℃、5%CO2饱和湿度培养过夜至70%-80%饱和度。弃去原有培养基,换成新的预热的不含血清的DMEM培养基。继续培养6h后,加入100nM纯化后的野生型融合蛋白G-L-Fc-L-F(GF)和4种突变体G-L-Fc-L-Fv2(GFv2)、G-L-Fc-L-Fv3(GFv3)、G-L-Fc-L-Fv4(GFv4)、G-L-Fc-L-Fv5(GFv5),处理细胞24h后,弃去培养上清,消化收集细胞,用预冷的PBS洗涤一次细胞,用含1%PMSF的RIPA裂解液(购自北京康为世纪生物科技有限公司)裂解细胞,并按照说明书提取细胞总蛋白质。取15μL总蛋白用免疫印迹检测细胞内总的AMPK(AMPKα抗体)和磷酸化的AMPK(pAMPK,phospho-AMPKα(Thr172)抗体)(抗体均购自Cell Signaling Technologies公司)的表达水平。
结果如图2所示,野生型的GF融合蛋白和4种GF突变体处理后,HepG2细胞的AMPK的磷酸化水平明显高于(pAMPK/AMPK比值增大)对照组(con),说明所述蛋白都是有活性的蛋白。其中,突变体GFv3和GFv5处理后的HepG2细胞中AMPK的磷酸化水平明显高于GF蛋白,说明突变后的这两种蛋白具有比野生型蛋白更大的活性。
实施例3 GF融合蛋白及其突变体对GLP1受体和FGF21受体的活化比较
用含10%FBS的DMEM培养基将表达GLP1R和FGF21共受体(β-klotho)以及CRE-荧光素酶诱导表达系统的HEK293细胞(HEK293-GLP1R/β-klotho/CRE-Luciferase)培养至90%以上汇合度,消化细胞并重悬,按照每孔4×104个细胞接种到96孔板中,每孔加入100μL含10%FBS的DMEM培养基,于37℃、5%CO2饱和湿度培养过夜。第二天加入不同浓度梯度的(0、0.001、0.01、0.1、1、10、100nM)的野生型融合蛋白G-L-Fc-L-F(GF)和4种突变体G-L-Fc-L-Fv2(GFv2)、G-L-Fc-L-Fv3(GFv3)、G-L-Fc-L-Fv4(GFv4)、G-L-Fc-L-Fv5(GFv5),处理细胞6-8h后,弃去培养上清,用PBS洗涤细胞两次,根据说明书裂解细胞并检测荧光素酶的表达(单荧光素酶报告基因检测试剂盒,购自北京原平皓生物技术有限公司)。用Graphpad Prism软件分析数据,得到一种GF蛋白的EC50值见表1。结果显示,4种突变体的EC50值均小于野生型的融合蛋白GF,说明它们同时激活两种受体的效果更好。其中突变体GFv5的EC50值最小,说明其活性最高。
表1:GF蛋白在HEK293-GLP1R/β-klotho/CRE-Luciferase细胞中活性的测定
实施例4构建表达新结构的GGF融合蛋白及其活性分析
在4种GF突变体的基础上构建表达了新结构融合蛋白GGFv2、GGFv3、GGFv4、GGFv5。
具体地,新结构的融合蛋白具有通式G-L-G-L-Fc-L-Fv2(GGFv2)、G-L-G-L-Fc-L-Fv3(GGFv3)、G-L-G-L-Fc-L-Fv4(GGFv4)、G-L-G-L-Fc-L-Fv5(GGFv5);其氨基酸序列分别如SEQ ID NO:24-26和18所示;核苷酸序列如SEQ ID NO:27-29和19所示;
其中G-L-G-L-Fc-L-Fv2氨基酸序列如SEQ ID NO:24所示:
HGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSHGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSAESKYGPPCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAKTKPREEQFNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKGLPSSIEKTISKAKGQPREPQVYTLPPSQEEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSRLTVDKSRWQEGNVFSCSVMHEALHNHYTQKSLSLSLGGGGGSGGGGSGGGGSDSSPLLQFGGQVRQRYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYAS
其中G-L-G-L-Fc-L-Fv3氨基酸序列如SEQ ID NO:25所示:
HGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSHGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSAESKYGPPCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAKTKPREEQFNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKGLPSSIEKTISKAKGQPREPQVYTLPPSQEEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSRLTVDKSRWQEGNVFSCSVMHEALHNHYTQKSLSLSLGGGGGSGGGGSGGGGSDSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYAS
其中G-L-G-L-Fc-L-Fv4氨基酸序列如SEQ ID NO:26所示:
HGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSHGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSAESKYGPPCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAKTKPREEQFNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKGLPSSIEKTISKAKGQPREPQVYTLPPSQEEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSRLTVDKSRWQEGNVFSCSVMHEALHNHYTQKSLSLSLGGGGGSGGGGSGGGGSDSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRERLLEDGYNVYQSEAHGLPLHLPGNKSPHRDPAPRGPARFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYESG-L-G-L-Fc-L-Fv5氨基酸序列如SEQ ID NO:18所示:
HGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSHGEGTFTSDVSSYLEEQAAKEFIAWLVKGGGGGGGSGGGGSGGGGSAESKYGPPCPPCPAPEAAGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAKTKPREEQFNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKGLPSSIEKTISKAKGQPREPQVYTLPPSQEEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSRLTVDKSRWQEGNVFSCSVMHEALHNHYTQKSLSLSLGGGGGSGGGGSGGGGSDSSPLLQFGGQVRQVYLYTDDAQQTEAHLEIREDGTVGGAADQSPESLLQLKALKPGVIQILGVKTSRFLCQRPDGALYGSLHFDPEACSFRELLLEDGYNVYQSEAHGLPLHCPGNKSPHRDPAPRGPCRFLPLPGLPPALPEPPGILAPQPPDVGSSDPLSMVGGSQGRSPSYES
其中G-L-G-L-Fc-L-Fv2核苷酸序列如SEQ ID NO:27所示
CATGGCGAAGGGACCTTTACCAGTGATGTAAGTTCTTATTTGGAAGAGCAAGCTGCCAAGGAATTCATTGCTTGGCTGGTGAAAGGCGGCGGAGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCCACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCGGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACCTGCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCAGCTCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
其中G-L-G-L-Fc-L-Fv3核苷酸序列如SEQ ID NO:28所示
CATGGCGAAGGGACCTTTACCAGTGATGTAAGTTCTTATTTGGAAGAGCAAGCTGCCAAGGAATTCATTGCTTGGCTGGTGAAAGGCGGCGGAGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCCACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGCTTCCTGA
其中G-L-G-L-Fc-L-Fv4核苷酸序列如SEQ ID NO:29所示
CATGGCGAAGGGACCTTTACCAGTGATGTAAGTTCTTATTTGGAAGAGCAAGCTGCCAAGGAATTCATTGCTTGGCTGGTGAAAGGCGGCGGAGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCCACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCGGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACCTGCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCAGCTCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
G-L-G-L-Fc-L-Fv5核苷酸序列如SEQ ID NO:19所示:
CATGGCGAAGGGACCTTTACCAGTGATGTAAGTTCTTATTTGGAAGAGCAAGCTGCCAAGGAATTCATTGCTTGGCTGGTGAAAGGCGGCGGAGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCCACGGCGAGGGCACCTTCACCTCCGACGTGTCCTCCTATCTCGAGGAGCAGGCCGCCAAGGAATTCATCGCCTGGCTGGTGAAGGGCGGCGGCGGTGGTGGTGGCTCCGGAGGCGGCGGCTCTGGTGGCGGTGGCAGCGCTGAGTCCAAATATGGTCCCCCATGCCCACCCTGCCCAGCACCTGAGGCCGCCGGGGGACCATCAGTCTTCCTGTTCCCCCCAAAACCCAAGGACACTCTCATGATCTCCCGGACCCCTGAGGTCACGTGCGTGGTGGTGGACGTGAGCCAGGAAGACCCCGAGGTCCAGTTCAACTGGTACGTGGATGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTTCAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAACGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGGCCTCCCGTCCTCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAGCCACAGGTGTACACCCTGCCCCCATCCCAGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTACCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAGGCTAACCGTGGACAAGAGCAGGTGGCAGGAGGGGAATGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACACAGAAGAGCCTCTCCCTGTCTCTGGGTGGCGGAGGCGGAAGCGGAGGCGGAGGAAGCGGCGGTGGCGGCAGCGACTCCAGTCCTCTCCTGCAATTCGGGGGCCAAGTCCGGCAGGTGTACCTCTACACAGATGATGCCCAGCAGACAGAAGCCCACCTGGAGATCAGGGAGGATGGGACGGTGGGGGGCGCTGCTGACCAGAGCCCCGAAAGTCTCCTGCAGCTGAAAGCCTTGAAGCCGGGAGTTATTCAAATCTTGGGAGTCAAGACATCCAGGTTCCTGTGCCAGCGGCCAGATGGGGCCCTGTATGGATCGCTCCACTTTGACCCTGAGGCCTGCAGCTTCCGGGAGCTGCTTCTTGAGGACGGATACAATGTTTACCAGTCCGAAGCCCACGGCCTCCCGCTGCACTGCCCAGGGAACAAGTCCCCACACCGGGACCCTGCACCCCGAGGACCATGCCGCTTCCTGCCACTACCAGGCCTGCCCCCCGCACTCCCGGAGCCACCCGGAATCCTGGCCCCCCAGCCCCCCGATGTGGGCTCCTCGGACCCTCTGAGCATGGTGGGAGGCTCCCAGGGCCGAAGCCCCAGCTACGAGTCCTGA
对纯化后4种GGFv蛋白,用HEK293-GLP1R/β-klotho/CRE-Luciferase细胞评价对GLP1受体和FGF21受体的激活作用,具体方法见实施例3。如图3A所示,4种GGFv蛋白的EC50值较对应的GFv蛋白均减小,说明新结构的融合蛋白活性均有所提高,其中GGFv5的活性提高的最多。如图3B所示,GGFv5和GFv5的EC50值均比药物杜拉鲁肽(G,购自美国礼来公司)小,且GGFv5的值最小,说明其活性最高。
实施例5在高血脂模型小鼠中验证双功能蛋白的生物活性。
将24只4-8周龄的低密度脂蛋白缺失小鼠(Ldlr-/-小鼠)(购自江苏集萃药康生物科技有限公司)用高脂饲料(含60%脂肪,购自北京柏奥生物科技有限公司)饲喂2周后,形成高血脂模型小鼠。根据随机体重将小鼠分成4组:对照组(con,生理盐水)、G组(杜拉鲁肽)、GFv5组(GFv5蛋白)、GGFv5组(GGFv5蛋白),每组6只小鼠。每组给药方式为皮下注射,给药剂量均为20nmol/kg,每周两次。每周称量记录小鼠随机体重。治疗4周后,检测血清生化指标:小鼠眼球取血,3000rpm离心10min分离血清,样品送至北京北方生科医学技术有限公司检测甘油三脂(TG)、总胆固醇(TG)、高密度脂蛋白(HDL)、低密度脂蛋白(LDL)指标。
图4结果表明,3种药物治疗四周后,小鼠体重大小顺序为GGFv5组<GFv5组<G组,均明显小于con组小鼠的体重,并且GFv5组的体重也明显小于G组。而GGFv5组在治疗3周后体重就与对照组和G组存在明显差异,4周以后与GFv5组也出现了显著性差异,并且在观察的4周中整组小鼠的体重与给药前体重相比几乎没有增加(体重增长率为-0.48±2.23%)。观察几组小鼠在治疗过程中的进食量发现,除了G组小鼠的进食量明显低于con组小鼠外,GGFv5和GFv5组小鼠的进食量与对照组没有明显差异,说明这两组小鼠与对照组小鼠的体重差异并不是因为饮食上的减少而引起的,而药物G对小鼠体重的影响很有可能与减少饮食有关。这些表明在高脂饮食条件下,对小鼠给予GFv5和GGFv5药物能很好地控制其体重的增长,并且GGFv5的效果更优。
图5结果表明,与con组小鼠相比,G组小鼠血清中的甘油三酯(TG)明显降低;GFv5组小鼠的胆固醇(CHOL)和TG明显下降;GGFv5组小鼠的CHOL、TG和低密度脂蛋白(LDL-C)均明显下降。另外,GGFv5组的CHOL、TG和LDL-C均明显低于G组小鼠,且GGFv5组的TG和LDL-C也与GFv5组小鼠存在显著差异。这充分显示了GFv5和GGFv5在治疗高血脂症上的良好疗效,并且GGFv5的优势更明显。
以上对本发明具体实施方式的描述并不限制本发明,本领域技术人员可以根据本发明作出各种改变或变形,只要不脱离本发明的精神,均应属于本发明所附权利要求的范围。
序列表
<110> 北京双因生物科技有限公司 段海峰
<120> 成纤维生长因子21变体、其融合蛋白及其用途
<160> 29
<170> SIPOSequenceListing 1.0
<210> 1
<211> 177
<212> PRT
<213> Artificial Sequence
<400> 1
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Arg Tyr
1 5 10 15
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
20 25 30
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
35 40 45
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
50 55 60
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
65 70 75 80
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
85 90 95
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
100 105 110
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
115 120 125
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
130 135 140
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
145 150 155 160
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Ala
165 170 175
Ser
<210> 2
<211> 177
<212> PRT
<213> Artificial Sequence
<400> 2
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
1 5 10 15
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
20 25 30
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
35 40 45
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
50 55 60
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
65 70 75 80
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
85 90 95
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
100 105 110
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
115 120 125
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
130 135 140
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
145 150 155 160
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Ala
165 170 175
Ser
<210> 3
<211> 177
<212> PRT
<213> Artificial Sequence
<400> 3
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
1 5 10 15
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
20 25 30
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
35 40 45
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
50 55 60
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
65 70 75 80
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Arg Leu Leu
85 90 95
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
100 105 110
His Leu Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
115 120 125
Pro Ala Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
130 135 140
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
145 150 155 160
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Glu
165 170 175
Ser
<210> 4
<211> 177
<212> PRT
<213> Artificial Sequence
<400> 4
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
1 5 10 15
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
20 25 30
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
35 40 45
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
50 55 60
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
65 70 75 80
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
85 90 95
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
100 105 110
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
115 120 125
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
130 135 140
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
145 150 155 160
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Glu
165 170 175
Ser
<210> 5
<211> 30
<212> PRT
<213> Artificial Sequence
<400> 5
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly
20 25 30
<210> 6
<211> 467
<212> PRT
<213> Artificial Sequence
<400> 6
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu
35 40 45
Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala
50 55 60
Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu
65 70 75 80
Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser
85 90 95
Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu
100 105 110
Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr
115 120 125
Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn
130 135 140
Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser
145 150 155 160
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln
165 170 175
Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val
180 185 190
Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val
195 200 205
Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro
210 215 220
Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr
225 230 235 240
Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val
245 250 255
Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
260 265 270
Ser Leu Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln
290 295 300
Arg Tyr Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu
305 310 315 320
Ile Arg Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu
325 330 335
Ser Leu Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu
340 345 350
Gly Val Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu
355 360 365
Tyr Gly Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu
370 375 380
Leu Leu Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu
385 390 395 400
Pro Leu His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro
405 410 415
Arg Gly Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu
420 425 430
Pro Glu Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser
435 440 445
Ser Asp Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser
450 455 460
Tyr Ala Ser
465
<210> 7
<211> 467
<212> PRT
<213> Artificial Sequence
<400> 7
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu
35 40 45
Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala
50 55 60
Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu
65 70 75 80
Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser
85 90 95
Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu
100 105 110
Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr
115 120 125
Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn
130 135 140
Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser
145 150 155 160
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln
165 170 175
Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val
180 185 190
Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val
195 200 205
Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro
210 215 220
Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr
225 230 235 240
Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val
245 250 255
Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
260 265 270
Ser Leu Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln
290 295 300
Val Tyr Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu
305 310 315 320
Ile Arg Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu
325 330 335
Ser Leu Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu
340 345 350
Gly Val Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu
355 360 365
Tyr Gly Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu
370 375 380
Leu Leu Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu
385 390 395 400
Pro Leu His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro
405 410 415
Arg Gly Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu
420 425 430
Pro Glu Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser
435 440 445
Ser Asp Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser
450 455 460
Tyr Ala Ser
465
<210> 8
<211> 467
<212> PRT
<213> Artificial Sequence
<400> 8
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu
35 40 45
Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala
50 55 60
Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu
65 70 75 80
Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser
85 90 95
Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu
100 105 110
Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr
115 120 125
Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn
130 135 140
Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser
145 150 155 160
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln
165 170 175
Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val
180 185 190
Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val
195 200 205
Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro
210 215 220
Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr
225 230 235 240
Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val
245 250 255
Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
260 265 270
Ser Leu Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln
290 295 300
Val Tyr Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu
305 310 315 320
Ile Arg Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu
325 330 335
Ser Leu Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu
340 345 350
Gly Val Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu
355 360 365
Tyr Gly Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Arg
370 375 380
Leu Leu Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu
385 390 395 400
Pro Leu His Leu Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro
405 410 415
Arg Gly Pro Ala Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu
420 425 430
Pro Glu Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser
435 440 445
Ser Asp Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser
450 455 460
Tyr Glu Ser
465
<210> 9
<211> 467
<212> PRT
<213> Artificial Sequence
<400> 9
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu
35 40 45
Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala
50 55 60
Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu
65 70 75 80
Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser
85 90 95
Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu
100 105 110
Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr
115 120 125
Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn
130 135 140
Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser
145 150 155 160
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln
165 170 175
Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val
180 185 190
Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val
195 200 205
Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro
210 215 220
Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr
225 230 235 240
Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val
245 250 255
Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
260 265 270
Ser Leu Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln
290 295 300
Val Tyr Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu
305 310 315 320
Ile Arg Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu
325 330 335
Ser Leu Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu
340 345 350
Gly Val Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu
355 360 365
Tyr Gly Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu
370 375 380
Leu Leu Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu
385 390 395 400
Pro Leu His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro
405 410 415
Arg Gly Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu
420 425 430
Pro Glu Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser
435 440 445
Ser Asp Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser
450 455 460
Tyr Glu Ser
465
<210> 10
<211> 1404
<212> DNA
<213> Artificial Sequence
<400> 10
cacggcgagg gcaccttcac ctccgacgtg tcctcctatc tcgaggagca ggccgccaag 60
gaattcatcg cctggctggt gaagggcggc ggcggtggtg gtggctccgg aggcggcggc 120
tctggtggcg gtggcagcgc tgagtccaaa tatggtcccc catgcccacc ctgcccagca 180
cctgaggccg ccgggggacc atcagtcttc ctgttccccc caaaacccaa ggacactctc 240
atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca ggaagacccc 300
gaggtccagt tcaactggta cgtggatggc gtggaggtgc ataatgccaa gacaaagccg 360
cgggaggagc agttcaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag 420
gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccgtcctcc 480
atcgagaaaa ccatctccaa agccaaaggg cagccccgag agccacaggt gtacaccctg 540
cccccatccc aggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc 600
ttctacccca gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac 660
aagaccacgc ctcccgtgct ggactccgac ggctccttct tcctctacag caggctaacc 720
gtggacaaga gcaggtggca ggaggggaat gtcttctcat gctccgtgat gcatgaggct 780
ctgcacaacc actacacaca gaagagcctc tccctgtctc tgggtggcgg aggcggaagc 840
ggaggcggag gaagcggcgg tggcggcagc gactccagtc ctctcctgca attcgggggc 900
caagtccggc agcggtacct ctacacagat gatgcccagc agacagaagc ccacctggag 960
atcagggagg atgggacggt ggggggcgct gctgaccaga gccccgaaag tctcctgcag 1020
ctgaaagcct tgaagccggg agttattcaa atcttgggag tcaagacatc caggttcctg 1080
tgccagcggc cagatggggc cctgtatgga tcgctccact ttgaccctga ggcctgcagc 1140
ttccgggagc tgcttcttga ggacggatac aatgtttacc agtccgaagc ccacggcctc 1200
ccgctgcact gcccagggaa caagtcccca caccgggacc ctgcaccccg aggaccatgc 1260
cgcttcctgc cactaccagg cctgcccccc gcactcccgg agccacccgg aatcctggcc 1320
ccccagcccc ccgatgtggg ctcctcggac cctctgagca tggtgggagg ctcccagggc 1380
cgaagcccca gctacgcttc ctga 1404
<210> 11
<211> 1404
<212> DNA
<213> Artificial Sequence
<400> 11
cacggcgagg gcaccttcac ctccgacgtg tcctcctatc tcgaggagca ggccgccaag 60
gaattcatcg cctggctggt gaagggcggc ggcggtggtg gtggctccgg aggcggcggc 120
tctggtggcg gtggcagcgc tgagtccaaa tatggtcccc catgcccacc ctgcccagca 180
cctgaggccg ccgggggacc atcagtcttc ctgttccccc caaaacccaa ggacactctc 240
atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca ggaagacccc 300
gaggtccagt tcaactggta cgtggatggc gtggaggtgc ataatgccaa gacaaagccg 360
cgggaggagc agttcaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag 420
gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccgtcctcc 480
atcgagaaaa ccatctccaa agccaaaggg cagccccgag agccacaggt gtacaccctg 540
cccccatccc aggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc 600
ttctacccca gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac 660
aagaccacgc ctcccgtgct ggactccgac ggctccttct tcctctacag caggctaacc 720
gtggacaaga gcaggtggca ggaggggaat gtcttctcat gctccgtgat gcatgaggct 780
ctgcacaacc actacacaca gaagagcctc tccctgtctc tgggtggcgg aggcggaagc 840
ggaggcggag gaagcggcgg tggcggcagc gactccagtc ctctcctgca attcgggggc 900
caagtccggc aggtgtacct ctacacagat gatgcccagc agacagaagc ccacctggag 960
atcagggagg atgggacggt ggggggcgct gctgaccaga gccccgaaag tctcctgcag 1020
ctgaaagcct tgaagccggg agttattcaa atcttgggag tcaagacatc caggttcctg 1080
tgccagcggc cagatggggc cctgtatgga tcgctccact ttgaccctga ggcctgcagc 1140
ttccgggagc tgcttcttga ggacggatac aatgtttacc agtccgaagc ccacggcctc 1200
ccgctgcact gcccagggaa caagtcccca caccgggacc ctgcaccccg aggaccatgc 1260
cgcttcctgc cactaccagg cctgcccccc gcactcccgg agccacccgg aatcctggcc 1320
ccccagcccc ccgatgtggg ctcctcggac cctctgagca tggtgggagg ctcccagggc 1380
cgaagcccca gctacgcttc ctga 1404
<210> 12
<211> 1404
<212> DNA
<213> Artificial Sequence
<400> 12
cacggcgagg gcaccttcac ctccgacgtg tcctcctatc tcgaggagca ggccgccaag 60
gaattcatcg cctggctggt gaagggcggc ggcggtggtg gtggctccgg aggcggcggc 120
tctggtggcg gtggcagcgc tgagtccaaa tatggtcccc catgcccacc ctgcccagca 180
cctgaggccg ccgggggacc atcagtcttc ctgttccccc caaaacccaa ggacactctc 240
atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca ggaagacccc 300
gaggtccagt tcaactggta cgtggatggc gtggaggtgc ataatgccaa gacaaagccg 360
cgggaggagc agttcaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag 420
gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccgtcctcc 480
atcgagaaaa ccatctccaa agccaaaggg cagccccgag agccacaggt gtacaccctg 540
cccccatccc aggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc 600
ttctacccca gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac 660
aagaccacgc ctcccgtgct ggactccgac ggctccttct tcctctacag caggctaacc 720
gtggacaaga gcaggtggca ggaggggaat gtcttctcat gctccgtgat gcatgaggct 780
ctgcacaacc actacacaca gaagagcctc tccctgtctc tgggtggcgg aggcggaagc 840
ggaggcggag gaagcggcgg tggcggcagc gactccagtc ctctcctgca attcgggggc 900
caagtccggc aggtgtacct ctacacagat gatgcccagc agacagaagc ccacctggag 960
atcagggagg atgggacggt ggggggcgct gctgaccaga gccccgaaag tctcctgcag 1020
ctgaaagcct tgaagccggg agttattcaa atcttgggag tcaagacatc caggttcctg 1080
tgccagcggc cagatggggc cctgtatgga tcgctccact ttgaccctga ggcctgcagc 1140
ttccgggagc ggcttcttga ggacggatac aatgtttacc agtccgaagc ccacggcctc 1200
ccgctgcacc tgccagggaa caagtcccca caccgggacc ctgcaccccg aggaccagct 1260
cgcttcctgc cactaccagg cctgcccccc gcactcccgg agccacccgg aatcctggcc 1320
ccccagcccc ccgatgtggg ctcctcggac cctctgagca tggtgggagg ctcccagggc 1380
cgaagcccca gctacgagtc ctga 1404
<210> 13
<211> 1404
<212> DNA
<213> Artificial Sequence
<400> 13
cacggcgagg gcaccttcac ctccgacgtg tcctcctatc tcgaggagca ggccgccaag 60
gaattcatcg cctggctggt gaagggcggc ggcggtggtg gtggctccgg aggcggcggc 120
tctggtggcg gtggcagcgc tgagtccaaa tatggtcccc catgcccacc ctgcccagca 180
cctgaggccg ccgggggacc atcagtcttc ctgttccccc caaaacccaa ggacactctc 240
atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca ggaagacccc 300
gaggtccagt tcaactggta cgtggatggc gtggaggtgc ataatgccaa gacaaagccg 360
cgggaggagc agttcaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag 420
gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccgtcctcc 480
atcgagaaaa ccatctccaa agccaaaggg cagccccgag agccacaggt gtacaccctg 540
cccccatccc aggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc 600
ttctacccca gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac 660
aagaccacgc ctcccgtgct ggactccgac ggctccttct tcctctacag caggctaacc 720
gtggacaaga gcaggtggca ggaggggaat gtcttctcat gctccgtgat gcatgaggct 780
ctgcacaacc actacacaca gaagagcctc tccctgtctc tgggtggcgg aggcggaagc 840
ggaggcggag gaagcggcgg tggcggcagc gactccagtc ctctcctgca attcgggggc 900
caagtccggc aggtgtacct ctacacagat gatgcccagc agacagaagc ccacctggag 960
atcagggagg atgggacggt ggggggcgct gctgaccaga gccccgaaag tctcctgcag 1020
ctgaaagcct tgaagccggg agttattcaa atcttgggag tcaagacatc caggttcctg 1080
tgccagcggc cagatggggc cctgtatgga tcgctccact ttgaccctga ggcctgcagc 1140
ttccgggagc tgcttcttga ggacggatac aatgtttacc agtccgaagc ccacggcctc 1200
ccgctgcact gcccagggaa caagtcccca caccgggacc ctgcaccccg aggaccatgc 1260
cgcttcctgc cactaccagg cctgcccccc gcactcccgg agccacccgg aatcctggcc 1320
ccccagcccc ccgatgtggg ctcctcggac cctctgagca tggtgggagg ctcccagggc 1380
cgaagcccca gctacgagtc ctga 1404
<210> 14
<211> 471
<212> PRT
<213> Artificial Sequence
<400> 14
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu
35 40 45
Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala
50 55 60
Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu
65 70 75 80
Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser
85 90 95
Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu
100 105 110
Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr
115 120 125
Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn
130 135 140
Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser
145 150 155 160
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln
165 170 175
Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val
180 185 190
Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val
195 200 205
Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro
210 215 220
Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr
225 230 235 240
Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val
245 250 255
Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
260 265 270
Ser Leu Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly
275 280 285
Gly Ser His Pro Ile Pro Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly
290 295 300
Gln Val Arg Gln Arg Tyr Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu
305 310 315 320
Ala His Leu Glu Ile Arg Glu Asp Gly Thr Val Gly Gly Ala Ala Asp
325 330 335
Gln Ser Pro Glu Ser Leu Leu Gln Leu Lys Ala Leu Lys Pro Gly Val
340 345 350
Ile Gln Ile Leu Gly Val Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro
355 360 365
Asp Gly Ala Leu Tyr Gly Ser Leu His Phe Asp Pro Glu Ala Cys Ser
370 375 380
Phe Arg Glu Leu Leu Leu Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu
385 390 395 400
Ala His Gly Leu Pro Leu His Leu Pro Gly Asn Lys Ser Pro His Arg
405 410 415
Asp Pro Ala Pro Arg Gly Pro Ala Arg Phe Leu Pro Leu Pro Gly Leu
420 425 430
Pro Pro Ala Leu Pro Glu Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro
435 440 445
Asp Val Gly Ser Ser Asp Pro Leu Ser Met Val Gly Pro Ser Gln Gly
450 455 460
Arg Ser Pro Ser Tyr Ala Ser
465 470
<210> 15
<211> 1416
<212> DNA
<213> Artificial Sequence
<400> 15
cacggcgagg gcaccttcac ctccgacgtg tcctcctatc tcgaggagca ggccgccaag 60
gaattcatcg cctggctggt gaagggcggc ggcggtggtg gtggctccgg aggcggcggc 120
tctggtggcg gtggcagcgc tgagtccaaa tatggtcccc catgcccacc ctgcccagca 180
cctgaggccg ccgggggacc atcagtcttc ctgttccccc caaaacccaa ggacactctc 240
atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca ggaagacccc 300
gaggtccagt tcaactggta cgtggatggc gtggaggtgc ataatgccaa gacaaagccg 360
cgggaggagc agttcaacag cacgtaccgt gtggtcagcg tcctcaccgt cctgcaccag 420
gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccgtcctcc 480
atcgagaaaa ccatctccaa agccaaaggg cagccccgag agccacaggt gtacaccctg 540
cccccatccc aggaggagat gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc 600
ttctacccca gcgacatcgc cgtggagtgg gagagcaatg ggcagccgga gaacaactac 660
aagaccacgc ctcccgtgct ggactccgac ggctccttct tcctctacag caggctaacc 720
gtggacaaga gcaggtggca ggaggggaat gtcttctcat gctccgtgat gcatgaggct 780
ctgcacaacc actacacaca gaagagcctc tccctgtctc tgggtggcgg aggcggaagc 840
ggaggcggag gaagcggcgg tggcggcagc caccccatcc ctgactccag tcctctcctg 900
caattcgggg gccaagtccg gcagcggtac ctctacacag atgatgccca gcagacagaa 960
gcccacctgg agatcaggga ggatgggacg gtggggggcg ctgctgacca gagccccgaa 1020
agtctcctgc agctgaaagc cttgaagccg ggagttattc aaatcttggg agtcaagaca 1080
tccaggttcc tgtgccagcg gccagatggg gccctgtatg gatcgctcca ctttgaccct 1140
gaggcctgca gcttccggga gctgcttctt gaggacggat acaatgttta ccagtccgaa 1200
gcccacggcc tcccgctgca cctgccaggg aacaagtccc cacaccggga ccctgcaccc 1260
cgaggaccag ctcgcttcct gccactacca ggcctgcccc ccgcactccc ggagccaccc 1320
ggaatcctgg ccccccagcc ccccgatgtg ggctcctcgg accctctgag catggtggga 1380
ccttcccagg gccgaagccc cagctacgct tcctga 1416
<210> 16
<211> 72
<212> DNA
<213> Artificial Sequence
<400> 16
atgccgtctt ctgtctcgtg gggcatcctc ctgctggcag gcctgtgctg cctggtccct 60
gtctccctgg ct 72
<210> 17
<211> 229
<212> PRT
<213> Artificial Sequence
<400> 17
Ala Glu Ser Lys Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu
1 5 10 15
Ala Ala Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp
20 25 30
Thr Leu Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp
35 40 45
Val Ser Gln Glu Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly
50 55 60
Val Glu Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn
65 70 75 80
Ser Thr Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp
85 90 95
Leu Asn Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro
100 105 110
Ser Ser Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu
115 120 125
Pro Gln Val Tyr Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn
130 135 140
Gln Val Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile
145 150 155 160
Ala Val Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr
165 170 175
Thr Pro Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg
180 185 190
Leu Thr Val Asp Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys
195 200 205
Ser Val Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu
210 215 220
Ser Leu Ser Leu Gly
225
<210> 18
<211> 513
<212> PRT
<213> Artificial Sequence
<400> 18
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser His Gly
35 40 45
Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu Gln Ala
50 55 60
Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly Gly Gly
65 70 75 80
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu Ser Lys
85 90 95
Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala Gly Gly
100 105 110
Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile
115 120 125
Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gln Glu
130 135 140
Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu Val His
145 150 155 160
Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr Tyr Arg
165 170 175
Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys
180 185 190
Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser Ile Glu
195 200 205
Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr
210 215 220
Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val Ser Leu
225 230 235 240
Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp
245 250 255
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val
260 265 270
Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr Val Asp
275 280 285
Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val Met His
290 295 300
Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Leu
305 310 315 320
Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
325 330 335
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
340 345 350
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
355 360 365
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
370 375 380
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
385 390 395 400
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
405 410 415
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
420 425 430
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
435 440 445
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
450 455 460
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
465 470 475 480
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
485 490 495
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Glu
500 505 510
Ser
<210> 19
<211> 1542
<212> DNA
<213> Artificial Sequence
<400> 19
catggcgaag ggacctttac cagtgatgta agttcttatt tggaagagca agctgccaag 60
gaattcattg cttggctggt gaaaggcggc ggaggcggag gcggaagcgg aggcggagga 120
agcggcggtg gcggcagcca cggcgagggc accttcacct ccgacgtgtc ctcctatctc 180
gaggagcagg ccgccaagga attcatcgcc tggctggtga agggcggcgg cggtggtggt 240
ggctccggag gcggcggctc tggtggcggt ggcagcgctg agtccaaata tggtccccca 300
tgcccaccct gcccagcacc tgaggccgcc gggggaccat cagtcttcct gttcccccca 360
aaacccaagg acactctcat gatctcccgg acccctgagg tcacgtgcgt ggtggtggac 420
gtgagccagg aagaccccga ggtccagttc aactggtacg tggatggcgt ggaggtgcat 480
aatgccaaga caaagccgcg ggaggagcag ttcaacagca cgtaccgtgt ggtcagcgtc 540
ctcaccgtcc tgcaccagga ctggctgaac ggcaaggagt acaagtgcaa ggtctccaac 600
aaaggcctcc cgtcctccat cgagaaaacc atctccaaag ccaaagggca gccccgagag 660
ccacaggtgt acaccctgcc cccatcccag gaggagatga ccaagaacca ggtcagcctg 720
acctgcctgg tcaaaggctt ctaccccagc gacatcgccg tggagtggga gagcaatggg 780
cagccggaga acaactacaa gaccacgcct cccgtgctgg actccgacgg ctccttcttc 840
ctctacagca ggctaaccgt ggacaagagc aggtggcagg aggggaatgt cttctcatgc 900
tccgtgatgc atgaggctct gcacaaccac tacacacaga agagcctctc cctgtctctg 960
ggtggcggag gcggaagcgg aggcggagga agcggcggtg gcggcagcga ctccagtcct 1020
ctcctgcaat tcgggggcca agtccggcag gtgtacctct acacagatga tgcccagcag 1080
acagaagccc acctggagat cagggaggat gggacggtgg ggggcgctgc tgaccagagc 1140
cccgaaagtc tcctgcagct gaaagccttg aagccgggag ttattcaaat cttgggagtc 1200
aagacatcca ggttcctgtg ccagcggcca gatggggccc tgtatggatc gctccacttt 1260
gaccctgagg cctgcagctt ccgggagctg cttcttgagg acggatacaa tgtttaccag 1320
tccgaagccc acggcctccc gctgcactgc ccagggaaca agtccccaca ccgggaccct 1380
gcaccccgag gaccatgccg cttcctgcca ctaccaggcc tgccccccgc actcccggag 1440
ccacccggaa tcctggcccc ccagcccccc gatgtgggct cctcggaccc tctgagcatg 1500
gtgggaggct cccagggccg aagccccagc tacgagtcct ga 1542
<210> 20
<211> 534
<212> DNA
<213> Artificial Sequence
<400> 20
gactccagtc ctctcctgca attcgggggc caagtccggc agcggtacct ctacacagat 60
gatgcccagc agacagaagc ccacctggag atcagggagg atgggacggt ggggggcgct 120
gctgaccaga gccccgaaag tctcctgcag ctgaaagcct tgaagccggg agttattcaa 180
atcttgggag tcaagacatc caggttcctg tgccagcggc cagatggggc cctgtatgga 240
tcgctccact ttgaccctga ggcctgcagc ttccgggagc tgcttcttga ggacggatac 300
aatgtttacc agtccgaagc ccacggcctc ccgctgcact gcccagggaa caagtcccca 360
caccgggacc ctgcaccccg aggaccatgc cgcttcctgc cactaccagg cctgcccccc 420
gcactcccgg agccacccgg aatcctggcc ccccagcccc ccgatgtggg ctcctcggac 480
cctctgagca tggtgggagg ctcccagggc cgaagcccca gctacgcttc ctga 534
<210> 21
<211> 534
<212> DNA
<213> Artificial Sequence
<400> 21
gactccagtc ctctcctgca attcgggggc caagtccggc aggtgtacct ctacacagat 60
gatgcccagc agacagaagc ccacctggag atcagggagg atgggacggt ggggggcgct 120
gctgaccaga gccccgaaag tctcctgcag ctgaaagcct tgaagccggg agttattcaa 180
atcttgggag tcaagacatc caggttcctg tgccagcggc cagatggggc cctgtatgga 240
tcgctccact ttgaccctga ggcctgcagc ttccgggagc tgcttcttga ggacggatac 300
aatgtttacc agtccgaagc ccacggcctc ccgctgcact gcccagggaa caagtcccca 360
caccgggacc ctgcaccccg aggaccatgc cgcttcctgc cactaccagg cctgcccccc 420
gcactcccgg agccacccgg aatcctggcc ccccagcccc ccgatgtggg ctcctcggac 480
cctctgagca tggtgggagg ctcccagggc cgaagcccca gctacgcttc ctga 534
<210> 22
<211> 534
<212> DNA
<213> Artificial Sequence
<400> 22
gactccagtc ctctcctgca attcgggggc caagtccggc aggtgtacct ctacacagat 60
gatgcccagc agacagaagc ccacctggag atcagggagg atgggacggt ggggggcgct 120
gctgaccaga gccccgaaag tctcctgcag ctgaaagcct tgaagccggg agttattcaa 180
atcttgggag tcaagacatc caggttcctg tgccagcggc cagatggggc cctgtatgga 240
tcgctccact ttgaccctga ggcctgcagc ttccgggagc ggcttcttga ggacggatac 300
aatgtttacc agtccgaagc ccacggcctc ccgctgcacc tgccagggaa caagtcccca 360
caccgggacc ctgcaccccg aggaccagct cgcttcctgc cactaccagg cctgcccccc 420
gcactcccgg agccacccgg aatcctggcc ccccagcccc ccgatgtggg ctcctcggac 480
cctctgagca tggtgggagg ctcccagggc cgaagcccca gctacgagtc ctga 534
<210> 23
<211> 534
<212> DNA
<213> Artificial Sequence
<400> 23
gactccagtc ctctcctgca attcgggggc caagtccggc aggtgtacct ctacacagat 60
gatgcccagc agacagaagc ccacctggag atcagggagg atgggacggt ggggggcgct 120
gctgaccaga gccccgaaag tctcctgcag ctgaaagcct tgaagccggg agttattcaa 180
atcttgggag tcaagacatc caggttcctg tgccagcggc cagatggggc cctgtatgga 240
tcgctccact ttgaccctga ggcctgcagc ttccgggagc tgcttcttga ggacggatac 300
aatgtttacc agtccgaagc ccacggcctc ccgctgcact gcccagggaa caagtcccca 360
caccgggacc ctgcaccccg aggaccatgc cgcttcctgc cactaccagg cctgcccccc 420
gcactcccgg agccacccgg aatcctggcc ccccagcccc ccgatgtggg ctcctcggac 480
cctctgagca tggtgggagg ctcccagggc cgaagcccca gctacgagtc ctga 534
<210> 24
<211> 513
<212> PRT
<213> Artificial Sequence
<400> 24
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser His Gly
35 40 45
Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu Gln Ala
50 55 60
Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly Gly Gly
65 70 75 80
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu Ser Lys
85 90 95
Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala Gly Gly
100 105 110
Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile
115 120 125
Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gln Glu
130 135 140
Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu Val His
145 150 155 160
Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr Tyr Arg
165 170 175
Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys
180 185 190
Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser Ile Glu
195 200 205
Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr
210 215 220
Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val Ser Leu
225 230 235 240
Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp
245 250 255
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val
260 265 270
Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr Val Asp
275 280 285
Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val Met His
290 295 300
Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Leu
305 310 315 320
Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
325 330 335
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Arg Tyr
340 345 350
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
355 360 365
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
370 375 380
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
385 390 395 400
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
405 410 415
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
420 425 430
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
435 440 445
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
450 455 460
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
465 470 475 480
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
485 490 495
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Ala
500 505 510
Ser
<210> 25
<211> 513
<212> PRT
<213> Artificial Sequence
<400> 25
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser His Gly
35 40 45
Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu Gln Ala
50 55 60
Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly Gly Gly
65 70 75 80
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu Ser Lys
85 90 95
Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala Gly Gly
100 105 110
Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile
115 120 125
Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gln Glu
130 135 140
Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu Val His
145 150 155 160
Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr Tyr Arg
165 170 175
Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys
180 185 190
Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser Ile Glu
195 200 205
Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr
210 215 220
Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val Ser Leu
225 230 235 240
Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp
245 250 255
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val
260 265 270
Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr Val Asp
275 280 285
Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val Met His
290 295 300
Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Leu
305 310 315 320
Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
325 330 335
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
340 345 350
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
355 360 365
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
370 375 380
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
385 390 395 400
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
405 410 415
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Leu Leu Leu
420 425 430
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
435 440 445
His Cys Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
450 455 460
Pro Cys Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
465 470 475 480
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
485 490 495
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Ala
500 505 510
Ser
<210> 26
<211> 513
<212> PRT
<213> Artificial Sequence
<400> 26
His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly
20 25 30
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser His Gly
35 40 45
Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu Gln Ala
50 55 60
Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Gly Gly Gly Gly Gly
65 70 75 80
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Glu Ser Lys
85 90 95
Tyr Gly Pro Pro Cys Pro Pro Cys Pro Ala Pro Glu Ala Ala Gly Gly
100 105 110
Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile
115 120 125
Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gln Glu
130 135 140
Asp Pro Glu Val Gln Phe Asn Trp Tyr Val Asp Gly Val Glu Val His
145 150 155 160
Asn Ala Lys Thr Lys Pro Arg Glu Glu Gln Phe Asn Ser Thr Tyr Arg
165 170 175
Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys
180 185 190
Glu Tyr Lys Cys Lys Val Ser Asn Lys Gly Leu Pro Ser Ser Ile Glu
195 200 205
Lys Thr Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr
210 215 220
Thr Leu Pro Pro Ser Gln Glu Glu Met Thr Lys Asn Gln Val Ser Leu
225 230 235 240
Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp
245 250 255
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val
260 265 270
Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Arg Leu Thr Val Asp
275 280 285
Lys Ser Arg Trp Gln Glu Gly Asn Val Phe Ser Cys Ser Val Met His
290 295 300
Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Leu
305 310 315 320
Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
325 330 335
Asp Ser Ser Pro Leu Leu Gln Phe Gly Gly Gln Val Arg Gln Val Tyr
340 345 350
Leu Tyr Thr Asp Asp Ala Gln Gln Thr Glu Ala His Leu Glu Ile Arg
355 360 365
Glu Asp Gly Thr Val Gly Gly Ala Ala Asp Gln Ser Pro Glu Ser Leu
370 375 380
Leu Gln Leu Lys Ala Leu Lys Pro Gly Val Ile Gln Ile Leu Gly Val
385 390 395 400
Lys Thr Ser Arg Phe Leu Cys Gln Arg Pro Asp Gly Ala Leu Tyr Gly
405 410 415
Ser Leu His Phe Asp Pro Glu Ala Cys Ser Phe Arg Glu Arg Leu Leu
420 425 430
Glu Asp Gly Tyr Asn Val Tyr Gln Ser Glu Ala His Gly Leu Pro Leu
435 440 445
His Leu Pro Gly Asn Lys Ser Pro His Arg Asp Pro Ala Pro Arg Gly
450 455 460
Pro Ala Arg Phe Leu Pro Leu Pro Gly Leu Pro Pro Ala Leu Pro Glu
465 470 475 480
Pro Pro Gly Ile Leu Ala Pro Gln Pro Pro Asp Val Gly Ser Ser Asp
485 490 495
Pro Leu Ser Met Val Gly Gly Ser Gln Gly Arg Ser Pro Ser Tyr Glu
500 505 510
Ser
<210> 27
<211> 1542
<212> DNA
<213> Artificial Sequence
<400> 27
catggcgaag ggacctttac cagtgatgta agttcttatt tggaagagca agctgccaag 60
gaattcattg cttggctggt gaaaggcggc ggaggcggag gcggaagcgg aggcggagga 120
agcggcggtg gcggcagcca cggcgagggc accttcacct ccgacgtgtc ctcctatctc 180
gaggagcagg ccgccaagga attcatcgcc tggctggtga agggcggcgg cggtggtggt 240
ggctccggag gcggcggctc tggtggcggt ggcagcgctg agtccaaata tggtccccca 300
tgcccaccct gcccagcacc tgaggccgcc gggggaccat cagtcttcct gttcccccca 360
aaacccaagg acactctcat gatctcccgg acccctgagg tcacgtgcgt ggtggtggac 420
gtgagccagg aagaccccga ggtccagttc aactggtacg tggatggcgt ggaggtgcat 480
aatgccaaga caaagccgcg ggaggagcag ttcaacagca cgtaccgtgt ggtcagcgtc 540
ctcaccgtcc tgcaccagga ctggctgaac ggcaaggagt acaagtgcaa ggtctccaac 600
aaaggcctcc cgtcctccat cgagaaaacc atctccaaag ccaaagggca gccccgagag 660
ccacaggtgt acaccctgcc cccatcccag gaggagatga ccaagaacca ggtcagcctg 720
acctgcctgg tcaaaggctt ctaccccagc gacatcgccg tggagtggga gagcaatggg 780
cagccggaga acaactacaa gaccacgcct cccgtgctgg actccgacgg ctccttcttc 840
ctctacagca ggctaaccgt ggacaagagc aggtggcagg aggggaatgt cttctcatgc 900
tccgtgatgc atgaggctct gcacaaccac tacacacaga agagcctctc cctgtctctg 960
ggtggcggag gcggaagcgg aggcggagga agcggcggtg gcggcagcga ctccagtcct 1020
ctcctgcaat tcgggggcca agtccggcag gtgtacctct acacagatga tgcccagcag 1080
acagaagccc acctggagat cagggaggat gggacggtgg ggggcgctgc tgaccagagc 1140
cccgaaagtc tcctgcagct gaaagccttg aagccgggag ttattcaaat cttgggagtc 1200
aagacatcca ggttcctgtg ccagcggcca gatggggccc tgtatggatc gctccacttt 1260
gaccctgagg cctgcagctt ccgggagcgg cttcttgagg acggatacaa tgtttaccag 1320
tccgaagccc acggcctccc gctgcacctg ccagggaaca agtccccaca ccgggaccct 1380
gcaccccgag gaccagctcg cttcctgcca ctaccaggcc tgccccccgc actcccggag 1440
ccacccggaa tcctggcccc ccagcccccc gatgtgggct cctcggaccc tctgagcatg 1500
gtgggaggct cccagggccg aagccccagc tacgagtcct ga 1542
<210> 28
<211> 1542
<212> DNA
<213> Artificial Sequence
<400> 28
catggcgaag ggacctttac cagtgatgta agttcttatt tggaagagca agctgccaag 60
gaattcattg cttggctggt gaaaggcggc ggaggcggag gcggaagcgg aggcggagga 120
agcggcggtg gcggcagcca cggcgagggc accttcacct ccgacgtgtc ctcctatctc 180
gaggagcagg ccgccaagga attcatcgcc tggctggtga agggcggcgg cggtggtggt 240
ggctccggag gcggcggctc tggtggcggt ggcagcgctg agtccaaata tggtccccca 300
tgcccaccct gcccagcacc tgaggccgcc gggggaccat cagtcttcct gttcccccca 360
aaacccaagg acactctcat gatctcccgg acccctgagg tcacgtgcgt ggtggtggac 420
gtgagccagg aagaccccga ggtccagttc aactggtacg tggatggcgt ggaggtgcat 480
aatgccaaga caaagccgcg ggaggagcag ttcaacagca cgtaccgtgt ggtcagcgtc 540
ctcaccgtcc tgcaccagga ctggctgaac ggcaaggagt acaagtgcaa ggtctccaac 600
aaaggcctcc cgtcctccat cgagaaaacc atctccaaag ccaaagggca gccccgagag 660
ccacaggtgt acaccctgcc cccatcccag gaggagatga ccaagaacca ggtcagcctg 720
acctgcctgg tcaaaggctt ctaccccagc gacatcgccg tggagtggga gagcaatggg 780
cagccggaga acaactacaa gaccacgcct cccgtgctgg actccgacgg ctccttcttc 840
ctctacagca ggctaaccgt ggacaagagc aggtggcagg aggggaatgt cttctcatgc 900
tccgtgatgc atgaggctct gcacaaccac tacacacaga agagcctctc cctgtctctg 960
ggtggcggag gcggaagcgg aggcggagga agcggcggtg gcggcagcga ctccagtcct 1020
ctcctgcaat tcgggggcca agtccggcag gtgtacctct acacagatga tgcccagcag 1080
acagaagccc acctggagat cagggaggat gggacggtgg ggggcgctgc tgaccagagc 1140
cccgaaagtc tcctgcagct gaaagccttg aagccgggag ttattcaaat cttgggagtc 1200
aagacatcca ggttcctgtg ccagcggcca gatggggccc tgtatggatc gctccacttt 1260
gaccctgagg cctgcagctt ccgggagctg cttcttgagg acggatacaa tgtttaccag 1320
tccgaagccc acggcctccc gctgcactgc ccagggaaca agtccccaca ccgggaccct 1380
gcaccccgag gaccatgccg cttcctgcca ctaccaggcc tgccccccgc actcccggag 1440
ccacccggaa tcctggcccc ccagcccccc gatgtgggct cctcggaccc tctgagcatg 1500
gtgggaggct cccagggccg aagccccagc tacgcttcct ga 1542
<210> 29
<211> 1542
<212> DNA
<213> Artificial Sequence
<400> 29
catggcgaag ggacctttac cagtgatgta agttcttatt tggaagagca agctgccaag 60
gaattcattg cttggctggt gaaaggcggc ggaggcggag gcggaagcgg aggcggagga 120
agcggcggtg gcggcagcca cggcgagggc accttcacct ccgacgtgtc ctcctatctc 180
gaggagcagg ccgccaagga attcatcgcc tggctggtga agggcggcgg cggtggtggt 240
ggctccggag gcggcggctc tggtggcggt ggcagcgctg agtccaaata tggtccccca 300
tgcccaccct gcccagcacc tgaggccgcc gggggaccat cagtcttcct gttcccccca 360
aaacccaagg acactctcat gatctcccgg acccctgagg tcacgtgcgt ggtggtggac 420
gtgagccagg aagaccccga ggtccagttc aactggtacg tggatggcgt ggaggtgcat 480
aatgccaaga caaagccgcg ggaggagcag ttcaacagca cgtaccgtgt ggtcagcgtc 540
ctcaccgtcc tgcaccagga ctggctgaac ggcaaggagt acaagtgcaa ggtctccaac 600
aaaggcctcc cgtcctccat cgagaaaacc atctccaaag ccaaagggca gccccgagag 660
ccacaggtgt acaccctgcc cccatcccag gaggagatga ccaagaacca ggtcagcctg 720
acctgcctgg tcaaaggctt ctaccccagc gacatcgccg tggagtggga gagcaatggg 780
cagccggaga acaactacaa gaccacgcct cccgtgctgg actccgacgg ctccttcttc 840
ctctacagca ggctaaccgt ggacaagagc aggtggcagg aggggaatgt cttctcatgc 900
tccgtgatgc atgaggctct gcacaaccac tacacacaga agagcctctc cctgtctctg 960
ggtggcggag gcggaagcgg aggcggagga agcggcggtg gcggcagcga ctccagtcct 1020
ctcctgcaat tcgggggcca agtccggcag gtgtacctct acacagatga tgcccagcag 1080
acagaagccc acctggagat cagggaggat gggacggtgg ggggcgctgc tgaccagagc 1140
cccgaaagtc tcctgcagct gaaagccttg aagccgggag ttattcaaat cttgggagtc 1200
aagacatcca ggttcctgtg ccagcggcca gatggggccc tgtatggatc gctccacttt 1260
gaccctgagg cctgcagctt ccgggagcgg cttcttgagg acggatacaa tgtttaccag 1320
tccgaagccc acggcctccc gctgcacctg ccagggaaca agtccccaca ccgggaccct 1380
gcaccccgag gaccagctcg cttcctgcca ctaccaggcc tgccccccgc actcccggag 1440
ccacccggaa tcctggcccc ccagcccccc gatgtgggct cctcggaccc tctgagcatg 1500
gtgggaggct cccagggccg aagccccagc tacgagtcct ga 1542