AU5266298A - Helicobacter polypeptides and corresponding polynucleotide molecules - Google Patents
Helicobacter polypeptides and corresponding polynucleotide moleculesInfo
- Publication number
- AU5266298A AU5266298A AU52662/98A AU5266298A AU5266298A AU 5266298 A AU5266298 A AU 5266298A AU 52662/98 A AU52662/98 A AU 52662/98A AU 5266298 A AU5266298 A AU 5266298A AU 5266298 A AU5266298 A AU 5266298A
- Authority
- AU
- Australia
- Prior art keywords
- ghpo
- seq
- leu
- lys
- ser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims description 200
- 102000004196 processed proteins & peptides Human genes 0.000 title claims description 190
- 229920001184 polypeptide Polymers 0.000 title claims description 188
- 102000040430 polynucleotide Human genes 0.000 title claims description 91
- 108091033319 polynucleotide Proteins 0.000 title claims description 91
- 239000002157 polynucleotide Substances 0.000 title claims description 91
- 241000589989 Helicobacter Species 0.000 title claims description 59
- 238000000034 method Methods 0.000 claims description 88
- 108020004414 DNA Proteins 0.000 claims description 67
- 239000013598 vector Substances 0.000 claims description 54
- 239000000203 mixture Substances 0.000 claims description 50
- 210000004027 cell Anatomy 0.000 claims description 36
- 239000013612 plasmid Substances 0.000 claims description 36
- 239000002671 adjuvant Substances 0.000 claims description 30
- 241000124008 Mammalia Species 0.000 claims description 25
- 108010046334 Urease Proteins 0.000 claims description 21
- 206010019375 Helicobacter infections Diseases 0.000 claims description 18
- 230000001580 bacterial effect Effects 0.000 claims description 16
- 150000001875 compounds Chemical class 0.000 claims description 16
- 102000053602 DNA Human genes 0.000 claims description 13
- 239000003085 diluting agent Substances 0.000 claims description 12
- -1 metronidizole Chemical compound 0.000 claims description 10
- 210000004962 mammalian cell Anatomy 0.000 claims description 9
- ULGZDMOVFRHVEP-RWJQBGPGSA-N Erythromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@@](C)(O)[C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 ULGZDMOVFRHVEP-RWJQBGPGSA-N 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 239000013603 viral vector Substances 0.000 claims description 7
- 239000004098 Tetracycline Substances 0.000 claims description 6
- 235000019364 tetracycline Nutrition 0.000 claims description 6
- 150000003522 tetracyclines Chemical class 0.000 claims description 6
- 241000607626 Vibrio cholerae Species 0.000 claims description 5
- 229960002180 tetracycline Drugs 0.000 claims description 5
- 229930101283 tetracycline Natural products 0.000 claims description 5
- 239000003242 anti bacterial agent Substances 0.000 claims description 4
- 150000001621 bismuth Chemical class 0.000 claims description 4
- ZQUAVILLCXTKTF-UHFFFAOYSA-H bismuth;tripotassium;2-hydroxypropane-1,2,3-tricarboxylate Chemical compound [K+].[K+].[K+].[Bi+3].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O.[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O ZQUAVILLCXTKTF-UHFFFAOYSA-H 0.000 claims description 4
- 229960003276 erythromycin Drugs 0.000 claims description 4
- 239000002731 stomach secretion inhibitor Substances 0.000 claims description 4
- 229940118696 vibrio cholerae Drugs 0.000 claims description 4
- BVPWJMCABCPUQY-UHFFFAOYSA-N 4-amino-5-chloro-2-methoxy-N-[1-(phenylmethyl)-4-piperidinyl]benzamide Chemical compound COC1=CC(N)=C(Cl)C=C1C(=O)NC1CCN(CC=2C=CC=CC=2)CC1 BVPWJMCABCPUQY-UHFFFAOYSA-N 0.000 claims description 3
- 230000003115 biocidal effect Effects 0.000 claims description 3
- 238000004113 cell culture Methods 0.000 claims description 3
- 150000003180 prostaglandins Chemical class 0.000 claims description 3
- 229940126409 proton pump inhibitor Drugs 0.000 claims description 3
- 239000000612 proton pump inhibitor Substances 0.000 claims description 3
- 229960000620 ranitidine Drugs 0.000 claims description 3
- VMXUWOKSQNHOCA-LCYFTJDESA-N ranitidine Chemical compound [O-][N+](=O)/C=C(/NC)NCCSCC1=CC=C(CN(C)C)O1 VMXUWOKSQNHOCA-LCYFTJDESA-N 0.000 claims description 3
- SUBDBMMJDZJVOS-UHFFFAOYSA-N 5-methoxy-2-{[(4-methoxy-3,5-dimethylpyridin-2-yl)methyl]sulfinyl}-1H-benzimidazole Chemical compound N=1C2=CC(OC)=CC=C2NC=1S(=O)CC1=NC=C(C)C(OC)=C1C SUBDBMMJDZJVOS-UHFFFAOYSA-N 0.000 claims description 2
- 241000186660 Lactobacillus Species 0.000 claims description 2
- IQPSEEYGBUAQFF-UHFFFAOYSA-N Pantoprazole Chemical compound COC1=CC=NC(CS(=O)C=2NC3=CC=C(OC(F)F)C=C3N=2)=C1OC IQPSEEYGBUAQFF-UHFFFAOYSA-N 0.000 claims description 2
- SMTZFNFIKUPEJC-UHFFFAOYSA-N Roxane Chemical compound CC(=O)OCC(=O)NCCCOC1=CC=CC(CN2CCCCC2)=C1 SMTZFNFIKUPEJC-UHFFFAOYSA-N 0.000 claims description 2
- 241000607142 Salmonella Species 0.000 claims description 2
- 241000607768 Shigella Species 0.000 claims description 2
- 241000194017 Streptococcus Species 0.000 claims description 2
- LSQZJLSUYDQPKJ-NJBDSQKTSA-N amoxicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=C(O)C=C1 LSQZJLSUYDQPKJ-NJBDSQKTSA-N 0.000 claims description 2
- 229960003022 amoxicillin Drugs 0.000 claims description 2
- 229960004645 bismuth subcitrate Drugs 0.000 claims description 2
- ZREIPSZUJIFJNP-UHFFFAOYSA-K bismuth subsalicylate Chemical compound C1=CC=C2O[Bi](O)OC(=O)C2=C1 ZREIPSZUJIFJNP-UHFFFAOYSA-K 0.000 claims description 2
- 229960000782 bismuth subsalicylate Drugs 0.000 claims description 2
- 229960001380 cimetidine Drugs 0.000 claims description 2
- CCGSUNCLSOWKJO-UHFFFAOYSA-N cimetidine Chemical compound N#CNC(=N/C)\NCCSCC1=NC=N[C]1C CCGSUNCLSOWKJO-UHFFFAOYSA-N 0.000 claims description 2
- 229960002626 clarithromycin Drugs 0.000 claims description 2
- AGOYDEPGAOXOCK-KCBOHYOISA-N clarithromycin Chemical compound O([C@@H]1[C@@H](C)C(=O)O[C@@H]([C@@]([C@H](O)[C@@H](C)C(=O)[C@H](C)C[C@](C)([C@H](O[C@H]2[C@@H]([C@H](C[C@@H](C)O2)N(C)C)O)[C@H]1C)OC)(C)O)CC)[C@H]1C[C@@](C)(OC)[C@@H](O)[C@H](C)O1 AGOYDEPGAOXOCK-KCBOHYOISA-N 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims description 2
- 229960003559 enprostil Drugs 0.000 claims description 2
- XUFQPHANEAPEMJ-UHFFFAOYSA-N famotidine Chemical compound NC(N)=NC1=NC(CSCCC(N)=NS(N)(=O)=O)=CS1 XUFQPHANEAPEMJ-UHFFFAOYSA-N 0.000 claims description 2
- 229960001596 famotidine Drugs 0.000 claims description 2
- 229940039696 lactobacillus Drugs 0.000 claims description 2
- 229960003174 lansoprazole Drugs 0.000 claims description 2
- MJIHNNLFOKEZEW-UHFFFAOYSA-N lansoprazole Chemical compound CC1=C(OCC(F)(F)F)C=CN=C1CS(=O)C1=NC2=CC=CC=C2N1 MJIHNNLFOKEZEW-UHFFFAOYSA-N 0.000 claims description 2
- OJLOPKGSLYJEMD-URPKTTJQSA-N methyl 7-[(1r,2r,3r)-3-hydroxy-2-[(1e)-4-hydroxy-4-methyloct-1-en-1-yl]-5-oxocyclopentyl]heptanoate Chemical compound CCCCC(C)(O)C\C=C\[C@H]1[C@H](O)CC(=O)[C@@H]1CCCCCCC(=O)OC OJLOPKGSLYJEMD-URPKTTJQSA-N 0.000 claims description 2
- PTOJVMZPWPAXER-VFJVYMGBSA-N methyl 7-[(1r,2r,3r)-3-hydroxy-2-[(e,3r)-3-hydroxy-4-phenoxybut-1-enyl]-5-oxocyclopentyl]hepta-4,5-dienoate Chemical compound O[C@@H]1CC(=O)[C@H](CC=C=CCCC(=O)OC)[C@H]1\C=C\[C@@H](O)COC1=CC=CC=C1 PTOJVMZPWPAXER-VFJVYMGBSA-N 0.000 claims description 2
- 229960005249 misoprostol Drugs 0.000 claims description 2
- 229960004872 nizatidine Drugs 0.000 claims description 2
- SGXXNSQHWDMGGP-IZZDOVSWSA-N nizatidine Chemical compound [O-][N+](=O)\C=C(/NC)NCCSCC1=CSC(CN(C)C)=N1 SGXXNSQHWDMGGP-IZZDOVSWSA-N 0.000 claims description 2
- 229960000381 omeprazole Drugs 0.000 claims description 2
- LSQZJLSUYDQPKJ-UHFFFAOYSA-N p-Hydroxyampicillin Natural products O=C1N2C(C(O)=O)C(C)(C)SC2C1NC(=O)C(N)C1=CC=C(O)C=C1 LSQZJLSUYDQPKJ-UHFFFAOYSA-N 0.000 claims description 2
- 229960005019 pantoprazole Drugs 0.000 claims description 2
- 229960003320 roxatidine Drugs 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 7
- 239000003485 histamine H2 receptor antagonist Substances 0.000 claims 2
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 436
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 276
- 108090000623 proteins and genes Proteins 0.000 description 136
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 86
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 81
- 102000004169 proteins and genes Human genes 0.000 description 73
- 150000001413 amino acids Chemical group 0.000 description 68
- 235000018102 proteins Nutrition 0.000 description 68
- 239000013615 primer Substances 0.000 description 61
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 57
- 108091007433 antigens Proteins 0.000 description 48
- 102000036639 antigens Human genes 0.000 description 48
- 229960005486 vaccine Drugs 0.000 description 48
- 239000000427 antigen Substances 0.000 description 47
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 46
- 235000001014 amino acid Nutrition 0.000 description 44
- 239000012634 fragment Substances 0.000 description 42
- 229940024606 amino acid Drugs 0.000 description 41
- 108010050848 glycylleucine Proteins 0.000 description 39
- 239000000523 sample Substances 0.000 description 36
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 32
- 241000880493 Leptailurus serval Species 0.000 description 32
- 239000000872 buffer Substances 0.000 description 30
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 30
- 238000003752 polymerase chain reaction Methods 0.000 description 29
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 28
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 28
- 239000004202 carbamide Substances 0.000 description 28
- 108010076504 Protein Sorting Signals Proteins 0.000 description 27
- 108091026890 Coding region Proteins 0.000 description 25
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 25
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 24
- 108010054155 lysyllysine Proteins 0.000 description 24
- 230000004927 fusion Effects 0.000 description 22
- 238000010367 cloning Methods 0.000 description 21
- 102000039446 nucleic acids Human genes 0.000 description 21
- 108020004707 nucleic acids Proteins 0.000 description 21
- 150000007523 nucleic acids Chemical class 0.000 description 21
- 239000002773 nucleotide Substances 0.000 description 21
- 125000003729 nucleotide group Chemical group 0.000 description 21
- 239000008188 pellet Substances 0.000 description 21
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 20
- 241000588724 Escherichia coli Species 0.000 description 20
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 20
- 108010034529 leucyl-lysine Proteins 0.000 description 20
- 108010009298 lysylglutamic acid Proteins 0.000 description 20
- 239000000047 product Substances 0.000 description 20
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 19
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 19
- 108010073969 valyllysine Proteins 0.000 description 19
- 230000003321 amplification Effects 0.000 description 17
- 238000003199 nucleic acid amplification method Methods 0.000 description 17
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 16
- 210000004899 c-terminal region Anatomy 0.000 description 16
- 230000001225 therapeutic effect Effects 0.000 description 16
- 108020001507 fusion proteins Proteins 0.000 description 15
- 102000037865 fusion proteins Human genes 0.000 description 15
- 108010081551 glycylphenylalanine Proteins 0.000 description 15
- 230000028993 immune response Effects 0.000 description 15
- 208000015181 infectious disease Diseases 0.000 description 15
- 238000000746 purification Methods 0.000 description 15
- 239000011780 sodium chloride Substances 0.000 description 15
- 108010051242 phenylalanylserine Proteins 0.000 description 14
- 230000009466 transformation Effects 0.000 description 14
- 241000590017 Helicobacter felis Species 0.000 description 13
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 13
- 108010092854 aspartyllysine Proteins 0.000 description 13
- 238000005119 centrifugation Methods 0.000 description 13
- 239000003153 chemical reaction reagent Substances 0.000 description 13
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 13
- 239000000499 gel Substances 0.000 description 13
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 13
- 239000012528 membrane Substances 0.000 description 13
- 238000002360 preparation method Methods 0.000 description 13
- 239000000243 solution Substances 0.000 description 13
- 239000004475 Arginine Substances 0.000 description 12
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 12
- 241000699670 Mus sp. Species 0.000 description 12
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 12
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 12
- 229960003121 arginine Drugs 0.000 description 12
- 108010038633 aspartylglutamate Proteins 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- 239000002502 liposome Substances 0.000 description 12
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 12
- 239000002253 acid Substances 0.000 description 11
- 108010005233 alanylglutamic acid Proteins 0.000 description 11
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 11
- 238000012217 deletion Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 108010037850 glycylvaline Proteins 0.000 description 11
- 238000009396 hybridization Methods 0.000 description 11
- 210000004379 membrane Anatomy 0.000 description 11
- HEGSGKPQLMEBJL-RKQHYHRCSA-N octyl beta-D-glucopyranoside Chemical compound CCCCCCCCO[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O HEGSGKPQLMEBJL-RKQHYHRCSA-N 0.000 description 11
- 230000000069 prophylactic effect Effects 0.000 description 11
- 108700026244 Open Reading Frames Proteins 0.000 description 10
- 239000012472 biological sample Substances 0.000 description 10
- 230000037430 deletion Effects 0.000 description 10
- 108010079547 glutamylmethionine Proteins 0.000 description 10
- 108010003700 lysyl aspartic acid Proteins 0.000 description 10
- 108010064235 lysylglycine Proteins 0.000 description 10
- 108010061238 threonyl-glycine Proteins 0.000 description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 9
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 9
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 9
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 9
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 9
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 9
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 9
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 9
- 108010077245 asparaginyl-proline Proteins 0.000 description 9
- 238000003776 cleavage reaction Methods 0.000 description 9
- 238000002405 diagnostic procedure Methods 0.000 description 9
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 108010057821 leucylproline Proteins 0.000 description 9
- 239000008194 pharmaceutical composition Substances 0.000 description 9
- 238000010561 standard procedure Methods 0.000 description 9
- 108010049048 Cholera Toxin Proteins 0.000 description 8
- 102000009016 Cholera Toxin Human genes 0.000 description 8
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 description 8
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 8
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 8
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 8
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 8
- 150000007513 acids Chemical class 0.000 description 8
- 239000003795 chemical substances by application Substances 0.000 description 8
- 108010049041 glutamylalanine Proteins 0.000 description 8
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 8
- 108010087823 glycyltyrosine Proteins 0.000 description 8
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 8
- 108010017391 lysylvaline Proteins 0.000 description 8
- 108010005942 methionylglycine Proteins 0.000 description 8
- 239000002243 precursor Substances 0.000 description 8
- 108010071207 serylmethionine Proteins 0.000 description 8
- 102000055501 telomere Human genes 0.000 description 8
- 108091035539 telomere Proteins 0.000 description 8
- 108010051110 tyrosyl-lysine Proteins 0.000 description 8
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 7
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 7
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 7
- 241000590006 Helicobacter mustelae Species 0.000 description 7
- 108010065920 Insulin Lispro Proteins 0.000 description 7
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 7
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 7
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 7
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 7
- 108010047495 alanylglycine Proteins 0.000 description 7
- 108010087924 alanylproline Proteins 0.000 description 7
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 7
- 108010013835 arginine glutamate Proteins 0.000 description 7
- 125000002091 cationic group Chemical group 0.000 description 7
- 210000000349 chromosome Anatomy 0.000 description 7
- 238000010790 dilution Methods 0.000 description 7
- 239000012895 dilution Substances 0.000 description 7
- 230000002496 gastric effect Effects 0.000 description 7
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 7
- 230000003053 immunization Effects 0.000 description 7
- 238000002649 immunization Methods 0.000 description 7
- 230000001939 inductive effect Effects 0.000 description 7
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 7
- 239000002609 medium Substances 0.000 description 7
- 230000035772 mutation Effects 0.000 description 7
- 230000001681 protective effect Effects 0.000 description 7
- 230000007017 scission Effects 0.000 description 7
- 108010026333 seryl-proline Proteins 0.000 description 7
- 239000006228 supernatant Substances 0.000 description 7
- 238000011282 treatment Methods 0.000 description 7
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 6
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 6
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 6
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 6
- PUUPMDXIHCOPJU-HJGDQZAQSA-N Asn-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O PUUPMDXIHCOPJU-HJGDQZAQSA-N 0.000 description 6
- 102100021935 C-C motif chemokine 26 Human genes 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 6
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 6
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 6
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 6
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 6
- 101000897493 Homo sapiens C-C motif chemokine 26 Proteins 0.000 description 6
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 6
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 6
- LZHJZLHSRGWBBE-IHRRRGAJSA-N Leu-Lys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LZHJZLHSRGWBBE-IHRRRGAJSA-N 0.000 description 6
- ABHIXYDMILIUKV-CIUDSAMLSA-N Lys-Asn-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ABHIXYDMILIUKV-CIUDSAMLSA-N 0.000 description 6
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 6
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 6
- JTKGCYOOJLUETJ-ULQDDVLXSA-N Phe-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JTKGCYOOJLUETJ-ULQDDVLXSA-N 0.000 description 6
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 6
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 6
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 6
- MJOUSKQHAIARKI-JYJNAYRXSA-N Val-Phe-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 MJOUSKQHAIARKI-JYJNAYRXSA-N 0.000 description 6
- 108010070783 alanyltyrosine Proteins 0.000 description 6
- 230000000890 antigenic effect Effects 0.000 description 6
- 230000008827 biological function Effects 0.000 description 6
- 239000012620 biological material Substances 0.000 description 6
- 229960005091 chloramphenicol Drugs 0.000 description 6
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 230000002068 genetic effect Effects 0.000 description 6
- 108010025306 histidylleucine Proteins 0.000 description 6
- 238000007918 intramuscular administration Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 108010090894 prolylleucine Proteins 0.000 description 6
- 241000894007 species Species 0.000 description 6
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 6
- 238000007920 subcutaneous administration Methods 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 5
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 5
- QXRNAOYBCYVZCD-BQBZGAKWSA-N Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN QXRNAOYBCYVZCD-BQBZGAKWSA-N 0.000 description 5
- 108010006591 Apoenzymes Proteins 0.000 description 5
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 5
- IJYZHIOOBGIINM-WDSKDSINSA-N Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N IJYZHIOOBGIINM-WDSKDSINSA-N 0.000 description 5
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 5
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 208000007882 Gastritis Diseases 0.000 description 5
- MPZWMIIOPAPAKE-BQBZGAKWSA-N Glu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N MPZWMIIOPAPAKE-BQBZGAKWSA-N 0.000 description 5
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 5
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 5
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 5
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 5
- FXLVSYVJDPCIHH-STQMWFEESA-N Gly-Phe-Arg Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FXLVSYVJDPCIHH-STQMWFEESA-N 0.000 description 5
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 5
- 239000004471 Glycine Substances 0.000 description 5
- 102000013462 Interleukin-12 Human genes 0.000 description 5
- 108010065805 Interleukin-12 Proteins 0.000 description 5
- 102000000588 Interleukin-2 Human genes 0.000 description 5
- 108010002350 Interleukin-2 Proteins 0.000 description 5
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 5
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 5
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 5
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 5
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 5
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 5
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 5
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 5
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 5
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 5
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 5
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 5
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 5
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 5
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 5
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 5
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 241000283973 Oryctolagus cuniculus Species 0.000 description 5
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 5
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 5
- PBUXMVYWOSKHMF-WDSKDSINSA-N Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO PBUXMVYWOSKHMF-WDSKDSINSA-N 0.000 description 5
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 5
- 239000007983 Tris buffer Substances 0.000 description 5
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 5
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 5
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 5
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 5
- DJEVQCWNMQOABE-RCOVLWMOSA-N Val-Gly-Asp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N DJEVQCWNMQOABE-RCOVLWMOSA-N 0.000 description 5
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 5
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 5
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 238000004587 chromatography analysis Methods 0.000 description 5
- 238000009472 formulation Methods 0.000 description 5
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 5
- 108010015792 glycyllysine Proteins 0.000 description 5
- 230000002209 hydrophobic effect Effects 0.000 description 5
- 238000002347 injection Methods 0.000 description 5
- 239000007924 injection Substances 0.000 description 5
- 229940117681 interleukin-12 Drugs 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 5
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 5
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 5
- 108010038320 lysylphenylalanine Proteins 0.000 description 5
- 108010034507 methionyltryptophan Proteins 0.000 description 5
- 239000011859 microparticle Substances 0.000 description 5
- 230000004899 motility Effects 0.000 description 5
- 239000008363 phosphate buffer Substances 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 5
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 230000002103 transcriptional effect Effects 0.000 description 5
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 5
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 4
- VJVQKGYHIZPSNS-FXQIFTODSA-N Ala-Ser-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N VJVQKGYHIZPSNS-FXQIFTODSA-N 0.000 description 4
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 4
- XAXMJQUMRJAFCH-CQDKDKBSSA-N Ala-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 XAXMJQUMRJAFCH-CQDKDKBSSA-N 0.000 description 4
- XUUXCWCKKCZEAW-YFKPBYRVSA-N Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N XUUXCWCKKCZEAW-YFKPBYRVSA-N 0.000 description 4
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 4
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 4
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 4
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 4
- ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N Asn-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N 0.000 description 4
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 4
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 4
- NYGILGUOUOXGMJ-YUMQZZPRSA-N Asn-Lys-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O NYGILGUOUOXGMJ-YUMQZZPRSA-N 0.000 description 4
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 4
- PIABYSIYPGLLDQ-XVSYOHENSA-N Asn-Thr-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PIABYSIYPGLLDQ-XVSYOHENSA-N 0.000 description 4
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 4
- YNQIDCRRTWGHJD-ZLUOBGJFSA-N Asp-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(O)=O YNQIDCRRTWGHJD-ZLUOBGJFSA-N 0.000 description 4
- UGIBTKGQVWFTGX-BIIVOSGPSA-N Asp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O UGIBTKGQVWFTGX-BIIVOSGPSA-N 0.000 description 4
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 4
- OVPHVTCDVYYTHN-AVGNSLFASA-N Asp-Glu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OVPHVTCDVYYTHN-AVGNSLFASA-N 0.000 description 4
- 102000004127 Cytokines Human genes 0.000 description 4
- 108090000695 Cytokines Proteins 0.000 description 4
- 108090000204 Dipeptidase 1 Proteins 0.000 description 4
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 4
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 4
- SXGAGTVDWKQYCX-BQBZGAKWSA-N Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SXGAGTVDWKQYCX-BQBZGAKWSA-N 0.000 description 4
- LHIPZASLKPYDPI-AVGNSLFASA-N Glu-Phe-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LHIPZASLKPYDPI-AVGNSLFASA-N 0.000 description 4
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 4
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 4
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 4
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 4
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 4
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 4
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 4
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 4
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 4
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 4
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 4
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 4
- ZRALSGWEFCBTJO-UHFFFAOYSA-N Guanidine Chemical compound NC(N)=N ZRALSGWEFCBTJO-UHFFFAOYSA-N 0.000 description 4
- 241000282412 Homo Species 0.000 description 4
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 4
- HXWALXSAVBLTPK-NUTKFTJISA-N Leu-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(C)C)N HXWALXSAVBLTPK-NUTKFTJISA-N 0.000 description 4
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 4
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 4
- LESXFEZIFXFIQR-LURJTMIESA-N Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(O)=O LESXFEZIFXFIQR-LURJTMIESA-N 0.000 description 4
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 4
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 4
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 4
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 4
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 4
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 4
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 4
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 4
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 4
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 4
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 4
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 4
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 4
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 4
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 4
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 4
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 4
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 4
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 4
- SOAYQFDWEIWPPR-IHRRRGAJSA-N Met-Ser-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O SOAYQFDWEIWPPR-IHRRRGAJSA-N 0.000 description 4
- GLUBLISJVJFHQS-VIFPVBQESA-N Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 GLUBLISJVJFHQS-VIFPVBQESA-N 0.000 description 4
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 4
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 4
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 4
- IAOZOFPONWDXNT-IXOXFDKPSA-N Phe-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IAOZOFPONWDXNT-IXOXFDKPSA-N 0.000 description 4
- RUDOLGWDSKQQFF-DCAQKATOSA-N Pro-Leu-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O RUDOLGWDSKQQFF-DCAQKATOSA-N 0.000 description 4
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 4
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 4
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 4
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 4
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 4
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 4
- PIQRHJQWEPWFJG-UWJYBYFXSA-N Ser-Tyr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PIQRHJQWEPWFJG-UWJYBYFXSA-N 0.000 description 4
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 4
- 238000002105 Southern blotting Methods 0.000 description 4
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 4
- QOLYAJSZHIJCTO-VQVTYTSYSA-N Thr-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O QOLYAJSZHIJCTO-VQVTYTSYSA-N 0.000 description 4
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 4
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 4
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 4
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 4
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 4
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 4
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 4
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 4
- 238000002835 absorbance Methods 0.000 description 4
- 125000000129 anionic group Chemical group 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 4
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 4
- 108010062796 arginyllysine Proteins 0.000 description 4
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 4
- 238000001476 gene delivery Methods 0.000 description 4
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 4
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 4
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 4
- 108010089804 glycyl-threonine Proteins 0.000 description 4
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 4
- 229960000789 guanidine hydrochloride Drugs 0.000 description 4
- PJJJBBJSCAKJQF-UHFFFAOYSA-N guanidinium chloride Chemical compound [Cl-].NC(N)=[NH2+] PJJJBBJSCAKJQF-UHFFFAOYSA-N 0.000 description 4
- 108010040030 histidinoalanine Proteins 0.000 description 4
- 108010092114 histidylphenylalanine Proteins 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- 108010018006 histidylserine Proteins 0.000 description 4
- 230000000521 hyperimmunizing effect Effects 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 238000001990 intravenous administration Methods 0.000 description 4
- 150000002632 lipids Chemical class 0.000 description 4
- 108010056582 methionylglutamic acid Proteins 0.000 description 4
- 230000007935 neutral effect Effects 0.000 description 4
- PXHVJJICTQNCMI-UHFFFAOYSA-N nickel Substances [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 4
- 244000052769 pathogen Species 0.000 description 4
- 108010012581 phenylalanylglutamate Proteins 0.000 description 4
- 230000037452 priming Effects 0.000 description 4
- 108010031719 prolyl-serine Proteins 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 4
- 239000012064 sodium phosphate buffer Substances 0.000 description 4
- 238000001179 sorption measurement Methods 0.000 description 4
- 239000003053 toxin Substances 0.000 description 4
- 231100000765 toxin Toxicity 0.000 description 4
- 108700012359 toxins Proteins 0.000 description 4
- 108010029384 tryptophyl-histidine Proteins 0.000 description 4
- 108010077037 tyrosyl-tyrosyl-phenylalanine Proteins 0.000 description 4
- 108010078580 tyrosylleucine Proteins 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000003981 vehicle Substances 0.000 description 4
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 3
- XAGIMRPOEJSYER-CIUDSAMLSA-N Ala-Cys-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N XAGIMRPOEJSYER-CIUDSAMLSA-N 0.000 description 3
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 3
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 3
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 3
- FUKFQILQFQKHLE-DCAQKATOSA-N Ala-Lys-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O FUKFQILQFQKHLE-DCAQKATOSA-N 0.000 description 3
- XRUJOVRWNMBAAA-NHCYSSNCSA-N Ala-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 XRUJOVRWNMBAAA-NHCYSSNCSA-N 0.000 description 3
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 3
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 3
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 3
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 3
- SBVJJNJLFWSJOV-UBHSHLNASA-N Arg-Ala-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SBVJJNJLFWSJOV-UBHSHLNASA-N 0.000 description 3
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 3
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 3
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 3
- NUHQMYUWLUSRJX-BIIVOSGPSA-N Asn-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N NUHQMYUWLUSRJX-BIIVOSGPSA-N 0.000 description 3
- RJUHZPRQRQLCFL-IMJSIDKUSA-N Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(O)=O RJUHZPRQRQLCFL-IMJSIDKUSA-N 0.000 description 3
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 3
- KHCNTVRVAYCPQE-CIUDSAMLSA-N Asn-Lys-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O KHCNTVRVAYCPQE-CIUDSAMLSA-N 0.000 description 3
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 3
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 3
- YXVAESUIQFDBHN-SRVKXCTJSA-N Asn-Phe-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O YXVAESUIQFDBHN-SRVKXCTJSA-N 0.000 description 3
- JTXVXGXTRXMOFJ-FXQIFTODSA-N Asn-Pro-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O JTXVXGXTRXMOFJ-FXQIFTODSA-N 0.000 description 3
- REQUGIWGOGSOEZ-ZLUOBGJFSA-N Asn-Ser-Asn Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)C(=O)N REQUGIWGOGSOEZ-ZLUOBGJFSA-N 0.000 description 3
- HNXWVVHIGTZTBO-LKXGYXEUSA-N Asn-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O HNXWVVHIGTZTBO-LKXGYXEUSA-N 0.000 description 3
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 3
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 3
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 3
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 3
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 3
- 101710112752 Cytotoxin Proteins 0.000 description 3
- 108010090461 DFG peptide Proteins 0.000 description 3
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 3
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 3
- 238000002965 ELISA Methods 0.000 description 3
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 3
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 3
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 3
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 3
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 3
- CHDWDBPJOZVZSE-KKUMJFAQSA-N Glu-Phe-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O CHDWDBPJOZVZSE-KKUMJFAQSA-N 0.000 description 3
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 3
- UXJHNZODTMHWRD-WHFBIAKZSA-N Gly-Asn-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O UXJHNZODTMHWRD-WHFBIAKZSA-N 0.000 description 3
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 3
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 3
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 3
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 3
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 3
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 3
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 3
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 3
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 3
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 3
- 241000590002 Helicobacter pylori Species 0.000 description 3
- VCDNHBNNPCDBKV-DLOVCJGASA-N His-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N VCDNHBNNPCDBKV-DLOVCJGASA-N 0.000 description 3
- MBSSHYPAEHPSGY-LSJOCFKGSA-N His-Ala-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O MBSSHYPAEHPSGY-LSJOCFKGSA-N 0.000 description 3
- GGXUJBKENKVYNV-ULQDDVLXSA-N His-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N GGXUJBKENKVYNV-ULQDDVLXSA-N 0.000 description 3
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 3
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 3
- QOOWRKBDDXQRHC-BQBZGAKWSA-N L-lysyl-L-alanine Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN QOOWRKBDDXQRHC-BQBZGAKWSA-N 0.000 description 3
- HBJZFCIVFIBNSV-DCAQKATOSA-N Leu-Arg-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O HBJZFCIVFIBNSV-DCAQKATOSA-N 0.000 description 3
- MDVZJYGNAGLPGJ-KKUMJFAQSA-N Leu-Asn-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MDVZJYGNAGLPGJ-KKUMJFAQSA-N 0.000 description 3
- MMEDVBWCMGRKKC-GARJFASQSA-N Leu-Asp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N MMEDVBWCMGRKKC-GARJFASQSA-N 0.000 description 3
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 3
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 3
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 3
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 3
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 3
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 3
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 3
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 3
- 239000006142 Luria-Bertani Agar Substances 0.000 description 3
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 3
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 3
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 3
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 3
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 3
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 3
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 3
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 3
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 3
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 3
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 3
- YSZNURNVYFUEHC-BQBZGAKWSA-N Lys-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O YSZNURNVYFUEHC-BQBZGAKWSA-N 0.000 description 3
- ZOKVLMBYDSIDKG-CSMHCCOUSA-N Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ZOKVLMBYDSIDKG-CSMHCCOUSA-N 0.000 description 3
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 3
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 3
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 3
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 3
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 3
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 3
- AETNZPKUUYYYEK-CIUDSAMLSA-N Met-Glu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AETNZPKUUYYYEK-CIUDSAMLSA-N 0.000 description 3
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 description 3
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 3
- MXEASDMFHUKOGE-ULQDDVLXSA-N Met-His-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MXEASDMFHUKOGE-ULQDDVLXSA-N 0.000 description 3
- WPTHAGXMYDRPFD-SRVKXCTJSA-N Met-Lys-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O WPTHAGXMYDRPFD-SRVKXCTJSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 241001529936 Murinae Species 0.000 description 3
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 3
- 108010079364 N-glycylalanine Proteins 0.000 description 3
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 3
- 239000000020 Nitrocellulose Substances 0.000 description 3
- 208000008469 Peptic Ulcer Diseases 0.000 description 3
- 102000003992 Peroxidases Human genes 0.000 description 3
- 108010081690 Pertussis Toxin Proteins 0.000 description 3
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 3
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 3
- CDNPIRSCAFMMBE-SRVKXCTJSA-N Phe-Asn-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CDNPIRSCAFMMBE-SRVKXCTJSA-N 0.000 description 3
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 3
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 3
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 3
- GKZIWHRNKRBEOH-HOTGVXAUSA-N Phe-Phe Chemical compound C([C@H]([NH3+])C(=O)N[C@@H](CC=1C=CC=CC=1)C([O-])=O)C1=CC=CC=C1 GKZIWHRNKRBEOH-HOTGVXAUSA-N 0.000 description 3
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 3
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 3
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 3
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 3
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 3
- PTDAGKJHZBGDKD-OEAJRASXSA-N Phe-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O PTDAGKJHZBGDKD-OEAJRASXSA-N 0.000 description 3
- MJOJSHOTYWABPR-WIRXVTQYSA-N Phe-Trp-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MJOJSHOTYWABPR-WIRXVTQYSA-N 0.000 description 3
- QTDBZORPVYTRJU-KKXDTOCCSA-N Phe-Tyr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O QTDBZORPVYTRJU-KKXDTOCCSA-N 0.000 description 3
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 3
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 3
- FMLRRBDLBJLJIK-DCAQKATOSA-N Pro-Leu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FMLRRBDLBJLJIK-DCAQKATOSA-N 0.000 description 3
- WHNJMTHJGCEKGA-ULQDDVLXSA-N Pro-Phe-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O WHNJMTHJGCEKGA-ULQDDVLXSA-N 0.000 description 3
- UIUWGMRJTWHIJZ-ULQDDVLXSA-N Pro-Tyr-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O UIUWGMRJTWHIJZ-ULQDDVLXSA-N 0.000 description 3
- 229920005654 Sephadex Polymers 0.000 description 3
- 239000012507 Sephadex™ Substances 0.000 description 3
- 229920002684 Sepharose Polymers 0.000 description 3
- HBOABDXGTMMDSE-GUBZILKMSA-N Ser-Arg-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O HBOABDXGTMMDSE-GUBZILKMSA-N 0.000 description 3
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 3
- FYUIFUJFNCLUIX-XVYDVKMFSA-N Ser-His-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O FYUIFUJFNCLUIX-XVYDVKMFSA-N 0.000 description 3
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 3
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 3
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 3
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 3
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 3
- LDEBVRIURYMKQS-WISUUJSJSA-N Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO LDEBVRIURYMKQS-WISUUJSJSA-N 0.000 description 3
- HAYADTTXNZFUDM-IHRRRGAJSA-N Ser-Tyr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HAYADTTXNZFUDM-IHRRRGAJSA-N 0.000 description 3
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 3
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 3
- 108091081024 Start codon Proteins 0.000 description 3
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 3
- MEBDIIKMUUNBSB-RPTUDFQQSA-N Thr-Phe-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MEBDIIKMUUNBSB-RPTUDFQQSA-N 0.000 description 3
- DSGIVWSDDRDJIO-ZXXMMSQZSA-N Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DSGIVWSDDRDJIO-ZXXMMSQZSA-N 0.000 description 3
- CRWOSTCODDFEKZ-HRCADAONSA-N Tyr-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O CRWOSTCODDFEKZ-HRCADAONSA-N 0.000 description 3
- XMNDQSYABVWZRK-BZSNNMDCSA-N Tyr-Asn-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XMNDQSYABVWZRK-BZSNNMDCSA-N 0.000 description 3
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 3
- NVZVJIUDICCMHZ-BZSNNMDCSA-N Tyr-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O NVZVJIUDICCMHZ-BZSNNMDCSA-N 0.000 description 3
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 3
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 3
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 3
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 3
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 3
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 3
- JTWIMNMUYLQNPI-WPRPVWTQSA-N Val-Gly-Arg Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N JTWIMNMUYLQNPI-WPRPVWTQSA-N 0.000 description 3
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 3
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 3
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 3
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 3
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 3
- AIWLHFZYOUUJGB-UFYCRDLUSA-N Val-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 AIWLHFZYOUUJGB-UFYCRDLUSA-N 0.000 description 3
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 3
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 3
- QPJSIBAOZBVELU-BPNCWPANSA-N Val-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N QPJSIBAOZBVELU-BPNCWPANSA-N 0.000 description 3
- IOUPEELXVYPCPG-UHFFFAOYSA-N Valylglycine Chemical compound CC(C)C(N)C(=O)NCC(O)=O IOUPEELXVYPCPG-UHFFFAOYSA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 238000001042 affinity chromatography Methods 0.000 description 3
- 108010070944 alanylhistidine Proteins 0.000 description 3
- 108010011559 alanylphenylalanine Proteins 0.000 description 3
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 3
- 102000006635 beta-lactamase Human genes 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 229940098773 bovine serum albumin Drugs 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 239000013611 chromosomal DNA Substances 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 108091036078 conserved sequence Proteins 0.000 description 3
- 231100000599 cytotoxic agent Toxicity 0.000 description 3
- 239000002619 cytotoxin Substances 0.000 description 3
- 238000000502 dialysis Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 108010054813 diprotin B Proteins 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 210000003495 flagella Anatomy 0.000 description 3
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 3
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 3
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 3
- 108010020688 glycylhistidine Proteins 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 239000010931 gold Substances 0.000 description 3
- 229910052737 gold Inorganic materials 0.000 description 3
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 3
- 108010028295 histidylhistidine Proteins 0.000 description 3
- 230000002163 immunogen Effects 0.000 description 3
- 239000012535 impurity Substances 0.000 description 3
- 239000004615 ingredient Substances 0.000 description 3
- 230000000968 intestinal effect Effects 0.000 description 3
- 238000007912 intraperitoneal administration Methods 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 108010053037 kyotorphin Proteins 0.000 description 3
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 3
- 108010091871 leucylmethionine Proteins 0.000 description 3
- 230000029226 lipidation Effects 0.000 description 3
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 3
- 108010068488 methionylphenylalanine Proteins 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000002703 mutagenesis Methods 0.000 description 3
- 231100000350 mutagenesis Toxicity 0.000 description 3
- 229920001220 nitrocellulos Polymers 0.000 description 3
- 238000007911 parenteral administration Methods 0.000 description 3
- 108040007629 peroxidase activity proteins Proteins 0.000 description 3
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 3
- 108010018625 phenylalanylarginine Proteins 0.000 description 3
- 229920000136 polysorbate Polymers 0.000 description 3
- 230000002685 pulmonary effect Effects 0.000 description 3
- 210000002966 serum Anatomy 0.000 description 3
- 108010048818 seryl-histidine Proteins 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- PFNFFQXMRSDOHW-UHFFFAOYSA-N spermine Chemical compound NCCCNCCCCNCCCN PFNFFQXMRSDOHW-UHFFFAOYSA-N 0.000 description 3
- 210000002784 stomach Anatomy 0.000 description 3
- 229960005322 streptomycin Drugs 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 239000000725 suspension Substances 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 108010038745 tryptophylglycine Proteins 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- 210000001635 urinary tract Anatomy 0.000 description 3
- 238000002255 vaccination Methods 0.000 description 3
- 108010009962 valyltyrosine Proteins 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- 238000001262 western blot Methods 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- PAHHYDSPOXDASW-VGWMRTNUSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-1-[(2s)-2-amino-3-hydroxypropanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO PAHHYDSPOXDASW-VGWMRTNUSA-N 0.000 description 2
- LEBVLXFERQHONN-UHFFFAOYSA-N 1-butyl-N-(2,6-dimethylphenyl)piperidine-2-carboxamide Chemical compound CCCCN1CCCCC1C(=O)NC1=C(C)C=CC=C1C LEBVLXFERQHONN-UHFFFAOYSA-N 0.000 description 2
- LDGWQMRUWMSZIU-LQDDAWAPSA-M 2,3-bis[(z)-octadec-9-enoxy]propyl-trimethylazanium;chloride Chemical compound [Cl-].CCCCCCCC\C=C/CCCCCCCCOCC(C[N+](C)(C)C)OCCCCCCCC\C=C/CCCCCCCC LDGWQMRUWMSZIU-LQDDAWAPSA-M 0.000 description 2
- HZAXFHJVJLSVMW-UHFFFAOYSA-N 2-Aminoethan-1-ol Chemical compound NCCO HZAXFHJVJLSVMW-UHFFFAOYSA-N 0.000 description 2
- DQPMXYDFWRYWQV-UHFFFAOYSA-N 2-[[6-amino-2-[[2-[(2-amino-3-methylbutanoyl)amino]-3-hydroxybutanoyl]amino]hexanoyl]amino]acetic acid Chemical compound CC(C)C(N)C(=O)NC(C(C)O)C(=O)NC(CCCCN)C(=O)NCC(O)=O DQPMXYDFWRYWQV-UHFFFAOYSA-N 0.000 description 2
- 229920001817 Agar Polymers 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- CCUAQNUWXLYFRA-IMJSIDKUSA-N Ala-Asn Chemical compound C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O CCUAQNUWXLYFRA-IMJSIDKUSA-N 0.000 description 2
- PXKLCFFSVLKOJM-ACZMJKKPSA-N Ala-Asn-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PXKLCFFSVLKOJM-ACZMJKKPSA-N 0.000 description 2
- SHYYAQLDNVHPFT-DLOVCJGASA-N Ala-Asn-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SHYYAQLDNVHPFT-DLOVCJGASA-N 0.000 description 2
- XQJAFSDFQZPYCU-UWJYBYFXSA-N Ala-Asn-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N XQJAFSDFQZPYCU-UWJYBYFXSA-N 0.000 description 2
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 2
- GSCLWXDNIMNIJE-ZLUOBGJFSA-N Ala-Asp-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GSCLWXDNIMNIJE-ZLUOBGJFSA-N 0.000 description 2
- BTYTYHBSJKQBQA-GCJQMDKQSA-N Ala-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N)O BTYTYHBSJKQBQA-GCJQMDKQSA-N 0.000 description 2
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 2
- OILNWMNBLIHXQK-ZLUOBGJFSA-N Ala-Cys-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O OILNWMNBLIHXQK-ZLUOBGJFSA-N 0.000 description 2
- IXTPACPAXIOCRG-ACZMJKKPSA-N Ala-Glu-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N IXTPACPAXIOCRG-ACZMJKKPSA-N 0.000 description 2
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 2
- CWEAKSWWKHGTRJ-BQBZGAKWSA-N Ala-Gly-Met Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O CWEAKSWWKHGTRJ-BQBZGAKWSA-N 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 2
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 2
- UWIQWPWWZUHBAO-ZLIFDBKOSA-N Ala-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)CC(C)C)C(O)=O)=CNC2=C1 UWIQWPWWZUHBAO-ZLIFDBKOSA-N 0.000 description 2
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 2
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 2
- OQWQTGBOFPJOIF-DLOVCJGASA-N Ala-Lys-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N OQWQTGBOFPJOIF-DLOVCJGASA-N 0.000 description 2
- CHFFHQUVXHEGBY-GARJFASQSA-N Ala-Lys-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CHFFHQUVXHEGBY-GARJFASQSA-N 0.000 description 2
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 2
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 2
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 2
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 2
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 2
- PHQXWZGXKAFWAZ-ZLIFDBKOSA-N Ala-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 PHQXWZGXKAFWAZ-ZLIFDBKOSA-N 0.000 description 2
- JNJHNBXBGNJESC-KKXDTOCCSA-N Ala-Tyr-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JNJHNBXBGNJESC-KKXDTOCCSA-N 0.000 description 2
- VWVPYNGMOCSSGK-GUBZILKMSA-N Arg-Arg-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O VWVPYNGMOCSSGK-GUBZILKMSA-N 0.000 description 2
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 2
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 2
- RWCLSUOSKWTXLA-FXQIFTODSA-N Arg-Asp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RWCLSUOSKWTXLA-FXQIFTODSA-N 0.000 description 2
- PQWTZSNVWSOFFK-FXQIFTODSA-N Arg-Asp-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)CN=C(N)N PQWTZSNVWSOFFK-FXQIFTODSA-N 0.000 description 2
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 2
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 2
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 2
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 2
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 2
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 2
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 2
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 2
- QBQVKUNBCAFXSV-ULQDDVLXSA-N Arg-Lys-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QBQVKUNBCAFXSV-ULQDDVLXSA-N 0.000 description 2
- DTBPLQNKYCYUOM-JYJNAYRXSA-N Arg-Met-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DTBPLQNKYCYUOM-JYJNAYRXSA-N 0.000 description 2
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 2
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 2
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 2
- UTSMXMABBPFVJP-SZMVWBNQSA-N Arg-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UTSMXMABBPFVJP-SZMVWBNQSA-N 0.000 description 2
- 206010003445 Ascites Diseases 0.000 description 2
- NXVGBGZQQFDUTM-XVYDVKMFSA-N Asn-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N NXVGBGZQQFDUTM-XVYDVKMFSA-N 0.000 description 2
- PTNFNTOBUDWHNZ-GUBZILKMSA-N Asn-Arg-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O PTNFNTOBUDWHNZ-GUBZILKMSA-N 0.000 description 2
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 2
- DAPLJWATMAXPPZ-CIUDSAMLSA-N Asn-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O DAPLJWATMAXPPZ-CIUDSAMLSA-N 0.000 description 2
- NLCDVZJDEXIDDL-BIIVOSGPSA-N Asn-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O NLCDVZJDEXIDDL-BIIVOSGPSA-N 0.000 description 2
- VKCOHFFSTKCXEQ-OLHMAJIHSA-N Asn-Asn-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VKCOHFFSTKCXEQ-OLHMAJIHSA-N 0.000 description 2
- XVVOVPFMILMHPX-ZLUOBGJFSA-N Asn-Asp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XVVOVPFMILMHPX-ZLUOBGJFSA-N 0.000 description 2
- ZDOQDYFZNGASEY-BIIVOSGPSA-N Asn-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZDOQDYFZNGASEY-BIIVOSGPSA-N 0.000 description 2
- VYLVOMUVLMGCRF-ZLUOBGJFSA-N Asn-Asp-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VYLVOMUVLMGCRF-ZLUOBGJFSA-N 0.000 description 2
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 2
- TWXZVVXRRRRSLT-IMJSIDKUSA-N Asn-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O TWXZVVXRRRRSLT-IMJSIDKUSA-N 0.000 description 2
- VWJFQGXPYOPXJH-ZLUOBGJFSA-N Asn-Cys-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)C(=O)N VWJFQGXPYOPXJH-ZLUOBGJFSA-N 0.000 description 2
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 2
- OGMDXNFGPOPZTK-GUBZILKMSA-N Asn-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N OGMDXNFGPOPZTK-GUBZILKMSA-N 0.000 description 2
- JQSWHKKUZMTOIH-QWRGUYRKSA-N Asn-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N JQSWHKKUZMTOIH-QWRGUYRKSA-N 0.000 description 2
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 2
- JGIAYNNXZKKKOW-KKUMJFAQSA-N Asn-His-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N JGIAYNNXZKKKOW-KKUMJFAQSA-N 0.000 description 2
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 2
- NLRJGXZWTKXRHP-DCAQKATOSA-N Asn-Leu-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLRJGXZWTKXRHP-DCAQKATOSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- QJMCHPGWFZZRID-BQBZGAKWSA-N Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O QJMCHPGWFZZRID-BQBZGAKWSA-N 0.000 description 2
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 2
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 2
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 2
- QXOPPIDJKPEKCW-GUBZILKMSA-N Asn-Pro-Arg Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)N)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O QXOPPIDJKPEKCW-GUBZILKMSA-N 0.000 description 2
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 2
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 2
- XHTUGJCAEYOZOR-UBHSHLNASA-N Asn-Ser-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XHTUGJCAEYOZOR-UBHSHLNASA-N 0.000 description 2
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 2
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 2
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 2
- KZYSHAMXEBPJBD-JRQIVUDYSA-N Asn-Thr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZYSHAMXEBPJBD-JRQIVUDYSA-N 0.000 description 2
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 2
- ULZOQOKFYMXHPZ-AQZXSJQPSA-N Asn-Trp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ULZOQOKFYMXHPZ-AQZXSJQPSA-N 0.000 description 2
- IPAQILGYEQFCFO-NYVOZVTQSA-N Asn-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CC(=O)N)N IPAQILGYEQFCFO-NYVOZVTQSA-N 0.000 description 2
- DATSKXOXPUAOLK-KKUMJFAQSA-N Asn-Tyr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DATSKXOXPUAOLK-KKUMJFAQSA-N 0.000 description 2
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 2
- FAEIQWHBRBWUBN-FXQIFTODSA-N Asp-Arg-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N FAEIQWHBRBWUBN-FXQIFTODSA-N 0.000 description 2
- ILJQISGMGXRZQQ-IHRRRGAJSA-N Asp-Arg-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ILJQISGMGXRZQQ-IHRRRGAJSA-N 0.000 description 2
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 2
- PJERDVUTUDZPGX-ZKWXMUAHSA-N Asp-Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CC(O)=O PJERDVUTUDZPGX-ZKWXMUAHSA-N 0.000 description 2
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 2
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 2
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 2
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 2
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 2
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 2
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 2
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 2
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 2
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 2
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 2
- DWBZEJHQQIURML-IMJSIDKUSA-N Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O DWBZEJHQQIURML-IMJSIDKUSA-N 0.000 description 2
- OFYVKOXTTDCUIL-FXQIFTODSA-N Asp-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N OFYVKOXTTDCUIL-FXQIFTODSA-N 0.000 description 2
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 2
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 2
- MNQMTYSEKZHIDF-GCJQMDKQSA-N Asp-Thr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O MNQMTYSEKZHIDF-GCJQMDKQSA-N 0.000 description 2
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 2
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 2
- ZQFZEBRNAMXXJV-KKUMJFAQSA-N Asp-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O ZQFZEBRNAMXXJV-KKUMJFAQSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 2
- 101100505161 Caenorhabditis elegans mel-32 gene Proteins 0.000 description 2
- 241000178270 Canarypox virus Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108091033380 Coding strand Proteins 0.000 description 2
- LBOLGUYQEPZSKM-YUMQZZPRSA-N Cys-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N LBOLGUYQEPZSKM-YUMQZZPRSA-N 0.000 description 2
- SSNJZBGOMNLSLA-CIUDSAMLSA-N Cys-Leu-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O SSNJZBGOMNLSLA-CIUDSAMLSA-N 0.000 description 2
- RAGIABZNLPZBGS-FXQIFTODSA-N Cys-Pro-Cys Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O RAGIABZNLPZBGS-FXQIFTODSA-N 0.000 description 2
- HJXSYJVCMUOUNY-SRVKXCTJSA-N Cys-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N HJXSYJVCMUOUNY-SRVKXCTJSA-N 0.000 description 2
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 2
- DQGIAOGALAQBGK-BWBBJGPYSA-N Cys-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N)O DQGIAOGALAQBGK-BWBBJGPYSA-N 0.000 description 2
- 241000701022 Cytomegalovirus Species 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 108060002716 Exonuclease Proteins 0.000 description 2
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 2
- BPDVTFBJZNBHEU-HGNGGELXSA-N Glu-Ala-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 BPDVTFBJZNBHEU-HGNGGELXSA-N 0.000 description 2
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 2
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 2
- CGYDXNKRIMJMLV-GUBZILKMSA-N Glu-Arg-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O CGYDXNKRIMJMLV-GUBZILKMSA-N 0.000 description 2
- WOSRKEJQESVHGA-CIUDSAMLSA-N Glu-Arg-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O WOSRKEJQESVHGA-CIUDSAMLSA-N 0.000 description 2
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 2
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 2
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 2
- KOSRFJWDECSPRO-WDSKDSINSA-N Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O KOSRFJWDECSPRO-WDSKDSINSA-N 0.000 description 2
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 2
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 2
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 2
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 2
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 2
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 2
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 2
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 2
- LZMQSTPFYJLVJB-GUBZILKMSA-N Glu-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N LZMQSTPFYJLVJB-GUBZILKMSA-N 0.000 description 2
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 2
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 2
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 2
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 2
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 2
- LGWUJBCIFGVBSJ-CIUDSAMLSA-N Glu-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N LGWUJBCIFGVBSJ-CIUDSAMLSA-N 0.000 description 2
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 2
- MIIGESVJEBDJMP-FHWLQOOXSA-N Glu-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 MIIGESVJEBDJMP-FHWLQOOXSA-N 0.000 description 2
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 2
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 2
- ALMBZBOCGSVSAI-ACZMJKKPSA-N Glu-Ser-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ALMBZBOCGSVSAI-ACZMJKKPSA-N 0.000 description 2
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 2
- GTFYQOVVVJASOA-ACZMJKKPSA-N Glu-Ser-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N GTFYQOVVVJASOA-ACZMJKKPSA-N 0.000 description 2
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 2
- LLEUXCDZPQOJMY-AAEUAGOBSA-N Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 LLEUXCDZPQOJMY-AAEUAGOBSA-N 0.000 description 2
- HVKAAUOFFTUSAA-XDTLVQLUSA-N Glu-Tyr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O HVKAAUOFFTUSAA-XDTLVQLUSA-N 0.000 description 2
- KXRORHJIRAOQPG-SOUVJXGZSA-N Glu-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KXRORHJIRAOQPG-SOUVJXGZSA-N 0.000 description 2
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 2
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 2
- FVGOGEGGQLNZGH-DZKIICNBSA-N Glu-Val-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FVGOGEGGQLNZGH-DZKIICNBSA-N 0.000 description 2
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 2
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 2
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 2
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 2
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 2
- PHONXOACARQMPM-BQBZGAKWSA-N Gly-Ala-Met Chemical compound [H]NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O PHONXOACARQMPM-BQBZGAKWSA-N 0.000 description 2
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 2
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 2
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 2
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 2
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 2
- XQHSBNVACKQWAV-WHFBIAKZSA-N Gly-Asp-Asn Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XQHSBNVACKQWAV-WHFBIAKZSA-N 0.000 description 2
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 2
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 2
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 2
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 2
- YNIMVVJTPWCUJH-KBPBESRZSA-N Gly-His-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YNIMVVJTPWCUJH-KBPBESRZSA-N 0.000 description 2
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 2
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 2
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 2
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 2
- SJLKKOZFHSJJAW-YUMQZZPRSA-N Gly-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN SJLKKOZFHSJJAW-YUMQZZPRSA-N 0.000 description 2
- MTBIKIMYHUWBRX-QWRGUYRKSA-N Gly-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN MTBIKIMYHUWBRX-QWRGUYRKSA-N 0.000 description 2
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 2
- FEUPVVCGQLNXNP-IRXDYDNUSA-N Gly-Phe-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FEUPVVCGQLNXNP-IRXDYDNUSA-N 0.000 description 2
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 2
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 2
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 2
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 2
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 2
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 2
- NIOPEYHPOBWLQO-KBPBESRZSA-N Gly-Trp-Glu Chemical compound NCC(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCC(O)=O)C(O)=O NIOPEYHPOBWLQO-KBPBESRZSA-N 0.000 description 2
- YJDALMUYJIENAG-QWRGUYRKSA-N Gly-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN)O YJDALMUYJIENAG-QWRGUYRKSA-N 0.000 description 2
- GNNJKUYDWFIBTK-QWRGUYRKSA-N Gly-Tyr-Asp Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O GNNJKUYDWFIBTK-QWRGUYRKSA-N 0.000 description 2
- DNAZKGFYFRGZIH-QWRGUYRKSA-N Gly-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 DNAZKGFYFRGZIH-QWRGUYRKSA-N 0.000 description 2
- NGBGZCUWFVVJKC-IRXDYDNUSA-N Gly-Tyr-Tyr Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NGBGZCUWFVVJKC-IRXDYDNUSA-N 0.000 description 2
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 2
- BNMRSWQOHIQTFL-JSGCOSHPSA-N Gly-Val-Phe Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 BNMRSWQOHIQTFL-JSGCOSHPSA-N 0.000 description 2
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 2
- IZVICCORZOSGPT-JSGCOSHPSA-N Gly-Val-Tyr Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IZVICCORZOSGPT-JSGCOSHPSA-N 0.000 description 2
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 2
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 2
- 241000303608 Helicobacter felis ATCC 49179 Species 0.000 description 2
- ZNPRMNDAFQKATM-LKTVYLICSA-N His-Ala-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZNPRMNDAFQKATM-LKTVYLICSA-N 0.000 description 2
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 2
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 2
- FYTCLUIYTYFGPT-YUMQZZPRSA-N His-Gly-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FYTCLUIYTYFGPT-YUMQZZPRSA-N 0.000 description 2
- OQDLKDUVMTUPPG-AVGNSLFASA-N His-Leu-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OQDLKDUVMTUPPG-AVGNSLFASA-N 0.000 description 2
- GJMHMDKCJPQJOI-IHRRRGAJSA-N His-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 GJMHMDKCJPQJOI-IHRRRGAJSA-N 0.000 description 2
- RLAOTFTXBFQJDV-KKUMJFAQSA-N His-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CN=CN1 RLAOTFTXBFQJDV-KKUMJFAQSA-N 0.000 description 2
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 2
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 2
- PBVQWNDMFFCPIZ-ULQDDVLXSA-N His-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 PBVQWNDMFFCPIZ-ULQDDVLXSA-N 0.000 description 2
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 2
- FBVHRDXSCYELMI-PBCZWWQYSA-N His-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N)O FBVHRDXSCYELMI-PBCZWWQYSA-N 0.000 description 2
- HTOOKGDPMXSJSY-STQMWFEESA-N His-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 HTOOKGDPMXSJSY-STQMWFEESA-N 0.000 description 2
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 2
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 2
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 2
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 2
- MLTRLIITQPXHBJ-BQBZGAKWSA-N Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O MLTRLIITQPXHBJ-BQBZGAKWSA-N 0.000 description 2
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 2
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 2
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 2
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 2
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 2
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 2
- QLQHWWCSCLZUMA-KKUMJFAQSA-N Leu-Asp-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QLQHWWCSCLZUMA-KKUMJFAQSA-N 0.000 description 2
- WCTCIIAGNMFYAO-DCAQKATOSA-N Leu-Cys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O WCTCIIAGNMFYAO-DCAQKATOSA-N 0.000 description 2
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 2
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 2
- PRZVBIAOPFGAQF-SRVKXCTJSA-N Leu-Glu-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O PRZVBIAOPFGAQF-SRVKXCTJSA-N 0.000 description 2
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 2
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 2
- CSFVADKICPDRRF-KKUMJFAQSA-N Leu-His-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CN=CN1 CSFVADKICPDRRF-KKUMJFAQSA-N 0.000 description 2
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 2
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 2
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 2
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 2
- UCNNZELZXFXXJQ-BZSNNMDCSA-N Leu-Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCNNZELZXFXXJQ-BZSNNMDCSA-N 0.000 description 2
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 2
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 2
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 2
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 2
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 2
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 2
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 2
- JVTYXRRFZCEPPK-RHYQMDGZSA-N Leu-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(C)C)N)O JVTYXRRFZCEPPK-RHYQMDGZSA-N 0.000 description 2
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 2
- WXDRGWBQZIMJDE-ULQDDVLXSA-N Leu-Phe-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O WXDRGWBQZIMJDE-ULQDDVLXSA-N 0.000 description 2
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 2
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 2
- RRVCZCNFXIFGRA-DCAQKATOSA-N Leu-Pro-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RRVCZCNFXIFGRA-DCAQKATOSA-N 0.000 description 2
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 2
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 2
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 2
- SQUFDMCWMFOEBA-KKUMJFAQSA-N Leu-Ser-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SQUFDMCWMFOEBA-KKUMJFAQSA-N 0.000 description 2
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 2
- WBRJVRXEGQIDRK-XIRDDKMYSA-N Leu-Trp-Ser Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 WBRJVRXEGQIDRK-XIRDDKMYSA-N 0.000 description 2
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 2
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 description 2
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- 108090001030 Lipoproteins Proteins 0.000 description 2
- 102000004895 Lipoproteins Human genes 0.000 description 2
- 239000006137 Luria-Bertani broth Substances 0.000 description 2
- VHFFQUSNFFIZBT-CIUDSAMLSA-N Lys-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N VHFFQUSNFFIZBT-CIUDSAMLSA-N 0.000 description 2
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 2
- UWKNTTJNVSYXPC-CIUDSAMLSA-N Lys-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN UWKNTTJNVSYXPC-CIUDSAMLSA-N 0.000 description 2
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 2
- DGAAQRAUOFHBFJ-CIUDSAMLSA-N Lys-Asn-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DGAAQRAUOFHBFJ-CIUDSAMLSA-N 0.000 description 2
- WLCYCADOWRMSAJ-CIUDSAMLSA-N Lys-Asn-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O WLCYCADOWRMSAJ-CIUDSAMLSA-N 0.000 description 2
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 2
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 2
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 2
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 description 2
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 2
- PRCHKVGXZVTALR-KKUMJFAQSA-N Lys-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCCN)N PRCHKVGXZVTALR-KKUMJFAQSA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 2
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 2
- AHFOKDZWPPGJAZ-SRVKXCTJSA-N Lys-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O)N AHFOKDZWPPGJAZ-SRVKXCTJSA-N 0.000 description 2
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 2
- XBZOQGHZGQLEQO-IUCAKERBSA-N Lys-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN XBZOQGHZGQLEQO-IUCAKERBSA-N 0.000 description 2
- ZCWWVXAXWUAEPZ-SRVKXCTJSA-N Lys-Met-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZCWWVXAXWUAEPZ-SRVKXCTJSA-N 0.000 description 2
- AZOFEHCPMBRNFD-BZSNNMDCSA-N Lys-Phe-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 AZOFEHCPMBRNFD-BZSNNMDCSA-N 0.000 description 2
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 2
- WGILOYIKJVQUPT-DCAQKATOSA-N Lys-Pro-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WGILOYIKJVQUPT-DCAQKATOSA-N 0.000 description 2
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 2
- DIBZLYZXTSVGLN-CIUDSAMLSA-N Lys-Ser-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O DIBZLYZXTSVGLN-CIUDSAMLSA-N 0.000 description 2
- YRNRVKTYDSLKMD-KKUMJFAQSA-N Lys-Ser-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YRNRVKTYDSLKMD-KKUMJFAQSA-N 0.000 description 2
- WAAZECNCPVGPIV-RHYQMDGZSA-N Lys-Thr-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O WAAZECNCPVGPIV-RHYQMDGZSA-N 0.000 description 2
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 2
- RVKIPWVMZANZLI-ZFWWWQNUSA-N Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-ZFWWWQNUSA-N 0.000 description 2
- MYTOTTSMVMWVJN-STQMWFEESA-N Lys-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MYTOTTSMVMWVJN-STQMWFEESA-N 0.000 description 2
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 2
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 2
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 2
- WYEXWKAWMNJKPN-UBHSHLNASA-N Met-Ala-Phe Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCSC)N WYEXWKAWMNJKPN-UBHSHLNASA-N 0.000 description 2
- JMEWFDUAFKVAAT-WDSKDSINSA-N Met-Asn Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CC(N)=O JMEWFDUAFKVAAT-WDSKDSINSA-N 0.000 description 2
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 2
- QTZXSYBVOSXBEJ-WDSKDSINSA-N Met-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O QTZXSYBVOSXBEJ-WDSKDSINSA-N 0.000 description 2
- OSOLWRWQADPDIQ-DCAQKATOSA-N Met-Asp-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OSOLWRWQADPDIQ-DCAQKATOSA-N 0.000 description 2
- MTBVQFFQMXHCPC-CIUDSAMLSA-N Met-Glu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MTBVQFFQMXHCPC-CIUDSAMLSA-N 0.000 description 2
- YORIKIDJCPKBON-YUMQZZPRSA-N Met-Glu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YORIKIDJCPKBON-YUMQZZPRSA-N 0.000 description 2
- MYAPQOBHGWJZOM-UWVGGRQHSA-N Met-Gly-Leu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C MYAPQOBHGWJZOM-UWVGGRQHSA-N 0.000 description 2
- MHQXIBRPDKXDGZ-ZFWWWQNUSA-N Met-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 MHQXIBRPDKXDGZ-ZFWWWQNUSA-N 0.000 description 2
- KMSMNUFBNCHMII-IHRRRGAJSA-N Met-Leu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN KMSMNUFBNCHMII-IHRRRGAJSA-N 0.000 description 2
- LBNFTWKGISQVEE-AVGNSLFASA-N Met-Leu-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCSC LBNFTWKGISQVEE-AVGNSLFASA-N 0.000 description 2
- ZRACLHJYVRBJFC-ULQDDVLXSA-N Met-Lys-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZRACLHJYVRBJFC-ULQDDVLXSA-N 0.000 description 2
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 2
- ILKCLLLOGPDNIP-RCWTZXSCSA-N Met-Met-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ILKCLLLOGPDNIP-RCWTZXSCSA-N 0.000 description 2
- VQILILSLEFDECU-GUBZILKMSA-N Met-Pro-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O VQILILSLEFDECU-GUBZILKMSA-N 0.000 description 2
- PHKBGZKVOJCIMZ-SRVKXCTJSA-N Met-Pro-Arg Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PHKBGZKVOJCIMZ-SRVKXCTJSA-N 0.000 description 2
- WXXNVZMWHOLNRJ-AVGNSLFASA-N Met-Pro-Lys Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O WXXNVZMWHOLNRJ-AVGNSLFASA-N 0.000 description 2
- LXCSZPUQKMTXNW-BQBZGAKWSA-N Met-Ser-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O LXCSZPUQKMTXNW-BQBZGAKWSA-N 0.000 description 2
- KAKJTZWHIUWTTD-VQVTYTSYSA-N Met-Thr Chemical compound CSCC[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)O)C([O-])=O KAKJTZWHIUWTTD-VQVTYTSYSA-N 0.000 description 2
- IHRFZLQEQVHXFA-RHYQMDGZSA-N Met-Thr-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCCN IHRFZLQEQVHXFA-RHYQMDGZSA-N 0.000 description 2
- QYIGOFGUOVTAHK-ZJDVBMNYSA-N Met-Thr-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QYIGOFGUOVTAHK-ZJDVBMNYSA-N 0.000 description 2
- XYVRXLDSCKEYES-JSGCOSHPSA-N Met-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 XYVRXLDSCKEYES-JSGCOSHPSA-N 0.000 description 2
- KLGIQJRMFHIGCQ-ZFWWWQNUSA-N Met-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(=O)NCC(O)=O)=CNC2=C1 KLGIQJRMFHIGCQ-ZFWWWQNUSA-N 0.000 description 2
- LPNWWHBFXPNHJG-AVGNSLFASA-N Met-Val-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN LPNWWHBFXPNHJG-AVGNSLFASA-N 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- CHJJGSNFBQVOTG-UHFFFAOYSA-N N-methyl-guanidine Natural products CNC(N)=N CHJJGSNFBQVOTG-UHFFFAOYSA-N 0.000 description 2
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 2
- 108010065395 Neuropep-1 Proteins 0.000 description 2
- 102000007079 Peptide Fragments Human genes 0.000 description 2
- 108010033276 Peptide Fragments Proteins 0.000 description 2
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 2
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 2
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 2
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 2
- IWRZUGHCHFZYQZ-UFYCRDLUSA-N Phe-Arg-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 IWRZUGHCHFZYQZ-UFYCRDLUSA-N 0.000 description 2
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 2
- JOXIIFVCSATTDH-IHPCNDPISA-N Phe-Asn-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JOXIIFVCSATTDH-IHPCNDPISA-N 0.000 description 2
- WFDAEEUZPZSMOG-SRVKXCTJSA-N Phe-Cys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O WFDAEEUZPZSMOG-SRVKXCTJSA-N 0.000 description 2
- XXAOSEUPEMQJOF-KKUMJFAQSA-N Phe-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XXAOSEUPEMQJOF-KKUMJFAQSA-N 0.000 description 2
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 2
- RFEXGCASCQGGHZ-STQMWFEESA-N Phe-Gly-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O RFEXGCASCQGGHZ-STQMWFEESA-N 0.000 description 2
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 2
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 2
- BIYWZVCPZIFGPY-QWRGUYRKSA-N Phe-Gly-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O BIYWZVCPZIFGPY-QWRGUYRKSA-N 0.000 description 2
- HNFUGJUZJRYUHN-JSGCOSHPSA-N Phe-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HNFUGJUZJRYUHN-JSGCOSHPSA-N 0.000 description 2
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 2
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 2
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 2
- KNYPNEYICHHLQL-ACRUOGEOSA-N Phe-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 KNYPNEYICHHLQL-ACRUOGEOSA-N 0.000 description 2
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 2
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 2
- IWZRODDWOSIXPZ-IRXDYDNUSA-N Phe-Phe-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)NCC(O)=O)C1=CC=CC=C1 IWZRODDWOSIXPZ-IRXDYDNUSA-N 0.000 description 2
- JDMKQHSHKJHAHR-UHFFFAOYSA-N Phe-Phe-Leu-Tyr Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)CC1=CC=CC=C1 JDMKQHSHKJHAHR-UHFFFAOYSA-N 0.000 description 2
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 2
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 2
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 2
- WEQJQNWXCSUVMA-RYUDHWBXSA-N Phe-Pro Chemical compound C([C@H]([NH3+])C(=O)N1[C@@H](CCC1)C([O-])=O)C1=CC=CC=C1 WEQJQNWXCSUVMA-RYUDHWBXSA-N 0.000 description 2
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 2
- ROHDXJUFQVRDAV-UWVGGRQHSA-N Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ROHDXJUFQVRDAV-UWVGGRQHSA-N 0.000 description 2
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 2
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 2
- XNQMZHLAYFWSGJ-HTUGSXCWSA-N Phe-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XNQMZHLAYFWSGJ-HTUGSXCWSA-N 0.000 description 2
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 2
- JLDZQPPLTJTJLE-IHPCNDPISA-N Phe-Trp-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JLDZQPPLTJTJLE-IHPCNDPISA-N 0.000 description 2
- FSXRLASFHBWESK-HOTGVXAUSA-N Phe-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 FSXRLASFHBWESK-HOTGVXAUSA-N 0.000 description 2
- AGTHXWTYCLLYMC-FHWLQOOXSA-N Phe-Tyr-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 AGTHXWTYCLLYMC-FHWLQOOXSA-N 0.000 description 2
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 2
- FRMKIPSIZSFTTE-HJOGWXRNSA-N Phe-Tyr-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FRMKIPSIZSFTTE-HJOGWXRNSA-N 0.000 description 2
- KIQUCMUULDXTAZ-HJOGWXRNSA-N Phe-Tyr-Tyr Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O KIQUCMUULDXTAZ-HJOGWXRNSA-N 0.000 description 2
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 2
- 102000017033 Porins Human genes 0.000 description 2
- 108010013381 Porins Proteins 0.000 description 2
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 2
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 2
- KQCCDMFIALWGTL-GUBZILKMSA-N Pro-Asn-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 KQCCDMFIALWGTL-GUBZILKMSA-N 0.000 description 2
- MLQVJYMFASXBGZ-IHRRRGAJSA-N Pro-Asn-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O MLQVJYMFASXBGZ-IHRRRGAJSA-N 0.000 description 2
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 2
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 2
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 2
- DRKAXLDECUGLFE-ULQDDVLXSA-N Pro-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O DRKAXLDECUGLFE-ULQDDVLXSA-N 0.000 description 2
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 2
- RVQDZELMXZRSSI-IUCAKERBSA-N Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 RVQDZELMXZRSSI-IUCAKERBSA-N 0.000 description 2
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 2
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 2
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 2
- PUQRDHNIOONJJN-AVGNSLFASA-N Pro-Lys-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O PUQRDHNIOONJJN-AVGNSLFASA-N 0.000 description 2
- SMFQZMGHCODUPQ-ULQDDVLXSA-N Pro-Lys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SMFQZMGHCODUPQ-ULQDDVLXSA-N 0.000 description 2
- WFIVLLFYUZZWOD-RHYQMDGZSA-N Pro-Lys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WFIVLLFYUZZWOD-RHYQMDGZSA-N 0.000 description 2
- SWRNSCMUXRLHCR-ULQDDVLXSA-N Pro-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 SWRNSCMUXRLHCR-ULQDDVLXSA-N 0.000 description 2
- GOMUXSCOIWIJFP-GUBZILKMSA-N Pro-Ser-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GOMUXSCOIWIJFP-GUBZILKMSA-N 0.000 description 2
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 description 2
- BVTYXOFTHDXSNI-IHRRRGAJSA-N Pro-Tyr-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 BVTYXOFTHDXSNI-IHRRRGAJSA-N 0.000 description 2
- VEUACYMXJKXALX-IHRRRGAJSA-N Pro-Tyr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VEUACYMXJKXALX-IHRRRGAJSA-N 0.000 description 2
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 2
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 2
- 108010019653 Pwo polymerase Proteins 0.000 description 2
- 108010025216 RVF peptide Proteins 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- BKOKTRCZXRIQPX-ZLUOBGJFSA-N Ser-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N BKOKTRCZXRIQPX-ZLUOBGJFSA-N 0.000 description 2
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 2
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 2
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 2
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 2
- VBKBDLMWICBSCY-IMJSIDKUSA-N Ser-Asp Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O VBKBDLMWICBSCY-IMJSIDKUSA-N 0.000 description 2
- KNZQGAUEYZJUSQ-ZLUOBGJFSA-N Ser-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N KNZQGAUEYZJUSQ-ZLUOBGJFSA-N 0.000 description 2
- CNIIKZQXBBQHCX-FXQIFTODSA-N Ser-Asp-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O CNIIKZQXBBQHCX-FXQIFTODSA-N 0.000 description 2
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 2
- DBIDZNUXSLXVRG-FXQIFTODSA-N Ser-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N DBIDZNUXSLXVRG-FXQIFTODSA-N 0.000 description 2
- RFBKULCUBJAQFT-BIIVOSGPSA-N Ser-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CO)N)C(=O)O RFBKULCUBJAQFT-BIIVOSGPSA-N 0.000 description 2
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 2
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 2
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 2
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 2
- JEHPKECJCALLRW-CUJWVEQBSA-N Ser-His-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEHPKECJCALLRW-CUJWVEQBSA-N 0.000 description 2
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 2
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 2
- KCGIREHVWRXNDH-GARJFASQSA-N Ser-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N KCGIREHVWRXNDH-GARJFASQSA-N 0.000 description 2
- JWOBLHJRDADHLN-KKUMJFAQSA-N Ser-Leu-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JWOBLHJRDADHLN-KKUMJFAQSA-N 0.000 description 2
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 2
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 2
- WGDYNRCOQRERLZ-KKUMJFAQSA-N Ser-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N WGDYNRCOQRERLZ-KKUMJFAQSA-N 0.000 description 2
- XNXRTQZTFVMJIJ-DCAQKATOSA-N Ser-Met-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XNXRTQZTFVMJIJ-DCAQKATOSA-N 0.000 description 2
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 2
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 2
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 2
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 2
- NMZXJDSKEGFDLJ-DCAQKATOSA-N Ser-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CCCCN)C(=O)O NMZXJDSKEGFDLJ-DCAQKATOSA-N 0.000 description 2
- OVQZAFXWIWNYKA-GUBZILKMSA-N Ser-Pro-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CO)N OVQZAFXWIWNYKA-GUBZILKMSA-N 0.000 description 2
- XZKQVQKUZMAADP-IMJSIDKUSA-N Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(O)=O XZKQVQKUZMAADP-IMJSIDKUSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- WLJPJRGQRNCIQS-ZLUOBGJFSA-N Ser-Ser-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O WLJPJRGQRNCIQS-ZLUOBGJFSA-N 0.000 description 2
- PPCZVWHJWJFTFN-ZLUOBGJFSA-N Ser-Ser-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPCZVWHJWJFTFN-ZLUOBGJFSA-N 0.000 description 2
- AABIBDJHSKIMJK-FXQIFTODSA-N Ser-Ser-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O AABIBDJHSKIMJK-FXQIFTODSA-N 0.000 description 2
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 2
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 2
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 2
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 2
- DYEGLQRVMBWQLD-IXOXFDKPSA-N Ser-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CO)N)O DYEGLQRVMBWQLD-IXOXFDKPSA-N 0.000 description 2
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 2
- OJFFAQFRCVPHNN-JYBASQMISA-N Ser-Thr-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O OJFFAQFRCVPHNN-JYBASQMISA-N 0.000 description 2
- HAUVENOGHPECML-BPUTZDHNSA-N Ser-Trp-Val Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CO)=CNC2=C1 HAUVENOGHPECML-BPUTZDHNSA-N 0.000 description 2
- VVKVHAOOUGNDPJ-SRVKXCTJSA-N Ser-Tyr-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VVKVHAOOUGNDPJ-SRVKXCTJSA-N 0.000 description 2
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 2
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 2
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 2
- ODRUTDLAONAVDV-IHRRRGAJSA-N Ser-Val-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ODRUTDLAONAVDV-IHRRRGAJSA-N 0.000 description 2
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 241000607762 Shigella flexneri Species 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- 229930006000 Sucrose Natural products 0.000 description 2
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 2
- NHUHCSRWZMLRLA-UHFFFAOYSA-N Sulfisoxazole Chemical compound CC1=NOC(NS(=O)(=O)C=2C=CC(N)=CC=2)=C1C NHUHCSRWZMLRLA-UHFFFAOYSA-N 0.000 description 2
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 2
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 2
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 2
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 2
- UQTNIFUCMBFWEJ-IWGUZYHVSA-N Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-IWGUZYHVSA-N 0.000 description 2
- SWIKDOUVROTZCW-GCJQMDKQSA-N Thr-Asn-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O SWIKDOUVROTZCW-GCJQMDKQSA-N 0.000 description 2
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 2
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 2
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 2
- IOWJRKAVLALBQB-IWGUZYHVSA-N Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O IOWJRKAVLALBQB-IWGUZYHVSA-N 0.000 description 2
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 2
- KRPKYGOFYUNIGM-XVSYOHENSA-N Thr-Asp-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O KRPKYGOFYUNIGM-XVSYOHENSA-N 0.000 description 2
- UZJDBCHMIQXLOQ-HEIBUPTGSA-N Thr-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O UZJDBCHMIQXLOQ-HEIBUPTGSA-N 0.000 description 2
- VGYBYGQXZJDZJU-XQXXSGGOSA-N Thr-Glu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VGYBYGQXZJDZJU-XQXXSGGOSA-N 0.000 description 2
- HJOSVGCWOTYJFG-WDCWCFNPSA-N Thr-Glu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O HJOSVGCWOTYJFG-WDCWCFNPSA-N 0.000 description 2
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 2
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 2
- YZUWGFXVVZQJEI-PMVVWTBXSA-N Thr-Gly-His Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O YZUWGFXVVZQJEI-PMVVWTBXSA-N 0.000 description 2
- IGGFFPOIFHZYKC-PBCZWWQYSA-N Thr-His-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O IGGFFPOIFHZYKC-PBCZWWQYSA-N 0.000 description 2
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 2
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 2
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 2
- FIFDDJFLNVAVMS-RHYQMDGZSA-N Thr-Leu-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O FIFDDJFLNVAVMS-RHYQMDGZSA-N 0.000 description 2
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 2
- BDGBHYCAZJPLHX-HJGDQZAQSA-N Thr-Lys-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BDGBHYCAZJPLHX-HJGDQZAQSA-N 0.000 description 2
- ZSPQUTWLWGWTPS-HJGDQZAQSA-N Thr-Lys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZSPQUTWLWGWTPS-HJGDQZAQSA-N 0.000 description 2
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 2
- LHNNQVXITHUCAB-QTKMDUPCSA-N Thr-Met-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O LHNNQVXITHUCAB-QTKMDUPCSA-N 0.000 description 2
- IQHUITKNHOKGFC-MIMYLULJSA-N Thr-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IQHUITKNHOKGFC-MIMYLULJSA-N 0.000 description 2
- NZRUWPIYECBYRK-HTUGSXCWSA-N Thr-Phe-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O NZRUWPIYECBYRK-HTUGSXCWSA-N 0.000 description 2
- GFRIEEKFXOVPIR-RHYQMDGZSA-N Thr-Pro-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O GFRIEEKFXOVPIR-RHYQMDGZSA-N 0.000 description 2
- OLFOOYQTTQSSRK-UNQGMJICSA-N Thr-Pro-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLFOOYQTTQSSRK-UNQGMJICSA-N 0.000 description 2
- GVMXJJAJLIEASL-ZJDVBMNYSA-N Thr-Pro-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVMXJJAJLIEASL-ZJDVBMNYSA-N 0.000 description 2
- XZUBGOYOGDRYFC-XGEHTFHBSA-N Thr-Ser-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O XZUBGOYOGDRYFC-XGEHTFHBSA-N 0.000 description 2
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 2
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 2
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 2
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 2
- ZOCJFNXUVSGBQI-HSHDSVGOSA-N Thr-Trp-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O ZOCJFNXUVSGBQI-HSHDSVGOSA-N 0.000 description 2
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 2
- YOPQYBJJNSIQGZ-JNPHEJMOSA-N Thr-Tyr-Tyr Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 YOPQYBJJNSIQGZ-JNPHEJMOSA-N 0.000 description 2
- DXDMNBJJEXYMLA-UBHSHLNASA-N Trp-Asn-Asp Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 DXDMNBJJEXYMLA-UBHSHLNASA-N 0.000 description 2
- GKUROEIXVURAAO-BPUTZDHNSA-N Trp-Asp-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GKUROEIXVURAAO-BPUTZDHNSA-N 0.000 description 2
- IQGJAHMZWBTRIF-UBHSHLNASA-N Trp-Asp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N IQGJAHMZWBTRIF-UBHSHLNASA-N 0.000 description 2
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 2
- NXQAOORHSYJRGH-AAEUAGOBSA-N Trp-Gly-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 NXQAOORHSYJRGH-AAEUAGOBSA-N 0.000 description 2
- CCZXBOFIBYQLEV-IHPCNDPISA-N Trp-Leu-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O CCZXBOFIBYQLEV-IHPCNDPISA-N 0.000 description 2
- UQHPXCFAHVTWFU-BVSLBCMMSA-N Trp-Phe-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O UQHPXCFAHVTWFU-BVSLBCMMSA-N 0.000 description 2
- ARKBYVBCEOWRNR-UBHSHLNASA-N Trp-Ser-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O ARKBYVBCEOWRNR-UBHSHLNASA-N 0.000 description 2
- ABRICLFKFRFDKS-IHPCNDPISA-N Trp-Ser-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=C(O)C=C1 ABRICLFKFRFDKS-IHPCNDPISA-N 0.000 description 2
- SGQSAIFDESQBRA-IHPCNDPISA-N Trp-Tyr-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SGQSAIFDESQBRA-IHPCNDPISA-N 0.000 description 2
- ZJPSMXCFEKMZFE-IHPCNDPISA-N Trp-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O ZJPSMXCFEKMZFE-IHPCNDPISA-N 0.000 description 2
- SDNVRAKIJVKAGS-LKTVYLICSA-N Tyr-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N SDNVRAKIJVKAGS-LKTVYLICSA-N 0.000 description 2
- TVOGEPLDNYTAHD-CQDKDKBSSA-N Tyr-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TVOGEPLDNYTAHD-CQDKDKBSSA-N 0.000 description 2
- JXNRXNCCROJZFB-RYUDHWBXSA-N Tyr-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JXNRXNCCROJZFB-RYUDHWBXSA-N 0.000 description 2
- WDIJBEWLXLQQKD-ULQDDVLXSA-N Tyr-Arg-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O WDIJBEWLXLQQKD-ULQDDVLXSA-N 0.000 description 2
- CKKFTIQYURNSEI-IHRRRGAJSA-N Tyr-Asn-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CKKFTIQYURNSEI-IHRRRGAJSA-N 0.000 description 2
- JWHOIHCOHMZSAR-QWRGUYRKSA-N Tyr-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JWHOIHCOHMZSAR-QWRGUYRKSA-N 0.000 description 2
- HDSKHCBAVVWPCQ-FHWLQOOXSA-N Tyr-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HDSKHCBAVVWPCQ-FHWLQOOXSA-N 0.000 description 2
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 2
- IJUTXXAXQODRMW-KBPBESRZSA-N Tyr-Gly-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O IJUTXXAXQODRMW-KBPBESRZSA-N 0.000 description 2
- ULHJJQYGMWONTD-HKUYNNGSSA-N Tyr-Gly-Trp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ULHJJQYGMWONTD-HKUYNNGSSA-N 0.000 description 2
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 2
- PRONOHBTMLNXCZ-BZSNNMDCSA-N Tyr-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PRONOHBTMLNXCZ-BZSNNMDCSA-N 0.000 description 2
- WDGDKHLSDIOXQC-ACRUOGEOSA-N Tyr-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 WDGDKHLSDIOXQC-ACRUOGEOSA-N 0.000 description 2
- FMXFHNSFABRVFZ-BZSNNMDCSA-N Tyr-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FMXFHNSFABRVFZ-BZSNNMDCSA-N 0.000 description 2
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 2
- UBKKNELWDCBNCF-STQMWFEESA-N Tyr-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UBKKNELWDCBNCF-STQMWFEESA-N 0.000 description 2
- HNERGSKJJZQGEA-JYJNAYRXSA-N Tyr-Met-Met Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HNERGSKJJZQGEA-JYJNAYRXSA-N 0.000 description 2
- WTTRJMAZPDHPGS-KKXDTOCCSA-N Tyr-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(O)=O WTTRJMAZPDHPGS-KKXDTOCCSA-N 0.000 description 2
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 2
- WURLIFOWSMBUAR-SLFFLAALSA-N Tyr-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O WURLIFOWSMBUAR-SLFFLAALSA-N 0.000 description 2
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 2
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 description 2
- ZZDYJFVIKVSUFA-WLTAIBSBSA-N Tyr-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O ZZDYJFVIKVSUFA-WLTAIBSBSA-N 0.000 description 2
- VSYROIRKNBCULO-BWAGICSOSA-N Tyr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O VSYROIRKNBCULO-BWAGICSOSA-N 0.000 description 2
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 2
- GPLTZEMVOCZVAV-UFYCRDLUSA-N Tyr-Tyr-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 GPLTZEMVOCZVAV-UFYCRDLUSA-N 0.000 description 2
- KSGKJSFPWSMJHK-JNPHEJMOSA-N Tyr-Tyr-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSGKJSFPWSMJHK-JNPHEJMOSA-N 0.000 description 2
- 241000700618 Vaccinia virus Species 0.000 description 2
- JFAWZADYPRMRCO-UBHSHLNASA-N Val-Ala-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JFAWZADYPRMRCO-UBHSHLNASA-N 0.000 description 2
- LABUITCFCAABSV-BPNCWPANSA-N Val-Ala-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-BPNCWPANSA-N 0.000 description 2
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 2
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 2
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 2
- PMXBARDFIAPBGK-DZKIICNBSA-N Val-Glu-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PMXBARDFIAPBGK-DZKIICNBSA-N 0.000 description 2
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 2
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 2
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 2
- JKHXYJKMNSSFFL-IUCAKERBSA-N Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN JKHXYJKMNSSFFL-IUCAKERBSA-N 0.000 description 2
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 2
- SBJCTAZFSZXWSR-AVGNSLFASA-N Val-Met-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SBJCTAZFSZXWSR-AVGNSLFASA-N 0.000 description 2
- WSUWDIVCPOJFCX-TUAOUCFPSA-N Val-Met-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N WSUWDIVCPOJFCX-TUAOUCFPSA-N 0.000 description 2
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 2
- NZGOVKLVQNOEKP-YDHLFZDLSA-N Val-Phe-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NZGOVKLVQNOEKP-YDHLFZDLSA-N 0.000 description 2
- WMRWZYSRQUORHJ-YDHLFZDLSA-N Val-Phe-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WMRWZYSRQUORHJ-YDHLFZDLSA-N 0.000 description 2
- JMCOXFSCTGKLLB-FKBYEOEOSA-N Val-Phe-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JMCOXFSCTGKLLB-FKBYEOEOSA-N 0.000 description 2
- GIAZPLMMQOERPN-YUMQZZPRSA-N Val-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(O)=O GIAZPLMMQOERPN-YUMQZZPRSA-N 0.000 description 2
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 2
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 2
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 2
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 2
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 2
- QHSSPPHOHJSTML-HOCLYGCPSA-N Val-Trp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)NCC(=O)O)N QHSSPPHOHJSTML-HOCLYGCPSA-N 0.000 description 2
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 2
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 2
- BGTDGENDNWGMDQ-KJEVXHAQSA-N Val-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N)O BGTDGENDNWGMDQ-KJEVXHAQSA-N 0.000 description 2
- OWFGFHQMSBTKLX-UFYCRDLUSA-N Val-Tyr-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N OWFGFHQMSBTKLX-UFYCRDLUSA-N 0.000 description 2
- 238000001261 affinity purification Methods 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010031014 alanyl-histidyl-leucyl-leucine Proteins 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 2
- 125000000217 alkyl group Chemical group 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 229910052782 aluminium Inorganic materials 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010089975 arginyl-glycyl-aspartyl-serine Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000002238 attenuated effect Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 229960003150 bupivacaine Drugs 0.000 description 2
- 208000023652 chronic gastritis Diseases 0.000 description 2
- 238000004737 colorimetric analysis Methods 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 101150028842 ctxA gene Proteins 0.000 description 2
- 230000001351 cycling effect Effects 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 108010004073 cysteinylcysteine Proteins 0.000 description 2
- 239000003398 denaturant Substances 0.000 description 2
- PSLWZOIUBRXAQW-UHFFFAOYSA-M dimethyl(dioctadecyl)azanium;bromide Chemical compound [Br-].CCCCCCCCCCCCCCCCCC[N+](C)(C)CCCCCCCCCCCCCCCCCC PSLWZOIUBRXAQW-UHFFFAOYSA-M 0.000 description 2
- SWSQBOPZIKWTGO-UHFFFAOYSA-N dimethylaminoamidine Natural products CN(C)C(N)=N SWSQBOPZIKWTGO-UHFFFAOYSA-N 0.000 description 2
- 239000003937 drug carrier Substances 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 229940088598 enzyme Drugs 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 102000013165 exonuclease Human genes 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 206010017758 gastric cancer Diseases 0.000 description 2
- 208000010749 gastric carcinoma Diseases 0.000 description 2
- 210000001156 gastric mucosa Anatomy 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 108010037389 glutamyl-cysteinyl-lysine Proteins 0.000 description 2
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 2
- 108010075431 glycyl-alanyl-phenylalanine Proteins 0.000 description 2
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 2
- 108010054666 glycyl-leucyl-glycyl-glycine Proteins 0.000 description 2
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 2
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 2
- 229960004198 guanidine Drugs 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 229940037467 helicobacter pylori Drugs 0.000 description 2
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 2
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 2
- 229910001629 magnesium chloride Inorganic materials 0.000 description 2
- 230000013011 mating Effects 0.000 description 2
- 239000004005 microsphere Substances 0.000 description 2
- 229940035032 monophosphoryl lipid a Drugs 0.000 description 2
- 238000010172 mouse model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 229940126578 oral vaccine Drugs 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 208000011906 peptic ulcer disease Diseases 0.000 description 2
- 108010024607 phenylalanylalanine Proteins 0.000 description 2
- 108010073101 phenylalanylleucine Proteins 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 2
- 239000002987 primer (paints) Substances 0.000 description 2
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 238000004153 renaturation Methods 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 239000011347 resin Substances 0.000 description 2
- 229920005989 resin Polymers 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 2
- 238000010186 staining Methods 0.000 description 2
- 201000000498 stomach carcinoma Diseases 0.000 description 2
- 239000005720 sucrose Substances 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 108010045269 tryptophyltryptophan Proteins 0.000 description 2
- WFKWXMTUELFFGS-UHFFFAOYSA-N tungsten Chemical compound [W] WFKWXMTUELFFGS-UHFFFAOYSA-N 0.000 description 2
- 229910052721 tungsten Inorganic materials 0.000 description 2
- 239000010937 tungsten Substances 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 2
- 101150080234 vacA gene Proteins 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- NOUIAHOPEGZYFE-JPLJXNOCSA-N (3S)-4-[[(2S)-1-[[(1S)-1-carboxy-2-(4-hydroxyphenyl)ethyl]amino]-3-methyl-1-oxobutan-2-yl]amino]-3-[[(2S)-2,6-diaminohexanoyl]amino]-4-oxobutanoic acid Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NOUIAHOPEGZYFE-JPLJXNOCSA-N 0.000 description 1
- RKDVKSZUMVYZHH-UHFFFAOYSA-N 1,4-dioxane-2,5-dione Chemical compound O=C1COC(=O)CO1 RKDVKSZUMVYZHH-UHFFFAOYSA-N 0.000 description 1
- VSWPGAIWKHPTKX-UHFFFAOYSA-N 1-methyl-10-[2-(4-methyl-1-piperazinyl)-1-oxoethyl]-5H-thieno[3,4-b][1,5]benzodiazepin-4-one Chemical compound C1CN(C)CCN1CC(=O)N1C2=CC=CC=C2NC(=O)C2=CSC(C)=C21 VSWPGAIWKHPTKX-UHFFFAOYSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- RYOFERRMXDATKG-YEUCEMRASA-N 2,3-bis[(z)-octadec-9-enoxy]propyl-trimethylazanium Chemical compound CCCCCCCC\C=C/CCCCCCCCOCC(C[N+](C)(C)C)OCCCCCCCC\C=C/CCCCCCCC RYOFERRMXDATKG-YEUCEMRASA-N 0.000 description 1
- KSXTUUUQYQYKCR-LQDDAWAPSA-M 2,3-bis[[(z)-octadec-9-enoyl]oxy]propyl-trimethylazanium;chloride Chemical compound [Cl-].CCCCCCCC\C=C/CCCCCCCC(=O)OCC(C[N+](C)(C)C)OC(=O)CCCCCCC\C=C/CCCCCCCC KSXTUUUQYQYKCR-LQDDAWAPSA-M 0.000 description 1
- DQVAZKGVGKHQDS-UHFFFAOYSA-N 2-[[1-[2-[(2-amino-4-methylpentanoyl)amino]-4-methylpentanoyl]pyrrolidine-2-carbonyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(=O)NC(CC(C)C)C(O)=O DQVAZKGVGKHQDS-UHFFFAOYSA-N 0.000 description 1
- ONEGZXHXCLCVRF-UHFFFAOYSA-N 2-[[2-[[1-(2-amino-3-methylbutanoyl)pyrrolidine-2-carbonyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(C(O)=O)NC(=O)C(C(C)C)NC(=O)C1CCCN1C(=O)C(N)C(C)C ONEGZXHXCLCVRF-UHFFFAOYSA-N 0.000 description 1
- OTEWWRBKGONZBW-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]-4-methylpentanoyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NC(CC(C)C)C(=O)NCC(=O)NCC(O)=O OTEWWRBKGONZBW-UHFFFAOYSA-N 0.000 description 1
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical group OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 1
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical compound NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 1
- WRDABNWSWOHGMS-UHFFFAOYSA-N AEBSF hydrochloride Chemical compound Cl.NCCC1=CC=C(S(F)(=O)=O)C=C1 WRDABNWSWOHGMS-UHFFFAOYSA-N 0.000 description 1
- 102100039819 Actin, alpha cardiac muscle 1 Human genes 0.000 description 1
- UWQJHXKARZWDIJ-ZLUOBGJFSA-N Ala-Ala-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O UWQJHXKARZWDIJ-ZLUOBGJFSA-N 0.000 description 1
- PJNSIUPOXFBHDM-GUBZILKMSA-N Ala-Arg-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O PJNSIUPOXFBHDM-GUBZILKMSA-N 0.000 description 1
- LBJYAILUMSUTAM-ZLUOBGJFSA-N Ala-Asn-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LBJYAILUMSUTAM-ZLUOBGJFSA-N 0.000 description 1
- HGRBNYQIMKTUNT-XVYDVKMFSA-N Ala-Asn-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HGRBNYQIMKTUNT-XVYDVKMFSA-N 0.000 description 1
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 1
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 1
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- NHLAEBFGWPXFGI-WHFBIAKZSA-N Ala-Gly-Asn Chemical compound C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N NHLAEBFGWPXFGI-WHFBIAKZSA-N 0.000 description 1
- RDIKFPRVLJLMER-BQBZGAKWSA-N Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)N RDIKFPRVLJLMER-BQBZGAKWSA-N 0.000 description 1
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 1
- LBYMZCVBOKYZNS-CIUDSAMLSA-N Ala-Leu-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O LBYMZCVBOKYZNS-CIUDSAMLSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 1
- FSHURBQASBLAPO-WDSKDSINSA-N Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)N FSHURBQASBLAPO-WDSKDSINSA-N 0.000 description 1
- XSTZMVAYYCJTNR-DCAQKATOSA-N Ala-Met-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XSTZMVAYYCJTNR-DCAQKATOSA-N 0.000 description 1
- OMNVYXHOSHNURL-WPRPVWTQSA-N Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OMNVYXHOSHNURL-WPRPVWTQSA-N 0.000 description 1
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 1
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 1
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 1
- XAXHGSOBFPIRFG-LSJOCFKGSA-N Ala-Pro-His Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XAXHGSOBFPIRFG-LSJOCFKGSA-N 0.000 description 1
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 1
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 1
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 1
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- ALZVPLKYDKJKQU-XVKPBYJWSA-N Ala-Tyr Chemical compound C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ALZVPLKYDKJKQU-XVKPBYJWSA-N 0.000 description 1
- AENHOIXXHKNIQL-AUTRQRHGSA-N Ala-Tyr-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H]([NH3+])C)CC1=CC=C(O)C=C1 AENHOIXXHKNIQL-AUTRQRHGSA-N 0.000 description 1
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 1
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- DHONNEYAZPNGSG-UBHSHLNASA-N Ala-Val-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DHONNEYAZPNGSG-UBHSHLNASA-N 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 101100480489 Arabidopsis thaliana TAAC gene Proteins 0.000 description 1
- DFCIPNHFKOQAME-FXQIFTODSA-N Arg-Ala-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFCIPNHFKOQAME-FXQIFTODSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- DCGLNNVKIZXQOJ-FXQIFTODSA-N Arg-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N DCGLNNVKIZXQOJ-FXQIFTODSA-N 0.000 description 1
- NUBPTCMEOCKWDO-DCAQKATOSA-N Arg-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N NUBPTCMEOCKWDO-DCAQKATOSA-N 0.000 description 1
- ITVINTQUZMQWJR-QXEWZRGKSA-N Arg-Asn-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ITVINTQUZMQWJR-QXEWZRGKSA-N 0.000 description 1
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 1
- YKZJPIPFKGYHKY-DCAQKATOSA-N Arg-Leu-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKZJPIPFKGYHKY-DCAQKATOSA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 1
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 1
- OISWSORSLQOGFV-AVGNSLFASA-N Arg-Met-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N OISWSORSLQOGFV-AVGNSLFASA-N 0.000 description 1
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 1
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 1
- ISJWBVIYRBAXEB-CIUDSAMLSA-N Arg-Ser-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O ISJWBVIYRBAXEB-CIUDSAMLSA-N 0.000 description 1
- KMFPQTITXUKJOV-DCAQKATOSA-N Arg-Ser-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O KMFPQTITXUKJOV-DCAQKATOSA-N 0.000 description 1
- JOTRDIXZHNQYGP-DCAQKATOSA-N Arg-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JOTRDIXZHNQYGP-DCAQKATOSA-N 0.000 description 1
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 1
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 1
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 1
- KSHJMDSNSKDJPU-QTKMDUPCSA-N Arg-Thr-His Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KSHJMDSNSKDJPU-QTKMDUPCSA-N 0.000 description 1
- FSPQNLYOFCXUCE-BPUTZDHNSA-N Arg-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FSPQNLYOFCXUCE-BPUTZDHNSA-N 0.000 description 1
- XOZYYXMHMIEJET-XIRDDKMYSA-N Arg-Trp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O XOZYYXMHMIEJET-XIRDDKMYSA-N 0.000 description 1
- UGJLILSJKSBVIR-ZFWWWQNUSA-N Arg-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)NCC(O)=O)=CNC2=C1 UGJLILSJKSBVIR-ZFWWWQNUSA-N 0.000 description 1
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 1
- LFWOQHSQNCKXRU-UFYCRDLUSA-N Arg-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 LFWOQHSQNCKXRU-UFYCRDLUSA-N 0.000 description 1
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 1
- 241000238421 Arthropoda Species 0.000 description 1
- SJUXYGVRSGTPMC-IMJSIDKUSA-N Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O SJUXYGVRSGTPMC-IMJSIDKUSA-N 0.000 description 1
- BRCVLJZIIFBSPF-ZLUOBGJFSA-N Asn-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N BRCVLJZIIFBSPF-ZLUOBGJFSA-N 0.000 description 1
- QQEWINYJRFBLNN-DLOVCJGASA-N Asn-Ala-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QQEWINYJRFBLNN-DLOVCJGASA-N 0.000 description 1
- QEYJFBMTSMLPKZ-ZKWXMUAHSA-N Asn-Ala-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O QEYJFBMTSMLPKZ-ZKWXMUAHSA-N 0.000 description 1
- NPDLYUOYAGBHFB-WDSKDSINSA-N Asn-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NPDLYUOYAGBHFB-WDSKDSINSA-N 0.000 description 1
- GMRGSBAMMMVDGG-GUBZILKMSA-N Asn-Arg-Arg Chemical compound C(C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N GMRGSBAMMMVDGG-GUBZILKMSA-N 0.000 description 1
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 1
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 1
- DQTIWTULBGLJBL-DCAQKATOSA-N Asn-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N DQTIWTULBGLJBL-DCAQKATOSA-N 0.000 description 1
- GOVUDFOGXOONFT-VEVYYDQMSA-N Asn-Arg-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GOVUDFOGXOONFT-VEVYYDQMSA-N 0.000 description 1
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 1
- LDSFSKFATNBTBV-UHFFFAOYSA-N Asn-Asn-Gly-His Chemical compound NC(=O)CC(N)C(=O)NC(CC(N)=O)C(=O)NCC(=O)NC(C(O)=O)CC1=CN=CN1 LDSFSKFATNBTBV-UHFFFAOYSA-N 0.000 description 1
- AYZAWXAPBAYCHO-CIUDSAMLSA-N Asn-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N AYZAWXAPBAYCHO-CIUDSAMLSA-N 0.000 description 1
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 1
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 1
- HZYFHQOWCFUSOV-IMJSIDKUSA-N Asn-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(O)=O HZYFHQOWCFUSOV-IMJSIDKUSA-N 0.000 description 1
- BHQQRVARKXWXPP-ACZMJKKPSA-N Asn-Asp-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BHQQRVARKXWXPP-ACZMJKKPSA-N 0.000 description 1
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 1
- FJIRXKVEDFLLOQ-SRVKXCTJSA-N Asn-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)N)N FJIRXKVEDFLLOQ-SRVKXCTJSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 1
- ODBSSLHUFPJRED-CIUDSAMLSA-N Asn-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N ODBSSLHUFPJRED-CIUDSAMLSA-N 0.000 description 1
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 1
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- NUCUBYIUPVYGPP-XIRDDKMYSA-N Asn-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CC(N)=O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O NUCUBYIUPVYGPP-XIRDDKMYSA-N 0.000 description 1
- NCFJQJRLQJEECD-NHCYSSNCSA-N Asn-Leu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O NCFJQJRLQJEECD-NHCYSSNCSA-N 0.000 description 1
- ORJQQZIXTOYGGH-SRVKXCTJSA-N Asn-Lys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ORJQQZIXTOYGGH-SRVKXCTJSA-N 0.000 description 1
- HMUKKNAMNSXDBB-CIUDSAMLSA-N Asn-Met-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMUKKNAMNSXDBB-CIUDSAMLSA-N 0.000 description 1
- KEUNWIXNKVWCFL-FXQIFTODSA-N Asn-Met-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O KEUNWIXNKVWCFL-FXQIFTODSA-N 0.000 description 1
- FTNRWCPWDWRPAV-BZSNNMDCSA-N Asn-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTNRWCPWDWRPAV-BZSNNMDCSA-N 0.000 description 1
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 1
- SZNGQSBRHFMZLT-IHRRRGAJSA-N Asn-Pro-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SZNGQSBRHFMZLT-IHRRRGAJSA-N 0.000 description 1
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 1
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 description 1
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- VBKIFHUVGLOJKT-FKZODXBYSA-N Asn-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)N)O VBKIFHUVGLOJKT-FKZODXBYSA-N 0.000 description 1
- JXMREEPBRANWBY-VEVYYDQMSA-N Asn-Thr-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JXMREEPBRANWBY-VEVYYDQMSA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- XEGZSHSPQNDNRH-JRQIVUDYSA-N Asn-Tyr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XEGZSHSPQNDNRH-JRQIVUDYSA-N 0.000 description 1
- LRCIOEVFVGXZKB-BZSNNMDCSA-N Asn-Tyr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LRCIOEVFVGXZKB-BZSNNMDCSA-N 0.000 description 1
- KWBQPGIYEZKDEG-FSPLSTOPSA-N Asn-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O KWBQPGIYEZKDEG-FSPLSTOPSA-N 0.000 description 1
- XLDMSQYOYXINSZ-QXEWZRGKSA-N Asn-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLDMSQYOYXINSZ-QXEWZRGKSA-N 0.000 description 1
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 1
- MYRLSKYSMXNLLA-LAEOZQHASA-N Asn-Val-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MYRLSKYSMXNLLA-LAEOZQHASA-N 0.000 description 1
- LMIWYCWRJVMAIQ-NHCYSSNCSA-N Asn-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N LMIWYCWRJVMAIQ-NHCYSSNCSA-N 0.000 description 1
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 1
- QXNGSPZMGFEZNO-QRTARXTBSA-N Asn-Val-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QXNGSPZMGFEZNO-QRTARXTBSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- CXBOKJPLEYUPGB-FXQIFTODSA-N Asp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N CXBOKJPLEYUPGB-FXQIFTODSA-N 0.000 description 1
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 1
- PSZNHSNIGMJYOZ-WDSKDSINSA-N Asp-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PSZNHSNIGMJYOZ-WDSKDSINSA-N 0.000 description 1
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 1
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 1
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 1
- VBVKSAFJPVXMFJ-CIUDSAMLSA-N Asp-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N VBVKSAFJPVXMFJ-CIUDSAMLSA-N 0.000 description 1
- XACXDSRQIXRMNS-OLHMAJIHSA-N Asp-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)O XACXDSRQIXRMNS-OLHMAJIHSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 1
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 1
- CKAJHWFHHFSCDT-WHFBIAKZSA-N Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O CKAJHWFHHFSCDT-WHFBIAKZSA-N 0.000 description 1
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- KFAFUJMGHVVYRC-DCAQKATOSA-N Asp-Leu-Met Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O KFAFUJMGHVVYRC-DCAQKATOSA-N 0.000 description 1
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 1
- UZFHNLYQWMGUHU-DCAQKATOSA-N Asp-Lys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UZFHNLYQWMGUHU-DCAQKATOSA-N 0.000 description 1
- NZWDWXSWUQCNMG-GARJFASQSA-N Asp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)C(=O)O NZWDWXSWUQCNMG-GARJFASQSA-N 0.000 description 1
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 1
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 1
- VMVUDJUXJKDGNR-FXQIFTODSA-N Asp-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N VMVUDJUXJKDGNR-FXQIFTODSA-N 0.000 description 1
- LKVKODXGSAFOFY-VEVYYDQMSA-N Asp-Met-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LKVKODXGSAFOFY-VEVYYDQMSA-N 0.000 description 1
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 1
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 1
- WZUZGDANRQPCDD-SRVKXCTJSA-N Asp-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N WZUZGDANRQPCDD-SRVKXCTJSA-N 0.000 description 1
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 1
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 1
- UKGGPJNBONZZCM-WDSKDSINSA-N Asp-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 1
- ZKAOJVJQGVUIIU-GUBZILKMSA-N Asp-Pro-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZKAOJVJQGVUIIU-GUBZILKMSA-N 0.000 description 1
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 1
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 1
- VNXQRBXEQXLERQ-CIUDSAMLSA-N Asp-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N VNXQRBXEQXLERQ-CIUDSAMLSA-N 0.000 description 1
- NAAAPCLFJPURAM-HJGDQZAQSA-N Asp-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O NAAAPCLFJPURAM-HJGDQZAQSA-N 0.000 description 1
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 1
- AWPWHMVCSISSQK-QWRGUYRKSA-N Asp-Tyr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O AWPWHMVCSISSQK-QWRGUYRKSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 208000004300 Atrophic Gastritis Diseases 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 108020000946 Bacterial DNA Proteins 0.000 description 1
- 231100000699 Bacterial toxin Toxicity 0.000 description 1
- 241001288393 Belgica Species 0.000 description 1
- 241000537222 Betabaculovirus Species 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 241000589562 Brucella Species 0.000 description 1
- 101100512078 Caenorhabditis elegans lys-1 gene Proteins 0.000 description 1
- VTYYLEPIZMXCLO-UHFFFAOYSA-L Calcium carbonate Chemical class [Ca+2].[O-]C([O-])=O VTYYLEPIZMXCLO-UHFFFAOYSA-L 0.000 description 1
- 101100348617 Candida albicans (strain SC5314 / ATCC MYA-2876) NIK1 gene Proteins 0.000 description 1
- 241000701931 Canine parvovirus Species 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- BQENDLAVTKRQMS-SBBGFIFASA-L Carbenoxolone sodium Chemical compound [Na+].[Na+].C([C@H]1C2=CC(=O)[C@H]34)[C@@](C)(C([O-])=O)CC[C@]1(C)CC[C@@]2(C)[C@]4(C)CC[C@@H]1[C@]3(C)CC[C@H](OC(=O)CCC([O-])=O)C1(C)C BQENDLAVTKRQMS-SBBGFIFASA-L 0.000 description 1
- 102100035882 Catalase Human genes 0.000 description 1
- 108010053835 Catalase Proteins 0.000 description 1
- 206010061041 Chlamydial infection Diseases 0.000 description 1
- 241000193163 Clostridioides difficile Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102100032202 Cornulin Human genes 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241001137307 Cyprinodon variegatus Species 0.000 description 1
- NOCCABSVTRONIN-CIUDSAMLSA-N Cys-Ala-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CS)N NOCCABSVTRONIN-CIUDSAMLSA-N 0.000 description 1
- CIVXDCMSSFGWAL-YUMQZZPRSA-N Cys-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N CIVXDCMSSFGWAL-YUMQZZPRSA-N 0.000 description 1
- WTEACWBAULENKE-SRVKXCTJSA-N Cys-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CS)N WTEACWBAULENKE-SRVKXCTJSA-N 0.000 description 1
- BOMGEMDZTNZESV-QWRGUYRKSA-N Cys-Tyr-Gly Chemical compound SC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 BOMGEMDZTNZESV-QWRGUYRKSA-N 0.000 description 1
- KZZYVYWSXMFYEC-DCAQKATOSA-N Cys-Val-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KZZYVYWSXMFYEC-DCAQKATOSA-N 0.000 description 1
- 238000007399 DNA isolation Methods 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 238000011238 DNA vaccination Methods 0.000 description 1
- 102100036912 Desmin Human genes 0.000 description 1
- 108010044052 Desmin Proteins 0.000 description 1
- 241001669680 Dormitator maculatus Species 0.000 description 1
- 238000012286 ELISA Assay Methods 0.000 description 1
- 101000686777 Escherichia phage T7 T7 RNA polymerase Proteins 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 108010040721 Flagellin Proteins 0.000 description 1
- 241001200922 Gagata Species 0.000 description 1
- 208000036495 Gastritis atrophic Diseases 0.000 description 1
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 1
- AVZHGSCDKIQZPQ-CIUDSAMLSA-N Glu-Arg-Ala Chemical compound C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AVZHGSCDKIQZPQ-CIUDSAMLSA-N 0.000 description 1
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 1
- AKJRHDMTEJXTPV-ACZMJKKPSA-N Glu-Asn-Ala Chemical compound C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AKJRHDMTEJXTPV-ACZMJKKPSA-N 0.000 description 1
- SVZIKUHLRKVZIF-GUBZILKMSA-N Glu-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N SVZIKUHLRKVZIF-GUBZILKMSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- QYPKJXSMLMREKF-BPUTZDHNSA-N Glu-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N QYPKJXSMLMREKF-BPUTZDHNSA-N 0.000 description 1
- BUAKRRKDHSSIKK-IHRRRGAJSA-N Glu-Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BUAKRRKDHSSIKK-IHRRRGAJSA-N 0.000 description 1
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- XIKYNVKEUINBGL-IUCAKERBSA-N Glu-His-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O XIKYNVKEUINBGL-IUCAKERBSA-N 0.000 description 1
- DVLZZEPUNFEUBW-AVGNSLFASA-N Glu-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N DVLZZEPUNFEUBW-AVGNSLFASA-N 0.000 description 1
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 1
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 1
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 1
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 1
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 1
- AOCARQDSFTWWFT-DCAQKATOSA-N Glu-Met-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AOCARQDSFTWWFT-DCAQKATOSA-N 0.000 description 1
- ZWMYUDZLXAQHCK-CIUDSAMLSA-N Glu-Met-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O ZWMYUDZLXAQHCK-CIUDSAMLSA-N 0.000 description 1
- QMOSCLNJVKSHHU-YUMQZZPRSA-N Glu-Met-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QMOSCLNJVKSHHU-YUMQZZPRSA-N 0.000 description 1
- SOEPMWQCTJITPZ-SRVKXCTJSA-N Glu-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N SOEPMWQCTJITPZ-SRVKXCTJSA-N 0.000 description 1
- XMBSYZWANAQXEV-QWRGUYRKSA-N Glu-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-QWRGUYRKSA-N 0.000 description 1
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 1
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 1
- 108010070675 Glutathione transferase Proteins 0.000 description 1
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 1
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 1
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 1
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 1
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 1
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 1
- ZRZILYKEJBMFHY-BQBZGAKWSA-N Gly-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN ZRZILYKEJBMFHY-BQBZGAKWSA-N 0.000 description 1
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 1
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 1
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- FSPVILZGHUJOHS-QWRGUYRKSA-N Gly-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 FSPVILZGHUJOHS-QWRGUYRKSA-N 0.000 description 1
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 1
- YSDLIYZLOTZZNP-UWVGGRQHSA-N Gly-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN YSDLIYZLOTZZNP-UWVGGRQHSA-N 0.000 description 1
- JPAACTMBBBGAAR-HOTGVXAUSA-N Gly-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)CC(C)C)C(O)=O)=CNC2=C1 JPAACTMBBBGAAR-HOTGVXAUSA-N 0.000 description 1
- BXICSAQLIHFDDL-YUMQZZPRSA-N Gly-Lys-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BXICSAQLIHFDDL-YUMQZZPRSA-N 0.000 description 1
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 1
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 1
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 1
- PCPOYRCAHPJXII-UWVGGRQHSA-N Gly-Lys-Met Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O PCPOYRCAHPJXII-UWVGGRQHSA-N 0.000 description 1
- MHZXESQPPXOING-KBPBESRZSA-N Gly-Lys-Phe Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MHZXESQPPXOING-KBPBESRZSA-N 0.000 description 1
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 1
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- WMGHDYWNHNLGBV-ONGXEEELSA-N Gly-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 WMGHDYWNHNLGBV-ONGXEEELSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- WZSHYFGOLPXPLL-RYUDHWBXSA-N Gly-Phe-Glu Chemical compound NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCC(O)=O)C(O)=O WZSHYFGOLPXPLL-RYUDHWBXSA-N 0.000 description 1
- VDCRBJACQKOSMS-JSGCOSHPSA-N Gly-Phe-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O VDCRBJACQKOSMS-JSGCOSHPSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- XBGGUPMXALFZOT-VIFPVBQESA-N Gly-Tyr Chemical compound NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-VIFPVBQESA-N 0.000 description 1
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 1
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 1
- OCRQUYDOYKCOQG-IRXDYDNUSA-N Gly-Tyr-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 OCRQUYDOYKCOQG-IRXDYDNUSA-N 0.000 description 1
- YDIDLLVFCYSXNY-RCOVLWMOSA-N Gly-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN YDIDLLVFCYSXNY-RCOVLWMOSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- JZNWSCPGTDBMEW-UHFFFAOYSA-N Glycerophosphorylethanolamin Natural products NCCOP(O)(=O)OCC(O)CO JZNWSCPGTDBMEW-UHFFFAOYSA-N 0.000 description 1
- 208000028861 Helicobacter pylori infectious disease Diseases 0.000 description 1
- 102100029100 Hematopoietic prostaglandin D synthase Human genes 0.000 description 1
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 1
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 1
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 1
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 1
- DFHVLUKTTVTCKY-PBCZWWQYSA-N His-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N)O DFHVLUKTTVTCKY-PBCZWWQYSA-N 0.000 description 1
- WZOGEMJIZBNFBK-CIUDSAMLSA-N His-Asp-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O WZOGEMJIZBNFBK-CIUDSAMLSA-N 0.000 description 1
- LSQHWKPPOFDHHZ-YUMQZZPRSA-N His-Asp-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LSQHWKPPOFDHHZ-YUMQZZPRSA-N 0.000 description 1
- FDQYIRHBVVUTJF-ZETCQYMHSA-N His-Gly-Gly Chemical compound [O-]C(=O)CNC(=O)CNC(=O)[C@@H]([NH3+])CC1=CN=CN1 FDQYIRHBVVUTJF-ZETCQYMHSA-N 0.000 description 1
- NQKRILCJYCASDV-QWRGUYRKSA-N His-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 NQKRILCJYCASDV-QWRGUYRKSA-N 0.000 description 1
- YAALVYQFVJNXIV-KKUMJFAQSA-N His-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 YAALVYQFVJNXIV-KKUMJFAQSA-N 0.000 description 1
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 1
- KHUFDBQXGLEIHC-BZSNNMDCSA-N His-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 KHUFDBQXGLEIHC-BZSNNMDCSA-N 0.000 description 1
- CZVQSYNVUHAILZ-UWVGGRQHSA-N His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 CZVQSYNVUHAILZ-UWVGGRQHSA-N 0.000 description 1
- ZFDKSLBEWYCOCS-BZSNNMDCSA-N His-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1NC=NC=1)C1=CC=CC=C1 ZFDKSLBEWYCOCS-BZSNNMDCSA-N 0.000 description 1
- ULRFSEJGSHYLQI-YESZJQIVSA-N His-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CN=CN3)N)C(=O)O ULRFSEJGSHYLQI-YESZJQIVSA-N 0.000 description 1
- PGXZHYYGOPKYKM-IHRRRGAJSA-N His-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CCCCN)C(=O)O PGXZHYYGOPKYKM-IHRRRGAJSA-N 0.000 description 1
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 1
- FFKJUTZARGRVTH-KKUMJFAQSA-N His-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FFKJUTZARGRVTH-KKUMJFAQSA-N 0.000 description 1
- WRPDZHJNLYNFFT-GEVIPFJHSA-N His-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N)O WRPDZHJNLYNFFT-GEVIPFJHSA-N 0.000 description 1
- CCUSLCQWVMWTIS-IXOXFDKPSA-N His-Thr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O CCUSLCQWVMWTIS-IXOXFDKPSA-N 0.000 description 1
- UWSMZKRTOZEGDD-CUJWVEQBSA-N His-Thr-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O UWSMZKRTOZEGDD-CUJWVEQBSA-N 0.000 description 1
- RNVUQLOKVIPNEM-BZSNNMDCSA-N His-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O RNVUQLOKVIPNEM-BZSNNMDCSA-N 0.000 description 1
- MRVZCDSYLJXKKX-ACRUOGEOSA-N His-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CN=CN3)N MRVZCDSYLJXKKX-ACRUOGEOSA-N 0.000 description 1
- XGBVLRJLHUVCNK-DCAQKATOSA-N His-Val-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O XGBVLRJLHUVCNK-DCAQKATOSA-N 0.000 description 1
- 101710169678 Histidine-rich protein Proteins 0.000 description 1
- 108010025076 Holoenzymes Proteins 0.000 description 1
- 101000959247 Homo sapiens Actin, alpha cardiac muscle 1 Proteins 0.000 description 1
- 101000920981 Homo sapiens Cornulin Proteins 0.000 description 1
- 102000009786 Immunoglobulin Constant Regions Human genes 0.000 description 1
- 108010009817 Immunoglobulin Constant Regions Proteins 0.000 description 1
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 description 1
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 102100024319 Intestinal-type alkaline phosphatase Human genes 0.000 description 1
- 101710184243 Intestinal-type alkaline phosphatase Proteins 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- 108010054278 Lac Repressors Proteins 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 1
- SUPVSFFZWVOEOI-CQDKDKBSSA-N Leu-Ala-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-CQDKDKBSSA-N 0.000 description 1
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 1
- QUAAUWNLWMLERT-IHRRRGAJSA-N Leu-Arg-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O QUAAUWNLWMLERT-IHRRRGAJSA-N 0.000 description 1
- VCSBGUACOYUIGD-CIUDSAMLSA-N Leu-Asn-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VCSBGUACOYUIGD-CIUDSAMLSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- NHHKSOGJYNQENP-SRVKXCTJSA-N Leu-Cys-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N NHHKSOGJYNQENP-SRVKXCTJSA-N 0.000 description 1
- YORLGJINWYYIMX-KKUMJFAQSA-N Leu-Cys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YORLGJINWYYIMX-KKUMJFAQSA-N 0.000 description 1
- PNUCWVAGVNLUMW-CIUDSAMLSA-N Leu-Cys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O PNUCWVAGVNLUMW-CIUDSAMLSA-N 0.000 description 1
- NFNVDJGXRFEYTK-YUMQZZPRSA-N Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O NFNVDJGXRFEYTK-YUMQZZPRSA-N 0.000 description 1
- QPXBPQUGXHURGP-UWVGGRQHSA-N Leu-Gly-Met Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N QPXBPQUGXHURGP-UWVGGRQHSA-N 0.000 description 1
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 1
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 1
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- MJTOYIHCKVQICL-ULQDDVLXSA-N Leu-Met-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MJTOYIHCKVQICL-ULQDDVLXSA-N 0.000 description 1
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 1
- LQUIENKUVKPNIC-ULQDDVLXSA-N Leu-Met-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LQUIENKUVKPNIC-ULQDDVLXSA-N 0.000 description 1
- NJMXCOOEFLMZSR-AVGNSLFASA-N Leu-Met-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O NJMXCOOEFLMZSR-AVGNSLFASA-N 0.000 description 1
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 1
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 1
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- MUCIDQMDOYQYBR-IHRRRGAJSA-N Leu-Pro-His Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N MUCIDQMDOYQYBR-IHRRRGAJSA-N 0.000 description 1
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 1
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 1
- KIZIOFNVSOSKJI-CIUDSAMLSA-N Leu-Ser-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N KIZIOFNVSOSKJI-CIUDSAMLSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 1
- LSLUTXRANSUGFY-XIRDDKMYSA-N Leu-Trp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O LSLUTXRANSUGFY-XIRDDKMYSA-N 0.000 description 1
- HOMFINRJHIIZNJ-HOCLYGCPSA-N Leu-Trp-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O HOMFINRJHIIZNJ-HOCLYGCPSA-N 0.000 description 1
- FPFOYSCDUWTZBF-IHPCNDPISA-N Leu-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]([NH3+])CC(C)C)C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 FPFOYSCDUWTZBF-IHPCNDPISA-N 0.000 description 1
- RIHIGSWBLHSGLV-CQDKDKBSSA-N Leu-Tyr-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O RIHIGSWBLHSGLV-CQDKDKBSSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- SEOXPEFQEOYURL-PMVMPFDFSA-N Leu-Tyr-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O SEOXPEFQEOYURL-PMVMPFDFSA-N 0.000 description 1
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 1
- RVOMPSJXSRPFJT-DCAQKATOSA-N Lys-Ala-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVOMPSJXSRPFJT-DCAQKATOSA-N 0.000 description 1
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- CLBGMWIYPYAZPR-AVGNSLFASA-N Lys-Arg-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O CLBGMWIYPYAZPR-AVGNSLFASA-N 0.000 description 1
- ALSRJRIWBNENFY-DCAQKATOSA-N Lys-Arg-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O ALSRJRIWBNENFY-DCAQKATOSA-N 0.000 description 1
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 1
- CKSXSQUVEYCDIW-AVGNSLFASA-N Lys-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N CKSXSQUVEYCDIW-AVGNSLFASA-N 0.000 description 1
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 1
- JPNRPAJITHRXRH-BQBZGAKWSA-N Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O JPNRPAJITHRXRH-BQBZGAKWSA-N 0.000 description 1
- MKBIVWXCFINCLE-SRVKXCTJSA-N Lys-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N MKBIVWXCFINCLE-SRVKXCTJSA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- HGZHSNBZDOLMLH-DCAQKATOSA-N Lys-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N HGZHSNBZDOLMLH-DCAQKATOSA-N 0.000 description 1
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 1
- QUCDKEKDPYISNX-HJGDQZAQSA-N Lys-Asn-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QUCDKEKDPYISNX-HJGDQZAQSA-N 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- SSJBMGCZZXCGJJ-DCAQKATOSA-N Lys-Asp-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O SSJBMGCZZXCGJJ-DCAQKATOSA-N 0.000 description 1
- YEIYAQQKADPIBJ-GARJFASQSA-N Lys-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O YEIYAQQKADPIBJ-GARJFASQSA-N 0.000 description 1
- GKFNXYMAMKJSKD-NHCYSSNCSA-N Lys-Asp-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GKFNXYMAMKJSKD-NHCYSSNCSA-N 0.000 description 1
- OPTCSTACHGNULU-DCAQKATOSA-N Lys-Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCCCN OPTCSTACHGNULU-DCAQKATOSA-N 0.000 description 1
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 1
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 1
- IGRMTQMIDNDFAA-UWVGGRQHSA-N Lys-His Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 IGRMTQMIDNDFAA-UWVGGRQHSA-N 0.000 description 1
- ZMMDPRTXLAEMOD-BZSNNMDCSA-N Lys-His-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZMMDPRTXLAEMOD-BZSNNMDCSA-N 0.000 description 1
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 1
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 1
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 1
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 1
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 1
- URGPVYGVWLIRGT-DCAQKATOSA-N Lys-Met-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O URGPVYGVWLIRGT-DCAQKATOSA-N 0.000 description 1
- SPNKGZFASINBMR-IHRRRGAJSA-N Lys-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N SPNKGZFASINBMR-IHRRRGAJSA-N 0.000 description 1
- ALEVUGKHINJNIF-QEJZJMRPSA-N Lys-Phe-Ala Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ALEVUGKHINJNIF-QEJZJMRPSA-N 0.000 description 1
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 1
- BPDXWKVZNCKUGG-BZSNNMDCSA-N Lys-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCCN)N BPDXWKVZNCKUGG-BZSNNMDCSA-N 0.000 description 1
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 1
- AEIIJFBQVGYVEV-YESZJQIVSA-N Lys-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CCCCN)N)C(=O)O AEIIJFBQVGYVEV-YESZJQIVSA-N 0.000 description 1
- WLXGMVVHTIUPHE-ULQDDVLXSA-N Lys-Phe-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O WLXGMVVHTIUPHE-ULQDDVLXSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 1
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 1
- WZVSHTFTCYOFPL-GARJFASQSA-N Lys-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N)C(=O)O WZVSHTFTCYOFPL-GARJFASQSA-N 0.000 description 1
- MEQLGHAMAUPOSJ-DCAQKATOSA-N Lys-Ser-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O MEQLGHAMAUPOSJ-DCAQKATOSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- CUHGAUZONORRIC-HJGDQZAQSA-N Lys-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O CUHGAUZONORRIC-HJGDQZAQSA-N 0.000 description 1
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 1
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 1
- LMMBAXJRYSXCOQ-ACRUOGEOSA-N Lys-Tyr-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O LMMBAXJRYSXCOQ-ACRUOGEOSA-N 0.000 description 1
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 1
- FPQMQEOVSKMVMA-ACRUOGEOSA-N Lys-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCCCN)N)O FPQMQEOVSKMVMA-ACRUOGEOSA-N 0.000 description 1
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 101710117393 Membrane-associated lipoprotein Proteins 0.000 description 1
- GAELMDJMQDUDLJ-BQBZGAKWSA-N Met-Ala-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O GAELMDJMQDUDLJ-BQBZGAKWSA-N 0.000 description 1
- FRWZTWWOORIIBA-FXQIFTODSA-N Met-Asn-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FRWZTWWOORIIBA-FXQIFTODSA-N 0.000 description 1
- ADHNYKZHPOEULM-BQBZGAKWSA-N Met-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O ADHNYKZHPOEULM-BQBZGAKWSA-N 0.000 description 1
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 1
- HLQWFLJOJRFXHO-CIUDSAMLSA-N Met-Glu-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O HLQWFLJOJRFXHO-CIUDSAMLSA-N 0.000 description 1
- FYRUJIJAUPHUNB-IUCAKERBSA-N Met-Gly-Arg Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N FYRUJIJAUPHUNB-IUCAKERBSA-N 0.000 description 1
- YCUSPBPZVJDMII-YUMQZZPRSA-N Met-Gly-Glu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O YCUSPBPZVJDMII-YUMQZZPRSA-N 0.000 description 1
- SXWQMBGNFXAGAT-FJXKBIBVSA-N Met-Gly-Thr Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SXWQMBGNFXAGAT-FJXKBIBVSA-N 0.000 description 1
- BMHIFARYXOJDLD-WPRPVWTQSA-N Met-Gly-Val Chemical compound [H]N[C@@H](CCSC)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O BMHIFARYXOJDLD-WPRPVWTQSA-N 0.000 description 1
- DYTWOWJWJCBFLE-IHRRRGAJSA-N Met-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CNC=N1 DYTWOWJWJCBFLE-IHRRRGAJSA-N 0.000 description 1
- RBGLBUDVQVPTEG-DCAQKATOSA-N Met-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCSC)N RBGLBUDVQVPTEG-DCAQKATOSA-N 0.000 description 1
- HZVXPUHLTZRQEL-UWVGGRQHSA-N Met-Leu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O HZVXPUHLTZRQEL-UWVGGRQHSA-N 0.000 description 1
- CHDYFPCQVUOJEB-ULQDDVLXSA-N Met-Leu-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 CHDYFPCQVUOJEB-ULQDDVLXSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- YLBUMXYVQCHBPR-ULQDDVLXSA-N Met-Leu-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YLBUMXYVQCHBPR-ULQDDVLXSA-N 0.000 description 1
- YYEIFXZOBZVDPH-DCAQKATOSA-N Met-Lys-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O YYEIFXZOBZVDPH-DCAQKATOSA-N 0.000 description 1
- RDLSEGZJMYGFNS-FXQIFTODSA-N Met-Ser-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RDLSEGZJMYGFNS-FXQIFTODSA-N 0.000 description 1
- LHXFNWBNRBWMNV-DCAQKATOSA-N Met-Ser-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LHXFNWBNRBWMNV-DCAQKATOSA-N 0.000 description 1
- UXJHNUBJSQQIOC-SZMVWBNQSA-N Met-Trp-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O UXJHNUBJSQQIOC-SZMVWBNQSA-N 0.000 description 1
- TWEWRDAAIYBJTO-ULQDDVLXSA-N Met-Tyr-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N TWEWRDAAIYBJTO-ULQDDVLXSA-N 0.000 description 1
- FZDOBWIKRQORAC-ULQDDVLXSA-N Met-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N FZDOBWIKRQORAC-ULQDDVLXSA-N 0.000 description 1
- PVSPJQWHEIQTEH-JYJNAYRXSA-N Met-Val-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PVSPJQWHEIQTEH-JYJNAYRXSA-N 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 101100476480 Mus musculus S100a8 gene Proteins 0.000 description 1
- 229940121948 Muscarinic receptor antagonist Drugs 0.000 description 1
- 241000282339 Mustela Species 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 101100205189 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-5 gene Proteins 0.000 description 1
- VEQPNABPJHWNSG-UHFFFAOYSA-N Nickel(2+) Chemical compound [Ni+2] VEQPNABPJHWNSG-UHFFFAOYSA-N 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 241000283977 Oryctolagus Species 0.000 description 1
- 108700006640 OspA Proteins 0.000 description 1
- 101710116435 Outer membrane protein Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- MIDZLCFIAINOQN-WPRPVWTQSA-N Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 MIDZLCFIAINOQN-WPRPVWTQSA-N 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- LSXGADJXBDFXQU-DLOVCJGASA-N Phe-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 LSXGADJXBDFXQU-DLOVCJGASA-N 0.000 description 1
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 1
- BXNGIHFNNNSEOS-UWVGGRQHSA-N Phe-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 BXNGIHFNNNSEOS-UWVGGRQHSA-N 0.000 description 1
- LXVFHIBXOWJTKZ-BZSNNMDCSA-N Phe-Asn-Tyr Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O LXVFHIBXOWJTKZ-BZSNNMDCSA-N 0.000 description 1
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 1
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 1
- HOYQLNNGMHXZDW-KKUMJFAQSA-N Phe-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HOYQLNNGMHXZDW-KKUMJFAQSA-N 0.000 description 1
- UAMFZRNCIFFMLE-FHWLQOOXSA-N Phe-Glu-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N UAMFZRNCIFFMLE-FHWLQOOXSA-N 0.000 description 1
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- QPVFUAUFEBPIPT-CDMKHQONSA-N Phe-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QPVFUAUFEBPIPT-CDMKHQONSA-N 0.000 description 1
- WFHRXJOZEXUKLV-IRXDYDNUSA-N Phe-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 WFHRXJOZEXUKLV-IRXDYDNUSA-N 0.000 description 1
- OHUXOEXBXPZKPT-STQMWFEESA-N Phe-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 OHUXOEXBXPZKPT-STQMWFEESA-N 0.000 description 1
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 1
- METZZBCMDXHFMK-BZSNNMDCSA-N Phe-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N METZZBCMDXHFMK-BZSNNMDCSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 1
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- GPSMLZQVIIYLDK-ULQDDVLXSA-N Phe-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O GPSMLZQVIIYLDK-ULQDDVLXSA-N 0.000 description 1
- PYOHODCEOHCZBM-RYUDHWBXSA-N Phe-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 PYOHODCEOHCZBM-RYUDHWBXSA-N 0.000 description 1
- ACJULKNZOCRWEI-ULQDDVLXSA-N Phe-Met-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O ACJULKNZOCRWEI-ULQDDVLXSA-N 0.000 description 1
- FQUUYTNBMIBOHS-IHRRRGAJSA-N Phe-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FQUUYTNBMIBOHS-IHRRRGAJSA-N 0.000 description 1
- OWSLLRKCHLTUND-BZSNNMDCSA-N Phe-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OWSLLRKCHLTUND-BZSNNMDCSA-N 0.000 description 1
- DEZCWWXTRAKZKJ-UFYCRDLUSA-N Phe-Phe-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O DEZCWWXTRAKZKJ-UFYCRDLUSA-N 0.000 description 1
- MGLBSROLWAWCKN-FCLVOEFKSA-N Phe-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MGLBSROLWAWCKN-FCLVOEFKSA-N 0.000 description 1
- RVEVENLSADZUMS-IHRRRGAJSA-N Phe-Pro-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RVEVENLSADZUMS-IHRRRGAJSA-N 0.000 description 1
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 1
- GKRCCTYAGQPMMP-IHRRRGAJSA-N Phe-Ser-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O GKRCCTYAGQPMMP-IHRRRGAJSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- JHSRGEODDALISP-XVSYOHENSA-N Phe-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JHSRGEODDALISP-XVSYOHENSA-N 0.000 description 1
- AOKZOUGUMLBPSS-PMVMPFDFSA-N Phe-Trp-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O AOKZOUGUMLBPSS-PMVMPFDFSA-N 0.000 description 1
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 1
- NHHZWPNMYQUNEH-ACRUOGEOSA-N Phe-Tyr-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N NHHZWPNMYQUNEH-ACRUOGEOSA-N 0.000 description 1
- FXEKNHAJIMHRFJ-ULQDDVLXSA-N Phe-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N FXEKNHAJIMHRFJ-ULQDDVLXSA-N 0.000 description 1
- GNZCMRRSXOBHLC-JYJNAYRXSA-N Phe-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N GNZCMRRSXOBHLC-JYJNAYRXSA-N 0.000 description 1
- 102000013566 Plasminogen Human genes 0.000 description 1
- 108010051456 Plasminogen Proteins 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- FYQSMXKJYTZYRP-DCAQKATOSA-N Pro-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FYQSMXKJYTZYRP-DCAQKATOSA-N 0.000 description 1
- CYQQWUPHIZVCNY-GUBZILKMSA-N Pro-Arg-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CYQQWUPHIZVCNY-GUBZILKMSA-N 0.000 description 1
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 1
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- HXOLCSYHGRNXJJ-IHRRRGAJSA-N Pro-Asp-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HXOLCSYHGRNXJJ-IHRRRGAJSA-N 0.000 description 1
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 1
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 1
- AJCRQOHDLCBHFA-SRVKXCTJSA-N Pro-His-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AJCRQOHDLCBHFA-SRVKXCTJSA-N 0.000 description 1
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 1
- HATVCTYBNCNMAA-AVGNSLFASA-N Pro-Leu-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O HATVCTYBNCNMAA-AVGNSLFASA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 1
- ZLXKLMHAMDENIO-DCAQKATOSA-N Pro-Lys-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLXKLMHAMDENIO-DCAQKATOSA-N 0.000 description 1
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 1
- JFBJPBZSTMXGKL-JYJNAYRXSA-N Pro-Met-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JFBJPBZSTMXGKL-JYJNAYRXSA-N 0.000 description 1
- HOTVCUAVDQHUDB-UFYCRDLUSA-N Pro-Phe-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 HOTVCUAVDQHUDB-UFYCRDLUSA-N 0.000 description 1
- GFHOSBYCLACKEK-GUBZILKMSA-N Pro-Pro-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O GFHOSBYCLACKEK-GUBZILKMSA-N 0.000 description 1
- SVXXJYJCRNKDDE-AVGNSLFASA-N Pro-Pro-His Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CN=CN1 SVXXJYJCRNKDDE-AVGNSLFASA-N 0.000 description 1
- WVXQQUWOKUZIEG-VEVYYDQMSA-N Pro-Thr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O WVXQQUWOKUZIEG-VEVYYDQMSA-N 0.000 description 1
- MDAWMJUZHBQTBO-XGEHTFHBSA-N Pro-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1)O MDAWMJUZHBQTBO-XGEHTFHBSA-N 0.000 description 1
- UEKYKRQIAQHOOZ-KBPBESRZSA-N Pro-Trp Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)[O-])C(=O)[C@@H]1CCC[NH2+]1 UEKYKRQIAQHOOZ-KBPBESRZSA-N 0.000 description 1
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 1
- YDTUEBLEAVANFH-RCWTZXSCSA-N Pro-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 YDTUEBLEAVANFH-RCWTZXSCSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- KDCGOANMDULRCW-UHFFFAOYSA-N Purine Natural products N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 108010091086 Recombinases Proteins 0.000 description 1
- 241000606651 Rickettsiales Species 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 1
- 101100007329 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) COS1 gene Proteins 0.000 description 1
- 241001222774 Salmonella enterica subsp. enterica serovar Minnesota Species 0.000 description 1
- 208000006268 Sarcoma 180 Diseases 0.000 description 1
- 241000710961 Semliki Forest virus Species 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- MWMKFWJYRRGXOR-ZLUOBGJFSA-N Ser-Ala-Asn Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC(N)=O)C)CO MWMKFWJYRRGXOR-ZLUOBGJFSA-N 0.000 description 1
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- HZWAHWQZPSXNCB-BPUTZDHNSA-N Ser-Arg-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O HZWAHWQZPSXNCB-BPUTZDHNSA-N 0.000 description 1
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- FFOKMZOAVHEWET-IMJSIDKUSA-N Ser-Cys Chemical compound OC[C@H](N)C(=O)N[C@@H](CS)C(O)=O FFOKMZOAVHEWET-IMJSIDKUSA-N 0.000 description 1
- ZHYMUFQVKGJNRM-ZLUOBGJFSA-N Ser-Cys-Asn Chemical compound OC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(N)=O ZHYMUFQVKGJNRM-ZLUOBGJFSA-N 0.000 description 1
- SWIQQMYVHIXPEK-FXQIFTODSA-N Ser-Cys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O SWIQQMYVHIXPEK-FXQIFTODSA-N 0.000 description 1
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 1
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 1
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical compound OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 1
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 1
- NFDYGNFETJVMSE-BQBZGAKWSA-N Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CO NFDYGNFETJVMSE-BQBZGAKWSA-N 0.000 description 1
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- VIIJCAQMJBHSJH-FXQIFTODSA-N Ser-Met-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O VIIJCAQMJBHSJH-FXQIFTODSA-N 0.000 description 1
- JJUNLJTUIKFPRF-BPUTZDHNSA-N Ser-Met-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CO)N JJUNLJTUIKFPRF-BPUTZDHNSA-N 0.000 description 1
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- MHVXPTAMDHLTHB-IHPCNDPISA-N Ser-Phe-Trp Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MHVXPTAMDHLTHB-IHPCNDPISA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 1
- QUGRFWPMPVIAPW-IHRRRGAJSA-N Ser-Pro-Phe Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QUGRFWPMPVIAPW-IHRRRGAJSA-N 0.000 description 1
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 1
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- LZLREEUGSYITMX-JQWIXIFHSA-N Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(O)=O)=CNC2=C1 LZLREEUGSYITMX-JQWIXIFHSA-N 0.000 description 1
- AXKJPUBALUNJEO-UBHSHLNASA-N Ser-Trp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O AXKJPUBALUNJEO-UBHSHLNASA-N 0.000 description 1
- XPVIVVLLLOFBRH-XIRDDKMYSA-N Ser-Trp-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](N)CO)C(O)=O XPVIVVLLLOFBRH-XIRDDKMYSA-N 0.000 description 1
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 1
- OSFZCEQJLWCIBG-BZSNNMDCSA-N Ser-Tyr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OSFZCEQJLWCIBG-BZSNNMDCSA-N 0.000 description 1
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 1
- 101710084578 Short neurotoxin 1 Proteins 0.000 description 1
- UIIMBOGNXHQVGW-DEQYMQKBSA-M Sodium bicarbonate-14C Chemical compound [Na+].O[14C]([O-])=O UIIMBOGNXHQVGW-DEQYMQKBSA-M 0.000 description 1
- 208000007107 Stomach Ulcer Diseases 0.000 description 1
- 241000194026 Streptococcus gordonii Species 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 101000865057 Thermococcus litoralis DNA polymerase Proteins 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 1
- UNURFMVMXLENAZ-KJEVXHAQSA-N Thr-Arg-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UNURFMVMXLENAZ-KJEVXHAQSA-N 0.000 description 1
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 1
- LXWZOMSOUAMOIA-JIOCBJNQSA-N Thr-Asn-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O LXWZOMSOUAMOIA-JIOCBJNQSA-N 0.000 description 1
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 1
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- KWQBJOUOSNJDRR-XAVMHZPKSA-N Thr-Cys-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N)O KWQBJOUOSNJDRR-XAVMHZPKSA-N 0.000 description 1
- BECPPKYKPSRKCP-ZDLURKLDSA-N Thr-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O BECPPKYKPSRKCP-ZDLURKLDSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 1
- KCRQEJSKXAIULJ-FJXKBIBVSA-N Thr-Gly-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCRQEJSKXAIULJ-FJXKBIBVSA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 1
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 1
- FDALPRWYVKJCLL-PMVVWTBXSA-N Thr-His-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O FDALPRWYVKJCLL-PMVVWTBXSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- JLNMFGCJODTXDH-WEDXCCLWSA-N Thr-Lys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O JLNMFGCJODTXDH-WEDXCCLWSA-N 0.000 description 1
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- KZURUCDWKDEAFZ-XVSYOHENSA-N Thr-Phe-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O KZURUCDWKDEAFZ-XVSYOHENSA-N 0.000 description 1
- GYUUYCIXELGTJS-MEYUZBJRSA-N Thr-Phe-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O GYUUYCIXELGTJS-MEYUZBJRSA-N 0.000 description 1
- VEIKMWOMUYMMMK-FCLVOEFKSA-N Thr-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VEIKMWOMUYMMMK-FCLVOEFKSA-N 0.000 description 1
- BDENGIGFTNYZSJ-RCWTZXSCSA-N Thr-Pro-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O BDENGIGFTNYZSJ-RCWTZXSCSA-N 0.000 description 1
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 1
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 1
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 1
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 1
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- ATJFFYVFTNAWJD-UHFFFAOYSA-N Tin Chemical compound [Sn] ATJFFYVFTNAWJD-UHFFFAOYSA-N 0.000 description 1
- 101710182532 Toxin a Proteins 0.000 description 1
- 102000008579 Transposases Human genes 0.000 description 1
- 108010020764 Transposases Proteins 0.000 description 1
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 1
- MHNHRNHJMXAVHZ-AAEUAGOBSA-N Trp-Asn-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N MHNHRNHJMXAVHZ-AAEUAGOBSA-N 0.000 description 1
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 1
- PKUJMYZNJMRHEZ-XIRDDKMYSA-N Trp-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKUJMYZNJMRHEZ-XIRDDKMYSA-N 0.000 description 1
- UYKREHOKELZSPB-JTQLQIEISA-N Trp-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(O)=O)=CNC2=C1 UYKREHOKELZSPB-JTQLQIEISA-N 0.000 description 1
- BEWOXKJJMBKRQL-AAEUAGOBSA-N Trp-Gly-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N BEWOXKJJMBKRQL-AAEUAGOBSA-N 0.000 description 1
- LYMVXFSTACVOLP-ZFWWWQNUSA-N Trp-Leu Chemical compound C1=CC=C2C(C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 LYMVXFSTACVOLP-ZFWWWQNUSA-N 0.000 description 1
- WXEQUSQNDDJEDZ-NYVOZVTQSA-N Trp-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WXEQUSQNDDJEDZ-NYVOZVTQSA-N 0.000 description 1
- XQMGDVVKFRLQKH-BBRMVZONSA-N Trp-Val-Gly Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O)=CNC2=C1 XQMGDVVKFRLQKH-BBRMVZONSA-N 0.000 description 1
- IEESWNWYUOETOT-BVSLBCMMSA-N Trp-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1ccccc1)C(O)=O IEESWNWYUOETOT-BVSLBCMMSA-N 0.000 description 1
- BABINGWMZBWXIX-BPUTZDHNSA-N Trp-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N BABINGWMZBWXIX-BPUTZDHNSA-N 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- IELISNUVHBKYBX-XDTLVQLUSA-N Tyr-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IELISNUVHBKYBX-XDTLVQLUSA-N 0.000 description 1
- XGEUYEOEZYFHRL-KKXDTOCCSA-N Tyr-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XGEUYEOEZYFHRL-KKXDTOCCSA-N 0.000 description 1
- NOXKHHXSHQFSGJ-FQPOAREZSA-N Tyr-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NOXKHHXSHQFSGJ-FQPOAREZSA-N 0.000 description 1
- PZXUIGWOEWWFQM-SRVKXCTJSA-N Tyr-Asn-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O PZXUIGWOEWWFQM-SRVKXCTJSA-N 0.000 description 1
- DANHCMVVXDXOHN-SRVKXCTJSA-N Tyr-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DANHCMVVXDXOHN-SRVKXCTJSA-N 0.000 description 1
- VFJIWSJKZJTQII-SRVKXCTJSA-N Tyr-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VFJIWSJKZJTQII-SRVKXCTJSA-N 0.000 description 1
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 1
- FMOSEWZYZPMJAL-KKUMJFAQSA-N Tyr-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N FMOSEWZYZPMJAL-KKUMJFAQSA-N 0.000 description 1
- ZRPLVTZTKPPSBT-AVGNSLFASA-N Tyr-Glu-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZRPLVTZTKPPSBT-AVGNSLFASA-N 0.000 description 1
- CNLKDWSAORJEMW-KWQFWETISA-N Tyr-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O CNLKDWSAORJEMW-KWQFWETISA-N 0.000 description 1
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 1
- KCPFDGNYAMKZQP-KBPBESRZSA-N Tyr-Gly-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O KCPFDGNYAMKZQP-KBPBESRZSA-N 0.000 description 1
- JKUZFODWJGEQAP-KBPBESRZSA-N Tyr-Gly-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O JKUZFODWJGEQAP-KBPBESRZSA-N 0.000 description 1
- NOOMDULIORCDNF-IRXDYDNUSA-N Tyr-Gly-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NOOMDULIORCDNF-IRXDYDNUSA-N 0.000 description 1
- NMKJPMCEKQHRPD-IRXDYDNUSA-N Tyr-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NMKJPMCEKQHRPD-IRXDYDNUSA-N 0.000 description 1
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 1
- FIRUOPRJKCBLST-KKUMJFAQSA-N Tyr-His-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O FIRUOPRJKCBLST-KKUMJFAQSA-N 0.000 description 1
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 1
- JHORGUYURUBVOM-KKUMJFAQSA-N Tyr-His-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O JHORGUYURUBVOM-KKUMJFAQSA-N 0.000 description 1
- AUEJLPRZGVVDNU-STQMWFEESA-N Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-STQMWFEESA-N 0.000 description 1
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 1
- DAOREBHZAKCOEN-ULQDDVLXSA-N Tyr-Leu-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O DAOREBHZAKCOEN-ULQDDVLXSA-N 0.000 description 1
- CNNVVEPJTFOGHI-ACRUOGEOSA-N Tyr-Lys-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNNVVEPJTFOGHI-ACRUOGEOSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- OGPKMBOPMDTEDM-IHRRRGAJSA-N Tyr-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N OGPKMBOPMDTEDM-IHRRRGAJSA-N 0.000 description 1
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 1
- SZEIFUXUTBBQFQ-STQMWFEESA-N Tyr-Pro-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SZEIFUXUTBBQFQ-STQMWFEESA-N 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 1
- NHOVZGFNTGMYMI-KKUMJFAQSA-N Tyr-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NHOVZGFNTGMYMI-KKUMJFAQSA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 1
- BIVIUZRBCAUNPW-JRQIVUDYSA-N Tyr-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O BIVIUZRBCAUNPW-JRQIVUDYSA-N 0.000 description 1
- GAKBTSMAPGLQFA-JNPHEJMOSA-N Tyr-Thr-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 GAKBTSMAPGLQFA-JNPHEJMOSA-N 0.000 description 1
- BXJQKVDPRMLGKN-PMVMPFDFSA-N Tyr-Trp-Leu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 BXJQKVDPRMLGKN-PMVMPFDFSA-N 0.000 description 1
- LVILBTSHPTWDGE-PMVMPFDFSA-N Tyr-Trp-Lys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(O)=O)C1=CC=C(O)C=C1 LVILBTSHPTWDGE-PMVMPFDFSA-N 0.000 description 1
- UUJHRSTVQCFDPA-UFYCRDLUSA-N Tyr-Tyr-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 UUJHRSTVQCFDPA-UFYCRDLUSA-N 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- 101800000970 Vacuolating cytotoxin Proteins 0.000 description 1
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 1
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 1
- UUYCNAXCCDNULB-QXEWZRGKSA-N Val-Arg-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O UUYCNAXCCDNULB-QXEWZRGKSA-N 0.000 description 1
- CWOSXNKDOACNJN-BZSNNMDCSA-N Val-Arg-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N CWOSXNKDOACNJN-BZSNNMDCSA-N 0.000 description 1
- WKWJJQZZZBBWKV-JYJNAYRXSA-N Val-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WKWJJQZZZBBWKV-JYJNAYRXSA-N 0.000 description 1
- AUMNPAUHKUNHHN-BYULHYEWSA-N Val-Asn-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N AUMNPAUHKUNHHN-BYULHYEWSA-N 0.000 description 1
- OGNMURQZFMHFFD-NHCYSSNCSA-N Val-Asn-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N OGNMURQZFMHFFD-NHCYSSNCSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 1
- CJDZKZFMAXGUOJ-IHRRRGAJSA-N Val-Cys-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CJDZKZFMAXGUOJ-IHRRRGAJSA-N 0.000 description 1
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- YDPFWRVQHFWBKI-GVXVVHGQSA-N Val-Glu-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YDPFWRVQHFWBKI-GVXVVHGQSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- FEFZWCSXEMVSPO-LSJOCFKGSA-N Val-His-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](C)C(O)=O FEFZWCSXEMVSPO-LSJOCFKGSA-N 0.000 description 1
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 1
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 1
- ZZGPVSZDZQRJQY-ULQDDVLXSA-N Val-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](Cc1ccccc1)C(O)=O ZZGPVSZDZQRJQY-ULQDDVLXSA-N 0.000 description 1
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 1
- GVJUTBOZZBTBIG-AVGNSLFASA-N Val-Lys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N GVJUTBOZZBTBIG-AVGNSLFASA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 1
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 1
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 1
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 1
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 1
- HWNYVQMOLCYHEA-IHRRRGAJSA-N Val-Ser-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N HWNYVQMOLCYHEA-IHRRRGAJSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- JXWGBRRVTRAZQA-ULQDDVLXSA-N Val-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N JXWGBRRVTRAZQA-ULQDDVLXSA-N 0.000 description 1
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 1
- ZNGPROMGGGFOAA-JYJNAYRXSA-N Val-Tyr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 ZNGPROMGGGFOAA-JYJNAYRXSA-N 0.000 description 1
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 1
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 1
- WHNSHJJNWNSTSU-BZSNNMDCSA-N Val-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 WHNSHJJNWNSTSU-BZSNNMDCSA-N 0.000 description 1
- YKZVPMUGEJXEOR-JYJNAYRXSA-N Val-Val-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N YKZVPMUGEJXEOR-JYJNAYRXSA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- PTTGRYBBCYZPSL-UHFFFAOYSA-H [Al+3].[Al+3].OOP([O-])([O-])=O.OOP([O-])([O-])=O.OOP([O-])([O-])=O Chemical compound [Al+3].[Al+3].OOP([O-])([O-])=O.OOP([O-])([O-])=O.OOP([O-])([O-])=O PTTGRYBBCYZPSL-UHFFFAOYSA-H 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000013543 active substance Substances 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000000240 adjuvant effect Effects 0.000 description 1
- 238000012387 aerosolization Methods 0.000 description 1
- 238000003450 affinity purification method Methods 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- AZDRQVAHHNSJOQ-UHFFFAOYSA-N alumane Chemical class [AlH3] AZDRQVAHHNSJOQ-UHFFFAOYSA-N 0.000 description 1
- WNROFYMDJYEPJX-UHFFFAOYSA-K aluminium hydroxide Chemical compound [OH-].[OH-].[OH-].[Al+3] WNROFYMDJYEPJX-UHFFFAOYSA-K 0.000 description 1
- ILRRQNADMUWWFW-UHFFFAOYSA-K aluminium phosphate Chemical compound O1[Al]2OP1(=O)O2 ILRRQNADMUWWFW-UHFFFAOYSA-K 0.000 description 1
- 229940047712 aluminum hydroxyphosphate Drugs 0.000 description 1
- 229940126575 aminoglycoside Drugs 0.000 description 1
- 229940069428 antacid Drugs 0.000 description 1
- 239000003159 antacid agent Substances 0.000 description 1
- 230000001458 anti-acid effect Effects 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 1
- 229960003589 arginine hydrochloride Drugs 0.000 description 1
- 108010089442 arginyl-leucyl-alanyl-arginine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 230000000621 autoagglutination Effects 0.000 description 1
- 239000000688 bacterial toxin Substances 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108010036170 bicitropeptide Proteins 0.000 description 1
- 239000003833 bile salt Substances 0.000 description 1
- 229940093761 bile salts Drugs 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 239000007975 buffered saline Substances 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 229960000530 carbenoxolone Drugs 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-N carbonic acid Chemical class OC(O)=O BVKZGUZCCUSVTD-UHFFFAOYSA-N 0.000 description 1
- 101150055766 cat gene Proteins 0.000 description 1
- 229960001668 cefuroxime Drugs 0.000 description 1
- JFPVXVDWJQMJEE-IZRZKJBUSA-N cefuroxime Chemical compound N([C@@H]1C(N2C(=C(COC(N)=O)CS[C@@H]21)C(O)=O)=O)C(=O)\C(=N/OC)C1=CC=CO1 JFPVXVDWJQMJEE-IZRZKJBUSA-N 0.000 description 1
- 239000013592 cell lysate Substances 0.000 description 1
- 230000004700 cellular uptake Effects 0.000 description 1
- 230000003196 chaotropic effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000013043 chemical agent Substances 0.000 description 1
- 150000001841 cholesterols Chemical class 0.000 description 1
- 239000000812 cholinergic antagonist Substances 0.000 description 1
- 208000016644 chronic atrophic gastritis Diseases 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000009260 cross reactivity Effects 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 210000005045 desmin Anatomy 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000001177 diphosphate Substances 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 208000000718 duodenal ulcer Diseases 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000006872 enzymatic polymerization reaction Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 239000003797 essential amino acid Substances 0.000 description 1
- 235000020776 essential amino acid Nutrition 0.000 description 1
- 125000004185 ester group Chemical group 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 101150089730 gly-10 gene Proteins 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- ZWCXYZRRTRDGQE-SORVKSEFSA-N gramicidina Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@@H](CC(C)C)NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@H](NC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](NC=O)C(C)C)CC(C)C)C(=O)NCCO)=CNC2=C1 ZWCXYZRRTRDGQE-SORVKSEFSA-N 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 230000003100 immobilizing effect Effects 0.000 description 1
- 238000003119 immunoblot Methods 0.000 description 1
- 230000005847 immunogenicity Effects 0.000 description 1
- 238000010324 immunological assay Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000002743 insertional mutagenesis Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000007852 inverse PCR Methods 0.000 description 1
- 238000005342 ion exchange Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 101150027374 irgA gene Proteins 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 108010071397 lactoferrin receptors Proteins 0.000 description 1
- DVCSNHXRZUVYAM-BQBZGAKWSA-N leu-asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC(O)=O DVCSNHXRZUVYAM-BQBZGAKWSA-N 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 239000003120 macrolide antibiotic agent Substances 0.000 description 1
- 229940041033 macrolides Drugs 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 238000003760 magnetic stirring Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 230000016379 mucosal immune response Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 229910001453 nickel ion Inorganic materials 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 210000000633 nuclear envelope Anatomy 0.000 description 1
- 239000006174 pH buffer Substances 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 150000002960 penicillins Chemical class 0.000 description 1
- 229940101070 pepto-bismol Drugs 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 239000008024 pharmaceutical diluent Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 108010089198 phenylalanyl-prolyl-arginine Proteins 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 239000002953 phosphate buffered saline Substances 0.000 description 1
- 150000008104 phosphatidylethanolamines Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 108010025488 pinealon Proteins 0.000 description 1
- 229960004633 pirenzepine Drugs 0.000 description 1
- RMHMFHUVIITRHF-UHFFFAOYSA-N pirenzepine Chemical compound C1CN(C)CCN1CC(=O)N1C2=NC=CC=C2NC(=O)C2=CC=CC=C21 RMHMFHUVIITRHF-UHFFFAOYSA-N 0.000 description 1
- 229920000747 poly(lactic acid) Polymers 0.000 description 1
- 229920001606 poly(lactic acid-co-glycolic acid) Polymers 0.000 description 1
- 229920002627 poly(phosphazenes) Polymers 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 229960003857 proglumide Drugs 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010029020 prolylglycine Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 150000007660 quinolones Chemical class 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 229940044551 receptor antagonist Drugs 0.000 description 1
- 239000002464 receptor antagonist Substances 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 229960001225 rifampicin Drugs 0.000 description 1
- JQXXHWHPUNPDRT-WLSIYKJHSA-N rifampicin Chemical compound O([C@](C1=O)(C)O/C=C/[C@@H]([C@H]([C@@H](OC(C)=O)[C@H](C)[C@H](O)[C@H](C)[C@@H](O)[C@@H](C)\C=C\C=C(C)/C(=O)NC=2C(O)=C3C([O-])=C4C)C)OC)C4=C1C3=C(O)C=2\C=N\N1CC[NH+](C)CC1 JQXXHWHPUNPDRT-WLSIYKJHSA-N 0.000 description 1
- 229930182490 saponin Natural products 0.000 description 1
- 150000007949 saponins Chemical class 0.000 description 1
- 235000017709 saponins Nutrition 0.000 description 1
- 230000009962 secretion pathway Effects 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 235000020183 skimmed milk Nutrition 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 229940063675 spermine Drugs 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 229960004291 sucralfate Drugs 0.000 description 1
- MNQYNQBOVCBZIQ-JQOFMKNESA-A sucralfate Chemical compound O[Al](O)OS(=O)(=O)O[C@@H]1[C@@H](OS(=O)(=O)O[Al](O)O)[C@H](OS(=O)(=O)O[Al](O)O)[C@@H](COS(=O)(=O)O[Al](O)O)O[C@H]1O[C@@]1(COS(=O)(=O)O[Al](O)O)[C@@H](OS(=O)(=O)O[Al](O)O)[C@H](OS(=O)(=O)O[Al](O)O)[C@@H](OS(=O)(=O)O[Al](O)O)O1 MNQYNQBOVCBZIQ-JQOFMKNESA-A 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 238000007910 systemic administration Methods 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 239000001648 tannin Substances 0.000 description 1
- 229950004351 telenzepine Drugs 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940040944 tetracyclines Drugs 0.000 description 1
- RTKIYNMVFMVABJ-UHFFFAOYSA-L thimerosal Chemical compound [Na+].CC[Hg]SC1=CC=CC=C1C([O-])=O RTKIYNMVFMVABJ-UHFFFAOYSA-L 0.000 description 1
- 108010001055 thymocartin Proteins 0.000 description 1
- 101150097091 tnpA gene Proteins 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 1
- 108010068794 tyrosyl-tyrosyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 229940125575 vaccine candidate Drugs 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 229940126580 vector vaccine Drugs 0.000 description 1
- 229940023147 viral vector vaccine Drugs 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 150000003722 vitamin derivatives Chemical class 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
- 108010027345 wheylin-1 peptide Proteins 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- 150000003952 β-lactams Chemical class 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/12—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria
- C07K16/1203—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria
- C07K16/121—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria from Helicobacter (Campylobacter) (G)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/205—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Saccharide Compounds (AREA)
- Peptides Or Proteins (AREA)
Description
HELICOBACTER POLYPEPTIDES AND CORRESPONDING POLYNUCLEOTIDE MOLECULES The invention relates to Helicobacter antigens and corresponding polynucleotide molecules that can be used in methods to prevent or treat Helicobacter infection in mammals, such as humans.
Background of the Invention Helicobacter is a genus of spiral, gram-negative bacteria that colonize the gastrointestinal tracts of mammals. Several species colonize the stomach, most notably H. pylori, H. heilmanii, H. felis, and H. mustelae. Although H. pylori is the species most commonly associated with human infection, H. heilmanii and H. felis have also been isolated from humans, but at lower frequencies than H. pylori. Helicobacter infects over 50% of adult populations in developed countries and nearly 100% in developing countries and some Pacific rim countries, making it one of the most prevalent infections worldwide.
Helicobacter is routinely recovered from gastric biopsies of humans with histological evidence of gastritis and peptic ulceration. Indeed, H. pylori is now recognized as an important pathogen of humans, in that the chronic gastritis it causes is a risk factor for the development of peptic ulcer diseases and gastric carcinoma. It is thus highly desirable to develop safe and effective vaccines for preventing and treating Helicobacter infection.
A number of Helicobacter antigens have been characterized or isolated. These include urease, which is composed of two structural subunits of approximately 30 and 67 kDa (Hu et al., Infect. Immun. 58:992, 1990; Dunn et
al, J. Biol. Chem. 265:9464, 1990; Evans et al, Microbial Pathogenesis 10:15, 1991; Labigne et al, J. Bact., 173: 1920, 1991); the 87 kDa vacuolar cytotoxm (NacA) (Cover et al, J. Biol. Chem. 267: 10570, 1992; Phadnis et al, Infect. Immun. 62: 1557, 1994; WO 93/18150); a 128 kDa immunodominant antigen associated with the cytotoxin (CagA, also called TagA; WO 93/18150; U.S. Patent No. 5,403,924); 13 and 58 kDa heat shock proteins HspA and HspB (Suerbaum et al, Mol. Microbiol. 14:959, 1994; WO 93/18150); a 54 kDa catalase (Hazell et al, J. Gen. Microbiol.137:57, 1991); a 15 kDa histidine-rich protein (Hpn) (Gilbert et al, Infect. Immun. 63:2682, 1995); a 20 kDa membrane-associated lipoprotein (Kostrcynska et al, J. Bact. 176:5938, 1994); a 30 kDa outer membrane protein (Bolin et al, J. Clin. Microbiol. 33:381, 1995); a lactoferrin receptor (FR 2,724,936); and several porins, designated HopA, HopB, HopC, HopD, and HopE, which have molecular weights of 48-67 kDa (Exner et al, Infect. Immun. 63: 1567, 1995; Doig et al, J. Bact. 177:5447, 1995). Some of these proteins have been proposed as potential vaccine antigens. In particular, urease is believed to be a vaccine candidate (WO 94/9823; WO 95/22987; WO 95/3824; Michetti et al, Gastroenterology 107: 1002, 1994). Nevertheless, it is thought that several antigens may ultimately be necessary in a vaccine.
Summary of the Invention
The invention provides polynucleotide molecules that encode Helicobacter polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191, GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761, GHPO 804, GHPO 805, GHPO 812,
GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100,
GHPO 1 140, GHPO 1148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559, GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001 , GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252, which can be used, e.g., in methods to prevent, treat, or diagnose Helicobacter infection. The polypeptides of the invention include those having the amino acid sequences shown in SEQ ID NOs:2-170 (even numbers), as well as mature forms of proteins having sequences shown in SEQ ID NOs:2-170 in their unprocessed forms, and fragments thereof. Those skilled in the art will understand that the invention also includes polynucleotide molecules that encode mutants and derivatives of these polypeptides, which can result from the addition, deletion, or substitution of non-essential amino acids, as is described further below. In addition to the polynucleotide molecules described above, the invention includes the corresponding polypeptides (i.e., polypeptides encoded by the polynucleotide molecules of the invention, or fragments thereof), and monospecific antibodies that specifically bind to these polypeptides.
The present invention has many applications and includes expression cassettes, vectors, and cells transformed or transfected with the polynucleotides of the invention. Accordingly, the present invention provides (i) methods for producing polypeptides of the invention in recombinant host systems and
related expression cassettes, vectors, and transformed or transfected cells; (ii) live vaccine vectors, such as pox virus, Salmonella typhimurium, and Vibrio cholerae vectors, that contain polynucleotides of the invention (such vaccine vectors being useful in, e.g., methods for preventing or treating Helicobacter infection) in combination with a diluent or carrier, and related pharmaceutical compositions and associated therapeutic and/or prophylactic methods; (iii) therapeutic and/or prophylactic methods involving administration of polynucleotide molecules, either in a naked form or formulated with a delivery vehicle, polypeptides or mixtures of polypeptides, or monospecific antibodies of the invention, and related pharmaceutical compositions; (iv) methods for detecting the presence of Helicobacter in biological samples, which can involve the use of polynucleotide molecules, monospecific antibodies, or polypeptides of the invention; and (v) methods for purifying polypeptides of the invention by antibody-based affinity chromatography.
Brief Description of the Drawings
Fig. 1A is a diagrammatic representation of transposon TnMax9, which is a derivative of the TnMax transposon system (Haas et al. , Gene 130:23-21, 1993). The mini-transposon carries the blaM gene, which is the β- lactamase gene lacking a promoter and a signal sequence, next to the inverted repeats (IR) and the M 13 forward (M 13-FP) and reverse (M 13-RP 1 ) primer binding sites. The resolution site (res) and an origin of replication (orifd) are located between the BlaM gene and the constitutive cαtGC-resistance gene. The transposase tnpA and resolvase tnpR genes are located outside of the mini- transposon and are under the control of the inducible Ptrc promoter. The laclq gene encodes the Lac repressor.
Fig. IB is a diagrammatic representation of plasmid pMin2. pMin2 contains a multiple cloning site, the tetracycline resistance gene (tet), an origin of transfer (oriT), an origin of replication (σπColE1), a transcriptional terminator (tfd), and a weak, constitutive promoter (Plga). H. pylori chromosome fragments were introduced into the Bglϊl and Clal sites of pMin2.
Detailed Description Open reading frames (ORFs) encoding new, full length polypeptides, designated GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191, GHPO 213, GHPO 240, GHPO 408, GHPO 411 , GHPO 419, GHPO 431, GHPO 474, GHPO 591, GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761, GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1100, GHPO 1140, GHPO 1 148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559,
GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252, have been identified in the H. pylori genome. These polypeptides can be used, for example, in vaccination methods for preventing or treating Helicobacter infection. Some of the new polypeptides are secreted polypeptides that can be produced in their mature forms (i.e., as polypeptides that have been exported through class II or class III
secretion pathways) or as precursors that include signal peptides, which can be removed in the course of excretion/secretion by cleavage at the N-terminal end of the mature form. (The cleavage site is located at the C-terminal end of the signal peptide, adjacent to the mature form.) According to a first aspect of the invention, there are provided isolated polynucleotides that encode the precursor and mature forms of Helicobacter GHPO 13, GHPO 73, GHPO 90, GHPO 107, GHPO 136, GHPO 191 , GHPO 213, GHPO 240, GHPO 408, GHPO 411, GHPO 419, GHPO 431, GHPO 474, GHPO 591 , GHPO 596, GHPO 699, GHPO 724, GHPO 730, GHPO 761 , GHPO 804, GHPO 805, GHPO 812, GHPO 879, GHPO 888, GHPO 986, GHPO 1056, GHPO 1081, GHPO 1 100, GHPO 1 140, GHPO 1 148, GHPO 1200, GHPO 1212, GHPO 1258, GHPO 1263, GHPO 1273, GHPO 1284, GHPO 1299, GHPO 1327, GHPO 1346, GHPO 1378, GHPO 1412, GHPO 1443, GHPO 1466, GHPO 1476, GHPO 1536, GHPO 1559, GHPO 427, GHPO 1045, GHPO 1262, GHPO 1688, GHPO 1538, GHPO 346, GHPO 1012, GHPO 470, GHPO 1398, GHPO 1550, GHPO 276, GHPO 1501, GHPO 706, GHPO 1001, GHPO 732, GHPO 329, GHPO 574, GHPO 1190, GHPO 1374, GHPO 1620, GHPO 956, HPO 98, GHPO 689, GHPO 208, GHPO 296, GHPO 726, GHPO 1026, GHPO 1301, GHPO 1536, GHPO 166, GHPO 253, GHPO 297, GHPO 615, GHPO 1278, GHPO 1282, GHPO 1420, GHPO 1484, GHPO 1719, and GHPO 1252.
An isolated polynucleotide of the invention encodes: (i) a polypeptide having an amino acid sequence that is homologous to a Helicobacter amino acid sequence of a polypeptide, the Helicobacter amino acid sequence being selected from the group consisting of the amino acid sequences shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO: 10
(GHPO 136), SEQ ID NO: 12 (GHPO 191), SEQ ID NO: 14 (GHPO 213), SEQ ID NO: 16 (GHPO 240), SEQ ID NO: 18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO 1538), SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 1012), SEQ ID NO: 108 (GHPO 470), SEQ ID NO: 110
(GHPO 1398), SEQ ID NO: 112 (GHPO 1550), SEQ ID NO: l 14 (GHPO 276), SEQ ID NO: 116 (GHPO 1501), SEQ ID NO: 118 (GHPO 706), SEQ ID NO: 120 (GHPO 1001), SEQ ID NO: 122 (GHPO 732), SEQ ID NO: 124 (GHPO 329), SEQ ID NO: 126 (GHPO 574), SEQ ID NO: 128 (GHPO 1190), SEQ ID NO: 130 (GHPO 1374), SEQ ID NO: 132 (GHPO 1620), SEQ ID NO: 134 (GHPO 956), SEQ ID NO: 136 (HPO 98), SEQ ID NO: 138 (GHPO 689), SEQ ID NO: 140 (GHPO 208), SEQ ID NO: 142 (GHPO 296), SEQ ID
NO: 144 (GHPO 726), SEQ ID NO: 146 (GHPO 1026), SEQ ID NO: 148 (GHPO 1301), SEQ ID NO: 150 (GHPO 1536), SEQ ID NO: 152 (GHPO 166), SEQ ID NO: 154 (GHPO 253), SEQ ID NO: 156 (GHPO 297), SEQ ID NO: 158 (GHPO 615), SEQ ID NO: 160 (GHPO 1278), SEQ ID NO: 162 (GHPO 1282), SEQ ID NO: 164 (GHPO 1420), SEQ ID NO: 166 (GHPO 1484), SEQ ID NO:168 (GHPO 1719), and SEQ ID NO:170 (GHPO 1252); or (ii) a derivative of the polypeptide.
In addition to the full-length polypeptides encoded by the polynucleotides of the invention, as set forth above, polynucleotides included in the invention can also encode polypeptides that lack signal sequences, as well as other polypeptide or peptide fragments of the full-length polypeptides.
The term "isolated polynucleotide" is defined as a polynucleotide that is removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria or as part of a gene bank is not isolated, but the same molecule, separated from the remaining part of the bacterial genome, as a result of, e.g., a cloning event (amplification), is "isolated." Typically, an isolated DNA molecule is free from DNA regions (e.g., coding regions) with which it is immediately contiguous, at the 5' or 3' ends, in the naturally occurring genome. Such isolated polynucleotides can be part of a vector or a composition and still be isolated, as such a vector or composition is not part of its natural environment.
A polynucleotide of the invention can consist of RNA or DNA (e.g., cDNA, genomic DNA, or synthetic DNA), or modifications or combinations of RNA or DNA. The polynucleotide can be double-stranded or single-stranded and, if single-stranded, can be the coding (sense) strand or the non-coding (anti- sense) strand. The sequences that encode polypeptides of the invention, as
shown in any of SEQ ID NOs:2-170 (even numbers), can be (a) the coding sequence as shown in any of SEQ ID NOs: 1-169 (odd numbers); (b) a ribonucleotide sequence derived by transcription of (a); or (c) a different coding sequence that, as a result of the redundancy or degeneracy of the genetic code, encodes the same polypeptides as the polynucleotide molecules having the sequences illustrated in any of SEQ ID NOs: 1-169 (odd numbers). The polypeptide can be one that is naturally secreted or excreted by, e.g., H. felis, H. mustelae, H. heilmanii, or H. pylori.
By "polypeptide" or "protein" is meant any chain of amino acids, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation). Both terms are used interchangeably in the present application.
By "homologous amino acid sequence" is meant an amino acid sequence that differs from an amino acid sequence shown in any of SEQ ID NOs:2-170 (even numbers), or an amino acid sequence encoded by the nucleotide sequence of any of SEQ ID NOs: 1-169 (odd numbers), by one or more non-conservative amino acid substitutions, deletions, or additions located at positions at which they do not destroy the specific antigenicity of the polypeptide. Preferably, such a sequence is at least 75%, more preferably at least 80%, and most preferably at least 90% identical to an amino acid sequence shown in any of SEQ ID NOs:2-170 (even numbers).
Homologous amino acid sequences include sequences that are identical or substantially identical to an amino acid sequence as shown in any of SEQ ID NOs :2- 170 (even numbers). By "amino acid sequence that is substantially identical" is meant a sequence that is at least 90%, preferably at least 95%, more preferably at least 97%, and most preferably at least 99% identical to an amino acid sequence of reference and that differs from the
sequence of reference, if at all, by a majority of conservative amino acid substitutions.
Conservative amino acid substitutions typically include substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan, and cysteine.
Homology can be measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Similar amino acid sequences are aligned to obtain the maximum degree of homology (i.e., identity). To this end, it may be necessary to artificially introduce gaps into the sequence. Once the optimal alignment has been set up, the degree of homology (i.e., identity) is established by recording all of the positions in which the amino acids of both sequences are identical, relative to the total number of positions. Homologous polynucleotide sequences are defined in a similar way.
Preferably, a homologous sequence is one that is at least 45%, more preferably at least 60%, and most preferably at least 85% identical to a coding sequence of any of SEQ ID NOs: 1-169 (odd numbers).
Polypeptides having a sequence homologous to any one of the sequences shown in SEQ ID NOs:2-170 (even numbers), include naturally- occurring allelic variants, as well as mutants or any other non-naturally
occurring variants that are analogous in terms of antigenicity, to a polypeptide having a sequence as shown in any one of SEQ ID NOs:2-170 (even numbers).
As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By "biological function" is meant a function of the polypeptide in the cells in which it naturally occurs, even if the function is not necessary for the growth or survival of the cells. For example, the biological function of a porin is to allow the entry into cells of compounds present in the extracellular medium. The biological function is distinct from the antigenic function. A polypeptide can have more than one biological function.
Allelic variants are very common in nature. For example, a bacterial species, e.g., H. pylori, is usually represented by a variety of strains that differ from each other by minor allelic variations. Indeed, a polypeptide that fulfills the same biological function in different strains can have an amino acid sequence that is not identical in each of the strains. Such an allelic variation can be equally reflected at the polynucleotide level.
Support for the use of allelic variants of polypeptide antigens comes from, e.g., studies of the Helicobacter urease antigen. The amino acid sequence of Helicobacter urease varies widely from species to species, yet cross-species protection occurs, indicating that the urease molecule, when used as an immunogen, is highly tolerant of amino acid variations. Even among different strains of the single species H. pylori, there are amino acid sequence variations. For example, although the amino acid sequences of the UreA and
UreB subunits of H. pylori and H. felis ureases differ from one another by 26.5% and 11.8%, respectively (Ferrero et al, Molecular Microbiology
9(2):323-333, 1993), it has been shown that H. pylori urease protects mice from H. felis infection (Michetti et al, Gastroenterology 107: 1002, 1994). In addition, it has been shown that the individual structural subunits of urease, UreA and UreB, which contain distinct amino acid sequences, are both protective antigens against Helicobacter infection (Michetti et al, supra). Similarly, Cuenca et al. (Gastroenterology 110: 1770, 1996) showed that therapeutic immunization of H. mustelae-m' ϊectQά ferrets with H. pylori urease was effective at eradicating H. mustelae infection. Further, several urease variants have been reported to be effective vaccine antigens, including, e.g., recombinant UreA + UreB apoenzyme expressed from pORV142 (UreA and UreB sequences derived from H. pylori strain CPM630; Lee et al, J. Infect. Dis.l 72: 161, 1995); recombinant UreA + UreB apoenzyme expressed from pORN214 (UreA and UreB sequences differ from H. pylori strain CPM630 by one and two amino acid changes, respectively; Lee et al, supra, 1995); a UreA-glutathione-S-transferase fusion protein (UreA sequence from H. pylori strain ATCC 43504; Thomas et al, Acta Gastro-Enterologica Belgica 56:54, 1993); UreA + UreB holoenzyme purified from H. pylori strain ΝCTC1 1637 (Marchetti et al, Science 267: 1655, 1995); a UreA-MBP fusion protein (UreA from H. pylori strain 85P; Ferrero et al, Infection and Immunity 62:4981 , 1994); a UreB-MBP fusion protein (UreB from H. pylori strain 85P; Ferrero et al, supra); a UreA-MBP fusion protein (UreA from H. felis strain ATCC 49179; Ferrero et al, supra); a UreB-MBP fusion protein (UreB from H. felis strain ATCC 49179; Ferrero et al, supra); and a 37 kDa fragment of UreB containing amino acids 220-569 (Dore-Davin et al, "A 37 kD fragment of UreB is sufficient to confer protection against Helicobacter felis infection in mice"). Finally, Thomas et al. (supra) showed that oral immunization of mice
with crude sonicates of H. pylori protected mice from subsequent challenge with H. felis.
Polynucleotides, e.g., DNA molecules, encoding allelic variants can easily be obtained by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching sequences that are upstream and downstream of the 5' and 3' ends of the coding region. Suitable primers can be designed based on the nucleotide sequence information provided in any of SEQ ID NOs: 1-169 (odd numbers). Typically, a primer consists of 10 to 40, preferably 15 to 25 nucleotides. It can also be advantageous to select primers containing C and G nucleotides in proportions sufficient to ensure efficient hybridization, e.g., an amount of C and G nucleotides of at least 40%, preferably 50%, of the total nucleotide amount. Those skilled in the art can readily design primers that can be used to isolate the polynucleotides of the invention from different Helicobacter strains. Experimental conditions for carrying out PCR can readily be determined by one skilled in the art and an illustration of carrying out PCR is provided in the Examples below. As is well known in the art, restriction endonuclease recognition sites that contain, typically, 4 to 6 nucleotides (for example, the sequences 5'-GGATCC-3' (BamHI) or 5'-CTCGAG-3' (Xhol)), can be included on the 5' ends of the primers. Restriction sites can be selected by those skilled in the art so that the amplified DNA can be conveniently cloned into an appropriately digested vector, such as a plasmid.
Useful homologs that do not occur naturally can be designed using known methods for identifying regions of an antigen that are likely to be tolerant of amino acid sequence changes and/or deletions. For example,
sequences of the antigen from different species can be compared to identify conserved sequences.
Polypeptide derivatives that are encoded by polynucleotides of the invention include, e.g., fragments, polypeptides having large internal deletions derived from full-length polypeptides, and fusion proteins. Polypeptide fragments of the invention can be derived from a polypeptide having a sequence homologous to any of the sequences of SEQ ID NOs:2-170 (even numbers), to the extent that the fragments retain the substantial antigenicity of the parent polypeptide (specific antigenicity). Polypeptide derivatives can also be constructed by large internal deletions that remove a substantial part of the parent polypeptide, while retaining specific antigenicity. Generally, polypeptide derivatives should be about at least 12 amino acids in length to maintain antigenicity. Advantageously, they can be at least 20 amino acids, preferably at least 50 amino acids, more preferably at least 75 amino acids, and most preferably at least 100 amino acids in length.
Useful polypeptide derivatives, e.g., polypeptide fragments, can be designed using computer-assisted analysis of amino acid sequences in order to identify sites in protein antigens having potential as surface-exposed, antigenic regions (Hughes et al, Infect. Immun. 60(9):3497, 1992). For example, the Laser Gene Program from DNA Star can be used to obtain hydrophilicity, antigenic index, and intensity index plots for the polypeptides of the invention. This program can also be used to obtain information about homologies of the polypeptides with known protein motifs. One skilled in the art can readily use the information provided in such plots to select peptide fragments for use as vaccine antigens. For example, fragments spanning regions of the plots in which the antigenic index is relatively high can be selected. One can also select fragments spanning regions in which both the antigenic index and the
intensity plots are relatively high. Fragments containing conserved sequences, particularly hydrophilic conserved sequences, can also be selected.
Polypeptide fragments and polypeptides having large internal deletions can be used for revealing epitopes that are otherwise masked in the parent polypeptide and that may be of importance for inducing a protective T cell-dependent immune response. Deletions can also remove immunodominant regions of high variability among strains.
It is an accepted practice in the field of immunology to use fragments and variants of protein immunogens as vaccines, as all that is required to induce an immune response to a protein is a small (e.g., 8 to 10 amino acids) immunogenic region of the protein. This has been done for a number of vaccines against pathogens other than Helicobacter. For example, short synthetic peptides corresponding to surface-exposed antigens of pathogens such as murine mammary tumor virus (peptide containing 1 1 amino acids; Dion et al., Nirology 179:474-477, 1990), Semliki Forest virus (peptide containing 16 amino acids; Snijders et al, J. Gen. Nirol. 72:557-565, 1991), and canine parvovirus (2 overlapping peptides, each containing 15 amino acids; Langeveld et al, Vaccine 12(15): 1473-1480, 1994) have been shown to be effective vaccine antigens against their respective pathogens. Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions can be constructed using standard methods (see, e.g., Ausubel et al, Current Protocols in Molecular Biology, John Wiley & Sons Inc., 1994), for example, by PCR, including inverse PCR, by restriction enzyme treatment of the cloned DΝA molecules, or by the method of Kunkel et al. (Proc. Νatl. Acad. Sci. USA 82:448, 1985; biological material available at Stratagene).
A polypeptide derivative can also be produced as a fusion polypeptide that contains a polypeptide or a polypeptide derivative of the invention fused, e.g., at the N- or C-terminal end, to any other polypeptide (hereinafter referred to as a peptide tail). Such a product can be easily obtained by translation of a genetic fusion, i.e., a hybrid gene. Vectors for expressing fusion polypeptides are commercially available, and include the pMal-c2 or pMal-p2 systems of New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.
Another particular example of fusion polypeptides included in invention includes a polypeptide or polypeptide derivative of the invention fused to a polypeptide having adjuvant activity, such as, e.g., subunit B of either cholera toxin or E. coli heat-labile toxin. Several possibilities can be used for producing such ftision proteins. First, the polypeptide of the invention can be fused to the
N-terminal end or, preferably, to the C-terminal end of the polypeptide having adjuvant activity. Second, a polypeptide fragment of the invention can be fused within the amino acid sequence of the polypeptide having adjuvant activity. Spacer sequences can also be included, if desired.
As stated above, the polynucleotides of the invention encode Helicobacter polypeptides in precursor or mature form. They can also encode hybrid precursors containing heterologous signal peptides, which can mature into polypeptides of the invention. By "heterologous signal peptide" is meant a signal peptide that is not found in the naturally-occurring precursor of a polypeptide of the invention.
A polynucleotide of the invention hybridizes, preferably under stringent conditions, to a polynucleotide having a sequence as shown in any of SEQ ID NOs: 1-169 (odd numbers). Hybridization procedures are, e.g., described by Ausubel et al. (supra); Silhavy et al. (Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1984); and Davis et al. (A Manual for Genetic Engineering: Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1980). Important parameters that can be considered for optimizing hybridization conditions are reflected in the following formula, which facilitates calculation of the melting temperature (Tm), which is the temperature above which two complementary DNA strands separate from one another (Casey et al, Nucl. Acid Res. 4:1539, 1977): Tm = 81.5 + 0.5 x (% G+C) + 1.6 log (positive ion concentration) - 0.6 x (% formamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40°C, 20 to 25 °C, or, preferably, 30 to 40°C below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined empirically in preliminary experiments using conventional procedures. For example, stringent conditions can be achieved, both for pre-hybridizing and hybridizing incubations, (i) within 4-16 hours at 42°C, in 6 x SSC containing
50% formamide or (ii) within 4-16 hours at 65 °C in an aqueous 6 x SSC solution (1 M NaCl, 0.1 M sodium citrate (pH 7.0)). For polynucleotides containing 30 to 600 nucleotides, the above formula is used and then is corrected by subtracting (600/polynucleotide size in base pairs). Stringency conditions are defined by a Th that is 5 to 10°C below Tm.
Hybridization conditions with oligonucleotides shorter than 20-30 bases do not precisely follow the rules set forth above. In such cases, the
formula for calculating the Tm is as follows: Tm = 4 x (G+C) + 2 (A+T). For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54°C.
A polynucleotide molecule of the invention, containing RNA, DNA, or modifications or combinations thereof, can have various applications. For example, a polynucleotide molecule can be used (i) in a process for producing the encoded polypeptide in a recombinant host system, (ii) in the construction of vaccine vectors, such as poxviruses, which are further used in methods and compositions for preventing and/or treating Helicobacter infection, (iii) as a vaccine agent, in a naked form or formulated with a delivery vehicle, and (iv) in the construction of attenuated Helicobacter strains that can over-express a polynucleotide of the invention or express it in a non-toxic, mutated form. According to a second aspect of the invention, there is therefore provided (i) an expression cassette containing a polynucleotide molecule of the invention placed under the control of elements (e.g., a promoter) required for expression; (ii) an expression vector containing an expression cassette of the invention; (iii) a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, as well as (iv) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, under conditions that allow expression of the polynucleotide molecule of the invention, and recovering the encoded polypeptide or polypeptide derivative from the cell culture. A recombinant expression system can be selected from procaryotic and eucaryotic hosts. Eucaryotic hosts include, for example, yeast cells (e.g., Saccharomyces cerevisiae or Pichia Pastoris), mammalian cells (e.g., COS1,
NIH3T3, or JEG3 cells), arthropods cells (e.g., Spodoptera frugiperda (SF9) cells), and plant cells. Preferably, a procaryotic host such as E. coli is used. Bacterial and eucaryotic cells are available from a number of different sources that are known to those skilled in the art, e.g., the American Type Culture Collection (ATCC; Rockville, Maryland).
The choice of the expression cassette will depend on the host system selected, as well as the features desired for the expressed polypeptide. For example, it may be useful to produce a polypeptide of the invention in a particular lipidated form or any other form. Typically, an expression cassette includes a constitutive or inducible promoter that is functional in the selected host system; a ribosome binding site; a start codon (ATG); if necessary, a region encoding a signal peptide, e.g., a lipidation signal peptide; a polynucleotide molecule of the invention; a stop codon; and, optionally, a 3' terminal region (translation and/or transcription terminator). The signal peptide-encoding region is adjacent to the polynucleotide of the invention and is placed in the proper reading frame. The signal peptide-encoding region can be homologous or heterologous to the polynucleotide molecule encoding the mature polypeptide and it can be specific to the secretion apparatus of the host used for expression. The open reading frame constituted by the polynucleotide molecule of the invention, alone or together with the signal peptide, is placed under the control of the promoter so that transcription and translation occur in the host system. Promoters and signal peptide-encoding regions are widely known and available to those skilled in the art and include, for example, the promoter of Salmonella typhimurium (and derivatives) that is inducible by arabinose (promoter araB) and is functional in Gram-negative bacteria such as E. coli (U.S. Patent No. 5,028,530; Cagnon et al, Protein Engineering 4(7):843, 1991); the promoter of the bacteriophage T7 RNA polymerase gene,
which is functional in a number of E. coli strains expressing T7 polymerase (U.S. Patent No. 4,952,496); the OspA lipidation signal peptide; and RlpB lipidation signal peptide (Takase et al, J. Bact. 169:5692, 1987).
The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system. Expression vectors (e.g., plasmids or viral vectors) can be chosen from, for example, those described in Pouwels et al. (Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987), and can be purchased from various commercial sources. Methods for transforming or transfecting host cells with expression vectors are well known in the art and will depend on the host system selected, as described in Ausubel et al. (supra).
Upon expression, a recombinant polypeptide of the invention (or a polypeptide derivative) is produced and remains in the intracellular compartment, is secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded in the cellular membrane. The polypeptide can then be recovered in a substantially purified form from the cell extract or from the supernatant after centrifugation of the cell culture. Typically, the recombinant polypeptide can be purified by antibody-based affinity purification or by any other method known in the art, such as by genetic fusion to a small affinity-binding domain. Antibody-based affinity purification methods are also available for purifying a polypeptide of the invention extracted from a Helicobacter strain. Antibodies useful for immunoaffmity purification of the polypeptides of the invention can be obtained using methods described below. Polynucleotides of the invention can also be used in DNA vaccination methods, using either a viral or bacterial host as gene delivery vehicle (live vaccine vector) or administering the gene in a free form, e.g.,
inserted into a plasmid. Therapeutic or prophylactic efficacy of a polynucleotide of the invention can be evaluated as is described below.
Accordingly, in a third aspect of the mvention, there is provided (i) a vaccine vector such as a poxvirus, containing a polynucleotide molecule of the invention placed under the control of elements required for expression; (ii) a composition of matter containing a vaccine vector of the invention, together with a diluent or carrier; (iii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a vaccine vector of the invention; (iv) a method for inducing an immune response against Helicobacter in a mammal (e.g., a human; alternatively, the method can be used in veterinary applications for treating or preventing Helicobacter infection of animals, e.g., cats or birds), which involves administering to the mammal an immunogenically effective amount of a vaccine vector of the invention to elicit an immune response, e.g., a protective or therapeutic immune response to Helicobacter; and (v) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, which involves administering a prophylactic or therapeutic amount of a vaccine vector of the invention to an individual in need. Additionally, the third aspect of the invention encompasses the use of a vaccine vector of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
A vaccine vector of the invention can express one or several polypeptides or derivatives of the invention, as well as at least one additional Helicobacter antigen such as a urease apoenzyme or a subunit, fragment, homolog, mutant, or derivative thereof. In addition, it can express a cytokine, such as interleukin-2 (IL-2) or interleukin- 12 (IL-12), that enhances the immune response. Thus, a vaccine vector can include an additional
polynucleotide molecule encoding, e.g., urease subunit A, B, or both, or a cytokine, placed under the control of elements required for expression in a mammalian cell.
Alternatively, a composition of the invention can include several vaccine vectors, each of which are capable of expressing a polypeptide or derivative of the invention. A composition can also contain a vaccine vector capable of expressing an additional Helicobacter antigen, such as urease apoenzyme, a subunit, fragment, homolog, mutant, or derivative thereof, or a cytokine such as IL-2 or IL-12. In vaccination methods for treating or preventing infection in a mammal, a vaccine vector of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route, or a combination thereof. Preferred routes depend upon the choice of the vaccine vector. The administration can be achieved in a single dose or repeated at intervals. The appropriate dosage depends on various parameters that are understood by those skilled in the art, such as the nature of the vaccine vector itself, the route of administration, and the condition of the mammal to be vaccinated (e.g., the weight, age, and general health of the mammal).
Live vaccine vectors that can be used in the invention include viral vectors, such as adenoviruses and poxviruses, as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus , Bacille bilie de Calmette- Guerin (BCG), and Streptococcus. An example of an adenovirus vector, as well as a method for constructing an adenovirus vector capable of expressing a polynucleotide molecule of the invention, is described in U.S. Patent No.
4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and canary pox viruses, which are described in U.S. Patent No. 4,722,848 and U.S. Patent No. 5,364,773, respectively (also see, e.g., Tartaglia et al, Virology 188:217, 1992, for a description of a vaccinia virus vector, and Taylor et al, Vaccine 13:539, 1995, for a description of a canary poxvirus vector). Poxvirus vectors capable of expressing a polynucleotide of the invention can be obtained by homologous recombination, as described in Kieny et al. (Nature 312: 163, 1984) so that the polynucleotide of the invention is inserted into the viral genome under appropriate conditions for expression in mammalian cells. Generally, the dose of viral vector vaccine, for therapeutic or prophylactic use, can be from about lxlO4 to about 1x10", advantageously from about lxl 07 to about lxl 0'°, or, preferably, from about lxl 07 to about lxl 09 plaque-forming units per kilogram. Preferably, viral vectors are administered parenterally, for example, in 3 doses that are 4 weeks apart. Those skilled in the art will recognize that it is preferable to avoid adding a chemical adjuvant to a composition containing a viral vector of the invention and thereby minimizing the immune response to the viral vector itself.
Non-toxicogenic Vibrio cholerae mutant strains that can be used in live oral vaccines are described by Mekalanos et al. (Nature 306:551, 1983) and in U.S. Patent No. 4,882,278 (strain in which a substantial amount of the coding sequence of each of the two ct A alleles has been deleted so that no functional cholerae toxin is produced); WO 92/11354 (strain in which the irgA locus is inactivated by mutation; this mutation can be combined in a single strain with ctxA mutations); and WO 94/1533 (deletion mutant lacking functional ctxA and attRSI DNA sequences). These strains can be genetically engineered to express heterologous antigens, as described in WO 94/19482. An effective vaccine dose of a V. cholerae strain capable of expressing a
polypeptide or polypeptide derivative encoded by a polynucleotide molecule of the invention can contain, e.g., about lxlO5 to about lxlO9, preferably about lxl 06 to about lxl 08, viable bacteria in an appropriate volume for the selected route of administration. Preferred routes of administration include all mucosal routes, but, most preferably, these vectors are administered intranasally or orally.
Attenuated Salmonella typhimurium strains, genetically engineered for recombinant expression of heterologous antigens, and their use as oral vaccines, are described by Nakayama et al. (Bio/Technology 6:693, 1988) and in WO 92/1 1361. Preferred routes of administration for these vectors include all mucosal routes. Most preferably, the vectors are administered intranasally or orally.
Others bacterial strains useful as vaccine vectors are described by High et al. (EMBO 11 : 1991, 1992) and Sizemore et al. (Science 270:299, 1995; Shigella flexneri); Medaglini et al. (Proc. Natl. Acad. Sci. USA 92:6868, 1995; (Streptococcus gordonii); Flynn (Cell. Mol. Biol. 40 (suppl. I):31, 1 194), and in WO 88/6626, WO 90/0594, WO 91/13157, WO 92/1796, and WO 92/21376 (Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be inserted into the bacterial genome or it can remain in a free state, for example, carried on a plasmid.
An adjuvant can also be added to a composition containing a bacterial vector vaccine. A number of adjuvants that can be used are known to those skilled in the art. For example, preferred adjuvants can be selected from the list provided below. According to a fourth aspect of the invention, there is also provided
(i) a composition of matter containing a polynucleotide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing
a therapeutically or prophylactically effective amount of a polynucleotide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polynucleotide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polynucleotide of the invention to an individual in need of such treatment. Additionally, the fourth aspect of the invention encompasses the use of a polynucleotide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection. The fourth aspect of the invention preferably includes the use of a polynucleotide molecule placed under conditions for expression in a mammalian cell, e.g., in a plasmid that is unable to replicate in mammalian cells and to substantially integrate into a mammalian genome.
Polynucleotides (for example, DNA or RNA molecules) of the invention can also be administered as such to a mammal as a vaccine. When a DNA molecule of the invention is used, it can be in the form of a plasmid that is unable to replicate in a mammalian cell and unable to integrate into the mammalian genome. Typically, a DNA molecule is placed under the control of a promoter suitable for expression in a mammalian cell. The promoter can function ubiquitously or tissue-specifically. Examples of non-tissue specific promoters include the early Cytomegalovirus (CMV) promoter (U.S. Patent No. 4,168,062) and the Rous Sarcoma Virus promoter (Norton et al, Molec. Cell Biol. 5:281, 1985). The desmin promoter (Li et al, Gene 78:243, 1989; Li et al, J. Biol. Chem. 266:6562, 1991; Li et al, J. Biol. Chem. 268: 10403, 1993) is tissue-specific and drives expression in muscle cells. More generally,
useful promoters and vectors are described, e.g., in WO 94/21797 and by Hartikka et al. (Human Gene Therapy 7: 1205, 1996).
For DNA/RNA vaccination, the polynucleotide of the invention can encode a precursor or a mature form of a polypeptide of the invention. When it encodes a precursor form, the precursor sequence can be homologous or heterologous. In the latter case, a eucaryotic leader sequence can be used, such as the leader sequence of the tissue-type plasminogen factor (tPA).
A composition of the invention can contain one or several polynucleotides of the invention. It can also contain at least one additional polynucleotide encoding another Helicobacter antigen, such as urease subunit A, B, or both, or a fragment, derivative, mutant, or analog thereof. A polynucleotide encoding a cytokine, such as interleukin-2 (IL-2) or interleukin- 12 (IL-12), can also be added to the composition so that the immune response is enhanced. These additional polynucleotides are placed under appropriate control for expression. Advantageously, DNA molecules of the invention and/or additional DNA molecules to be included in the same composition are carried in the same plasmid.
Standard methods can be used in the preparation of therapeutic polynucleotides of the invention. For example, a polynucleotide can be used in a naked form, free of any delivery vehicles, such as anionic liposomes, cationic lipids, microparticles, e.g., gold microparticles, precipitating agents, e.g., calcium phosphate, or any other transfection-facilitating agent. In this case, the polynucleotide can be simply diluted in a physiologically acceptable solution, such as sterile saline or sterile buffered saline, with or without a carrier. When present, the carrier preferably is isotonic, hypotonic, or weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose solution, e.g., a solution containing 20% sucrose.
Alternatively, a polynucleotide can be associated with agents that assist in cellular uptake. It can be, e.g., (i) complemented with a chemical agent that modifies cellular permeability, such as bupivacaine (see, e.g., WO 94/16737), (ii) encapsulated into liposomes, or (iii) associated with cationic lipids or silica, gold, or tungsten microparticles.
Anionic and neutral liposomes are well-known in the art (see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed description of methods for making liposomes) and are useful for delivering a large range of products, including polynucleotides. Cationic lipids can also be used for gene delivery. Such lipids include, for example, Lipofectin™, which is also known as DOTMA (N-[l- (2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride), DOTAP (1,2- bis(oleyloxy)-3-(trimethylammonio)propane), DDAB (dimethyldioctadecyl- ammonium bromide), DOGS (dioctadecylamidologlycyl spermine), and cholesterol derivatives. A description of these cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, WO 91/15501, WO 95/26356, and U.S. Patent No. 5,527,928. Cationic lipids for gene delivery are preferably used in association with a neutral lipid, such as DOPE (dioleyl phosphatidylethanolamine; WO 90/11092). Other transfection- facilitating compounds can be added to a formulation containing cationic liposomes. A number of them are described in, e.g., WO 93/18759, WO 93/19768, WO 94/25608, and WO 95/2397. They include, e.g., spermine derivatives useful for facilitating the transport of DNA through the nuclear membrane (see, for example, WO 93/18759) and membrane-permeabilizing compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, WO 93/19768).
Gold or tungsten microparticles can also be used for gene delivery, as described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356: 152, 1992). In this case, the microparticle-coated polynucleotides can be injected via intradermal or intraepidermal routes using a needleless injection device ("gene gun"), such as those described in U.S. Patent No. 4,945,050, U.S. Patent No. 5,015,580, and WO 94/24263.
The amount of DNA to be used in a vaccine depends, e.g., on the strength of the promoter used in the DNA construct, the immunogenicity of the expressed gene product, the condition of the mammal intended for administration (e.g., the weight, age, and general health of the mammal), the mode of administration, and the type of formulation. In general, a therapeutically or prophylactically effective dose from about 1 μg to about 1 mg, preferably, from about 10 μg to about 800 μg, and, more preferably, from about 25 μg to about 250 μg, can be administered to a human adult. The administration can be achieved in a single dose or repeated at intervals.
The route of administration can be any conventional route used in the vaccine field. As general guidance, a polynucleotide of the invention can be administered via a mucosal surface, e.g., an ocular, intranasal, pulmonary, oral, intestinal, rectal, vaginal, or urinary tract surface, or via a parenteral route, e.g., by an intravenous, subcutaneous, intraperitoneal, intradermal, intraepidermal, or intramuscular route. The choice of administration route will depend on, e.g., the formulation that is selected. A polynucleotide formulated in association with bupivacaine is advantageously administered into muscle. When a neutral or anionic liposome or a cationic lipid, such as DOTMA, is used, the formulation can be advantageously administered via intravenous, intranasal (for example, by aerosolization), intramuscular, intradermal, and subcutaneous routes. A polynucleotide in a naked form can advantageously be administered
via the intramuscular, intradermal, or subcutaneous routes. Although not absolutely required, such a composition can also contain an adjuvant. A systemic adjuvant that does not require concomitant administration in order to exhibit an adjuvant effect is preferable. The sequence information provided in the present application enables the design of specific nucleotide probes and primers that can be used in diagnostic methods. Accordingly, in a fifth aspect of the invention, there is provided a nucleotide probe or primer having a sequence found in, or derived by degeneracy of the genetic code from, a sequence shown in any of SEQ ID NOs: l-169 (odd numbers).
The term "probe" as used in the present application refers to a DNA (preferably single stranded) or RNA molecule (or modifications or combinations thereof) that hybridizes under the stringent conditions, as defined above, to a polynucleotide molecule having a sequence homologous to any of those shown in SEQ ID NOs: 1-169 (odd numbers), or to a complementary or anti-sense sequence of any of those shown in SEQ ID NOs: 1-169 (odd numbers). Generally, probes are significantly shorter than the full-length sequences shown in SEQ ID NOs: 1-169 (odd numbers). For example, they can contain from about 5 to about 100, preferably from about 10 to about 80, nucleotides. In particular, probes have sequences that are at least 75%, preferably at least 85%, more preferably 95% homologous to a portion of a sequence as shown in any of SEQ ID NOs: 1-169 (odd numbers) or a sequence complementary to any of such sequences.
Probes can contain modified bases, such as inosine, methyl-5- deoxycytidine, deoxyuridine, dimethylamino-5-deoxyuridine, or diamino-2, 6- purine. Sugar or phosphate residues can also be modified or substituted. For example, a deoxyribose residue can be replaced by a poly amide (Nielsen et al,
Science 254: 1497, 1991) and phosphate residues can be replaced by ester groups, such as diphosphate, alkyl, arylphosphonate, and phosphorothioate esters. In addition, the 2'-hydroxyl group on ribonucleotides can be modified by addition of, e.g., alkyl groups. Probes of the invention can be used in diagnostic tests or as capture or detection probes. Such capture probes can be immobilized on solid supports, directly or indirectly, by covalent means or by passive adsorption. A detection probe can be labeled by a detectable label, for example, a label selected from radioactive isotopes; enzymes, such as peroxidase and alkaline phosphatase; enzymes that are able to hydrolyze a chromogenic, fluorogenic, or luminescent substrate; compounds that are chromogenic, fluorogenic, or luminescent; nucleotide base analogs; and biotin.
Probes of the invention can be used in any conventional hybridization method, such as in dot blot methods (Maniatis et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1982), Southern blot methods (Southern, J. Mol. Biol. 98:503, 1975), northern blot methods (identical to Southern blot to the exception that RNA is used as a target), or a sandwich method (Dunn et al, Cell 12:23, 1977). As is known in the art, the latter technique involves the use of a specific capture probe and a specific detection probe that have nucleotide sequences that are at least partially different from each other.
Primers used in the invention usually contain about 10 to 40 nucleotides and are used to initiate enzymatic polymerization of DNA in an amplification process (e.g., PCR), an elongation process, or a reverse transcription method. In a diagnostic method involving PCR, the primers can be labeled.
Thus, the invention also encompasses (i) a reagent containing a probe of the invention for detecting and/or identifying the presence of Helicobacter in a biological material; (ii) a method for detecting and/or identifying the presence oϊ Helicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA or RNA is extracted from the material and denatured, and (c) the sample is exposed to a probe of the invention, for example, a capture probe, a detection probe, or both, under stringent hybridization conditions, so that hybridization is detected; and (iii) a method for detecting and/or identifying the presence oϊ Helicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA is extracted therefrom, (c) the extracted DNA is contacted with at least one, or, preferably two, primers of the invention, and amplified by the polymerase chain reaction, and (d) an amplified DNA molecule is produced. As mentioned above, polypeptides that can be produced by expression of the polynucleotides of the invention can be used as vaccine antigens. Accordingly, a sixth aspect of the invention features a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention. A "substantially purified polypeptide" is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or a polypeptide that is free of most of the other polypeptides that are present in the environment in which it was synthesized. The polypeptides of the invention can be purified from a natural source, such as a Helicobacter strain, or can be produced using recombinant methods.
Homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention can be screened for specific antigenicity by
testing cross-reactivity with an antiserum raised against a polypeptide having an amino acid sequence as shown in any of SEQ ID NOs:2-170 (even numbers). Briefly, a monospecific hyperimmune antiserum can be raised against a purified reference polypeptide as such or as a fusion polypeptide, for example, an expression product of MBP, GST, or His-tag systems, or a synthetic peptide predicted to be antigenic. The homologous polypeptide or derivative that is screened for specific antigenicity can be produced as such or as a fusion polypeptide. In the latter case, and if the antiserum is also raised against a fusion polypeptide, two different fusion systems are employed. Specific antigenicity can be determined using a number of methods, including Western blot (Towbin et al, Proc. Natl. Acad. Sci. USA 76:4350, 1979), dot blot, and ELISA methods, as described below.
In a Western blot assay, the product to be screened, either as a purified preparation or a total E. coli extract, is fractionated by SDS-PAGE, as described, for example, by Laemmli (Nature 227:680, 1970). After being transferred to a filter, such as a nitrocellulose membrane, the material is incubated with the monospecific hyperimmune antiserum, which is diluted in a range of dilutions from about 1 :50 to about 1 :5,000, preferably from about 1 : 100 to about 1 :500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the range.
In an ELISA assay, the product to be screened can be used as the coating antigen. A purified preparation is preferred, but a whole cell extract can also be used. Briefly, about 100 μl of a preparation of about 10 μg protein/ml is distributed into wells of a 96-well ELISA plate. The plate is incubated for about 2 hours at 37°C, then overnight at 4°C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween 20
(PBS/Tween buffer) and the wells are saturated with 250 μl PBS containing 1% bovine serum albumin (BSA), to prevent non-specific antibody binding. After 1 hour of incubation at 37 °C, the plate is washed with PBS/Tween buffer. The antiserum is serially diluted in PBS/Tween buffer containing 0.5% BSA, and 100 μl dilutions are added to each well. The plate is incubated for 90 minutes at 37°C, washed, and evaluated using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when the specific antibodies used were raised in rabbits. Incubation is carried out for about 90 minutes at 37 °C and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under these experimental conditions, a positive reaction is shown once an O.D. value of 1.0 is detected with a dilution of at least about 1 :50, preferably of at least about 1 :500. In a dot blot assay, a purified product is preferred, although a whole cell extract can be used. Briefly, a solution of the product at a concentration of about 100 μg/ml is serially diluted two-fold with 50 mM Tris-HCl (pH 7.5). One hundred μl of each dilution is applied to a filter, such as a 0.45 μm nitrocellulose membrane, set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of 50 mM Tris-HCl (pH 7.5) and the membrane is air-dried. The membrane is saturated in blocking buffer (50 mM Tris-HCl (pH 7.5), 0.15 M NaCl, 10 g/1 skim milk) and incubated with an antiserum diluted from about 1 :50 to about 1:5000, preferably about 1 :500. The reaction is detected using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when rabbit antibodies are used. Incubation is carried out for about 90 minutes at 37 °C and the blot is washed. The reaction is developed
with the appropriate substrate and stopped. The reaction is then measured visually by the appearance of a colored spot, e.g., by colorimetry. Under these experimental conditions, a positive reaction is associated with detection of a colored spot for reactions carried out with a dilution of at least about 1 :50, preferably, of at least about 1 :500. Therapeutic or prophylactic efficacy of a polypeptide or polypeptide derivative of the invention can be evaluated as described below.
According to a seventh aspect of the invention, there is provided (i) a composition of matter containing a polypeptide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polypeptide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polypeptide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polypeptide of the invention to an individual in need of such treatment. Additionally, this aspect of the invention includes the use of a polypeptide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
The immunogenic compositions of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, pulmonary, oral, gastric, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. The choice of the administration route depends upon a number of parameters, such as the
adjuvant used. For example, if a mucosal adjuvant is used, the intranasal or oral route will be preferred, and if a lipid formulation or an aluminum compound is used, a parenteral route will be preferred. In the latter case, the subcutaneous or intramuscular route is most preferred. The choice of administration route can also depend upon the nature of the vaccine agent. For example, a polypeptide of the invention fused to CTB or to LTB will be best administered to a mucosal surface.
A composition of the invention can contain one or several polypeptides or derivatives of the invention. It can also contain at least one additional Helicobacter antigen, such as the urease apoenzyme, or a subunit, fragment, homolog, mutant, or derivative thereof.
For use in a composition of the invention, a polypeptide or polypeptide derivative can be formulated into or with liposomes, such as neutral or anionic liposomes, microspheres, ISCOMS, or virus-like particles (VLPs), to facilitate delivery and/or enhance the immune response. These compounds are readily available to those skilled in the art; for example, see Liposomes: A Practical Approach (supra). Adjuvants other than liposomes can also be used in the mvention and are well known in the art (see, for example, the list provided below). Administration can be achieved in a single dose or repeated as necessary at appropriate intervals that can be determined by those skilled in the art. For example, a priming dose can be followed by three booster doses at weekly or monthly intervals. An appropriate dose depends on various parameters, including the nature of the recipient (e.g., whether the recipient is an adult or an infant), the particular vaccine antigen, the route and frequency of administration, the presence/absence or type of adjuvant, and the desired effect (e.g., protection and/or treatment), and can be readily determined by one skilled
in the art. In general, a vaccine antigen of the invention can be administered mucosally in an amount ranging from about 10 μg to about 500 mg, preferably from about 1 mg to about 200 mg. For a parenteral route of administration, the dose usually should not exceed about 1 mg, and is, preferably, about 100 μg. When used as components of a vaccine, the polynucleotides and polypeptides of the invention can be used sequentially as part of a multi-step immunization process. For example, a mammal can be initially primed with a vaccine vector of the invention, such as a pox virus, e.g., via a parenteral route, and then boosted twice with a polypeptide encoded by the vaccine vector, e.g., via the mucosal route. In another example, liposomes associated with a polypeptide or polypeptide derivative of the invention can be used for priming, with boosting being carried out mucosally using a soluble polypeptide or polypeptide derivative of the invention, in combination with a mucosal adjuvant (e.g., LT). Polypeptides and polypeptide derivatives of the invention can also be used as diagnostic reagents for detecting the presence of anti-Helicobacter antibodies, e.g., in blood samples. Such polypeptides can be about 5 to about 80, preferably, about 10 to about 50, amino acids in length and can be labeled or unlabeled, depending upon the diagnostic method. Diagnostic methods involving such a reagent are described below.
Upon expression of a polynucleotide molecule of the invention, a polypeptide or polypeptide derivative is produced and can be purified using known methods. For example, the polypeptide or polypeptide derivative can be produced as a fusion protein containing a fused tail that facilitates purification. The fusion product can be used to immunize a small mammal, e.g., a mouse or a rabbit, in order to raise monospecific antibodies against the polypeptide or polypeptide derivative. The eighth aspect of the invention thus provides a
monospecific antibody that binds to a polypeptide or polypeptide derivative of the invention.
By "monospecific antibody" is meant an antibody that is capable of reacting with a unique, naturally-occurring Helicobacter polypeptide. An antibody of the invention can be polyclonal or monoclonal. Monospecific antibodies can be recombinant, e.g., chimeric (e.g., consisting of a variable region of murine origin and a human constant region), humanized (e.g., a human immunoglobulin constant region and a variable region of animal, e.g., murine, origin), and/or single chain. Both polyclonal and monospecific antibodies can also be in the form of immunoglobulin fragments, e.g., F(ab)'2 or Fab fragments. The antibodies of the invention can be of any isotype, e.g., IgG or IgA, and polyclonal antibodies can be of a single isotype or can contain a mixture of isotypes.
The antibodies of the invention, which can be raised against a polypeptide or polypeptide derivative of the invention, can be produced and identified using standard immunological assays, e.g., Western blot assays, dot blot assays, or ELISA (see, e.g., Coligan et al, Current Protocols in Immunology, John Wiley & Sons, Inc., New York, NY, 1994). The antibodies can be used in diagnostic methods to detect the presence oϊ Helicobacter antigens in a sample, such as a biological sample. The antibodies can also be used in affinity chromatography methods for purifying a polypeptide or polypeptide derivative of the invention. As is discussed further below, the antibodies can also be used in prophylactic and therapeutic passive immunization methods. Accordingly, a ninth aspect of the invention provides (i) a reagent for detecting the presence oϊ Helicobacter in a biological sample that contains an antibody, polypeptide, or polypeptide derivative of the invention; and (ii) a
diagnostic method for detecting the presence oϊ Helicobacter in a biological sample, by contacting the biological sample with an antibody, a polypeptide, or a polypeptide derivative of the invention, so that an immune complex is formed, and detecting the complex as an indication of the presence of Helicobacter in the sample or the organism from which the sample was derived. The immune complex is formed between a component of the sample and the antibody, polypeptide, or polypeptide derivative, and any unbound material can be removed prior to detecting the complex. A polypeptide reagent can be used for detecting the presence of anti-Helicobacter antibodies in a sample, e.g., a blood sample, while an antibody of the invention can be used for screening a sample, such as a gastric extract or biopsy sample, for the presence oϊ Helicobacter polypeptides.
For use in diagnostic methods, the reagent (e.g., the antibody, polypeptide, or polypeptide derivative of the invention) can be in a free state or can be immobilized on a solid support, such as, for example, on the interior surface of a tube or on the surface, or within pores, of a bead. Immobilization can be achieved using direct or indirect means. Direct means include passive adsorption (i.e., non-covalent binding) or covalent binding between the support and the reagent. By "indirect means" is meant that an anti-reagent compound that interacts with the reagent is first attached to the solid support. For example, if a polypeptide reagent is used, an antibody that binds to it can serve as an anti-reagent, provided that it binds to an epitope that is not involved in recognition of antibodies in biological samples. Indirect means can also employ a ligand-receptor system, for example, a molecule, such as a vitamin, can be grafted onto the polypeptide reagent and the corresponding receptor can be immobilized on the solid phase. This concept is illustrated by the well known biotin-streptavidin system. Alternatively, indirect means can be used,
e.g., by adding to the reagent a peptide tail, chemically or by genetic engineering, and immobilizing the grafted or fused product by passive adsorption or covalent linkage of the peptide tail.
According to a tenth aspect of the invention, there is provided a process for purifying from a biological sample a polypeptide or polypeptide derivative of the invention, which involves carrying out antibody-based affinity chromatography with the biological sample, wherein the antibody is a monospecific antibody of the invention.
For use in a purification process of the invention, the antibody can be polyclonal or monospecific, and preferably is of the IgG type. Purified IgGs can be prepared from an antiserum using standard methods (see, e.g., Coligan et al, supra). Conventional chromatography supports, as well as standard methods for grafting antibodies, are described, for example, by Harlow et al. (Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1988).
Briefly, a biological sample, such as an H. pylori extract, preferably in a buffer solution, is applied to a chromatography material, which is, preferably, equilibrated with the buffer used to dilute the biological sample, so that the polypeptide or polypeptide derivative of the invention (i.e., the antigen) is allowed to adsorb onto the material. The chromatography material, such as a gel or a resin coupled to an antibody of the invention, can be in batch form or in a column. The unbound components are washed off and the antigen is eluted with an appropriate elution buffer, such as a glycine buffer, a buffer containing a chaotropic agent, e.g., guanidine HC1, or a buffer having high salt concentration (e.g., 3 M MgCl2). Eluted fractions are recovered and the presence of the antigen is detected, e.g., by measuring the absorbance at 280 nm.
An antibody of the invention can be screened for therapeutic efficacy as follows. According to an eleventh aspect of the invention, there is provided (i) a composition of matter containing a monospecific antibody of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a monospecific antibody of the invention, and (iii) a method for treating or preventing Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a therapeutic or prophylactic amount of a monospecific antibody of the invention to an individual in need of such treatment. In addition, the eleventh aspect of the invention includes the use of a monospecific antibody of the invention in the preparation of a medicament for treating or preventing Helicobacter infection.
The monospecific antibody can be polyclonal or monoclonal, and is, preferably, predominantly of the IgA isotype. In passive immunization methods, the antibody is administered to a mucosal surface of a mammal, e.g., the gastric mucosa, e.g., orally or intragastrically, optionally, in the presence of a bicarbonate buffer. Alternatively, systemic administration, not requiring a bicarbonate buffer, can be carried out. A monospecific antibody of the invention can be administered as a single active agent or as a mixture with at least one additional monospecific antibody specific for a different Helicobacter polypeptide. The amount of antibody and the particular regimen used can be readily determined by one skilled in the art. For example, daily administration of about 100 to 1,000 mg of antibody over one week, or three doses per day of about 100 to 1,000 mg of antibody over two or three days, can be effective regimens for most purposes.
Therapeutic or prophylactic efficacy can be evaluated using standard methods in the art, e.g., by measuring induction of a mucosal immune response
or induction of protective and/or therapeutic immunity using, e.g., the H. felis mouse model and the procedures described by Lee et al. (Eur. J. Gastroenterology & Hepatology 7:303, 1995) or Lee et al. (J. Infect. Dis. 172: 161 , 1995). Those skilled in the art will recognize that the H. felis strain of the model can be replaced with another Helicobacter strain. For example, the efficacy of polynucleotide molecules and polypeptides from H. pylori is, preferably, evaluated in a mouse model using an H. pylori strain. Protection can be determined by comparing the degree oϊ Helicobacter infection in the gastric tissue assessed by, for example, urease activity, bacterial counts, or gastritis, to that of a control group. Protection is shown when infection is reduced by comparison to the control group. Such an evaluation can be made for polynucleotides, vaccine vectors, polypeptides, and polypeptide derivatives, as well as for antibodies of the invention.
For example, various doses of an antibody of the invention can be administered to the gastric mucosa of mice previously challenged with an H. pylori strain as described, e.g., by Lee et al. (supra). Then, after an appropriate period of time, the bacterial load of the mucosa can be estimated by assessing urease activity, as compared to a control. Reduced urease activity indicates that the antibody is therapeutically effective. Adjuvants that can be used in any of the vaccine compositions described above are described as follows. Adjuvants for parenteral administration include, for example, aluminum compounds, such as aluminum hydroxide, aluminum phosphate, and aluminum hydroxy phosphate. The antigen can be precipitated with, or adsorbed onto, the aluminum compound using standard methods. Other adjuvants, such as RIBI (ImmunoChem, Hamilton, MT), can also be used in parenteral administration.
Adjuvants that can be used for mucosal administration include, for example, bacterial toxins, e.g., the cholera toxin (CT), the E. coli heat-labile toxin (LT), the Clostridium difficile toxin A, the pertussis toxin (PT), and combinations, subunits, toxoids, or mutants thereof. For example, a purified preparation of native cholera toxin subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of these toxins can also be used, provided that they retain adjuvant activity. Preferably, a mutant having reduced toxicity is used. Suitable mutants are described, e.g., in WO 95/1721 1 (Arg-7-Lys CT mutant), WO 96/6627 (Arg-192-Gly LT mutant), and WO 95/34323 (Arg-9-Lys and Glu-129-Gly PT mutant). Additional LT mutants that can be used in the methods and compositions of the invention include, e.g., Ser-63-Lys, Ala-69-Gly, Glu- 110- Asp, and Glu- 112- Asp mutants. Other adjuvants, such as the bacterial monophosphoryl lipid A (MPLA) of, e.g., E. coli, Salmonella minnesota, Salmonella typhimurium, or Shigella flexneri; saponins, and polylactide glycolide (PLGA) microspheres, can also be used in mucosal administration. Adjuvants useful for both mucosal and parenteral administration, such as polyphosphazene (WO 95/2415), can also be used. Any pharmaceutical composition of the invention, containing a polynucleotide, polypeptide, polypeptide derivative, or antibody of the invention, can be manufactured using standard methods. It can be formulated with a pharmaceutically acceptable diluent or carrier, e.g., water or a saline solution, such as phosphate buffered saline, optionally, including a bicarbonate salt, such as sodium bicarbonate, e.g., 0.1 to 0.5 M. Bicarbonate can advantageously be added to compositions intended for oral or intragastric administration. In general, a diluent or carrier can be selected on the basis of the mode and route of administration, and standard pharmaceutical practice. Suitable pharmaceutical carriers and diluents, as well as pharmaceutical
necessities for their use in pharmaceutical formulations, are described in Remington's Pharmaceutical Sciences, a standard reference text in this field and in the USP/NF.
The invention also includes methods in which gastroduodenal infections, such as Helicobacter infection, are treated by oral administration of a Helicobacter polypeptide of the invention and a mucosal adjuvant, in combination with an antibiotic, an antisecretory agent, a bismuth salt, an antacid, sucralfate, or a combination thereof. Examples of such compounds that can be administered with the vaccine antigen and an adjuvant are antibiotics, including, e.g., macrolides, tetracyclines, β-lactams, aminoglycosides, quinolones, penicillins, and derivatives thereof (specific examples of antibiotics that can be used in the invention include, e.g., amoxicillin, clarithromycin, tetracycline, metronidizole, erythromycin, cefuroxime, and erythromycin); antisecretory agents, including, e.g., H2- receptor antagonists (e.g., cimetidine, ranitidine, famotidine, nizatidine, and roxatidine), proton pump inhibitors (e.g., omeprazole, lansoprazole, and pantoprazole), prostaglandin analogs (e.g., misoprostil and enprostil), and anticholinergic agents (e.g., pirenzepine, telenzepine, carbenoxolone, and proglumide); and bismuth salts, including colloidal bismuth subcitrate, tripotassium dicitrate bismuthate, bismuth subsalicylate, bicitropeptide, and pepto-bismol (see, e.g., Goodwin et al, Helicobacter pylori, Biology and Clinical Practice, CRC Press, Boca Raton, FL, pp 366-395, 1993; Physicians' Desk Reference, 49th edn., Medical Economics Data Production Company, Montvale, New Jersey, 1995). In addition, compounds containing more than one of the above-listed components coupled together, e.g., ranitidine coupled to bismuth subcitrate, can be used. The invention also includes compositions for carrying out these methods, i.e., compositions containing a Helicobacter
antigen (or antigens) of the invention, an adjuvant, and one or more of the above-listed compounds, in a pharmaceutically acceptable carrier or diluent. Amounts of the above-listed compounds used in the methods and compositions of the invention can readily be determined by one skilled in the art. In addition, one skilled in the art can readily design treatment/immunization schedules. For example, the non-vaccine components can be administered on days 1-14, and the vaccine antigen + adjuvant can be administered on days 7, 14, 21, and 28.
Methods and pharmaceutical compositions of the invention can be used to treat or to prevent Helicobacter infections and, accordingly, gastroduodenal diseases associated with these infections, including acute, chronic, and atrophic gastritis, and peptic ulcer diseases, e.g., gastric and duodenal ulcers.
The clones of the invention were originally isolated by a transposon shuttle mutagenesis method. Briefly, in this method, a TnMax9 mmi-blaM transposon was used for insertional mutagenesis of an H. pylori gene library established in E. coli. 192 E. coli clones expressing active β-lactamase fusion proteins were obtained, indicating that the corresponding target plasmids carry H. pylori genes encoding extracytoplasmic proteins. Individual mutants were transferred onto the chromosome of H. pylori PI or P12 by natural transformation, resulting in 135 distinct H. pylori mutants. This method is described in further detail, as follows.
The transposon TnMax9 (Kahrs et al, Gene 167:53, 1995) was used to generate mutations in an H. pylori library in E. coli. As illustrated in Fig. 1 A, TvMax9 contains, in addition to a catGC-resistance gene close to the inverted repeat (IR), an unexpressed open reading frame encoding β-lactamase without a promoter or signal sequence (mature β-lactamase, blaM; Kahrs et al,
suprd). For production of extracytoplasmic BlaM fusion proteins resulting in ampicillin-resistant (ampR) clones, expression of the cloned H. pylori genes in E. coli is obligatory. The minimal vector pMin2 (Kahrs et al, supra; see Fig. IB), containing a weak constitutive promoter (Piga) upstream of the multiple cloning site, was used for construction of the H. pylori library to ensure expression of H. pylori genes in E. coli.
In construction of the library, H. pylori DNA was partially digested with Sau3A and Hpaϊl, size fractionated by preparative agarose gel electrophoresis, and 3-6 kilobase fragments were ligated into the Bglll and Clal sites of pMin2. The library was introduced into E. coli strain E181(pTn αxP), which is a derivative of HB101 containing the TnMax9 transposon, by electroporation. This generated approximately 2,400 independent transformants. More than 95% of the plasmids contained an insert of between 3 and 6 kilobases, showing that the 1.7 megabase H. pylori chromosome was statistically covered. Since not every plasmid could be expected to contain a target gene carrying an export signal, the library was partitioned into a total of 198 pools (24 pools of 20 clones and 174 pools of 11 clones). Using a cotton swab, either eleven or twenty individual colonies were inoculated in 0.5 ml LB medium in eppendorf tubes, vortexed, and 100 ml of the suspension was spread on LB agar plates supplemented with tetracycline and chloramphenicol to select for maintenance of both plasmids. Insertion of TnMax9 into the target plasmids was induced with 100 mM isopropyl-b-D-thiogalactoside (IPTG) separately for each pool (Haas et al, Gene 130:23-21, 1993). Plasmids were transferred into E145 by triparental mating, in which 25 ml of the donor strain (El 81), 25 ml of the mobilisator (HB101(pRK2013)), and 50 ml of the recipient strain (El 45) were mixed from corresponding bacterial suspensions (O.D.550 = 10). The matings were performed for 2-3 hours at 37°C on
nitrocellulose filters, which were placed on LB plates. Bacteria were suspended in 1 ml LB and aliquots were spread on LB plates containing chloramphenicol, tetracycline, and rifampicin. Each pool gave rise to chloramphenicol-resistant transconjugates in El 45, demonstrating that both transposition and conjugation were successful. Generally, several thousand chloramphenicol-resistant transconjugates were obtained, but the number of ampR colonies varied in different pools, ranging from one to several hundred colonies. Two ampR colonies from each positive pool were isolated, plasmid DNA was extracted, and the DNA was characterized by further restriction analysis. Only those TnMax9 insertions of a single pool that mapped in obviously different plasmid clones, or in markedly different regions of the same clone, were used further.
From 158 of the 198 pools, ampicillin-resistant E145 transconjugates were obtained (80%), showing that in several pools, TnMax9 inserted into expressed genes, resulting in production of extracytoplasmic BlaM fusion proteins. Thus, a total of 192 ampR E145 clones could be isolated by conjugal transfer of plasmids from 198 pools.
To analyze the mutant library, it was determined whether defined gene sequences inactivated by
were represented once or several times in the whole library. Five transposon-containing plasmids conferring an ampR phenotype to E145 (pMu7, pMul3, pMu75, pMu94, and pMul 10) were randomly selected and DNA fragments flanking the TrιMax9 insert were isolated and used as probes in Southern hybridization of 120 ampR clones. The hybridization probes isolated from clones pMu7, pMu75, and pMu94 were between 0.9 and 1.1 kilobases in size, and hybridized exclusively with the inserts of the homologous plasmids. In contrast, the
flanking regions of clones pMul3 and pMul 10 were 4.0 and 5.5 kilobases, respectively. They
each hybridized with the homologous plasmids, and with one additional clone of the library. Such a result was expected, since the chance of a probe to find a homologous sequence in the library should be higher, the longer the hybridization probes. In order to verify the insertion of the transposon into distinct ORFs encoding putative exported proteins, the
DNA of five representative ampR mutant clones (pMu7, pMul2, pMulδ, pMu20, and pMu26) was sequenced, taking advantage of the M13 forward and reverse primers on
(Fig. 1 A). This analysis revealed that the mini-transposon was inserted into different sequences in each plasmid, thereby interrupting ORFs encoding putative proteins. For two clones, the sequences located upstream of the blaM gene revealed a putative ribosome-binding site and a potential translational start codon (ATG). Other clones either revealed an ORF spanning the complete sequence (approximately 400 base pairs upstream and downstream of the
insertion) or terminating shortly after the site of insertion. The partial protein sequences from different ORFs were used for database searches, but no significant homologies with known proteins were found.
In a further approach, it was determined whether a known gene, like vacA, encoding the extracellular vacuolating cytotoxin of H. pylori, could be identified using this method and how often such a mutation would be represented in the mutant library. Total cell lysates of the 135 mutants were tested in an immunoblot using the H. pylori cytotoxin-specific rabbit antiserum AK197 (Schmitt et al, Mol. Microbiol. 12:307-319, 1994). Two mutants were identified that no longer produced the cytotoxin antigen (mutants PI -26 and
PI -47) and partial DNA sequencing of the insertion sites revealed that Tr\Max9
was inserted at distinct positions in the vacA gene, 56 and 53 codons downstream of the ATG start codon.
Thus, the characterization of the mutant collection confirmed that a representative gene library was constructed in E. coli, in which target genes encoding exported H. pylori proteins were efficiently tagged by
In order to establish a collection of mutants lacking distinct exported proteins, the mutations had to be transferred back into the H. pylori chromosome. By means of natural transformation, 86 plasmids could be transformed into the original strain PI . H. pylori strains PI or PI 2, which were naturally competent for DNA transformation, were transformed with circular plasmid DNA (0.2-0.5 mg/transformation). Transformations to streptomycin resistance were performed with chromosomal DNA (1 mg/transformation), isolated from a streptomycin-resistant NCTC1 1637 H. pylori mutant according to the procedure described in Haas et al. (Mol. Microbiol. 8:753-760). Selection was performed on serum plates containing 4 mg/ml chloramphenicol or 500 mg/ml streptomycin. The transformation frequency for a given mutant was calculated as the number of chloramphenicol-, streptomycin-, or erythromycin-resistant colonies per cfu (average of three experiments). The blaM gene was deleted by Notl digestion, and the plasmid religated, in those plasmids that did not transform strain PI directly. This procedure, which resulted in a twenty- to thirty-fold higher frequency of transformation, as compared to the same plasmid containing blaM, resulted in 36 additional mutant PI strains. The b/α -deletion plasmids that still did not transform strain PI were used to transform the heterologous H. pylori strain PI 2, possessing an approximately 10-fold higher transformation frequency compared to PI . This resulted in thirteen further mutants.
Thus, from the 192 ampR plasmids, a total of 135 H. pylori mutants (122 mutants in PI and 13 mutants in PI 2) were finally obtained by selection for chloramphenicol resistance (70%). The transformation frequency varied between different plasmids in the range of lxlO"5 - lxlO"7. The remaining plasmids did not result in any transformants. The collection was frozen as individual mutants in stock cultures at -70°C. To verify the correct insertion of the mini-transposon into the H. pylori chromosome, ten representative mutants were tested by Southern hybridization of chromosomal DNA using catGC DNA and the vector pMin2 as probes. Consistent with our previous experience concerning TnMzxP-based shuttle mutagenesis of H. pylori, the mini- transposon was, in all cases, inserted into the chromosome without integration of the vector DNA, which probably means by a double cross-over, rather than by a single cross-over event. As judged from the hybridization pattern obtained with the cat gene as a probe, it appears that TnMax9 is located in different regions of the chromosome, showing that distinct target genes have been interrupted in individual mutants.
The mutants were analyzed for motility, transformation competence, and adherence to KatoIII cells. Screening of the H. pylori mutant collection allowed identification of mutants impaired in motility, natural transformation competence, and adherence to gastric epithelial cell lines. Motility mutants could be grouped into distinct classes: (i) mutants lacking the major flagellin subunit FlaA and intact flagella; (ii) mutants with apparently normal flagella, but reduced motility; and (iii) mutants with obviously normal flagella, but completely abolished motility. Two independent mutations, which exhibited defects in natural competence for genetic transformation, mapped to different genetic loci. In addition, two independent mutants were isolated by their failure to bind to the human gastric carcinoma cell line KatoIII. Both mutants
carried a transposon in the same gene, approximately 0.8 kilobases apart, and showed decrease autoagglutination, when compared to the wild type strain.
Sequences of clones obtained using the above-described transposon shuttle mutagenesis method were used to identify intact genes, lacking inserted transposons, in the H. pylori genome, as is described below in Example 5. The invention is further illustrated by the following examples. Example 1 describes identification of genes, such as genes that encode the polypeptides of the invention, in the Helicobacter genome, as well as identification of signal sequences and primer design for amplification of genes lacking signal sequences. Example 2 describes cloning of DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1 190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13 into a vector that provides a histidine tag, and production and purification of the resulting his-tagged fusion proteins. Example 3 describes methods for cloning DNA encoding the polypeptides of the invention so that they can be produced without his-tags, and Example 4 describes methods for purifying recombinantly produced polypeptides of the invention. Example 5 describes methods for obtaining the nucleic acids of the invention from the deposited clones. Example 6 describes purification of recombinant H. pylori antigen GHPO 1190.
EXAMPLE 1 : Identification of genes in the H. pylori genome, identification of signal sequences, and primer design for amplification of genes lacking signal sequences l.A. Creating H. pylori genomic databases
The H. pylori genome was provided as a text file containing a single contiguous string of nucleotides that had been determined to be 1.76
Megabases in length. The complete genome was split into 17 separate files
using the program SPLIT (Creativity in Action), giving rise to 16 contigs, each containing 100,000 nucleotides, and a 17th contig containing the remaining 76,000 nucleotides. A header was added to each of the 17 files using the format: >hpg0.txt (representing contig 1), .hpgl.txt (representing contig 2), etc. The resulting 17 files, named hpgO through hpgl 6, were then copied together to form one file that represented the plus strand of the complete H. pylori genome. The constructed database was given the designation "H." A negative strand database of the H. pylori genome was created similarly by first creating a reverse complement of the positive strand using the program SeqPup (D.G. Gilbert, Indiana University Biology Department) and then performing the same procedure as described above for the plus strand. This database was given the designation "N."
The regions predicted to encode open reading frames (ORFs) were defined for the complete H. pylori genome using the program GENEMARK™ (Borodovsky et al, Comp. Chem. 17:123, 1993). A database was created from a text file containing an annotated version of all ORFs predicted to be encoded by the H. pylori genome for both the plus and minus strands, and was given the designation "O." Each ORF was assigned a number indicating its location on the genome and its position relative to other genes. No manipulation of the text file was required.
l.B. Searching the H. pylori databases
The databases constructed as is described above were searched using the program FASTA (Pearson et al, Proc. Natl. Acad. Sci. USA 85:2444-2448, 1988). FASTA was used for searching either a DNA sequence against either of the gene databases ("H" and/or "N"), or a peptide sequence against the ORF library ("O"). TFASTX was used to search a peptide sequence against all
possible reading frames of a DNA database ("H" and/or "N" libraries). Potential frameshifts also being resolved, FASTX was used for searching the translated reading frames of a DNA sequence against either a DNA database, or a peptide sequence against the protein database.
l.C. Isolation of DNA sequences from the H. pylori genome
The FASTA searches against the constructed DNA databases identified exact nucleotide coordinates on one or more of the isolated contigs, and therefore the location of the target DNA. Once the exact location of the target sequence was known, the contig identified to carry the gene was exported into the software package MapDraw (DNAStar, Inc.) and the gene was isolated. Gene sequences with flanking DNA were then excised and copied into the EditSeq. Software package (DNAStar, Inc.) for further analysis.
I.D. Identification of signal sequences
The deduced protein encoded by a target gene sequence was analyzed using the PROTEAN software package (DNAStar, Inc.). This analysis predicts those areas of the protein that are hydrophobic by using the Kyte-Doolittle algorithm, and identifies any potential polar residues preceding the hydrophobic core region, which is typical for many signal sequences. For confirmation, the target protein was then searched against a PROSITE database (DNAStar, Inc.) consisting of motifs and signatures. Characteristic of many signal sequences and hydrophobic regions in general, is the identification of predicted prokaryotic lipid attachment sites. Where confirmation between the two approaches is apparent at the N-terminus of any protein, putative cleavage sites were sought. Specifically, this includes the presence of either an Alanine (A), Serine (S), or Glycine (G) residue immediately after the core hydrophobic
region. In the case of lipoproteins, a Cysteine (C) residue would be identified as the +1 residue, post-cleavage.
I.E. Rational design of PCR primers based on the identification of signal sequences To clone gene sequences as N-terminal translational fusions for the generation of recombinant proteins with N-terminal Histidine tags, the gene sequence that specifies the signal sequence is omitted. The 5'-end of the gene- specific portion of the N-terminal primer is designed to start at the first codon beyond the cleavage site. In the case of lipoproteins, the 5'-end of the N- terminal primer begins at the second codon, immediately after the modifiable residue at position +1 post-cleavage. The omission of the signal sequence from the recombinant allows for one-step purification, and potential problems associated with insertion of signal sequences in the membrane of the host strain carrying the hybrid construct are avoided.
EXAMPLE 2 : Preparation of isolated DNA encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13, and production of these polypeptides as histidine-tagged fusion proteins 2.A. Preparation of genomic DNA from Helicobacter pylori Helicobacter pylori strain ORV2001, stored in LB medium containing 50% glycerol at -70 °C, is grown on Colombia agar containing 7% sheep blood for 48 hours under microaerophilic conditions (8-10% C02, 5-7% 02, 85-87% N2). Cells are harvested, washed with phosphate buffer saline (PBS) (pH 7.2), and DNA is then extracted from the cells using the Rapid Prep Genomic DNA Isolation kit (Pharmacia Biotech).
2.B. PCR amplification
DNA molecules encoding GHPO 732, GHPO 419, GHPO 1398, GHPO 706, GHPO 1190, GHPO 986, GHPO 1420, GHPO 1299, and GHPO 13 are amplified from genomic DNA, as can be prepared as is described above, by the Polymerase Chain Reaction (PCR) using the following primers: GHPO 732 (ΗPO 64
N-terminal primer: 5'-GCCGGATCCATGACTTATGGGTATGGGGAA-3' (SEQ ID NO:171); and C-terminal primer:
5'-GCCCTCGAGACTTTTATTGATTCACCATTTCATT-3' (SEQ ID
NO: 172).
GHPO 419 (ΗPO 54):
N-terminal primer: 5'-GCCGGATCCATCGCTGAAGAAAATGGGGCG-3' (SEQ ID NO: 173); and
C-terminal primer: 5'-GCCCGGCCGCCCTAAAAACTATAAACATAACTC-3' (SEQ ID NO: 174). GHPO 1398 (ΗPO 15):
N-terminal primer: 5'-GCCGGATCCGGTATTAGGAAGCTTATACCATC-3' (SEQ ID NO: 175); and
C-terminal primer: 5'-GCCCTCGAGAAGTTCTATTTTTAATTCCTTGAGAG-3' (SEQ ID NO: 176). GHPO 706 (ΗPO 50 :
N-terminal primer: 5'-GCCGGATCCTCTGATAGCCATAAAGAAAAAAAGGAC-3' (SEQ ID
NO: 177); and
C-terminal primer: 5'-GCCCTCGAGATCTTTAGAAATCAACCCCCAAAGC-3' (SEQ ID NO: 178). GHPO 1 190 (ΗPO 76):
N-terminal primer: 5'-GCCGGATCCGACTTAGAACATTTTAACACGCTC-3' (SEQ ID NO: 179); and
C-terminal primer: 5*-GCCCTCGAGTCATTTTAAACGACTCAAAACAAA-3' (SEQ ID NO: 180). GHPO 986:
N-terminal primer: 5'-GCCGGATCCGGCCAAAGCGTGCGCACTTATTGG-3' (SEQ ID NO: 181); and
C-terminal primer: 5'-GCCCTCGAGTTATTGTTCCAACCCCCACGCATC-3' (SEQ ID NO: 182). GHPO 1420:
N-terminal primer: 5'-GCCGGATCCAAGAGCAATGCTGATGACAAACC-3' (SEQ ID NO: 183); and C-terminal primer:
5'-GCCCTCGAGTTATGAGTTAAAGCCCCTTGTCC-3' (SEQ ID NO: 184). GHPO 1299:
N-terminal primer: 5*-GCCGGATCCGAATCAGTAAAAACAGGAAAAAC-3' (SEQ ID NO: 185); and
C-terminal primer: 5'-GCCCTCGAGCGGCTCTTTGGAGTTTTATTG-3' (SEQ ID NO: 186). GHPO 13:
N-terminal primer: 5'-GCCGGATCCATCATTCCCTCTCGCTCTATGG-3' (SEQ ID NO: 187); and C-terminal primer:
5'-GCCCTCGAGACCTTAATGCGTTGCGTTTTCTTT-3' (SEQ ID NO: 188).
The N-terminal and C-terminal primers for each clone both include a 5' clamp and a restriction enzyme recognition sequence for cloning purposes (BamHI (GGATCC) and Xhol (CTCGAG) or Notl (CGGCCG) recognition sequences). The Ν-terminal primer is designed so that the amplified product does not encode the signal sequence and the potential cleavage site.
Amplification of gene-specific DΝA is carried out using Pwo DΝA Polymerase (Boehringer Mannheim), which is a proof-reading polymerase, according to general guidance provided by the manufacturer. Because of the exonuclease activity of the polymerase, two reaction mixtures (mixtures 1 and 2) are first prepared separately and combined just prior to amplification. These mixtures are as follows:
Ingredient Cfinal cone.) Mixture 1 (ul) Mixture 2 (ul) distilled H,0 160 79 dNTPs (200 μM each) 40 — lOx PCR buffer — 20 primers ( 100 nM each) 1 —
DNA template (200 ng) 2 — as obtained in 5.A.
(lOx PCR buffer contains 100 mM Tris-HCl (pH 8.85), 250 mM KC1, 50 mM (NH4)2 S04, 20 mM MgS04) Amplification is carried out as follows:
Cycling conditions Temp (°C1 Time (mm.) Number of cycles
Initial denaturing step 96 4 1
Denaturing step 94 0.5 20
Annealing step 50 1 20 Extension step 72 1 20
Final extension step 72 5 1
2.C. Transformation and selection of transformants
A single PCR product is thus amplified and is then digested at 37 °C for 2 hours with BamHI and Xhol or Notl concurrently in a 20 μl reaction volume. The digested product is ligated to similarly cleaved pET28a
(Νovagen) that is dephosphorylated prior to the ligation by treatment with Calf Intestinal Alkaline Phosphatase (CIP). The gene fusion constructed in this manner allows one-step affinity purification of the resulting fusion protein because of the presence of histidine residues at the Ν-terminus of the fusion protein, which are encoded by the vector.
The ligation reaction (20 μl) is carried out at 14°C overnight and then is used to transform 100 μl fresh E. coli XL 1 -blue competent cells (Νovagen). The cells are incubated on ice for 2 hours, heat-shocked at 42 °C for 30 seconds, and returned to ice for 90 seconds. The samples are then added to 1 ml LB broth in the absence of selection and grown at 37 °C for 2 hours.
The cells are plated out on LB agar containing kanamycin (50 μg/ml) at a lOx and neat dilution and incubated overnight at 37°C. The following day, 50 colonies are picked onto secondary plates and incubated at 37 °C overnight.
Five colonies are picked into 3 ml LB broth supplemented with kanamycin (100 μg/ml) and are grown overnight at 37° C. Plasmid DNA is extracted using the Quiagen mini-prep, method and is quantitated by agarose gel electrophoresis.
PCR is performed with the gene-specific primers under the conditions set forth above and transformant DNA is confirmed to contain the desired insert. If PCR-positive, one of the five plasmid DNA samples (500 ng) extracted from the E. coli XL 1 -blue cells is used to transform competent BL21 (λDE3) E. coli competent cells (Novagen; as described previously). Transformants (10) are picked onto selective kanamycin (50 μg/mL) containing LB agar plates and stored as a research stock in LB containing 50% glycerol.
2.D. Purification of recombinant proteins
One ml of frozen glycerol stock prepared as described in 2.C. is used to inoculate 50 ml of LB medium containing 25 μg/ml of kanamycin in a 250 ml Erlenmeyer flask. The flask is incubated at 37°C for 2 hours or until the absorbance at 600 nm (ODβOO) reaches 0.4-1.0. The culture is stopped from growing by placing the flask at 4°C overnight. The following day, 10 ml of the overnight culture are used to inoculate 240 ml LB medium containing kanamycin (25 μg/ml), with the initial ODβfjO aDOUt 0.02-0.04. Four flasks are inoculated for each ORF. The cells are grown to an ODgQO °f 1 -0 (about 2 hours at 37°C), a 1 ml sample is harvested by centrifugation, and the sample is analyzed by SDS-PAGE to detect any leaky expression. The remaining culture
is induced with 1 mM IPTG and the induced cultures are grown for an additional 2 hours at 37°C.
The final ODgQO is taken and the cells are harvested by centrifugation at 5,000 x g for 15 minutes at 4°C. The supernatant is discarded and the pellets are resuspended in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Two hundred and fifty ml of buffer are used for 1 liter of culture and the cells are recovered by centrifugation at 12,000 x g for 20 minutes. The supernatant is discarded and the pellets are stored at -45 °C.
2. E. Protein purification Pellets obtained from 2.D. are thawed and resuspended in 95 ml of 50 mM Tris-HCl (pH 8.0). Pefabloc and lysozyme are added to final concentrations of 100 μM and 100 μg/ml, respectively. The mixture is homogenized with magnetic stirring at 5°C for 30 minutes. Benzonase (Merck) is added at a 1 U/ml final concentration, in the presence of 10 mM MgCl2, to ensure total digestion of the DNA. The suspension is sonicated (Branson Sonifier 450) for 3 cycles of 2 minutes each at maximum output. The homogenate is centrifuged at 19,000 x g for 15 minutes and both the supernatant and the pellet are analyzed by SDS-PAGE to detect the cellular location of the target protein in the soluble or insoluble fractions, as is described further below.
2.E.I. Soluble fraction
If the target protein is produced in a soluble form (i.e., in the supernatant obtained in 2.E.) NaCl and imidazole are added to the supernatant to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, and 10 mM imidazole (buffer A). The mixture is filtered through a 0.45 μm membrane and
loaded onto an IMAC column (Pharmacia HiTrap chelating Sepharose; 1 ml), which has been charged with nickel ions according to the manufacturer's recommendations. After loading, the column is washed with 50 column volumes of buffer A and the recombinant target protein is eluted with 5 ml of buffer B (50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, 500 mM imidazole).
The elution profile is monitored by measuring the absorbance of the fractions at 280 nm. Fractions corresponding to the protein peak are pooled, dialyzed against PBS containing 0.5 M arginine, filtered through a 0.22 μm membrane, and stored at -45°C.
2.E.2. Insoluble fraction
If the target protein is expressed in the insoluble fraction (pellets obtained from 2.E.), purification is conducted under denaturing conditions. NaCl, imidazole, and urea are added to the resuspended pellet to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, 10 mM imidazole, and 6 M urea (buffer C). After complete solubilization, the mixture is filtered through a 0.45 μm membrane and loaded onto an IMAC column.
The purification procedures on the IMAC column are the same as described in 2.E.I ., except that 6 M urea is included in all buffers used and 10 column volumes of buffer C are used to wash the column after protein loading, instead of 50 column volumes.
The protein fractions eluted from the IMAC column with buffer D (buffer C containing 500 mM imidazole) are pooled. Arginine is added to the solution to final concentration of 0.5 M and the mixture is dialyzed against PBS containing 0.5 M arginine and various concentrations of urea (4 M, 3 M, 2 M, 1 M, and 0.5 M) to progressively decrease the concentration of urea. The final dialysate is filtered through a 0.22 μm membrane and stored at -45°C.
Alternatively, when the above purification process is not as efficient as it should be, two other processes may be used as follows. A first alternative involves the use of a mild denaturant, N-octyl glucoside (NOG). Briefly, a pellet obtained in 2.E. is homogenized in 5 mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 50 mM NaPO4 (pH 7.5) containing 1-2 % weight /volume NOG, and homogenized. The NOG-soluble impurities are removed by centrifugation. The pellet is extracted once more by repeating the preceding extraction step. The pellet is dissolved in 8 M urea, 50 mM Tris (pH 8.0). The urea-solubilized protein is diluted with an equal volume of 2 M arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 M arginine for 24-48 hours to remove the urea. The final dialysate is filtered through a 0.22 μm membrane and stored at -45°C. A second alternative involves the use of a strong denaturant, such as guanidine hydrochloride. Briefly, a pellet obtained in 2.E. is homogenized in 5 mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi and clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 6 M guanidine hydrochloride, and passed through an IMAC column charged with Ni""". The bound antigen is eluted with 8 M urea (pH 8.5). Beta-mercaptoethanol is added to the eluted protein to a final concentration of 1 mM, then the eluted protein is passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from the column is slowly added to 4 volumes of 50 mM phosphate buffer (pH 7.0). The protein remains in solution.
2.F. Evaluation of the protective activity of the purified protein
Groups of 10 Swiss Webster mice (Taconic Labs) are immunized rectally with 25 μg of the purified recombinant protein, admixed with 1 μg of cholera toxin (Berna) in physiological buffer. Mice are immunized on days 0, 7, 14, and 21. Fourteen days after the last immunization, the mice are challenged with H. pylori strain ORN2001 grown in liquid media (the cells are grown on agar plates, as described in 2. A., and, after harvest, the cells are resuspended in Brucella broth; the flasks are then incubated overnight at 37 °C). Fourteen days after challenge, the mice are sacrificed and their stomachs are removed. The amount of H. pylori is determined by measuring the urease activity in the stomach and by culture.
2.G. Production of monospecific polyclonal antibodies 2.G.I. Hyperimmune rabbit antiserum
New Zealand rabbits are injected both subcutaneously and intramuscularly with 100 μg of a purified fusion polypeptide, as obtained in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a total volume of approximately 2 ml. Twenty one and 42 days after the initial injection, booster doses, which are identical to priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way. Fifteen days after the last injection, animal serum is recovered, decomplemented, and filtered through a 0.45 μm membrane.
2.G.2. Mouse hyperimmune ascites fluid
Ten mice are injected subcutaneously with 10-50 μg of a purified fusion polypeptide as obtained in 2.E.I . or 2.E.2., in the presence of Freund's complete adjuvant and in a volume of approximately 200 μl. Seven and 14 days after the
initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way. Twenty one and 28 days after the initial infection, mice receive 50 μg of the antigen alone intraperitoneally. On day 21, mice are also injected intraperitoneally with sarcoma 180/TG cells CM26684 (Lennette et al,
Diagnostic Procedures for Viral, Rickettsial, and Chlamydial Infections, 5th Ed. Washington DC, American Public Health Association, 1979). Ascites fluid is collected 10-13 days after the last injection.
EXAMPLE 3: Methods for producing transcriptional fusions lacking His- tags
Methods for amplification and cloning of DNA encoding the polypeptides of the invention as transcriptional fusions lacking His-tags are described as follows. Two PCR primers for each clone are designed based upon the sequences of the polynucleotides that encode them (SEQ ID NOs: l- 169 (odd numbers)). These primers can be used to amplify DNA encoding the polypeptides of the invention from any Helicobacter pylori strain, including, for example, ORV2001 and the strain deposited as ATCC deposit number 43579, as well as from other Helicobacter species.
The N-terminal primers are designed to include the ribosome binding site of the target gene, the ATG start site, and any signal sequence and cleavage site. The N-terminal primers can include a 5' clamp and a restriction endonuclease recognition site, such as that for BamHI (GGATCC), which facilitates subsequent cloning. Similarly, the C-terminal primers can include a restriction endonuclease recognition site, such as that for Xhol (CTCGAG), which can be used in subsequent cloning, and a TAA stop codon.
Amplification of genes encoding the polypeptides of the invention is carried out using Thermalase DNA Polymerase under the conditions described above in Example 2. Alternatively, Vent DNA polymerase (New England Biolabs), Pwo DNA polymerase (Boehringer Mannheim), or Taq DNA polymerase (Appligene) can be used, according to instructions provided by the manufacturers.
A single PCR product for each clone is amplified and cloned into appropriately cleaved pET 24 (e.g., BamHl-Xhol cleaved pET 24), resulting in construction of a transcriptional fusion that permits expression of the proteins without His-tags. The expressed products can be purified as denatured proteins that are refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription of the genes from the T7 promoter, which is supplied by the vector, but relies upon binding of the RNA- specific DNA polymerase to the intrinsic ribosome binding sites of the genes, and thereby expression of the complete ORF. The amplification, digestion, and cloning protocols are as described above for constructing translational fusions.
Amplification of clone GHPO 1190 DNA Design of PCR primers for cloning
Two PCR primers are designed based on the complete gene sequence (see table 1). The N-terminal primer (FCl) is designed to include the ribosome binding site of the target gene, the ATG start site, and the signal sequence (with cleavage site). It includes a clamp (GCC) at the 5' most end, and a Sacl recognition sequence (GAGCTC) for cloning purposes.
The C-terminal primer (RN2) includes an Xhol recognition sequence for cloning purposes, and the natural TAA stop codon.
N-terminal primer (FCl):
5'-GCCGAGCTCCAAGCAAAAAAATGTCAATTAAAAGGG-3' (SEQ ID NO: 189)
C-terminal primer (RN2): 5'-GCCCTCGAGGTCTAAATTAGAATAAGTGTTGTT-3' (SEQ ID NO: 190)
Amplification of each specified gene can be achieved by employing FC1/RN2 primers for any of the genes described (see Table 1).
PCR conditions
Amplification of gene-specific DNA is carried out using Pwo DNA Polymerase (Boehringer Mannheim) under the following conditions. Due to the exonuclease activity of the polymerase, two reaction mixtures are prepared separately and combined just prior to amplification.
Reaction ingredients: Ingredient (final cone.) Mixture 1 I (μl) Mixtuie 2 (ul) distilled H20 160 79 dNTPs (200 μM each) 40 -
1 OX buffer - 20 primer 1 (100 nM) 1 - primer 2 (100 nM) 1 -
Template (200 ng) 2 0
Cycling condition Temp CC) Tιme( mm) Number of cvcles
Initial denaturing step 96 4 1
Denaturing step 94 0.5 20
Annealing step 50 1 20
Extension step 72 1 20
Final extension step 72 1 1
A single PCR product of 624 basepairs is amplified and cloned into
Sacl-Xhol cleaved pET 24, allowing construction of a transcriptional fusion and expression of GHPO 1190 antigen in the absence of a His-tag. In this instance, expressed product can be purified as a denatured protein that is refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription from the T7 promoter, supplied by the vector, but relies upon binding of the RNA-specific DNA polymerase to the intrinsic ribosome binding site for GHPO 1 190, and thereby expression of the complete ORF. The amplification, restriction, and cloning protocols are as previously described for constructing translational fusions.
EXAMPLE 4: Purification of the polypeptides of the invention by immunoaffinity
4.A. Purification of specific IgGs
An immune serum, as prepared in section 2.G., is applied to a protein A Sepharose Fast Flow column (Pharmacia) equilibrated in 100 mM Tris-HCl (pH 8.0). The resin is washed by applying 10 column volumes of 100 mM Tris-HCl and 10 volumes of 10 mM Tris-HCl (pH 8.0) to the column. IgG antibodies are eluted with 0.1 M glycine buffer (pH 3.0) and are collected as 5 ml fractions to which is added 0.25 ml 1 M Tris-HCl (pH 8.0). The optical density of the eluate is measured at 280 nm and the fractions containing the IgG antibodies are pooled, dialyzed against 50 mM Tris-HCl (pH 8.0), and, if necessary, stored frozen at -70°C.
4.B. Preparation of the column An appropriate amount of CNBr-activated Sepharose 4B gel ( 1 g of dried gel provides for approximately 3.5 ml of hydrated gel; gel capacity is from 5 to 10 mg coupled IgG/ml of gel) manufactured by Pharmacia (17-0430- 01) is suspended in 1 mM HC1 buffer and washed with a buchner by adding small quantities of 1 mM HC1 buffer. The total volume of buffer is 200 ml per gram of gel.
Purified IgG antibodies are dialyzed for 4 hours at 20+5 °C against 50 volumes of 500 mM sodium phosphate buffer (pH 7.5). The antibodies are then diluted in 500 mM phosphate buffer (pH 7.5) to a final concentration of 3 mg/ml. IgG antibodies are mixed with the gel overnight at 5+3 °C. The gel is packed into a chromatography column and is washed with 2 column volumes of 500 mM phosphate buffer (pH 7.5), and 1 column volume of 50 mM sodium phosphate buffer, containing 500 mM NaCl (pH 7.5). The gel is then transferred to a tube, mixed with 100 mM ethanolamine (pH 7.5) for 4 hours at room temperature, and washed twice with 2 column volumes of PBS. The gel is then stored in 1/10,000 PBS/merthiolate. The amount of IgG antibodies coupled to the gel is determined by measuring the optical density (OD) at 280 nm of the IgG solution and the direct eluate, plus washings.
4.C. Adsorption and elution of the antigen An antigen solution in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA, for example, the supernatant obtained in 3.E. or the solubilized pellet obtained in 3.E., after centrifugation and filtration through a 0.45 μm membrane, is applied to a column equilibrated with 50 mM Tris-HCl (pH 8.0), 2 mM EDTA, at a flow rate of about 10 ml/hour. The column is then washed with 20 volumes of 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Alternatively, adsoφtion can be achieved by mixing overnight at 5+3 °C.
The adsorbed gel is washed with 2 to 6 volumes of 10 mM sodium phosphate buffer (pH 6.8) and the antigen is eluted with 100 mM glycine buffer (pH 2.5). The eluate is recovered in 3 mL fractions, to each of which is added 150 μl of 1 M sodium phosphate buffer (pH 8.0). Absorption is measured at
280 nm for each fraction; those fractions containing the antigen are pooled and stored at
-20°C.
EXAMPLE 5: Preparation of isolated DNA encoding the polypeptides of the invention from the deposited clones.
As mentioned above, E. coli strains including plasmids containing nucleic acids encoding GHPO 1190 (formerly HPO76, ATCC# 98197), GHPO 1212 (formerly HPOl 8, ATCC# 98210), GHPO 1012 (formerly HP0121 , ATCC# 98201), GHPO 1501 (formerly HPO45, ATCC# 98208), GHPO 1688 (formerly HPOIOI, ATCC# 98198), GHPO 346 (formerly HPOl 16, ATCC# 98200), GHPO 1200 (formerly HPO7, ATCC# 9821 1), GHPO 1538 (formerly HPO104, ATCC# 98199), GHPO 1398 (formerly HP015, ATCC# 98214), GHPO 1001 (formerly HPO58, ATCC# 98206), GHPO 470 (formerly HPOl 32, ATCC# 98202), GHPO 689 (formerly HP09, ATCC# 98203), GHPO 1550 (formerly HP038, ATCC# 98204), GHPO 1620 (formerly HPO87, ATCC# 98205), GHPO 574 (formerly HP071, ATCC# 98217), GHPO 329 (formerly HPO70, ATCC# 98219), GHPO 1374 (formerly HPO80, ATCC# 98215), GHPO 956 (formerly HPO95 ATCC# 98216), HPO 98 (ATCC# 98218), GHPO 1346 (formerly HPO57, ATCC# 98220), GHPO 706 (formerly HPO50, ATCC# 98207), GHPO 732 (formerly HP064, ATCC# 98213), GHPO 419 (formerly HPO54, ATCC# 98212), and GHPO 276 (formerly HPO42, ATCC# 98209) were deposited in E. coli strain DH5α under the Budapest Treaty with the American Type Culture Collection (ATCC; Rockville, Maryland) on October 9, 1996 and were designated with accession numbers indicated in parentheses above. These plasmids each contain a genomic DNA Bglϊl-Clal insert from H. pylori strain PI or P12 (referred to as
69-A and 888-0 in Haas et al, Mol. Microbiol. (1993) 8:753). Each of the inserts are disrupted by the presence of transposon TtιMax9 (Kahrs et al, Gene (1995) 167:53). DNA molecules lacking the transposon can be amplified from the plasmids using standard PCR techniques, such as inverse and recombinant PCR (see, e.g., Innis et al, supra), so that a full-length H. pylori insert is reconstituted. For example, the H. pylori sequences flanking the transposon can each be amplified by PCR, and then ligated together to form the full-length H. pylori gene lacking the transposon. Primers that can be used in these methods for each of the twenty-four deposited clones of the invention are shown in Table 1. The locations of insertion of the transposon in each of the deposited clones are between the nucleotides indicated in parentheses after the name of each clone, as follows: HPOIOI (497-498), GHPO 1538 (428-429), GHPO 346 (433-444), GHPO 1012 (463-464), GHPO 132 (408-409), GHPO 1212 (226-227), GHPO 1550 (347-348), GHPO 276 (372-373), GHPO 1501 (299-300), GHPO 706 (29-293), GHPO 419 (351-352), GHPO 1346 (266-267), GHPO 1001 (434-435), GHPO 732 (224-225), GHPO 329 (114-115), GHPO 574 (274-275), GHPO 1190 (412-413), GHPO 1200 (349-350), GHPO 1374 (105-106), GHPO 1620 (26-27), GHPO 956 (64-65), HPO 98 (43-44), and GHPO 689 (346-347).
EXAMPLE 6: Purification of recombinant H. pylori antigen from GHPO 1190.
A pellet of E. coli expressing GHPO 1190 is homogenized in 5 mM imidazole, 500 mM sodium chloride, 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi, and clarified by centrifugation at 4000-5000g. Method 1
The pellet containing cloned protein is suspended in buffer containing 2% N-octyl glucoside (NOG) and is homogenized. The NOG soluble protein is removed by centrifugation. The pellet is extracted one more time with 2% NOG. After centrifugation, the pellet is dissolved in 8 M urea. The urea solubilized protein is diluted with an equal volume of 2 M arginine and dialyzed against 1 M arginine for 24-48 hours to remove urea. The cloned protein remains in solution. SDS-PAGE and Coomassie staining, followed by densitometric scanning, shows that the protein is 80-85% pure cloned antigen. Method 2 The pellet containing cloned protein is solubilized in 6 M guanidine hydrochloride and is passed through an IMAC column charged with Ni ^. The bound antigen is eluted with 8 M urea (pH 8.5). β-Mercaptoethanol is added to eluted protein to a final concentration of 1 mM, then passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from Sephadex G-25 column is slowly added to 4 volumes of 50 mM phosphate (pH 7.0). The protein remains in solution.
Purification of recombinant proteins
Recombinant proteins expressed as Histidine-tagged fusion proteins can be solubilized and purified by using a metal affinity column (nickel column). The bound protein can be eluted with imidazole buffer, with or without urea, or by using low pH buffers, with or without urea. Urea or guanidine hydrochloride-denatured proteins can then be renatured using appropriate renaturing buffers. With a number of recombinant H. pylori antigens (HpaA and clone GHPO 1190), renaturation conditions using arginine hydrochloride (0.25-1 M) have been determined.
Recombinant proteins without a His-tag can be solubilized and purified using immunoaffinity, ion-exchange, sizing, and/or hydrophobic chromatography. Proteins expressed as insoluble aggregates in inclusion bodies can be solubilized in denaturing agents, such as 8 M urea or 6 M guanidine hydrochloride. Appropriate folding and renaturation can readily be determined by one skilled in the art.
The above pellet containing cloned protein is suspended in 50 mM NaP04 (pH 7.5) containing 1% weight/volume N-octyl glucoside (NOG) and mixed vigorously. The NOG soluble impurities are removed by centrifugation. The remaining pellet is extracted one more time with the 1 % NOG solution to further remove impurities. After centrifugation, the pellet is solubilized in 8 M urea, 50 mM Tris (pH 8.0). The Urea solubilized protein is diluted with an equal volume of 2 M Arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 M Arginine, 50 mM Tris, 50 mM NaCl (pH 8.0) for 24-48 hours to remove urea. The cloned protein remains in solution following dialysis. SDS-PAGE and Coomassie staining followed by densitometric scanning shows that the protein is 80-85% pure cloned antigen.
Other embodiments are within the following claims.
RI ( ONS I KIK 1 ION OI' Λ CO I'LE I E ORF BY RECOMBINAN I' PCR f denotes forward primer
R denotes reverse pnmpr
C denotes coding strand
N' denotes non coding strand
All I C 1 and RN2 primers have incorporated at their 5' end a clamp and a recognition sequence for cloning purposes
GGC clamp present for amplification and cloning of entire gene sequence from chromosomal DNA
[XJ denotes any nucleotide sequence not present in the completed gene sequence
() Identifies region of overlap between the two original PCR products, and is consistently 10 nucleotides long for each clone
CLONE Primer nt positions Primer sequence (5' - 3') Length Tin (c
0) No type of gene seq. c
00 (Λ 76 FC1 304 - 330 GCC [X] CAAGCAAAAAAATGTCAATTAAAAGGG 27 70
H RN1 413 - 391 TAAGTCCATACGATAGCCTATG 22 62
H TC2 404 - 436 (TATGGAACTTA) GAACATTTTAACACGCTCTATTA 33 60 C H RN2 927 - 904 GCC [X] GTCTAAATTAGAATAAGTGTTGTT 24 60 m en x 1 8 FC1 101 - 124 GCC [X] AATATATGGGAACTTAATGAGAΛT 24 60 m RN1 227 - 206 TGCGAGATTTAACCTGTTTTCA 22 60 m
H FC2 218 - 249 (AAATCTCGCA) GAAATCTTTCACAAGCGAGCAA 32 60
3 RN2 922 - 901 GCC [X] ATGTCATGTCAAACTATGAAGC 22 60 c r- m 1 2 1 TC1 141 - 164 GCC [X] TCACAATGGATAAAAACAACAACA 24 62 to RN1 451 - 473 GCCCTTTTGTTTAGGGGTTAG 2 1 62
FC2 455 - 485 (ACAAAAGGGC) TTTTTAGAGCATGTGAGCCATC 32 62
RN2 814 796 GCC [X] CTGTCCAAATCAGCCACCC 1 9 60
4 5 FC1 1 - 26 GCC [X] ATGAAAAGATTTGATTTGTTΠTATC 26 62
RN1 299 - 278 ΛAGCCGTATTGTTTGTTTTGGC 22 62
FC2 290 - 323 (AATACGGCTTTAAAGCTATAGAAAATTTAAACGC) 34 60
RN2 603 - 582 GCC [X] TTAAATATCCCAATCCTGCCAC 22 62
1 0 1 FFCC11 333000888 --- 333333222 GCC [X] GAAGGATTTATTATGATTAAAAGAA 25 60
RN1 4 49977 - - 447744 AACCTAATTTGAAATTCAAACCAT 24 60
FC2 4 48888 - - 551199 (AAATTAGGTT) TTGTAGGCTTTGCCAATAAATG 3 322 6 600
RN2 8 89933 - - 886699 GCC [X] AAGGAATAAATTAGAAAGTGAAGAA 25 62
1 1 6 FC1 2 23366 - - 225599 GCC [X] CGCATTGATTTGATGAATAAACC 23 62
RN1 4 43344 - - 441166 CGCCTATAACCGCTCCATT 19 60
FC2 4 42255 - - 445566 (GTTATAGGCG) ATAAAGGTTTAACGCAGCTAAG 32 60
RN2 8 81122 - - 779900 GCC [X] CTCACTAAAAAGCAATTTTTGAG 23 60
7 FC1 1 19955 -- ■ 222200 GCC [X] TAAGGAATGAAGTTGATAAAATTTGT 26 64 w RN1 3 34499 - - 332277 GCATTTTCATTCATTCTTTGGAC 23 60
C FC2 3 33399 - - 337711 (ATGAAAATGC) ACGCCCAAATAATAAGGAAGTA 32 60
JJj RN2 7 73388 - - 771177 GCC [X] GGATTTATTGAGCTTTCCCCTT 22 62
H
H 1 04 FC1 2 25511 -■ - 227711 GCC [X] AAAGGGCGAAAATGAGCAAGA 21 60
H RN1 4 42299 -■ • 440077 TAAAATAACCAACAGAGTGATCA 23 60 m FC2 442200 • - ■ 445522 (GGTTATTTTA) GTGGATATTTGGGTTTATAGCGA 3 _3. 6 _ _2
U) x RN2 778844 ■ - • 776611 GCC [X] I I I I I I AAGAATCACTTTCTTCGG 2 244 6 622 m ϋj 58 F FCC11 1 1 1111888 - -- 11 1444333 GCC [X] ATAGGAACAAGCATGTTTTTTAAAAC 26 66
RN1 4 43344 -- 4 41133 TGAAGTCTTGCGATTTTTGCTT 22 60
C F FCC22 4 4 4222555 - -- 44 4555444 ((CCAAAAGGAACCTTTTCCAA)) AAAAAAAAAAGGAAAAGGGGAAGCGGTTGCC 30 60 p- m RN2 6 65500 -- 6 63300 GCC [X] CTGGCTTATTGCGTATCATC 20 60
N>
S 1 32 FFCC11 22 2999444 -- - 33 3111444 GGC [X] GGAAGAATAATGCTCGCTTCC 21 62
RN1 4 40099 - -337788 ACTGGAGTGTGGATAAAACTAT 22 60
FC2 4 40000 ■- ■ 4 43300 (ACACTCCAGT) AGATGCTTTCCCGGATATTTC 31 60
RN2 7 76611 •- ■ 7 74411 GCC [X] CTATTCTCCAGGGATATGGCC 21 64
FC1 221111 -- 223333 GCC [X] GATGGA I I I I I I ATGGGGGTGAG 23 64
RN1 334477 -- 332288 GGCACTGCCGCAGATTCTA 1 199 6 600
FC2 333388 -- 337700 (CGGCAGTGCC) TTTAGCCTATTATTTAGAAGCGA 33 60
RN2 668866 -- 666655 GCC [X] ATGGTATTTGTCTAAGACCCTC 2222 6622
38 FFCC11 222200 -- 224422 GCC [X] AAAAGGGTTTTAAATAATGGCTG 2 3 60
RN1 348 - 327 ACAAGGATAAAAAACGCGCTAA 2 222 6 600
FC2 239 - 371 (TTATCCTTGη TGCTGGCTTGG I I I I I I I I AATT 33 60
RN2 597 - 575 GCC [X] AAGATTCTAAAAGGGCTTCAAAT 2 233 6 600
7 1 F FCCll 1 1 -- 2255 GCC [X] ATGTTGAAATTTAAATATGGTTTGA 2 5 60
RN1 274 - 254 AAACCCCACTCTTATCATCGG 2 1 62
FC2 265 - 294 (AGTGGGGTTT) TTTTAGGGGGTGGGTATGCT 30 60
RN2 524 - 505 GCC [X] GAGCCTACAGGTTGCTTGC 20 60
7 0 F FCCll 1 1 -- 2233 GCC [X] ATGGTATTTGACAGAACAATCAG 2 3 62
RN1 115 - 96 GAAAAGCCACCCCGCTTATT 20 60
CO FC2 106 - 137 (GTGGCTTTTC) AAAAAGAGTGGGTGCAACAATT 32 60 03 R RNN22 4 49955 -- 447711 GCC [X] TTAGGAATAGCATAACAAACAAACG 2 5 66 co
H
Z 80 F FCC11 1 1 -- 2255 GCC [X] ATGTTAGAAAAATTGATTGAAAGAG 25 62
C R RNN11 1 10066 -- 9955 TGAACACATAGCCTAAAACCAC 2 1 62 m F FCC22 9 977 -- 112277 (TATGTGTTCA) TGAAAGAGTTGTGGCACATGC 3 3 11 6 622
(0 R RNN22 4 43355 -- 441155 GCC [X] TTATGCGATAGGGGGCGTATC 2 1 66
X m m 95 F FCC11 1 1 -- 2277 GCC [X] ATGAAAAAA I I I I I I I CTCAATCTTT 27 60
RN1 64 - 46 TGGCCAGTAGCGCGTTCAT 1 9 60 jjg FFCC22 5555 -- 9988 (CTACTGGCCA) TGGATGGCAATGGCG I I I I I I IAG 34 68 f= RRNN22 443322 -- 440088 GCC [X] TTATTGATGAACATTAACCATTAAA 25 60
{ 98 FFCC11 11 -- 2222 GCC [X] ATGAAAACCTTTAAAAACCTGC 22 58
RN1 43 - 23 TAGCGATCAGGCTAAAACAGA 2 1 60
FC2 34 - 62 (CTGATCGCTA) TGAGTTGGCTCCAAGCGGA 29 60
RN2 336 - 313 GCC [X] TTAAAACTCATAGCG I I I I I CAAT 24 60
1 2 FFCC11 11 88--5511 GCC [X] GAGAGTAGTGGCAGAGTTTATGCTGATTCC 34 9 !
RN1 380-351 (AACTTTTC)TCTATCCCAATTCGTTACGCTC 30 64
FC2 366-396 (GGATAGA)GAAAAGTTTGGCGTCAAAAGTTGG 3 1 68
RN2 822-801 GCC [X] GGCTTAAACTGGAACGGATTTC 22 64
50 FFCC11 114400--117700 GCC [X] TAAAGTTTGCTAAAAAGATGGTTTTAATTl 31 76
RN1 297-270 (G ACTTCTAAAG) CGTCC I I I I I I I CTTTA 28 56
FC2 287-314 (CTTTA)GAAGTCATTAAACAAAGAGGGGT 2 299 6 644
RN2 607-584 GCC [X] CCCATCTTTAGAAATCAACCCCCA 24 70
64 F FCC11 2 233--5500 GCC [X] GAAATCAAGGAGTTTGTATGCAACAGCG 28 80
RN1 225-149 (A)AGCTTTTCATTATCTTCCCCATAAGC 27 74
FC2 216-244 (TGAAAAGCT)TTTAGCGAAGCGATCAAGCC 29 60
RN2 1039-1012 GCC [X] CCCAATACTTTTATTGATTCACCATTTC 28 74 w 54 FFCC11 2211--4488 GCC [X] CAATAAAACACCAAAATGAATGAGTTAC 28 68
C RRNN11 335522--332277 (A)GATTTTGTTTTGAGCGTTAGAAATG 26 66
CD FC2 345-376 32 62
CΛ
-J RRNN22 11228800--11225555 GCC [X] GCATTTACCCCCTAAAAACTATAAAC 26 70
H fq 15 FFCC11 1144--3355 GCC [X] CTGAAGGGTGTATGGTATTAGG 22 64
RRNN11 115577--113322 (C)ACCATACATGTATCCTGCATTAATG 26 68
ΣJΪ FFCC22 114477--117799 (CATGTATGGηGTAGCAAAGAATTTTAAGGAGGC 33 64 m RRNN22 337777--334499 GCC [X] CGTTAAAACTAAAGTTCTAl I I I IAATTC 2 299 7 700 m
--j 57 FFCC11 1133--3399 GCC [X] GTAAGGAATGAGATGATAAAGAGTTGG 27 74
C RRNN11 226677--224444 (T)GGAATATTCTGATCCACGCCATC 24 68 m FC2 258-294 (GAATATTCC)AAAAGCCGTTTTTTATTACAGAAGAC 37 76
RN2 957-934 GCC [X] CTAAACTCTGGCTTATTGCGTATC 24 68
87 FFCC11 11--2222 GCC [X] ATGCGTTTATTATTGTGGTGGG 22 62
RN1 27-3 (C)AATACCCACCACAATAATAAACGCAT 25 66
FC2 18-50 (GTGGGTATTJGGTATTATCGCTCTTTTTAAATCC 33 64
RN2 519-498 GCC [X] TTAAAI II I I AGGGAAAGGGTA 22 62
CONDITIONS FOR RECOMBINANT PCR Two independent PCR conditions are carried out for FC1/RN1 and FC2/RN2 primers under the same conditions proposed for cloning genes for expression.
After 20 cycles, the product of each reaction is used as template for a further 20 cycles with FC1/RN2 only
The product will encompass the full lenth gene minus the transposon.
The presence of restriction sites at the 5' ends of these primers allows for cloning/expression studies.
CΛ
C CD CΛ
H
m
CΛ
I m m
H
3 c ι- m
NJ
- 74 .
SEQUENCE LISTING (1) GENERAL INFORMATION II) APPLICANT: ORAVAX , INC.
In) TITLE OF THE INVENTION: HELICOBACTER POLYPEPTIDES AND CORRESPONDING POLYNUCLEOTIDE MOLECULES
(iii) NUMBER OF SEQUENCES: 190
(iv) CORRESPONDENCE ADDRESS:
(A ADDRESSEE: Clark & Elbing LLP (B STREET: 176 Federal Street (C CITY: Boston (D STATE : MA (E COUNTRY : USA (F ZIP: 02110 2214
(v) COMPUTER READABLE FORM:
(A MEDIUM TYPE: Diskette (B COMPUTER: IBM Compatible (C OPERATING SYSTEM: DOS (D SOFTWARE: FastSEQ for Windows Version 2.0
( i) CURRENT APPLICATION DATA: (A APPLICATION NUMBER: UNKNOWN (B FILING DATE: 14-NOV-1997 (C CLASSIFICATION:
( n PRIOR APPLICATION DATA: (A APPLICATION NUMBER: 08/749,051 (B FILING DATE: 14-NOV-1996
(vn PRIOR APPLICATION DATA:
(A: APPLICATION NUMBER: 08/831,309
(B FILING DATE: l-APR-1997
( n PRIOR APPLICATION DATA: (A APPLICATION NUMBER: 08/834,705 (B FILING DATE: l-APR-1997
( n PRIOR APPLICATION DATA: (A APPLICATION NUMBER: 08/833,457 (B FILING DATE: l-APR-1997
(vii PRIOR APPLICATION DATA: (A APPLICATION NUMBER: 08/881,227 (B FILING DATE: 24-JUN-1997
(vii) PRIOR APPLICATION DATA:
- 75 -
(A) APPLICATION NUMBER: 08/902,615
(B) FILING DATE: 29-JUL-1997
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Clark, Paul T.
(B) REGISTRATION NUMBER: 30,175
(C) REFERENCE/DOCKET NUMBER: 06132/028WO1
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 617-428-0200
(B) TELEFAX: 617-428-7045
(C) TELEX:
(2) INFORMATION FOR SEQ ID NO : 1 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 989 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 71...940 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 :
CTATGACGAT TGTCTCGCTT TTAGAAAACA CTCTAATCGC TTTTGAAAAA CAACAAAGGA 60
AGGGATTTTA ATG AAA TTT TTA CGC TCT GTT TAT GCA TTT TGC TCC AGT 109
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser 1 5 10
TGG GTA GGG ACG ATT GTT ATT GTG CTG TTG GTT ATC TTT TTT ATC GCG 157
Trp Val Gly Thr lie Val lie Val Leu Leu Val lie Phe Phe lie Ala 15 20 25
CAA GCC TTT ATC ATT CCC TCT CGC TCT ATG GTT GGC ACG CTC TAT GAG 205
Gin Ala Phe lie lie Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu 30 35 40 45
GGC GAC ATG CTC TTT GTC AAA AAG TTT TCT TAC GGC ATA CCC ATT CCT 253
Gly Asp Met Leu Phe Val Lys Lys Phe Ser Tyr Gly lie Pro lie Pro
50 55 60
AAA ATC CCA TGG ATT GAG CTT CCT GTT ATG CCT GAT TTT AAA AAT AAC 301
Lys lie Pro Trp lie Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn
65 70 75
GGA CAT TTG ATA GAG GGG GAT CGC CCT AAG CGT GGC GAA GTG GTG GTG 349
Gly His Leu lie Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val
- 76 -
80 85 90
TTT ATC CCT CCC CAT GAA AAA AAG TCT TAC TAT GTT AAA AGG AAT TTT 397 Phe lie Pro Pro His Glu Lys Lys Ser Tyr Tyr Val Lys Arg Asn Phe 95 100 105
GCC ATT GGA GGC GAT GAG GTG TTG TTC ACT AAT GAG GGT TTT TAT TTG 445 Ala lie Gly Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu 110 115 120 125
CAC CCT TTT GAG AGC GAC ACG GAC AAA AAT TAC ATC GCT AAA CAT TAC 493 His Pro Phe Glu Ser Asp Thr Asp Lys Asn Tyr lie Ala Lys His Tyr 130 135 140
CCT AAC GCC ATG ACA AAA GAA TTT ATG GGT AAA ATT TTT GTT TTA AAC 541 Pro Asn Ala Met Thr Lys Glu Phe Met Gly Lys lie Phe Val Leu Asn 145 150 155
CCT TAT AAA AAT GAG CAT CCG GGT ATC CAT TAC CAA AAA GAC AAT GAA 589 Pro Tyr Lys Asn Glu His Pro Gly lie His Tyr Gin Lys Asp Asn Glu 160 165 170
ACC TTC CAC TTA ATG GAG CAA TTA GCC ACT CAA GGC GCA GAA GCT AAT 637 Thr Phe His Leu Met Glu Gin Leu Ala Thr Gin Gly Ala Glu Ala Asn 175 180 185
ATC AGC ATG CAA CTC ATT CAA ATG GAG GGC GAA AAG GTG TTT TAT AAG 685 lie Ser Met Gin Leu lie Gin Met Glu Gly Glu Lys Val Phe Tyr Lys 190 195 200 205
AAA ATC AAT GAC GAT GAA TTT TTC ATG ATC GGC GAC AAC AGA GAC AAT 733 Lys lie Asn Asp Asp Glu Phe Phe Met lie Gly Asp Asn Arg Asp Asn 210 215 220
TCT AGC GAC TCG CGC TTT TGG GGG AGT GTG GCT TAT AAA AAC ATC GTG 781 Ser Ser Asp Ser Arg Phe Trp Gly Ser Val Ala Tyr Lys Asn lie Val 225 230 235
GGT TCG CCA TGG TTT GTT TAT TTC AGT TTG AGT TTA AAA AAT AGC CTA 829 Gly Ser Pro Trp Phe Val Tyr Phe Ser Leu Ser Leu Lys Asn Ser Leu 240 245 250
GAA ATG GAT GCA GAA AAT AAC CCT AAA AAA CGC TAT CTG GTG CGT TGG 877 Glu Met Asp Ala Glu Asn Asn Pro Lys Lys Arg Tyr Leu Val Arg Trp 255 260 265
GAA CGC ATG TTT AAA AGC GTT GGA GGC TTA GAA AAA ATC ATT AAA AAA 925 Glu Arg Met Phe Lys Ser Val Gly Gly Leu Glu Lys lie lie Lys Lys 270 275 280 285
GAA AAC GCA ACG CAT TAAGGTTTTT TGTGCAATTT TTTGATTTCT CTTTAGAAAG T 981 Glu Asn Ala Thr His 290
TTTATTAC 989
77 (2) INFORMATION FOR SEQ ID NO 2
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 290 amino acids
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE protein (v) FRAGMENT TYPE internal
(xi) SEQUENCE DESCRIPTION SEQ ID NO 2
Met Lys Phe Leu Arg Ser Val Tyr Ala Phe Cys Ser Ser Trp Val Gly
1 5 10 15
Thr lie Val lie Val Leu Leu Val lie Phe Phe lie Ala Gin Ala Phe
20 25 30 lie lie Pro Ser Arg Ser Met Val Gly Thr Leu Tyr Glu Gly Asp Met
35 40 45
Leu Phe Val Lys Lys Phe Ser Tyr Gly lie Pro lie Pro Lys lie Pro
50 55 60
Trp lie Glu Leu Pro Val Met Pro Asp Phe Lys Asn Asn Gly His Leu 65 70 75 80 lie Glu Gly Asp Arg Pro Lys Arg Gly Glu Val Val Val Phe lie Pro
85 90 95
Pro His Glu Lys Lys Ser Tyr Tyr Val Lys Arg Asn Phe Ala lie Gly
100 105 110
Gly Asp Glu Val Leu Phe Thr Asn Glu Gly Phe Tyr Leu His Pro Phe
115 120 125
Glu Ser Asp Thr Asp Lys Asn Tyr lie Ala Lys His Tyr Pro Asn Ala
130 135 140
Met Thr Lys Glu Phe Met Gly Lys lie Phe Val Leu Asn Pro Tyr Lys 145 150 155 160
Asn Glu His Pro Gly lie His Tyr Gin Lys Asp Asn Glu Thr Phe His
165 170 175
Leu Met Glu Gin Leu Ala Thr Gin Gly Ala Glu Ala Asn lie Ser Met
180 185 190
Gin Leu lie Gin Met Glu Gly Glu Lys Val Phe Tyr Lys Lys lie Asn
195 200 205
Asp Asp Glu Phe Phe Met lie Gly Asp Asn Arg Asp Asn Ser Ser Asp
210 215 220
Ser Arg Phe Trp Gly Ser Val Ala Tyr Lys Asn lie Val Gly Ser Pro 225 230 235 240
Trp Phe Val Tyr Phe Ser Leu Ser Leu Lys Asn Ser Leu Glu Met Asp
245 250 255
Ala Glu Asn Asn Pro Lys Lys Arg Tyr Leu Val Arg Trp Glu Arg Met
260 265 270
Phe Lys Ser Val Gly Gly Leu Glu Lys lie lie Lys Lys Glu Asn Ala
275 280 285 290
(2) INFORMATION FOR SEQ ID NO 3
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH: 514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 112...471 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 :
GGATTTTTTA GAGCTCTTAG TCAATGATAA TGTGGTAGAA ACGATTGAAA AAGGCTTTGT 60
GATAGGTTTT GGAGCGGGGG ATATTACCTA TCAATTAAGA GGCGAAATGT A ATG GGT 117
Met Gly
1
GCA GTG GTT GTT TTA TTT TTA ACG CTG GTT TTA TTG TTT TTA GTT TTA 165 Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu Val Leu 5 10 15
AGG GAT TTT GGT TTA GCA AGC CCC AAA CAA AAG ATT TTA GCT TTT TTA 213 Arg Asp Phe Gly Leu Ala Ser Pro Lys Gin Lys lie Leu Ala Phe Leu 20 25 30
ATC GTA GGG ATT ATA GGA GCG AGC ATC AGC GTT TAT ACT TAC AAG CAA 261 lie Val Gly lie lie Gly Ala Ser lie Ser Val Tyr Thr Tyr Lys Gin 35 40 45 50
AAC CAA CAA AAC CAA CAA GAG ATC GCT TTG CAA AGA GCG TTT TTA AGG 309 Asn Gin Gin Asn Gin Gin Glu lie Ala Leu Gin Arg Ala Phe Leu Arg 55 60 65
GGG GAA ACC TTG TTG TGT AAA GGC ATT AAA GTC AAT AAC CAA ACC TTT 357 Gly Glu Thr Leu Leu Cys Lys Gly lie Lys Val Asn Asn Gin Thr Phe 70 75 80
AAT TTA GTG AGC GGA ACT TTA AGC TTT TTA GGC AAA AAA CAA ACC CCT 405 Asn Leu Val Ser Gly Thr Leu Ser Phe Leu Gly Lys Lys Gin Thr Pro 85 90 95
ATG AAA GAC GTT CTT GTG GAT TTG GAT TCT TGT CAG ACG CTC CAA AAA 453 Met Lys Asp Val Leu Val Asp Leu Asp Ser Cys Gin Thr Leu Gin Lys 100 105 110
GAT CCC TTA ATC CAA CCC TAATGATGAA TAATAATAAT ACCCCACCCA AACCCCTA 509 Asp Pro Leu lie Gin Pro 115 120
GAAGA 514
(2) INFORMATION FOR SEQ ID NO : 4 :
- 79 -
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 120 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 :
Met Gly Ala Val Val Val Leu Phe Leu Thr Leu Val Leu Leu Phe Leu
1 5 10 15
Val Leu Arg Asp Phe Gly Leu Ala Ser Pro Lys Gin Lys lie Leu Ala
20 25 30
Phe Leu lie Val Gly lie lie Gly Ala Ser lie Ser Val Tyr Thr Tyr
35 40 45
Lys Gin Asn Gin Gin Asn Gin Gin Glu lie Ala Leu Gin Arg Ala Phe
50 55 60
Leu Arg Gly Glu Thr Leu Leu Cys Lys Gly lie Lys Val Asn Asn Gin 65 70 75 80
Thr Phe Asn Leu Val Ser Gly Thr Leu Ser Phe Leu Gly Lys Lys Gin
85 90 95
Thr Pro Met Lys Asp Val Leu Val Asp Leu Asp Ser Cys Gin Thr Leu
100 105 110
Gin Lys Asp Pro Leu lie Gin Pro 115 120
(2) INFORMATION FOR SEQ ID NO : 5 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1233 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 135...1049 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 :
GTTTTTAATT TAATATTCAT TAAGCTTTTG TGGCTATTCC ATTTTAATTT TGTTTTTCAT 60 TAAAACCCAA TCTAAAATCT TATTTTTATG ATAAAATACC TAATCATAAT ATCAAATCTT 120 AAACCAACGA AACC ATG AAA AAA GCT CTC TTA CTA ACT CTC TCT CTC TCG 170 Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser 1 5 10
TTC TGG CTC CAC GCT GAA AGG AAT GGA TTT TAT TTA GGT TTA AAT TTT 2 IE Phe Trp Leu His Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe 15 20 25
- 80 -
CTA GAA GGA AGC TAT ATT AAA GGA CAA GGT AGC ATC GGC AAA AAA GCT 266 Leu Glu Gly Ser Tyr lie Lys Gly Gin Gly Ser lie Gly Lys Lys Ala 30 35 40
TCA GCA GAA AAC GCC TTA AAT GAA GCG ATC AAT AAC GCA AAA AAT TCA 314 Ser Ala Glu Asn Ala Leu Asn Glu Ala lie Asn Asn Ala Lys Asn Ser 45 50 55 60
TTA TTC CCT AAC ACA AAA GCC ATA AGA GAT GCA CAA AAC GCC TTA AAT 362 Leu Phe Pro Asn Thr Lys Ala lie Arg Asp Ala Gin Asn Ala Leu Asn 65 70 75
GCA GTG AAA GAT TCA AAC AAA ATC GCT AGC CGA TTC GCA GGA AAT GGT 410 Ala Val Lys Asp Ser Asn Lys lie Ala Ser Arg Phe Ala Gly Asn Gly 80 85 90
GGA TCG GGC GGT CTT TTT AAT GAG CTC AGC TTT GGG TAT AAA TAT TTT 458 Gly Ser Gly Gly Leu Phe Asn Glu Leu Ser Phe Gly Tyr Lys Tyr Phe 95 100 105
TTG GGT AAA AAA AGG ATT ATA GGG TTT AGG CAC TCT CTT TTT TTC GGT 506 Leu Gly Lys Lys Arg lie lie Gly Phe Arg His Ser Leu Phe Phe Gly 110 115 120
TAC CAA CTT GGT GGC GTT GGT TCT GTT CCT GGT AGC GGT TTA ATC GTT 554 Tyr Gin Leu Gly Gly Val Gly Ser Val Pro Gly Ser Gly Leu lie Val 125 130 135 140
TTT TTA CCC TAT GGT TTC AAT ACG GAT TTG CTC ATT AAT TGG ACT AAC 602 Phe Leu Pro Tyr Gly Phe Asn Thr Asp Leu Leu lie Asn Trp Thr Asn 145 150 155
GAT AAG CGA GCG TCC CAA AAA TAT GTT GAA CGA AGG GTA AAA GGG CTC 650 Asp Lys Arg Ala Ser Gin Lys Tyr Val Glu Arg Arg Val Lys Gly Leu 160 165 170
TCT ATA TTT TAC AAA GAT ATG ACC GGC AGA ACG CTA GAC GCT AAT ACA 698 Ser lie Phe Tyr Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr 175 180 185
TTA AAA AAA GCA TCA AGG CAT GTA TTT AGA AAA TCT TCA GGG CTT GTG 746
Leu Lys Lys Ala Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val
190 195 200
ATT GGC ATG GAA CTA GGG GGT AGC ACT TGG TTT GCA AGT AAC AAT CTC 794 lie Gly Met Glu Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu
205 210 215 220
ACC CCT TTC AAT CAA GTC AAG AGT CGC ACG ATT TTT CAG TTG CAA GGA 842 Thr Pro Phe Asn Gin Val Lys Ser Arg Thr lie Phe Gin Leu Gin Gly 225 230 235
AAA TTT GGC GTT CGT TGG AAT AAT GAT GAA TAC GAT ATT GAT CGC TAT 890 Lys Phe Gly Val Arg Trp Asn Asn Asp Glu Tyr Asp lie Asp Arg Tyr 240 245 250
- 81 -
GGC GAT GAA ATC TAT CTT GGA GGT TCT AGT GTT GAA TTA GGG GTT AAA 938 Gly Asp Glu lie Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys 255 260 265
GTG CCA GCG TTT AAA GTC AAT TAC TAT AGC GAT GAT TAT GGG GAT AAA 986 Val Pro Ala Phe Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys 270 275 280
TTG GAT TAT AAA AGA GTG GTG AGC GTT TAT CTT AAC TAT ACA TAT AAC 1034 Leu Asp Tyr Lys Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn 285 290 295 300
TTT AAA AAC AAA CAT TAAAACACGC TTTTTACCGC TCTTTAGTTG GTTTTTTAAA A 1090 Phe Lys Asn Lys His 305
AACCTTATTT TTTATTAGCT TGAAACTCTT CAAAGCCTTT TTTTCTCAAT TGGCATGCCG 1150 GGCATTTATC GCAACCATAA CCATAAGCAT GCAAAATCTT TCGCTCTCCT TGATAGCAGG 1210 TGTGCGTTTC TTTGATGACT AAA 1233
(2) INFORMATION FOR SEQ ID NO : 6 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 :
Met Lys Lys Ala Leu Leu Leu Thr Leu Ser Leu Ser Phe Trp Leu His
1 5 10 15
Ala Glu Arg Asn Gly Phe Tyr Leu Gly Leu Asn Phe Leu Glu Gly Ser
20 25 30
Tyr lie Lys Gly Gin Gly Ser lie Gly Lys Lys Ala Ser Ala Glu Asn
35 40 45
Ala Leu Asn Glu Ala lie Asn Asn Ala Lys Asn Ser Leu Phe Pro Asn
50 55 60
Thr Lys Ala lie Arg Asp Ala Gin Asn Ala Leu Asn Ala Val Lys Asp 65 70 75 80
Ser Asn Lys lie Ala Ser Arg Phe Ala Gly Asn Gly Gly Ser Gly Gly
85 90 95
Leu Phe Asn Glu Leu Ser Phe Gly Tyr Lys Tyr Phe Leu Gly Lys Lys
100 105 110
Arg lie lie Gly Phe Arg His Ser Leu Phe Phe Gly Tyr Gin Leu Gly
115 120 125
Gly Val Gly Ser Val Pro Gly Ser Gly Leu lie Val Phe Leu Pro Tyr
130 135 140
Gly Phe Asn Thr Asp Leu Leu lie Asn Trp Thr Asn Asp Lys Arg Ala 145 150 155 160
Ser Gin Lys Tyr Val Glu Arg Arg Val Lys Gly Leu Ser lie Phe Tyr 165 170 175
- 82 -
Lys Asp Met Thr Gly Arg Thr Leu Asp Ala Asn Thr Leu Lys Lys Ala
180 185 190
Ser Arg His Val Phe Arg Lys Ser Ser Gly Leu Val lie Gly Met Glu
195 200 205
Leu Gly Gly Ser Thr Trp Phe Ala Ser Asn Asn Leu Thr Pro Phe Asn
210 215 220
Gin Val Lys Ser Arg Thr lie Phe Gin Leu Gin Gly Lys Phe Gly Val 225 230 235 240
Arg Trp Asn Asn Asp Glu Tyr Asp lie Asp Arg Tyr Gly Asp Glu lie
245 250 255
Tyr Leu Gly Gly Ser Ser Val Glu Leu Gly Val Lys Val Pro Ala Phe
260 265 270
Lys Val Asn Tyr Tyr Ser Asp Asp Tyr Gly Asp Lys Leu Asp Tyr Lys
275 280 285
Arg Val Val Ser Val Tyr Leu Asn Tyr Thr Tyr Asn Phe Lys Asn Lys
290 295 300 305
(2) INFORMATION FOR SEQ ID NO : 7 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3012 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 142...2682 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 :
AATGACGGCT CTAAACCAAA CGATTTGACT TCTCCAAAAG AAGCCTCTCA AGAATCTCAA 60
AAAAATGAAG CTCCAAAAAA TGAAGTTCAA AGAAATGAAG CTCAAAAAGA AACCCCCCAA 120
TCCAATCAAA CGCCTAAAGA A ATG AAA GTC AAG TCC ATT TCT TAT GTC GGG 171
Met Lys Val Lys Ser lie Ser Tyr Val Gly 1 5 10
CTT TCT TAC ATG TCT GAC ATG CTC GCT AAT GAA ATT GTA AAG ATT CGT 219 Leu Ser Tyr Met Ser Asp Met Leu Ala Asn Glu lie Val Lys lie Arg 15 20 25
GTG GGC GAT ATT GTG GAT TCT AAA AAA ATA GAC ACC GCT GTT TTG GCT 267 Val Gly Asp lie Val Asp Ser Lys Lys lie Asp Thr Ala Val Leu Ala 30 35 40
TTG TTC AAT CAA GGG TAT TTT AAA GAC GTT TAT GCC ACT TTT GAA GGC 315 Leu Phe Asn Gin Gly Tyr Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly 45 50 55
- 83 -
GGC ATA TTA GAG TTT CAT TTT GAT GAA AAA GCC AGG ATT GCC GGG GTA 363 Gly lie Leu Glu Phe His Phe Asp Glu Lys Ala Arg lie Ala Gly Val 60 65 70
GAA ATC AAG GGT TAT GGG ACT GAA AAG GAA AAA GAC GGC TTA AAA TCC 411 Glu lie Lys Gly Tyr Gly Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser 75 80 85 90
CAA ATG GGG ATC AAA AAG GGC GAC ACC TTT GAT GAG CAA AAA TTA GAG 459 Gin Met Gly lie Lys Lys Gly Asp Thr Phe Asp Glu Gin Lys Leu Glu 95 100 105
CAT GCT AAA ACG GCT TTA AAA ACC GCT TTA GAG GGG CAG GGC TAT TAT 507 His Ala Lys Thr Ala Leu Lys Thr Ala Leu Glu Gly Gin Gly Tyr Tyr 110 115 120
GGG AGC GTG GTG GAG GTG CGC ACA GAA AAG GTC AGT GAG GGT GCA TTA 555 Gly Ser Val Val Glu Val Arg Thr Glu Lys Val Ser Glu Gly Ala Leu 125 130 135
TTG ATC GTG TTT GAT GTG AAT AGG GGG GAT AGC ATT TAT ATC AAA CAA 603 Leu lie Val Phe Asp Val Asn Arg Gly Asp Ser lie Tyr lie Lys Gin 140 145 150
TCC ATT TAT GAG GGA AGC GCG AAA TTA AAA CGC CGC ATG ATT GAA TCT 651 Ser lie Tyr Glu Gly Ser Ala Lys Leu Lys Arg Arg Met lie Glu Ser 155 160 165 170
TTG AGT GCG AAC AAG CAA CGA GAT TTC ATG GGC TGG ATG TGG GGC TTG 699 Leu Ser Ala Asn Lys Gin Arg Asp Phe Met Gly Trp Met Trp Gly Leu 175 180 185
AAT GAC GGG AAA TTG CGT TTA GAT CAA CTA GAA TAC GAT TCT ATG CGT 747 Asn Asp Gly Lys Leu Arg Leu Asp Gin Leu Glu Tyr Asp Ser Met Arg 190 195 200
ATC CAA GAT GTG TAT ATG CGT AGG GGT TAC TTA GAC GCT CAT ATT TCT 795 lie Gin Asp Val Tyr Met Arg Arg Gly Tyr Leu Asp Ala His lie Ser 205 210 215
TCG CCT TTT TTG AAA ACG GAT TTT TCT ACC CAT GAC GCT AAG CTT CAT 843 Ser Pro Phe Leu Lys Thr Asp Phe Ser Thr His Asp Ala Lys Leu His 220 225 230
TAT AAA GTC AAA GAG GGG ATC CAA TAC AGG ATT TCA GAC ATT TTA ATA 891 Tyr Lys Val Lys Glu Gly lie Gin Tyr Arg lie Ser Asp lie Leu lie 235 240 245 250
GAG ATT GAC AAC CCG GTA GTC CCC TTA AAA ACC TTA GAA AAA GCG CTT 939 Glu lie Asp Asn Pro Val Val Pro Leu Lys Thr Leu Glu Lys Ala Leu 255 260 265
AAA GTG AAA AGG AAA GAT GTC TTT AAT ATT GAG CAT TTA AGA GCG GAT 987 Lys Val Lys Arg Lys Asp Val Phe Asn lie Glu His Leu Arg Ala Asp 270 275 280
- 84 -
GCG CAA ATT TTA AAA ACC GAA ATC GCC GAT AAG GGT TAT GCG TTT GCG 1035 Ala Gin lie Leu Lys Thr Glu lie Ala Asp Lys Gly Tyr Ala Phe Ala 285 290 295
GTG GTG AAG CCA GAC TTG GAT AAA GAT GAA AAA AAC GGG CTT GTG AAA 1083 Val Val Lys Pro Asp Leu Asp Lys Asp Glu Lys Asn Gly Leu Val Lys 300 305 310
GTC ATT TAT CGT ATT GAA GTG GGC GAT ATG GTG TAT ATC AAT GAT GTC 1131 Val lie Tyr Arg lie Glu Val Gly Asp Met Val Tyr lie Asn Asp Val 315 320 325 330
ATC ATT TCA GGG AAC CAG CGC ACG AGC GAT AGG ATC ATT AGA AGG GAG 1179 lie lie Ser Gly Asn Gin Arg Thr Ser Asp Arg lie lie Arg Arg Glu 335 340 345
TTA TTG TTA GGG CCT AAG GAT AAA TAC AAC TTG ACC AAA CTG AGA AAT 1227 Leu Leu Leu Gly Pro Lys Asp Lys Tyr Asn Leu Thr Lys Leu Arg Asn 350 355 360
TCC GAA AAT TCT TTA AGG CGT TTA GGA TTC TTC TCT AAA GTC AAA ATT 1275 Ser Glu Asn Ser Leu Arg Arg Leu Gly Phe Phe Ser Lys Val Lys lie 365 370 375
GAA GAA AAA AGG GTT AAT AGC TCA CTC ATG GAT TTA TTA GTG AGC GTA 1323 Glu Glu Lys Arg Val Asn Ser Ser Leu Met Asp Leu Leu Val Ser Val 380 385 390
GAA GAG GGG CGT ACT GGG CAG TTG CAA TTT GGG TTA GGC TAT GGC TCT 1371 Glu Glu Gly Arg Thr Gly Gin Leu Gin Phe Gly Leu Gly Tyr Gly Ser 395 400 405 410
TAT GGA GGG CTT ATG CTT AAT GGG AGC GTG AGC GAA AGA AAC CTT TTT 1419
Tyr Gly Gly Leu Met Leu Asn Gly Ser Val Ser Glu Arg Asn Leu Phe
415 420 425
GGC ACA GGG CAA AGC ATG AGC TTG TAT GCT AAC ATC GCT ACA GGG GGG 1467
Gly Thr Gly Gin Ser Met Ser Leu Tyr Ala Asn lie Ala Thr Gly Gly
430 435 440
GGT AGA TCT TAT CCG GGC ATG CCA AAA GGA GCG GGG CGT ATG TTT GCC 1515 Gly Arg Ser Tyr Pro Gly Met Pro Lys Gly Ala Gly Arg Met Phe Ala 445 450 455
GGG AAT TTG AGC TTG ACT AAT CCA AGG ATT TTT GAC AGC TGG TAT AGC 1563 Gly Asn Leu Ser Leu Thr Asn Pro Arg lie Phe Asp Ser Trp Tyr Ser 460 465 470
TCT ACG ATC AAC CTT TAT GCG GAT TAC AGG ATA AGC TAC CAA TAC ATC 1611 Ser Thr lie Asn Leu Tyr Ala Asp Tyr Arg lie Ser Tyr Gin Tyr lie 475 480 485 490
CAA CAA GGC GGG GGC TTT GGG GTG AAT GTC GGG CGC ATG CTG GGT AAT 1659 Gin Gin Gly Gly Gly Phe Gly Val Asn Val Gly Arg Met Leu Gly Asn 495 500 505
AGA ACC CAT GTG AGC TTA GGG TAT AAC TTG AAT GTT ACC AAA CTC CTT 1707
Arg Thr His Val Ser Leu Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu 510 515 520
GGT TTC AGC AGC CCT TTA TAC AAC CGC TAC TAT TCC TCT GTT AAT GAA 1755
Gly Phe Ser Ser Pro Leu Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu
525 530 535
GTG GTT TCT CCA AGG CAA TGT TCT ACC CCC GCA TCG GTG ATT ATC AAT 1803
Val Val Ser Pro Arg Gin Cys Ser Thr Pro Ala Ser Val lie lie Asn 540 545 550
CGC TTA TCA GGC GGT AAA ACC CCC TTA CAA CCT GAA AGC TGT TCT AGT 1851
Arg Leu Ser Gly Gly Lys Thr Pro Leu Gin Pro Glu Ser Cys Ser Ser 555 560 565 570
CCT GGA GCG ATC ACC ACT TCA CCA GAA ATA AGA GGT ATT TGG GAT AGG 1899
Pro Gly Ala lie Thr Thr Ser Pro Glu lie Arg Gly He Trp Asp Arg 575 580 585
GAT TAC CAT ACG CCT ATC ACC AGC TCT TTC ACC CTT GAT GTG AGC TAT 1347
Asp Tyr His Thr Pro He Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr 590 595 600
GAC AAC ACC GAT GAT TAT TAC TTC CCT AGA AAT GGG GTT ATC TTT AGT 1995
Asp Asn Thr Asp Asp Tyr Tyr Phe Pro Arg Asn Gly Val He Phe Ser
605 610 615
TCC TAT GCG ACG ATG TCT GGC TTG CCA AGC TCT GGC ACG CTC AAT TCT 2043
Ser Tyr Ala Thr Met Ser Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser 620 625 630
TGG AAC GGG TTA GGC GGG AAT GTC CGT AAC ACC AAA GTT TAT GGT AAA 2091
Trp Asn Gly Leu Gly Gly Asn Val Arg Asn Thr Lys Val Tyr Gly Lys 635 640 645 650
TTC GCC GCT TAC CAC CAT TTG CAA AAA TAT TTA TTG ATA GAT TTG ATC 2139
Phe Ala Ala Tyr His His Leu Gin Lys Tyr Leu Leu He Asp Leu He 655 660 665
GCT CGC TTT AAA ACG CAA GGA GGT TAT ATC TTT AGG TAT AAC ACC GAT 2187
Ala Arg Phe Lys Thr Gin Gly Gly Tyr He Phe Arg Tyr Asn Thr Asp 670 675 680
GAT TAC TTG CCC TTA AAC TCC ACC TTC TAC ATG GGG GGC GTA ACC ACG 2235
Asp Tyr Leu Pro Leu Asn Ser Thr Phe Tyr Met Gly Gly Val Thr Thr
685 690 695
GTG AGA GGC TTT AGG AAC GGA TCG GTT ACT CCT AAA GAT GAG TTT GGC 2283
Val Arg Gly Phe Arg Asn Gly Ser Val Thr Pro Lys Asp Glu Phe Gly 700 705 710
TTG TGG CTT GGA GGC GAT GGG ATT TTT ACC GCT TCT ACT GAA TTG AGC 2331
Leu Trp Leu Gly Gly Asp Gly He Phe Thr Ala Ser Thr Glu Leu Ser 715 720 725 730
- 86 -
TAT GGG GTG CTA AAG GCG GCT AAA ATG CGC TTA GCG TGG TTT TTT GAC 2379 Tyr Gly Val Leu Lys Ala Ala Lys Met Arg Leu Ala Trp Phe Phe Asp 735 740 745
TTT GGT TTC TTA ACC TTT AAA ACC CCA ACT AGA GGG AGT TTT TTC TAT 2427 Phe Gly Phe Leu Thr Phe Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr 750 755 760
AAC GCT CCT GTT ACG ACA GCG AAT TTT AAA GAT TAT GGC GTT ATA GGG 2475 Asn Ala Pro Val Thr Thr Ala Asn Phe Lys Asp Tyr Gly Val He Gly 765 770 775
GCT GGG TTT GAA AGA GCG ACT TGG AGG GCT TCC ACA GGC TTG CAG ATT 2523 Ala Gly Phe Glu Arg Ala Thr Trp Arg Ala Ser Thr Gly Leu Gin He 780 785 790
GAA TGG ATT TCG CCC ATG GGG CCT TTG GTG TTG ATT TTC CCT ATA GCG 2571 Glu Trp He Ser Pro Met Gly Pro Leu Val Leu He Phe Pro He Ala 795 800 805 810
TTT TTC AAC CAA TGG GGC GAT GGC AAT GGC AAG AAA TGT AAA GGG CTA 2619 Phe Phe Asn Gin Trp Gly Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu 815 820 825
TGC TTC AAC CCT AAC ATG GAC GAT TAC ACG CAA CAC TTT GAA TTT TCT 2667 Cys Phe Asn Pro Asn Met Asp Asp Tyr Thr Gin His Phe Glu Phe Ser 830 835 840
ATG GGA ACA AGG TTT TAAAATGCGC ATCAACAGAG AAGAAATTTT GGATTTAATG A 2723 Met Gly Thr Arg Phe 845
AAAACGCGCC CTTGAAAGAA TTGGGGCAAA GGGCTTTGAG GGTGAAGCAA CGCTTGCACC 2783
CTGAAAACTT GACGACTTTT ATTGTGGATA GGAATATCAA TTACACCAAT ATTTGTTTTG 2843
TGGATTGCAA GTTTTGCGCG TTCAAACGCA CCTTAAAAGA AAAAGACGCC TATGTGTTGA 2903
GCTATGAAGA AATTGATCAA AAGATTGAAG AATTGCTCGC TATTGGCGGC ACGCAGATCC 2963
TTTTTCAAGG GGGGGTGCAC CCGCAGCTAA AGATTGATTA TTATGAGAA 3012
(2) INFORMATION FOR SEQ ID NO : 8 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 847 amino acids
(B) TYPE: am o acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 :
Met Lys Val Lys Ser He Ser Tyr Val Gly Leu Ser Tyr Met Ser Asp
1 5 10 15
Met Leu Ala Asn Glu He Val Lys He Arg Val Gly Asp He Val Asp 20 25 30
- 87 -
Ser Lys Lys He Asp Thr Ala Val Leu Ala Leu Phe Asn Gin Gly Tyr
35 40 45
Phe Lys Asp Val Tyr Ala Thr Phe Glu Gly Gly He Leu Glu Phe His
50 55 60
Phe Asp Glu Lys Ala Arg He Ala Gly Val Glu He Lys Gly Tyr Gly 65 70 75 80
Thr Glu Lys Glu Lys Asp Gly Leu Lys Ser Gin Met Gly He Lys Lys
85 90 95
Gly Asp Thr Phe Asp Glu Gin Lys Leu Glu His Ala Lys Thr Ala Leu
100 105 110
Lys Thr Ala Leu Glu Gly Gin Gly Tyr Tyr Gly Ser Val Val Glu Val
115 120 125
Arg Thr Glu Lys Val Ser Glu Gly Ala Leu Leu He Val Phe Asp Val
130 135 140
Asn Arg Gly Asp Ser He Tyr He Lys Gin Ser He Tyr Glu Gly Ser 145 150 155 160
Ala Lys Leu Lys Arg Arg Met He Glu Ser Leu Ser Ala Asn Lys Gin
165 170 175
Arg Asp Phe Met Gly Trp Met Trp Gly Leu Asn Asp Gly Lys Leu Arg
180 185 190
Leu Asp Gin Leu Glu Tyr Asp Ser Met Arg He Gin Asp Val Tyr Met
195 200 205
Arg Arg Gly Tyr Leu Asp Ala His He Ser Ser Pro Phe Leu Lys Thr
210 215 220
Asp Phe Ser Thr His Asp Ala Lys Leu His Tyr Lys Val Lys Glu Gly 225 230 235 240
He Gin Tyr Arg He Ser Asp He Leu He Glu He Asp Asn Pro Val
245 250 255
Val Pro Leu Lys Thr Leu Glu Lys Ala Leu Lys Val Lys Arg Lys Asp
260 265 270
Val Phe Asn He Glu His Leu Arg Ala Asp Ala Gin He Leu Lys Thr
275 280 285
Glu He Ala Asp Lys Gly Tyr Ala Phe Ala Val Val Lys Pro Asp Leu
290 295 300
Asp Lys Asp Glu Lys Asn Gly Leu Val Lys Val He Tyr Arg He Glu 305 310 315 320
Val Gly Asp Met Val Tyr He Asn Asp Val He He Ser Gly Asn Gin
325 330 335
Arg Thr Ser Asp Arg He He Arg Arg Glu Leu Leu Leu Gly Pro Lys
340 345 350
Asp Lys Tyr Asn Leu Thr Lys Leu Arg Asn Ser Glu Asn Ser Leu Arg
355 360 365
Arg Leu Gly Phe Phe Ser Lys Val Lys He Glu Glu Lys Arg Val Asn
370 375 380
Ser Ser Leu Met Asp Leu Leu Val Ser Val Glu Glu Gly Arg Thr Gly 385 390 395 400
Gin Leu Gin Phe Gly Leu Gly Tyr Gly Ser Tyr Gly Gly Leu Met Leu
405 410 415
Asn Gly Ser Val Ser Glu Arg Asn Leu Phe Gly Thr Gly Gin Ser Met
420 425 430
Ser Leu Tyr Ala Asn He Ala Thr Gly Gly Gly Arg Ser Tyr Pro Gly
435 440 445
Met Pro Lys Gly Ala Gly Arg Met Phe Ala Gly Asn Leu Ser Leu Thr
450 455 460
Asn Pro Arg He Phe Asp Ser Trp Tyr Ser Ser Thr He Asn Leu Tyr
465 470 475 480
Ala Asp Tyr Arg He Ser Tyr Gin Tyr He Gin Gin Gly Gly Gly Phe
485 490 495
Gly Val Asn Val Gly Arg Met Leu Gly Asn Arg Thr His Val Ser Leu
500 505 510
Gly Tyr Asn Leu Asn Val Thr Lys Leu Leu Gly Phe Ser Ser Pro Leu
515 520 525
Tyr Asn Arg Tyr Tyr Ser Ser Val Asn Glu Val Val Ser Pro Arg Gin
530 535 540
Cys Ser Thr Pro Ala Ser Val He He Asn Arg Leu Ser Gly Gly Lys 545 550 555 560
Thr Pro Leu Gin Pro Glu Ser Cys Ser Ser Pro Gly Ala He Thr Thr
565 570 575
Ser Pro Glu He Arg Gly He Trp Asp Arg Asp Tyr His Thr Pro He
580 585 590
Thr Ser Ser Phe Thr Leu Asp Val Ser Tyr Asp Asn Thr Asp Asp Tyr
595 600 605
Tyr Phe Pro Arg Asn Gly Val He Phe Ser Ser Tyr Ala Thr Met Ser
610 615 620
Gly Leu Pro Ser Ser Gly Thr Leu Asn Ser Trp Asn Gly Leu Gly Gly 625 630 635 640
Asn Val Arg Asn Thr Lys Val Tyr Gly Lys Phe Ala Ala Tyr His His
645 650 655
Leu Gin Lys Tyr Leu Leu He Asp Leu He Ala Arg Phe Lys Thr Gin
660 665 670
Gly Gly Tyr He Phe Arg Tyr Asn Thr Asp Asp Tyr Leu Pro Leu Asn
675 680 685
Ser Thr Phe Tyr Met Gly Gly Val Thr Thr Val Arg Gly Phe Arg Asn
690 695 700
Gly Ser Val Thr Pro Lys Asp Glu Phe Gly Leu Trp Leu Gly Gly Asp 705 710 715 720
Gly He Phe Thr Ala Ser Thr Glu Leu Ser Tyr Gly Val Leu Lys Ala
725 730 735
Ala Lys Met Arg Leu Ala Trp Phe Phe Asp Phe Gly Phe Leu Thr Phe
740 745 750
Lys Thr Pro Thr Arg Gly Ser Phe Phe Tyr Asn Ala Pro Val Thr Thr
755 760 765
Ala Asn Phe Lys Asp Tyr Gly Val He Gly Ala Gly Phe Glu Arg Ala
770 775 780
Thr Trp Arg Ala Ser Thr Gly Leu Gin He Glu Trp He Ser Pro Met 785 790 795 800
Gly Pro Leu Val Leu He Phe Pro He Ala Phe Phe Asn Gin Trp Gly
805 810 815
Asp Gly Asn Gly Lys Lys Cys Lys Gly Leu Cys Phe Asn Pro Asn Met
820 825 830
Asp Asp Tyr Thr Gin His Phe Glu Phe Ser Met Gly Thr Arg Phe 835 840 845
(2) INFORMATION FOR SEQ ID NO : 9 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1032 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 8 9 - ( ix ) FEATURE :
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 149...913 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 :
ATGTTTTGTG TTGCAAAAAC AAAACAGACC AATAAAGGCA TCACTTTTAA AAGCGTTGTT 60 TAGGGGGGTT TGGTTATTGG TGTTTGATTA GAATAGGGTT GTTTTTAATT TTCTTTTAAG 120 AGGAGTTTTT ACTTTTTTAA GGGTTTTT ATG GAT ATT TAT GCG TTA TAT ATA 172
Met Asp He Tyr Ala Leu Tyr He 1 5
GCG ATA GGG CTT TTT ACT GGC ATT CTA TCA GGG ATT TTT GGC ATT GGT 220 Ala He Gly Leu Phe Thr Gly He Leu Ser Gly He Phe Gly He Gly 10 15 20
GGG GGG TTG ATC ATT GTC CCT ATC ATG CTC GCA ACC GGG CAT TCT TTT 268 Gly Gly Leu He He Val Pro He Met Leu Ala Thr Gly His Ser Phe 25 30 35 40
GAA GAA TCC ATT GGG ATT TCC ATT TTG CAA ATG GCG CTT TCA TCG TTC 316 Glu Glu Ser He Gly He Ser He Leu Gin Met Ala Leu Ser Ser Phe 45 50 55
GTG GGC TCT GTT TTG AAT TTC AAA AAA AAA TCG CTT GAT TTT TCT TTA 364 Val Gly Ser Val Leu Asn Phe Lys Lys Lys Ser Leu Asp Phe Ser Leu 60 65 70
GGC TTG TTG ATA GGG GCA GGG GGG CTG ATA GGG GCG AGT TTT AGC GGA 412 Gly Leu Leu He Gly Ala Gly Gly Leu He Gly Ala Ser Phe Ser Gly 75 80 85
TTT GTT TTA AAA ATC GTT TCC AGT AAA ATT TTA ATG GTT ATT TTC GCG 460 Phe Val Leu Lys He Val Ser Ser Lys He Leu Met Val He Phe Ala 90 95 100
CTT TTA GTC GTG TAT TCT ATG ATC CAA TTT GTT TTG AAA CCC AAA AAA 508 Leu Leu Val Val Tyr Ser Met He Gin Phe Val Leu Lys Pro Lys Lys 105 110 115 120
AAA GAT TTG ATA GCG GAT ACT AAA CGC TAT CAT CTG CAA GGT TTG AAA 556 Lys Asp Leu He Ala Asp Thr Lys Arg Tyr His Leu Gin Gly Leu Lys 125 130 135
TTA TTT TTA ATT GGC ACG CTC ACA GGG TTT TTT GCT ATC ACT TTA GGG 604 Leu Phe Leu He Gly Thr Leu Thr Gly Phe Phe Ala He Thr Leu Gly 140 145 150
ATT GGT GGG GGG ATG CTC ATG GTG CCT TTG ATG CAT TAT TTT TTA GGG 652 He Gly Gly Gly Met Leu Met Val Pro Leu Met His Tyr Phe Leu Gly 155 160 165
- 90 -
TAT GAT TCT AAA AAA TGC GTG GCT CTA GGG TTA TTT TTC ATC TTG TTT 700 Tyr Asp Ser Lys Lys Cys Val Ala Leu Gly Leu Phe Phe He Leu Phe 170 175 180
TCT TCT ATT TCA GGA GCT TTT TCT TTA ATG TAT CAC CAC ATC ATC AAT 748 Ser Ser He Ser Gly Ala Phe Ser Leu Met Tyr His His He He Asn 185 190 195 200
AAA GAA GTG CTC TTA GCA GGG GCG ATT GTG GGA TTA GGA TCT GTT ATG 796 Lys Glu Val Leu Leu Ala Gly Ala He Val Gly Leu Gly Ser Val Met 205 210 215
GGC GTG AGC ATT GGG ATT AAA TGG ATC ATG GGG CTT TTG AAT GAA AAA 844 Gly Val Ser He Gly He Lys Trp He Met Gly Leu Leu Asn Glu Lys 220 225 230
ATG CAT AAA GCT TTG ATT TTA GGG GTG TAT GGT TTG TCG CTA TTG ATT 892 Met His Lys Ala Leu He Leu Gly Val Tyr Gly Leu Ser Leu Leu He 235 240 245
GTT TTA TAC AAA CTC TTT TTT TAATTGATGG TTTTATACCA CTACTATTTT AAGA 947 Val Leu Tyr Lys Leu Phe Phe 250 255
CCCCTAAGAG TTTCCCTTTA GAGTATTTGC ATTTGTGCGC TAATGAGAGC CATTTATTGA 1007 GATTGGATTT TGATGCGGCC AATTT 1032
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Met Asp He Tyr Ala Leu Tyr He Ala He Gly Leu Phe Thr Gly He
1 5 10 15
Leu Ser Gly He Phe Gly He Gly Gly Gly Leu He He Val Pro He
20 25 30
Met Leu Ala Thr Gly His Ser Phe Glu Glu Ser He Gly He Ser He
35 40 45
Leu Gin Met Ala Leu Ser Ser Phe Val Gly Ser Val Leu Asn Phe Lys
50 55 60
Lys Lys Ser Leu Asp Phe Ser Leu Gly Leu Leu He Gly Ala Gly Gly
65 70 75 80
Leu He Gly Ala Ser Phe Ser Gly Phe Val Leu Lys He Val Ser Ser
85 90 95
Lys He Leu Met Val He Phe Ala Leu Leu Val Val Tyr Ser Met He
100 105 110
Gin Phe Val Leu Lys Pro Lys Lys Lys Asp Leu He Ala Asp Thr Lys
- 91 -
115 120 125
Arg Tyr His Leu Gin Gly Leu Lys Leu Phe Leu He Gly Thr Leu Thr
130 135 140
Gly Phe Phe Ala He Thr Leu Gly He Gly Gly Gly Met Leu Met Val 145 150 155 160
Pro Leu Met His Tyr Phe Leu Gly Tyr Asp Ser Lys Lys Cys Val Ala
165 170 175
Leu Gly Leu Phe Phe He Leu Phe Ser Ser He Ser Gly Ala Phe Ser
180 185 190
Leu Met Tyr His His He He Asn Lys Glu Val Leu Leu Ala Gly Ala
195 200 205
He Val Gly Leu Gly Ser Val Met Gly Val Ser He Gly He Lys Trp
210 215 220
He Met Gly Leu Leu Asn Glu Lys Met His Lys Ala Leu He Leu Gly 225 230 235 240
Val Tyr Gly Leu Ser Leu Leu He Val Leu Tyr Lys Leu Phe Phe 245 250 255
(2) INFORMATION FOR SEQ ID NO: 11: (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1057 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 66...980 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
AAGAGCATGC GAGAGAGCAT AGAGGAATTT TTTAATCAAG AAATGTTGCA AAGTGAAGTG 60
CCGTT ATG GGT AGA ATT GAA TCA AAA AAG CGT TTG AAA GCG CTT GTT TTT 110
Met Gly Arg He Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe
1 5 10 15
TTA GCC AGC TTG GGG GTT TTG TGG GGC AAT AGC GCT GAA AAA ACG CCT 158 Leu Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro 20 25 30
TTT TTT AAA ACG AAA AAC CAC ATT TAT CTA GGT TTT AGG CTA GGC ACA 206 Phe Phe Lys Thr Lys Asn His He Tyr Leu Gly Phe Arg Leu Gly Thr 35 40 45
GGA GCC AAT GTG CAC ACG AGC ATG TGG CAA CAA GCC TAT AAA GAC AAC 254 Gly Ala Asn Val His Thr Ser Met Trp Gin Gin Ala Tyr Lys Asp Asn 50 55 60
CCC ACC TGC CCT GGT AGC GTG TGT TAT GGC GAG AAA TTA GAA GCC CAT 302 Pro Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His 65 70 75
- 92 -
TAT CAG GGG GGT AAA AAC CTG TCT TAT ACC GGG CAA ATA GGC GAT GAA 350 Tyr Gin Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gin He Gly Asp Glu 80 85 90 95
ATA GCT TTT GAT AAA CAC CAT ATT TTA GGC TTA AGG GTG TGG GGG GAT 398 He Ala Phe Asp Lys His His He Leu Gly Leu Arg Val Trp Gly Asp 100 105 110
GTA GAA TAC GCT AAA GCG CAA TTA GGT CAA AAA GTG GGG GGT AAT ACC 446 Val Glu Tyr Ala Lys Ala Gin Leu Gly Gin Lys Val Gly Gly Asn Thr 115 120 125
CTT TTA TCC CAA GCC AAT TAT GAC CCA AAC GCG ATT AAA ACC TAC GAT 494 Leu Leu Ser Gin Ala Asn Tyr Asp Pro Asn Ala He Lys Thr Tyr Asp 130 135 140
TCT GCT TCA AAC ACT CAA GGC CCT TTA GTT TTG CAA AAA ACC CCA AGC 542 Ser Ala Ser Asn Thr Gin Gly Pro Leu Val Leu Gin Lys Thr Pro Ser 145 150 155
CCT CAA AAC TTC CTT TTC AAT AAC GGG CAT TTC ATG GCG TTT GGT TTG 590 Pro Gin Asn Phe Leu Phe Asn Asn Gly His Phe Met Ala Phe Gly Leu 160 165 170 175
AAC GTG AAT GTG TTT GTT AAC CTC CCT ATA GAC ACC CTT TTA AAA CTC 638 Asn Val Asn Val Phe Val Asn Leu Pro He Asp Thr Leu Leu Lys Leu 180 185 190
GCT TTA AAA ACA GAA AAA ATG CTG TTT TTT AAA ATA GGC GTG TTT GGT 686 Ala Leu Lys Thr Glu Lys Met Leu Phe Phe Lys He Gly Val Phe Gly 195 200 205
GGG GGC GGG GTG GAA TAC GCA ATA TTA TGG AGT CCT AAC TAT CAA AAT 734 Gly Gly Gly Val Glu Tyr Ala He Leu Trp Ser Pro Asn Tyr Gin Asn 210 215 220
CAA AAC ACG AAA CAA GGC GAT AAA TTT TTT GCA GCG GGT GGG GGG TTT 782 Gin Asn Thr Lys Gin Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe 225 230 235
TTT GTG AAT TTT GGG GGT TCT TTG TAT ATA GGC AAA CGC AAC CGC TTC 830 Phe Val Asn Phe Gly Gly Ser Leu Tyr He Gly Lys Arg Asn Arg Phe 240 245 250 255
AAT GTG GGG TTA AAA ATC CCT TAC TAT AGC TTG AGC GCG CAA AGT TGG 878 Asn Val Gly Leu Lys He Pro Tyr Tyr Ser Leu Ser Ala Gin Ser Trp 260 265 270
AAA AAC TTT GGC TCT AGC AAT GTG TGG CAG CAA CAA ACG ATC CGA CAA 926 Lys Asn Phe Gly Ser Ser Asn Val Trp Gin Gin Gin Thr He Arg Gin 275 280 285
AAC TTC AGC GTT TTT AGG AAT AAA GAA GTT TTT GTC AGC TAC GCG TTC 974 Asn Phe Ser Val Phe Arg Asn Lys Glu Val Phe Val Ser Tyr Ala Phe 290 295 300
- 93 -
TTG TTT TAGTTTGGAT TCGTTCTCAT TAAACACTGA TGATAAAATT CAAAAGATGG TT 1032 Leu Phe 305
TTATCGTTAC AAAATTCAAC ATTTC 1057
(2) INFORMATION FOR SEQ ID NO: 12:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 amino acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 :
Met Gly Arg He Glu Ser Lys Lys Arg Leu Lys Ala Leu Val Phe Leu
1 5 10 15
Ala Ser Leu Gly Val Leu Trp Gly Asn Ser Ala Glu Lys Thr Pro Phe
20 25 30
Phe Lys Thr Lys Asn His He Tyr Leu Gly Phe Arg Leu Gly Thr Gly
35 40 45
Ala Asn Val His Thr Ser Met Trp Gin Gin Ala Tyr Lys Asp Asn Pro
50 55 60
Thr Cys Pro Gly Ser Val Cys Tyr Gly Glu Lys Leu Glu Ala His Tyr 65 70 75 80
Gin Gly Gly Lys Asn Leu Ser Tyr Thr Gly Gin He Gly Asp Glu He
85 90 95
Ala Phe Asp Lys His His He Leu Gly Leu Arg Val Trp Gly Asp Val
100 105 110
Glu Tyr Ala Lys Ala Gin Leu Gly Gin Lys Val Gly Gly Asn Thr Leu
115 120 125
Leu Ser Gin Ala Asn Tyr Asp Pro Asn Ala He Lys Thr Tyr Asp Ser
130 135 140
Ala Ser Asn Thr Gin Gly Pro Leu Val Leu Gin Lys Thr Pro Ser Pro 145 150 155 160
Gin Asn Phe Leu Phe Asn Asn Gly His Phe Met Ala Phe Gly Leu Asn
165 170 175
Val Asn Val Phe Val Asn Leu Pro He Asp Thr Leu Leu Lys Leu Ala
180 185 190
Leu Lys Thr Glu Lys Met Leu Phe Phe Lys He Gly Val Phe Gly Gly
195 200 205
Gly Gly Val Glu Tyr Ala He Leu Trp Ser Pro Asn Tyr Gin Asn Gin
210 215 220
Asn Thr Lys Gin Gly Asp Lys Phe Phe Ala Ala Gly Gly Gly Phe Phe 225 230 235 240
Val Asn Phe Gly Gly Ser Leu Tyr He Gly Lys Arg Asn Arg Phe Asn
245 250 255
Val Gly Leu Lys He Pro Tyr Tyr Ser Leu Ser Ala Gin Ser Trp Lys
260 265 270
Asn Phe Gly Ser Ser Asn Val Trp Gin Gin Gin Thr He Arg Gin Asn 275 280 285
- 94 -
Phe Ser Val Phe Arg Asn Lys Glu Val Phe Val Ser Tyr Ala Phe Leu
290 295 300
Phe 305
(2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 624 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 77...535 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
TATTAGTTGG TTTAATACGC TATAATCTGT GTGCCAACAT TGTGTGGCTC AAATCATTTT 60 TAAAAGGGGT TTTATA ATG GAA AAC AAC GAA AAT CAT GAG AAA TTG AAT GGC 112 Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly 1 5 10
GTT TTG CGC AAG TTT TTA GGC GAT GCG TTC ACG CTT GAT GGG AAA GAA 160 Val Leu Arg Lys Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu 15 20 25
GGA GGA TTG AAT ATG GAA AAA TTG CGC GAA GCC ATT AAA AAA GAA AAA 208 Gly Gly Leu Asn Met Glu Lys Leu Arg Glu Ala He Lys Lys Glu Lys 30 35 40
CCA ATC ATG AAT ATT TTG CTC ATG GGA GCT ACT GGG GTG GGT AAA AGC 256 Pro He Met Asn He Leu Leu Met Gly Ala Thr Gly Val Gly Lys Ser 45 50 55 60
TCG CTC ATT AAC GCT CTA TTC GGT AAG GAA GTA GCT AAA GCA GGT GTA 304 Ser Leu He Asn Ala Leu Phe Gly Lys Glu Val Ala Lys Ala Gly Val 65 70 75
GGA AAA CCC ATC ACT CAG CAT CTT GAA AAA TAT GTT GAT GAA GAA AAA 352 Gly Lys Pro He Thr Gin His Leu Glu Lys Tyr Val Asp Glu Glu Lys 80 85 90
GGC TTG ATT TTA TGG GAC ACT AAA GGC ATT GAA GAT AAA GAT TAT GAA 400 Gly Leu He Leu Trp Asp Thr Lys Gly He Glu Asp Lys Asp Tyr Glu 95 100 105
AAT ACC TTG GAA AGC ATT AAA AAA GAA ATG GAA GAT TCT TTT AAA ACG 448 Asn Thr Leu Glu Ser He Lys Lys Glu Met Glu Asp Ser Phe Lys Thr 110 115 120
95
CTT GAT GAA AAA GAG GCT ATT GAT GTG GCG TAT CTG TGC GTT AAA GAG 496 Leu Asp Glu Lys Glu Ala He Asp Val Ala Tyr Leu Cys Val Lys Glu 125 130 135 140
ACT TCT GGT AGG GTT CAA GAG AGA GAG AGA GAG AGT TAT TAAGCTTTAC TA 547 Thr Ser Gly Arg Val Gin Glu Arg Glu Arg Glu Ser Tyr 145 150
AAAAATGGAA TATCCCAACG ATTTTCGTTT TCACCAACAC ACAAGAAAAA GCCGGCGATG 607 CCTTTGTTAA AAAAACT 624
(2) INFORMATION FOR SEQ ID NO 14
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 153 ammo acids
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(n) MOLECULE TYPE protein (v) FRAGMENT TYPE internal
(xi) SEQUENCE DESCRIPTION SEQ ID NO 14
Met Glu Asn Asn Glu Asn His Glu Lys Leu Asn Gly Val Leu Arg Lys
1 5 10 15
Phe Leu Gly Asp Ala Phe Thr Leu Asp Gly Lys Glu Gly Gly Leu Asn
20 25 30
Met Glu Lys Leu Arg Glu Ala He Lys Lys Glu Lys Pro He Met Asn
35 40 45
He Leu Leu Met Gly Ala Thr Gly Val Gly Lys Ser Ser Leu He Asn
50 55 60
Ala Leu Phe Gly Lys Glu Val Ala Lys Ala Gly Val Gly Lys Pro He 65 70 75 80
Thr Gin His Leu Glu Lys Tyr Val Asp Glu Glu Lys Gly Leu He Leu
85 90 95
Trp Asp Thr Lys Gly He Glu Asp Lys Asp Tyr Glu Asn Thr Leu Glu
100 105 110
Ser He Lys Lys Glu Met Glu Asp Ser Phe Lys Thr Leu Asp Glu Lys
115 120 125
Glu Ala He Asp Val Ala Tyr Leu Cys Val Lys Glu Thr Ser Gly Arg
130 135 140
Val Gin Glu Arg Glu Arg Glu Ser Tyr 145 150
(2) INFORMATION FOR SEQ ID NO 15
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 1083 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ix) FEATURE
- 96 -
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 155...1033 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 :
GATGTTGTTA AGTCGTTGTT TATTATGTTA CACTAAAAGC TTAAATAAAA GGGCATAAGG 60
GATAAAGGGA GTGTTAGTAG ATAGTTTTAA TAGGGTTATT GACTATATTA GGGTTTCTGT 120
AACCAAACAG TGCAATTTCA GGTGTCAGTA TTGC ATG CCT GCT ACG CCA TTA AAT 175
Met Pro Ala Thr Pro Leu Asn 1 5
TTT TTT GAT AAT GAA GAA TTA TTG CCT TTG GAT AAT GTT TTA GAA TTT 223 Phe Phe Asp Asn Glu Glu Leu Leu Pro Leu Asp Asn Val Leu Glu Phe 10 15 20
CTC AAA ATC GCC ATT GAT GAG GGC GTT AAA AAA ATT AGA ATC ACG GGT 271 Leu Lys He Ala He Asp Glu Gly Val Lys Lys He Arg He Thr Gly 25 30 35
GGG GAG CCG CTA TTA CGC AAA GGC TTA GAT GAA TTT ATC GCT AAA TTG 319 Gly Glu Pro Leu Leu Arg Lys Gly Leu Asp Glu Phe He Ala Lys Leu 40 45 50 55
CAC GCT TAC AAT AAA GAA GTG GAG TTA GTT TTA AGC ACT AAT GGT TTT 367 His Ala Tyr Asn Lys Glu Val Glu Leu Val Leu Ser Thr Asn Gly Phe 60 65 70
TTA CTC AAA AAA ATG GCT AAG GAT TTA AAA AAT GCC GGG TTA GCG CAA 415 Leu Leu Lys Lys Met Ala Lys Asp Leu Lys Asn Ala Gly Leu Ala Gin 75 80 85
GTG AAT GTT TCA TTG GAT TCT TTA AAA AGC GAT AGG GTT TTA AAA ATC 463 Val Asn Val Ser Leu Asp Ser Leu Lys Ser Asp Arg Val Leu Lys He 90 95 100
TCT CAA AAA GAC GCT CTT AAA AAC ACG CTA GAA GGG ATT GAA GAG TCT 511 Ser Gin Lys Asp Ala Leu Lys Asn Thr Leu Glu Gly He Glu Glu Ser 105 110 115
TTG AAA GTG GGT TTA AAA CTC AAA TTA AAC ACG GTT GTG ATA AAA AGC 559 Leu Lys Val Gly Leu Lys Leu Lys Leu Asn Thr Val Val He Lys Ser 120 125 130 135
GTT AAT GAT GAT GAA ATC TTA GAG CTT TTA GAA TAC GCA AAA AAT AGG 607 Val Asn Asp Asp Glu He Leu Glu Leu Leu Glu Tyr Ala Lys Asn Arg 140 145 150
CAT ATA CAA ATC CGC TAC ATT GAA TTT ATG GAA AAC ACG CAT GCT AAA 655 His He Gin He Arg Tyr He Glu Phe Met Glu Asn Thr His Ala Lys 155 160 165
AGT TTG GTT AAA GGC TTG AAA GAG CGA GAA ATT TTA GAT TTG ATC GCT 703
- 97 -
Ser Leu Val Lys Gly Leu Lys Glu Arg Glu He Leu Asp Leu He Ala 170 175 180
CAA AAA TAT CAA ATC ATT GAG GCA GAA AAA CCC AAA CAA GGG TCT TCT 751 Gin Lys Tyr Gin He He Glu Ala Glu Lys Pro Lys Gin Gly Ser Ser 185 190 195
AAA ATC TAC ACG CTA GAA AAT GGC TAT CAA TTT GGC ATT ATC GCT CCG 799 Lys He Tyr Thr Leu Glu Asn Gly Tyr Gin Phe Gly He He Ala Pro 200 205 210 215
CAT AGC GAT GAT TTT TGC CAA TCT TGC AAT CGT ATC CGT TTG GCT TCT 847 His Ser Asp Asp Phe Cys Gin Ser Cys Asn Arg He Arg Leu Ala Ser 220 225 230
GAT GGT AAG ATT TGC CCA TGT TTA TAC TAT CAA GAC GCC ATA GAC GCT 895 Asp Gly Lys He Cys Pro Cys Leu Tyr Tyr Gin Asp Ala He Asp Ala 235 240 245
AAA GAG GCG ATC ATC AAT AAG GAT ACA AAA AAT ATA AAA AGG CTT TTA 943 Lys Glu Ala He He Asn Lys Asp Thr Lys Asn He Lys Arg Leu Leu 250 255 260
AAG CAA TCT GTC ATC AAT AAA CCA GAA AAA AAC ATG TGG AAT GAT AAA 991 Lys Gin Ser Val He Asn Lys Pro Glu Lys Asn Met Trp Asn Asp Lys 265 270 275
AAC AGC GAA ACT CCC ACA AGG GCG TTT TAC TAC ACA GGG GGG TAGGGGAGT 1042 Asn Ser Glu Thr Pro Thr Arg Ala Phe Tyr Tyr Thr Gly Gly 280 285 290
AAAATATTTA TTATTTTAAA CCTTTTTATT AAAAATAAGG C 1083
(2) INFORMATION FOR SEQ ID NO: 16:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 293 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
Met Pro Ala Thr Pro Leu Asn Phe Phe Asp Asn Glu Glu Leu Leu Pro
1 5 10 15
Leu Asp Asn Val Leu Glu Phe Leu Lys He Ala He Asp Glu Gly Val
20 25 30
Lys Lys He Arg He Thr Gly Gly Glu Pro Leu Leu Arg Lys Gly Leu
35 40 45
Asp Glu Phe He Ala Lys Leu His Ala Tyr Asn Lys Glu Val Glu Leu
50 55 60
Val Leu Ser Thr Asn Gly Phe Leu Leu Lys Lys Met Ala Lys Asp Leu
- 98 -
65 70 75 80
Lys Asn Ala Gly Leu Ala Gin Val Asn Val Ser Leu Asp Ser Leu Lys
85 90 95
Ser Asp Arg Val Leu Lys He Ser Gin Lys Asp Ala Leu Lys Asn Thr
100 105 110
Leu Glu Gly He Glu Glu Ser Leu Lys Val Gly Leu Lys Leu Lys Leu
115 120 125
Asn Thr Val Val He Lys Ser Val Asn Asp Asp Glu He Leu Glu Leu
130 135 140
Leu Glu Tyr Ala Lys Asn Arg His He Gin He Arg Tyr He Glu Phe 145 150 155 160
Met Glu Asn Thr His Ala Lys Ser Leu Val Lys Gly Leu Lys Glu Arg
165 170 175
Glu He Leu Asp Leu He Ala Gin Lys Tyr Gin He He Glu Ala Glu
180 185 190
Lys Pro Lys Gin Gly Ser Ser Lys He Tyr Thr Leu Glu Asn Gly Tyr
195 200 205
Gin Phe Gly He He Ala Pro His Ser Asp Asp Phe Cys Gin Ser Cys
210 215 220
Asn Arg He Arg Leu Ala Ser Asp Gly Lys He Cys Pro Cys Leu Tyr 225 230 235 240
Tyr Gin Asp Ala He Asp Ala Lys Glu Ala He He Asn Lys Asp Thr
245 250 255
Lys Asn He Lys Arg Leu Leu Lys Gin Ser Val He Asn Lys Pro Glu
260 265 270
Lys Asn Met Trp Asn Asp Lys Asn Ser Glu Thr Pro Thr Arg Ala Phe
275 280 285
Tyr Tyr Thr Gly Gly 290
(2) INFORMATION FOR SEQ ID NO: 17:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1181 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 121...1137 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
ACTTCTCAAT CAGCGAGCTA TCATGCAAGG CCTTATGTGG TGGATACCGC TTTTTTACGA 60
TACGATTACA AAGATGTTTT TGGGTTTAAG GCGGGGCGCT ATGAAGCGAA TATTGATTTC 120
ATG AGC GGA TCG AAT CAA GGG TGG GAA GTG TAT TAT CAG CCC TAT AAG 168
Met Ser Gly Ser Asn Gin Gly Trp Glu Val Tyr Tyr Gin Pro Tyr Lys 1 5 10 15
ACT GAA ACG CAA AGG TTA AGG TTT TGG TGG TGG AGT TCT TTT GGG AGA 216
- 99 -
Thr Glu Thr Gin Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg 20 25 30
GGT TTA GCG TTC AAC TCT TGG ATT TAT GAG TTT TTT GCG ACG GTG CCT 264 Gly Leu Ala Phe Asn Ser Trp He Tyr Glu Phe Phe Ala Thr Val Pro 35 40 45
TAT TTG AAA AAG GGA GGC AAT CCT AAT AAC AGC AAC GAT TTC ATC AAT 312 Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe He Asn 50 55 60
TAT GGC TGG CAT GGA ATC ACC ACA ACC TAT TCT TAT AAA GGT TTA GAC 360 Tyr Gly Trp His Gly He Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp 65 70 75 80
GCT CAA TTT TTT TAT TAT TTT GCG CCT AAG ACT TAT AAC GCT CCT GGC 408 Ala Gin Phe Phe Tyr Tyr Phe Ala Pro Lys Thr Tyr Asn Ala Pro Gly 85 90 95
TTT AAG CTG GTC TAT GAC ACG AAT AGG AAT TTT CAA AAT GTA GGC TTT 456 Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn Phe Gin Asn Val Gly Phe 100 105 110
CGC TCT CAA AGC ATG ATC ATG ACA ACC TTT CCT TTA TAC TAT AGA GGG 504 Arg Ser Gin Ser Met He Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly 115 120 125
TGG TAT AAC CCA GAG ACA AAC ACT TAT AGT TTA GAA GAC AGC ACG CCT 552 Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro 130 135 140
CAT GGC TCG TTG TTG GGG AGG AAT GGC GTT ACT TTA AAT ATC CGC CAG 600 His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn He Arg Gin 145 150 155 160
GTT TTT TGG TGG GAT AAT TTC AAC TGG TCC ATT GGC TTT TAT AAC ACC 648 Val Phe Trp Trp Asp Asn Phe Asn Trp Ser He Gly Phe Tyr Asn Thr 165 170 175
TTT GGC AAT TCG GAC GCT TTT TTA GGC TCT CAC ACG ATG CCA AGG GGT 696 Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly 180 185 190
AAT AAC ACT TCC TAT ATC GGT AGT GAA ATC TCC ATA ACG ACT AGG CAT 744 Asn Asn Thr Ser Tyr He Gly Ser Glu He Ser He Thr Thr Arg His 195 200 205
GCC GGA ATG ATT GGC TAT GAT TTT TGG GAT AAT ACG GCT TAT GAT GGG 792 Ala Gly Met He Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly 210 215 220
CTA GCT GAT GCG ATC ACT AAC GCT AAC ACT TTC ACT TTT TAC ACT TCT 840 Leu Ala Asp Ala He Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser 225 230 235 240
- 100 -
GTT GGA GGG ATC CAT AAG CGT TTT GCA TGG CAT GTT TTT GGG CGC GTC 888
Val Gly Gly He His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val
245 250 255
TCT CAT GCG AAT AAA AAC GCG TTA GGG CAA GTG GGG AGG GCT AAT GAA 936
Ser His Ala Asn Lys Asn Ala Leu Gly Gin Val Gly Arg Ala Asn Glu 260 265 270
TAT TCC TTG CAA TTC AAC GCG AGC TAT GCG TTC ACT GAA TCA ATC CTT 984 Tyr Ser Leu Gin Phe Asn Ala Ser Tyr Ala Phe Thr Glu Ser He Leu 275 280 285
CTT AAC TTT AGG ATC ACT TAT TAT GGG GCT AGG ATC AAT AAA GGG TAT 1032 Leu Asn Phe Arg He Thr Tyr Tyr Gly Ala Arg He Asn Lys Gly Tyr 290 295 300
CAA GCG GGG TAT TTT GGA GCG CCC AAA TTC AAT AAC CCT GAT GGC GAT 1080 Gin Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 310 315 320
TTT AGC GCT AAT TAC CAA GAC AGA AGT TAC ATG ATG ACC AAC CTC ACG 1128 Phe Ser Ala Asn Tyr Gin Asp Arg Ser Tyr Met Met Thr Asn Leu Thr 325 330 335
CTG AAG TTT TGATTTCCAA TCACAGCGAG TTAAAAACAC TCCAAGGCAT TTTT 1181
Leu Lys Phe
(2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 339 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
Met Ser Gly Ser Asn Gin Gly Trp Glu Val Tyr Tyr Gin Pro Tyr Lys
1 5 10 15
Thr Glu Thr Gin Arg Leu Arg Phe Trp Trp Trp Ser Ser Phe Gly Arg
20 25 30
Gly Leu Ala Phe Asn Ser Trp He Tyr Glu Phe Phe Ala Thr Val Pro
35 40 45
Tyr Leu Lys Lys Gly Gly Asn Pro Asn Asn Ser Asn Asp Phe He Asn
50 55 60
Tyr Gly Trp His Gly He Thr Thr Thr Tyr Ser Tyr Lys Gly Leu Asp 65 70 75 80
Ala Gin Phe Phe Tyr Tyr Phe Ala Pro Lys Thr Tyr Asn Ala Pro Gly
85 90 95
Phe Lys Leu Val Tyr Asp Thr Asn Arg Asn Phe Gin Asn Val Gly Phe 100 105 110
- 101 -
Arg Ser Gin Ser Met He Met Thr Thr Phe Pro Leu Tyr Tyr Arg Gly
115 120 125
Trp Tyr Asn Pro Glu Thr Asn Thr Tyr Ser Leu Glu Asp Ser Thr Pro
130 135 140
His Gly Ser Leu Leu Gly Arg Asn Gly Val Thr Leu Asn He Arg Gin 145 150 155 160
Val Phe Trp Trp Asp Asn Phe Asn Trp Ser He Gly Phe Tyr Asn Thr
165 170 175
Phe Gly Asn Ser Asp Ala Phe Leu Gly Ser His Thr Met Pro Arg Gly
180 185 190
Asn Asn Thr Ser Tyr He Gly Ser Glu He Ser He Thr Thr Arg His
195 200 205
Ala Gly Met He Gly Tyr Asp Phe Trp Asp Asn Thr Ala Tyr Asp Gly
210 215 220
Leu Ala Asp Ala He Thr Asn Ala Asn Thr Phe Thr Phe Tyr Thr Ser 225 230 235 240
Val Gly Gly He His Lys Arg Phe Ala Trp His Val Phe Gly Arg Val
245 250 255
Ser His Ala Asn Lys Asn Ala Leu Gly Gin Val Gly Arg Ala Asn Glu
260 265 270
Tyr Ser Leu Gin Phe Asn Ala Ser Tyr Ala Phe Thr Glu Ser He Leu
275 280 285
Leu Asn Phe Arg He Thr Tyr Tyr Gly Ala Arg He Asn Lys Gly Tyr
290 295 300
Gin Ala Gly Tyr Phe Gly Ala Pro Lys Phe Asn Asn Pro Asp Gly Asp 305 310 315 320
Phe Ser Ala Asn Tyr Gin Asp Arg Ser Tyr Met Met Thr Asn Leu Thr
325 330 335
Leu Lys Phe
(2) INFORMATION FOR SEQ ID NO: 19:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 959 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 133...879 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
TAAGGAAATG AGTTTTTATA TCATAAAATA AAGTAACCGA GAAAAATCTT TCTCTAAAAA 60 TAATACTTTT TTAGTTATAA TAACAATTTT GTTTTTTCAA AAACAATAAT TACTATATTT 120 AGGATTTTAA GA ATG AAT GAC AAG CGT TTT AGA AAA TAT TGT AGT TTT TCT 171 Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser 1 5 10
- 102 -
ATT TTT TTG TCC TTA TTA GGA ACG TTT GAA TTA GAG GCT AAA GAA GAA 219 He Phe Leu Ser Leu Leu Gly Thr Phe Glu Leu Glu Ala Lys Glu Glu 15 20 25
GAA GAA AAA GAA GAA AGA AAG ACA GAA AGG AAA AAA GAA AAG AAC GCC 267 Glu Glu Lys Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala 30 35 40 45
CAA CAC ACT CTA GGC AAG GTT ACC ACT CAA GCG GCT AAA ATC TTT AAC 315 Gin His Thr Leu Gly Lys Val Thr Thr Gin Ala Ala Lys He Phe Asn 50 55 60
TAC AAC AAC CAG ACA ACC ATT TCA AGT AAG GAA TTA GAA AGA AGG CAA 363 Tyr Asn Asn Gin Thr Thr He Ser Ser Lys Glu Leu Glu Arg Arg Gin 65 70 75
GCC AAC CAA ATC AGC GAC ATG TTT AGA AGA AAC CCT AAT ATC AAT GTG 411 Ala Asn Gin He Ser Asp Met Phe Arg Arg Asn Pro Asn He Asn Val 80 85 90
GGC GGT GGT GCG GTG ATA GCG CAA AAA ATT TAT GTG CGC GGT ATT GAA 459 Gly Gly Gly Ala Val He Ala Gin Lys He Tyr Val Arg Gly He Glu 95 100 105
GAC AGA TTG GCT CGG GTT ACG GTG GAT GGG GCG GCG CAA ATG GGT GCA 507 Asp Arg Leu Ala Arg Val Thr Val Asp Gly Ala Ala Gin Met Gly Ala 110 115 120 125
AGC TAT GGG CAT CAA GGC AAT ACG ATC ATT GAC CCT GGA ATG CTT AAA 555 Ser Tyr Gly His Gin Gly Asn Thr He He Asp Pro Gly Met Leu Lys 130 135 140
AGC GTG GTG GTT ACT AAA GGG GCG GCT CAA GCG AGC GCG GGG CCT ATG 603 Ser Val Val Val Thr Lys Gly Ala Ala Gin Ala Ser Ala Gly Pro Met 145 150 155
GCT TTG ATT GGC GCG ATT AAA ATG GAG ACT AAA AGT GCT AGC GAT TTT 651 Ala Leu He Gly Ala He Lys Met Glu Thr Lys Ser Ala Ser Asp Phe 160 165 170
ATC CCT AAA GGT AAA GAC TAC GCC ATA AGT GGG GCT GCC ACT TTT TTA 699 He Pro Lys Gly Lys Asp Tyr Ala He Ser Gly Ala Ala Thr Phe Leu 175 180 185
ACC AAC TTT GGG GAT CGA GAA ACC GTG ATG GGC GCT TAT CGT CAT AAT 747 Thr Asn Phe Gly Asp Arg Glu Thr Val Met Gly Ala Tyr Arg His Asn 190 195 200 205
CAT TTT GAT GCG CTT TTG TAT TAC ACG CAT CAA AAT ATT TTT TAC TAT 795 His Phe Asp Ala Leu Leu Tyr Tyr Thr His Gin Asn He Phe Tyr Tyr 210 215 220
CGT GAT GGG GAT AAT GCT ACA AAA GAT CTC TTT AGA CCT AAA GCG GAG 843 Arg Asp Gly Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu 225 230 235
- 103 -
AAT AAA GTT ACA GAA GTC CTA GCG AGC AAA ACA ATG TGATGGCTAA GATCAA 895 Asn Lys Val Thr Glu Val Leu Ala Ser Lys Thr Met 240 245
TGGTTATTTG AGCGAAAGGG ATATTTTAAC GCTCAGTTAT AACATGACCA GAGACAACGC 955 TAAC 959
(2) INFORMATION FOR SEQ ID NO : 20 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 249 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 20 :
Met Asn Asp Lys Arg Phe Arg Lys Tyr Cys Ser Phe Ser He Phe Leu
1 5 10 15
Ser Leu Leu Gly Thr Phe Glu Leu Glu Ala Lys Glu Glu Glu Glu Lys
20 25 30
Glu Glu Arg Lys Thr Glu Arg Lys Lys Glu Lys Asn Ala Gin His Thr
35 40 45
Leu Gly Lys Val Thr Thr Gin Ala Ala Lys He Phe Asn Tyr Asn Asn
50 55 60
Gin Thr Thr He Ser Ser Lys Glu Leu Glu Arg Arg Gin Ala Asn Gin 65 70 75 80
He Ser Asp Met Phe Arg Arg Asn Pro Asn He Asn Val Gly Gly Gly
85 90 95
Ala Val He Ala Gin Lys He Tyr Val Arg Gly He Glu Asp Arg Leu
100 105 110
Ala Arg Val Thr Val Asp Gly Ala Ala Gin Met Gly Ala Ser Tyr Gly
115 120 125
His Gin Gly Asn Thr He He Asp Pro Gly Met Leu Lys Ser Val Val
130 135 140
Val Thr Lys Gly Ala Ala Gin Ala Ser Ala Gly Pro Met Ala Leu He 145 150 155 160
Gly Ala He Lys Met Glu Thr Lys Ser Ala Ser Asp Phe He Pro Lys
165 170 175
Gly Lys Asp Tyr Ala He Ser Gly Ala Ala Thr Phe Leu Thr Asn Phe
180 185 190
Gly Asp Arg Glu Thr Val Met Gly Ala Tyr Arg His Asn His Phe Asp
195 200 205
Ala Leu Leu Tyr Tyr Thr His Gin Asn He Phe Tyr Tyr Arg Asp Gly
210 215 220
Asp Asn Ala Thr Lys Asp Leu Phe Arg Pro Lys Ala Glu Asn Lys Val 225 230 235 240
Thr Glu Val Leu Ala Ser Lys Thr Met 245
(2) INFORMATION FOR SEQ ID NO : 21 :
- 104 -
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1306 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 40...1266 (D) OTHER INFORMATION:
(A) NAME/KEY: sιg_peptιde
(B) LOCATION: 40...219 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide
(B) LOCATION: 220...1266 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 21 :
TTTGACAGCT TATCATTTGG CAATAAAACA CCAAAATGA ATG AGT TAC ACA AAA 54
Met Ser Tyr Thr Lys -60
AAA TAC TCA ACA CCA CCC AAC CGG CGT AAA ATG CAA AAC ATT ATC GCT 102 Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met Gin Asn He He Ala 55 -50 -45 -40
ATT AAA AGA TCC TCT AGA GTC GAC CTG CAG GCA TGC AAG CTA GCT TTC 150 He Lys Arg Ser Ser Arg Val Asp Leu Gin Ala Cys Lys Leu Ala Phe -35 -30 -25
GCG AGC TCG AGA TCA CCC ATG CAA TTT CAA AAA ACC TTA TTT CCT TTA 198 Ala Ser Ser Arg Ser Pro Met Gin Phe Gin Lys Thr Leu Phe Pro Leu -20 -15 -10
CCC TTA TTA TTT TTA TCT TGT TGT ATC GCT GAA GAA AAT GGG GCG TAT 246 Pro Leu Leu Phe Leu Ser Cys Cys He Ala Glu Glu Asn Gly Ala Tyr -5 1 5
GCG AGC GTG GGG TTT GAA TAT TCC ATT AGT CAT GCC GTT GAG CAT AAT 294 Ala Ser Val Gly Phe Glu Tyr Ser He Ser His Ala Val Glu His Asn 10 15 20 25
AAC CCT TTT TTA AAT CAA GAA CGC ATC CAA ATC ATT TCT AAC GCT CAA 342 Asn Pro Phe Leu Asn Gin Glu Arg He Gin He He Ser Asn Ala Gin 30 35 40
AAC AAA ATC TAT AAA CTC AAT CAA GTC AAA AAT GAA ATC ACA AGC ATG 390
- 105 -
Asn Lys He Tyr Lys Leu Asn Gin Val Lys Asn Glu He Thr Ser Met 45 50 55
CAA AAC ACC TTT AAT TAC ATC AAC AAC GCT TTA AAA AAC AAT GCT AAA 438 Gin Asn Thr Phe Asn Tyr He Asn Asn Ala Leu Lys Asn Asn Ala Lys 60 65 70
TTA ACC CCC ACT GAA ATC CAA GCT GAG AAA TAC TAC CTC CAA TCC ACC 486 Leu Thr Pro Thr Glu He Gin Ala Glu Lys Tyr Tyr Leu Gin Ser Thr 75 80 85
CTT CAA AAC ATT GAA AAA ATA GTC ACA CTT AGC GGT GGC GTT GCA TCT 534 Leu Gin Asn He Glu Lys He Val Thr Leu Ser Gly Gly Val Ala Ser 90 95 100 105
AAC CCC AAA CTA GTC CAA GCG TTG GAA AAA ATG CAA GAA CCC ATT ACT 582 Asn Pro Lys Leu Val Gin Ala Leu Glu Lys Met Gin Glu Pro He Thr 110 115 120
AAC CCT TTA GAA TTA GCA GAA AAC TTA AGA AAT TTA GAA TTG CAA TTT 630 Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn Leu Glu Leu Gin Phe 125 130 135
GCT CAA TCT CAA AAC CGC ATG CTT TCT TCT TTA TCT TCT CAA ACC GCT 678 Ala Gin Ser Gin Asn Arg Met Leu Ser Ser Leu Ser Ser Gin Thr Ala 140 145 150
CAA ATT TCA AAT TCT TTG AAC GCG CTT GAT CCC AGC TCT TAT TCT AAA 726 Gin He Ser Asn Ser Leu Asn Ala Leu Asp Pro Ser Ser Tyr Ser Lys 155 160 165
AAC ATT TCA AGC ATG TCT GGG GTG AGT TTG AGC GTA GGG TAT AAG CAT 774 Asn He Ser Ser Met Ser Gly Val Ser Leu Ser Val Gly Tyr Lys His 170 175 180 185
TTC TTT ACT AAG AAA AAA AAT CAA GGG TTT CGC TAT TAC TTG TTT TAT 822 Phe Phe Thr Lys Lys Lys Asn Gin Gly Phe Arg Tyr Tyr Leu Phe Tyr 190 195 200
GAC TAT GGT TAC ACT AAC TTT GGT TTT GTG GGT AAT GGC TTT GAT GGT 870 Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly Asn Gly Phe Asp Gly 205 210 215
TTA GGC AAA ATG AAT AAC CAC CTC TAT GGG CTT GGA ATA AAC TAC CTT 918 Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu Gly He Asn Tyr Leu 220 225 230
TAT AAT TTC ATT GAT AAT GCA CAA AAA CAT TCG AGC GTG GGT TTT TAT 966 Tyr Asn Phe He Asp Asn Ala Gin Lys His Ser Ser Val Gly Phe Tyr 235 240 245
GCG GGT TTT GCT TTG GCG GGG AAT TCG TGG GTA GGG AAT GGT TTA GGC 1014 Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val Gly Asn Gly Leu Gly 250 255 260 265
- 106 -
ATG TGG GTG AGC CAA ACG GAT TTT ATC AAC AAT TAC TTG ATG GGC TAT 1062 Met Trp Val Ser Gin Thr Asp Phe He Asn Asn Tyr Leu Met Gly Tyr 270 275 280
CAA GCT AAA ATA CAC ACG AAC TTT TTC CAG ATC CCT TTG AAT TTT GGG 1110 Gin Ala Lys He His Thr Asn Phe Phe Gin He Pro Leu Asn Phe Gly 285 290 295
GTT CGT GTG AAT GTC AAT AGG CAT AAC GGA TTT GAA ATG GGC CTA AAA 1158 Val Arg Val Asn Val Asn Arg His Asn Gly Phe Glu Met Gly Leu Lys 300 305 310
ATC CCT TTA GCG GTG AAT TCC TTT TAT GAA ACG CAT GGC AAA GGG TTA 1206 He Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr His Gly Lys Gly Leu 315 320 325
AAC ACT TCC CTC TTT TTC AAA CGC CTT GTG GTG TTT AAT GTG AGT TAT 1254 Asn Thr Ser Leu Phe Phe Lys Arg Leu Val Val Phe Asn Val Ser Tyr 330 335 340 345
GTT TAT AGT TTT TAGGGGGTAA ATGCCTTCAA ACGCTCTTTT GATTGAAGAA 1306
Val Tyr Ser Phe
(2) INFORMATION FOR SEQ ID NO: 22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
Met Ser Tyr Thr Lys Lys Tyr Ser Thr Pro Pro Asn Arg Arg Lys Met -60 -55 -50 -45
Gin Asn He He Ala He Lys Arg Ser Ser Arg Val Asp Leu Gin Ala
-40 -35 -30
Cys Lys Leu Ala Phe Ala Ser Ser Arg Ser Pro Met Gin Phe Gin Lys
-25 -20 -15
Thr Leu Phe Pro Leu Pro Leu Leu Phe Leu Ser Cys Cys He Ala Glu
-10 -5 1
Glu Asn Gly Ala Tyr Ala Ser Val Gly Phe Glu Tyr Ser He Ser His
5 10 15 20
Ala Val Glu His Asn Asn Pro Phe Leu Asn Gin Glu Arg He Gin He
25 30 35
He Ser Asn Ala Gin Asn Lys He Tyr Lys Leu Asn Gin Val Lys Asn
40 45 50
Glu He Thr Ser Met Gin Asn Thr Phe Asn Tyr He Asn Asn Ala Leu
55 60 65
Lys Asn Asn Ala Lys Leu Thr Pro Thr Glu He Gin Ala Glu Lys Tyr
- 107 -
70 75 80
Tyr Leu Gin Ser Thr Leu Gin Asn He Glu Lys He Val Thr Leu Ser 85 90 95 100
Gly Gly Val Ala Ser Asn Pro Lys Leu Val Gin Ala Leu Glu Lys Met
105 110 115
Gin Glu Pro He Thr Asn Pro Leu Glu Leu Ala Glu Asn Leu Arg Asn
120 125 130
Leu Glu Leu Gin Phe Ala Gin Ser Gin Asn Arg Met Leu Ser Ser Leu
135 140 145
Ser Ser Gin Thr Ala Gin He Ser Asn Ser Leu Asn Ala Leu Asp Pro
150 155 160
Ser Ser Tyr Ser Lys Asn He Ser Ser Met Ser Gly Val Ser Leu Ser 165 170 175 180
Val Gly Tyr Lys His Phe Phe Thr Lys Lys Lys Asn Gin Gly Phe Arg
185 190 195
Tyr Tyr Leu Phe Tyr Asp Tyr Gly Tyr Thr Asn Phe Gly Phe Val Gly
200 205 210
Asn Gly Phe Asp Gly Leu Gly Lys Met Asn Asn His Leu Tyr Gly Leu
215 220 225
Gly He Asn Tyr Leu Tyr Asn Phe He Asp Asn Ala Gin Lys His Ser
230 235 240
Ser Val Gly Phe Tyr Ala Gly Phe Ala Leu Ala Gly Asn Ser Trp Val 245 250 255 260
Gly Asn Gly Leu Gly Met Trp Val Ser Gin Thr Asp Phe He Asn Asn
265 270 275
Tyr Leu Met Gly Tyr Gin Ala Lys He His Thr Asn Phe Phe Gin He
280 285 290
Pro Leu Asn Phe Gly Val Arg Val Asn Val Asn Arg His Asn Gly Phe
295 300 305
Glu Met Gly Leu Lys He Pro Leu Ala Val Asn Ser Phe Tyr Glu Thr
310 315 320
His Gly Lys Gly Leu Asn Thr Ser Leu Phe Phe Lys Arg Leu Val Val 325 330 335 340
Phe Asn Val Ser Tyr Val Tyr Ser Phe 345
(2) INFORMATION FOR SEQ ID NO: 23
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1030 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 342...824 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23
- 108 -
CACTCTAAGC GTCAAACTCT CTTTTTCTTT AGAGGAAGAA AGCAAGCGGA TCCATCTTAA 60
AGCCTTACAA AATATCTTAA ATAACGCTAA AAGCGCGCAT TTTAAATTTG TTTTAGAGAG 120
CCAAAACGCC GCTCAATCTA TTATAGAAAT TCAAAGCCTC TTGAAACAAC TCTCCTTAAA 180
AAATAATGAA ATCTTTTTAA TGCCTTTAGG CACAAATAAC AACGAGCTAG ACAAAAATCT 240
AAAAACCCTA GCCCCCCTAG CCATAAAGCA TGGTTTCAGG CTAAGCGATA GGCTTCATAT 300
CCGCTTGTGG GATAATCAAA AAGGGTTTTA AAAAGTTAAT C ATG ACC ATC AAA GTT 356
Met Thr He Lys Val 1 5
TTT TCG CCC AAA TAC CCC ACT GAA TTA GAA GAA TTT TAT GCT GAG CGT 404 Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu Phe Tyr Ala Glu Arg 10 15 20
ATC GCT GAC AAC CCT TTA GGG TTT ATC CAA CGC TTG GAT CTT TTG CCT 452 He Ala Asp Asn Pro Leu Gly Phe He Gin Arg Leu Asp Leu Leu Pro 25 30 35
AGT ATT AGC GGG TTC GTT CAA AAA TTG CGC GAG CAT GGC GGG GAA TTT 500 Ser He Ser Gly Phe Val Gin Lys Leu Arg Glu His Gly Gly Glu Phe 40 45 50
TTT GAA ATG AGA GAG GGT AAC AAG CTC ATT GGG ATT TGT GGG CTT AAT 548 Phe Glu Met Arg Glu Gly Asn Lys Leu He Gly He Cys Gly Leu Asn 55 60 65
CCT ATC AAT CAA ACA GAA GCC GAG CTG TGC AAA TTC CAC ATA AAT AGT 596 Pro He Asn Gin Thr Glu Ala Glu Leu Cys Lys Phe His He Asn Ser 70 75 80 85
GCT TAT CAA TCC CAA GGG CTA GGT CAA AAA CTC TAT GAG AGC GTG GAG 644 Ala Tyr Gin Ser Gin Gly Leu Gly Gin Lys Leu Tyr Glu Ser Val Glu 90 95 100
AAA TAC GCT TTC ATT AAA GGC TAT ACT AAA ATC TCT CTG CAT GTG AGC 692 Lys Tyr Ala Phe He Lys Gly Tyr Thr Lys He Ser Leu His Val Ser 105 110 115
AAA AGC CAA ATC AAG GCA TGC AAC CTC TAT CAA AAG CTG GGT TTT GTG 740
Lys Ser Gin He Lys Ala Cys Asn Leu Tyr Gin Lys Leu Gly Phe Val
120 125 130
CAC ATC AAA GAA GAG GAT TGC GTG GTG GAG TTG GGC GAA GAG ACT TTG 788
His He Lys Glu Glu Asp Cys Val Val Glu Leu Gly Glu Glu Thr Leu
135 140 145
ATT TTC CCC ACT CTT TTT ATG GAA AAG ATT TTG TCT TGATTGGTGC ATCCAT 840 He Phe Pro Thr Leu Phe Met Glu Lys He Leu Ser 150 155 160
TTGACACACG CCCAAGCGAC ATTCAAACTA TCAAACTTTC ATTAACACAA CCCAATTAAC 900
GCTAAATAAA CCCTAAAACA AACACTCGTT GTTAAAATTT TGTTTTTCAA GCGCTTCGCA 960
AAGTTTTAGA AGCCCTATTT AGGGGTTAAC GCTAAAATAG GCTATCAAAA CTACTTTAAT 1020
GATTTTATAG 1030
(2) INFORMATION FOR SEQ ID NO: 24:
109 -
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 161 ammo acids
(C) STRANDEDNESS: Single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION. SEQ ID NO: 24:
Met Thr He Lys Val Phe Ser Pro Lys Tyr Pro Thr Glu Leu Glu Glu
1 5 10 15
Phe Tyr Ala Glu Arg He Ala Asp Asn Pro Leu Gly Phe He Gin Arg
20 25 30
Leu Asp Leu Leu Pro Ser He Ser Gly Phe Val Gin Lys Leu Arg Glu
35 40 45
His Gly Gly Glu Phe Phe Glu Met Arg Glu Gly Asn Lys Leu He Gly
50 55 60
He Cys Gly Leu Asn Pro He Asn Gin Thr Glu Ala Glu Leu Cys Lys 65 70 75 80
Phe His He Asn Ser Ala Tyr Gin Ser Gin Gly Leu Gly Gin Lys Leu
85 90 95
Tyr Glu Ser Val Glu Lys Tyr Ala Phe He Lys Gly Tyr Thr Lys He
100 105 110
Ser Leu His Val Ser Lys Ser Gin He Lys Ala Cys Asn Leu Tyr Gin
115 120 125
Lys Leu Gly Phe Val His He Lys Glu Glu Asp Cys Val Val Glu Leu
130 135 140
Gly Glu Glu Thr Leu He Phe Pro Thr Leu Phe Met Glu Lys He Leu 145 150 155 160
Ser
(2) INFORMATION FOR SEQ ID NO: 25,
(l) SEQUENCE CHARACTERISTICS.
(A) LENGTH: 1477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 374. .1267 (D) OTHER INFORMATION.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
CGTGGAGTTT TTTAGGCATT TCTTTATATT CATTCAATAA CGCTTGCGCG GGCAATTCTT 60
CAACTAAAAT CTCTACTAAC AATTCATCTG AATGCAAAAT CTCAATTCTC CCTAAAAAAC 120
AAAATCACTT TTAAGACTAA ATCATGTTAG AATTATACTT GAATTTACAC TCAGTTTAGT 180
- 110 -
TTATTTCTTA ATACAAAAGG TAGGCGTTTT GAAACATTTA ACCCCACTCA CTCACACCAT 240 CTTTAAAGCC TTATGGCTAG GCACAGCCTT AAGTGCATCT TTAAGTTTAG CCGCAACAGA 300 AAGCCCCACT AAAACAGAGC CTAAGCCCGC TAAAGGGGTT AAAAACAAGC CCAAATCGCC 360
CGTTACTAAA GTC ATG ATG ACC AAT TGC GAC AAT ATT AAA GAT TTT AAC 409 Met Met Thr Asn Cys Asp Asn He Lys Asp Phe Asn 1 5 10
GCT AAG CAA AAA GAA GTC TTA AAA GCC GCT TAT CAA TTC GGC TCT AAA 457 Ala Lys Gin Lys Glu Val Leu Lys Ala Ala Tyr Gin Phe Gly Ser Lys 15 20 25
GAA AAT TTA GGC TAT GAA ATG GCA GGC ATT GCA TGG AAA GAG TCA TGC 505 Glu Asn Leu Gly Tyr Glu Met Ala Gly He Ala Trp Lys Glu Ser Cys 30 35 40
GCA GGG GTT TAT AAA ATC AAT TTT TCG GAT CCG AGC GCG GGC GTG TAT 553 Ala Gly Val Tyr Lys He Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr 45 50 55 60
CAT TCT TAT ATC CCA AGC GTT CTA AAA AGC TAT GGG CAT AAT GAT AGC 601 His Ser Tyr He Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser 65 70 75
CCC TTT TTG CGT AAT GTG ATG GGG GAA TTG CTC ATT AAA GAC GAT GCG 649 Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu He Lys Asp Asp Ala 80 85 90
TTT GCT TCT GAA GTG GCT TTA AAA GAG TTG CTC TAT TGG AAA ACA CGC 697 Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg 95 100 105
TAC CAT GAC AAT TTA AAA GAC ATG ATT AAA TCT TAC AAC AAG GGC AGT 745 Tyr His Asp Asn Leu Lys Asp Met He Lys Ser Tyr Asn Lys Gly Ser 110 115 120
CGT TGG GAA AGG AGC GAA AAA TCT AAC GCT GAT GCT GAA AAA TAT TAC 793 Arg Trp Glu Arg Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys Tyr Tyr 125 130 135 140
GAA GAG ATA CAA GAC AGA ATC AGG CGT TTG AAA GAA TCT AAA ATC TTT 841 Glu Glu He Gin Asp Arg He Arg Arg Leu Lys Glu Ser Lys He Phe 145 150 155
GAT TCG CAG TCT AGT AAT GAC CAA GAA TTG CAA AAA AGC GCT AAT AGC 889 Asp Ser Gin Ser Ser Asn Asp Gin Glu Leu Gin Lys Ser Ala Asn Ser 160 165 170
AAC CTG GAT TTA GAC CCT ATC GGC AAC GCC ATG CCC CAA GCC TTA ATT 937 Asn Leu Asp Leu Asp Pro He Gly Asn Ala Met Pro Gin Ala Leu He 175 180 185
GCC AAA GAA ACT AAA ATA GAA GAA ACC CAA GCA GAA AAA TCC CAA GAA 985 Ala Lys Glu Thr Lys He Glu Glu Thr Gin Ala Glu Lys Ser Gin Glu 190 195 200
- 111 -
ATG AAA GAG ACA ACT AGC GAG CAA ACA AAA AGT AAG CCA GAA AAA GCA 1033 Met Lys Glu Thr Thr Ser Glu Gin Thr Lys Ser Lys Pro Glu Lys Ala 205 210 215 220
AAA GAT AAA CCC ATG TAT TTG GCG CAA ATC AAC AGC ACT GAT TTC ACA 1081 Lys Asp Lys Pro Met Tyr Leu Ala Gin He Asn Ser Thr Asp Phe Thr 225 230 235
CCC GTT AAA AAA AGC CCC AAA AAA CCG GCT AAA GTG AGC CAA AAA CAC 1129 Pro Val Lys Lys Ser Pro Lys Lys Pro Ala Lys Val Ser Gin Lys His 240 245 250
TCC TTT AAG AAT AAC ATT AAA AAT AAT GTA AAA AAC AAC GCC AAA ACC 1177 Ser Phe Lys Asn Asn He Lys Asn Asn Val Lys Asn Asn Ala Lys Thr 255 260 265
GCT TCC AAA AAA CAA GAA ATG TGC AAA AAT TGC TCT CCA GGG CAA AGG 1225 Ala Ser Lys Lys Gin Glu Met Cys Lys Asn Cys Ser Pro Gly Gin Arg 270 275 280
AAT GCG ATT TTA GCT AAC CAC ATC ACT CTC ATG CAA GAG CTT TAAAAAGTC 1276 Asn Ala He Leu Ala Asn His He Thr Leu Met Gin Glu Leu 285 290 295
CTAAAAATGG CGCAAAAAAC TCTTTTGATT ATCACTGATG GCATTGGGTA TCGTAAAGAT 1336
AGCGATCATA ACGCTTTCTT CCATGCCAAA AAACCCACTT ATGATTTGAT GTTTAAAACC 1396
TTGCCTTATA GCCTGATTGA TACGCATGGC TTGAGCGTGG GCTTACCTAA GGGGCAAATG 1456
GGAAATTCTG AAGTGGGGCA T 1477
(2) INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 298 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
Met Met Thr Asn Cys Asp Asn He Lys Asp Phe Asn Ala Lys Gin Lys
1 5 10 15
Glu Val Leu Lys Ala Ala Tyr Gin Phe Gly Ser Lys Glu Asn Leu Gly
20 25 30
Tyr Glu Met Ala Gly He Ala Trp Lys Glu Ser Cys Ala Gly Val Tyr
35 40 45
Lys He Asn Phe Ser Asp Pro Ser Ala Gly Val Tyr His Ser Tyr He
50 55 60
Pro Ser Val Leu Lys Ser Tyr Gly His Asn Asp Ser Pro Phe Leu Arg 65 70 75 80
Asn Val Met Gly Glu Leu Leu He Lys Asp Asp Ala Phe Ala Ser Glu
85 90 95
Val Ala Leu Lys Glu Leu Leu Tyr Trp Lys Thr Arg Tyr His Asp Asn
- 112 -
100 105 110
Leu Lys Asp Met He Lys Ser Tyr Asn Lys Gly Ser Arg Trp Glu Arg
115 120 125
Ser Glu Lys Ser Asn Ala Asp Ala Glu Lys Tyr Tyr Glu Glu He Gin
130 135 140
Asp Arg He Arg Arg Leu Lys Glu Ser Lys He Phe Asp Ser Gin Ser 145 150 155 160
Ser Asn Asp Gin Glu Leu Gin Lys Ser Ala Asn Ser Asn Leu Asp Leu
165 170 175
Asp Pro He Gly Asn Ala Met Pro Gin Ala Leu He Ala Lys Glu Thr
180 185 190
Lys He Glu Glu Thr Gin Ala Glu Lys Ser Gin Glu Met Lys Glu Thr
195 200 205
Thr Ser Glu Gin Thr Lys Ser Lys Pro Glu Lys Ala Lys Asp Lys Pro
210 215 220
Met Tyr Leu Ala Gin He Asn Ser Thr Asp Phe Thr Pro Val Lys Lys 225 230 235 240
Ser Pro Lys Lys Pro Ala Lys Val Ser Gin Lys His Ser Phe Lys Asn
245 250 255
Asn He Lys Asn Asn Val Lys Asn Asn Ala Lys Thr Ala Ser Lys Lys
260 265 270
Gin Glu Met Cys Lys Asn Cys Ser Pro Gly Gin Arg Asn Ala He Leu
275 280 285
Ala Asn His He Thr Leu Met Gin Glu Leu 290 295
(2) INFORMATION FOR SEQ ID NO: 27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 141...1340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
TTAGTGTTGA TTTTTTTATC GTTAGTGTTT GTGCGTCCTT TAGAGGCTTT GAGCGTGTTT 60 ATGGGGTTGT ATTTGATTTA TGGCATCATT CGGTGGCTCT TTTTAATGGT AAAAATTATT 120 TTTAATAAAA ATAAAAGCGC ATG AAA GAA TCT TTT TAC ATA GAG GGA ATG 170
Met Lys Glu Ser Phe Tyr He Glu Gly Met 1 5 10
ACT TGC ACG GCG TGT TCT AGC GGG ATT GAA CGC TCT TTG GGG CGT AAG 218 Thr Cys Thr Ala Cys Ser Ser Gly He Glu Arg Ser Leu Gly Arg Lys 15 20 25
AGT TTT GTG AAA AAA ATA GAA GTG AGC CTT TTA AAT AAG AGC GCT AAC 266 Ser Phe Val Lys Lys He Glu Val Ser Leu Leu Asn Lys Ser Ala Asn
- 113 -
30 35 40
ATT GAA TTT GAC GAA AAC CAA ACC AAT TTA GAC GAA ATT TTT AAA CTC 314 He Glu Phe Asp Glu Asn Gin Thr Asn Leu Asp Glu He Phe Lys Leu 45 50 55
ATT GAA AAG CTA GGC TAT AGC CCT AAA AAA GCT CTG ACA AAA GAA AAA 362 He Glu Lys Leu Gly Tyr Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys 60 65 70
AAA GAA TTT TTT AGC CCT AAT GTT AAA TTA GCG TTA GCG GTT ATT TTC 410 Lys Glu Phe Phe Ser Pro Asn Val Lys Leu Ala Leu Ala Val He Phe 75 80 85 90
ACG CTT TTT GTG GTG TAT CTT TCT ATG GGG GCG ATG CTT AGC CCT AGC 458 Thr Leu Phe Val Val Tyr Leu Ser Met Gly Ala Met Leu Ser Pro Ser 95 100 105
CTT TTA CCT GAA AGC TTG CTT GCA ATT GAT AAT CAT AGT AAT TTT TTA 506 Leu Leu Pro Glu Ser Leu Leu Ala He Asp Asn His Ser Asn Phe Leu 110 115 120
AAC GCT TGC TTA CAG CTT ATA GGC GCA CTC ATT GTC ATG CAT TTG GGG 554 Asn Ala Cys Leu Gin Leu He Gly Ala Leu He Val Met His Leu Gly 125 130 135
AGG GAT TTT TAC ATT CAA GGG TTT AAA GCC TTA TGG CAC AGA CAA CCC 602 Arg Asp Phe Tyr He Gin Gly Phe Lys Ala Leu Trp His Arg Gin Pro 140 145 150
AAC ATG AGC AGC CTT ATC GCC ATA GGC ACA AGC GCT GCC TTA ATT TCA 650 Asn Met Ser Ser Leu He Ala He Gly Thr Ser Ala Ala Leu He Ser 155 160 165 170
AGC CTG TGG CAA TTG TAT TTG GTC TAT ACC AAT CAT TAT ACC GAT CAG 698 Ser Leu Trp Gin Leu Tyr Leu Val Tyr Thr Asn His Tyr Thr Asp Gin 175 180 185
TGG TCT TAT GGG CAT TAT TAT TTT GAA AGC GTG TGC GTG ATT TTA ATG 746 Trp Ser Tyr Gly His Tyr Tyr Phe Glu Ser Val Cys Val He Leu Met 190 195 200
TTT GTG ATG GTG GGC AAA CGC ATT GAA AAT GTT TCT AAA GAC AAA GCT 794 Phe Val Met Val Gly Lys Arg He Glu Asn Val Ser Lys Asp Lys Ala 205 210 215
TTA GAC GCT ATG CAA GCC TTG ATG AAA AAC GCC CCA AAA ACC GCC CTT 842 Leu Asp Ala Met Gin Ala Leu Met Lys Asn Ala Pro Lys Thr Ala Leu 220 225 230
AAA ATG CAA AAT AAC CAA CAG ATT GAA GTT TTA GTG GAT AGC ATT GTG 890 Lys Met Gin Asn Asn Gin Gin He Glu Val Leu Val Asp Ser He Val 235 240 245 250
GTG GGG GAT ATT CTA AAA GTC CTC CCT GGA AGC GCG ATT GCG GTG GAT 938
-114-
Val Gly Asp He Leu Lys Val Leu Pro Gly Ser Ala He Ala Val Asp 255 260 265
GGT GAA ATC ATA GAG GGC GAA GGG GAA TTA GAT GAG AGC ATG TTG AGC 986 Gly Glu He He Glu Gly Glu Gly Glu Leu Asp Glu Ser Met Leu Ser 270 275 280
GGC GAA GCG TTG CCG GTT TAT AAA AAA GTC GGC GAT AAA GTC TTT TCA 1034 Gly Glu Ala Leu Pro Val Tyr Lys Lys Val Gly Asp Lys Val Phe Ser 285 290 295
GGG ACA TTC AAT AGC CAC ACG AGT TTT TTA ATG AAA GCC ACG CAA AAC 1082 Gly Thr Phe Asn Ser His Thr Ser Phe Leu Met Lys Ala Thr Gin Asn 300 305 310
AAC AAA AAC AGC ACC TTG TCT CAA ATT ATA GAA ATG ATT TAT AAC GCT 1130 Asn Lys Asn Ser Thr Leu Ser Gin He He Glu Met He Tyr Asn Ala 315 320 325 330
CAA AGT TCA AAG GCA GAG ATT TCT CGC TTA GCG GAT AAG GTT TCA AGC 1178 Gin Ser Ser Lys Ala Glu He Ser Arg Leu Ala Asp Lys Val Ser Ser 335 340 345
GTG TTT GTG CCA AGC GTG ATC GCT ATT TCT ATT TTA GCG TTT GTG GTG 1226 Val Phe Val Pro Ser Val He Ala He Ser He Leu Ala Phe Val Val 350 355 360
TGG CTC ATC ATT GCA CCT AAG CCC GAT TTT TGG TGG AAT TTT GGA ATC 1274
Trp Leu He He Ala Pro Lys Pro Asp Phe Trp Trp Asn Phe Gly He
365 370 375
GCT TTA GAA GTG TTT GTA TCG GTT TTA GTG ATT TCT TGC CCT TGC GCT 1322
Ala Leu Glu Val Phe Val Ser Val Leu Val He Ser Cys Pro Cys Ala 380 385 390
TTA GGA TTG CTA CGC CTA TGAGCATTTT AGTAGCGAAC CAGAAAGCGA GTTCTTTA 1378 Leu Gly Leu Leu Arg Leu 395 400
GGGTTATTTT TTAAAGACGC TAAAAGTTTA GAAAAAGCAA GGCTAGTCAA TACGATCGTT 1438 TTTGATAAAA CCGGCACGCT CACTAACGGC AAGCCTGTCG TTAAAAGCGT TCATTCTAAG 1498 ATAGAATTAT TAGAGTT 1515
(2) INFORMATION FOR SEQ ID NO: 28:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 400 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(i ) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 28 :
- 115 -
Met Lys Glu Ser Phe Tyr He Glu Gly Met Thr Cys Thr Ala Cys Ser
1 5 10 15
Ser Gly He Glu Arg Ser Leu Gly Arg Lys Ser Phe Val Lys Lys He
20 25 30
Glu Val Ser Leu Leu Asn Lys Ser Ala Asn He Glu Phe Asp Glu Asn
35 40 45
Gin Thr Asn Leu Asp Glu He Phe Lys Leu He Glu Lys Leu Gly Tyr
50 55 60
Ser Pro Lys Lys Ala Leu Thr Lys Glu Lys Lys Glu Phe Phe Ser Pro 65 70 75 80
Asn Val Lys Leu Ala Leu Ala Val He Phe Thr Leu Phe Val Val Tyr
85 90 95
Leu Ser Met Gly Ala Met Leu Ser Pro Ser Leu Leu Pro Glu Ser Leu
100 105 110
Leu Ala He Asp Asn His Ser Asn Phe Leu Asn Ala Cys Leu Gin Leu
115 120 125
He Gly Ala Leu He Val Met His Leu Gly Arg Asp Phe Tyr He Gin
130 135 140
Gly Phe Lys Ala Leu Trp His Arg Gin Pro Asn Met Ser Ser Leu He 145 150 155 160
Ala He Gly Thr Ser Ala Ala Leu He Ser Ser Leu Trp Gin Leu Tyr
165 170 175
Leu Val Tyr Thr Asn His Tyr Thr Asp Gin Trp Ser Tyr Gly His Tyr
180 185 190
Tyr Phe Glu Ser Val Cys Val He Leu Met Phe Val Met Val Gly Lys
195 200 205
Arg He Glu Asn Val Ser Lys Asp Lys Ala Leu Asp Ala Met Gin Ala
210 215 220
Leu Met Lys Asn Ala Pro Lys Thr Ala Leu Lys Met Gin Asn Asn Gin 225 230 235 240
Gin He Glu Val Leu Val Asp Ser He Val Val Gly Asp He Leu Lys
245 250 255
Val Leu Pro Gly Ser Ala He Ala Val Asp Gly Glu He He Glu Gly
260 265 270
Glu Gly Glu Leu Asp Glu Ser Met Leu Ser Gly Glu Ala Leu Pro Val
275 280 285
Tyr Lys Lys Val Gly Asp Lys Val Phe Ser Gly Thr Phe Asn Ser His
290 295 300
Thr Ser Phe Leu Met Lys Ala Thr Gin Asn Asn Lys Asn Ser Thr Leu 305 310 315 320
Ser Gin He He Glu Met He Tyr Asn Ala Gin Ser Ser Lys Ala Glu
325 330 335
He Ser Arg Leu Ala Asp Lys Val Ser Ser Val Phe Val Pro Ser Val
340 345 350
He Ala He Ser He Leu Ala Phe Val Val Trp Leu He He Ala Pro
355 360 365
Lys Pro Asp Phe Trp Trp Asn Phe Gly He Ala Leu Glu Val Phe Val
370 375 380
Ser Val Leu Val He Ser Cys Pro Cys Ala Leu Gly Leu Leu Arg Leu 385 390 395 400
(2) INFORMATION FOR SEQ ID NO: 29:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1443 base pairs
- 116 -
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 76...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 29 :
ACTTTAAAAA ACCCCCTTAA AAAGGTTTTT AGGTATAATT AGCGATCTTT TAGTTTCAAA 60 TAGTAGAGAG ATGGG ATG AAA AAA ATA TGG CTT TTA GTG TGG GGC TTG TGT 111 Met Lys Lys He Trp Leu Leu Val Trp Gly Leu Cys 1 5 10
TCT TGG GTG TTT TTG CAT GCG ATA GAG ATG ATA GAA AAA GCC CCT ACA 159 Ser Trp Val Phe Leu His Ala He Glu Met He Glu Lys Ala Pro Thr 15 20 25
AAT GTA GAG GAT AGA GAC AAA GCC CCC CAT TTG TTG CTT TTA GCA GGG 207 Asn Val Glu Asp Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly 30 35 40
ATT CAA GGC GAT GAG CCT GGT GGG TTT AAT GCA ACT AAT TTG TTT TTA 255 He Gin Gly Asp Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu 45 50 55 60
ATG CAT TAT AGC GTT TTA AAA GGT TTG GTT GAA GTG GTT CCT GTA TTG 303 Met His Tyr Ser Val Leu Lys Gly Leu Val Glu Val Val Pro Val Leu 65 70 75
AAT AAG CCT TCC ATG TTA AGA AAT CAT AGG GGC TTG TAT GGG GAT ATG 351 Asn Lys Pro Ser Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met 80 85 90
AAC CGC AAA TTT GCC GCT TTA GAC AAG AAT GAC CCT GAA TAC CCC ACT 399 Asn Arg Lys Phe Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr 95 100 105
ATC CAG GAA ATC AAA TCC TTG ATT GCA AAA CCC AGT ATA GAC GCT GTC 447 He Gin Glu He Lys Ser Leu He Ala Lys Pro Ser He Asp Ala Val 110 115 120
TTG CAT TTG CAT GAT GGC GGT GGG TAT TAC CGC CCT GTT TAT GTT GAT 495 Leu His Leu His Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp 125 130 135 140
GCG ATG CTC AAT CCT AAG CGC TGG GGG AAT TGC TTT ATT ATT GAT CAA 543 Ala Met Leu Asn Pro Lys Arg Trp Gly Asn Cys Phe He He Asp Gin 145 150 155
- 117 -
GAT GAG GTT AAA GGG GCG AAA TTC CCT AAT TTG CTT GCT TTT GCA AAC 591 Asp Glu Val Lys Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe Ala Asn 160 165 170
AAT ACG ATT GAG AGT ATC AAC GCC CAT TTA TTG CAC CCC ATT GAA GAG 639 Asn Thr He Glu Ser He Asn Ala His Leu Leu His Pro He Glu Glu 175 180 185
TAT CAT TTA AAA AAC ACG CGC ACC GCG CAA GGC GAT ACA GAA ATG CAA 687 Tyr His Leu Lys Asn Thr Arg Thr Ala Gin Gly Asp Thr Glu Met Gin 190 195 200
AAA GCC CTA ACT TTT TAT GCG ATC AAC CAA AAA AAG AGC GCT TTT GCC 735 Lys Ala Leu Thr Phe Tyr Ala He Asn Gin Lys Lys Ser Ala Phe Ala 205 210 215 220
AAT GAA GCT AGC AAA GAA CTC CCT TTA GCA TCA AGG GTG TTT TAC CAC 783 Asn Glu Ala Ser Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His 225 230 235
CTG CAA GCC ATT GAG GGC TTA CTC AAT CAG CTC AAT ATC CCT TTT AAG 831 Leu Gin Ala He Glu Gly Leu Leu Asn Gin Leu Asn He Pro Phe Lys 240 245 250
CGC GAT TTT GAT CTT AAC CCT AAC AGC GTG CAT GCC CTA ATC AAT GAT 879 Arg Asp Phe Asp Leu Asn Pro Asn Ser Val His Ala Leu He Asn Asp 255 260 265
AAA AAC TTG TGG GCA AAA ATC AGC TCT TTG CCT AAA ATG CCC CTT TTT 927 Lys Asn Leu Trp Ala Lys He Ser Ser Leu Pro Lys Met Pro Leu Phe 270 275 280
AAC TTG CGC CCT AAA CTC AAT CAT TTC CCC TTA CCC CAC AAC ACT AAA 975 Asn Leu Arg Pro Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys 285 290 295 300
ATC CCA CAA ATC CCC ATA GAG AGC AAC GCT TAC ATT GTA GGG CTA GTC 1023 He Pro Gin He Pro He Glu Ser Asn Ala Tyr He Val Gly Leu Val 305 310 315
AAA AAT AAA CAA GAA GTG TTT TTA AAA TAC GGC AAC AAG CTC ATG ACA 1071 Lys Asn Lys Gin Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr 320 325 330
CGA TTA TCG CCT TTT TAC ATA GAG TTT GAT CCT TCT TTA GAA GAA GTG 1119 Arg Leu Ser Pro Phe Tyr He Glu Phe Asp Pro Ser Leu Glu Glu Val 335 340 345
AAA ATG CAA ATT GAC AAT AAG GAT CAA ATG GTT AAA ATA GGG AGC GTG 1167 Lys Met Gin He Asp Asn Lys Asp Gin Met Val Lys He Gly Ser Val 350 355 360
GTT GAA GTG AAA GAG AGT TTT TAT ATC CAT GCT ATG GAC AAT ATC CGT 1215 Val Glu Val Lys Glu Ser Phe Tyr He His Ala Met Asp Asn He Arg 365 370 375 380
- 118 -
GCG AAT GTG ATT GGC TTT AGC GTT TCT AAT GAA AAT AAG CCT AAT GAA 1263 Ala Asn Val He Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu 385 390 395
GCG GGT TAT ACG ATT AAA TTT AAA GAT TTT CAA AAA CGC TTT TCA TTG 1311 Ala Gly Tyr Thr He Lys Phe Lys Asp Phe Gin Lys Arg Phe Ser Leu 400 405 410
GAC AAG CAA GAA AGG ATC TAT CGC ATA GAA TTT TAT AAA AAC AAC GCG 1359 Asp Lys Gin Glu Arg He Tyr Arg He Glu Phe Tyr Lys Asn Asn Ala 415 420 425
TTT AGC GGG ATG ATC TTA GTG AAA TTT GTG TAGGAATGGA TAAATCTCAT TGC 1412 Phe Ser Gly Met He Leu Val Lys Phe Val 430 435
CTTTTAACAT TCAAGGGTTT TGGTATTTTT T 1443
(2) INFORMATION FOR SEQ ID NO : 30 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 438 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 30 :
Met Lys Lys He Trp Leu Leu Val Trp Gly Leu Cys Ser Trp Val Phe
1 5 10 15
Leu His Ala He Glu Met He Glu Lys Ala Pro Thr Asn Val Glu Asp
20 25 30
Arg Asp Lys Ala Pro His Leu Leu Leu Leu Ala Gly He Gin Gly Asp
35 40 45
Glu Pro Gly Gly Phe Asn Ala Thr Asn Leu Phe Leu Met His Tyr Ser
50 55 60
Val Leu Lys Gly Leu Val Glu Val Val Pro Val Leu Asn Lys Pro Ser 65 70 75 80
Met Leu Arg Asn His Arg Gly Leu Tyr Gly Asp Met Asn Arg Lys Phe
85 90 95
Ala Ala Leu Asp Lys Asn Asp Pro Glu Tyr Pro Thr He Gin Glu He
100 105 110
Lys Ser Leu He Ala Lys Pro Ser He Asp Ala Val Leu His Leu His
115 120 125
Asp Gly Gly Gly Tyr Tyr Arg Pro Val Tyr Val Asp Ala Met Leu Asn
130 135 140
Pro Lys Arg Trp Gly Asn Cys Phe He He Asp Gin Asp Glu Val Lys 145 150 155 160
Gly Ala Lys Phe Pro Asn Leu Leu Ala Phe Ala Asn Asn Thr He Glu
165 170 175
Ser He Asn Ala His Leu Leu His Pro He Glu Glu Tyr His Leu Lys 180 185 190
- 119 -
Asn Thr Arg Thr Ala Gin Gly Asp Thr Glu Met Gin Lys Ala Leu Thr
195 200 205
Phe Tyr Ala He Asn Gin Lys Lys Ser Ala Phe Ala Asn Glu Ala Ser
210 215 220
Lys Glu Leu Pro Leu Ala Ser Arg Val Phe Tyr His Leu Gin Ala He 225 230 235 240
Glu Gly Leu Leu Asn Gin Leu Asn He Pro Phe Lys Arg Asp Phe Asp
245 250 255
Leu Asn Pro Asn Ser Val His Ala Leu He Asn Asp Lys Asn Leu Trp
260 265 270
Ala Lys He Ser Ser Leu Pro Lys Met Pro Leu Phe Asn Leu Arg Pro
275 280 285
Lys Leu Asn His Phe Pro Leu Pro His Asn Thr Lys He Pro Gin He
290 295 300
Pro He Glu Ser Asn Ala Tyr He Val Gly Leu Val Lys Asn Lys Gin 305 310 315 320
Glu Val Phe Leu Lys Tyr Gly Asn Lys Leu Met Thr Arg Leu Ser Pro
325 330 335
Phe Tyr He Glu Phe Asp Pro Ser Leu Glu Glu Val Lys Met Gin He
340 345 350
Asp Asn Lys Asp Gin Met Val Lys He Gly Ser Val Val Glu Val Lys
355 360 365
Glu Ser Phe Tyr He His Ala Met Asp Asn He Arg Ala Asn Val He
370 375 380
Gly Phe Ser Val Ser Asn Glu Asn Lys Pro Asn Glu Ala Gly Tyr Thr 385 390 395 400
He Lys Phe Lys Asp Phe Gin Lys Arg Phe Ser Leu Asp Lys Gin Glu
405 410 415
Arg He Tyr Arg He Glu Phe Tyr Lys Asn Asn Ala Phe Ser Gly Met
420 425 430
He Leu Val Lys Phe Val 435
(2) INFORMATION FOR SEQ ID NO : 31 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1280 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 66...1223 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 31 :
ATCAATACCC CTTAAATAAA AGATATAATG CTGTATTATA AGCTAGTTTT AATTACAATT 60
TTCAA ATG TTA AGG AAA AAC ATT TTA GCT TAC TAT GGG GCG AAT TTT CTC 110
Met Leu Arg Lys Asn He Leu Ala Tyr Tyr Gly Ala Asn Phe Leu
1 5 10 15
- 120 -
TTA ATC ATC GCT CAA AGC TTA CCC CAT GCG ATT TTA ACC CCC TTG TTG 158 Leu He He Ala Gin Ser Leu Pro His Ala He Leu Thr Pro Leu Leu 20 25 30
CTT TCT AAA GGG CTT AGT TTG AGT GAA ATC TTG CTC GTG CAA ACC TTT 206 Leu Ser Lys Gly Leu Ser Leu Ser Glu He Leu Leu Val Gin Thr Phe 35 40 45
TTT AGC TTT TGC GTG CTA GTG GCT GAA TAC CCA AGC GGC GTT TTA GCG 254 Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala 50 55 60
GAT TTG ATG AGC CGA AAA AAT TTA TTC CTG GTT TCT AAT GCC TTT TTA 302 Asp Leu Met Ser Arg Lys Asn Leu Phe Leu Val Ser Asn Ala Phe Leu 65 70 75
ATC GCT AGT TTT TCG TTT GTG CTG TTT TTT GAT AGC TTT ATT TTC ATG 350 He Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe He Phe Met 80 85 90 95
CTT TTA GCG TGG GGG TTG TAT GGT TTG TAT AGC GCA TGC TCT AGC GGC 398 Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly 100 105 110
ACG ATT GAA GCT TCA CTC ATC ACA GAC ATT AAG GAA AAC AAA AAA GAT 446 Thr He Glu Ala Ser Leu He Thr Asp He Lys Glu Asn Lys Lys Asp 115 120 125
TTA TCC AAG TTT TTA GCC AAA AAC AAT CAA ATT ACT TAT TTA GGC ATG 494 Leu Ser Lys Phe Leu Ala Lys Asn Asn Gin He Thr Tyr Leu Gly Met 130 135 140
ATT ATA GGG AGT TCT TTG GGA TCG TTT TTG TAT CTC AAA GTC CAT GCG 542 He He Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala 145 150 155
ATG CTG TAT ATT GTG GGG ATT TTT TTA ATC ATG CTC TGT GTG CTA ACG 590 Met Leu Tyr He Val Gly He Phe Leu He Met Leu Cys Val Leu Thr 160 165 170 175
ATC ATT TTT TAT TTT AAA GAG AAA GAA GGG GAT TTT AAA AGC CAA AAA 638 He He Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gin Lys 180 185 190
AGC CTG AAA CTC CTT AAA GAG CAA GTC AAA GGC AGT CTT AAA GAG CTT 686 Ser Leu Lys Leu Leu Lys Glu Gin Val Lys Gly Ser Leu Lys Glu Leu 195 200 205
AAA GAT AAC CCC AAA CTT AAA ATT CTG TTA GTG GGG CAT TTG ATT ACG 734 Lys Asp Asn Pro Lys Leu Lys He Leu Leu Val Gly His Leu He Thr 210 215 220
CCC GTC TTT TTT ATG AGC CAT TTT CAA ATG TGG CAA GCG TAT TTT TTA 782 Pro Val Phe Phe Met Ser His Phe Gin Met Trp Gin Ala Tyr Phe Leu 225 230 235
- 12 1 -
AAA CAA GGC GTT AAA GAG CAA TAC CTT TTT GTG TTT TAT ATC GCT TTT 830 Lys Gin Gly Val Lys Glu Gin Tyr Leu Phe Val Phe Tyr He Ala Phe 240 245 250 255
CAA GTG ATT TCT ATT CTC ATT CAT TTT TTA AAA GCC TCT AGT TAT AGC 878 Gin Val He Ser He Leu He His Phe Leu Lys Ala Ser Ser Tyr Ser 260 265 270
CAA AAA ATC GCC TTG AGT TCG CTT GTG GTG TTG TTA GGC GTT AGC CCC 926 Gin Lys He Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro 275 280 285
TTA TTG CTT AGC AAT ATC CCT TAT TGT TTC ATA GGG GTG TAT GCG CTC 974
Leu Leu Leu Ser Asn He Pro Tyr Cys Phe He Gly Val Tyr Ala Leu
290 295 300
ATG GTG GCG TTT TTC ACT TAC ATG AGC TAT TGC TTA AAC TAT CAA TTC 1022
Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gin Phe
305 310 315
TCC AAA TTC GTT TCT AAA AAC AAC ATT TCC TCG CTC TCA TCG CTT TTA 1070 Ser Lys Phe Val Ser Lys Asn Asn He Ser Ser Leu Ser Ser Leu Leu 320 325 330 335
TCA AGC TGT GTG CGC GTG GTC TCT GTG CTA ATC TTA TCG CTC AGC AGT 1118 Ser Ser Cys Val Arg Val Val Ser Val Leu He Leu Ser Leu Ser Ser 340 345 350
CTG GAA CTG CGT TAC TTC TCA CCC CTA ACT ATC ATA ACC ATG CAT TTT 1166 Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr He He Thr Met His Phe 355 360 365
GCC TTG ACG CTT ATC ATC CTC TTT TTC TTT TTG TAT AAG GCT AAG CCG 1214 Ala Leu Thr Leu He He Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro 370 375 380
TTT GAT GAG TGAGCGGCTT TAAGAGTGCA ACCTTTTAGC GATTTCTATA GCAACATCA 1272 Phe Asp Glu 385
TAGCCATG 1280
(2) INFORMATION FOR SEQ ID NO: 32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 386 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:
Met Leu Arg Lys Asn He Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu
- 122 -
1 5 10 15
He He Ala Gin Ser Leu Pro His Ala He Leu Thr Pro Leu Leu Leu
20 25 30
Ser Lys Gly Leu Ser Leu Ser Glu He Leu Leu Val Gin Thr Phe Phe
35 40 45
Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala Asp
50 55 60
Leu Met Ser Arg Lys Asn Leu Phe Leu Val Ser Asn Ala Phe Leu He 65 70 75 80
Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe He Phe Met Leu
85 90 95
Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly Thr
100 105 110
He Glu Ala Ser Leu He Thr Asp He Lys Glu Asn Lys Lys Asp Leu
115 120 125
Ser Lys Phe Leu Ala Lys Asn Asn Gin He Thr Tyr Leu Gly Met He
130 135 140
He Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala Met 145 150 155 160
Leu Tyr He Val Gly He Phe Leu He Met Leu Cys Val Leu Thr He
165 170 175
He Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gin Lys Ser
180 185 190
Leu Lys Leu Leu Lys Glu Gin Val Lys Gly Ser Leu Lys Glu Leu Lys
195 200 205
Asp Asn Pro Lys Leu Lys He Leu Leu Val Gly His Leu He Thr Pro
210 215 220
Val Phe Phe Met Ser His Phe Gin Met Trp Gin Ala Tyr Phe Leu Lys 225 230 235 240
Gin Gly Val Lys Glu Gin Tyr Leu Phe Val Phe Tyr He Ala Phe Gin
245 250 255
Val He Ser He Leu He His Phe Leu Lys Ala Ser Ser Tyr Ser Gin
260 265 270
Lys He Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu
275 280 285
Leu Leu Ser Asn He Pro Tyr Cys Phe He Gly Val Tyr Ala Leu Met
290 295 300
Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gin Phe Ser 305 310 315 320
Lys Phe Val Ser Lys Asn Asn He Ser Ser Leu Ser Ser Leu Leu Ser
325 330 335
Ser Cys Val Arg Val Val Ser Val Leu He Leu Ser Leu Ser Ser Leu
340 345 350
Glu Leu Arg Tyr Phe Ser Pro Leu Thr He He Thr Met His Phe Ala
355 360 365
Leu Thr Leu He He Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro Phe
370 375 380
Asp Glu 385
(2) INFORMATION FOR SEQ ID NO: 33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1264 base pairs
(B) TYPE: nucleic acid
- 123 -
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 51...1205 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: ATTAAATATG ACTATATACA CTACAACAAT AAGATTTTGA AAGGTTGGTA ATG GAA 56
Met Glu 1
TCA GTA AAA ACA GGA AAA ACA AAT AAG GTT GGC AAG AAT ACA GAG ATG 104 Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr Glu Met 5 10 15
GCT AAT ACA AAG GCA AAT AAA GAG GCT CAT TTT AAA CAA GCG AGC ACC 152 Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gin Ala Ser Thr 20 25 30
ATT ACA AAT ATA ATC AGA TCA ATT CGT GGG ATT TTT ACA AAA ATT GCA 200 He Thr Asn He He Arg Ser He Arg Gly He Phe Thr Lys He Ala 35 40 45 50
AAG AAA GTT AGA GGA CTT GTA AAA AAA CAC CCC AAG AAA AGC AGT GCG 248 Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser Ser Ala 55 60 65
GCA TTA GTA GTA TTG ACC CAT ATT GCG TGC AAG AAA GCG AAA GAA TTA 296 Ala Leu Val Val Leu Thr His He Ala Cys Lys Lys Ala Lys Glu Leu 70 75 80
GAC GAT AAA GTC CAA GAT AAA TCC AAA CAA GCT GAA AAA GAA AAT CAA 344 Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu Asn Gin 85 90 95
ATC AAT TGG TGG AAA TAT TCA GGA TTA ACA ATA GCG ACA AGT TTA TTA 392 He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Thr Ser Leu Leu 100 105 110
TTA GCC GCT TGT AGC ACT GGT GAT GTT AGT GAA CAA ATA GAA CTA GAA 440 Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin He Glu Leu Glu 115 120 125 130
CAA GAA AAA CAA AAG ACG AGC AAT ATA GAG ACT AAC AAT CAA ATA AAA 488 Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn Asn Gin He Lys 135 140 145
GTA GAA CAA GAA AAA CAA AAG ACA AGC AAT ATA GAG ACT AAT AAT CAA 536 Val Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn Asn Gin 150 155 160
-124-
ATA AAA GTA GAA CAA GAA CAA CAG AAA ACA GAA CAA GAA MGA CAG AAA 584 He Lys Val Glu Gin Glu Gin Gin Lys Thr Glu Gin Glu Xaa Gin Lys 165 170 175
ACA GAA CAA GAA AGA CAG AAG ACA GAA CAA GAA AAA CAA AAG ACC ATT 632 Thr Glu Gin Glu Arg Gin Lys Thr Glu Gin Glu Lys Gin Lys Thr He 180 185 190
AAA ACA CAG AAA GAT TTC ATT AAA TAT GTA GAA CAA AAT TGC CAA GAA 680 Lys Thr Gin Lys Asp Phe He Lys Tyr Val Glu Gin Asn Cys Gin Glu 195 200 205 210
AAT CAT AAT CAA TTC TTT ATT GAA AAA GGA GGA ATT AAG GCT GGT ATT 728 Asn His Asn Gin Phe Phe He Glu Lys Gly Gly He Lys Ala Gly He 215 220 225
GGT ATA GAA GTA GAA GCT GAA TGC AAA ACC CCT AAA CCT GCA AAA ACC 776 Gly He Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr 230 235 240
AAT CAA ACC CCT ATC CAG CCA AAA CAC CTC CCA AAC TCT AAA CAA CCC 824 Asn Gin Thr Pro He Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro 245 250 255
CGC TCT CAA AGA GGA TCA AAA GCG CAA GAG CTT ATC GCT TAT TTG CAA 872 Arg Ser Gin Arg Gly Ser Lys Ala Gin Glu Leu He Ala Tyr Leu Gin 260 265 270
AAA GAG CTA GAA TCT CTG CCC TAT TCA CAA AAA GCT ATC GCT AAA CAA 920 Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gin Lys Ala He Ala Lys Gin 275 280 285 290
GTG GAT TTT TAT AGA CCA AGT TCT ATC GCT TAT TTA GAA CTA GAC CCT 968 Val Asp Phe Tyr Arg Pro Ser Ser He Ala Tyr Leu Glu Leu Asp Pro 295 300 305
AGA GAT TTT AAT GTT ACA GAA GAA TGG CAA AAA GAA AAT TTA AAA ATA 1016 Arg Asp Phe Asn Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys He 310 315 320
CGC TCT AAA GCT CAA GCT AAA ATG CTT GAA ATG AGG AGT TTA AAA CCA 1064 Arg Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Ser Leu Lys Pro 325 330 335
GAC TCA CAA GCC CAC CTT TCA ACC TCT CAA AGC CTT TTG TTC GTT CAA 1112 Asp Ser Gin Ala His Leu Ser Thr Ser Gin Ser Leu Leu Phe Val Gin 340 345 350
AAA ATA TTT GCT GAT GTT AAT AAA GAA ATA AAA GTA GTT GCT AAT ACT 1160 Lys He Phe Ala Asp Val Asn Lys Glu He Lys Val Val Ala Asn Thr 355 360 365 370
GAA AAG AAA GCA GAA AAA GCG GGT TAT GGT TAT AGT AAA AGG ATG TAGGC 1210 Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 375 380 385
-125- ATAAGAAAAC ACCATAAAAT CGTTCTTAGC TTATTTATAG TATTTTAAAA ACTC 1264
(2) INFORMATION FOR SEQ ID NO: 34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 385 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr
1 5 10 15
Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gin Ala
20 25 30
Ser Thr He Thr Asn He He Arg Ser He Arg Gly He Phe Thr Lys
35 40 45
He Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser
50 55 60
Ser Ala Ala Leu Val Val Leu Thr His He Ala Cys Lys Lys Ala Lys 65 70 75 80
Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu
85 90 95
Asn Gin He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Thr Ser
100 105 110
Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin He Glu
115 120 125
Leu Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn Asn Gin
130 135 140
He Lys Val Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn 145 150 155 160
Asn Gin He Lys Val Glu Gin Glu Gin Gin Lys Thr Glu Gin Glu Xaa
165 170 175
Gin Lys Thr Glu Gin Glu Arg Gin Lys Thr Glu Gin Glu Lys Gin Lys
180 185 190
Thr He Lys Thr Gin Lys Asp Phe He Lys Tyr Val Glu Gin Asn Cys
195 200 205
Gin Glu Asn His Asn Gin Phe Phe He Glu Lys Gly Gly He Lys Ala
210 215 220
Gly He Gly He Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala 225 230 235 240
Lys Thr Asn Gin Thr Pro He Gin Pro Lys His Leu Pro Asn Ser Lys
245 250 255
Gin Pro Arg Ser Gin Arg Gly Ser Lys Ala Gin Glu Leu He Ala Tyr
260 265 270
Leu Gin Lys Glu Leu Glu Ser Leu Pro Tyr Ser Gin Lys Ala He Ala
275 280 285
Lys Gin Val Asp Phe Tyr Arg Pro Ser Ser He Ala Tyr Leu Glu Leu
290 295 300
Asp Pro Arg Asp Phe Asn Val Thr Glu Glu Trp Gin Lys Glu Asn Leu 305 310 315 320
- 126 -
Lys He Arg Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Ser Leu
325 330 335
Lys Pro Asp Ser Gin Ala His Leu Ser Thr Ser Gin Ser Leu Leu Phe
340 345 350
Val Gin Lys He Phe Ala Asp Val Asn Lys Glu He Lys Val Val Ala
355 360 365
Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg
370 375 380
Met 385
(2) INFORMATION FOR SEQ ID NO: 35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 410 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 62...340 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:
ATTCATTTAC TTTTGAGAAA TATAATTCTC TCGCTTTTAA GATCATCACA AGGAGTTTCG 60
T ATG AAA AAG CAA ATC TTG ACA GGT GTT TTA TTA TCA GTT TTG GCA GTG 109
Met Lys Lys Gin He Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val 1 5 10 15
AGT TCT GCA TAC GCT CAC AAA GAT AAA AAA GAC GCC AAA AAA CCT AAA 157 Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys 20 25 30
TTT AGC ACA GAA TTA GTC GTG GCT CAA AAC GAC AAA AAA GAC GCT AAA 205 Phe Ser Thr Glu Leu Val Val Ala Gin Asn Asp Lys Lys Asp Ala Lys 35 40 45
AAA CCT AAA TTT AGC ACA GAA TTA GTC GTG GCT CAA AAC GAC AAA AAA 253 Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gin Asn Asp Lys Lys 50 55 60
GAC GCT AAA AAA CCT AAA TTT AGC ACA GAA TTA GTC GTG GCT CAA AAC 301 Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gin Asn 65 70 75 80
GAC AAA AAA GAC GCT AAA AAA CCT AAA AAC TCA GTG GTC TAATGGCTTT GA 352 Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val 85 90
CTCTAAAAAA GCGTTTTTAA AAACGCTTTT TTGGATATTA TCCTATAATT TCCTACCA 410
-127- (2) INFORMATION FOR SEQ ID NO : 36 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 36 :
Met Lys Lys Gin He Leu Thr Gly Val Leu Leu Ser Val Leu Ala Val
1 5 10 15
Ser Ser Ala Tyr Ala His Lys Asp Lys Lys Asp Ala Lys Lys Pro Lys
20 25 30
Phe Ser Thr Glu Leu Val Val Ala Gin Asn Asp Lys Lys Asp Ala Lys
35 40 45
Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gin Asn Asp Lys Lys
50 55 60
Asp Ala Lys Lys Pro Lys Phe Ser Thr Glu Leu Val Val Ala Gin Asn
65 70 75 80
Asp Lys Lys Asp Ala Lys Lys Pro Lys Asn Ser Val Val 85 90
(2) INFORMATION FOR SEQ ID NO : 37 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2097 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 67... 046 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 37 :
TAAAAACCCC TATCATAGGG CGTGGCATGA AGAAAAAAGC AAAAGTCTTT TGGTATTGTT 60
TTAATC ATG ATT TAT TGG TTG TAT TTG GCG GTC TTT TTT TTG TTG AGC 108
Met He Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser 1 5 10
GCA TTA GAC GCT AAA GAA ATC GCT ATG CAA CGA TTT GAC AAA CAA AAC 156 Ala Leu Asp Ala Lys Glu He Ala Met Gin Arg Phe Asp Lys Gin Asn 15 20 25 30
CAT AAG ATT TTT GAA ATC CTT GCG GAT AAA GTG AGC GCT AAA GAC AAT 204 His Lys He Phe Glu He Leu Ala Asp Lys Val Ser Ala Lys Asp Asn
- 128 -
35 40 45
GTG ATA ACC GCA TCA GGG AAT GCG ATC TTA TTG AAT TAT GAT GTG TAT 252 Val He Thr Ala Ser Gly Asn Ala He Leu Leu Asn Tyr Asp Val Tyr 50 55 60
ATT CTA GCG GAC AAG GTG CGT TAT GAC ACT AAA ACC AAA GAA GCG TTA 300 He Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu 65 70 75
TTA GAG GGG AAT ATC AAG GTT TAT AGG GGC GAG GGT TTG CTC GTT AAA 348 Leu Glu Gly Asn He Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys 80 85 90
ACC GAT TAC GTG AAA TTG AGT TTG AAT GAA AAA TAT GAA ATC ATT TTC 396 Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu He He Phe 95 100 105 110
CCC TTT TAT GTC CAA GAC AGC GTG AGC GGG ATT TGG GTG AGC GCG GAT 444 Pro Phe Tyr Val Gin Asp Ser Val Ser Gly He Trp Val Ser Ala Asp 115 120 125
ATT GCC AGC GGA AAG GAT CAA AAA TAT AAG GTT AAA AAC ATG AGC ACT 492 He Ala Ser Gly Lys Asp Gin Lys Tyr Lys Val Lys Asn Met Ser Thr 130 135 140
TCA GGG TGC AGC ATT GAT AAC CCC ATT TGG CAT GTC AAT GCG ACT TCA 540 Ser Gly Cys Ser He Asp Asn Pro He Trp His Val Asn Ala Thr Ser 145 150 155
GGC TCA TTC AAC ATG CAA AAA TCG CAT TTG TCT ATG TGG AAT CCT AAG 588 Gly Ser Phe Asn Met Gin Lys Ser His Leu Ser Met Trp Asn Pro Lys 160 165 170
ATC TAT GTC GGT GAT ATT CCT GTA TTG TAT TTG CCC TAT ATT TTC ATG 636 He Tyr Val Gly Asp He Pro Val Leu Tyr Leu Pro Tyr He Phe Met 175 180 185 190
TCC ACG AGC AAT AAA AGA ACT ACT GGG TTT TTA TAC CCT GAG TTT GGC 684 Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro Glu Phe Gly 195 200 205
ACT TCC AAC TTA GAC GGC TTT ATT TAT TTG CAA CCC TTT TAT TTA GCC 732 Thr Ser Asn Leu Asp Gly Phe He Tyr Leu Gin Pro Phe Tyr Leu Ala 210 215 220
CCC AAA AAC TCA TGG GAT ATG ACC TTT ACC CCA CAA ATC CGC TAT AAA 780 Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gin He Arg Tyr Lys 225 230 235
AGG GGT TTT GGC TTG AAT TTT GAA GCG CGC TAC ATT AAC TCT AAA AAC 828 Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr He Asn Ser Lys Asn 240 245 250
GAC AGG TTT TTA TTC AAC GCG CGC TAT TTT AGG AAT TAC ACC CAA TAT 876 Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gin Tyr
- 12 9 -
255 260 265 270
GTC AAA CGC TAC GAT TTG AGG AAT CAA AAT ATC TAC GGG TTT GAA TTT 924 Val Lys Arg Tyr Asp Leu Arg Asn Gin Asn He Tyr Gly Phe Glu Phe 275 280 285
TTA AGC TCT AGC AGG GAC ACT TTA CAA AAA TAC TTC CAC CTT AAG TCT 972 Leu Ser Ser Ser Arg Asp Thr Leu Gin Lys Tyr Phe His Leu Lys Ser 290 295 300
AAT ATT GAC AAC GGG CAT TAC ATT GAC TTT TTA TAC ATG AAC GAT TTG 1020 Asn He Asp Asn Gly His Tyr He Asp Phe Leu Tyr Met Asn Asp Leu 305 310 315
GAC TAT GTG CGT TTT GAA AAG GTT AAT AAG CGT ATC ACA GAC GCC ACG 1068 Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg He Thr Asp Ala Thr 320 325 330
CAC ATG TCT AGG GCG AAT TAC TAT TTG CAA ACA GAA AAC AAT TAT TAC 1116 His Met Ser Arg Ala Asn Tyr Tyr Leu Gin Thr Glu Asn Asn Tyr Tyr 335 340 345 350
GGC TTG AAT ATC AAG TAT TTT TTA AAC CTG AAT AAA ATC AAC AAT AAC 1164 Gly Leu Asn He Lys Tyr Phe Leu Asn Leu Asn Lys He Asn Asn Asn 355 360 365
CGC ACT TTC CAA TCT GTC CCT AAT TTG CAA TAC CAT AAA TAT TTA AAT 1212 Arg Thr Phe Gin Ser Val Pro Asn Leu Gin Tyr His Lys Tyr Leu Asn 370 375 380
TCT TTG TAT TTT AGA AAT TTG TTG TAT TCG GTG GAT TAT CAG TTT AGA 1260 Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gin Phe Arg 385 390 395
AAC ACC GCA AGA GAG ATT GGT TAT GGC TAT GTG CAA AAC GCT TTG AAT 1308 Asn Thr Ala Arg Glu He Gly Tyr Gly Tyr Val Gin Asn Ala Leu Asn 400 405 410
GTG CCG GTG GGC TTG CAA TTT TCT TTG TTT AAA AAG TAT TTG TCT TTA 1356 Val Pro Val Gly Leu Gin Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu 415 420 425 430
GGG CTT TGG AAT GAT CTC CAA CTA TCT AAT GTG GCT TTA ATG CAA TCT 1404 Gly Leu Trp Asn Asp Leu Gin Leu Ser Asn Val Ala Leu Met Gin Ser 435 440 445
AAA AAT TCC TTC GTG CCT ACG ATC CCT AAT GAA TCA AGG GAA TTT GGG 1452 Lys Asn Ser Phe Val Pro Thr He Pro Asn Glu Ser Arg Glu Phe Gly 450 455 460
AAT TTT GTG TCT TCA AAT TTT TCC ATG TAT GTC AAT ACG GAT TTG GCT 1500 Asn Phe Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala 465 470 475
AGA GAA TAC AAC AAG CTT TTC CAC ACG ATC CAA CTA GAA GCG ATT TTC 1548
- 130 -
Arg Glu Tyr Asn Lys Leu Phe His Thr He Gin Leu Glu Ala He Phe 480 485 490
AAC ATC CCT TAT TAC ACC TTT AAA AAC GGC TTA TTT TCT CAA AAC ATG 1596 Asn He Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gin Asn Met 495 500 505 510
TAT GCT TTA AGC GCG CAA GCC TTA AAC AGC TAC ACT TCG CCT TTA TTG 1644 Tyr Ala Leu Ser Ala Gin Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu 515 520 525
AGA GAT TAT GAT TAT CAA GGG CGT TTG TAT GAC TCG GTG TGG AAT CCT 1692 Arg Asp Tyr Asp Tyr Gin Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro 530 535 540
AGC AGT ATT TTA CCT AGC AAT GCG AGC AAC AAG ACG GTG GAT TTA ACC 1740 Ser Ser He Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr 545 550 555
CTA ACG CAA TAC CTT TAT GGC TTA GGG GGG CAA GAG TTA TTG TAT TTT 1788 Leu Thr Gin Tyr Leu Tyr Gly Leu G.y Gly Gin Glu Leu Leu Tyr Phe 560 565 570
AAA ATA TCG CAA CTC ATC AAT CTT GAC GAT AAA GTT TCG CCC TTT AGA 1836 Lys He Ser Gin Leu He Asn Leu Asp Asp Lys Val Ser Pro Phe Arg 575 580 585 590
ATG CCA CTA GAG AGC AAG ATC GGG TTT TCG CCC TTA ACG GGA TTG AAC 1884 Met Pro Leu Glu Ser Lys He Gly Phe Ser Pro Leu Thr Gly Leu Asn 595 600 605
ATC TTT GGG AAT GTC TTT TAT TCG TTT TAT CAA AAC CGC TTA GAA GAA 1932 He Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gin Asn Arg Leu Glu Glu 610 615 620
ATC TCT GTG AAC GCC AAT TAC CAA CGC AAG TTT TTA AGC TTT AAC CTC 1980 He Ser Val Asn Ala Asn Tyr Gin Arg Lys Phe Leu Ser Phe Asn Leu 625 630 635
TCT TAT TTT TTA AAA AAC AAT TTT AGC AGT GGG ATT AAT AGC ATT GTA 2028 Ser Tyr Phe Leu Lys Asn Asn Phe Ser Ser Gly He Asn Ser He Val 640 645 650
GAA AAT CTG CGG ATT ATT TAAAGGCGGG TTTTAGCAAC GACTTTGGCT ATTTTTCC 2084 Glu Asn Leu Arg He He 655 660
ATGAGCGCGG ATG 2097
(2) INFORMATION FOR SEQ ID NO : 38 : (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 131 -
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 38 :
Met He Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu
1 5 10 15
Asp Ala Lys Glu He Ala Met Gin Arg Phe Asp Lys Gin Asn His Lys
20 25 30
He Phe Glu He Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val He
35 40 45
Thr Ala Ser Gly Asn Ala He Leu Leu Asn Tyr Asp Val Tyr He Leu
50 55 60
Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu 65 70 75 80
Gly Asn He Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp
85 90 95
Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu He He Phe Pro Phe
100 105 110
Tyr Val Gin Asp Ser Val Ser Gly He Trp Val Ser Ala Asp He Ala
115 120 125
Ser Gly Lys Asp Gin Lys Tyr Lys Val Lys Asn Met Ser Thr Ser Gly
130 135 140
Cys Ser He Asp Asn Pro He Trp His Val Asn Ala Thr Ser Gly Ser 145 150 155 160
Phe Asn Met Gin Lys Ser His Leu Ser Met Trp Asn Pro Lys He Tyr
165 170 175
Val Gly Asp He Pro Val Leu Tyr Leu Pro Tyr He Phe Met Ser Thr
180 185 190
Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro Glu Phe Gly Thr Ser
195 200 205
Asn Leu Asp Gly Phe He Tyr Leu Gin Pro Phe Tyr Leu Ala Pro Lys
210 215 220
Asn Ser Trp Asp Met Thr Phe Thr Pro Gin He Arg Tyr Lys Arg Gly 225 230 235 240
Phe Gly Leu Asn Phe Glu Ala Arg Tyr He Asn Ser Lys Asn Asp Arg
245 250 255
Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gin Tyr Val Lys
260 265 270
Arg Tyr Asp Leu Arg Asn Gin Asn He Tyr Gly Phe Glu Phe Leu Ser
275 280 285
Ser Ser Arg Asp Thr Leu Gin Lys Tyr Phe His Leu Lys Ser Asn He
290 295 300
Asp Asn Gly His Tyr He Asp Phe Leu Tyr Met Asn Asp Leu Asp Tyr 305 310 315 320
Val Arg Phe Glu Lys Val Asn Lys Arg He Thr Asp Ala Thr His Met
325 330 335
Ser Arg Ala Asn Tyr Tyr Leu Gin Thr Glu Asn Asn Tyr Tyr Gly Leu
340 345 350
Asn He Lys Tyr Phe Leu Asn Leu Asn Lys He Asn Asn Asn Arg Thr
355 360 365
Phe Gin Ser Val Pro Asn Leu Gin Tyr His Lys Tyr Leu Asn Ser Leu
370 375 380
Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gin Phe Arg Asn Thr 385 390 395 400
- 132 -
Ala Arg Glu He Gly Tyr Gly Tyr Val Gin Asn Ala Leu Asn Val Pro
405 410 415
Val Gly Leu Gin Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu
420 425 430
Trp Asn Asp Leu Gin Leu Ser Asn Val Ala Leu Met Gin Ser Lys Asn
435 440 445
Ser Phe Val Pro Thr He Pro Asn Glu Ser Arg Glu Phe Gly Asn Phe
450 455 460
Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu 465 470 475 480
Tyr Asn Lys Leu Phe His Thr He Gin Leu Glu Ala He Phe Asn He
485 490 495
Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gin Asn Met Tyr Ala
500 505 510
Leu Ser Ala Gin Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu Arg Asp
515 520 525
Tyr Asp Tyr Gin Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser
530 535 540
He Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr 545 550 555 560
Gin Tyr Leu Tyr Gly Leu Gly Gly Gin Glu Leu Leu Tyr Phe Lys He
565 570 575
Ser Gin Leu He Asn Leu Asp Asp Lys Val Ser Pro Phe Arg Met Pro
580 585 590
Leu Glu Ser Lys He Gly Phe Ser Pro Leu Thr Gly Leu Asn He Phe
595 600 605
Gly Asn Val Phe Tyr Ser Phe Tyr Gin Asn Arg Leu Glu Glu He Ser
610 615 620
Val Asn Ala Asn Tyr Gin Arg Lys Phe Leu Ser Phe Asn Leu Ser Tyr 625 630 635 640
Phe Leu Lys Asn Asn Phe Ser Ser Gly He Asn Ser He Val Glu Asn
645 650 655
Leu Arg He He 660
(2) INFORMATION FOR SEQ ID NO : 39 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 961 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 168...764 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39 :
ATGCCGATTA AATGCATGCT GATTAAATGA ATGAAAAGAG TCCAAACCAC CGCCTTTAAC 60 GCACCACGCT TGAAATTAAA ACTAAATTTT AGTGTATTCT TAGCAAATTT TAGATAAGAT 120
- 133 -
CAAGCGTGAT TTTTTCTAAA TTTTAGGCAT TTAAGGAATC AGTGTTT ATG ACA AGC 176
Met Thr Ser 1
GCT CTG TTA GGC TTA CAA ATT GTT TTA GCG GTA TTG ATT GTG GTG GTG 224 Ala Leu Leu Gly Leu Gin He Val Leu Ala Val Leu He Val Val Val 5 10 15
GTT TTG TTG CAA AAA AGT TCT AGC ATC GGC TTA GGG GCT TAT AGC GGG 272 Val Leu Leu Gin Lys Ser Ser Ser He Gly Leu Gly Ala Tyr Ser Gly 20 25 30 35
AGT AAT GAG TCT TTA TTT GGC GCT AAA GGG CCT GCA AGC TTT ATG GCG 320 Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser Phe Met Ala 40 45 50
AAA TTA ACC ATG TTT TTA GGG CTG TTA TTT GTC ATC AAC ACC ATC GCT 368 Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val He Asn Thr He Ala 55 60 65
TTG GGC TAT TTT TAC AAC AAA GAA TAC GGC AAG AGC GTT TTA GAT GAG 416 Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val Leu Asp Glu 70 75 80
ACT AAA ACC AAC AAA GAA CTT TCG CCC CTA GTC CCT GCC ACC GGC ACG 464 Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala Thr Gly Thr 85 90 95
CTT AAC CCT GCA CTT AAT CCC ACA TTA AAC CCA ACG CTC AAC CCT TTA 512 Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu Asn Pro Leu 100 105 110 115
GAG CAA GCC CCA ACT AAT CCT TTA ATG CCA CAA CAA ACG CCT AAC GAA 560 Glu Gin Ala Pro Thr Asn Pro Leu Met Pro Gin Gin Thr Pro Asn Glu 120 125 130
CTC CCT AAA GAG CCA GCC AAA ACG CCT TCT GTT GAA AGC CCC AAA CAG 608 Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser Pro Lys Gin 135 140 145
AAT GAA AAG AAT GAA AAG AAT GAC GCC AAA GAG AAT GGT ATA AAG GGT 656 Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly He Lys Gly 150 155 160
GTT GAA AAA ACC AAA GAG AAC GCC AAA ACG CCC CCA ACC ACC CAC CAA 704 Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr Thr His Gin 165 170 175
AAG CCT AAA ACG CAT GCA ACG CAA ACC AAC GCC CAT ACC AAC CAA AAA 752 Lys Pro Lys Thr His Ala Thr Gin Thr Asn Ala His Thr Asn Gin Lys 180 185 190 195
AAG GAT GAA AAA TAATGTTACA GGCCATTTAT AACGAAACCA AAGATCTGAT GCAAA 809 Lys Asp Glu Lys
- 134 -
AAAGCATTCA AGCTTTAAAC AGGGATTTTT CCACTCTAAG GAGCGCGAAA GTTTCAGTCA 869 ATATTTTAGA TCACATCAAA GTGGATTATT ACGGCACGCC CACGGCATTA AATCAAGTCG 929 GATCCGTGAT GAGCTTGGAT GCGACCACCC TT 961
(2) INFORMATION FOR SEQ ID NO : 40 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 199 amino acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 40 :
Met Thr Ser Ala Leu Leu Gly Leu Gin He Val Leu Ala Val Leu He
1 5 10 15
Val Val Val Val Leu Leu Gin Lys Ser Ser Ser He Gly Leu Gly Ala
20 25 30
Tyr Ser Gly Ser Asn Glu Ser Leu Phe Gly Ala Lys Gly Pro Ala Ser
35 40 45
Phe Met Ala Lys Leu Thr Met Phe Leu Gly Leu Leu Phe Val He Asn
50 55 60
Thr He Ala Leu Gly Tyr Phe Tyr Asn Lys Glu Tyr Gly Lys Ser Val 65 70 75 80
Leu Asp Glu Thr Lys Thr Asn Lys Glu Leu Ser Pro Leu Val Pro Ala
85 90 95
Thr Gly Thr Leu Asn Pro Ala Leu Asn Pro Thr Leu Asn Pro Thr Leu
100 105 110
Asn Pro Leu Glu Gin Ala Pro Thr Asn Pro Leu Met Pro Gin Gin Thr
115 120 125
Pro Asn Glu Leu Pro Lys Glu Pro Ala Lys Thr Pro Ser Val Glu Ser
130 135 140
Pro Lys Gin Asn Glu Lys Asn Glu Lys Asn Asp Ala Lys Glu Asn Gly 145 150 155 160
He Lys Gly Val Glu Lys Thr Lys Glu Asn Ala Lys Thr Pro Pro Thr
165 170 175
Thr His Gin Lys Pro Lys Thr His Ala Thr Gin Thr Asn Ala His Thr
180 185 190
Asn Gin Lys Lys Asp Glu Lys 195
(2) INFORMATION FOR SEQ ID NO : 41 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1058 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
- 135 -
(B) LOCATION: 325...879 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 41 :
CCTAGTCCCT GCCACCGGCA CGCTTAACCC TGCACTTAAT CCCACATTAA ACCCAACGCT 60
CAACCCTTTA GAGCAAGCCC CAACTAATCC TTTAATGCCA CAACAAACGC CTAACGAACT 120
CCCTAAAGAG CCAGCCAAAA CGCCTTCTGT TGAAAGCCCC AAACAGAATG AAAAGAATGA 180
AAAGAATGAC GCCAAAGAGA ATGGTATAAA GGGTGTTGAA AAAACCAAAG AGAACGCCAA 240
AACGCCCCCA ACCACCCACC AAAAGCCTAA AACGCATGCA ACGCAAACCA ACGCCCATAC 300
CAACCAAAAA AAGGATGAAA AATA ATG TTA CAG GCC ATT TAT AAC GAA ACC 351
Met Leu Gin Ala He Tyr Asn Glu Thr 1 5
AAA GAT CTG ATG CAA AAA AGC ATT CAA GCT TTA AAC AGG GAT TTT TCC 399 Lys Asp Leu Met Gin Lys Ser He Gin Ala Leu Asn Arg Asp Phe Ser 10 15 20 25
ACT CTA AGG AGC GCG AAA GTT TCA GTC AAT ATT TTA GAT CAC ATC AAA 447 Thr Leu Arg Ser Ala Lys Val Ser Val Asn He Leu Asp His He Lys 30 35 40
GTG GAT TAT TAC GGC ACG CCC ACG GCA TTA AAT CAA GTC GGA TCC GTG 495 Val Asp Tyr Tyr Gly Thr Pro Thr Ala Leu Asn Gin Val Gly Ser Val 45 50 55
ATG AGC TTG GAT GCG ACC ACC CTT CAA ATC AGC CCA TGG GAA AAA AAC 543 Met Ser Leu Asp Ala Thr Thr Leu Gin He Ser Pro Trp Glu Lys Asn 60 65 70
CTG CTC AAA GAA ATT GAA AGA TCC ATT CAA GAA GCC AAT ATT GGT GTC 591 Leu Leu Lys Glu He Glu Arg Ser He Gin Glu Ala Asn He Gly Val 75 80 85
AAT CCT AAT AAC GAC GGC GAA ACG ATC AAG CTT TTT TTC CCG CCC ATG 639 Asn Pro Asn Asn Asp Gly Glu Thr He Lys Leu Phe Phe Pro Pro Met 90 95 100 105
ACA AGT GAG CAA AGA AAA CTC ATC GCA AAA GAC GCC AAA GCG ATG GGT 687 Thr Ser Glu Gin Arg Lys Leu He Ala Lys Asp Ala Lys Ala Met Gly 110 115 120
GAA AAG GCT AAA GTG GCT GTG AGG AAT ATC CGC CAA GAT GCT AAC AAC 735 Glu Lys Ala Lys Val Ala Val Arg Asn He Arg Gin Asp Ala Asn Asn 125 130 135
CAG GTG AAA AAA TTA GAA AAA GAC AAA GAA ATC AGC GAA GAT GAA AGC 783 Gin Val Lys Lys Leu Glu Lys Asp Lys Glu He Ser Glu Asp Glu Ser 140 145 150
AAA AAA GCC CAA GAG CAG ATC CAA AAA ATC ACC GAT GAA GCC ATT AAA 831 Lys Lys Ala Gin Glu Gin He Gin Lys He Thr Asp Glu Ala He Lys 155 160 165
- 136 -
AAA ATT GAT GAA AGC GTG AAA AAC AAA GAA GAC GCG ATC TTA AAG GTC T 880 Lys He Asp Glu Ser Val Lys Asn Lys Glu Asp Ala He Leu Lys Val 170 175 180 185
AAACCATGGA TATTAAGGCA TGTTATCAAA ACGCTAAAGC GTTATTAGAG GGGCATTTCT 940 TGCTCAGCAG TGGGTTTCAT TCCAATTATT ATTTGCAATC CGCTAAAGTT TTAGAAGATC 1000 CCAAACTAGC CGAACAATTA GCGCTAGAAT TAGCCAAACA AATCCAAGAA GCTCATTT 1058
(2) INFORMATION FOR SEQ ID NO:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:
Met Leu Gin Ala He Tyr Asn Glu Thr Lys Asp Leu Met Gin Lys Ser
1 5 10 15
He Gin Ala Leu Asn Arg Asp Phe Ser Thr Leu Arg Ser Ala Lys Val
20 25 30
Ser Val Asn He Leu Asp His He Lys Val Asp Tyr Tyr Gly Thr Pro
35 40 45
Thr Ala Leu Asn Gin Val Gly Ser Val Met Ser Leu Asp Ala Thr Thr
50 55 60
Leu Gin He Ser Pro Trp Glu Lys Asn Leu Leu Lys Glu He Glu Arg 65 70 75 80
Ser He Gin Glu Ala Asn He Gly Val Asn Pro Asn Asn Asp Gly Glu
85 90 95
Thr He Lys Leu Phe Phe Pro Pro Met Thr Ser Glu Gin Arg Lys Leu
100 105 110
He Ala Lys Asp Ala Lys Ala Met Gly Glu Lys Ala Lys Val Ala Val
115 120 125
Arg Asn He Arg Gin Asp Ala Asn Asn Gin Val Lys Lys Leu Glu Lys
130 135 140
Asp Lys Glu He Ser Glu Asp Glu Ser Lys Lys Ala Gin Glu Gin He 145 150 155 160
Gin Lys He Thr Asp Glu Ala He Lys Lys He Asp Glu Ser Val Lys
165 170 175
Asn Lys Glu Asp Ala He Leu Lys Val 180 185
(2) INFORMATION FOR SEQ ID NO: 43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1669 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
- 137 -
(A) NAME/ KEY: Coding Sequence
(B) LOCATION: 163...1389 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
GAGTGGATGA AAAAGACACT TTCAATTTTG CAAAAATTGG CTATGAACAG GGCAAGGGCG 60 AAGAATTAAA AGAAGTAGAA GAAAAGCATG CGTTTAAGAA AATCCCTTTT GTCAAAGATT 120 TGCACAAAAT CGCCCCCACT ATCTTAAAAA AGAGGCTATA AA ATG GCT CAA AAT 174
Met Ala Gin Asn 1
TTC ACG AAA CTC AAC CCC CAG TTT GAA AAC ATC ATT TTT GAA CAT GAC 222 Phe Thr Lys Leu Asn Pro Gin Phe Glu Asn He He Phe Glu His Asp 5 10 15 20
GAC AAC CAA ATG ATT TTA AAC TTT GGC CCC CAA CAC CCC AGT AGT CAT 270 Asp Asn Gin Met He Leu Asn Phe Gly Pro Gin His Pro Ser Ser His 25 30 35
GGG CAA TTG CGC TTG ATT TTG GAA TTA GAG GGC GAA AAA ATC ATT AAG 318 Gly Gin Leu Arg Leu He Leu Glu Leu Glu Gly Glu Lys He He Lys 40 45 50
GCT ACC CCT GAA ATT GGC TAC TTG CAT AGA GGC TGT GAA AAG TTA GGC 366 Ala Thr Pro Glu He Gly Tyr Leu His Arg Gly Cys Glu Lys Leu Gly 55 60 65
GAA AAC ATG ACC TAT AAC GAA TAC ATG CCC ACT ACT GAT AGA TTG GAT 414 Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr Asp Arg Leu Asp 70 75 80
TAC ACT TCT TCT ACC AGC AAT AAT TAC GCT TAC GCT TAT GCG GTA GAG 462 Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala Tyr Ala Val Glu 85 90 95 100
ACC TTA CTC AAT TTA GAA ATC CCA CGC CGA GCG CAG GTG ATC CGC ACG 510 Thr Leu Leu Asn Leu Glu He Pro Arg Arg Ala Gin Val He Arg Thr 105 110 115
ATT TTA CTA GAG CTT AAC CGC ATG ATC TCA CAC ATC TTT TTT ATC AGC 558 He Leu Leu Glu Leu Asn Arg Met He Ser His He Phe Phe He Ser 120 125 130
GTG CAT GCT TTA GAT GTG GGG GCG ATG AGC GTG TTT TTG TAT GCG TTT 606 Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe Leu Tyr Ala Phe 135 140 145
AAA ACG AGG GAA TAC GGC TTG GAT TTG ATG GAG GAT TAT TGC GGG GCT 654 Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp Tyr Cys Gly Ala 150 155 160
AGG CTC ACG CAT AAC GCT ATA AGG ATT GGG GGC GTG CCT TTA GAT TTA 702
- 138 -
Arg Leu Thr His Asn Ala He Arg He Gly Gly Val Pro Leu Asp Leu 165 170 175 180
CCC CCT AAT TGG TTA GAA GGC TTA AAA AAG TTT TTA GGC GAA ATG AGG 750 Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys Phe Leu Gly Glu Met Arg 185 190 195
GAA TGC AAA AAA CTC ATT CAA GGC TTA TTG GAT AAG AAT CGC ATT TGG 798 Glu Cys Lys Lys Leu He Gin Gly Leu Leu Asp Lys Asn Arg He Trp 200 205 210
CGG ATG CGC TTG GAA AAT GTG GGC GTT GTA ACG CAA AAA ATG GCG CAA 846 Arg Met Arg Leu Glu Asn Val Gly Val Val Thr Gin Lys Met Ala Gin 215 220 225
AGC TGG GGC ATG AGC GGT ATC ATG TTA AGA GGG ACT GGG ATC GCT TAT 894 Ser Trp Gly Met Ser Gly He Met Leu Arg Gly Thr Gly He Ala Tyr 230 235 240
GAC ATC AGA AAA GAA GAG CCT TAT GAG CTT TAT AAA GAG CTT GAT TTT 942 Asp He Arg Lys Glu Glu Pro Tyr Glu Leu Tyr Lys Glu Leu Asp Phe 245 250 255 260
GAT GTG CCG GTG GGC AAT TAT GGC GAT AGT TAT GAT AGG TAT TGT TTG 990 Asp Val Pro Val Gly Asn Tyr Gly Asp Ser Tyr Asp Arg Tyr Cys Leu 265 270 275
TAT ATG TTA GAA ATT GAT GAA AGC GTT CGC ATC ATT GAA CAG CTC ATT 1038 Tyr Met Leu Glu He Asp Glu Ser Val Arg He He Glu Gin Leu He 280 285 290
CCT ATG TAT GCT AAA ACC GAT ACG CCT ATC ATG GCT CAA AAC CCG CAT 1086 Pro Met Tyr Ala Lys Thr Asp Thr Pro He Met Ala Gin Asn Pro His 295 300 305
TAT ATT TCC GCC CCT AAA GAA GAT ATA ATG ACG CAA AAC TAC GCC TTG 1134 Tyr He Ser Ala Pro Lys Glu Asp He Met Thr Gin Asn Tyr Ala Leu 310 315 320
ATG CAG CAT TTT GTT TTA GTG GCT CAG GGC ATG CGT CCG CCC GTT GGG 1182 Met Gin His Phe Val Leu Val Ala Gin Gly Met Arg Pro Pro Val Gly 325 330 335 340
GAA GTG TAT GCC CCC ACA GAA AGC CCT AAA GGG GAA TTA GGG TTT TTT 1230 Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys Gly Glu Leu Gly Phe Phe 345 350 355
ATC CAT TCA GAG GGC GAG CCT TAC CCT CAC AGG CTA AAA ATC AGA GCC 1278 He His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu Lys He Arg Ala 360 365 370
CCT AGC TTT TAT CAC ATT GGG GCT TTG AGC GAC ATT TTA GTG GGG CAA 1326 Pro Ser Phe Tyr His He Gly Ala Leu Ser Asp He Leu Val Gly Gin 375 380 385
- 13 9 -
TAT TTA GCG GAT GCA GTA ACC GTG ATT GGC TCA ACC AAT GCG GTG TTT 1374 Tyr Leu Ala Asp Ala Val Thr Val He Gly Ser Thr Asn Ala Val Phe 390 395 400
GGC GAG GTG GAT AGA TGAAACGCTT TGATTTACGC CCCTTAAAAG CGGGTATTTT T 1430
Gly Glu Val Asp Arg
405
GAACGCTTAG AAGAATTGAT TGAAAAAGAA ATGCAACCTA ATGAAGTCGC TATTTTCATG 1490
TTTGAAGTGG GGGATTTTTC TAATATCCCT AAGAGCGCTG AATTTATCCA ATCTAAAGGG 1550
CATGAGCTCC TCAATTCTTT GCGTTTCAAT CAAGCGGATT GGACGATTGT CGTGAGAAAA 1610
AAGGCTTGAT TTTGAGCGGC TTTAACCCCT TAAATTCTCC CTTAGTCGCA AGCTCTTCT 1669
(2) INFORMATION FOR SEQ ID NO: 44:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:
Met Ala Gin Asn Phe Thr Lys Leu Asn Pro Gin Phe Glu Asn He He
1 5 10 15
Phe Glu His Asp Asp Asn Gin Met He Leu Asn Phe Gly Pro Gin His
20 25 30
Pro Ser Ser His Gly Gin Leu Arg Leu He Leu Glu Leu Glu Gly Glu
35 40 45
Lys He He Lys Ala Thr Pro Glu He Gly Tyr Leu His Arg Gly Cys
50 55 60
Glu Lys Leu Gly Glu Asn Met Thr Tyr Asn Glu Tyr Met Pro Thr Thr 65 70 75 80
Asp Arg Leu Asp Tyr Thr Ser Ser Thr Ser Asn Asn Tyr Ala Tyr Ala
85 90 95
Tyr Ala Val Glu Thr Leu Leu Asn Leu Glu He Pro Arg Arg Ala Gin
100 105 110
Val He Arg Thr He Leu Leu Glu Leu Asn Arg Met He Ser His He
115 120 125
Phe Phe He Ser Val His Ala Leu Asp Val Gly Ala Met Ser Val Phe
130 135 140
Leu Tyr Ala Phe Lys Thr Arg Glu Tyr Gly Leu Asp Leu Met Glu Asp 145 150 155 160
Tyr Cys Gly Ala Arg Leu Thr His Asn Ala He Arg He Gly Gly Val
165 170 175
Pro Leu Asp Leu Pro Pro Asn Trp Leu Glu Gly Leu Lys Lys Phe Leu
180 185 190
Gly Glu Met Arg Glu Cys Lys Lys Leu He Gin Gly Leu Leu Asp Lys
195 200 205
Asn Arg He Trp Arg Met Arg Leu Glu Asn Val Gly Val Val Thr Gin
210 215 220
Lys Met Ala Gin Ser Trp Gly Met Ser Gly He Met Leu Arg Gly Thr
- 140 -
225 230 235 240
Gly He Ala Tyr Asp He Arg Lys Glu Glu Pro Tyr Glu Leu Tyr Lys
245 250 255
Glu Leu Asp Phe Asp Val Pro Val Gly Asn Tyr Gly Asp Ser Tyr Asp
260 265 270
Arg Tyr Cys Leu Tyr Met Leu Glu He Asp Glu Ser Val Arg He He
275 280 285
Glu Gin Leu He Pro Met Tyr Ala Lys Thr Asp Thr Pro He Met Ala
290 295 300
Gin Asn Pro His Tyr He Ser Ala Pro Lys Glu Asp He Met Thr Gin 305 310 315 320
Asn Tyr Ala Leu Met Gin His Phe Val Leu Val Ala Gin Gly Met Arg
325 330 335
Pro Pro Val Gly Glu Val Tyr Ala Pro Thr Glu Ser Pro Lys Gly Glu
340 345 350
Leu Gly Phe Phe He His Ser Glu Gly Glu Pro Tyr Pro His Arg Leu
355 360 365
Lys He Arg Ala Pro Ser Phe Tyr His He Gly Ala Leu Ser Asp He
370 375 380
Leu Val Gly Gin Tyr Leu Ala Asp Ala Val Thr Val He Gly Ser Thr 385 390 395 400
Asn Ala Val Phe Gly Glu Val Asp Arg 405
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 869 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 358...732 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45:
TAACTTGTGG TTAACTACCG CCAGACTCCT TTTGAGTTTG GCAAACGCGC CAATGAGTTC 60
TTTAGGCATT TTTTCAGTGC CGATCTTAAA GTTTTCAAGA CTGCGTTGCG TTTGAGCCCC 120
CCAATATTGG CTATCATTTA CTTTGATTTC GCCCATCGTG TCATGTTCAA TTCTAAATTG 180
CATGCTAATC CTTTGAAATT TGATTTTAAA ACCTTAAAAA AATAGCATAA ACTCTTATAC 240
CTTCTACTTA AAAACCCTAA TTTTTTAAAC ACCATTTCCA CAATTTTTAC ACAAAAGAGG 300
GTTATTATCC GTTCGCAACA AGAATTTTCT TGTTATCTTA ATGTAAAGGT CAAAACG ATG 360
Met 1
AAA AAG TTA GCC GCT TTA TTT TTA GTA AGC GTG TTG GGG GTT ATG GGT 408 Lys Lys Leu Ala Ala Leu Phe Leu Val Ser Val Leu Gly Val Met Gly 5 10 15
- 14 1 -
TTA AAC GCA TGG GAG CAA ACC CTA AAA GCT AAT GAC TTG GAA GTG AAA 456 Leu Asn Ala Trp Glu Gin Thr Leu Lys Ala Asn Asp Leu Glu Val Lys 20 25 30
ATC AAA TCC GTG GGT AAC CCC ATT AAA GGC GAT AAC ACT TTC ATT CTC 504 He Lys Ser Val Gly Asn Pro He Lys Gly Asp Asn Thr Phe He Leu 35 40 45
AGC CCC ACT TTA AAA GGT AAG GCT TTA GAA AAA GCT ATC GTT AGG GTG 552 Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala He Val Arg Val 50 55 60 65
CAG TTT ATG ATG CCT GAA ATG CCC GGC ATG CCA GCG ATG AAA GAA ATG 600 Gin Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu Met 70 75 80
GCG CAA GTG AGT GAA AAA AAC GGC CTT TAT GAA GCT AAA ACC AAT CTT 648 Ala Gin Val Ser Glu Lys Asn Gly Leu Tyr Glu Ala Lys Thr Asn Leu 85 90 95
TCT ATG AAC GGG ACA TGG CAG GTT AGG GTG GAT ATT AAA TCT AAA GAG 696 Ser Met Asn Gly Thr Trp Gin Val Arg Val Asp He Lys Ser Lys Glu 100 105 110
GGT CAG GTT TAT CGC GCT AAA ACA AGC CTG GAT TTA TAAGAGCATG CTATCT 748 Gly Gin Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu 115 120 125
TTTATAAGCG CGTTTGATAA AAGGGGCGTT TCAATACGCC TTTTAACAGC CTTGTTACTG 808 CTTTTTAGTT TGGGTTTGGC TAAAGATTTA GAGATCCAAT CTTTTGTGGC TAAATACCTT 868 T 869
(2) INFORMATION FOR SEQ ID NO: 46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 125 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46:
Met Lys Lys Leu Ala Ala Leu Phe Leu Val Ser Val Leu Gly Val Met
1 5 10 15
Gly Leu Asn Ala Trp Glu Gin Thr Leu Lys Ala Asn Asp Leu Glu Val
20 25 30
Lys He Lys Ser Val Gly Asn Pro He Lys Gly Asp Asn Thr Phe He
35 40 45
Leu Ser Pro Thr Leu Lys Gly Lys Ala Leu Glu Lys Ala He Val Arg
50 55 60
Val Gin Phe Met Met Pro Glu Met Pro Gly Met Pro Ala Met Lys Glu 65 70 75 80
- 142 -
Met Ala Gin Val Ser Glu J_,ys Asn Gly Leu Tyr Glu Ala Lys Thr Asn
85 90 95
Leu Ser Met Asn Gly Thr Trp Gin Val Arg Val Asp He Lys Ser Lys
100 105 110
Glu Gly Gin Val Tyr Arg Ala Lys Thr Ser Leu Asp Leu 115 120 125
(2) INFORMATION FOR SEQ ID NO: 47:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1217 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 73...1152 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 47 :
TCCATGCGTT TTGATGCGAT TTTAAAAAAT CTTTGGGTAT TTTAGCATGC CAATGGTTAA 60 AAAAAGGTGG TT ATG AAT GGT TTT TGC GCT AGA CTA CGA GCC ATA ACT CAT 111 Met Asn Gly Phe Cys Ala Arg Leu Arg Ala He Thr His 1 5 10
AAT GAA AGA TTA AAA ATG AAA ATA GCG GTA TTA CTC AGT GGG GGG GTG 159 Asn Glu Arg Leu Lys Met Lys He Ala Val Leu Leu Ser Gly Gly Val 15 20 25
GAT AGC TCT TAT AGC GCT TAT AGC TTA AAA GAG CAA GGG CAT GAA TTA 207 Asp Ser Ser Tyr Ser Ala Tyr Ser Leu Lys Glu Gin Gly His Glu Leu 30 35 40 45
GTG GGG ATT TAT TTA AAA CTC CAT GCG AGT GAA AAA AAG CAT GAT TTA 255 Val Gly He Tyr Leu Lys Leu His Ala Ser Glu Lys Lys His Asp Leu 50 55 60
TAC ATC AAA AAC GCT CAA AAA GCA TGC GAG TTT TTA GGC ATT CCT TTA 303 Tyr He Lys Asn Ala Gin Lys Ala Cys Glu Phe Leu Gly He Pro Leu 65 70 75
GAG GTG TTG GAT TTT CAA AAG GAT TTT AAA AGC GCG GTT TAT GAT GAA 351 Glu Val Leu Asp Phe Gin Lys Asp Phe Lys Ser Ala Val Tyr Asp Glu 80 85 90
TTT ATC AAC GCC TAT GAA GAA GGG CAA ACC CCA AAC CCT TGT GCG TTG 399 Phe He Asn Ala Tyr Glu Glu Gly Gin Thr Pro Asn Pro Cys Ala Leu 95 100 105
TGC AAC CCT TTA ATG AAG TTT GGG CTA GCT TTG GAT CAC GCT TTA AAA 447
- 143 -
Cys Asn Pro Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys 110 115 120 125
TTA GGG TGT GAA AAG ATC GCT ACC GGG CAT TAT GCG AGA GTC AAA GAA 495 Leu Gly Cys Glu Lys He Ala Thr Gly His Tyr Ala Arg Val Lys Glu 130 135 140
ATT GAC AAA ATA AGT TAT ATT CAA GAG GCT TTG GAT AAA ACT AAA GAT 543
He Asp Lys He Ser Tyr He Gin Glu Ala Leu Asp Lys Thr Lys Asp
145 150 155
CAG AGC TAT TTT TTA TAC GCT TTA GAG CAT GAA GTG ATC GCT AAA TTG 591
Gin Ser Tyr Phe Leu Tyr Ala Leu Glu His Glu Val He Ala Lys Leu 160 165 170
GTG TTC CCT TTA GGG GAT TTG CTA AAA AAG GAT ATT AAG CCT TTA GCC 639
Val Phe Pro Leu Gly Asp Leu Leu Lys Lys Asp He Lys Pro Leu Ala 175 180 185
TTG AAT GCG ATG CCT TTT TTA GGC ACT TTA GAG ACT TAT AAG GAA TCT 687
Leu Asn Ala Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser 190 195 200 205
CAA GAA ATC TGC TTT GTG GAA AAA AGC TAC ATT GAC ACT TTA AAA AAG 735
Gin Glu He Cys Phe Val Glu Lys Ser Tyr He Asp Thr Leu Lys Lys 210 215 220
CAT GTT GAA GTG GAA AAA GAG GGC GTG GTG AAA AAC CTA CAA GGC GAA 783
His Val Glu Val Glu Lys Glu Gly Val Val Lys Asn Leu Gin Gly Glu
225 230 235
GTC ATT GGC ACG CAT AAA GGC TAT ATG CAA TAC ACG ATT GGC AAA CGC 831
Val He Gly Thr His Lys Gly Tyr Met Gin Tyr Thr He Gly Lys Arg 240 245 250
AAA GGC TTT AGT ATT AAA GGC GCG TTA GAG CCG CAT TTT GTG GTG GGG 879
Lys Gly Phe Ser He Lys Gly Ala Leύ* Glu Pro His Phe Val Val Gly 255 260 265
ATT GAC GCT AAA AAG AAC GAG CTA GTC GTG GGC AAA AAA GAA GAT CTC 927
He Asp Ala Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu 270 275 280 285
GCC ACG CAT TCG CTT AAG GCT AAA AAC AAA TCT TTA ATG AAA GAT TTT 975
Ala Thr His Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe 290 295 300
AAA GAT GGC GAA TAT TTT ATC AAG GCT CGT TAC AGG AGC GTG CCT GCT 1023
Lys Asp Gly Glu Tyr Phe He Lys Ala Arg Tyr Arg Ser Val Pro Ala
305 310 315
AAA GCG CAT GTG AGT TTG AAA GAT GAG GTG ATT GAA GTG GGG TTT AAA 1071
Lys Ala His Val Ser Leu Lys Asp Glu Val He Glu Val Gly Phe Lys 320 325 330
- 144 -
GAG CCT TTT TAT GGC GTG GCT AAA GGG CAA GCT TTG GTC GTT TAT AAA 1119 Glu Pro Phe Tyr Gly Val Ala Lys Gly Gin Ala Leu Val Val Tyr Lys 335 340 345
GAT GAC ATC TTG CTT GGT GGG GGC GTG ATT GTT TAAAAACTAA AGAACTAAGA 1172 Asp Asp He Leu Leu Gly Gly Gly Val He Val 350 355 360
GATACGCCTT TTGGCAGTCT CTTAATGTTT TATTGAATAG GCGTT 1217
(2) INFORMATION FOR SEQ ID NO : 48 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48.
Met Asn Gly Phe Cys Ala Arg Leu Arg Ala He Thr His Asn Glu Arg
1 5 10 15
Leu Lys Met Lys He Ala Val Leu Leu Ser Gly Gly Val Asp Ser Ser
20 25 30
Tyr Ser Ala Tyr Ser Leu Lys Glu Gin Gly His Glu Leu Val Gly He
35 40 45
Tyr Leu Lys Leu His Ala Ser Glu Lys Lys His Asp Leu Tyr He Lys
50 55 60
Asn Ala Gin Lys Ala Cys Glu Phe Leu Gly He Pro Leu Glu Val Leu 65 70 75 80
Asp Phe Gin Lys Asp Phe Lys Ser Ala Val Tyr Asp Glu Phe He Asn
85 90 95
Ala Tyr Glu Glu Gly Gin Thr Pro Asn Pro Cys Ala Leu Cys Asn Pro
100 105 110
Leu Met Lys Phe Gly Leu Ala Leu Asp His Ala Leu Lys Leu Gly Cys
115 120 125
Glu Lys He Ala Thr Gly His Tyr Ala Arg Val Lys Glu He Asp Lys
130 135 140
He Ser Tyr He Gin Glu Ala Leu Asp Lys Thr Lys Asp Gin Ser Tyr 145 150 155 160
Phe Leu Tyr Ala Leu Glu His Glu Val He Ala Lys Leu Val Phe Pro
165 170 175
Leu Gly Asp Leu Leu Lys Lys Asp He Lys Pro Leu Ala Leu Asn Ala
180 185 190
Met Pro Phe Leu Gly Thr Leu Glu Thr Tyr Lys Glu Ser Gin Glu He
195 200 205
Cys Phe Val Glu Lys Ser Tyr He Asp Thr Leu Lys Lys His Val Glu
210 215 220
Val Glu Lys Glu Gly Val Val Lys Asn Leu Gin Gly Glu Val He Gly 225 230 235 240
Thr His Lys Gly Tyr Met Gin Tyr Thr He Gly Lys Arg Lys Gly Phe 245 250 255
- 145 -
Ser He Lys Gly Ala Leu Glu Pro His Phe Val Val Gly He Asp Ala
260 265 270
Lys Lys Asn Glu Leu Val Val Gly Lys Lys Glu Asp Leu Ala Thr His
275 280 285
Ser Leu Lys Ala Lys Asn Lys Ser Leu Met Lys Asp Phe Lys Asp Gly
290 295 300
Glu Tyr Phe He Lys Ala Arg Tyr Arg Ser Val Pro Ala Lys Ala His 305 310 315 320
Val Ser Leu Lys Asp Glu Val He Glu Val Gly Phe Lys Glu Pro Phe
325 330 335
Tyr Gly Val Ala Lys Gly Gin Ala Leu Val Val Tyr Lys Asp Asp He
340 345 350
Leu Leu Gly Gly Gly Val He Val 355 360
(2) INFORMATION FOR SEQ ID NO: 49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 191...793 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49:
ACATTACACA TATCTGTCGC TAAAACGCGC CGCTTCACTA AACCCACTGA TTGTAAAAAT 60
TTGTCTATTC GCATGCGTTT ATTTTACCCT ATTCTTTAAG TTTTTATCCA TAACTTATAA 120
GGGTTTTAGT TTTAGCATGT TAGCATTCAG CCACCACTCT TTTTAAGGAA TTTGTTTGAA 180
GTTTCAAATT ATG AGT TTG TTA GCC ACT CTT TTA TTA GCC TCT TGC TTG 229 Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu 1 5 10
CCC CCC AAA GGC CAT CAT TCT GGT TTG GTG AAT CTT TAT ATC GCT CAT 277 Pro Pro Lys Gly His His Ser Gly Leu Val Asn Leu Tyr He Ala His 15 20 25
CAA GGC CAA AGC GTG CGC ACT TAT TGG CGC AAA GTG GAT AGA GGA GTT 325 Gin Gly Gin Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val 30 35 40 45
ATC GCT AAA CAC AAT GAA GCG CTT AAA AAA GAT CCT AAA GCA AAG CTC 373 He Ala Lys His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu 50 55 60
AAA GAC CCC AGG GGG CCT TTA TTC ATG CTA GGG AGT GAG CGC TTC ATG 421 Lys Asp Pro Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met 65 70 75
- 146 -
CTT TTA TGG AAA AAC CGC TAC GCT TTA GCC AAG CCC CAA TCG TTC AGG 469 Leu Leu Trp Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gin Ser Phe Arg 80 85 90
CTA GAG CCT GGT TTT TAT TAC TTG GAT TCT TTT AGC GTG GAA ACT CAA 517 Leu Glu Pro Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gin 95 100 105
AAA GGC GTC TTG CAG AGC GCT CCT GGC TAT TCA TAT ACT AAA AAT GGC 565 Lys Gly Val Leu Gin Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly 110 115 120 125
TAT GAT TTC AAA AAC AAC CGC CCC TTT TTC CTG GCC TTT GAA GTC AAA 613 Tyr Asp Phe Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys 130 135 140
CCT GAT GGC AAA ACC ATT CTT CCT AGC GTG GAA TTA AGC CTG ATT AAA 661 Pro Asp Gly Lys Thr He Leu Pro Ser Val Glu Leu Ser Leu He Lys
145 150 155
ACC CCT AGA GGC TTT TTA GGG GTG TTC TTG TTT GAT AAT AAT GAA AAG 709 Thr Pro Arg Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys 160 165 170
GGG ACT AAC GCC AAG TGG ATT GAG GGG AGT TTG AAT TTA AAG CTT AAA 757 Gly Thr Asn Ala Lys Trp He Glu Gly Ser Leu Asn Leu Lys Leu Lys 175 180 185
AAC GCT TCC TTT AAA GAT GCG TGG GGG TTG GAA CAA TAAAGCATGA AGTGAT 809 Asn Ala Ser Phe Lys Asp Ala Trp Gly Leu Glu Gin 190 195 200
CGCTTGCTTT TCGTAAGCTC TTTATGATTA GATTGTAAAA AAATGCCTTG AGTATTTTTT 869 AGATTTTATT ACCCCTATTC AATTGGAACA AAGCCATTAA ATTTTTAAAA ACTTTTAAAA 929 ACGATAAACA TAATCCGCGC TCCAAGTAAC ATAGCTTTCA AAAATG 975
(2) INFORMATION FOR SEQ ID NO: 50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50:
Met Ser Leu Leu Ala Thr Leu Leu Leu Ala Ser Cys Leu Pro Pro Lys
1 5 10 15
Gly His His Ser Gly Leu Val Asn Leu Tyr He Ala His Gin Gly Gin
20 25 30
Ser Val Arg Thr Tyr Trp Arg Lys Val Asp Arg Gly Val He Ala Lys
35 40 45
His Asn Glu Ala Leu Lys Lys Asp Pro Lys Ala Lys Leu Lys Asp Pro
- 147 -
50 55 60
Arg Gly Pro Leu Phe Met Leu Gly Ser Glu Arg Phe Met Leu Leu Trp 65 70 75 80
Lys Asn Arg Tyr Ala Leu Ala Lys Pro Gin Ser Phe Arg Leu Glu Pro
85 90 95
Gly Phe Tyr Tyr Leu Asp Ser Phe Ser Val Glu Thr Gin Lys Gly Val
100 105 110
Leu Gin Ser Ala Pro Gly Tyr Ser Tyr Thr Lys Asn Gly Tyr Asp Phe
115 120 125
Lys Asn Asn Arg Pro Phe Phe Leu Ala Phe Glu Val Lys Pro Asp Gly
130 135 140
Lys Thr He Leu Pro Ser Val Glu Leu Ser Leu He Lys Thr Pro Arg 145 150 155 160
Gly Phe Leu Gly Val Phe Leu Phe Asp Asn Asn Glu Lys Gly Thr Asn
165 170 175
Ala Lys Trp He Glu Gly Ser Leu Asn Leu Lys Leu Lys Asn Ala Ser
180 185 190
Phe Lys Asp Ala Trp Gly Leu Glu Gin 195 200
(2) INFORMATION FOR SEQ ID NO: 51:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1116 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 90...1076 (D) OTHER INFORMATION:
(x ) SEQUENCE DESCRIPTION. SEQ ID NO: 51:
TATAAATACA TCGTTTCATT AGCGAATTTA ATGGCGTTAA GCGATCATAT TGATTTATTT 60
TATGAATTTG TTTATTAAGG GAAAAAATC ATG TCA AAT AGC ATG TTG GAT AAA 113
Met Ser Asn Ser Met Leu Asp Lys 1 5
AAT AAA GCG ATT CTT ACA GGG GGT GGG GCT TTA TTA TTA GGG CTA ATC 161
Asn Lys Ala He Leu Thr Gly Gly Gly Ala Leu Leu Leu Gly Leu He 10 15 20
GTG CTT TTT TAT TTA GCT TAT CGC CCT AAG GCT GAA GTG TTG CAA GGG 209
Val Leu Phe Tyr Leu Ala Tyr Arg Pro Lys Ala Glu Val Leu Gin Gly 25 30 35 40
TTT TTG GAA GCC AGA GAA TAC AGC GTG AGC TCC AAA GTC CCT GGC CGC 257
Phe Leu Glu Ala Arg Glu Tyr Ser Val Ser Ser Lys Val Pro Gly Arg 45 50 55
- 148 -
ATT GAA AAG GTG TTT GTT AAA AAA GGC GAT CAC ATT AAA AAG GGC GAT 305 He Glu Lys Val Phe Val Lys Lys Gly Asp His He Lys Lys Gly Asp 60 65 70
TTG GTT TTT AGC ATT TCT AGC CCT GAA TTA GAA GCC AAA CTC GCT CAA 353 Leu Val Phe Ser He Ser Ser Pro Glu Leu Glu Ala Lys Leu Ala Gin 75 80 85
GCT GAA GCC GGG CAT AAA GCC GCT AAA GCG CTT AGC GAT GAA GTC AAA 401 Ala Glu Ala Gly His Lys Ala Ala Lys Ala Leu Ser Asp Glu Val Lys 90 95 100
AGA GGC TCA AGA GAC GAA ACG ATT AAT TCT GCG AGA GAC GTT TGG CAA 449 Arg Gly Ser Arg Asp Glu Thr He Asn Ser Ala Arg Asp Val Trp Gin 105 110 115 120
GCA GCC AAA TCC CAA GCC ACT TTA GCC AAA GAG ACT TAT AAG CGC GTT 497 Ala Ala Lys Ser Gin Ala Thr Leu Ala Lys Glu Thr Tyr Lys Arg Val 125 130 135
CAA GAT TTG TAT GAT AAT GGC GTG GCG AGC TTG CAA AAG CGC GAT GAA 545 Gin Asp Leu Tyr Asp Asn Gly Val Ala Ser Leu Gin Lys Arg Asp Glu 140 145 150
GCC TAT GCG GCT TAT GAA AGC ACT AAA TAC AAC GAG AGC GCG GCT TAC 593 Ala Tyr Ala Ala Tyr Glu Ser Thr Lys Tyr Asn Glu Ser Ala Ala Tyr 155 160 165
CAA AAG TAT AAA ATG GCT TTA GGG GGG GCG AGC TCT GAA AGT AAG ATT 641 Gin Lys Tyr Lys Met Ala Leu Gly Gly Ala Ser Ser Glu Ser Lys He 170 175 180
GCC GCT AAG GCT AAA GAG AGC GCG GCT TTA GGG CAA GTG AAT GAA GTG 689 Ala Ala Lys Ala Lys Glu Ser Ala Ala Leu Gly Gin Val Asn Glu Val 185 190 195 200
GAG TCT TAT TTA AAA GAC GTC AAA GCG ACA GCC CCA ATT GAT GGG GAA 737 Glu Ser Tyr Leu Lys Asp Val Lys Ala Thr Ala Pro He Asp Gly Glu 205 210 215
GTG AGT AAC GTG CTT TTA AGC GGT GGC GAG CTT AGC CCT AAG GGT TTT 785 Val Ser Asn Val Leu Leu Ser Gly Gly Glu Leu Ser Pro Lys Gly Phe 220 225 230
CCT GTG GTT TTA ATG ATA GAT TTA AAG GAT AGT TGG TTA AAA ATC AGC 833 Pro Val Val Leu Met He Asp Leu Lys Asp Ser Trp Leu Lys He Ser 235 240 245
GTG CCT GAA AAG TAT TTG AAC GAG TTT AAA GTG GGT AAG GAA TTT GAA 881 Val Pro Glu Lys Tyr Leu Asn Glu Phe Lys Val Gly Lys Glu Phe Glu 250 255 260
GGC TAT ATC CCG GCG TTG AAA AAA AGC ACG AAA TTC AGG GTC AAA TAT 929 Gly Tyr He Pro Ala Leu Lys Lys Ser Thr Lys Phe Arg Val Lys Tyr 265 270 275 280
- 149 -
TTG AGC GTG ATG GGG GAT TTT GCG ACT TGG AAA GCG ACG AAT AAT TCC 977 Leu Ser Val Met Gly Asp Phe Ala Thr Trp Lys Ala Thr Asn Asn Ser 285 290 295
AAC ACT TAC GAC ATG AAA AGC TAT GAA GTG GAA GCC ATA CCC TTA GAA 1025 Asn Thr Tyr Asp Met Lys Ser Tyr Glu Val Glu Ala He Pro Leu Glu 300 305 310
GAG TTG GAA AAT TTT AGG GTA GGG ATG AGC GTG TTA GTT ACC ATT AAA 1073 Glu Leu Glu Asn Phe Arg Val Gly Met Ser Val Leu Val Thr He Lys 315 320 325
CCT TAAAAAGGAT TGTTTTGTTC AGATTGATAA GCGCATGGGT 1116
Pro
(2) INFORMATION FOR SEQ ID NO: 52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:
Met Ser Asn Ser Met Leu Asp Lys Asn Lys Ala He Leu Thr Gly Gly
1 5 10 15
Gly Ala Leu Leu Leu Gly Leu He Val Leu Phe Tyr Leu Ala Tyr Arg
20 25 30
Pro Lys Ala Glu Val Leu Gin Gly Phe Leu Glu Ala Arg Glu Tyr Ser
35 40 45
Val Ser Ser Lys Val Pro Gly Arg He Glu Lys Val Phe Val Lys Lys
50 55 60
Gly Asp His He Lys Lys Gly Asp Leu Val Phe Ser He Ser Ser Pro 65 70 75 80
Glu Leu Glu Ala Lys Leu Ala Gin Ala Glu Ala Gly His Lys Ala Ala
85 90 95
Lys Ala Leu Ser Asp Glu Val Lys Arg Gly Ser Arg Asp Glu Thr He
100 105 110
Asn Ser Ala Arg Asp Val Trp Gin Ala Ala Lys Ser Gin Ala Thr Leu
115 120 125
Ala Lys Glu Thr Tyr Lys Arg Val Gin Asp Leu Tyr Asp Asn Gly Val
130 135 140
Ala Ser Leu Gin Lys Arg Asp Glu Ala Tyr Ala Ala Tyr Glu Ser Thr 145 150 155 160
Lys Tyr Asn Glu Ser Ala Ala Tyr Gin Lys Tyr Lys Met Ala Leu Gly
165 170 175
Gly Ala Ser Ser Glu Ser Lys He Ala Ala Lys Ala Lys Glu Ser Ala
180 185 190
Ala Leu Gly Gin Val Asn Glu Val Glu Ser Tyr Leu Lys Asp Val Lys
- 150 -
195 200 205
Ala Thr Ala Pro He Asp Gly Glu Val Ser Asn Val Leu Leu Ser Gly
210 215 220
Gly Glu Leu Ser Pro Lys Gly Phe Pro Val Val Leu Met He Asp Leu 225 230 235 240
Lys Asp Ser Trp Leu Lys He Ser Val Pro Glu Lys Tyr Leu Asn Glu
245 250 255
Phe Lys Val Gly Lys Glu Phe Glu Gly Tyr He Pro Ala Leu Lys Lys
260 265 270
Ser Thr Lys Phe Arg Val Lys Tyr Leu Ser Val Met Gly Asp Phe Ala
275 280 285
Thr Trp Lys Ala Thr Asn Asn Ser Asn Thr Tyr Asp Met Lys Ser Tyr
290 295 300
Glu Val Glu Ala He Pro Leu Glu Glu Leu Glu Asn Phe Arg Val Gly 305 310 315 320
Met Ser Val Leu Val Thr He Lys Pro 325
(2) INFORMATION FOR SEQ ID NO: 53:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH- 1514 base pairs
(B) TYPE, nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 94...1467 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:
AAATAAAATA GCGATCATTA TAACATGTTG CTTTTTAAGT GAAAGCGTTA AGTTGTTAGG 60 GTATAGTGGC TTAAAAATTT TAGGATATTG AGA ATG CTT GAA ACT TCT AGC CAT 114
Met Leu Glu Thr Ser Ser His 1 5
TTT TTA AAA TCG TTT CGC TTG AAG CGT TAT ATA GGG TTT TTA TTG ATT 162 Phe Leu Lys Ser Phe Arg Leu Lys Arg Tyr He Gly Phe Leu Leu He 10 15 20
TCT TTA GCG TTA TTA ATC ACG CCC TTT GTT CGC ATT GAT GGG GCG CAT 210 Ser Leu Ala Leu Leu He Thr Pro Phe Val Arg He Asp Gly Ala His 25 30 35
TTG TTT TTG ATC TCT TTT GAG CAT AAG CAA CTG CAT TTT TTA GGC AAG 258 Leu Phe Leu He Ser Phe Glu His Lys Gin Leu His Phe Leu Gly Lys 40 45 50 55
ATC TTT AGC GCT GAA GAA TTG CAA GTC ATG CCT TTT ATG GTT ATT TTG 306 He Phe Ser Ala Glu Glu Leu Gin Val Met Pro Phe Met Val He Leu
- 151 -
60 65 70
CTT TTT ATA GGG ATT TTT TTC ATC ACC ACT AGC CTT GGG CGT GTG TGG 354 Leu Phe He Gly He Phe Phe He Thr Thr Ser Leu Gly Arg Val Trp 75 80 85
TGC GGT TGG GCT TGC CCG CAA ACC TTT TTA AGG GTG CTT TAT AGA GAT 402 Cys Gly Trp Ala Cys Pro Gin Thr Phe Leu Arg Val Leu Tyr Arg Asp 90 95 100
GTG ATT GAA ACC AAG ATT TTC AAA CTC CAT AAA AAG ATC AGC AAC AAG 450 Val He Glu Thr Lys He Phe Lys Leu His Lys Lys He Ser Asn Lys 105 110 115
CAA GAA AGC CCT AAA AAC ACC CCA AGC TAC AAG ATC CGT AAA GTA TTG 498 Gin Glu Ser Pro Lys Asn Thr Pro Ser Tyr Lys He Arg Lys Val Leu 120 125 130 135
AGC GTT TTA TTG TTC GCT CCT GTT GTG GCG GGG CTA ATG ATG TTG TTT 546 Ser Val Leu Leu Phe Ala Pro Val Val Ala Gly Leu Met Met Leu Phe 140 145 150
TTC TTT TAT TTC ATC GCC CCA GAA GAT TTT TTT ATG TAT CTT AAA AAC 594 Phe Phe Tyr Phe He Ala Pro Glu Asp Phe Phe Met Tyr Leu Lys Asn 155 160 165
CCT AGC GAT CAC CCT ATT GCT ATG GGT TTT TGG CTT TTT AGC ACG GCT 642 Pro Ser Asp His Pro He Ala Met Gly Phe Trp Leu Phe Ser Thr Ala 170 175 180
GTG GTG CTA TTT GAT ATA GTG GTG GTT GCG GAG CGT TTT TGC ATT TAT 690 Val Val Leu Phe Asp He Val Val Val Ala Glu Arg Phe Cys He Tyr 185 190 195
TTA TGC CCT TAC GCT AGG GTG CAA TCG GTG TTG TAT GAC AAT GAC ACC 738 Leu Cys Pro Tyr Ala Arg Val Gin Ser Val Leu Tyr Asp Asn Asp Thr 200 205 210 215
TTA AAC CCT ATT TAT GAT GAA AAG CGC GGC GGA GCG CTT TAT AAT AAT 786 Leu Asn Pro He Tyr Asp Glu Lys Arg Gly Gly Ala Leu Tyr Asn Asn 220 225 230
CAG GGC CAT CTC TTC CCC TTA CCT CCC AAA AAA CGC AGC CCA GAA AAC 834 Gin Gly His Leu Phe Pro Leu Pro Pro Lys Lys Arg Ser Pro Glu Asn 235 240 245
GAA TGC GTG AAT TGT TTG CAT TGC GTG CAG GTT TGC CCC ACG CAT ATT 882 Glu Cys Val Asn Cys Leu His Cys Val Gin Val Cys Pro Thr His He 250 255 260
GAC ATC AGG AAG GGC TTG CAA TTA GAA TGC ATC AAT TGT TTA GAA TGC 930 Asp He Arg Lys Gly Leu Gin Leu Glu Cys He Asn Cys Leu Glu Cys 265 270 275
GTG GAT GCA TGC ACG ATT ACC ATG GCT AAA TTT AAC CGC CCT TCA CTC 978
- 152 -
Val Asp Ala Cys Thr He Thr Met Ala Lys Phe Asn Arg Pro Ser Leu 280 285 290 295
ATC CAA TGG TCT TCA ACT AAC GCT ATT AAT ACG CGC CAA AAA GTG CAC 1026
He Gin Trp Ser Ser Thr Asn Ala He Asn Thr Arg Gin Lys Val His
300 305 310
CTG GTG CGT TTA AAA ACG ATC GCT TAC ATG GGG GTT ATC GCT ATT GTG 1074
Leu Val Arg Leu Lys Thr He Ala Tyr Met Gly Val He Ala He Val 315 320 325
ATC GCT CTT TTA GCC ATC ACT TCG TTT AAA AAA GAA CGC ATG CTC TTA 1122
He Ala Leu Leu Ala He Thr Ser Phe Lys Lys Glu Arg Met Leu Leu 330 335 340
GAC ATT AAC CGC AAC AGC GAT CTG TAT GAA TTG CGC TCT AGC GGG TAT 1170
Asp He Asn Arg Asn Ser Asp Leu Tyr Glu Leu Arg Ser Ser Gly Tyr 345 350 355
GTG GAT AAC GAT TAC GTG TTT TTA TTC CAC AAC ACG GAC AAT AAA GAC 1218
Val Asp Asn Asp Tyr Val Phe Leu Phe His Asn Thr Asp Asn Lys Asp 360 365 370 375
CAT GAG TTT TAT TTC AAA GTT TTA GGG CAA AAA GAC ATT CAG ATC AAA 1266
His Glu Phe Tyr Phe Lys Val Leu Gly Gin Lys Asp He Gin He Lys
380 385 390
AAG CCT TTA AAT CCT ATC GCC ATT AAA GCC GGG CAA AAG ATT AAA GCG 1314
Lys Pro Leu Asn Pro He Ala He Lys Ala Gly Gin Lys He Lys Ala 395 400 405
GTA GTG ATT TTA AGA AAA CCC CTA AAG AGT AAC GCC ACA GAA TAC AAG 1362
Val Val He Leu Arg Lys Pro Leu Lys Ser Asn Ala Thr Glu Tyr Lys 410 415 420
AAC GCT AAA GAC GCT CTA ATC CCC ATT ACC ATA CAA GCT TAT AGC GCG 1410
Asn Ala Lys Asp Ala Leu He Pro He Thr He Gin Ala Tyr Ser Ala 425 430 435
GAC GAT AAG AAT ATT ACG ATA GAA AGG GAA TCG GTG TTT ATT GCA CCA 1458
Asp Asp Lys Asn He Thr He Glu Arg Glu Ser Val Phe He Ala Pro 440 445 450 455
AGT GAG GAT TGAAGCCTAA AACTAGCGTT CAATCACTTC ATAAGGCAAG CCTTGTT 1514 Ser Glu Asp
(2) INFORMATION FOR SEQ ID NO: 54
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 458 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 153 -
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54:
Met Leu Glu Thr Ser Ser His Phe Leu Lys Ser Phe Arg Leu Lys Arg
1 5 10 15
Tyr He Gly Phe Leu Leu He Ser Leu Ala Leu Leu He Thr Pro Phe
20 25 30
Val Arg He Asp Gly Ala His Leu Phe Leu He Ser Phe Glu His Lys
35 40 45
Gin Leu His Phe Leu Gly Lys He Phe Ser Ala Glu Glu Leu Gin Val
50 55 60
Met Pro Phe Met Val He Leu Leu Phe He Gly He Phe Phe He Thr 65 70 75 80
Thr Ser Leu Gly Arg Val Trp Cys Gly Trp Ala Cys Pro Gin Thr Phe
85 90 95
Leu Arg Val Leu Tyr Arg Asp Val He Glu Thr Lys He Phe Lys Leu
100 105 110
His Lys Lys He Ser Asn Lys Gin Glu Ser Pro Lys Asn Thr Pro Ser
115 120 125
Tyr Lys He Arg Lys Val Leu Ser Val Leu Leu Phe Ala Pro Val Val
130 135 140
Ala Gly Leu Met Met Leu Phe Phe Phe Tyr Phe He Ala Pro Glu Asp 145 150 155 160
Phe Phe Met Tyr Leu Lys Asn Pro Ser Asp His Pro He Ala Met Gly
165 170 175
Phe Trp Leu Phe Ser Thr Ala Val Val Leu Phe Asp He Val Val Val
180 185 190
Ala Glu Arg Phe Cys He Tyr Leu Cys Pro Tyr Ala Arg Val Gin Ser
195 200 205
Val Leu Tyr Asp Asn Asp Thr Leu Asn Pro He Tyr Asp Glu Lys Arg
210 215 220
Gly Gly Ala Leu Tyr Asn Asn Gin Gly His Leu Phe Pro Leu Pro Pro 225 230 235 240
Lys Lys Arg Ser Pro Glu Asn Glu Cys Val Asn Cys Leu His Cys Val
245 250 255
Gin Val Cys Pro Thr His He Asp He Arg Lys Gly Leu Gin Leu Glu
260 265 270
Cys He Asn Cys Leu Glu Cys Val Asp Ala Cys Thr He Thr Met Ala
275 280 285
Lys Phe Asn Arg Pro Ser Leu He Gin Trp Ser Ser Thr Asn Ala He
290 295 300
Asn Thr Arg Gin Lys Val His Leu Val Arg Leu Lys Thr He Ala Tyr 305 310 315 320
Met Gly Val He Ala He Val He Ala Leu Leu Ala He Thr Ser Phe
325 330 335
Lys Lys Glu Arg Met Leu Leu Asp He Asn Arg Asn Ser Asp Leu Tyr
340 345 350
Glu Leu Arg Ser Ser Gly Tyr Val Asp Asn Asp Tyr Val Phe Leu Phe
355 360 365
His Asn Thr Asp Asn Lys Asp His Glu Phe Tyr Phe Lys Val Leu Gly
370 375 380
Gin Lys Asp He Gin He Lys Lys Pro Leu Asn Pro He Ala He Lys 385 390 395 400
- 154 -
Ala Gly Gin Lys He Lys Ala Val Val He Leu Arg Lys Pro Leu Lys
405 410 415
Ser Asn Ala Thr Glu Tyr Lys Asn Ala Lys Asp Ala Leu He Pro He
420 425 430
Thr He Gin Ala Tyr Ser Ala Asp Asp Lys Asn He Thr He Glu Arg
435 440 445
Glu Ser Val Phe He Ala Pro Ser Glu Asp 450 455
(2) INFORMATION FOR SEQ ID NO: 55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 990 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 228...782 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55:
ACGATTTGAT CAATAACGAA AATAAAATTG ATGAAATCAA TAATGAAGAA AACGCTGATC 60
CTTCGCAAAA AAGAACGAAC AACGTTTTGC AACGAGCCAC TAACCACCAA GACAATCTCA 120
ATTCCCCACT CAACAGGAAG TATTAAAGTG TGAAACTTTT TTCAAAGGAT TTATTTAAAA 180
AAGTAACCCC TTTATTTTTA AGCGTTTATT TTTTAAACCC CACCATT ATG CAA GCC 236
Met Gin Ala 1
AAA AGC CGT TTT TAT GTG GCT TCT CAA TAC CAG GTG GGG AAA ATG ATC 284 Lys Ser Arg Phe Tyr Val Ala Ser Gin Tyr Gin Val Gly Lys Met He 5 10 15
ATG AAA AAA TAC AAC GAT CTC AAA CGC ACG ATT GAA GGG GCG AGC TTT 332 Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr He Glu Gly Ala Ser Phe 20 25 30 35
TCT TTA GGC TGG GAG ATT AAC CCC ACT AAC TAC TGG TTT TAT TCG CGC 380 Ser Leu Gly Trp Glu He Asn Pro Thr Asn Tyr Trp Phe Tyr Ser Arg 40 45 50
TAT TAC TTT TTT ATG GAT TAC GGG AAT GTC ATT CTC AAT AAA AGA ACG 428 Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val He Leu Asn Lys Arg Thr 55 60 65
GGC GCT CAA GCG AAC ATG TTC ACT TAT GGC TTT GGG GGG GAT TTG ATT 476 Gly Ala Gin Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly Asp Leu He 70 75 80
GTG GAA TAC AAT AAA AAC CCC TTG TAT GTA TTT TCT CTT TTT TAT GGC 524
- 155 -
Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu Phe Tyr Gly
85 90 95
ATG CAA GTT GCT GAA AAC ACA TGG ACG ATT TCC AAA CAC AGC GCG AAT 572
Met Gin Val Ala Glu Asn Thr Trp Thr He Ser Lys His Ser Ala Asn
100 105 110 115
TTC ATC ATT GAC GAT TGG CGC AGC ATT CAA GGG TTT TCG CTC AAA ACT 620 Phe He He Asp Asp Trp Arg Ser He Gin Gly Phe Ser Leu Lys Thr 120 125 130
TCC AAT TTT AGG ATG TTG GGT TTA GTG GGG TTT AAA TTC CAA ACC GTG 668 Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe Gin Thr Val 135 140 145
CTA TTC CAC CAT GAC GCA AGT ATT GAA GTG GGG ATC AAA TGG CCT TTT 716 Leu Phe His His Asp Ala Ser He Glu Val Gly He Lys Trp Pro Phe 150 155 160
GCT TTT GAA TAC GAC TCA GCC TTT GTA AGG CTT TTT TCT GTC TTT ATT 764 Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser Val Phe He 165 170 175
TCG CAC ACT TTC TAC CTT TAAACTAATT CCAACCCTAC CGGGCAATGA TCGCTCCC 820 Ser His Thr Phe Tyr Leu 180 185
TAAAATATCT TTATAGATTA AAGCGTCTTT TAAGCGCGTT TTTAAAGGGT TAGAGCATAA 880 AAAATAATCA ATGCGCCAAC CAATGTTTTT ATCCCTTGCT TGTTGCATGT AACTCCACCA 940 GGTGTAAGCC TTTTCTTTGT TAGGGTAAAA ATAACGGAAA GTGTCAATAA 990
(2) INFORMATION FOR SEQ ID NO: 56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:
Met Gin Ala Lys Ser Arg Phe Tyr Val Ala Ser Gin Tyr Gin Val Gly
1 5 10 15
Lys Met He Met Lys Lys Tyr Asn Asp Leu Lys Arg Thr He Glu Gly
20 25 30
Ala Ser Phe Ser Leu Gly Trp Glu He Asn Pro Thr Asn Tyr Trp Phe
35 40 45
Tyr Ser Arg Tyr Tyr Phe Phe Met Asp Tyr Gly Asn Val He Leu Asn
50 55 60
Lys Arg Thr Gly Ala Gin Ala Asn Met Phe Thr Tyr Gly Phe Gly Gly 65 70 75 80
Asp Leu He Val Glu Tyr Asn Lys Asn Pro Leu Tyr Val Phe Ser Leu 85 90 95
- 156 -
Phe Tyr Gly Met Gin Val Ala Glu Asn Thr Trp Thr He Ser Lys His
100 105 110
Ser Ala Asn Phe He He Asp Asp Trp Arg Ser He Gin Gly Phe Ser
115 120 125
Leu Lys Thr Ser Asn Phe Arg Met Leu Gly Leu Val Gly Phe Lys Phe
130 135 140
Gin Thr Val Leu Phe His His Asp Ala Ser He Glu Val Gly He Lys 145 150 155 160
Trp Pro Phe Ala Phe Glu Tyr Asp Ser Ala Phe Val Arg Leu Phe Ser
165 170 175
Val Phe He Ser His Thr Phe Tyr Leu 180 185
(2) INFORMATION FOR SEQ ID NO: 57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1161 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 109...1113 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57:
ATCTTACCTT TATCTTTTAA GATTTTATGA AAAATAGTTT CATTTTTACT ATTGTTATTT 60 TCTTAGTAAT GTTATAATCG CTTTATAAAT CATACAAAAA GGATCGCT ATG TTA GTT 117
Met Leu Val 1
ACT CGC TTT AAA AAA GCT TTC ATT TCT TAT TCT TTA GGC GTG CTT GTC 165 Thr Arg Phe Lys Lys Ala Phe He Ser Tyr Ser Leu Gly Val Leu Val 5 10 15
GCT TCA TTA TGG TTG AAC GTG TGC AAC GCT TCA GCG CAA GAA GTC AAA 213 Ala Ser Leu Trp Leu Asn Val Cys Asn Ala Ser Ala Gin Glu Val Lys 20 25 30 35
GTC AAG GAT TAT TTC GGG GAG CAA ACC ATC AAG CTT CCT GTT TCT AAA 261 Val Lys Asp Tyr Phe Gly Glu Gin Thr He Lys Leu Pro Val Ser Lys 40 45 50
ATA GCC TAT ATA GGG AGC TAT GTA GAA GTG CCT GCC ATG CTT AAT GTT 309 He Ala Tyr He Gly Ser Tyr Val Glu Val Pro Ala Met Leu Asn Val 55 60 65
TGG AAT AGG GTT GTA GGC GTT TCG GAT TAC GCT TTT AAA GAC GAT ATT 357 Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys Asp Asp He 70 75 80
- 157 -
GTC AAA GCC ACT CTC AAA GGC GAA GAT CTT AAA CGC GTC AAA CAC ATG 405
Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val Lys His Met 85 90 95
AGC ACT GAT CAT ACA GCC GCG CTA AAT GTA GAG CTT TTA AAA AAG CTT 453
Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu Lys Lys Leu 100 105 110 115
AGC CCT GAT CTT GTG GTA ACC TTT GTG GGC AAC CCT AAA GCG GTA GAG 501
Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys Ala Val Glu 120 125 130
CAT GCG AAA AAA TTT GGT ATA TCA TTT CTT TCT TTT CAA GAG ACA ACG 549
His Ala Lys Lys Phe Gly He Ser Phe Leu Ser Phe Gin Glu Thr Thr
135 140 145
ATT GCA GAG GCC ATG CAG GCC ATG CAA GCT CAA GCC ACG GTT TTA GAG 597
He Ala Glu Ala Met Gin Ala Met Gin Ala Gin Ala Thr Val Leu Glu 150 155 160
ATT GAC GCT TCC AAA AAA TTC GCC AAA ATG CAA GAA ACT TTG GAT TTT 645
He Asp Ala Ser Lys Lys Phe Ala Lys Met Gin Glu Thr Leu Asp Phe 165 170 175
ATT GCT GAG CGT TTG AAA AAT GTC AAA AAG AAA AAG GGG GTG GAG CTT 693
He Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly Val Glu Leu 180 185 190 195
TTC CAT AAA GCC AAT AAA ATC AGC GGC CAT CAA GCC ATT AGC TCA GAC 741
Phe His Lys Ala Asn Lys He Ser Gly His Gin Ala He Ser Ser Asp 200 205 210
ATT TTA GAA AAA GGG GGC ATA GAC AAT TTT GGC TTG AAA TAT GTC AAA 789
He Leu Glu Lys Gly Gly He Asp Asn Phe Gly Leu Lys Tyr Val Lys
215 220 225
TTT GGG CGT GCT GAC ATT AGC GTG GAA AAA ATC GTT AAA GAA AAC CCT 837
Phe Gly Arg Ala Asp He Ser Val Glu Lys He Val Lys Glu Asn Pro 230 235 240
GAG ATT ATC TTT ATT TGG TGG ATA AGC CCA CTC ACG CCT GAA GAT GTG 885
Glu He He Phe He Trp Trp He Ser Pro Leu Thr Pro Glu Asp Val 245 250 255
TTA AAC AAC CCC AAA TTT GCT ACC ATC AAA GCC ATT AAA AAC AAG CAG 933
Leu Asn Asn Pro Lys Phe Ala Thr He Lys Ala He Lys Asn Lys Gin 260 265 270 275
GTT TAT AAA CTC CCC ACA ATG GAT ATT GGC GGG CCT AGA GCC CCA CTC 981
Val Tyr Lys Leu Pro Thr Met Asp He Gly Gly Pro Arg Ala Pro Leu
280 285 290
ATA AGT CTT TTT ATC GCT CTA AAA GCC CAC CCT GAA GCC TTT AAG GGC 1029
He Ser Leu Phe He Ala Leu Lys Ala His Pro Glu Ala Phe Lys Gly
295 300 305
- 158 -
GTG GAT ATT AAT GCG ATG GTT AAA GAC TAC TAT AAA GTG GTT TTT GAT 1077 Val Asp He Asn Ala Met Val Lys Asp Tyr Tyr Lys Val Val Phe Asp 310 315 320
TTG AAT GAT GCA GAG GTT GAG CCC TTT TTA TGG CAT TAATTTTTAA AAAGGG 1129 Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His 325 330 335
GTTGATGTTT TTAGCCTTTC GTGTATCGCG CT 1161
(2) INFORMATION FOR SEQ ID NO: 58:
(l) SEQUENCE CHARACTERISTICS.
(A) LENGTH: 335 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY, linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE, internal
(xi) SEQUENCE DESCRIPTION SEQ ID NO: 58.
Met Leu Val Thr Arg Phe Lys Lys Ala Phe He Ser Tyr Ser Leu Gly
1 5 10 15
Val Leu Val Ala Ser Leu Trp Leu Asn Val Cys Asn Ala Ser Ala Gin
20 25 30
Glu Val Lys Val Lys Asp Tyr Phe Gly Glu Gin Thr He Lys Leu Pro
35 40 45
Val Ser Lys He Ala Tyr He Gly Ser Tyr Val Glu Val Pro Ala Met
50 55 60
Leu Asn Val Trp Asn Arg Val Val Gly Val Ser Asp Tyr Ala Phe Lys 65 70 75 80
Asp Asp He Val Lys Ala Thr Leu Lys Gly Glu Asp Leu Lys Arg Val
85 90 95
Lys His Met Ser Thr Asp His Thr Ala Ala Leu Asn Val Glu Leu Leu
100 105 110
Lys Lys Leu Ser Pro Asp Leu Val Val Thr Phe Val Gly Asn Pro Lys
115 120 125
Ala Val Glu His Ala Lys Lys Phe Gly He Ser Phe Leu Ser Phe Gin
130 135 140
Glu Thr Thr He Ala Glu Ala Met Gin Ala Met Gin Ala Gin Ala Thr 145 150 155 160
Val Leu Glu He Asp Ala Ser Lys Lys Phe Ala Lys Met Gin Glu Thr
165 170 175
Leu Asp Phe He Ala Glu Arg Leu Lys Asn Val Lys Lys Lys Lys Gly
180 185 190
Val Glu Leu Phe His Lys Ala Asn Lys He Ser Gly His Gin Ala He
195 200 205
Ser Ser Asp He Leu Glu Lys Gly Gly He Asp Asn Phe Gly Leu Lys
210 215 220
Tyr Val Lys Phe Gly Arg Ala Asp He Ser Val Glu Lys He Val Lys 225 230 235 240
Glu Asn Pro Glu He He Phe He Trp Trp He Ser Pro Leu Thr Pro 245 250 255
- 159 -
Glu Asp Val Leu Asn Asn Pro Lys Phe Ala Thr He Lys Ala He Lys
260 265 270
Asn Lys Gin Val Tyr Lys Leu Pro Thr Met Asp He Gly Gly Pro Arg
275 280 285
Ala Pro Leu He Ser Leu Phe He Ala Leu Lys Ala His Pro Glu Ala
290 295 300
Phe Lys Gly Val Asp He Asn Ala Met Val Lys Asp Tyr Tyr Lys Val 305 310 315 320
Val Phe Asp Leu Asn Asp Ala Glu Val Glu Pro Phe Leu Trp His 325 330 335
(2) INFORMATION FOR SEQ ID NO: 59:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 800 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY. Coding Sequence
(B) LOCATION: 121...669 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59:
TTATTCGCAT GCATTAGCTA TTATTGAAGC TCAAAGCATT CAAGCGCATT TATTCTTAGA 60
TGAAATCAAA CAAAGCCAAA AAGAAAAGAA AAAATTCCCC ACTTTCAAAG GAGGTTTTTA 120
ATG CGT TGG TGG TGT TTT TTG GTG TGT TGT TTT GGT ATT TTA AGC GTG 168 Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly He Leu Ser Val 1 5 10 15
ATG GAC GCT AAA AAA TTA GAG AAT AAG AAT TTG AAA AAA GAA AGA GAG 216 Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu 20 25 30
CTT TTA GAG ATT ACT GGC AAC CAA TTT GTA GCG AAC GAC AAA ACC AAA 264 Leu Leu Glu He Thr Gly Asn Gin Phe Val Ala Asn Asp Lys Thr Lys 35 40 45
ACC GCT GTT ATT CAA GGC AAT GTG CAG ATC AAA AAG GGT AAA GAC CGG 312 Thr Ala Val He Gin Gly Asn Val Gin He Lys Lys Gly Lys Asp Arg 50 55 60
TTG TTT GCG GAC AAG GTG AGC GTG TTT TTA AAC GAT AAA CGA AAG CCA 360 Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro 65 70 75 80
GAG CGC TAT GAA GCC ACA GGG AAC ACG CAT TTT AAC ATC TTT ACA GAG 408 Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn He Phe Thr Glu 85 90 95
- 160 -
GAC AAT CGT GAA ATC AGC GGG AGT GCT GAC AAG CTC ATT TAT AAC GCG 456 Asp Asn Arg Glu He Ser Gly Ser Ala Asp Lys Leu He Tyr Asn Ala 100 105 110
CTG AAT GGG GAA TAC AAA TTA TTG CAA AAT GCG GTG GTT AGA GAA GTG 504 Leu Asn Gly Glu Tyr Lys Leu Leu Gin Asn Ala Val Val Arg Glu Val 115 120 125
GGG AAA TCC AAT GTC ATC ACC GGC GAT GAA ATC ATT TTA AAC AAA ACT 552 Gly Lys Ser Asn Val He Thr Gly Asp Glu He He Leu Asn Lys Thr 130 135 140
AAG GGT TAT GCT GAT GTG TTG GGG AGC GCG AAA CGG CCC GCT AAA TTC 600 Lys Gly Tyr Ala Asp Val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe 145 150 155 160
GTG TTT GAT ATG GAA GAT ATT AAT GAA GAA AAT CGT AAG GCT AAA TTG 648 Val Phe Asp Met Glu Asp He Asn Glu Glu Asn Arg Lys Ala Lys Leu 165 170 175
AAG AAG AAA GGC GAA AAA CCA TGATTGTCAT TAAAGACGCT CATTTTCTCA CTTC 703 Lys Lys Lys Gly Glu Lys Pro 180
TTCAAGCCAA CTTTTTCAAT GCCCTGCGAG TTTGACTTCT GAAATGGTGG TTTTAGGGCG 763 CAGCAATGTA GGCAAAAGCT CGTTTATTAA TACCTTG 800
(2) INFORMATION FOR SEQ ID NO: 60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 183 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:
Met Arg Trp Trp Cys Phe Leu Val Cys Cys Phe Gly He Leu Ser Val
1 5 10 15
Met Asp Ala Lys Lys Leu Glu Asn Lys Asn Leu Lys Lys Glu Arg Glu
20 25 30
Leu Leu Glu He Thr Gly Asn Gin Phe Val Ala Asn Asp Lys Thr Lys
35 40 45
Thr Ala Val He Gin Gly Asn Val Gin He Lys Lys Gly Lys Asp Arg
50 55 60
Leu Phe Ala Asp Lys Val Ser Val Phe Leu Asn Asp Lys Arg Lys Pro
65 70 75 80
Glu Arg Tyr Glu Ala Thr Gly Asn Thr His Phe Asn He Phe Thr Glu
85 90 95
Asp Asn Arg Glu He Ser Gly Ser Ala Asp Lys Leu He Tyr Asn Ala
100 105 110
Leu Asn Gly Glu Tyr Lys Leu Leu Gin Asn Ala Val Val Arg Glu Val
- 161 -
115 120 125
Gly Lys Ser Asn Val He Thr Gly Asp Glu He He Leu Asn Lys Thr
130 135 140
Lys Gly Tyr Ala Asp Val Leu Gly Ser Ala Lys Arg Pro Ala Lys Phe 145 150 155 160
Val Phe Asp Met Glu Asp He Asn Glu Glu Asn Arg Lys Ala Lys Leu
165 170 175
Lys Lys Lys Gly Glu Lys Pro 180
(2) INFORMATION FOR SEQ ID NO: 61:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 724 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 88...618 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61:
GTGATGATTG AAAACATGTG AAAGAGCGTT TTTTAAGCTT TTAAATGGTG TTTGAATGCG 60
AAAAAAAGGC TAATACTATC ATAAGGA ATG AAG TTG ATA AAA TTT GTG CGT AAT 114
Met Lys Leu He Lys Phe Val Arg Asn 1 5
GTG GTT TTG TTC ATT TTA ACG GCG ATC TTT TTA GCG TTC ATG CTT TTG 162 Val Val Leu Phe He Leu Thr Ala He Phe Leu Ala Phe Met Leu Leu 10 15 20 25
GTG AGT TAT TGC ATG CCC CAT TAT AGC GCG GCT GTC ATT AGC GGG GTG 210 Val Ser Tyr Cys Met Pro His Tyr Ser Ala Ala Val He Ser Gly Val 30 35 40
GAA GTC AAA AGA ATG AAT GAA AAT GAA AAC ACG CCC AAT AAT AAG GAA 258 Glu Val Lys Arg Met Asn Glu Asn Glu Asn Thr Pro Asn Asn Lys Glu 45 50 55
GTA AAA ACC CTT GCT AGA GAT GTC TAT TTT GTG CAA ACT TAC GAC CCT 306 Val Lys Thr Leu Ala Arg Asp Val Tyr Phe Val Gin Thr Tyr Asp Pro 60 65 70
AAA GAT CAA AAA AGC GTA ACC GTT TAT CGT AAC GAA GAC ACG CGC TTT 354 Lys Asp Gin Lys Ser Val Thr Val Tyr Arg Asn Glu Asp Thr Arg Phe 75 80 85
AGC TTC CCT TTT TAT TTT AAG TTT AAT TCG GCT GAT ATT TCA GCC CTC 402 Ser Phe Pro Phe Tyr Phe Lys Phe Asn Ser Ala Asp He Ser Ala Leu
162 -
90 95 100 105
GCT CAA AGT TTA ATC AAT CAG CAA GTG GAA GTG AAA TAC TAT GGT TGG 450 Ala Gin Ser Leu He Asn Gin Gin Val Glu Val Lys Tyr Tyr Gly Trp 110 115 120
CGG ATC AAT TTG TTT AAC ATG TTC CCT AAT GTG ATT TTT TTA AAG CCC 498 Arg He Asn Leu Phe Asn Met Phe Pro Asn Val He Phe Leu Lys Pro 125 130 135
TTA AAA GAG AGC ACT GAC ATT TCA AAG CCC ATT TTT AGC TGG ATT TTA 546 Leu Lys Glu Ser Thr Asp He Ser Lys Pro He Phe Ser Trp He Leu 140 145 150
TAC GCT TTG CTG TTA ATG GGC TTT TTT ATC AGC GCG CGT TCT GTT TGC 594 Tyr Ala Leu Leu Leu Met Gly Phe Phe He Ser Ala Arg Ser Val Cys 155 160 165
ACT TTA TTT AAG AGC AAA GCT CAT TAAAACTTTT AGGCTTTGTT GGAAAATCAC 648 Thr Leu Phe Lys Ser Lys Ala His 170 175
AATGGGGTTA TTGGAGCGTG TATTAAAAAG CTCAATATAG GGCAAGCTGA TGCTGTGAAA 708 AGCGGTGTTG TTTCCT 724
(2) INFORMATION FOR SEQ ID NO : 62 :
( ) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 177 ammo acids
(B) TYPE, amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62:
Met Lys Leu He Lys Phe Val Arg Asn Val Val Leu Phe He Leu Thr
1 5 10 15
Ala He Phe Leu Ala Phe Met Leu Leu Val Ser Tyr Cys Met Pro His
20 25 30
Tyr Ser Ala Ala Val He Ser Gly Val Glu Val Lys Arg Met Asn Glu
35 40 45
Asn Glu Asn Thr Pro Asn Asn Lys Glu Val Lys Thr Leu Ala Arg Asp
50 55 60
Val Tyr Phe Val Gin Thr Tyr Asp Pro Lys Asp Gin Lys Ser Val Thr 65 70 75 80
Val Tyr Arg Asn Glu Asp Thr Arg Phe Ser Phe Pro Phe Tyr Phe Lys
85 90 95
Phe Asn Ser Ala Asp He Ser Ala Leu Ala Gin Ser Leu He Asn Gin
100 105 110
Gin Val Glu Val Lys Tyr Tyr Gly Trp Arg He Asn Leu Phe Asn Met
115 120 125
Phe Pro Asn Val He Phe Leu Lys Pro Leu Lys Glu Ser Thr Asp He
- 163 -
130 135 140
Ser Lys Pro He Phe Ser Trp He Leu Tyr Ala Leu Leu Leu Met Gly 145 150 155 160
Phe Phe He Ser Ala Arg Ser Val Cys Thr Leu Phe Lys Ser Lys Ala
165 170 175
(2) INFORMATION FOR SEQ ID NO: 63
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 117...911 (D) OTHER INFORMATION:
(A) NAME/KEY: sιg_peptιde
(B) LOCATION: 117...167 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptιde
(B) LOCATION: 168...911 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63:
CCACATTTAA GGTAGAAACC ACTCAATTAG ATGTAAAAAT TCCAAACGGC AACCAAAAAA 60 TGGTTAAAAA GGACACAATA AACCCCAAAA ATGAAATTTA AATATATGGG AACTTA ATG 119
Met -17
AGA ATT TTT TTT GTT ATC ATG GGA CTT GTG TTT TTT GGT TGC ACC AGT 167 Arg He Phe Phe Val He Met Gly Leu Val Phe Phe Gly Cys Thr Ser -15 -10 -5
AAG GTG CAT GAG ATG AAA AAA AGC CCT TGC ACC TTG TAT GAA AAC AGG 215 Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn Arg 1 5 10 15
TTA AAT CTC GCA GAA ATC TTT CAC AAG CGA GCA ATT GAT CTA TTT AGA 263 Leu Asn Leu Ala Glu He Phe His Lys Arg Ala He Asp Leu Phe Arg 20 25 30
GAG CTT TTA AGC CAC CAA GAA AAG CAT TTA GAA AAC AAG CTT TCT GGT 311 Glu Leu Leu Ser His Gin Glu Lys His Leu Glu Asn Lys Leu Ser Gly
- 164 -
35 40 45
TTT TCG GTG AGT GAT TTG GAC ATG CAA AGC GTG TTT CGG CTG GAA AGA 359 Phe Ser Val Ser Asp Leu Asp Met Gin Ser Val Phe Arg Leu Glu Arg 50 55 60
AAC CGC TTG AAA ATC GCT TAC AAG CTC TTA GGC TTG ATG AGT TTT ATC 407 Asn Arg Leu Lys He Ala Tyr Lys Leu Leu Gly Leu Met Ser Phe He 65 70 75 80
GCT CTT ATT TTA GCG ATC GTG TTA ATC AGT CTT CTA CCC TTA CAA AAA 455 Ala Leu He Leu Ala He Val Leu He Ser Leu Leu Pro Leu Gin Lys 85 90 95
ACC GAA CAC CAT TTC GTG GAT TTT TTA AAC CAG GAC AAG CAT TAC GTC 503 Thr Glu His His Phe Val Asp Phe Leu Asn Gin Asp Lys His Tyr Val 100 105 110
ATT ATC CAA AGA GCG GAT AAA AGC ATT TCC AGT AAT GAA GCG TTG GCT 551 He He Gin Arg Ala Asp Lys Ser He Ser Ser Asn Glu Ala Leu Ala 115 120 125
CGT TCG CTC ATT GGG GCG TAT GTG TTA AAC CGA GAG AGC ATT AAC CGC 599 Arg Ser Leu He Gly Ala Tyr Val Leu Asn Arg Glu Ser He Asn Arg 130 135 140
ATT GAC GAT AAA TCG CGC TAT GAA TTG GTG CGC TTG CAA AGC AGT TCT 647 He Asp Asp Lys Ser Arg Tyr Glu Leu Val Arg Leu Gin Ser Ser Ser 145 150 155 160
AAA GTG TGG CAA CGC TTT GAA GAT TTG ATT AAA ACC CAA AAC AGC ATT 695 Lys Val Trp Gin Arg Phe Glu Asp Leu He Lys Thr Gin Asn Ser He 165 170 175
TAT GTG CAA AGC CAT TTG GAA AGA GAA GTC CAT ATC GTC AAT ATT GCG 743 Tyr Val Gin Ser His Leu Glu Arg Glu Val His He Val Asn He Ala 180 185 190
ATC TAT CAG CAA GAC AAT AAC CCC ATT GCG AGC GTC TCC ATT GCC GCT 791 He Tyr Gin Gin Asp Asn Asn Pro He Ala Ser Val Ser He Ala Ala 195 200 205
AAA CTT TTG AAT GAA AAC AAG CTG GTG TAT GAA AAG CGT TAT AAA ATC 839 Lys Leu Leu Asn Glu Asn Lys Leu Val Tyr Glu Lys Arg Tyr Lys He 210 215 220
GTA TTG AGT TAT TTG TTT GAC ACC CCG ATG AAT TCA AGC TTG CAA GCT 887 Val Leu Ser Tyr Leu Phe Asp Thr Pro Met Asn Ser Ser Leu Gin Ala 225 230 235 240
TGC AAG CTC TCA GGC TTC ATA GTT TGACATGACA TATAGATGAG CTTTATGCGG 941 Cys Lys Leu Ser Gly Phe He Val 245
TACGATTATC ACAGAATGGC TAACGCAGCA GGCACCGAGT A 982
- 165 -
(2) INFORMATION FOR SEQ ID NO 64
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 265 ammo acids
(B) TYPE ammo acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE protein (v) FRAGMENT TYPE internal
(xi) SEQUENCE DESCRIPTION SEQ ID NO 64
Met Arg He Phe Phe Val He Met Gly Leu Val Phe Phe Gly Cys Thr
17 15 10 -5
Ser Lys Val His Glu Met Lys Lys Ser Pro Cys Thr Leu Tyr Glu Asn
1 5 10 15
Arg Leu Asn Leu Ala Glu He Phe His Lys Arg Ala He Asp Leu Phe
20 25 30
Arg Glu Leu Leu Ser His Gin Glu Lys His Leu Glu Asn Lys Leu Ser
35 40 45
Gly Phe Ser Val Ser Asp Leu Asp Met Gin Ser Val Phe Arg Leu Glu
50 55 60
Arg Asn Arg Leu Lys He Ala Tyr Lys Leu Leu Gly Leu Met Ser Phe
65 70 75
He Ala Leu He Leu Ala He Val Leu He Ser Leu Leu Pro Leu Gin 80 85 90 95
Lys Thr Glu His His Phe Val Asp Phe Leu Asn Gin Asp Lys His Tyr
100 105 110
Val He He Gin Arg Ala Asp Lys Ser He Ser Ser Asn Glu Ala Leu
115 120 125
Ala Arg Ser Leu He Gly Ala Tyr Val Leu Asn Arg Glu Ser He Asn
130 135 140
Arg He Asp Asp Lys Ser Arg Tyr Glu Leu Val Arg Leu Gin Ser Ser
145 150 155
Ser Lys Val Trp Gin Arg Phe Glu Asp Leu He Lys Thr Gin Asn Ser 160 165 170 175
He Tyr Val Gin Ser His Leu Glu Arg Glu Val His He Val Asn He
180 185 190
Ala He Tyr Gin Gin Asp Asn Asn Pro He Ala Ser Val Ser He Ala
195 200 205
Ala Lys Leu Leu Asn Glu Asn Lys Leu Val Tyr Glu Lys Arg Tyr Lys
210 215 220
He Val Leu Ser Tyr Leu Phe Asp Thr Pro Met Asn Ser Ser Leu Gin
225 230 235
Ala Cys Lys Leu Ser Gly Phe He Val 240 245
(2) INFORMATION FOR SEQ ID NO 65
(l) SEQUENCE CHARACTERISTICS (A) LENGTH 2059 base pairs
- 166 -
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 183...1961 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65:
GATATTTGTT TTGTTGGGGG TTAGGTTTTT GTTTAAGAAA GTTTTTTAAA ACTAAAGAAG 60
CGCTTAAAAC AGAACCTTTT GTTTTTTAGG TTTTATTTTT TACTTTGGCT TGTTTTCAAA 120
AGTCATTTTG ATTTCTAAAA ATAGTCTATA ATGCTCGCAA GAGATATTTT TTAAGGTTAT 180
CA ATG AAA GCT ATA AAA ATA CTT TTT ATA ATG ACA CTC AGT TTA AAC 227 Met Lys Ala He Lys He Leu Phe He Met Thr Leu Ser Leu Asn 1 5 10 15
GCT ATC AGC GTG AAT AGG GCG TTG TTT GAT TTA AAA GAT TCG CAA TTA 275 Ala He Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gin Leu 20 25 30
AAA GGG GAA TTA ACG CCA AAA ATA GTG AAT TTT GGG GGT TAT AAA AGC 323 Lys Gly Glu Leu Thr Pro Lys He Val Asn Phe Gly Gly Tyr Lys Ser 35 40 45
AGC ACT GAA GAG TGG GGG GCT ACG GCT TTA AAC TAT ATC AAT GCG GCT 371 Ser Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr He Asn Ala Ala 50 55 60
AAT GGC GAT GCG AAA AAA TTC AGC ACT CTA GTG GAA AAA ATG CGT TTT 419 Asn Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe 65 70 75
AAC TCC GGT ATA TTG GGG AAT TTA AGA GTG CAT GCA CGT TTG AGG CAA 467 Asn Ser Gly He Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gin 80 85 90 95
GCC CTA AAA TTG CAA AAG AAT TTG AAA TAT TGC CTT AAA ATC ATC GCT 515 Ala Leu Lys Leu Gin Lys Asn Leu Lys Tyr Cys Leu Lys He He Ala 100 105 110
AGG GAT TCT TTT TAT AGC TAC CGC ACC GGT ATT TAT ATC CCC TTA GGC 563 Arg Asp Ser Phe Tyr Ser Tyr Arg Thr Gly He Tyr He Pro Leu Gly 115 120 125
ATT TCT TTA AAA GAT CAA AAA ACG GCT CAA AAA ATG CTC GCT GAT TTG 611 He Ser Leu Lys Asp Gin Lys Thr Ala Gin Lys Met Leu Ala Asp Leu 130 135 140
AGC GTG GTA GGG GCG TAT CTT AAA AAA CAA CAA GAG AAT GAA AAG GCT 659 Ser Val Val Gly Ala Tyr Leu Lys Lys Gin Gin Glu Asn Glu Lys Ala
- 167 -
145 150 155
CAA AGC CCT TAT TAC AGA AAC AAC AAC TAT TAC AAC TCT TAC TAT AGC 707 Gin Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser 160 165 170 175
CCT TAT TAC GGA ATG TAT GGT ATG TAT GGC ATG GGC ATG TAT GGA ATG 755 Pro Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met 180 185 190
TAT GGC ATG GGC ATG TAT GAT TTT TAT GAC TTT TAT GAT GGC ATG TAT 803 Tyr Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr 195 200 205
GGA TTC TAC CCT AAC ATG TTT TTC ATG ATG CAA GTT CAA GAT TAC TTG 851 Gly Phe Tyr Pro Asn Met Phe Phe Met Met Gin Val Gin Asp Tyr Leu 210 215 220
ATG TTA GAA AAT TAC ATG TAT GCG CTC GAT CAA GAA GAG ATT TTA GAT 899 Met Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gin Glu Glu He Leu Asp 225 230 235
CAT GAC GCT TCT ACT GAC CAA CTT GAT ACG CCT ACT GAT GAT GAC AAA 947 His Asp Ala Ser Thr Asp Gin Leu Asp Thr Pro Thr Asp Asp Asp Lys 240 245 250 255
GAC GAT AAA GAC GAT AAA TCC TTA CAG CAG GCA AAT CTT ATG AAC TTT 995 Asp Asp Lys Asp Asp Lys Ser Leu Gin Gin Ala Asn Leu Met Asn Phe 260 265 270
TAT CGT GAT CCC AAA TTC AGC AAA GGC ATT CAA ACC AAC CGC TTG AAT 1043
Tyr Arg Asp Pro Lys Phe Ser Lys Gly He Gin Thr Asn Arg Leu Asn
275 280 285
AGC GCT TTA GTC AAT TTA GAC AAC AGT CGC ATG CTC AAA GAC AAT TCG 1091
Ser Ala Leu Val Asn Leu Asp Asn Ser Arg Met Leu Lys Asp Asn Ser
290 295 300
CTT TTC CAC ACT AAA GCC ATG CCC ACT AAA AGC GTG GAT GCG ATA ACT 1139 Leu Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala He Thr 305 310 315
TCT CAA GCC AAA GAG CTT AAC CAT TTA GTG GGG CAA ATC AAA GAA ATG 1187 Ser Gin Ala Lys Glu Leu Asn His Leu Val Gly Gin He Lys Glu Met 320 325 330 335
AAG CAA GAC GGG GCG AGT CCT AGT AAG ATT GAT TCA GTT GTC AAT AAA 1235 Lys Gin Asp Gly Ala Ser Pro Ser Lys He Asp Ser Val Val Asn Lys 340 345 350
GCT ATG GAA GTG AGG GAC AAG CTA GAC AAT AAT CTC AAC CAA CTA GAC 1283 Ala Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gin Leu Asp 355 360 365
AAT GAC TTA AAA GAT CAA AAA GGG CTT TCA AGC GAG CAA CAA GCT CAA 1331 Asn Asp Leu Lys Asp Gin Lys Gly Leu Ser Ser Glu Gin Gin Ala Gin
- 168 -
370 375 380
GTG GAT AAA GCC CTA GAC AGC GTG CAA CAA TTA AGC CAT AGC AGC GAT 1379 Val Asp Lys Ala Leu Asp Ser Val Gin Gin Leu Ser His Ser Ser Asp 385 390 395
GTG GTG GGG AAT TAT TTA GAC GGG AGT TTG AAA ATT GAT GGC GAT GAT 1427 Val Val Gly Asn Tyr Leu Asp Gly Ser Leu Lys He Asp Gly Asp Asp 400 405 410 415
AGA GAT GAT TTG AAT GAT GCG ATG AAT AAC CCT ATG CAA CAA CCC GTG 1475 Arg Asp Asp Leu Asn Asp Ala Met Asn Asn Pro Met Gin Gin Pro Val 420 425 430
CAA CAA ACG CCT ACT AGC AAC ATG GCC GAC ACC CAT GCA AAT GAC AGC 1523 Gin Gin Thr Pro Thr Ser Asn Met Ala Asp Thr His Ala Asn Asp Ser 435 440 445
AAG GAT CAA GGG AGT AAC GCG CTC ATA AAC CCT AAC AGC GCC ACT AAC 1571 Lys Asp Gin Gly Ser Asn Ala Leu He Asn Pro Asn Ser Ala Thr Asn 450 455 460
GCC GAC GAC ACT CAC ACT GAC GAT ACT CAC ACT GAC ACT AAC ACC ACA 1619 Ala Asp Asp Thr His Thr Asp Asp Thr His Thr Asp Thr Asn Thr Thr 465 470 475
AAC GAT GCT AGC ACC ACT GAC ACC CCC ACT GAC GAT AAA GAT GCT AGC 1667 Asn Asp Ala Ser Thr Thr Asp Thr Pro Thr Asp Asp Lys Asp Ala Ser 480 485 490 495
GGC TTG AAC AAT ACC GGC GAT ATG AAT AAC ACG GAT ACC GGC AAC ACG 1715 Gly Leu Asn Asn Thr Gly Asp Met Asn Asn Thr Asp Thr Gly Asn Thr 500 505 510
GAC ACC GGC AAT ACG GAT ACC GGT AAC ACT GAT GAT ATG AGC AAC ATG 1763 Asp Thr Gly Asn Thr Asp Thr Gly Asn Thr Asp Asp Met Ser Asn Met 515 520 525
AAC AAC GGC AAC GAT GAT ACG GGT AAC GCT AAT GAC GAC ATG AGC AAC 1811 Asn Asn Gly Asn Asp Asp Thr Gly Asn Ala Asn Asp Asp Met Ser Asn 530 535 540
GGC AAC GAC ATG GGC GAT GAT TTG AAC AAC GCG AAC GAT ATG AAC GAC 1859 Gly Asn Asp Met Gly Asp Asp Leu Asn Asn Ala Asn Asp Met Asn Asp 545 550 555
GAC ATG GGT AAT GGC AAC GAT GAC ATG GGC GAT ATG GGG GAT ATG AAC 1907 Asp Met Gly Asn Gly Asn Asp Asp Met Gly Asp Met Gly Asp Met Asn 560 565 570 575
GAC GAT ATG GGT GGC GAT ATG GGA GAC ATG GGG GAT ATG GGC GAT ATG 1955 Asp Asp Met Gly Gly Asp Met Gly Asp Met Gly Asp Met Gly Asp Met 580 585 590
GGG AAT TGAGATTAAC CCCAATATCA AAGAGTGATA GCCAAAACTT TAAGGAATAT TT 2013
- 169 - Gly Asn
TTATAGTAAA AACGATTCTT TTAAGGTAAT AGGGGGGATA TTTTGC 2059
(2) INFORMATION FOR SEQ ID NO: 66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 593 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66:
Met Lys Ala He Lys He Leu Phe He Met Thr Leu Ser Leu Asn Ala
1 5 10 15
He Ser Val Asn Arg Ala Leu Phe Asp Leu Lys Asp Ser Gin Leu Lys
20 25 30
Gly Glu Leu Thr Pro Lys He Val Asn Phe Gly Gly Tyr Lys Ser Ser
35 40 45
Thr Glu Glu Trp Gly Ala Thr Ala Leu Asn Tyr He Asn Ala Ala Asn
50 55 60
Gly Asp Ala Lys Lys Phe Ser Thr Leu Val Glu Lys Met Arg Phe Asn 65 70 75 80
Ser Gly He Leu Gly Asn Leu Arg Val His Ala Arg Leu Arg Gin Ala
85 90 95
Leu Lys Leu Gin Lys Asn Leu Lys Tyr Cys Leu Lys He He Ala Arg
100 105 110
Asp Ser Phe Tyr Ser Tyr Arg Thr Gly He Tyr He Pro Leu Gly He
115 120 125
Ser Leu Lys Asp Gin Lys Thr Ala Gin Lys Met Leu Ala Asp Leu Ser
130 135 140
Val Val Gly Ala Tyr Leu Lys Lys Gin Gin Glu Asn Glu Lys Ala Gin 145 150 155 160
Ser Pro Tyr Tyr Arg Asn Asn Asn Tyr Tyr Asn Ser Tyr Tyr Ser Pro
165 170 175
Tyr Tyr Gly Met Tyr Gly Met Tyr Gly Met Gly Met Tyr Gly Met Tyr
180 185 190
Gly Met Gly Met Tyr Asp Phe Tyr Asp Phe Tyr Asp Gly Met Tyr Gly
195 200 205
Phe Tyr Pro Asn Met Phe Phe Met Met Gin Val Gin Asp Tyr Leu Met
210 215 220
Leu Glu Asn Tyr Met Tyr Ala Leu Asp Gin Glu Glu He Leu Asp His 225 230 235 240
Asp Ala Ser Thr Asp Gin Leu Asp Thr Pro Thr Asp Asp Asp Lys Asp
245 250 255
Asp Lys Asp Asp Lys Ser Leu Gin Gin Ala Asn Leu Met Asn Phe Tyr
260 265 270
Arg Asp Pro Lys Phe Ser Lys Gly He Gin Thr Asn Arg Leu Asn Ser
275 280 285
Ala Leu Val Asn Leu Asp Asn Ser Arg Met Leu Lys Asp Asn Ser Leu
- 170 -
290 295 300
Phe His Thr Lys Ala Met Pro Thr Lys Ser Val Asp Ala He Thr Ser 305 310 315 320
Gin Ala Lys Glu Leu Asn His Leu Val Gly Gin He Lys Glu Met Lys
325 330 335
Gin Asp Gly Ala Ser Pro Ser Lys He Asp Ser Val Val Asn Lys Ala
340 345 350
Met Glu Val Arg Asp Lys Leu Asp Asn Asn Leu Asn Gin Leu Asp Asn
355 360 365
Asp Leu Lys Asp Gin Lys Gly Leu Ser Ser Glu Gin Gin Ala Gin Val
370 375 380
Asp Lys Ala Leu Asp Ser Val Gin Gin Leu Ser His Ser Ser Asp Val 385 390 395 400
Val Gly Asn Tyr Leu Asp Gly Ser Leu Lys He Asp Gly Asp Asp Arg
405 410 415
Asp Asp Leu Asn Asp Ala Met Asn Asn Pro Met Gin Gin Pro Val Gin
420 425 430
Gin Thr Pro Thr Ser Asn Met Ala Asp Thr His Ala Asn Asp Ser Lys
435 440 445
Asp Gin Gly Ser Asn Ala Leu He Asn Pro Asn Ser Ala Thr Asn Ala
450 455 460
Asp Asp Thr His Thr Asp Asp Thr His Thr Asp Thr Asn Thr Thr Asn 465 470 475 480
Asp Ala Ser Thr Thr Asp Thr Pro Thr Asp Asp Lys Asp Ala Ser Gly
485 490 495
Leu Asn Asn Thr Gly Asp Met Asn Asn Thr Asp Thr Gly Asn Thr Asp
500 505 510
Thr Gly Asn Thr Asp Thr Gly Asn Thr Asp Asp Met Ser Asn Met Asn
515 520 525
Asn Gly Asn Asp Asp Thr Gly Asn Ala Asn Asp Asp Met Ser Asn Gly
530 535 540
Asn Asp Met Gly Asp Asp Leu Asn Asn Ala Asn Asp Met Asn Asp Asp 545 550 555 560
Met Gly Asn Gly Asn Asp Asp Met Gly Asp Met Gly Asp Met Asn Asp
565 570 575
Asp Met Gly Gly Asp Met Gly Asp Met Gly Asp Met Gly Asp Met Gly
580 585 590
Asn
(2) INFORMATION FOR SEQ ID NO: 67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 112...1461 (D) OTHER INFORMATION:
- 171 -
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 67 :
AATGAGCGAT TTGAAAGATT TTGTCAATAA AACTTCAAGC CCTTTAAATG CGAATTGATT 60
TTCTTATATT ATGATTACGA TTTATCAATT TAAAACATTT GGAGAAAGAC A ATG AGT 117
Met Ser
1
ATG GAA TTT GAT GCT GTT ATT ATT GGA GGT GGG GTT TCA GGG TGC GCG 165 Met Glu Phe Asp Ala Val He He Gly Gly Gly Val Ser Gly Cys Ala 5 10 15
ACC TTT TAT ACT TTG AGC GAA TAC AGC TCT TTA AAG CGC GTG GCT ATC 213 Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val Ala He 20 25 30
GTG GAA AAA TGC TCT AAA TTG GCT CAA ATC AGC TCC AGC GCT AAA GCT 261 Val Glu Lys Cys Ser Lys Leu Ala Gin He Ser Ser Ser Ala Lys Ala 35 40 45 50
AAT TCG CAA ACC ATT CAT GAT GGC TCT ATT GAA ACG AAT TAC ACT CCC 309 Asn Ser Gin Thr He His Asp Gly Ser He Glu Thr Asn Tyr Thr Pro 55 60 65
GAA AAA GCT AAA AAA GTG CGT TTG AGC GCT TAT AAG ACC AGG CAA TAC 357 Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg Gin Tyr 70 75 80
GCT TTG AAT AAA GGC TTG CAA AAT GAA GTG ATT TTT GAA ACC CAG AAA 405 Ala Leu Asn Lys Gly Leu Gin Asn Glu Val He Phe Glu Thr Gin Lys 85 90 95
ATG GCT ATA GGC GTG GGC GAT GAA GAA TGC GAG TTC ATG AAA AAA CGC 453 Met Ala He Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys Lys Arg 100 105 110
TAC GAA TCT TTT AAA GAA ATC TTT GTG GGG TTA GAA GAA TTT GAC AAG 501 Tyr Glu Ser Phe Lys Glu He Phe Val Gly Leu Glu Glu Phe Asp Lys 115 120 125 130
CAA AAG ATT AAA GAA TTA GAG CCT AAT GTG ATT TTA GGG GCT AAT GGC 549 Gin Lys He Lys Glu Leu Glu Pro Asn Val He Leu Gly Ala Asn Gly 135 140 145
ATA GAC AGG CAT GAA AAC ATT ATC GGG CAT GGG TAT AGA AAG GAT TGG 597 He Asp Arg His Glu Asn He He Gly His Gly Tyr Arg Lys Asp Trp 150 155 160
AGC ACC ATG AAT TTT GCG AAG TTG AGT GAA AAC TTC GTT GAA GAA GCC 645 Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu Glu Ala 165 170 175
CTA AAA TTA AAG CCT AAC AAC CAG GTG TTT TTG AAT TTC AAA GTG AAA 693 Leu Lys Leu Lys Pro Asn Asn Gin Val Phe Leu Asn Phe Lys Val Lys 180 185 190
- 172 -
AAG ATT GAA AAA CGC AAC GAC ACT TAC GCC GTA ATT TCA GAA GAC GCT 741 Lys He Glu Lys Arg Asn Asp Thr Tyr Ala Val He Ser Glu Asp Ala 195 200 205 210
GAA GAA GTG TAT GCT AAA TTC GTG CTG GTC AAT GCC GGC TCT TAC GCT 789 Glu Glu Val Tyr Ala Lys Phe Val Leu Val Asn Ala Gly Ser Tyr Ala 215 220 225
TTG CCT TTG GCT CAG AGC ATG GGC TAT GGC CTA GAT TTA GGG TGC TTG 837 Leu Pro Leu Ala Gin Ser Met Gly Tyr Gly Leu Asp Leu Gly Cys Leu 230 235 240
CCT GTG GCG GGC AGC TTT TAT TTT GTG CCG GAT TTA TTA AGG GGT AAG 885 Pro Val Ala Gly Ser Phe Tyr Phe Val Pro Asp Leu Leu Arg Gly Lys 245 250 255
GTT TAT ACC GTT CAA AAC CCC AAA CTC CCT TTT GCA GCC GTG CAT GGC 933 Val Tyr Thr Val Gin Asn Pro Lys Leu Pro Phe Ala Ala Val His Gly 260 265 270
GAC CCT GAT GCC GTC ATT AAA GGA AAA ACA CGA ATC GGG CCT ACC GCT 9βl Asp Pro Asp Ala Val He Lys Gly Lys Thr Arg He Gly Pro Thr Ala 275 280 285 290
TTA ACG ATG CCT AAA TTA GAA CGC AAC AAA TGT TGG CTT AAG GGC ATT 1029 Leu Thr Met Pro Lys Leu Glu Arg Asn Lys Cys Trp Leu Lys Gly He 295 300 305
AGC TTG GAA TTG TTG AAA ATG GAT TTG AAT AAA GAT GTG TTT AAA ATT 1077 Ser Leu Glu Leu Leu Lys Met Asp Leu Asn Lys Asp Val Phe Lys He 310 315 320
GCG TTT GAT TTG ATG AGC GAT AAA GAA ATC CGA AAT TAT GTG TTT AAA 1125 Ala Phe Asp Leu Met Ser Asp Lys Glu He Arg Asn Tyr Val Phe Lys 325 330 335
AAC ATG GTT TTT GAA TTG CCC ATT ATC GGT AAA AGG AAA TTT TTA AAA 1173 Asn Met Val Phe Glu Leu Pro He He Gly Lys Arg Lys Phe Leu Lys 340 345 350
GAC GCT CAA AAA ATC ATC CCC TCT CTT AGC CTA GAA GAT CTA GAA TAC 1221 Asp Ala Gin Lys He He Pro Ser Leu Ser Leu Glu Asp Leu Glu Tyr 355 360 365 370
GCT CAT GGT TTT GGT GAA GTG CGC CCG CAA GTT TTA GAC AGA ACC AAG 1269 Ala His Gly Phe Gly Glu Val Arg Pro Gin Val Leu Asp Arg Thr Lys 375 380 385
CGA AAA CTG GAA TTA GGC GAA AAA AAG ATT TGC ACC CAT AAA GGC ATC 1317 Arg Lys Leu Glu Leu Gly Glu Lys Lys He Cys Thr His Lys Gly He 390 395 400
ACT TTT AAC ATG ACC CCT TCT CCA GGC GCG ACG AGT TGT TTG CAA AAC 1365 Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu Gin Asn 405 410 415
- 173 -
GCC CTT GTG GAT TCC CAA GAA ATC GCT GCG TAT TTG GGC GAG AGC TTT 1413 Ala Leu Val Asp Ser Gin Glu He Ala Ala Tyr Leu Gly Glu Ser Phe 420 425 430
GAA TTA GAA CGC TTT TAT AAA GAT TTA TCC CCA GAA GAA TTG GAA AAT T 1462 Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu Glu Asn 435 440 445 450
AAAAACGCAT GCAAAAAGAA CAAGAAGCCC AAGAAATCGC TAAAAAAGCC GTTAAAATCG 1522 TGTTT 1527
(2) INFORMATION FOR SEQ ID NO: 68:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68:
Met Ser Met Glu Phe Asp Ala Val He He Gly Gly Gly Val Ser Gly
1 5 10 15
Cys Ala Thr Phe Tyr Thr Leu Ser Glu Tyr Ser Ser Leu Lys Arg Val
20 25 30
Ala He Val Glu Lys Cys Ser Lys Leu Ala Gin He Ser Ser Ser Ala
35 40 45
Lys Ala Asn Ser Gin Thr He His Asp Gly Ser He Glu Thr Asn Tyr
50 55 60
Thr Pro Glu Lys Ala Lys Lys Val Arg Leu Ser Ala Tyr Lys Thr Arg 65 70 75 80
Gin Tyr Ala Leu Asn Lys Gly Leu Gin Asn Glu Val He Phe Glu Thr
85 90 95
Gin Lys Met Ala He Gly Val Gly Asp Glu Glu Cys Glu Phe Met Lys
100 105 110
Lys Arg Tyr Glu Ser Phe Lys Glu He Phe Val Gly Leu Glu Glu Phe
115 120 125
Asp Lys Gin Lys He Lys Glu Leu Glu Pro Asn Val He Leu Gly Ala
130 135 140
Asn Gly He Asp Arg His Glu Asn He He Gly His Gly Tyr Arg Lys 145 150 155 160
Asp Trp Ser Thr Met Asn Phe Ala Lys Leu Ser Glu Asn Phe Val Glu
165 170 175
Glu Ala Leu Lys Leu Lys Pro Asn Asn Gin Val Phe Leu Asn Phe Lys
180 185 190
Val Lys Lys He Glu Lys Arg Asn Asp Thr Tyr Ala Val He Ser Glu
195 200 205
Asp Ala Glu Glu Val Tyr Ala Lys Phe Val Leu Val Asn Ala Gly Ser
210 215 220
Tyr Ala Leu Pro Leu Ala Gin Ser Met Gly Tyr Gly Leu Asp Leu Gly 225 230 235 240
Cys Leu Pro Val Ala Gly Ser Phe Tyr Phe Val Pro Asp Leu Leu Arg 245 250 255
- 174 -
Gly Lys Val Tyr Thr Val Gin Asn Pro Lys Leu Pro Phe Ala Ala Val
260 265 270
His Gly Asp Pro Asp Ala Val He Lys Gly Lys Thr Arg He Gly Pro
275 280 285
Thr Ala Leu Thr Met Pro Lys Leu Glu Arg Asn Lys Cys Trp Leu Lys
290 295 300
Gly He Ser Leu Glu Leu Leu Lys Met Asp Leu Asn Lys Asp Val Phe 305 310 315 320
Lys He Ala Phe Asp Leu Met Ser Asp Lys Glu He Arg Asn Tyr Val
325 330 335
Phe Lys Asn Met Val Phe Glu Leu Pro He He Gly Lys Arg Lys Phe
340 345 350
Leu Lys Asp Ala Gin Lys He He Pro Ser Leu Ser Leu Glu Asp Leu
355 360 365
Glu Tyr Ala His Gly Phe Gly Glu Val Arg Pro Gin Val Leu Asp Arg
370 375 380
Thr Lys Arg Lys Leu Glu Leu Gly Glu Lys Lys He Cys Thr His Lys 385 390 395 400
Gly He Thr Phe Asn Met Thr Pro Ser Pro Gly Ala Thr Ser Cys Leu
405 410 415
Gin Asn Ala Leu Val Asp Ser Gin Glu He Ala Ala Tyr Leu Gly Glu
420 425 430
Ser Phe Glu Leu Glu Arg Phe Tyr Lys Asp Leu Ser Pro Glu Glu Leu
435 440 445
Glu Asn 450
(2) INFORMATION FOR SEQ ID NO: 69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 63...590 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69:
CTAGATTTAA TTTTAAAGTT ATATAATTAA ACCACAAAAT CCTTTTTTAA AAGAAACTAA 60 GC ATG CCA AAA CCC AAG AAA AAC ACC CTC CCC TGT AGC CTT TCT GTC 107 Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val 1 5 10 15
AAA ATG TCT TAT TTC ATG CGC TTT CTC ATT AAA TGG CGC ACC CGC TCT 155 Lys Met Ser Tyr Phe Met Arg Phe Leu He Lys Trp Arg Thr Arg Ser 20 25 30
TTA AGC CAT AAA ATG ATG ACT CTC ATT CAA ATC TTA AGC ATT CTG GCT 203
- 175 -
Leu Ser His Lys Met Met Thr Leu He Gin He Leu Ser He Leu Ala
35 40 45
TTA GCG AGC AAG GCC AGT GAA GAT TTA GAA GAG CAA CTC AAA AAA ATC 251
Leu Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gin Leu Lys Lys He
50 55 60
AAA GAT TAC ATT TAT AGA ACC CTA AAC GCT AAA ATC GCA TCG GAT GTG 299
Lys Asp Tyr He Tyr Arg Thr Leu Asn Ala Lys He Ala Ser Asp Val 65 70 75
TAT AAC CGA GTG CTT ATT TTA GTG AAT GAA TAT TGC ACT AAT GAA GAA 347
Tyr Asn Arg Val Leu He Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu 80 85 90 95
TTG TTT GAC AAA GAG AGC GTT AAA ATT TCA GAT TTA CTC ATT CAA GAC 395
Leu Phe Asp Lys Glu Ser Val Lys He Ser Asp Leu Leu He Gin Asp
100 105 110
ATT CAG CTT TAC GCT TTA GTG GAT GAA ATG CTT AAA GAA GAT AAA TAT 443
He Gin Leu Tyr Ala Leu Val Asp Glu Met Leu Lys Glu Asp Lys Tyr
115 120 125
CAA GTC CAG CAC ACC ATT TTA AAG GGC ATC ATC AAA CGC AAA TAC GAT 491
Gin Val Gin His Thr He Leu Lys Gly He He Lys Arg Lys Tyr Asp
130 135 140
GAA GCC TAC TCG CTC AAT AGC GAA GAC AGG ATT CTT TTA GAA TAC CAA 539
Glu Ala Tyr Ser Leu Asn Ser Glu Asp Arg He Leu Leu Glu Tyr Gin 145 150 155
GAA CGC TTG CTA GAA CAC TCA CAC GCG TCT TTT TCA AAT AAA AAA TTC 587
Glu Arg Leu Leu Glu His Ser His Ala Ser Phe Ser Asn Lys Lys Phe 160 165 170 175
AAA TGATTTGAAA GCGTTACTTG CCCTGCTTTT TGGGCTTTTA TTGAAAAAGG GCTTTA 646 Lys
AAATGAG 653
(2) INFORMATION FOR SEQ ID NO: 70:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 176 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE, internal
(xi) SEQUENCE DESCRIPTION- SEQ ID NO: 70:
Met Pro Lys Pro Lys Lys Asn Thr Leu Pro Cys Ser Leu Ser Val Lys 1 5 10 15
- 176 -
Met Ser Tyr Phe Met Arg Phe Leu He Lys Trp Arg Thr Arg Ser Leu
20 25 30
Ser His Lys Met Met Thr Leu He Gin He Leu Ser He Leu Ala Leu
35 40 45
Ala Ser Lys Ala Ser Glu Asp Leu Glu Glu Gin Leu Lys Lys He Lys
50 55 60
Asp Tyr He Tyr Arg Thr Leu Asn Ala Lys He Ala Ser Asp Val Tyr 65 70 75 80
Asn Arg Val Leu He Leu Val Asn Glu Tyr Cys Thr Asn Glu Glu Leu
85 90 95
Phe Asp Lys Glu Ser Val Lys He Ser Asp Leu Leu He Gin Asp He
100 105 110
Gin Leu Tyr Ala Leu Val Asp Glu Met Leu Lys Glu Asp Lys Tyr Gin
115 120 125
Val Gin His Thr He Leu Lys Gly He He Lys Arg Lys Tyr Asp Glu
130 135 140
Ala Tyr Ser Leu Asn Ser Glu Asp Arg He Leu Leu Glu Tyr Gin Glu 145 150 155 160
Arg Leu Leu Glu His Ser His Ala Ser Phe Ser Asn Lys Lys Phe Lys 165 170 175
(2) INFORMATION FOR SEQ ID NO : 71 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1883 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 91...1833 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71:
AAGCGTTAAA TTCCAATCAA AAACCATCGT ATCGGTGTTA ATATTGTGTA AAAATTAATG 60 TTATGAATCT CTTGTATTAA AAGGACTTCA ATG AAA AAA TTG GTT TTA GTC ATC 114
Met Lys Lys Leu Val Leu Val He 1 5
TTT TTA ACG CTA GCG CTT TCA ATA TCT GCA AAA GAA GTC AAA ATA GTG 162 Phe Leu Thr Leu Ala Leu Ser He Ser Ala Lys Glu Val Lys He Val 10 15 20
TTT TTA GAA ACT TCA GAC ATT CAT GGG CGG CTT TTT TCG TAT GAT TAT 210 Phe Leu Glu Thr Ser Asp He His Gly Arg Leu Phe Ser Tyr Asp Tyr 25 30 35 40
GCG ATT GGC GAG CAA AAA CCC AAT AAC GGC TTG ACA AGG ATT GCG ACT 258 Ala He Gly Glu Gin Lys Pro Asn Asn Gly Leu Thr Arg He Ala Thr 45 50 55
- 177 -
TTA ATC AAA AAG CAA AGG GCT GAG AAT AAA AAT GTG GTT TTG ATT GAC 306 Leu He Lys Lys Gin Arg Ala Glu Asn Lys Asn Val Val Leu He Asp 60 65 70
AGC GGG GAT TTG TTG CAA GGC AAT AGC GCG GAG TTG TTT AAT GAT GAG 354 Ser Gly Asp Leu Leu Gin Gly Asn Ser Ala Glu Leu Phe Asn Asp Glu 75 80 85
CCA ATT CAT CCG CTA GTT AGA GCT GAA AAC GAT TTG AAA TTT GAC ATT 402 Pro He His Pro Leu Val Arg Ala Glu Asn Asp Leu Lys Phe Asp He 90 95 100
CGT GTG CTT GGC AAT CAC GAG TTT AAT TTC AGT AAA GAT TTT TTA GAA 450 Arg Val Leu Gly Asn His Glu Phe Asn Phe Ser Lys Asp Phe Leu Glu 105 110 115 120
AAG AAT ATT AAG GGG TTT AAT GGC GAT GTC ATG AAT GCG AAT ATC ATT 498 Lys Asn He Lys Gly Phe Asn Gly Asp Val Met Asn Ala Asn He He 125 130 135
AAA ATT GCG GAC AAT AAG CCG TTT GTA AAA CCT TAT ATT ATT AAA AAA 546 Lys He Ala Asp Asn Lys Pro Phe Val Lys Pro Tyr He He Lys Lys 140 145 150
ATT GAT GGC GTG AGG GTG GCG GTT GTG GGG TAT GTG GTG GCG CAC ATC 594 He Asp Gly Val Arg Val Ala Val Val Gly Tyr Val Val Ala His He 155 160 165
CCA ACT TGG GAG GCC TCT ACG CCT GAA CAT TTT GCA GGA TTG AAG TTT 642 Pro Thr Trp Glu Ala Ser Thr Pro Glu His Phe Ala Gly Leu Lys Phe 170 175 180
TTG GAC GCT GAA GAA GCG TTA AAA AAG ACC TTA AAA GAG TTG AAA GGG 690 Leu Asp Ala Glu Glu Ala Leu Lys Lys Thr Leu Lys Glu Leu Lys Gly 185 190 195 200
AAG TAT GAT ATT TTG ATT GGC GCT TTT CAT TTG GGG CGA GAA GAT GAG 738 Lys Tyr Asp He Leu He Gly Ala Phe His Leu Gly Arg Glu Asp Glu 205 210 215
AAA GGT GGC GAC GGG ATA CCG GAT TTA GCG AAA AAA TTC CCG CAA TTT 786 Lys Gly Gly Asp Gly He Pro Asp Leu Ala Lys Lys Phe Pro Gin Phe 220 225 230
GAC ATC ATT TTT GCA GGG CAT GAG CAT GCG GTT TAT AAC ACC AAA GTA 834 Asp He He Phe Ala Gly His Glu His Ala Val Tyr Asn Thr Lys Val 235 240 245
GGG AAA GTG CAT ACC ATT GAG CCT GGA GCG TAT GGG GCT TAT CTG GCA 882 Gly Lys Val His Thr He Glu Pro Gly Ala Tyr Gly Ala Tyr Leu Ala 250 255 260
AAG GGC GTG GTG GTA TTT GAC ACT AAA ACG AAG AAA AAA ATT ATA ACG 930 Lys Gly Val Val Val Phe Asp Thr Lys Thr Lys Lys Lys He He Thr 265 270 275 280
- 178 -
ACT GAA AAT TTA CCC ACA AAA GAT GTG CCA GAA GAT GAA GAA TTA GCG 978 Thr Glu Asn Leu Pro Thr Lys Asp Val Pro Glu Asp Glu Glu Leu Ala 285 290 295
AAA AAA TAC GAA TAT GTG GAT AAA AAA TCA AAA GAA TAC GCT AAT GAA 1026 Lys Lys Tyr Glu Tyr Val Asp Lys Lys Ser Lys Glu Tyr Ala Asn Glu 300 305 310
GTG GTT GGC GAA GTT ACA AAA ACC TTT ATT GAC AGG CCT GAT TTT ATC 1074
Val Val Gly Glu Val Thr Lys Thr Phe He Asp Arg Pro Asp Phe He
315 320 325
ACA GGA GAA GAA AAA ATC ACC ACG ATG CCC ACC GCC GCC TTG CAA GAA 1122
Thr Gly Glu Glu Lys He Thr Thr Met Pro Thr Ala Ala Leu Gin Glu
330 335 340
ACA CCG GTG ATA GAA TTG ATT AAT AAA GTG CAA AAA TAT TAC GCA AAA 1170 Thr Pro Val He Glu Leu He Asn Lys Val Gin Lys Tyr Tyr Ala Lys 345 350 355 360
GCC GAT GTT TCA GCG GCA GCC TTA TTC AAT TTT GGG GCG AAT TTG AAA 1218 Ala Asp Val Ser Ala Ala Ala Leu Phe Asn Phe Gly Ala Asn Leu Lys 365 370 375
AAA GGG CCT TTC AAA AGA AAA GAT GTC ACT TAT ATT TAC AAG TTC GCT 1266 Lys Gly Pro Phe Lys Arg Lys Asp Val Thr Tyr He Tyr Lys Phe Ala 380 385 390
AAT ACG CTC ATT GGA GTG CGT ATA ACG GGT GAA AAT CTG TTG AAA TAC 1314 Asn Thr Leu He Gly Val Arg He Thr Gly Glu Asn Leu Leu Lys Tyr 395 400 405
ATG GAA TGG TCA TAC CGA TTT TAC AAT CAG TTG CAA CCA GGA GAT TTG 1362 Met Glu Trp Ser Tyr Arg Phe Tyr Asn Gin Leu Gin Pro Gly Asp Leu 410 415 420
ACG ATC AGT TTT AAT GAA AAC ATT CGC GGC TAT AAC TTT GAT ATG TTT 1410 Thr He Ser Phe Asn Glu Asn He Arg Gly Tyr Asn Phe Asp Met Phe 425 430 435 440
TCT GGC GTG AAA TAC CAG GTT GAT GTT ACA AAA CCC GCC GGA CAA AGG 1458 Ser Gly Val Lys Tyr Gin Val Asp Val Thr Lys Pro Ala Gly Gin Arg 445 450 455
ATT ATC AAT CCG ACA ATC AAC AAC AAA CCC ATT GAC CCC AAA GCC ATC 1506 He He Asn Pro Thr He Asn Asn Lys Pro He Asp Pro Lys Ala He 460 465 470
TAT AAA TTA GCG ATC AAC AAT TAC CGA TTC GGA ACA TTA TCC ACG ACA 1554 Tyr Lys Leu Ala He Asn Asn Tyr Arg Phe Gly Thr Leu Ser Thr Thr 475 480 485
TTG AAT TTG GTT ACA GAC GCT GMT AGG TAT TAT AAT TCT TAC GAT GAA 1602 Leu Asn Leu Val Thr Asp Ala Xaa Arg Tyr Tyr Asn Ser Tyr Asp Glu 490 495 500
- 179 -
CTG CAA GAT AAT GGG CAA ATA CGA GAT TTG ATC ATC AAA TAC ATC ACG 1650 Leu Gin Asp Asn Gly Gin He Arg Asp Leu He He Lys Tyr He Thr 505 510 515 520
GAA GAA AAA GGT GGG AAG GTA ACC CCT GAA TTG GAG GGT AAT TGG GAA 1698 Glu Glu Lys Gly Gly Lys Val Thr Pro Glu Leu Glu Gly Asn Trp Glu 525 530 535
ATC ATC AAC TAC GAT TTC AAA AAC CCG TTG TTG GAA AAA TTG AGA GAA 1746 He He Asn Tyr Asp Phe Lys Asn Pro Leu Leu Glu Lys Leu Arg Glu 540 545 550
AAA TTA AAA GAG GGG AGC ATC AAA ATC CCC ACC TCA AAG GAT GGG AGG 1794 Lys Leu Lys Glu Gly Ser He Lys He Pro Thr Ser Lys Asp Gly Arg 555 560 565
ACT TTG AAT GTC AAA TCC ATT AAA GAG AGT GAA GTT AAA TAAAATT 1840 Thr Leu Asn Val Lys Ser He Lys Glu Ser Glu Val Lys 570 575 580
TTTTATTTTT ATTATTTTAT CTTTAAGCCT AACTTAAAAA AGG 1883
(2) INFORMATION FOR SEQ ID NO: 72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72
Met Lys Lys Leu Val Leu Val He Phe Leu Thr Leu Ala Leu Ser He
1 5 10 15
Ser Ala Lys Glu Val Lys He Val Phe Leu Glu Thr Ser Asp He His 20 25 30
Gly Arg Leu Phe Ser Tyr Asp Tyr Ala He Gly Glu Gin Lys Pro Asn
35 40 45
Asn Gly Leu Thr Arg He Ala Thr Leu He Lys Lys Gin Arg Ala Glu 50 55 60
Asn Lys Asn Val Val Leu He Asp Ser Gly Asp Leu Leu Gin Gly Asn
65 70 75 80
Ser Ala Glu Leu Phe Asn Asp Glu Pro He His Pro Leu Val Arg Ala 85 90 95
Glu Asn Asp Leu Lys Phe Asp He Arg Val Leu Gly Asn His Glu Phe
100 105 110
Asn Phe Ser Lys Asp Phe Leu Glu Lys Asn He Lys Gly Phe Asn Gly
115 120 125
Asp Val Met Asn Ala Asn He He Lys He Ala Asp Asn Lys Pro Phe 130 135 140
Val Lys Pro Tyr He He Lys Lys He Asp Gly Val Arg Val Ala Val
- 180 -
145 150 155 160
Val Gly Tyr Val Val Ala His He Pro Thr Trp Glu Ala Ser Thr Pro
165 170 175
Glu His Phe Ala Gly Leu Lys Phe Leu Asp Ala Glu Glu Ala Leu Lys
180 185 190
Lys Thr Leu Lys Glu Leu Lys Gly Lys Tyr Asp He Leu He Gly Ala
195 200 205
Phe His Leu Gly Arg Glu Asp Glu Lys Gly Gly Asp Gly He Pro Asp
210 215 220
Leu Ala Lys Lys Phe Pro Gin Phe Asp He He Phe Ala Gly His Glu 225 230 235 240
His Ala Val Tyr Asn Thr Lys Val Gly Lys Val His Thr He Glu Pro
245 250 255
Gly Ala Tyr Gly Ala Tyr Leu Ala Lys Gly Val Val Val Phe Asp Thr
260 265 270
Lys Thr Lys Lys Lys He He Thr Thr Glu Asn Leu Pro Thr Lys Asp
275 280 285
Val Pro Glu Asp Glu Glu Leu Ala Lys Lys Tyr Glu Tyr Val Asp Lys
290 295 300
Lys Ser Lys Glu Tyr Ala Asn Glu Val Val Gly Glu Val Thr Lys Thr 305 310 315 320
Phe He Asp Arg Pro Asp Phe He Thr Gly Glu Glu Lys He Thr Thr
325 330 335
Met Pro Thr Ala Ala Leu Gin Glu Thr Pro Val He Glu Leu He Asn
340 345 350
Lys Val Gin Lys Tyr Tyr Ala Lys Ala Asp Val Ser Ala Ala Ala Leu
355 360 365
Phe Asn Phe Gly Ala Asn Leu Lys Lys Gly Pro Phe Lys Arg Lys Asp
370 375 380
Val Thr Tyr He Tyr Lys Phe Ala Asn Thr Leu He Gly Val Arg He 385 390 395 400
Thr Gly Glu Asn Leu Leu Lys Tyr Met Glu Trp Ser Tyr Arg Phe Tyr
405 410 415
Asn Gin Leu Gin Pro Gly Asp Leu Thr He Ser Phe Asn Glu Asn He
420 425 430
Arg Gly Tyr Asn Phe Asp Met Phe Ser Gly Val Lys Tyr Gin Val Asp
435 440 445
Val Thr Lys Pro Ala Gly Gin Arg He He Asn Pro Thr He Asn Asn
450 455 460
Lys Pro He Asp Pro Lys Ala He Tyr Lys Leu Ala He Asn Asn Tyr 465 470 475 480
Arg Phe Gly Thr Leu Ser Thr Thr Leu Asn Leu Val Thr Asp Ala Xaa
485 490 495
Arg Tyr Tyr Asn Ser Tyr Asp Glu Leu Gin Asp Asn Gly Gin He Arg
500 505 510
Asp Leu He He Lys Tyr He Thr Glu Glu Lys Gly Gly Lys Val Thr
515 520 525
Pro Glu Leu Glu Gly Asn Trp Glu He He Asn Tyr Asp Phe Lys Asn
530 535 540
Pro Leu Leu Glu Lys Leu Arg Glu Lys Leu Lys Glu Gly Ser He Lys 545 550 555 560
He Pro Thr Ser Lys Asp Gly Arg Thr Leu Asn Val Lys Ser He Lys
565 570 575
Glu Ser Glu Val Lys 580
-181- (2) INFORMATION FOR SEQ ID NO:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1339 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 68...1252 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73:
CCAATCGTTT AATAGCGATT AAATATGACT ATATACACTA CAACAATAAG ATTTTGAAAG 60
GTTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA AAT AAG GTT GGC AAG 109
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys 1 5 10
AAT ACA GAG ATG GCT AAT ACA AAG GCA AAT AAA GAG GCT CAT TTT AAA 157
Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys
15 20 25 30
CAA GCG AGC ACC ATT ACA AAT ATA ATC AGA TCA ATT CGT GGG ATT TTT 205
Gin Ala Ser Thr He Thr Asn He He Arg Ser He Arg Gly He Phe 35 40 45
ACA AAA ATT GCA AAG AAA GTT AGA GGA CTT GTA AAA AAA CAC CCC AAG 253
Thr Lys He Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys
50 55 60
AAA AGC AGT GCG GCA TTA GTA GTA TTG ACC CAT ATT GCG TGC AAG AAA 301
Lys Ser Ser Ala Ala Leu Val Val Leu Thr His He Ala Cys Lys Lys 65 70 75
GCG AAA GAA TTA GAC GAT AAA GTC CAA GAT AAA TCC AAA CAA GCT GAA 349
Ala Lys Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu 80 85 90
AAA GAA AAT CAA ATC AAT TGG TGG AAA TAT TCA GGA TTA ACA ATA GCG 397
Lys Glu Asn Gin He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala
95 100 105 110
ACA AGT TTA TTA TTA GCC GCT TGT AGC ACT GGT GAT GTT AGT GAA CAA 445
Thr Ser Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin 115 120 125
ATA GAA CTA GAA CAA GAA AAA CAA AAG ACG AGC AAT ATA GAG ACT AAC 493
He Glu Leu Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn
130 135 140
- 182 -
AAT CAA ATA AAA GTA GAA CAA GAA AAA CAA AAG ACA AGC AAT ATA GAG 541
Asn Gin He Lys Val Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu
145 150 155
ACT AAT AAT CAA ATA AAA GTA GAA CAA GAA CAA CAA AAG ACA AGC AAT 589
Thr Asn Asn Gin He Lys Val Glu Gin Glu Gin Gin Lys Thr Ser Asn 160 165 170
ACA CAG AAA GAT TTG GTT AAA GAA CAG AAA GAT TTG GTT AAA GAA CAG 637 Thr Gin Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin 175 180 185 190
AAA GAT TTG GTT AAA GAA CAG AAA GAT TTG GTT AAA GAA CAG AAA GAT 685 Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin Lys Asp 195 200 205
TTG GTT AAA ACA CAG AAA GAT TTC ATT AAA TAT GTA GAA CAA AAT TGC 733 Leu Val Lys Thr Gin Lys Asp Phe He Lys Tyr Val Glu Gin Asn Cys 210 215 220
CAA GAA AAT CAT AAT CAA TTC TTT ATT GAA AAA GGA GGA ATT AAG GCT 781 Gin Glu Asn His Asn Gin Phe Phe He Glu Lys Gly Gly He Lys Ala 225 230 235
GGT ATT GGT ATA GAA GTA GAA GCT GAA TGC AAA ACC CCT AAA CCT GCA 829 Gly He Gly He Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala 240 245 250
AAA ACC AAT CAA ACC CCT ATC CAG CCA AAA CAC CTC CCA AAC TCT AAA 877 Lys Thr Asn Gin Thr Pro He Gin Pro Lys His Leu Pro Asn Ser Lys 255 260 265 270
CAA CCC CGC TCT CAA AGA GGA TCA AAA GCG CAA GAG CTT ATC GCT TAT 925 Gin Pro Arg Ser Gin Arg Gly Ser Lys Ala Gin Glu Leu He Ala Tyr 275 280 285
TTG CAA AAA GAG CTA GAA TTT CTG CCC TAT TCG CAA AAA GCT ATC GCT 973 Leu Gin Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gin Lys Ala He Ala 290 295 300
AAA CAA GTG GAT TTT TAC AGG CCA AGT TCT ATC GCT TAT TTA GAA CTA 1021 Lys Gin Val Asp Phe Tyr Arg Pro Ser Ser He Ala Tyr Leu Glu Leu 305 310 315
GAT CCT AGA GAT TTT AAG GTT ACA GAA GAA TGG CAA AAA GAA AAT CTA 1069 Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gin Lys Glu Asn Leu 320 325 330
AAA ATA CGC TCT AAA GCT CAA GCT AAA ATG CTT GAA ATG AGA AAC CCA 1117 Lys He Arg Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Asn Pro 335 340 345 350
CAA GCC CAC CTT TCA AAC TCT CAA AGC CTT TTG TTC GTT CAA AAA ATA 1165 Gin Ala His Leu Ser Asn Ser Gin Ser Leu Leu Phe Val Gin Lys He 355 360 365
- 183 -
TTT GCT GAT GTT AAT AAA GAA ATA GAA GCA GTT GCT AAT ACT GAA AAG 1213 Phe Ala Asp Val Asn Lys Glu He Glu Ala Val Ala Asn Thr Glu Lys 370 375 380
AAA GCA GAA AAA GCG GGT TAT GGT TAT AGT AAA AGG ATG TAGCGGTTAA AA 1264 Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 395
ACATTGCACC AAGTTTTTAA TTATCTGTCG GCTTTTGAAA ACATTTTTTA TGGTAGCGTT 1324 ATTTGGCAAT AAAAG 1339
(2) INFORMATION FOR SEQ ID NO: 74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 395 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr
1 5 10 15
Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Ala His Phe Lys Gin Ala
20 25 30
Ser Thr He Thr Asn He He Arg Ser He Arg Gly He Phe Thr Lys
35 40 45
He Ala Lys Lys Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser
50 55 60
Ser Ala Ala Leu Val Val Leu Thr His He Ala Cys Lys Lys Ala Lys 65 70 75 80
Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu
85 90 95
Asn Gin He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Thr Ser
100 105 110
Leu Leu Leu Ala Ala Cys Ser Thr Gly Asp Val Ser Glu Gin He Glu
115 120 125
Leu Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn Asn Gin
130 135 140
He Lys Val Glu Gin Glu Lys Gin Lys Thr Ser Asn He Glu Thr Asn 145 150 155 160
Asn Gin He Lys Val Glu Gin Glu Gin Gin Lys Thr Ser Asn Thr Gin
165 170 175
Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin Lys Asp
180 185 190
Leu Val Lys Glu Gin Lys Asp Leu Val Lys Glu Gin Lys Asp Leu Val
195 200 205
Lys Thr Gin Lys Asp Phe He Lys Tyr Val Glu Gin Asn Cys Gin Glu
210 215 220
Asn His Asn Gin Phe Phe He Glu Lys Gly Gly He Lys Ala Gly He 225 230 235 240
Gly He Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr
- 184 -
245 250 255
Asn Gin Thr Pro He Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro
260 265 270
Arg Ser Gin Arg Gly Ser Lys Ala Gin Glu Leu He Ala Tyr Leu Gin
275 280 285
Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gin Lys Ala He Ala Lys Gin
290 295 300
Val Asp Phe Tyr Arg Pro Ser Ser He Ala Tyr Leu Glu Leu Asp Pro 305 310 315 320
Arg Asp Phe Lys Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys He
325 330 335
Arg Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Asn Pro Gin Ala
340 345 350
His Leu Ser Asn Ser Gin Ser Leu Leu Phe Val Gin Lys He Phe Ala
355 360 365
Asp Val Asn Lys Glu He Glu Ala Val Ala Asn Thr Glu Lys Lys Ala
370 375 380
Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 395
(2) INFORMATION FOR SEQ ID NO: 75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 904 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 70...864 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75:
TAATAACTCA ATCCCATTTG AATGGCATTT TTAAGCCAAA TTGCTACTAT CTTTGGCTAA 60 AGGTTAAAC ATG ATT AAA CAA ACC CTC ATC ATT CTT GCC CCT TTT TTT ATC 111 Met He Lys Gin Thr Leu He He Leu Ala Pro Phe Phe He 1 5 10
GCA ACG CTG TTG TAT TTT TTA GGC GCA CCG GAT GGG TTA AGA CCT AAC 159 Ala Thr Leu Leu Tyr Phe Leu Gly Ala Pro Asp Gly Leu Arg Pro Asn 15 20 25 30
GCT TGG CTT TAT TTT TGT ATT TTC ATG GGC ATG ATT ATA GGG CTA ATT 207 Ala Trp Leu Tyr Phe Cys He Phe Met Gly Met He He Gly Leu He 35 40 45
TTA GAG CCG GTG CCA TCA GGT TTA ATA GCG CTA AGC GCG TTA GTG CTG 255 Leu Glu Pro Val Pro Ser Gly Leu He Ala Leu Ser Ala Leu Val Leu 50 55 60
- 185 -
TGT ATA GCG TTA AAA ATT GGA GCG AGC GAT AAA GTA GCG AGC GCT AAT 303 Cys He Ala Leu Lys He Gly Ala Ser Asp Lys Val Ala Ser Ala Asn 65 70 75
AAG GCT ATT TCG TGG GGT TTG AGC GGG TAT GCG AAT AAA ACG GTG TGG 351 Lys Ala He Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp 80 85 90
CTT GTG TTT GTC GCT TTC ATT TTG GGT TTA GGG TAT GAA AAA AGC TTG 399 Leu Val Phe Val Ala Phe He Leu Gly Leu Gly Tyr Glu Lys Ser Leu 95 100 105 110
TTA GGG AAA CGG ATC GCT CTT TTA CTG ATT AGG TTT TTA GGG CAA ACC 447 Leu Gly Lys Arg He Ala Leu Leu Leu He Arg Phe Leu Gly Gin Thr 115 120 125
CCT TTA GGT TTA GGC TAT GCG ATT GGT TTG AGC GAA TTG TGT CTA GCC 495 Pro Leu Gly Leu Gly Tyr Ala He Gly Leu Ser Glu Leu Cys Leu Ala 130 135 140
CCT TTT ATC CCT AGC AAC TCC GCT AGA AGT GGA GGC ATA CTC TAT CCC 543 Pro Phe He Pro Ser Asn Ser Ala Arg Ser Gly Gly He Leu Tyr Pro 145 150 155
ATC GTT TCA TCT ATC CCG CCT TTA ATG GGA TCT ACT CCA AAT AAT AAC 591 He Val Ser Ser He Pro Pro Leu Met Gly Ser Thr Pro Asn Asn Asn 160 165 170
CCT GAC AAA ATC GGC GCG TAT TTG ATG TGG GTC GCT TTG GCT TCA ACT 639 Pro Asp Lys He Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr 175 180 185 190
TGC ATC ACT TCG TCC ATG TTT TTA ACC GCG CTC GCT CCT AAC CCC CTA 687 Cys He Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu 195 200 205
GCA ATG GAA ATC GCT GCC AAA ATG GGC GTG AAT GAA ATC TCA TGG TTT 735 Ala Met Glu He Ala Ala Lys Met Gly Val Asn Glu He Ser Trp Phe 210 215 220
TCG TGG TTT TTA GCG TTC TTG CCT TGT GGG GTG GTT TTG ATC TTG CTT 783 Ser Trp Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu He Leu Leu 225 230 235
GTG CCT TTA TTG GCG TAT AAA ACC TGC AAA CCC ACC TTA AAA GGC TCA 831 Val Pro Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser 240 245 250
AAA GAA GTG AGT TTG TGG GCC AAA AAA AGG AAT TAGAGGGCAT GGGGAGGTTT 884
Lys G]u Val Ser Leu Trp Ala Lys Lys Arg Asn
255 260 265
TCTTTAAAAG AAATTTTAAT 904
(2) INFORMATION FOR SEQ ID NO: 76:
- 186 -
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76:
Met He Lys Gin Thr Leu He He Leu Ala Pro Phe Phe He Ala Thr
1 5 10 15
Leu Leu Tyr Phe Leu Gly Ala Pro Asp Gly Leu Arg Pro Asn Ala Trp
20 25 30
Leu Tyr Phe Cys He Phe Met Gly Met He He Gly Leu He Leu Glu
35 40 45
Pro Val Pro Ser Gly Leu He Ala Leu Ser Ala Leu Val Leu Cys He
50 55 60
Ala Leu Lys He Gly Ala Ser Asp Lys Val Ala Ser Ala Asn Lys Ala 65 70 75 80
He Ser Trp Gly Leu Ser Gly Tyr Ala Asn Lys Thr Val Trp Leu Val
85 90 95
Phe Val Ala Phe He Leu Gly Leu Gly Tyr Glu Lys Ser Leu Leu Gly
100 105 110
Lys Arg He Ala Leu Leu Leu He Arg Phe Leu Gly Gin Thr Pro Leu
115 120 125
Gly Leu Gly Tyr Ala He Gly Leu Ser Glu Leu Cys Leu Ala Pro Phe
130 135 140
He Pro Ser Asn Ser Ala Arg Ser Gly Gly He Leu Tyr Pro He Val 145 150 155 160
Ser Ser He Pro Pro Leu Met Gly Ser Thr Pro Asn Asn Asn Pro Asp
165 170 175
Lys He Gly Ala Tyr Leu Met Trp Val Ala Leu Ala Ser Thr Cys He
180 185 190
Thr Ser Ser Met Phe Leu Thr Ala Leu Ala Pro Asn Pro Leu Ala Met
195 200 205
Glu He Ala Ala Lys Met Gly Val Asn Glu He Ser Trp Phe Ser Trp
210 215 220
Phe Leu Ala Phe Leu Pro Cys Gly Val Val Leu He Leu Leu Val Pro 225 230 235 240
Leu Leu Ala Tyr Lys Thr Cys Lys Pro Thr Leu Lys Gly Ser Lys Glu
245 250 255
Val Ser Leu Trp Ala Lys Lys Arg Asn 260 265
(2) INFORMATION FOR SEQ ID NO: 77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1194 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
- 187 -
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 152...1069 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77:
TTTTAAGCGG TTTCCCTAAA ATAGGTTTTT AATCAATTTA ATCCAAAGTT GAATTTATTT 60 TTTGACAATA TTATACTATA ATAACCAATT AGATTGGGGT TTTACTGATT TTTCTTTGTG 120 TGAGCTTTGG CTTAGTTTTG TAAGGAATGA G ATG ATA AAG AGT TGG ACT AAA 172
Met He Lys Ser Trp Thr Lys 1 5
AAG TGG TTT TTG ATT TTA TTT TTA ATG GCA AGT TGT TCC AGT TAT TTG 220 Lys Trp Phe Leu He Leu Phe Leu Met Ala Ser Cys Ser Ser Tyr Leu 10 15 20
GTG GCT ACA ACC GGT GAG AAA TAT TTT AAA ATG GCT ACT CAA GCC TTT 268 Val Ala Thr Thr Gly Glu Lys Tyr Phe Lys Met Ala Thr Gin Ala Phe 25 30 35
AAG AGA GGG GAC TAC CAT AAA GCG GTG GCT TTT TAT AAG AGG AGC TGT 316 Lys Arg Gly Asp Tyr His Lys Ala Val Ala Phe Tyr Lys Arg Ser Cys 40 45 50 55
AAT TTA AGG GTG GGG GTT GGT TGC ACG AGT TTA GGC TCT ATG TAT GAA 364 Asn Leu Arg Val Gly Val Gly Cys Thr Ser Leu Gly Ser Met Tyr Glu 60 65 70
GAT GGC GAT GGC GTG GAT CAG AAT ATT ACA AAA GCC GTT TTT TAT TAC 412 Asp Gly Asp Gly Val Asp Gin Asn He Thr Lys Ala Val Phe Tyr Tyr 75 80 85
AGA AGA GGG TGT AAT TTA AGG AAT CAT CTC GCT TGC GCG AGT CTA GGC 460 Arg Arg Gly Cys Asn Leu Arg Asn His Leu Ala Cys Ala Ser Leu Gly 90 95 100
TCT ATG TAT GAA GAT GGC GAT GGT GTG CAA AAA AAC CTT CCA AAG GCT 508 Ser Met Tyr Glu Asp Gly Asp Gly Val Gin Lys Asn Leu Pro Lys Ala 105 110 115
ATC TAT TAT TAC AGG AGA GGG TGC CAC TTA AAG GGT GGG GTG AGC TGT 556 He Tyr Tyr Tyr Arg Arg Gly Cys His Leu Lys Gly Gly Val Ser Cys 120 125 130 135
GGG AGT TTA GGT TTT ATG TAT TTT AAT GGC ACG GGC GTT AAG CAA AAT 604 Gly Ser Leu Gly Phe Met Tyr Phe Asn Gly Thr Gly Val Lys Gin Asn 140 145 150
TAT GCC AAA GCC CTT TTT CTT TCT AAA TAC GCT TGC AGT TTG AAT TAC 652 Tyr Ala Lys Ala Leu Phe Leu Ser Lys Tyr Ala Cys Ser Leu Asn Tyr 155 160 165
GGC ATT AGT TGT AAC TTT GTA GGG TAT ATG TAT AGG AAC GCC AAA GGC 700
- 188 -
Gly He Ser Cys Asn Phe Val Gly Tyr Met Tyr Arg Asn Ala Lys Gly 170 175 180
GTA CAG AAG GAT TTG AAA AAA GCC CTT GCG AAT TTT AAA AGA GGG TGC 748 Val Gin Lys Asp Leu Lys Lys Ala Leu Ala Asn Phe Lys Arg Gly Cys 185 190 195
CAT TTG AAA GAC GGA GCG AGT TGT GTG AGC TTG GGA TAC ATG TAT GAA 796 His Leu Lys Asp Gly Ala Ser Cys Val Ser Leu Gly Tyr Met Tyr Glu 200 205 210 215
GTC GGT ATG GAT GTC AAA CAA AAT GGA GAG CAA GCC TTG AAT CTT TAT 844 Val Gly Met Asp Val Lys Gin Asn Gly Glu Gin Ala Leu Asn Leu Tyr 220 225 230
AAA AAG GGT TGT TAT TTA AAA AGG GGG AGC GGT TGT CAT AAT GTG GCG 892 Lys Lys Gly Cys Tyr Leu Lys Arg Gly Ser Gly Cys His Asn Val Ala 235 240 245
GTG ATG TAT TAC ACC GGT AAG GGC GTT CCA AAG GAT TTA GAT AAA GCC 940 Val Met Tyr Tyr Thr Gly Lys Gly Val Pro Lys Asp Leu Asp Lys Ala 250 255 260
ATT TCG TAT TAT AAG AAA GGT TGC ACT CTA GGC TTT AGT GGT AGC TGT 988 He Ser Tyr Tyr Lys Lys Gly Cys Thr Leu Gly Phe Ser Gly Ser Cys 265 270 275
AAA GTG TTA GAA GAA GTG ATT GGC AAG AAG TCT GAT GAT TTG CAA GAT 1036 Lys Val Leu Glu Glu Val He Gly Lys Lys Ser Asp Asp Leu Gin Asp 280 285 290 295
GAC GCG CAA AAC GAC ACG CAA GAT GAT ATG CAA TAAGTTAAAG CTTATGGACT 1089 Asp Ala Gin Asn Asp Thr Gin Asp Asp Met Gin 300 305
AATGATTAAA ACTCATCTTA TAGAAATCTT TCTACTCTCT TGTTATCAAA TAGGGATTAA 1149 GCGTCTCTAT TGATGGGTAT TGAGACTAAA AATCTGCAAA TCTAG 1194
(2) INFORMATION FOR SEQ ID NO: 78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 306 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:
Met He Lys Ser Trp Thr Lys Lys Trp Phe Leu He Leu Phe Leu Met
1 5 10 15
Ala Ser Cys Ser Ser Tyr Leu Val Ala Thr Thr Gly Glu Lys Tyr Phe
20 25 30
Lys Met Ala Thr Gin Ala Phe Lys Arg Gly Asp Tyr His Lys Ala Val
- 189 -
35 40 45
Ala Phe Tyr Lys Arg Ser Cys Asn Leu Arg Val Gly Val Gly Cys Thr
50 55 60
Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val Asp Gin Asn He 65 70 75 80
Thr Lys Ala Val Phe Tyr Tyr Arg Arg Gly Cys Asn Leu Arg Asn His
85 90 95
Leu Ala Cys Ala Ser Leu Gly Ser Met Tyr Glu Asp Gly Asp Gly Val
100 105 110
Gin Lys Asn Leu Pro Lys Ala He Tyr Tyr Tyr Arg Arg Gly Cys His
115 120 125
Leu Lys Gly Gly Val Ser Cys Gly Ser Leu Gly Phe Met Tyr Phe Asn
130 135 140
Gly Thr Gly Val Lys Gin Asn Tyr Ala Lys Ala Leu Phe Leu Ser Lys 145 150 155 160
Tyr Ala Cys Ser Leu Asn Tyr Gly He Ser Cys Asn Phe Val Gly Tyr
165 170 175
Met Tyr Arg Asn Ala Lys Gly Val Gin Lys Asp Leu Lys Lys Ala Leu
180 185 190
Ala Asn Phe Lys Arg Gly Cys His Leu Lys Asp Gly Ala Ser Cys Val
195 200 205
Ser Leu Gly Tyr Met Tyr Glu Val Gly Met Asp Val Lys Gin Asn Gly
210 215 220
Glu Gin Ala Leu Asn Leu Tyr Lys Lys Gly Cys Tyr Leu Lys Arg Gly 225 230 235 240
Ser Gly Cys His Asn Val Ala Val Met Tyr Tyr Thr Gly Lys Gly Val
245 250 255
Pro Lys Asp Leu Asp Lys Ala He Ser Tyr Tyr Lys Lys Gly Cys Thr
260 265 270
Leu Gly Phe Ser Gly Ser Cys Lys Val Leu Glu Glu Val He Gly Lys
275 280 285
Lys Ser Asp Asp Leu Gin Asp Asp Ala Gin Asn Asp Thr Gin Asp Asp
290 295 300
Met Gin 305
(2) INFORMATION FOR SEQ ID NO: 79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1001 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 101...865 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79:
TTTGTTTATA AGAAAAATTA TTTCAAATGT AGTAGAATTA AGGCAGTGTT TTTGCGTCAA 60 GCGATTTTAG GTTAATTTTG AGTTTTTAGG AGCAGTTTTT ATG CAA CAA GAA GAG 115
- 190 -
Met Gin Gin Glu Glu 1 5
ATT ATA GAG GGT TAT TAT GGT GCT AGC AAA GGG CTT AAA AAG AGC GGT 163 He He Glu Gly Tyr Tyr Gly Ala Ser Lys Gly Leu Lys Lys Ser Gly 10 15 20
ATT TAT GCC AAG CTG GAT TTT TTA CAG AGC GCT ACG GGC TTG ATT TTA 211 He Tyr Ala Lys Leu Asp Phe Leu Gin Ser Ala Thr Gly Leu He Leu 25 30 35
GCG CTC TTT ATG ATA GCA CAC ATG TTT TTA GTC TCA AGT ATC TTG ATT 259 Ala Leu Phe Met He Ala His Met Phe Leu Val Ser Ser He Leu He 40 45 50
AGC GAT GAA GCC ATG TAT AAA GTG GCG AAA TTT TTT GAA GGG AGC TTG 307 Ser Asp Glu Ala Met Tyr Lys Val Ala Lys Phe Phe Glu Gly Ser Leu 55 60 65
TTT TTA AAA GCG GGC GAG CCG GCT ATT GTG AGC GTG GTT GCA GCA GGG 355 Phe Leu Lys Ala Gly Glu Pro Ala He Val Ser Val Val Ala Ala Gly 70 75 80 85
ATT ATT CTT ATT TTA GTC GCG CAT GCT TTT TTG GCG TTA AGG AAA TTC 403 He He Leu He Leu Val Ala His Ala Phe Leu Ala Leu Arg Lys Phe 90 95 100
CCT ATC AAT TAC AGG CAA TAC AAG GTT TTT AAA ACC CAT AAG CAT TTG 451 Pro He Asn Tyr Arg Gin Tyr Lys Val Phe Lys Thr His Lys His Leu 105 110 115
ATG AAA CAT GGC GAT ACG AGC TTG TGG TTT ATT CAA GCC CTC ACC GGG 499 Met Lys His Gly Asp Thr Ser Leu Trp Phe He Gin Ala Leu Thr Gly 120 125 130
TTT GCG ATG TTT TTC TTA GCG AGT ATC CAC TTA TTT GTC ATG CTC ACA 547 Phe Ala Met Phe Phe Leu Ala Ser He His Leu Phe Val Met Leu Thr 135 140 145
GAG CCT GAA AGT ATT GGG CCT CAT GGT TCA AGC TAT CGT TTT GTA ACG 595
Glu Pro Glu Ser He Gly Pro His Gly Ser Ser Tyr Arg Phe Val Thr
150 155 160 165
CAA AAC TTT TGG CTT TTG TAT ATT TTC TTA TTG TTT GCC GTA GAA TTG 643
Gin Asn Phe Trp Leu Leu Tyr He Phe Leu Leu Phe Ala Val Glu Leu 170 175 180
CAT GGC TCT ATT GGG TTG TAT CGT TTA GCG ATC AAA TGG GGG TGG TTT 691 His Gly Ser He Gly Leu Tyr Arg Leu Ala He Lys Trp Gly Trp Phe 185 190 195
AAA AAT GTG AGC ATT CAA GGT TTG AGA AAA GTC AAA TGG GCG ATG AGC 739 Lys Asn Val Ser He Gin Gly Leu Arg Lys Val Lys Trp Ala Met Ser 200 205 210
GTG TTT TTT ATT GTT TTA GGG CTT TGC ACC TAT GGG GCT TAC ATT AAA 787
- 191 -
Val Phe Phe He Val Leu Gly Leu Cys Thr Tyr Gly Ala Tyr He Lys 215 220 225
AAA GGT TTA GAA AAT AAG GAA AAT GGC ATT AAA ACC ATG CAA GAA GCC 835 Lys Gly Leu Glu Asn Lys Glu Asn Gly He Lys Thr Met Gin Glu Ala 230 235 240 245
ATA GAA GCT GAT GGG AAA TTC CAC AAA GAA TAAGGGTAGA AAATGAAAAT AAC 888 He Glu Ala Asp Gly Lys Phe His Lys Glu 250 255
ATATTGTGAT GCGCTAATTA TTGGAGGCGG ACTAGCTGGG TTAAGGGCTA GTATCGCATG 948 CAAACAAAAG GGTTTAAACA CCATCGTTTT AAGCCTAGTG CCTGTCAGGC GTT 1001
(2) INFORMATION FOR SEQ ID NO : 80 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 80 :
Met Gin Gin Glu Glu He He Glu Gly Tyr Tyr Gly Ala Ser Lys Gly
1 5 10 15
Leu Lys Lys Ser Gly He Tyr Ala Lys Leu Asp Phe Leu Gin Ser Ala
20 25 30
Thr Gly Leu He Leu Ala Leu Phe Met He Ala His Met Phe Leu Val
35 40 45
Ser Ser He Leu He Ser Asp Glu Ala Met Tyr Lys Val Ala Lys Phe
50 55 60
Phe Glu Gly Ser Leu Phe Leu Lys Ala Gly Glu Pro Ala He Val Ser 65 70 75 80
Val Val Ala Ala Gly He He Leu He Leu Val Ala His Ala Phe Leu
85 90 95
Ala Leu Arg Lys Phe Pro He Asn Tyr Arg Gin Tyr Lys Val Phe Lys
100 105 110
Thr His Lys His Leu Met Lys His Gly Asp Thr Ser Leu Trp Phe He
115 120 125
Gin Ala Leu Thr Gly Phe Ala Met Phe Phe Leu Ala Ser He His Leu
130 135 140
Phe Val Met Leu Thr Glu Pro Glu Ser He Gly Pro His Gly Ser Ser 145 150 155 160
Tyr Arg Phe Val Thr Gin Asn Phe Trp Leu Leu Tyr He Phe Leu Leu
165 170 175
Phe Ala Val Glu Leu His Gly Ser He Gly Leu Tyr Arg Leu Ala He
180 185 190
Lys Trp Gly Trp Phe Lys Asn Val Ser He Gin Gly Leu Arg Lys Val
195 200 205
Lys Trp Ala Met Ser Val Phe Phe He Val Leu Gly Leu Cys Thr Tyr 210 215 220
- 192 -
Gly Ala Tyr He Lys Lys Gly Leu Glu Asn Lys Glu Asn Gly He Lys
225 230 235 240
Thr Met Gin Glu Ala He Glu Ala Asp Gly Lys Phe His Lys Glu
245 250 255
(2) INFORMATION FOR SEQ ID NO : 81 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 975 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 82...912 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81:
TTTTAAAATT AAAGAAAATT TTTTTTAAAG ATTATCACTC TTTTTTGATA AAGTAATCAT 60
TTAAAATTTA GGGAGTTTTT T ATG GAA GAA TCA ACA GCG TTT ATT TTG GCT 111
Met Glu Glu Ser Thr Ala Phe He Leu Ala 1 5 10
CTT GTG GGG CTA TTC ACC GGC ATT ACC GCC GGG TTT TTT GGT ATT GGT 159
Leu Val Gly Leu Phe Thr Gly He Thr Ala Gly Phe Phe Gly He Gly 15 20 25
GGG GGG GAG ATT GTC GTC CCT AGC GCG ATT TTT GCC CAT TTT AGC TAT 207
Gly Gly Glu He Val Val Pro Ser Ala He Phe Ala His Phe Ser Tyr 30 35 40
AGC CAT GCG GTG GGT ATT TCG CTC ATG CAA ATG CTT TTT TCT TCA GTG 255
Ser His Ala Val Gly He Ser Leu Met Gin Met Leu Phe Ser Ser Val
45 50 55
GTC GGC TCT ATC ATC AAT TAC AAA AAG GGC TTA TTG GAT TTG AGA GAA 303
Val Gly Ser He He Asn Tyr Lys Lys Gly Leu Leu Asp Leu Arg Glu
60 65 70
GGC TCA TTT GCC GCG CTT GGA GGG CTA ATG GGA GCG ATT TTA GGG AGC 351
Gly Ser Phe Ala Ala Leu Gly Gly Leu Met Gly Ala He Leu Gly Ser 75 80 85 90
TTT ATC TTA AAA ATC ATT GAC GAT AAA ATT TTA ATG GCG GTG TTT GTG 399
Phe He Leu Lys He He Asp Asp Lys He Leu Met Ala Val Phe Val 95 100 105
GTG GTG GTG TGC TAC ACC TTT ATC AAA TAC GCT TTT TCT AGC AAC AAG 447
Val Val Val Cys Tyr Thr Phe He Lys Tyr Ala Phe Ser Ser Asn Lys 110 115 120
- 193 -
AAA CCC AAG CAT TTT GAA GAA ATG CAT TTT GAT TTG CAT GCG AAT AAC 495 Lys Pro Lys His Phe Glu Glu Met His Phe Asp Leu His Ala Asn Asn 125 130 135
AAA ACG CCC GAA AAA AAG CGC GCA ATC CCT TTT GTG TCT ATG GAT AGA 543 Lys Thr Pro Glu Lys Lys Arg Ala He Pro Phe Val Ser Met Asp Arg 140 145 150
ACG CAT GGG GTT TTG ATG CTC GCC GGT TTT GTT ACC GGC ATC TTT TCT 591 Thr His Gly Val Leu Met Leu Ala Gly Phe Val Thr Gly He Phe Ser 155 160 165 170
ATC CCA CTA GGC ATG GGT GGA GGG ATT TTA ATG GTG CCG TTT TTG GGC 639 He Pro Leu Gly Met Gly Gly Gly He Leu Met Val Pro Phe Leu Gly 175 180 185
TAT TTT TTG AAA TAC GAT TCT AAA AAA ATC GTG CCT TTG GGG CTA TTT 687 Tyr Phe Leu Lys Tyr Asp Ser Lys Lys He Val Pro Leu Gly Leu Phe 190 195 200
TTT GTG GTG TTC GCT TCT TTA TCT GGG GTC ATC TCT CTT TAT AAC GGG 735 Phe Val Val Phe Ala Ser Leu Ser Gly Val He Ser Leu Tyr Asn Gly 205 210 215
AGG GTT CTT GAT AAT ATA AGC GTT CAA GCG GGG GTG ATT ACC GGC ATT 783 Arg Val Leu Asp Asn He Ser Val Gin Ala Gly Val He Thr Gly He 220 225 230
GGA GCG TTT TTA GGC GTG GGC ATT GGC ATC AAG CTT ATC GCT TTG GCT 831 Gly Ala Phe Leu Gly Val Gly He Gly He Lys Leu He Ala Leu Ala 235 240 245 250
AAT GAA AAG GTG CAT AAA ATC CTG TTG CTC CTT ATT TAT GCT TTA AGC 879 Asn Glu Lys Val His Lys He Leu Leu Leu Leu He Tyr Ala Leu Ser 255 260 265
ATT TTA GCG ACT TTA CAC AAG CTC ATT ATG GGG TAAATCTAAA AACGCTTCTA 932 He Leu Ala Thr Leu His Lys Leu He Met Gly 270 275
GGGCATTTTT AAAATTAATA TCAAAGAGCT TTCACCAGCA AGC 975
(2) INFORMATION FOR SEQ ID NO: 82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82:
- 194 -
Met Glu Glu Ser Thr Ala Phe He Leu Ala Leu Val Gly Leu Phe Thr
1 5 10 15
Gly He Thr Ala Gly Phe Phe Gly He Gly Gly Gly Glu He Val Val
20 25 30
Pro Ser Ala He Phe Ala His Phe Ser Tyr Ser His Ala Val Gly He
35 40 45
Ser Leu Met Gin Met Leu Phe Ser Ser Val Val Gly Ser He He Asn
50 55 60
Tyr Lys Lys Gly Leu Leu Asp Leu Arg Glu Gly Ser Phe Ala Ala Leu 65 70 75 80
Gly Gly Leu Met Gly Ala He Leu Gly Ser Phe He Leu Lys He He
85 90 95
Asp Asp Lys He Leu Met Ala Val Phe Val Val Val Val Cys Tyr Thr
100 105 110
Phe He Lys Tyr Ala Phe Ser Ser Asn Lys Lys Pro Lys His Phe Glu
115 120 125
Glu Met His Phe Asp Leu His Ala Asn Asn Lys Thr Pro Glu Lys Lys
130 135 140
Arg Ala He Pro Phe Val Ser Met Asp Arg Thr His Gly Val Leu Met 145 150 155 160
Leu Ala Gly Phe Val Thr Gly He Phe Ser He Pro Leu Gly Met Gly
165 170 175
Gly Gly He Leu Met Val Pro Phe Leu Gly Tyr Phe Leu Lys Tyr Asp
180 185 190
Ser Lys Lys He Val Pro Leu Gly Leu Phe Phe Val Val Phe Ala Ser
195 200 205
Leu Ser Gly Val He Ser Leu Tyr Asn Gly Arg Val Leu Asp Asn He
210 215 220
Ser Val Gin Ala Gly Val He Thr Gly He Gly Ala Phe Leu Gly Val 225 230 235 240
Gly He Gly He Lys Leu He Ala Leu Ala Asn Glu Lys Val His Lys
245 250 255
He Leu Leu Leu Leu He Tyr Ala Leu Ser He Leu Ala Thr Leu His
260 265 270
Lys Leu He Met Gly 275
(2) INFORMATION FOR SEQ ID NO: 83:
(l) SEQUENCE CHARACTERISTICS-
(A) LENGTH: 1667 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 220...1482 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83:
- 195 -
AAGCGCGAGC TATATGAGGA ATTTTAGCTT CTATGTGGGC TATTCAGTCG GTTTTTAAGG 60
AAGGCTCTTG ATGAAAAATA CCAATACAAA AGAGATAAAG AATACAAGAA TGAAAAAAGG 120
TTATAGTCAA TACCACGCGC TCAAAAAAGG GCTTTTAAAA ACGCTCTGCT TTTTAGCCTT 180
CCTTTAAGCG TGGCGTTAGC TGAAGACGAT GGCTTTTAT ATG GGA GTG GGC TAT 234
Met Gly Val Gly Tyr 1 5
CAA ATC GGC GGC GCG CAA CAA AAT ATC GAT AAC AAA GGC AGC ACC CTA 282 Gin He Gly Gly Ala Gin Gin Asn He Asp Asn Lys Gly Ser Thr Leu 10 15 20
AGG AAT AAT GTC ATT AAT AAT TTC CGC CAA GTG GGC GTG GGT ATG GCA 330 Arg Asn Asn Val He Asn Asn Phe Arg Gin Val Gly Val Gly Met Ala 25 30 35
GGG GGT AAT GGG CTT TTA GCC TTA GCG ACA AAC ACG ACC ATG GAC GCT 378 Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn Thr Thr Met Asp Ala 40 45 50
CTT TTA GGG ATA GGC AAC CAA ATT GTC AAT ACT AAT ACA ACT GTT AGC 426 Leu Leu Gly He Gly Asn Gin He Val Asn Thr Asn Thr Thr Val Ser 55 60 65
AAC AAC AAC GCA GAA TTA ACC CAG TTT AAA AAA ATA CTC CCT CAA ATT 474 Asn Asn Asn Ala Glu Leu Thr Gin Phe Lys Lys He Leu Pro Gin He 70 75 80 85
GAG CAA CGC TTT GAA ACG AAT AAA AAC GCT TAT AGC GTT CAA GCC TTG 522 Glu Gin Arg Phe Glu Thr Asn Lys Asn Ala Tyr Ser Val Gin Ala Leu 90 95 100
CAA GTG TAT TTG AGT AAT GTG CTT TAT AAC TTG GTT AAT AAT AGT AAT 570 Gin Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu Val Asn Asn Ser Asn
105 110 115
AAT GGC AGT AAT AAT GGA GTC GTT CCT GAA TAT GTA GGA ATT ATA AAA 618 Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr Val Gly He He Lys 120 125 130
GTT CTC TAT GGT TCT CAA AAT GAA TTC AGT CTC TTA GCC ACG GAG AGT 666 Val Leu Tyr Gly Ser Gin Asn Glu Phe Ser Leu Leu Ala Thr Glu Ser 135 140 145
GTG GTG CTT TTA AAC GCG CTT ACA AGG GTG AAT CTG GAT AGT AAT TCG 714 Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn Leu Asp Ser Asn Ser 150 155 160 165
GTG TTT TTA AAA GGG CTA TTA GCC CAA ATG CAG CTT TTT AAT GAC ACT 762 Val Phe Leu Lys Gly Leu Leu Ala Gin Met Gin Leu Phe Asn Asp Thr 170 175 180
TCT TCA GCA AAG CTA GGC CAG ATC GCA GAA AAC TTG AAG AAC GGT GGT 810 Ser Ser Ala Lys Leu Gly Gin He Ala Glu Asn Leu Lys Asn Gly Gly 185 190 195
GCA GGA TCA ATG CTC CAA AAG GAT GTG AAA ACC ATC TCG GAT CGA ATC 858
- 196 -
Ala Gly Ser Met Leu Gin Lys Asp Val Lys Thr He Ser Asp Arg He 200 205 210
GCT ACT TAC CAA GAG AAT CTA AAA CAG CTA GGA GGG ATG CTA AAG AAT 906 Ala Thr Tyr Gin Glu Asn Leu Lys Gin Leu Gly Gly Met Leu Lys Asn 215 220 225
TAC GAT GAA CCC TAC TTG CCC CAA TTT GGG CCA GGC ACA AGC TCT CAG 954 Tyr Asp Glu Pro Tyr Leu Pro Gin Phe Gly Pro Gly Thr Ser Ser Gin 230 235 240 245
CAT GGG GTT ATT AAT GGC TTT GGC ATT CAA GTG GGC TAT AAG CAA TTT 1002 His Gly Val He Asn Gly Phe Gly He Gin Val Gly Tyr Lys Gin Phe 250 255 260
TTT GGG AAC AAG CGG AAT ATA GGC TTA CGA TAT TAC GCT TTC TTT GAT 1050 Phe Gly Asn Lys Arg Asn He Gly Leu Arg Tyr Tyr Ala Phe Phe Asp 265 270 275
TAT GGC TTT ACG CAA TTG GGC AGT CTT AGC AGC GCC GTT AAA GCG AAT 1098 Tyr Gly Phe Thr Gin Leu Gly Ser Leu Ser Ser Ala Val Lys Ala Asn 280 285 290
ATC TTT ACT TAT GGC GCT GGC ACG GAC TTT TTA TGG AAT ATC TTT AGA 1146 He Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu Trp Asn He Phe Arg 295 300 305
AGG GTT TTT AGC GAT CAG TCC TTG AAT GTG GGG GTG TTT GGG GGC ATT 1194 Arg Val Phe Ser Asp Gin Ser Leu Asn Val Gly Val Phe Gly Gly He 310 315 320 325
CAA ATA GCG GGT AAC ACT TGG GAT AGC TCT TTA AGA GGT CAA ATT GAA 1242 Gin He Ala Gly Asn Thr Trp Asp Ser Ser Leu Arg Gly Gin He Glu 330 335 340
AAC TCG TTT AAA GAA TAC CCC ACT CCC ACG AAT TTC CAA TTT TTG TTT 1290 Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn Phe Gin Phe Leu Phe 345 350 355
AAT TTG GGT TTA AGG GCT CAT TTT GCC AGC ACC ATG CAC CGC CGG TTT 1338 Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr Met His Arg Arg Phe 360 365 370
TTG AGC GCG TCT CAA AGC ATT CAG CAT GGG ATG GAA TTT GGC GTG AAA 1386 Leu Ser Ala Ser Gin Ser He Gin His Gly Met Glu Phe Gly Val Lys 375 380 385
ATC CCG GCT ATC AAT CAA AGG TAT TTG AGG GCC AAT GGG GCT GAT GTG 1434 He Pro Ala He Asn Gin Arg Tyr Leu Arg Ala Asn Gly Ala Asp Val 390 395 400 405
GAT TAC AGG CGT TTG TAT GCG TTC TAT ATC AAT TAC ACG ATA GGT TTT T 1483 Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr He Asn Tyr Thr He Gly Phe 410 415 420
- 197 -
AAGCTCTTTT TAGGGCTTAT AAAGAGGCTT TTTACTTTTT TTTTGGTATT CTAACAAGCT 1543
TTTAAATAAT CCAATCTACT TTGTTTTAAG GATAATATTT TATGGCAGAT GTCGTTGTGG 1603
GGATCCAGTG GGGAGATGAG GGGAAGGGAA AAATTGTTGA TAGGATCGCT AAAGATTATG 1663
ACTT 1667
(2) INFORMATION FOR SEQ ID NO: 84:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 421 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84:
Met Gly Val Gly Tyr Gin He Gly Gly Ala Gin Gin Asn He Asp Asn
1 5 10 15
Lys Gly Ser Thr Leu Arg Asn Asn Val He Asn Asn Phe Arg Gin Val
20 25 30
Gly Val Gly Met Ala Gly Gly Asn Gly Leu Leu Ala Leu Ala Thr Asn
35 40 45
Thr Thr Met Asp Ala Leu Leu Gly He Gly Asn Gin He Val Asn Thr
50 55 60
Asn Thr Thr Val Ser Asn Asn Asn Ala Glu Leu Thr Gin Phe Lys Lys 65 70 75 80
He Leu Pro Gin He Glu Gin Arg Phe Glu Thr Asn Lys Asn Ala Tyr
85 90 95
Ser Val Gin Ala Leu Gin Val Tyr Leu Ser Asn Val Leu Tyr Asn Leu
100 105 110
Val Asn Asn Ser Asn Asn Gly Ser Asn Asn Gly Val Val Pro Glu Tyr
115 120 125
Val Gly He He Lys Val Leu Tyr Gly Ser Gin Asn Glu Phe Ser Leu
130 135 140
Leu Ala Thr Glu Ser Val Val Leu Leu Asn Ala Leu Thr Arg Val Asn 145 150 155 160
Leu Asp Ser Asn Ser Val Phe Leu Lys Gly Leu Leu Ala Gin Met Gin
165 170 175
Leu Phe Asn Asp Thr Ser Ser Ala Lys Leu Gly Gin He Ala Glu Asn
180 185 190
Leu Lys Asn Gly Gly Ala Gly Ser Met Leu Gin Lys Asp Val Lys Thr
195 200 205
He Ser Asp Arg He Ala Thr Tyr Gin Glu Asn Leu Lys Gin Leu Gly
210 215 220
Gly Met Leu Lys Asn Tyr Asp Glu Pro Tyr Leu Pro Gin Phe Gly Pro 225 230 235 240
Gly Thr Ser Ser Gin His Gly Val He Asn Gly Phe Gly He Gin Val
245 250 255
Gly Tyr Lys Gin Phe Phe Gly Asn Lys Arg Asn He Gly Leu Arg Tyr
260 265 270
Tyr Ala Phe Phe Asp Tyr Gly Phe Thr Gin Leu Gly Ser Leu Ser Ser
275 280 285
Ala Val Lys Ala Asn He Phe Thr Tyr Gly Ala Gly Thr Asp Phe Leu
- 198 -
290 295 300
Trp Asn He Phe Arg Arg Val Phe Ser Asp Gin Ser Leu Asn Val Gly 305 310 315 320
Val Phe Gly Gly He Gin He Ala Gly Asn Thr Trp Asp Ser Ser Leu
325 330 335
Arg Gly Gin He Glu Asn Ser Phe Lys Glu Tyr Pro Thr Pro Thr Asn
340 345 350
Phe Gin Phe Leu Phe Asn Leu Gly Leu Arg Ala His Phe Ala Ser Thr
355 360 365
Met His Arg Arg Phe Leu Ser Ala Ser Gin Ser He Gin His Gly Met
370 375 380
Glu Phe Gly Val Lys He Pro Ala He Asn Gin Arg Tyr Leu Arg Ala 385 390 395 400
Asn Gly Ala Asp Val Asp Tyr Arg Arg Leu Tyr Ala Phe Tyr He Asn
405 410 415
Tyr Thr He Gly Phe 420
(2) INFORMATION FOR SEQ ID NO: 85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 926 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 207...746 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85:
CCCCTTAATT GCAGATGTTT TGCAAGAGGG ATTGCGTGGC GTCTATCATT CTAGAGAGAT 60
AGACTTTGTA GAAAAAGTGG TTGTTTTAGA CAGCTGTCAA ATCCACCAAA AAGCGTTAAT 120
GCATTTGCAA GAAACTTTGA TGATAGAAGT GGATAGGCTT GATTTTTCTT TAGTGGAGCG 180
CTTGAACATT TTAGCGCGCA TGGAGA ATG AAA AGC ATG CGT TTT AGT TAC ATT 233
Met Lys Ser Met Arg Phe Ser Tyr He 1 5
GAG CCA AGA GCG AAA TAC CTT ATC AGC AAG CTT TCT AAA ATT TGG GTT 281 Glu Pro Arg Ala Lys Tyr Leu He Ser Lys Leu Ser Lys He Trp Val 10 15 20 25
TTT TAC ATT TTT TTA TCT TTT GTG GTA ATA GGG GGG TTA GTG TGG TTT 329 Phe Tyr He Phe Leu Ser Phe Val Val He Gly Gly Leu Val Trp Phe 30 35 40
ATG CAC AAC GCC ATT AAA AGC ACT CAA GAC AAC GCG TCC AGT TTG ACG 377 Met His Asn Ala He Lys Ser Thr Gin Asp Asn Ala Ser Ser Leu Thr 45 50 55
- 199 -
ATC CAA GAA AGG CTC TAC CGC CAT GAA ATC AGC CGC TTA CAG GTT AAG 425 He Gin Glu Arg Leu Tyr Arg His Glu He Ser Arg Leu Gin Val Lys 60 65 70
ACT GAT GAA ACC TTA AAA CTC ATT AAA GAA GCC AAA AAG CGT TTG AAT 473 Thr Asp Glu Thr Leu Lys Leu He Lys Glu Ala Lys Lys Arg Leu Asn 75 80 85
TAT AAC GAT GAT ATA CGA GAT GTT TTG CAA GGG CTT TTG AAT ATT GTG 521 Tyr Asn Asp Asp He Arg Asp Val Leu Gin Gly Leu Leu Asn He Val 90 95 100 105
CCG GAT TCC ATC ACT ATT AAT AGC ATT GAA ATA GAC CAG CAA AGC GTG 569 Pro Asp Ser He Thr He Asn Ser He Glu He Asp Gin Gin Ser Val 110 115 120
GTT GTT AGC GGT AAA ACC CCT TCT AAA GAA GCC TTT TAT TTT TTG TTT 617 Val Val Ser Gly Lys Thr Pro Ser Lys Glu Ala Phe Tyr Phe Leu Phe 125 130 135
CAA AAC AAA CTA AAC CCC ATG TTT GAT TAT TCT AGG GCG GAA TTT TTC 665 Gin Asn Lys Leu Asn Pro Met Phe Asp Tyr Ser Arg Ala Glu Phe Phe 140 145 150
CCC TTA AGC GAT GGG TGG TTT AAT TTT GTC TCC ACC AAC TTT TCT AAT 713
Pro Leu Ser Asp Gly Trp Phe Asn Phe Val Ser Thr Asn Phe Ser Asn
155 160 165
TCC TTA CTG ATA AAA AAT CCG GAG TCT ATT AAA TGAAGCCATT GCATTTTTCA 766
Ser Leu Leu He Lys Asn Pro Glu Ser He Lys 170 175 180
CACCTGGACA GAGAGCAATC AGGCGATGTG GGGTTTATCA TTAAAAACCT CGTTTTTTTA 826 GGGGTTTTTT CCTTATTGGG TTGGTTGAAT ACCGAGTATT TTCTATGGCC TAGCATGCTG 886 GAATTAAAAA AAATCCTTTT AGAAGAAAAT CGTAAAAAAA 926
(2) INFORMATION FOR SEQ ID NO: 86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 180 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86:
Met Lys Ser Met Arg Phe Ser Tyr He Glu Pro Arg Ala Lys Tyr Leu
1 5 10 15
He Ser Lys Leu Ser Lys He Trp Val Phe Tyr He Phe Leu Ser Phe
20 25 30
Val Val He Gly Gly Leu Val Trp Phe Met His Asn Ala He Lys Ser
35 40 45
Thr Gin Asp Asn Ala Ser Ser Leu Thr He Gin Glu Arg Leu Tyr Arg
- 200 -
50 55 60
His Glu He Ser Arg Leu Gin Val Lys Thr Asp Glu Thr Leu Lys Leu 65 70 75 80
He Lys Glu Ala Lys Lys Arg Leu Asn Tyr Asn Asp Asp He Arg Asp
85 90 95
Val Leu Gin Gly Leu Leu Asn He Val Pro Asp Ser He Thr He Asn
100 105 110
Ser He Glu He Asp Gin Gin Ser Val Val Val Ser Gly Lys Thr Pro
115 120 125
Ser Lys Glu Ala Phe Tyr Phe Leu Phe Gin Asn Lys Leu Asn Pro Met
130 135 140
Phe Asp Tyr Ser Arg Ala Glu Phe Phe Pro Leu Ser Asp Gly Trp Phe 145 150 155 160
Asn Phe Val Ser Thr Asn Phe Ser Asn Ser Leu Leu He Lys Asn Pro
165 170 175
Glu Ser He Lys 180
(2) INFORMATION FOR SEQ ID NO: 87:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1440 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 151...1299 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87:
AGCACTTTCG CTTTTCATTG TTTTGATGCG ACTTCTAGTT TCAGGCTTTT GCAAGTGTTA 60 AACGATGAGG TGAGCGATGC GTTTTTAATC ATACAAGATT TTAAAGAACA GCGCATCATT 120 CATAAAATCA TTCAAACCCA TTTCAAACGC ATG TGC GTG GTT TTG AGC GTG AAA 174
Met Cys Val Val Leu Ser Val Lys 1 5
AGA GAT GGT GAA AAA ACT TTA GAA AAT AAT GAA GAA AAT AAA GAT GAA 222 Arg Asp Gly Glu Lys Thr Leu Glu Asn Asn Glu Glu Asn Lys Asp Glu 10 15 20
AAG CTT ATT TTG ATT GAT GAA TTT GAA GTT TTA GCC AAT AAA TTC ATT 270 Lys Leu He Leu He Asp Glu Phe Glu Val Leu Ala Asn Lys Phe He 25 30 35 40
TCT CGT TTG CCC AAT ATC CCT AGC ACC CCT AGA GAG TTT GGG TTA GGC 318 Ser Arg Leu Pro Asn He Pro Ser Thr Pro Arg Glu Phe Gly Leu Gly 45 50 55
AAG GGC GAG ATC ATG GAG ATT GAT GTG CCT TTT GGG AGT ATT TTT GCT 366 Lys Gly Glu He Met Glu He Asp Val Pro Phe Gly Ser He Phe Ala
- 201 -
60 65 70
TAC AGA CAC ATT GGC TCT ATC AGA CAA AAA GAA TAC AGG ATT GTA GGG 414 Tyr Arg His He Gly Ser He Arg Gin Lys Glu Tyr Arg He Val Gly 75 80 85
CTT TAT CGC AAC GAT GTT TTG TTG CTC TCC ACT AAA TCT TTA GTT ATC 462 Leu Tyr Arg Asn Asp Val Leu Leu Leu Ser Thr Lys Ser Leu Val He 90 95 100
CAG CCG CGA GAC ATT CTC TTA GTG GCG GGT AAT CCG GAA ATT TTG AAT 510 Gin Pro Arg Asp He Leu Leu Val Ala Gly Asn Pro Glu He Leu Asn 105 110 115 120
GCG GTG TAT CTT CAA GTC AAA AGC AAT GTG GGG CAG TTC CCA GCC CCC 558 Ala Val Tyr Leu Gin Val Lys Ser Asn Val Gly Gin Phe Pro Ala Pro 125 130 135
TTT GGT AAG AGC ATT TAT TTA TAC ATT GAT ATG CGT TTG CAG AAC AGA 606 Phe Gly Lys Ser He Tyr Leu Tyr He Asp Met Arg Leu Gin Asn Arg 140 145 150
AAA GCG ATG ATG CGC GAT GTG TAT CAA GCC TTG TTT TTG CAC AAA CAT 654 Lys Ala Met Met Arg Asp Val Tyr Gin Ala Leu Phe Leu His Lys His 155 160 165
TTA AAG AGC TAC AAG CTC TAC ATT CAG GTT TTA CAC CCC ACT AGC CCT 702 Leu Lys Ser Tyr Lys Leu Tyr He Gin Val Leu His Pro Thr Ser Pro 170 175 180
AAG TTT TAC CAT AAA TTT TTA GCG CTA GAA ACC GAA AGC ATT GAA GTG 750 Lys Phe Tyr His Lys Phe Leu Ala Leu Glu Thr Glu Ser He Glu Val 185 190 195 200
AAT TTT GAT TTT TAC AGG AAA AGT TTT ATC CAA AAA CTC CAT GAA GAC 798 Asn Phe Asp Phe Tyr Arg Lys Ser Phe He Gin Lys Leu His Glu Asp 205 210 215
CAC CAG AAA AAA ATG GGC CTA ATC GTG GTA GGC AGA GAG CTT TTT TTA 846 His Gin Lys Lys Met Gly Leu He Val Val Gly Arg Glu Leu Phe Leu 220 225 230
TCT AAA AAA CAC CGA AAG GCC TTG TAT AAA ACA GCC ACC CCA GTT TAT 894 Ser Lys Lys His Arg Lys Ala Leu Tyr Lys Thr Ala Thr Pro Val Tyr 235 240 245
AAA ACC AAC ACT TCT GGC TTG TCT AAA ACC TCT CAA AGC GTG GTG GTA 942 Lys Thr Asn Thr Ser Gly Leu Ser Lys Thr Ser Gin Ser Val Val Val 250 255 260
TTG AAT GAA AGT TTG GAT ATT AAT GAG GAC ATG TCT TCA GTG ATT TTT 990 Leu Asn Glu Ser Leu Asp He Asn Glu Asp Met Ser Ser Val He Phe 265 270 275 280
GAT GTG TCT ATG CAA ATG GAT TTG GGC TTG TTG CTC TAT GAT TTT GAC 1038
- 2 02 -
Asp Val Ser Met Gin Met Asp Leu Gly Leu Leu Leu Tyr Asp Phe Asp 285 290 295
CCT AAC AAG CGC TAT AAA AAC GAG ATT GTC AAT CAT TAT GAA AAT TTA 1086 Pro Asn Lys Arg Tyr Lys Asn Glu He Val Asn His Tyr Glu Asn Leu 300 305 310
GCC AAC GCG TTC AAC CGC AAG ATT GAG ATT TTC CAA ACC GAT ATT AGA 1134 Ala Asn Ala Phe Asn Arg Lys He Glu He Phe Gin Thr Asp He Arg 315 320 325
AAT CCT ATC ATG TAT CTC AAT TCT TTA AGA AAT CCC ATT TTG CAT TTC 1182 Asn Pro He Met Tyr Leu Asn Ser Leu Arg Asn Pro He Leu His Phe 330 335 340
ATG CCT TTT GAA GAG TGC ATC ACG CAC ACG CGC TTT TGG TGG TTT TTA 1230 Met Pro Phe Glu Glu Cys He Thr His Thr Arg Phe Trp Trp Phe Leu 345 350 355 360
TCC ACT AAA GTG GAA AAA TTA GCG TTT TTA AAC GAT GAT AAC CCT CAA 1278 Ser Thr Lys Val Glu Lys Leu Ala Phe Leu Asn Asp Asp Asn Pro Gin
365 370 375
ATT TTT ATC CCT GTA GCG GAG TGAAAGAATG CAAGAAATTT TAATCCCTTT AAAA 1333 He Phe He Pro Val Ala Glu 380
GAAAAAAACT ATAAAGTGTT TTTGGGGGAA CTGCCTGAAA TAAAATTGAA ACAAAAAGCC 1393 CTCATCATTA GCGATAGCAT CGTAGCCGGG TTGCATTTGC CCTATTT 1440
(2) INFORMATION FOR SEQ ID NO: 88:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 383 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 88 :
Met Cys Val Val Leu Ser Val Lys Arg Asp Gly Glu Lys Thr Leu Glu
1 5 10 15
Asn Asn Glu Glu Asn Lys Asp Glu Lys Leu He Leu He Asp Glu Phe
20 25 30
Glu Val Leu Ala Asn Lys Phe He Ser Arg Leu Pro Asn He Pro Ser
35 40 45
Thr Pro Arg Glu Phe Gly Leu Gly Lys Gly Glu He Met Glu He Asp
50 55 60
Val Pro Phe Gly Ser He Phe Ala Tyr Arg His He Gly Ser He Arg
65 70 75 80
Gin Lys Glu Tyr Arg He Val Gly Leu Tyr Arg Asn Asp Val Leu Leu
85 90 95
Leu Ser Thr Lys Ser Leu Val He Gin Pro Arg Asp He Leu Leu Val
- 203 -
100 105 110
Ala Gly Asn Pro Glu He Leu Asn Ala Val Tyr Leu Gin Val Lys Ser
115 120 125
Asn Val Gly Gin Phe Pro Ala Pro Phe Gly Lys Ser He Tyr Leu Tyr
130 135 140
He Asp Met Arg Leu Gin Asn Arg Lys Ala Met Met Arg Asp Val Tyr 145 150 155 160
Gin Ala Leu Phe Leu His Lys His Leu Lys Ser Tyr Lys Leu Tyr He
165 170 175
Gin Val Leu His Pro Thr Ser Pro Lys Phe Tyr His Lys Phe Leu Ala
180 185 190
Leu Glu Thr Glu Ser He Glu Val Asn Phe Asp Phe Tyr Arg Lys Ser
195 200 205
Phe He Gin Lys Leu His Glu Asp His Gin Lys Lys Met Gly Leu He
210 215 220
Val Val Gly Arg Glu Leu Phe Leu Ser Lys Lys His Arg Lys Ala Leu 225 230 235 240
Tyr Lys Thr Ala Thr Pro Val Tyr Lys Thr Asn Thr Ser Gly Leu Ser
245 250 255
Lys Thr Ser Gin Ser Val Val Val Leu Asn Glu Ser Leu Asp He Asn
260 265 270
Glu Asp Met Ser Ser Val He Phe Asp Val Ser Met Gin Met Asp Leu
275 280 285
Gly Leu Leu Leu Tyr Asp Phe Asp Pro Asn Lys Arg Tyr Lys Asn Glu
290 295 300
He Val Asn His Tyr Glu Asn Leu Ala Asn Ala Phe Asn Arg Lys He 305 310 315 320
Glu He Phe Gin Thr Asp He Arg Asn Pro He Met Tyr Leu Asn Ser
325 330 335
Leu Arg Asn Pro He Leu His Phe Met Pro Phe Glu Glu Cys He Thr
340 345 350
His Thr Arg Phe Trp Trp Phe Leu Ser Thr Lys Val Glu Lys Leu Ala
355 360 365
Phe Leu Asn Asp Asp Asn Pro Gin He Phe He Pro Val Ala Glu 370 375 380
(2) INFORMATION FOR SEQ ID NO: 89:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 51...464 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: AGATTTCATT CGAGGTAGAA AATACATTGA AAAAGCGTGT GAATTAAACG ATG GTA 56
- 204 -
Met Val 1
GGG GGT GGA ACG GTA AAA AAA GAC TTG AAG AAA GCC ATT CAA TAC TAT 104 Gly Gly Gly Thr Val Lys Lys Asp Leu Lys Lys Ala He Gin Tyr Tyr 5 10 15
GTT AAA GCG TGT GAA TTG AAT GAA ATG TTT GGG TGT CTG TCA TTA GTT 152 Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser Leu Val 20 25 30
TCG AAC TCT CAA ATA AAC AAA CAA AAA CTC TTT CAA TAT CTC TCT AAA 200 Ser Asn Ser Gin He Asn Lys Gin Lys Leu Phe Gin Tyr Leu Ser Lys 35 40 45 50
GCT TGT GAA TTA AAT AGT GGT AAT GGA TGT AGG TTT TTA GGG GAT TTT 248 Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly Asp Phe 55 60 65
TAT GAG AAT GGA AAA TAT GTA AAA AAG GAT TTA AGA AAA GCT GCT CAA 296 Tyr Glu Asn Gly Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala Ala Gin 70 75 80
TAC TAC TCT AAA GCT TGT GGA TTA AAT GAT CAA GAT GGG TGT TTA ATA 344 Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gin Asp Gly Cys Leu He 85 90 95
CTA GGA TAT AAG CAA TAT GCT GGC AAG GGC GTA GTC AAA AAT GAA AAA 392 Leu Gly Tyr Lys Gin Tyr Ala Gly Lys Gly Val Val Lys Asn Glu Lys 100 105 110
CAA GCG GTG AAA ACC TTT GAA AAG GCT TGT AGG TTA GGA TCT GAA GAC 440 Gin Ala Val Lys Thr Phe Glu Lys Ala Cys Arg Leu Gly Ser Glu Asp 115 120 125 130
GCA TGT GGT ATT TTA AAC AAC TAC TAGATTTGAA ATAAATGCTG TTTTTTAGCT 494 Ala Cys Gly He Leu Asn Asn Tyr 135
GGCTTTCATG TTTTTGTAAC CCC 517
(2) INFORMATION FOR SEQ ID NO: 90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90:
Met Val Gly Gly Gly Thr Val Lys Lys Asp Leu Lys Lys Ala He Gin
- 205 -
1 5 10 15
Tyr Tyr Val Lys Ala Cys Glu Leu Asn Glu Met Phe Gly Cys Leu Ser
20 25 30
Leu Val Ser Asn Ser Gin He Asn Lys Gin Lys Leu Phe Gin Tyr Leu
35 40 45
Ser Lys Ala Cys Glu Leu Asn Ser Gly Asn Gly Cys Arg Phe Leu Gly
50 55 60
Asp Phe Tyr Glu Asn Gly Lys Tyr Val Lys Lys Asp Leu Arg Lys Ala 65 70 75 80
Ala Gin Tyr Tyr Ser Lys Ala Cys Gly Leu Asn Asp Gin Asp Gly Cys
85 90 95
Leu He Leu Gly Tyr Lys Gin Tyr Ala Gly Lys Gly Val Val Lys Asn
100 105 110
Glu Lys Gin Ala Val Lys Thr Phe Glu Lys Ala Cys Arg Leu Gly Ser
115 120 125
Glu Asp Ala Cys Gly He Leu Asn Asn Tyr 130 135
(2) INFORMATION FOR SEQ ID NO: 91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1663 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 68...1600 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91:
AAATGTTAGA AACCCTTACA AAACAAGCTA ATATATTCTA TTCAATTTGC CTCAAGGACA 60 AACAAAC ATG AAA AAA CTT CTT TAT ACC ATA CTC GCG CTT CTT TTA ATC 109 Met Lys Lys Leu Leu Tyr Thr He Leu Ala Leu Leu Leu He 1 5 10
GGC CTT TTA ACA ATC TAT CTC ATC CTT TTT ACA GAA TGG GGG AAT AAG 157 Gly Leu Leu Thr He Tyr Leu He Leu Phe Thr Glu Trp Gly Asn Lys 15 20 25 30
ATC ATC GCT TCG TAT ATA GAG AAA AAA ATC AAC CCG AAC GAG CAC TAC 205 He He Ala Ser Tyr He Glu Lys Lys He Asn Pro Asn Glu His Tyr 35 40 45
TTG AGC GTT AAA ACC TTT AAA TTG AGA TTC AAC TCT TTG GAT TTT AAA 253 Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys 50 55 60
GCT CAA GCC AAC GAT GAT TCC ACG CTC ATT CTT AAG GGG GAT TTT TCA 301 Ala Gin Ala Asn Asp Asp Ser Thr Leu He Leu Lys Gly Asp Phe Ser
- 206 -
65 70 75
CTT TTA AAG CAA AGC GTA AAT TTG AAT TAC CAT ATA GAT ATT AAA GAT 349 Leu Leu Lys Gin Ser Val Asn Leu Asn Tyr His He Asp He Lys Asp 80 85 90
TTA CGC TCT TTC AAA GAA TGG ATA CCC TAC CCT TTA AGG GGG GCT GTT 397 Leu Arg Ser Phe Lys Glu Trp He Pro Tyr Pro Leu Arg Gly Ala Val 95 100 105 110
ATC ACT TCT GGG AAT ATT AAA GGG CAT AGA AAA GCC CTT ATG ATT CAA 445 He Thr Ser Gly Asn He Lys Gly His Arg Lys Ala Leu Met He Gin 115 120 125
GGC GTC TCT AAT GTG GCT CAA TCC CAC ACT GCC TAC AAT GCC CTT TTA 493 Gly Val Ser Asn Val Ala Gin Ser His Thr Ala Tyr Asn Ala Leu Leu 130 135 140
GAT GAT TTC AAG CTT TCT CGC TTA AAT TTG AAC GCA CAA GAC GCC AAT 541 Asp Asp Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gin Asp Ala Asn 145 150 155
TTA GAA GAT TTG CTT TAT TTA ATC AAT CGC CCC GCT TAT GCG AAC GCA 589 Leu Glu Asp Leu Leu Tyr Leu He Asn Arg Pro Ala Tyr Ala Asn Ala 160 165 170
AAA GTG TCC TTA CAG GCG GAT TTT AAC TCT CTA AAG CCT TTA GAG GGG 637 Lys Val Ser Leu Gin Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly 175 180 185 190
CAT TTG ATC CTA ACA GCT AAT AAC GCT TTA ATC AAT AAC GCC CTA ATC 685 His Leu He Leu Thr Ala Asn Asn Ala Leu He Asn Asn Ala Leu He 195 200 205
AAT CAA ATT TTT CAT TTA AAC CTT AAA GAC ACG CTT GTT TTC AGC CTC 733 Asn Gin He Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu 210 215 220
TCG CAT TCA AGC GAC TTT AAA GGA AAC AAA GCC ATC AGC GAT ACC ACC 781 Ser His Ser Ser Asp Phe Lys Gly Asn Lys Ala He Ser Asp Thr Thr 225 230 235
CTG ACT AGC CCT TTA GCC AAT TTC AAA GCC CTA AAA AGC GAA TAC CTT 829 Leu Thr Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Ser Glu Tyr Leu 240 245 250
TTC TCT ATT TTA AAA CTC AAC GCC CCC TAC ACT TTA GAA ATC CCC AAT 877 Phe Ser He Leu Lys Leu Asn Ala Pro Tyr Thr Leu Glu He Pro Asn 255 260 265 270
CTA GCC AAA CTC TAT AAC ATT ACC AAC CAC CCC TTA AAA GGG AGC TTG 925 Leu Ala Lys Leu Tyr Asn He Thr Asn His Pro Leu Lys Gly Ser Leu 275 280 285
ACT TTA AAA GGC GCT ATA GAA CAA AGC CCC AAA CTT TTA AAA GTC AGC 973
- 207 -
Thr Leu Lys Gly Ala He Glu Gin Ser Pro Lys Leu Leu Lys Val Ser 290 295 300
GGC CAT TCA AAT TTA CTA GAC GGC GCG CTG GAT TTC ACG CTT TTA AAT 1021
Gly His Ser Asn Leu Leu Asp Gly Ala Leu Asp Phe Thr Leu Leu Asn 305 310 315
AAA GAT TTG AAA GGG CGT TTT TCC AAT ATT TCC ACT TTA AAA GCT TTA 1069
Lys Asp Leu Lys Gly Arg Phe Ser Asn He Ser Thr Leu Lys Ala Leu 320 325 330
GAT TTA TTC CAT TAC CCT AAG TTT TTC CAA TCC GTT GCA GAC GCT AAT 1117
Asp Leu Phe His Tyr Pro Lys Phe Phe Gin Ser Val Ala Asp Ala Asn
335 340 345 350
TTG GAT TAT GAT CTT ATC GCT AAG CAA GGC GTA TTG AAA GCC CGC CTA 1165
Leu Asp Tyr Asp Leu He Ala Lys Gin Gly Val Leu Lys Ala Arg Leu 355 360 365
AAA AAC GCA AGA TTC CTC AAA AAT GCA TTC AGC GAT TTT CTC TAC TCC 1213
Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser 370 375 380
ATT TCT AAA TTT GAT ATT ACA AAA GAA ATT TAT AAC GAT GCC AAT CTG 1261
He Ser Lys Phe Asp He Thr Lys Glu He Tyr Asn Asp Ala Asn Leu 385 390 395
GTA AGC CAA ATC AAC CAG CAA CGC CTG CTC TCT GAT CTG AGT TTA AAA 1309
Val Ser Gin He Asn Gin Gin Arg Leu Leu Ser Asp Leu Ser Leu Lys 400 405 410
AGC CCC AAA ACC CAA TTG AAA ATC CAT AAC GGT TTG TTG GAT TTA AAC 1357
Ser Pro Lys Thr Gin Leu Lys He His Asn Gly Leu Leu Asp Leu Asn
415 420 425 430
ACC AAA CAA ATG AAC ATG CTC ATG GAT GCG GAA ATT TTA AAA TTC ATT 1405
Thr Lys Gin Met Asn Met Leu Met Asp Ala Glu He Leu Lys Phe He 435 440 445
TTT AAA ATG AAA CTT CAA GGC AAC ATG CAC CAG CCA AAA TTT TCT CTC 1453
Phe Lys Met Lys Leu Gin Gly Asn Met His Gin Pro Lys Phe Ser Leu 450 455 460
ATT TTA AAC GAA AAA GCC ATT CAG CAA AAC TTG CAA CAA GGC TTG AAA 1501
He Leu Asn Glu Lys Ala He Gin Gin Asn Leu Gin Gin Gly Leu Lys 465 470 475
GAA ATC TTA AAA AAC GAC ACC CTT AAA AAA GGT TTA GAT CAT TTG CTT 1549
Glu He Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu 480 485 490
AAA GAT GAT AAG CTC AAA GAA AAG CTT GAA AAA GGG CTT AAG GGG CTT 1597
Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu
495 500 505 510
TTT TAAAAATTTT AAAGGATAGA AATGGCGCAC ATTTTAGTTA GCGGGGCGAC TTCAGG 1656
- 2 08 - Phe
GTTTGGA 1663
(2) INFORMATION FOR SEQ ID NO : 92 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 511 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 :
Met Lys Lys Leu Leu Tyr Thr He Leu Ala Leu Leu Leu He Gly Leu
1 5 10 15
Leu Thr He Tyr Leu He Leu Phe Thr Glu Trp Gly Asn Lys He He
20 25 30
Ala Ser Tyr He Glu Lys Lys He Asn Pro Asn Glu His Tyr Leu Ser
35 40 45
Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys Ala Gin
50 55 60
Ala Asn Asp Asp Ser Thr Leu He Leu Lys Gly Asp Phe Ser Leu Leu 65 70 75 80
Lys Gin Ser Val Asn Leu Asn Tyr His He Asp He Lys Asp Leu Arg
85 90 95
Ser Phe Lys Glu Trp He Pro Tyr Pro Leu Arg Gly Ala Val He Thr
100 105 110
Ser Gly Asn He Lys Gly His Arg Lys Ala Leu Met He Gin Gly Val
115 120 125
Ser Asn Val Ala Gin Ser His Thr Ala Tyr Asn Ala Leu Leu Asp Asp
130 135 140
Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gin Asp Ala Asn Leu Glu 145 150 155 160
Asp Leu Leu Tyr Leu He Asn Arg Pro Ala Tyr Ala Asn Ala Lys Val
165 170 175
Ser Leu Gin Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly His Leu
180 185 190
He Leu Thr Ala Asn Asn Ala Leu He Asn Asn Ala Leu He Asn Gin
195 200 205
He Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu Ser His
210 215 220
Ser Ser Asp Phe Lys Gly Asn Lys Ala He Ser Asp Thr Thr Leu Thr 225 230 235 240
Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Ser Glu Tyr Leu Phe Ser
245 250 255
He Leu Lys Leu Asn Ala Pro Tyr Thr Leu Glu He Pro Asn Leu Ala
260 265 270
Lys Leu Tyr Asn He Thr Asn His Pro Leu Lys Gly Ser Leu Thr Leu
275 280 285
Lys Gly Ala He Glu Gin Ser Pro Lys Leu Leu Lys Val Ser Gly His 290 295 300
- 209 -
Ser Asn Leu Leu Asp riy -Ala Leu Asp Phe Thr Leu Leu Asn Lys Asp 305 310 315 320
Leu Lys Gly Arg Phe Ser Asn He Ser Thr Leu Lys Ala Leu Asp Leu
325 330 335
Phe His Tyr Pro Lys Phe Phe Gin Ser Val Ala Asp Ala Asn Leu Asp
340 345 350
Tyr Asp Leu He Ala Lys Gin Gly Val Leu Lys Ala Arg Leu Lys Asn
355 360 365
Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser He Ser
370 375 380
Lys Phe Asp He Thr Lys Glu He Tyr Asn Asp Ala Asn Leu Val Ser 385 390 395 400
Gin He Asn Gin Gin Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro
405 410 415
Lys Thr Gin Leu Lys He His Asn Gly Leu Leu Asp Leu Asn Thr Lys
420 425 430
Gin Met Asn Met Leu Met Asp Ala Glu He Leu Lys Phe He Phe Lys
435 440 445
Met Lys Leu Gin Gly Asn Met His Gin Pro Lys Phe Ser Leu He Leu
450 455 460
Asn Glu Lys Ala He Gin Gin Asn Leu Gin Gin Gly Leu Lys Glu He 465 470 475 480
Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp
485 490 495
Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Phe 500 505 510
(2) INFORMATION FOR SEQ ID NO: 93:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 947 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 292...645 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93:
AGTGCATAAA CGCACAGACC CCAAAAATGA AAGCTATTTT TGGCTAGGGC TACACCCTTT 60
AGAATGGCAA AAGCGCGAAA ATGAAGACAG ACTCTCTGAT TTTGACGCTA TTGCTTCAAA 120
CCATGCCTCT ATCACGCCTT TAAATTTAGA CTTAACCAGT TATGATGATT TGAAAAGTTT 180
GGAATCTTGG CATGAGGGAA TGTTAAAGTG AGTAAAAAGC ACCGCTTGGC TTTTTTAGGG 240
CTAATTGTTG GGGTTCTATT CTTCTTTAGT GCGTGTGAGC ACCGCCTGCA C ATG GGG 297
Met Gly 1
TAT TAT TCA GAA GTT ACA GGG GAT TAT TTG TTC AAT TAT AAT TCC ACT 345 Tyr Tyr Ser Glu Val Thr Gly Asp Tyr Leu Phe Asn Tyr Asn Ser Thr
- 210 -
5 10 15
ATC GTG GTG GCT TAT GAC AGA AGC GAT GCG ATG ACT TCT TAT TAT ATC 393 He Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr Tyr He 20 25 30
AAT GTG ATT GTT TAT GAA TTG CAA AAA TTA GGC TTT TAC AAT GTC TTC 441 Asn Val He Val Tyr Glu Leu Gin Lys Leu Gly Phe Tyr Asn Val Phe 35 40 45 50
ACG CAA GCG GAA TTC CCA CTA GAT AAA GCC AAA AAT GTG ATC TAT GCG 489 Thr Gin Ala Glu Phe Pro Leu Asp Lys Ala Lys Asn Val He Tyr Ala 55 60 65
CGC ATT GTC CGT AAC ATC TCA GCT GTG CCG TTC TAC CAA TAC AAT TAC 537 Arg He Val Arg Asn He Ser Ala Val Pro Phe Tyr Gin Tyr Asn Tyr 70 75 80
CAA CTG ATT GAT CAA GTC AAT AAG CCT TGT TAT TTT CTT GGG GGG CAG 585 Gin Leu He Asp Gin Val Asn Lys Pro Cys Tyr Phe Leu Gly Gly Gin 85 90 95
TTT TAT TGC TCT CAA ACC CTA CGG ATT ATT ACG CTA TCA ATG GCT TTA 633 Phe Tyr Cys Ser Gin Thr Leu Arg He He Thr Leu Ser Met Ala Leu 100 105 110
GCG AGC AAA TTT TAATGAGTGC TAATTCGCAT TTTATTTTAG ATTGGTATGA TGTGG 690
Ala Ser Lys Phe
115
TGTTGCAAAA ACGGGTTTTA TATGTGGATG GGAGCGTGAG CGGGAGGACT TGCGGCTATC 750
AGATGCTGTA TAGGGATTTG ATTAAAAGCA CGATCAAACG CATTGATTTT AACCGCCCTG 810
AACGCTACTA CTACAATTTA AGACTGCCCC TTTATCAGCC ATGTTATAGG CAATGAAATG 870
GTTATCAGGC GATTGTATCA ATTTTGCGCT AGCCATGTGG TGCGCAATTG CTCTTCTTTA 930
AAATGCGCTC AAAATAT 947
(2) INFORMATION FOR SEQ ID NO : 94 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 118 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94:
Met Gly Tyr Tyr Ser Glu Val Thr Gly Asp Tyr Leu Phe Asn Tyr Asn
1 5 10 15
Ser Thr He Val Val Ala Tyr Asp Arg Ser Asp Ala Met Thr Ser Tyr
20 25 30
Tyr He Asn Val He Val Tyr Glu Leu Gin Lys Leu Gly Phe Tyr Asn
35 40 45
Val Phe Thr Gin Ala Glu Phe Pro Leu Asp Lys Ala Lys Asn Val He
- 211 -
50 - 55 60
Tyr Ala Arg He Val Arg Asn He Ser Ala Val Pro Phe Tyr Gin Tyr 65 70 75 80
Asn Tyr Gin Leu He Asp Gin Val Asn Lys Pro Cys Tyr Phe Leu Gly
85 90 95
Gly Gin Phe Tyr Cys Ser Gin Thr Leu Arg He He Thr Leu Ser Met
100 105 110
Ala Leu Ala Ser Lys Phe 115
(2) INFORMATION FOR SEQ ID NO: 95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 875 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 348...716 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95:
TGCGGAGGGA ATGTCTATGA TAAAATCTCA GAAAAATTTG TAGAAAAAGT GGATAACGGG 60
TTTTGAAAAT TTTAATCCTT TTTTTTATCT GTTTAAACGC ATTGTTCGCC CTAGATTCAA 120
ACGCACTTAA AGCAGAGATT AAAGAAGTTT ACCTTAAAGA ATACAAAGAC TTAAAATTAG 180
AAATTGAAAC CATTAACTTA GAAATCCCAG AGCGCTTTTC TAACGCTTCC ATTTTAAGCT 240
ATGAATTAAA CGCTTCCAAT AAGCTTAAAA AAGATGGGGT CGTGTTTTTA AGGTTGGAAA 300
ATGATCCTAA TTTACGCCTA CCGGTGCGTT ATAGCGTGAT AGGCAGC ATG CAG GCT 356
Met Gin Ala 1
TTT AAA AGC GTT AGC GCG ATT AAA AAA GAT GAA AAC ATC ACC GCT AAT 404 Phe Lys Ser Val Ser Ala He Lys Lys Asp Glu Asn He Thr Ala Asn 5 10 15
AAC ACT CAA AAA GAG CGC ATT TTG TTT GGT GCG CTT TCT AAC CCC TTA 452 Asn Thr Gin Lys Glu Arg He Leu Phe Gly Ala Leu Ser Asn Pro Leu 20 25 30 35
TTA GAG GGC GCG ATT GAT AAA GTG AGC GCG AAA AAT TTT ATC CCC CCT 500 Leu Glu Gly Ala He Asp Lys Val Ser Ala Lys Asn Phe He Pro Pro 40 45 50
AAC ACG CTT TTA AGC ACG GAT AAA ACC CAA GCT TTA ATT ATC GTG CGT 548 Asn Thr Leu Leu Ser Thr Asp Lys Thr Gin Ala Leu He He Val Arg 55 60 65
AAA AAT GAC ATT ATC ACC GGG GTG TAT GAA GAG GGG CAA ATC AGC ATA 596 Lys Asn Asp He He Thr Gly Val Tyr Glu Glu Gly Gin He Ser He
- 212 -
70 75 80
GAA ATA AGC CTA AAA GCC CTA GAA AAT GGC GCG CTT AAT CAA ATC ATT 644 Glu He Ser Leu Lys Ala Leu Glu Asn Gly Ala Leu Asn Gin He He 85 90 95
CAA GCG AAA AAT TTA GAA AGC AAT AAA ATA CTC AAA GCA AAA GTG TTG 692 Gin Ala Lys Asn Leu Glu Ser Asn Lys He Leu Lys Ala Lys Val Leu 100 105 110 115
AGC AGC TCT AAA GCG CAA ATC TTA TAAAGGACAT TCATGAAATT GGTTTTAGGC 746 Ser Ser Ser Lys Ala Gin He Leu 120
ATCAGTGGAG CGAGCGGGAT ACCCCTAGCC TTGCGGTTTT TAGAAAAATT ACCCAAAGAA 806 ATTGAAGTTT TTGTCGTGGC GTCTAAAAAC GCGCATGTCG TGGCGTTAGA AGAATCTAAT 866 ATTAACCTT 875
(2) INFORMATION FOR SEQ ID NO 96
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 123 ammo acids
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE protein (v) FRAGMENT TYPE internal
(xi) SEQUENCE DESCRIPTION SEQ ID NO 96
Met Gin Ala Phe Lys Ser Val Ser Ala He Lys Lys Asp Glu Asn He
1 5 10 15
Thr Ala Asn Asn Thr Gin Lys Glu Arg He Leu Phe Gly Ala Leu Ser
20 25 30
Asn Pro Leu Leu Glu Gly Ala He Asp Lys Val Ser Ala Lys Asn Phe
35 40 45
He Pro Pro Asn Thr Leu Leu Ser Thr Asp Lys Thr Gin Ala Leu He
50 55 60
He Val Arg Lys Asn Asp He He Thr Gly Val Tyr Glu Glu Gly Gin 65 70 75 80
He Ser He Glu He Ser Leu Lys Ala Leu Glu Asn Gly Ala Leu Asn
85 90 95
Gin He He Gin Ala Lys Asn Leu Glu Ser Asn Lys He Leu Lys Ala
100 105 110
Lys Val Leu Ser Ser Ser Lys Ala Gin He Leu 115 120
(2) INFORMATION FOR SEQ ID NO 97
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 394 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
- 2 13 - ( ix ) FEATURE :
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 160...345 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
GGCATCACTT TTAACATGAC CCCTTCTCCA GGCGCGACGA GTTGTTTGCA AAACGCCCTT 60 GTGGATTCCC AAGAAATCGC TGCGTATTTG GGCGAGAGCT TTGAATTAGA ACGCTTTTAT 120 AAAGATTTAT CCCCAGAAGA ATTGGAAAAT TAAAAACGC ATG CAA AAA GAA CAA 174
Met Gin Lys Glu Gin 1 5
GAA GCC CAA GAA ATC GCT AAA AAA GCC GTT AAA ATC GTG TTT TTT TTA 222 Glu Ala Gin Glu He Ala Lys Lys Ala Val Lys He Val Phe Phe Leu 10 15 20
GGG CTT GTG GTG GTG CTT TTG ATG ATG ATA AAC CTT TAC ATG CTC ATC 270 Gly Leu Val Val Val Leu Leu Met Met He Asn Leu Tyr Met Leu He 25 30 35
AAT CAA ATC AAC GCG AGC GCT CAA ATG AGC CAC CAA ATC AAA AAG ATA 318 Asn Gin He Asn Ala Ser Ala Gin Met Ser His Gin He Lys Lys He 40 45 50
GAA GAA AGG CTT AAT CAG GAG CAA AAA TAAAAAAGGC TTTTTGGTAT TTTTACG 372 Glu Glu Arg Leu Asn Gin Glu Gin Lys 55 60
ATCAAATAGT AAAGAGCTTA TC 394
(2) INFORMATION FOR SEQ ID NO: 98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98:
Met Gin Lys Glu Gin Glu Ala Gin Glu He Ala Lys Lys Ala Val Lys
1 5 10 15
He Val Phe Phe Leu Gly Leu Val Val Val Leu Leu Met Met He Asn
20 25 30
Leu Tyr Met Leu He Asn Gin He Asn Ala Ser Ala Gin Met Ser His
35 40 45
Gin He Lys Lys He Glu Glu Arg Leu Asn Gin Glu Gin Lys 50 55 60
- 214 -
(2) INFORMATION FOR SEQ ID NO: 99;
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 982 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 320...880 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide
(B) LOCATION: 320...400 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide
(B) LOCATION: 401...880 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99:
AGATTAGCAG CAGCAGGGAT TTTTAAATTC CTGGCCAACA GGGGCGGTTG GAAAAAAATA 60 CGATTAAAAA GGCAAACGCT TTGAAAGTAT TTTTTCATAG AAATTCCCTT TTGTTAAATG 120 ATTGAAGTTG GTGATTATAC CTATTTGTAT CTTAAAAATT TGATTTTAAA AGTTTGAGAT 180 GGTTTTGTAG GTGTATCCCA CTTATCCAAT TTATATCAAT ATTTTCACTC TAAAACCCTC 240 ATCCTTGATA AAAAATTAAA CCTTTTAGAA AAATAACCGA TTTTAGGGTG TAACTTTAAT 300 TCAACAAGAA GGATTTATT ATG ATT AAA AGA ATT GCT TGT ATT TTA AGC TTG 352 Met He Lys Arg He Ala Cys He Leu Ser Leu -27 -25 -20
AGT GCG AGT TTA GCG CTG GCT GGC GAA GTG AAT GGG TTT TTC ATG GGT 400 Ser Ala Ser Leu Ala Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly -15 -10 -5
GCG GGT TAT CAG CAA GGT CGT TAT GGT CCT TAT AAC AGC AAT TAC TCT 448 Ala Gly Tyr Gin Gin Gly Arg Tyr Gly Pro Tyr Asn Ser Asn Tyr Ser 1 5 10 15
GAT TGG CGC CAT GGC AAT GAT CTT TAT GGT TTG AAT TTC AAA TTA GGT 496 Asp Trp Arg His Gly Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly 20 25 30
TTT GTA GGC TTT GCC AAT AAA TGG TTT GGG GCT AGG GTG TAT GGC TTT 544 Phe Val Gly Phe Ala Asn Lys Trp Phe Gly Ala Arg Val Tyr Gly Phe 35 40 45
TTA GAT TGG TTT AAC ACT TCA GGG ACA GAA CAC ACC AAA ACC AAT TTG 592
- 215 -
Leu Asp Trp ne Asn nr- Ser Gly Thr Glu His Thr Lys Thr Asn Leu 50 55 60
CTC ACC TAT GGT GGC GGT GGC GAT TTG ATT GTC AAT CTC ATT CCT TTG 640 Leu Thr Tyr Gly Gly Gly Gly Asp Leu He Val Asn Leu He Pro Leu 65 70 75 80
GAT AAA TTC GCT CTA GGT CTC ATC GGT GGC GTT CAA TTA GCC GGA AAC 688 Asp Lys Phe Ala Leu Gly Leu He Gly Gly Val Gin Leu Ala Gly Asn 85 90 95
ACT TGG ATG TTC CCT TAT GAT GTC AAT CAA ACG AGA TTC CAG TTC TTA 736 Thr Trp Met Phe Pro Tyr Asp Val Asn Gin Thr Arg Phe Gin Phe Leu 100 105 110
TGG AAT TTA GGC GGA AGA ATG CGT GTT GGG GAT CGC AGT GCG TTT GAA 784 Trp Asn Leu Gly Gly Arg Met Arg Val Gly Asp Arg Ser Ala Phe Glu 115 120 125
GCA GGC GTG AAA TTC CCT ATG GTT AAT CAA GGC AAC AAA GAT GTT AGG 832 Ala Gly Val Lys Phe Pro Met Val Asn Gin Gly Asn Lys Asp Val Arg 130 135 140
GCT TAT CCG CTA CTA TTC TTG GGT ATG TGG ATT ATG TTC TTC ACT TTC T 881 Ala Tyr Pro Leu Leu Phe Leu Gly Met Trp He Met Phe Phe Thr Phe 145 150 155 160
AATTTATTCC TTTCATTCGC TCTTCTTCAT CAAATCAACC CTAACCCACT CTTAAAAGGT 941 TGGGGTTCAA AAATCTTTTT CATAAATAAA ATTTGCCTTA A 982
(2) INFORMATION FOR SEQ ID NO: 100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 187 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100:
Met He Lys Arg He Ala Cys He Leu Ser Leu Ser Ala Ser Leu Ala
-27 -25 -20 -15
Leu Ala Gly Glu Val Asn Gly Phe Phe Met Gly Ala Gly Tyr Gin Gin
-10 -5 1 5
Gly Arg Tyr Gly Pro Tyr Asn Ser Asn Tyr Ser Asp Trp Arg His Gly
10 15 20
Asn Asp Leu Tyr Gly Leu Asn Phe Lys Leu Gly Phe Val Gly Phe Ala
25 30 35
Asn Lys Trp Phe Gly Ala Arg Val Tyr Gly Phe Leu Asp Trp Phe Asn
40 45 50
Thr Ser Gly Thr Glu His Thr Lys Thr Asn Leu Leu Thr Tyr Gly Gly 55 60 65
- 216 -
Gly Gly Asp Leu He val-Asn Leu He Pro Leu Asp Lys Phe Ala Leu
70 75 80 85
Gly Leu He Gly Gly Val Gin Leu Ala Gly Asn Thr Trp Met Phe Pro
90 95 100
Tyr Asp Val Asn Gin Thr Arg Phe Gin Phe Leu Trp Asn Leu Gly Gly
105 110 115
Arg Met Arg Val Gly Asp Arg Ser Ala Phe Glu Ala Gly Val Lys Phe
120 125 130
Pro Met Val Asn Gin Gly Asn Lys Asp Val Arg Ala Tyr Pro Leu Leu
135 140 145
Phe Leu Gly Met Trp He Met Phe Phe Thr Phe
150 155 160
(2) INFORMATION FOR SEQ ID NO: 101:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 843 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 262...777 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101:
CCAATGGAGG CGTTTCCAAA AACCCAAACG GGCGCTTTTT AAAGAAAAAT CTCAAAAAAT 60
TCAGGGAGCA AGCGGTAAAA ATCGTAGAAA AACGCTTGAT AAAAGAGAAT ATGCAACTGA 120
GCGATTTTAA TGAAGAAGAA TTAAAAATCA TGTTTGAAGC TGAAGAAAAA AGGTTGTTAG 180
AGCAAATCCA CGCTAAAGAA TTGAAAGAAA AGCAAGAAAA AACCACCAAG CATTTTAAAG 240
AAGTTTGGGA AAAGGGCGAA A ATG AGC AAG AAA AAT AGC GTA ATT TCT GGT 291
Met Ser Lys Lys Asn Ser Val He Ser Gly 1 5 10
TTA ATG AAT TTT TTT AGC GAA AAG AAT GAA CGC TGG CTC TTA GCC CAC 339 Leu Met Asn Phe Phe Ser Glu Lys Asn Glu Arg Trp Leu Leu Ala His 15 20 25
AGG CAC ACG AGA GGG TTT GTG ATA GTG GCG TGG CTT TTT CGG TTT AAA 387 Arg His Thr Arg Gly Phe Val He Val Ala Trp Leu Phe Arg Phe Lys 30 35 40
AGC ATT GCG TTT TCT ATT TTG ATC ACT CTG TTG GTT ATT TTA GTG GAT 435 Ser He Ala Phe Ser He Leu He Thr Leu Leu Val He Leu Val Asp 45 50 55
ATT TGG GTT TAT AGC GAT GTG CGT CAG TTT TTA TTG GAC ACT TCT AGC 483 He Trp Val Tyr Ser Asp Val Arg Gin Phe Leu Leu Asp Thr Ser Ser
-217 -
60 - 65 70
TCT TTT ATT TGG CTT TTG ATC GCT TTA CTA ATC AAG TGG GGC GTG ATT 531 Ser Phe He Trp Leu Leu He Ala Leu Leu He Lys Trp Gly Val He 75 80 85 90
GTC ATA AGC GCA CGT AAA TGC TAC CAA TTC AGC CAA AAA ATG TTT ACG 579 Val He Ser Ala Arg Lys Cys Tyr Gin Phe Ser Gin Lys Met Phe Thr 95 100 105
CTC ATT CAA AGA AAA AGG CAA ATC AGA GAG AAT TTA AAA AAC CGC TCC 627 Leu He Gin Arg Lys Arg Gin He Arg Glu Asn Leu Lys Asn Arg Ser 110 115 120
AAC TAC AAA GAT ACC AAA AAT GCG GAA AAA CTC TCT AGC ATC GCT GAA 675 Asn Tyr Lys Asp Thr Lys Asn Ala Glu Lys Leu Ser Ser He Ala Glu 125 130 135
GAA ATC ATT TCA AAA AAA CAA GAA GAG TCC CGC CCC AAA GAA GAT TCT 723 Glu He He Ser Lys Lys Gin Glu Glu Ser Arg Pro Lys Glu Asp Ser 140 145 150
AAT CAT GAA AAC CAT AAA GAA AAG CTT TCT AAC ATT ACC GAA GAA AGT 771 Asn His Glu Asn His Lys Glu Lys Leu Ser Asn He Thr Glu Glu Ser 155 160 165 170
GAT TCT TAAAAAACAA GAGGAATTGA AAAGCTAAAA AGGATAGGGG GGGATTACCC AA 829 Asp Ser
AGCATATTGG AGGG 843
(2) INFORMATION FOR SEQ ID NO: 102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102:
Met Ser Lys Lys Asn Ser Val He Ser Gly Leu Met Asn Phe Phe Ser
1 5 10 15
Glu Lys Asn Glu Arg Trp Leu Leu Ala His Arg His Thr Arg Gly Phe
20 25 30
Val He Val Ala Trp Leu Phe Arg Phe Lys Ser He Ala Phe Ser He
35 40 45
Leu He Thr Leu Leu Val He Leu Val Asp He Trp Val Tyr Ser Asp 50 55 60
- 218 -
Val Arg Gin Phe Leu Leu- Asp Thr Ser Ser Ser Phe He Trp Leu Leu 65 70 75 80
He Ala Leu Leu He Lys Trp Gly Val He Val He Ser Ala Arg Lys
85 90 95
Cys Tyr Gin Phe Ser Gin Lys Met Phe Thr Leu He Gin Arg Lys Arg
100 105 110
Gin He Arg Glu Asn Leu Lys Asn Arg Ser Asn Tyr Lys Asp Thr Lys
115 120 125
Asn Ala Glu Lys Leu Ser Ser He Ala Glu Glu He He Ser Lys Lys
130 135 140
Gin Glu Glu Ser Arg Pro Lys Glu Asp Ser Asn His Glu Asn His Lys 145 150 155 160
Glu Lys Leu Ser Asn He Thr Glu Glu Ser Asp Ser 165 170
(2) INFORMATION FOR SEQ ID NO:103:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1047 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 34...1005 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103:
AGAAAGAAAC CATTCAAGGA ACGCATTGAT TTG ATG AAT AAA CCA TTT TTA ATC 54
Met Asn Lys Pro Phe Leu He 1 5
TTA CTC ATA GCC CTA ATT GTC TTT AGC GGC TGT AAC ATG AGA AAA TAT 102 Leu Leu He Ala Leu He Val Phe Ser Gly Cys Asn Met Arg Lys Tyr 10 15 20
TTC AAA CCC GCT AAA CAC CAA ATT AAA GGC GAA GCG TAT TTC CCT AAC 150 Phe Lys Pro Ala Lys His Gin He Lys Gly Glu Ala Tyr Phe Pro Asn 25 30 35
CAT TTG CAA GAA AGT ATC GTT TCG TCT AAT CGT TAT GGA GCC ATT TTG 198 His Leu Gin Glu Ser He Val Ser Ser Asn Arg Tyr Gly Ala He Leu 40 45 50 55
AAA AAT GGA GCG GTT ATA GGC GAT AAA GGT TTA ACG CAG CTA AGA ATC 246 Lys Asn Gly Ala Val He Gly Asp Lys Gly Leu Thr Gin Leu Arg He 60 65 70
GGT AAG AAC TTC AAT TAC GAA AGC AGT TTT TTA AAT GAG AGT CAA GGG 294
- 219 -
Gly Lys Asn Pne Asn lyr Glu Ser Ser Phe Leu Asn Glu Ser Gin Gly
75 80 85
TTT TTT ATT CTT GCG CAA GAT TGT TTG AAC AAG ATT GAT AAA AAA ACA 342
Phe Phe He Leu Ala Gin Asp Cys Leu Asn Lys He Asp Lys Lys Thr
90 95 100
AAC AAA AGC AAG GTG GCT AAG ACT GAA GAA ACG GAA TTG AAA TTA AAG 390
Asn Lys Ser Lys Val Ala Lys Thr Glu Glu Thr Glu Leu Lys Leu Lys
105 110 115
GGC GTT GAA GCG GAA GTC CAA GAT AAA GTC TGT CAT CAA GTG GAA TTG 438
Gly Val Glu Ala Glu Val Gin Asp Lys Val Cys His Gin Val Glu Leu
120 125 130 135
ATT AGC AAT AAC CCT AAC GCC AGC CAA CAA TCT ATC GTT ATT CCT TTG 486
He Ser Asn Asn Pro Asn Ala Ser Gin Gin Ser He Val He Pro Leu 140 145 150
GAG ACT TTT GCC TTG AGC GCA AGC GTT AAA GGG AAT CTT TTA GCG GTG 534
Glu Thr Phe Ala Leu Ser Ala Ser Val Lys Gly Asn Leu Leu Ala Val
155 160 165
GTG TTA GCG GAC AAT TCA GCG AAC TTA TAC GAC ATC ACT TCT CAA AAA 582
Val Leu Ala Asp Asn Ser Ala Asn Leu Tyr Asp He Thr Ser Gin Lys
170 175 180
TTG CTT TTT AGT GAG AAA GGT TCC CCA AGC ACC ACG ATC AAT TCT TTA 630
Leu Leu Phe Ser Glu Lys Gly Ser Pro Ser Thr Thr He Asn Ser Leu
185 190 195
ATG GCG ATG CCT ATT TTT ATG GAT ACG GTC GTG GTG TTC CCC ATG CTA 678
Met Ala Met Pro He Phe Met Asp Thr Val Val Val Phe Pro Met Leu
200 205 210 215
GAT GGG CGC TTG TTG GTC GTG GAT TAT GTG CAC GGA AAC CCT ACG CCT 726
Asp Gly Arg Leu Leu Val Val Asp Tyr Val His Gly Asn Pro Thr Pro 220 225 230
ATT AGA AAC ATT GTT ATC AGC AGC GAT AAG TTT TTT AAC AAT ATC ACC 774
He Arg Asn He Val He Ser Ser Asp Lys Phe Phe Asn Asn He Thr
235 240 245
TAC CTT ATC GTA GAT GGC AAT AAC ATG ATC GCT TCT ACA GGG AAA AGG 822
Tyr Leu He Val Asp Gly Asn Asn Met He Ala Ser Thr Gly Lys Arg
250 255 260
ATA CTC TCA GTA GTG AGC GGT CAA GAG TTC AAC TAT GAT GGG GAT ATT 870
He Leu Ser Val Val Ser Gly Gin Glu Phe Asn Tyr Asp Gly Asp He
265 270 275
GTG GAT TTG CTT TAT GAT AAG GGG ACT TTA TAT GTG CTC ACG CTA GAC 918
Val Asp Leu Leu Tyr Asp Lys Gly Thr Leu Tyr Val Leu Thr Leu Asp
280 285 290 295
- 220 -
GGG CAG ATT TTG CAA ATG -GAT AAG AGT TTG AGG GAA TTA AAC AGC GTG 966 Gly Gin He Leu Gin Met Asp Lys Ser Leu Arg Glu Leu Asn Ser Val 300 305 310
AAA CTG CCT NTC NTC GCT CAA CAC GAT TGT ATT AAA CCA TAATAAATTG TA 1017 Lys Leu Pro Xaa Xaa Ala Gin His Asp Cys He Lys Pro 315 320
TTCTTTAGAA AAACGAGGGT ATGTGATAGA 1047
(2) INFORMATION FOR SEQ ID NO: 104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 324 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104:
Met Asn Lys Pro Phe Leu He Leu Leu He Ala Leu He Val Phe Ser
1 5 10 15
Gly Cys Asn Met Arg Lys Tyr Phe Lys Pro Ala Lys His Gin He Lys
20 25 30
Gly Glu Ala Tyr Phe Pro Asn His Leu Gin Glu Ser He Val Ser Ser
35 40 45
Asn Arg Tyr Gly Ala He Leu Lys Asn Gly Ala Val He Gly Asp Lys
50 55 60
Gly Leu Thr Gin Leu Arg He Gly Lys Asn Phe Asn Tyr Glu Ser Ser 65 70 75 80
Phe Leu Asn Glu Ser Gin Gly Phe Phe He Leu Ala Gin Asp Cys Leu
85 90 95
Asn Lys He Asp Lys Lys Thr Asn Lys Ser Lys Val Ala Lys Thr Glu
100 105 110
Glu Thr Glu Leu Lys Leu Lys Gly Val Glu Ala Glu Val Gin Asp Lys
115 120 125
Val Cys His Gin Val Glu Leu He Ser Asn Asn Pro Asn Ala Ser Gin
130 135 140
Gin Ser He Val He Pro Leu Glu Thr Phe Ala Leu Ser Ala Ser Val 145 150 155 160
Lys Gly Asn Leu Leu Ala Val Val Leu Ala Asp Asn Ser Ala Asn Leu
165 170 175
Tyr Asp He Thr Ser Gin Lys Leu Leu Phe Ser Glu Lys Gly Ser Pro
180 185 190
Ser Thr Thr He Asn Ser Leu Met Ala Met Pro He Phe Met Asp Thr
195 200 205
Val Val Val Phe Pro Met Leu Asp Gly Arg Leu Leu Val Val Asp Tyr
210 215 220
Val His Gly Asn Pro Thr Pro He Arg Asn He Val He Ser Ser Asp 225 230 235 240
Lys Phe Phe Asn Asn He Thr Tyr Leu He Val Asp Gly Asn Asn Met
245 250 255
He Ala Ser Thr Gly Lys Arg He Leu Ser Val Val Ser Gly Gin Glu
-221 -
260 265 270
Phe Asn Tyr Asp Gly Asp He Val Asp Leu Leu Tyr Asp Lys Gly Thr
275 280 285
Leu Tyr Val Leu Thr Leu Asp Gly Gin He Leu Gin Met Asp Lys Ser
290 295 300
Leu Arg Glu Leu Asn Ser Val Lys Leu Pro Xaa Xaa Ala Gin His Asp 305 310 315 320
Cys He Lys Pro
(2) INFORMATION FOR SEQ ID NO: 105:
Ii) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1968 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
!ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 153...1793 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 153...219 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105:
TCTGGGGGCA TTGCTTACCC TACTACTCGC TTGAAACGCC CAAGCCTGAT CCAATCTCAT 60 AAAGATTCTA ATCGCAATTT TAAAACCATC ACTTTTTGGC TCGTTCCCAC AAAAAGCCAC 120 GCAACTTACT ACATCATTAA GGTTTAATCA CA ATG GAT AAA AAC AAC AAT AAT 173
Met Asp Lys Asn Asn Asn Asn -20
CTC CGC TTG ATT TTA GCG ATC GCT CTG TCT TTC TTG TTT ATC GCT CTT 221 Leu Arg Leu He Leu Ala He Ala Leu Ser Phe Leu Phe He Ala Leu -15 -10 -5 1
TAT AGC TAT TTT TTC CAA AAA CCA AAC AAA ACA ACA ACC CAA ACC ACA 269 Tyr Ser Tyr Phe Phe Gin Lys Pro Asn Lys Thr Thr Thr Gin Thr Thr 5 10 15
AAG CAA GAA ACA ACC AAC AAC CAT ACA GCA ACA AGT CCT AAC GCG CCC 317 Lys Gin Glu Thr Thr Asn Asn His Thr Ala Thr Ser Pro Asn Ala Pro 20 25 30
AAC GCC CAA CAT TTT AGC ACC ACT CAA ACA ACC CCC CAA GAG AAT TTG 365 Asn Ala Gin His Phe Ser Thr Thr Gin Thr Thr Pro Gin Glu Asn Leu 35 40 45
- 222 -
CTA AGC ACG ATT TC TTT -GAG CAT GCC AGG ATT GAA ATT GAT TCT TTA 413
Leu Ser Thr He Ser Phe Glu His Ala Arg He Glu He Asp Ser Leu
50 55 60 65
GGG CGC ATC AAA CAG GTT TAT CTC AAG GAT AAA AAG TAT CTA ACC CCT 461
Gly Arg He Lys Gin Val Tyr Leu Lys Asp Lys Lys Tyr Leu Thr Pro 70 75 80
AAA CAA AAG GGC TTT TTA GAG CAT GTG GGC CAT CTT TTT AGC TCC AAA 509
Lys Gin Lys Gly Phe Leu Glu His Val Gly His Leu Phe Ser Ser Lys 85 90 95
GAA AAC GCG CAA CCC CCC CTA AAA GAG CTC CCC CTT TTA GCA GCC GAT 557
Glu Asn Ala Gin Pro Pro Leu Lys Glu Leu Pro Leu Leu Ala Ala Asp 100 105 110
AAA CTC AAG CCT TTA GAA GTG CGT TTT TTA GAC CCT ACG CTC AAT AAC 605
Lys Leu Lys Pro Leu Glu Val Arg Phe Leu Asp Pro Thr Leu Asn Asn 115 120 125
AAA GCG TTC AAC ACC CCT TAT AGC GCT TCA AAA ACC ACT CTT GGG CCT 653
Lys Ala Phe Asn Thr Pro Tyr Ser Ala Ser Lys Thr Thr Leu Gly Pro
130 135 140 145
AAC GAA CAG CTT GTT TTA ACC CAA GAT TTA GGC ACT CTT AGC ATC ATT 701
Asn Glu Gin Leu Val Leu Thr Gin Asp Leu Gly Thr Leu Ser He He 150 155 160
AAA ACC CTG ACT TTC TAT GAT GAT TTG CAT TAT GAT TTA AAA ATC GCA 749
Lys Thr Leu Thr Phe Tyr Asp Asp Leu His Tyr Asp Leu Lys He Ala 165 170 175
TTC AAA TCG CCC AAT AAC CTT ATC CCT AGC TAT GTG ATC ACC AAT GGT 797
Phe Lys Ser Pro Asn Asn Leu He Pro Ser Tyr Val He Thr Asn Gly 180 185 190
TAC AGG CCG GTG GCT GAT TTG GAC AGC TAC ACC TTT TCA GGC GTG CTT 845
Tyr Arg Pro Val Ala Asp Leu Asp Ser Tyr Thr Phe Ser Gly Val Leu 195 200 205
TTA GAA AAT AGC GAC AAA AAA ATT GAA AAA ATT GAA GAT AAA GAC GCT 893
Leu Glu Asn Ser Asp Lys Lys He Glu Lys He Glu Asp Lys Asp Ala
210 215 220 225
AAA GAA ATC AAA CGC TTT TCT AAC ACC CTC TTT TTA TCC AGC GTG GAT 941
Lys Glu He Lys Arg Phe Ser Asn Thr Leu Phe Leu Ser Ser Val Asp 230 235 240
AGG TAT TTC ACC ACC TTG CTT TTC ACT AAA GAT CCT CAA GGT TTT GAA 989
Arg Tyr Phe Thr Thr Leu Leu Phe Thr Lys Asp Pro Gin Gly Phe Glu 245 250 255
GCC TTA ATT GAT TCA GAA ATC GGC ACT AAA AAC CCC TTA GGG TTC ATT 1037
Ala Leu He Asp Ser Glu He Gly Thr Lys Asn Pro Leu Gly Phe He 260 265 270
- 223 -
TCC CTT AAA AAT GAA GCG -AAT TTG CAT GGC TAT ATT GGC CCT AAG GAT 1085
Ser Leu Lys Asn Glu Ala Asn Leu His Gly Tyr He Gly Pro Lys Asp
275 280 285
TAC CGC TCT TTG AAA GCG ATT TCA CCC ATG CTC ACC GAT GTG ATA GAG 1133
Tyr Arg Ser Leu Lys Ala He Ser Pro Met Leu Thr Asp Val He Glu
290 295 300 305
TAT GGC TTA ATC ACT TTC TTT GCA AAA GGC GTG TTT GTT TTA CTG GAT 1181
Tyr Gly Leu He Thr Phe Phe Ala Lys Gly Val Phe Val Leu Leu Asp 310 315 320
TAT TTG TAT CAA TTC GTG GGC AAT TGG GGT TGG GCT ATC ATT CTT TTA 1229
Tyr Leu Tyr Gin Phe Val Gly Asn Trp Gly Trp Ala He He Leu Leu 325 330 335
ACG ATT ATC GTG CGC ATC ATC CTT TAT CCT TTA AGC TAT AAG GGC ATG 1277
Thr He He Val Arg He He Leu Tyr Pro Leu Ser Tyr Lys Gly Met
340 345 350
GTG AGC ATG CAA AAG CTC AAA GAA TTA GCC CCT AAA ATG AAA GAA CTC 1325
Val Ser Met Gin Lys Leu Lys Glu Leu Ala Pro Lys Met Lys Glu Leu
355 360 365
CAA GAA AAA TAC AAG GGC GAA CCC CAA AAA TTG CAA GCC CAC ATG ATG 1373
Gin Glu Lys Tyr Lys Gly Glu Pro Gin Lys Leu Gin Ala His Met Met
370 375 380 385
CAG CTT TAC AAA AAA CAT GGG GCT AAC CCA CTA GGG GGT TGT CTG CCC 1421
Gin Leu Tyr Lys Lys His Gly Ala Asn Pro Leu Gly Gly Cys Leu Pro 390 395 400
TTA ATC TTA CAA ATC CCG GTG TTT TTT GCC ATT TAT AGA GTG CTT TAT 1469
Leu He Leu Gin He Pro Val Phe Phe Ala He Tyr Arg Val Leu Tyr 405 410 415
AAC GCT GTG GAA TTG AAA AGC TCA GAG TGG ATC TTA TGG ATT CAT GAT 1517
Asn Ala Val Glu Leu Lys Ser Ser Glu Trp He Leu Trp He His Asp
420 425 430
TTA TCC ATC ATG GAT CCG TAT TTT ATT TTA CCG CTT CTT ATG GGA GCG 1565
Leu Ser He Met Asp Pro Tyr Phe He Leu Pro Leu Leu Met Gly Ala
435 440 445
TCT ATG TAT TGG CAC CAA AGC GTT ACG CCA AAC ACC ATG ACC GAT CCC 1613
Ser Met Tyr Trp His Gin Ser Val Thr Pro Asn Thr Met Thr Asp Pro
450 455 460 465
ATG CAA GCA AAG ATT TTT AAA CTC TTA CCC CTA TTA TTC ACA ATC TTT 1661
Met Gin Ala Lys He Phe Lys Leu Leu Pro Leu Leu Phe Thr He Phe 470 475 480
TTA ATC ACT TTC CCG GCA GGG TTA GTC TTG TAT TGG ACC ACG AAC AAC 1709
Leu He Thr Phe Pro Ala Gly Leu Val Leu Tyr Trp Thr Thr Asn Asn 485 490 495
- 224 -
ATC CTT TCG GTG TTG CAA- CAA CTC ATC ATC AAT AAA GTC TTA GAG AAT 1757 He Leu Ser Val Leu Gin Gin Leu He He Asn Lys Val Leu Glu Asn 500 505 510
AAA AAA CGC ATG CAT GCG CAA AAC AAA AAG GAA CAT TGATGCAAAA TTTTAT 1809 Lys Lys Arg Met His Ala Gin Asn Lys Lys Glu His 515 520 525
TGAAATCAAA GCCAAAACCT TAGAAGAAGC CCTCATTCAA GCTTCTATCG CCTTGAATTG 1869 CCCCATTATT AATTTGCAAT ACGAAGTCAT TCAAACGCCC TCTAAAGGGT TTTTAAGCAT 1929 TGGTAAAAAA GAAGCCATTA TCTTAGCGGG CGTTAAAGA 1968
(2) INFORMATION FOR SEQ ID NO: 106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 547 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...22 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106:
Met Asp Lys Asn Asn Asn Asn Leu Arg Leu He Leu Ala He Ala Leu
-20 -15 -10
Ser Phe Leu Phe He Ala Leu Tyr Ser Tyr Phe Phe Gin Lys Pro Asn
-5 1 5 10
Lys Thr Thr Thr Gin Thr Thr Lys Gin Glu Thr Thr Asn Asn His Thr
15 20 25
Ala Thr Ser Pro Asn Ala Pro Asn Ala Gin His Phe Ser Thr Thr Gin
30 35 40
Thr Thr Pro Gin Glu Asn Leu Leu Ser Thr He Ser Phe Glu His Ala
45 50 55
Arg He Glu He Asp Ser Leu Gly Arg He Lys Gin Val Tyr Leu Lys
60 65 70
Asp Lys Lys Tyr Leu Thr Pro Lys Gin Lys Gly Phe Leu Glu His Val
75 80 85 90
Gly His Leu Phe Ser Ser Lys Glu Asn Ala Gin Pro Pro Leu Lys Glu
95 100 105
Leu Pro Leu Leu Ala Ala Asp Lys Leu Lys Pro Leu Glu Val Arg Phe
110 115 120
Leu Asp Pro Thr Leu Asn Asn Lys Ala Phe Asn Thr Pro Tyr Ser Ala
125 130 135
Ser Lys Thr Thr Leu Gly Pro Asn Glu Gin Leu Val Leu Thr Gin Asp
140 145 150
Leu Gly Thr Leu Ser He He Lys Thr Leu Thr Phe Tyr Asp Asp Leu
- 225 -
155 16u 165 170
His Tyr Asp Leu Lys He Ala Phe Lys Ser Pro Asn Asn Leu He Pro
175 180 185
Ser Tyr Val He Thr Asn Gly Tyr Arg Pro Val Ala Asp Leu Asp Ser
190 195 200
Tyr Thr Phe Ser Gly Val Leu Leu Glu Asn Ser Asp Lys Lys He Glu
205 210 215
Lys He Glu Asp Lys Asp Ala Lys Glu He Lys Arg Phe Ser Asn Thr
220 225 230
Leu Phe Leu Ser Ser Val Asp Arg Tyr Phe Thr Thr Leu Leu Phe Thr 235 240 245 250
Lys Asp Pro Gin Gly Phe Glu Ala Leu He Asp Ser Glu He Gly Thr
255 260 265
Lys Asn Pro Leu Gly Phe He Ser Leu Lys Asn Glu Ala Asn Leu His
270 275 280
Gly Tyr He Gly Pro Lys Asp Tyr Arg Ser Leu Lys Ala He Ser Pro
285 290 295
Met Leu Thr Asp Val He Glu Tyr Gly Leu He Thr Phe Phe Ala Lys
300 305 310
Gly Val Phe Val Leu Leu Asp Tyr Leu Tyr Gin Phe Val Gly Asn Trp 315 320 325 330
Gly Trp Ala He He Leu Leu Thr He He Val Arg He He Leu Tyr
335 340 345
Pro Leu Ser Tyr Lys Gly Met Val Ser Met Gin Lys Leu Lys Glu Leu
350 355 360
Ala Pro Lys Met Lys Glu Leu Gin Glu Lys Tyr Lys Gly Glu Pro Gin
365 370 375
Lys Leu Gin Ala His Met Met Gin Leu Tyr Lys Lys His Gly Ala Asn
380 385 390
Pro Leu Gly Gly Cys Leu Pro Leu He Leu Gin He Pro Val Phe Phe 395 400 405 410
Ala He Tyr Arg Val Leu Tyr Asn Ala Val Glu Leu Lys Ser Ser Glu
415 420 425
Trp He Leu Trp He His Asp Leu Ser He Met Asp Pro Tyr Phe He
430 435 440
Leu Pro Leu Leu Met Gly Ala Ser Met Tyr Trp His Gin Ser Val Thr
445 450 455
Pro Asn Thr Met Thr Asp Pro Met Gin Ala Lys He Phe Lys Leu Leu
460 465 470
Pro Leu Leu Phe Thr He Phe Leu He Thr Phe Pro Ala Gly Leu Val 475 480 485 490
Leu Tyr Trp Thr Thr Asn Asn He Leu Ser Val Leu Gin Gin Leu He
495 500 505
He Asn Lys Val Leu Glu Asn Lys Lys Arg Met His Ala Gin Asn Lys
510 515 520
Lys Glu His 525
(2) INFORMATION FOR SEQ ID NO: 107
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3280 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
-226- (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 151...3207 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 151...241 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107:
TAAAGGTTTT AGGCCTGTGG TGGTTCAAGT TTTAGAAGAG CGCAGCAAGA TTTTTATCGT 60
GAACGCTCAA AATTTACACC CTAATGACAG CGTGGCAGTG GGGTCATTGA TAGGGTTAAA 120
AGGCATGATC AACAATTTAG GGGAGGAATG ATG CTC GCT TCC ATT ATT GAA TTT 174
Met Leu Ala Ser He He Glu Phe -30 -25
TCC TTA CGC CAA AGA GTG ATC GTG ATT GTT GGT GCG ATT CTT ATT TTA 222 Ser Leu Arg Gin Arg Val He Val He Val Gly Ala He Leu He Leu -20 -15 -10
TTT TTT GGG ACT TAT AGT TTT ATC AAC ACT CCA GTG GAC GCT TTC CCG 270 Phe Phe Gly Thr Tyr Ser Phe He Asn Thr Pro Val Asp Ala Phe Pro -5 1 5 10
GAT ATT TCG CCC ACT CAA GTT AAA ATC ATT TTA AAA CTC CCC GGC TCT 318 Asp He Ser Pro Thr Gin Val Lys He He Leu Lys Leu Pro Gly Ser 15 20 25
AGC CCT GAA GAA ATG GAA AAC AAC ATC GTG CGC CCT TTA GAA TTG GAG 366 Ser Pro Glu Glu Met Glu Asn Asn He Val Arg Pro Leu Glu Leu Glu 30 35 40
CTT TTA GGC TTG AAA GGG CAA AAA TCT TTA AGG AGT GTT TCA AAA TAT 414 Leu Leu Gly Leu Lys Gly Gin Lys Ser Leu Arg Ser Val Ser Lys Tyr 45 50 55
TCT ATT TCA GAT ATT ACG ATA GAT TTT GAT GAC AGC GTG GAT ATT TAT 462 Ser He Ser Asp He Thr He Asp Phe Asp Asp Ser Val Asp He Tyr 60 65 70
TTA GCG AGG AAT ATT GTC AAT GAG CGC TTG AGC AGC GTG ATG AAA GAT 510 Leu Ala Arg Asn He Val Asn Glu Arg Leu Ser Ser Val Met Lys Asp 75 80 85 90
TTA CCC GTG GGG GTT GAG GGG GGC ATG GCG CCC ATT GTT ACG CCG CTA 558 Leu Pro Val Gly Val Glu Gly Gly Met Ala Pro He Val Thr Pro Leu 95 100 105
- 22 7 -
TCA GAT ATC TTT ATG HC ACT ATT GAT GGC AAT ATC ACT GAG ATA GAA 606
Ser Asp He Phe Met Phe Thr He Asp Gly Asn He Thr Glu He Glu 110 115 120
AAA CGA CAG CTT TTA GAT TTT GTG ATC CGC CCA CAA TTA AGA ATG ATT 654
Lys Arg Gin Leu Leu Asp Phe Val He Arg Pro Gin Leu Arg Met He
125 130 135
AGC GGC GTA GCA GAT GTC AAT TCC ATT GGA GGC TTT AGC AGA GCG TTT 702
Ser Gly Val Ala Asp Val Asn Ser He Gly Gly Phe Ser Arg Ala Phe
140 145 150
GTG ATC GTG CCG GAT TTT AAT GAC ATG GCA AGG CTT GGG GTG AGT ATT 750
Val He Val Pro Asp Phe Asn Asp Met Ala Arg Leu Gly Val Ser He 155 160 165 170
TCT GAT TTA GAA TCG GCT GTG AGA GTG AAT TTA AGA AAC AGC GGA GCG 798
Ser Asp Leu Glu Ser Ala Val Arg Val Asn Leu Arg Asn Ser Gly Ala 175 180 185
GGG CGC GTG GAT AGA GAT GGC GAA ACC TTT TTA GTC AAA ATC CAA ACC 846
Gly Arg Val Asp Arg Asp Gly Glu Thr Phe Leu Val Lys He Gin Thr 190 195 200
GCT TCT TTG AGT TTA GAA GAC ATT GGC AAA ATC ACC GTT TCC ACT AAT 894
Ala Ser Leu Ser Leu Glu Asp He Gly Lys He Thr Val Ser Thr Asn
205 210 215
TTA GGG CAT TTG CAC ATT AAG GAT TTT GCG AAA GTC ATC AGC CAG TCT 942
Leu Gly His Leu His He Lys Asp Phe Ala Lys Val He Ser Gin Ser
220 225 230
CGC ACC CGT TTG GGG TTT GTT ACT AAA GAT GGC GTG GGC GAG ACC ACA 990
Arg Thr Arg Leu Gly Phe Val Thr Lys Asp Gly Val Gly Glu Thr Thr 235 240 245 250
GAA GGC TTG GTG CTT TCT TTA AAA GAC GCT AAC ACC AAA GAA ATC ATC 1038
Glu Gly Leu Val Leu Ser Leu Lys Asp Ala Asn Thr Lys Glu He He 255 260 265
ACT CAA GTG TAT CAA AAA CTA GAA GAA TTA AAA CCC TTT TTA CCG AAT 1086
Thr Gin Val Tyr Gin Lys Leu Glu Glu Leu Lys Pro Phe Leu Pro Asn 270 275 280
GGC GTG TCC ATT AAT GTT TTT TAT GAT CGC TCA GAA TTT ACG CAA AAA 1134
Gly Val Ser He Asn Val Phe Tyr Asp Arg Ser Glu Phe Thr Gin Lys
285 290 295
GCC ATT GCC ACC GTT TCT AAA ACC CTC ATT GAA GCC GTT GTT TTA ATC 1182
Ala He Ala Thr Val Ser Lys Thr Leu He Glu Ala Val Val Leu He
300 305 310
ATC ATC ACG CTC TTT TTA TTT TTA GGG AAT TTG AGG GCG AGC GTG GCT 1230
He He Thr Leu Phe Leu Phe Leu Gly Asn Leu Arg Ala Ser Val Ala 315 320 325 330
- 228 -
GTG GGG GTG ATT 'HA CCT TTA AGC TTG TCC GTG GCG TTT ATT TTT ATC 1278
Val Gly Val He Leu Pro Leu Ser Leu Ser Val Ala Phe He Phe He 335 340 345
AAG TTT AGC GAT CTG ACT TTA AAT TTG ATG AGT TTA GGG GGA TTG GTT 1326
Lys Phe Ser Asp Leu Thr Leu Asn Leu Met Ser Leu Gly Gly Leu Val 350 355 360
ATC GCT ATA GGC ATG CTC ATT GAC TCA GCC GTG GTG GTG GTG GAA AAC 1374
He Ala He Gly Met Leu He Asp Ser Ala Val Val Val Val Glu Asn 365 370 375
GCT TTT GAA AAA TTA AGC GCT AAC ACT AAA ACC ACT AAA CTC CAT GCA 1422
Ala Phe Glu Lys Leu Ser Ala Asn Thr Lys Thr Thr Lys Leu His Ala
380 385 390
ATC TAT CGT TCG TGT AAA GAA ATC GCT GTT TCA GTG GTG AGC GGG GTG 1470
He Tyr Arg Ser Cys Lys Glu He Ala Val Ser Val Val Ser Gly Val
395 400 405 410
GTG ATC ATC ATT GTG TTT TTT GTG CCG ATT TTA ACC TTA CAG GGG TTA 1518
Val He He He Val Phe Phe Val Pro He Leu Thr Leu Gin Gly Leu 415 420 425
GAG GGT AAG ATG TTT AGG CCT TTA GCG CAA AGC ATT GTG TAT GCG CTT 1566
Glu Gly Lys Met Phe Arg Pro Leu Ala Gin Ser He Val Tyr Ala Leu 430 435 440
TTA GGC ACT TTA GTT CTA TCT ATT ACA ATC ATT CCT GTA GTC AGC TCT 1614
Leu Gly Thr Leu Val Leu Ser He Thr He He Pro Val Val Ser Ser 445 450 455
CTT GTC TTA AAA GCC ACG CCC CAT AGC GAA ACC TTT TTA ACG AGG TTT 1662
Leu Val Leu Lys Ala Thr Pro His Ser Glu Thr Phe Leu Thr Arg Phe
460 465 470
TTA AAC AGA ATC TAC GCC CCT TTA TTG GAA TTT TTT GTG CAT AAC CCT 1710
Leu Asn Arg He Tyr Ala Pro Leu Leu Glu Phe Phe Val His Asn Pro
475 480 485 490
AAA AAA GTG ATT TTA GGA GCG TTT GTT TTT TTA ATC GCA AGC CTT TCT 1758
Lys Lys Val He Leu Gly Ala Phe Val Phe Leu He Ala Ser Leu Ser 495 500 505
TTA TTC CCT TTT GTG GGG AAG AAT TTC ATG CCC GTT TTA GAT GAG GGC 1806
Leu Phe Pro Phe Val Gly Lys Asn Phe Met Pro Val Leu Asp Glu Gly 510 515 520
GAT GTG GTT TTG AGC GTG GAA ACC ACC CCT TCT ATT TCT TTA GAT CAA 1854
Asp Val Val Leu Ser Val Glu Thr Thr Pro Ser He Ser Leu Asp Gin 525 530 535
TCT AGG GAT CTC ATG CTA AAC ATT GAG AGC GCG ATT AAA AAG CAT GTC 1902
Ser Arg Asp Leu Met Leu Asn He Glu Ser Ala He Lys Lys His Val
540 545 550
- 229 -
AAG GAA GTT AAA AGC AH GTC GCG CGC ACA GGG AGC GAT GAA TTG GGG 1950
Lys Glu Val Lys Ser He Val Ala Arg Thr Gly Ser Asp Glu Leu Gly 555 560 565 570
CTG GAT TTA GGA GGT TTG AAT CAA ACC GAT ACT TTT ATT TCT TTT ATT 1998
Leu Asp Leu Gly Gly Leu Asn Gin Thr Asp Thr Phe He Ser Phe He 575 580 585
CCT AAA AAA GAA TGG AGC GTT AAA ACC AAA GAT GAA TTA TTA GAA AAA 2046
Pro Lys Lys Glu Trp Ser Val Lys Thr Lys Asp Glu Leu Leu Glu Lys 590 595 600
ATC ATG GAT TCT TTA AAA GAC TTT AAG GGG ATT AAC TTT TCT TTC ACC 2094
He Met Asp Ser Leu Lys Asp Phe Lys Gly He Asn Phe Ser Phe Thr
605 610 615
CAA CCC ATT GAA ATG AGA ATT TCT GAA ATG CTG ACA GGG GTT AGG GGG 2142
Gin Pro He Glu Met Arg He Ser Glu Met Leu Thr Gly Val Arg Gly 620 625 630
GAT TTA GCG GTT AAG ATT TTT GGA GAT GGT ATT AGC GAA TTG AAT GAA 2190
Asp Leu Ala Val Lys He Phe Gly Asp Gly He Ser Glu Leu Asn Glu 635 640 645 650
TTG AGT TTT CAA ATC GCG CAA GCT CTA AAA GGG ATT AAA GGA TCT AGT 2238
Leu Ser Phe Gin He Ala Gin Ala Leu Lys Gly He Lys Gly Ser Ser 655 660 665
GAA GTT TTA ACC ACG CTT AAT GAG GGC GTG AAT TAT TTG TAT GTA ACC 2286
Glu Val Leu Thr Thr Leu Asn Glu Gly Val Asn Tyr Leu Tyr Val Thr 670 675 680
CCT AAT AAA GAA TCG ATG GCG GAT GTG GGG ATC ACT AGC GAT GAA TTT 2334
Pro Asn Lys Glu Ser Met Ala Asp Val Gly He Thr Ser Asp Glu Phe
685 690 695
TCC AAG TTT TTA AAA TCC GCT TTA GAG GGC TTG GTT GTA GAT GTG ATC 2382
Ser Lys Phe Leu Lys Ser Ala Leu Glu Gly Leu Val Val Asp Val He 700 705 710
CCT ACA GGG ATT TCA CGC ACG CCA GTG ATG ATC CGC CAA GAG AGC GAT 2430
Pro Thr Gly He Ser Arg Thr Pro Val Met He Arg Gin Glu Ser Asp 715 720 725 730
TTT GCA AGC TCT ATC ACT AAA ATC AAA AGT TTA GCC TTG ACT TCA AAA 2478
Phe Ala Ser Ser He Thr Lys He Lys Ser Leu Ala Leu Thr Ser Lys 735 740 745
TAT GGC GTT TTA GTG CCT ATC ACT TCT ATC GCC AAA ATT GAA GAA GTG 2526
Tyr Gly Val Leu Val Pro He Thr Ser He Ala Lys He Glu Glu Val 750 755 760
GAT GGC CCT GTT TCT GTT GTG CGT GAA AAT TCA ATG CGC ATG AGC GTG 2574
Asp Gly Pro Val Ser Val Val Arg Glu Asn Ser Met Arg Met Ser Val
765 770 775
- 230 -
GTT CGC AGT AAT GTG GTG -GGG CGC GAT TTG AAA TCT TTT GTA GAA GAG 2622
Val Arg Ser Asn Val Val Gly Arg Asp Leu Lys Ser Phe Val Glu Glu
780 785 790
GCT AAA AAA GTG ATC GCT CAA AAC ATC AAA CTC CCT CCC AGC TAC TAT 2670
Ala Lys Lys Val He Ala Gin Asn He Lys Leu Pro Pro Ser Tyr Tyr 795 800 805 810
ATC ACT TAT GGG GGG CAG TTT GAA AAC CAG CAA CGG GCC AAT AAA AGG 2718
He Thr Tyr Gly Gly Gin Phe Glu Asn Gin Gin Arg Ala Asn Lys Arg 815 820 825
CTC TCC ACC GTT ATC CCT TTA AGC ATC TTA GCG ATT TTT TTC ATT CTT 2766
Leu Ser Thr Val He Pro Leu Ser He Leu Ala He Phe Phe He Leu 830 835 840
TTT TTC ACT TTT AAA AGC ATT CCT TTA GCC TTG CTC ATT CTT TTG AAT 2814
Phe Phe Thr Phe Lys Ser He Pro Leu Ala Leu Leu He Leu Leu Asn 845 850 855
ATC CCT TTT GCG GTT ACC GGA GGC CTT ATT GCG TTG TTT GCG GTC GGG 2862
He Pro Phe Ala Val Thr Gly Gly Leu He Ala Leu Phe Ala Val Gly
860 865 870
GAG TAT ATT TCA GTG CCA GCG AGC GTG GGC TTT ATC GCT CTT TTT GGG 2910
Glu Tyr He Ser Val Pro Ala Ser Val Gly Phe He Ala Leu Phe Gly 875 880 885 890
ATT GCG GTT TTA AAT GGC GTG GTG ATG ATA GGC TAT TTT AAA GAG CTT 2958
He Ala Val Leu Asn Gly Val Val Met He Gly Tyr Phe Lys Glu Leu 895 900 905
CTC TTG CAA GGG AAA AGC GTA GAA GAA TGC GTT TTA TTG GGC GCT AAA 3006
Leu Leu Gin Gly Lys Ser Val Glu Glu Cys Val Leu Leu Gly Ala Lys 910 915 920
AGG CGT TTG AGA CCG GTT TTA ATG ACC GCT TGC ATT GCC GGT TTG GGT 3054
Arg Arg Leu Arg Pro Val Leu Met Thr Ala Cys He Ala Gly Leu Gly 925 930 935
TTG CTC CCT TTA TTA TTT TCT CAT AGC GTG GGA TCA GAA GTC CAA AAA 3102
Leu Leu Pro Leu Leu Phe Ser His Ser Val Gly Ser Glu Val Gin Lys
940 945 950
CCT TTA GCG ATC GTG GTG CTT GGA GGC TTG GTT ACC TCA AGC GCT CTA 3150
Pro Leu Ala He Val Val Leu Gly Gly Leu Val Thr Ser Ser Ala Leu 955 960 965 970
ACC TTA CTC CTA CTG CCG CCA ATG TTT ATG CTC ATC GCT AAA AAG ATT 3198
Thr Leu Leu Leu Leu Pro Pro Met Phe Met Leu He Ala Lys Lys He 975 980 985
AAA ATC GTT TGAGTTAAAG GATTTCACAT GCTCGCTTTA GAAATTTATA TTGATATTT 3256 Lys He Val
-231- GTTTGAAAGA CGCTTTAA'lA GATT 3280
(2) INFORMATION FOR SEQ ID NO: 108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1019 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...30 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108:
Met Leu Ala Ser He He Glu Phe Ser Leu Arg Gin Arg Val He Val -30 -25 -20 -15
He Val Gly Ala He Leu He Leu Phe Phe Gly Thr Tyr Ser Phe He
-10 -5 1
Asn Thr Pro Val Asp Ala Phe Pro Asp He Ser Pro Thr Gin Val Lys
5 10 15
He He Leu Lys Leu Pro Gly Ser Ser Pro Glu Glu Met Glu Asn Asn
20 25 30
He Val Arg Pro Leu Glu Leu Glu Leu Leu Gly Leu Lys Gly Gin Lys 35 40 45 50
Ser Leu Arg Ser Val Ser Lys Tyr Ser He Ser Asp He Thr He Asp
55 60 65
Phe Asp Asp Ser Val Asp He Tyr Leu Ala Arg Asn He Val Asn Glu
70 75 80
Arg Leu Ser Ser Val Met Lys Asp Leu Pro Val Gly Val Glu Gly Gly
85 90 95
Met Ala Pro He Val Thr Pro Leu Ser Asp He Phe Met Phe Thr He
100 105 110
Asp Gly Asn He Thr Glu He Glu Lys Arg Gin Leu Leu Asp Phe Val 115 120 125 130
He Arg Pro Gin Leu Arg Met He Ser Gly Val Ala Asp Val Asn Ser
135 140 145
He Gly Gly Phe Ser Arg Ala Phe Val He Val Pro Asp Phe Asn Asp
150 155 160
Met Ala Arg Leu Gly Val Ser He Ser Asp Leu Glu Ser Ala Val Arg
165 170 175
Val Asn Leu Arg Asn Ser Gly Ala Gly Arg Val Asp Arg Asp Gly Glu
180 185 190
Thr Phe Leu Val Lys He Gin Thr Ala Ser Leu Ser Leu Glu Asp He 195 200 205 210
Gly Lys He Thr Val Ser Thr Asn Leu Gly His Leu His He Lys Asp
215 220 225
Phe Ala Lys Val He Ser Gin Ser Arg Thr Arg Leu Gly Phe Val Thr
- 232 -
230 235 240
Lys Asp Gly Val Gly Glu Thr Thr Glu Gly Leu Val Leu Ser Leu Lys
245 250 255
Asp Ala Asn Thr Lys Glu He He Thr Gin Val Tyr Gin Lys Leu Glu
260 265 270
Glu Leu Lys Pro Phe Leu Pro Asn Gly Val Ser He Asn Val Phe Tyr 275 280 285 290
Asp Arg Ser Glu Phe Thr Gin Lys Ala He Ala Thr Val Ser Lys Thr
295 300 305
Leu He Glu Ala Val Val Leu He He He Thr Leu Phe Leu Phe Leu
310 315 320
Gly Asn Leu Arg Ala Ser Val Ala Val Gly Val He Leu Pro Leu Ser
325 330 335
Leu Ser Val Ala Phe He Phe He Lys Phe Ser Asp Leu Thr Leu Asn
340 345 350
Leu Met Ser Leu Gly Gly Leu Val He Ala He Gly Met Leu He Asp 355 360 365 370
Ser Ala Val Val Val Val Glu Asn Ala Phe Glu Lys Leu Ser Ala Asn
375 380 385
Thr Lys Thr Thr Lys Leu His Ala He Tyr Arg Ser Cys Lys Glu He
390 395 400
Ala Val Ser Val Val Ser Gly Val Val He He He Val Phe Phe Val
405 410 415
Pro He Leu Thr Leu Gin Gly Leu Glu Gly Lys Met Phe Arg Pro Leu
420 425 430
Ala Gin Ser He Val Tyr Ala Leu Leu Gly Thr Leu Val Leu Ser He 435 440 445 450
Thr He He Pro Val Val Ser Ser Leu Val Leu Lys Ala Thr Pro His
455 460 465
Ser Glu Thr Phe Leu Thr Arg Phe Leu Asn Arg He Tyr Ala Pro Leu
470 475 480
Leu Glu Phe Phe Val His Asn Pro Lys Lys Val He Leu Gly Ala Phe
485 490 495
Val Phe Leu He Ala Ser Leu Ser Leu Phe Pro Phe Val Gly Lys Asn
500 505 510
Phe Met Pro Val Leu Asp Glu Gly Asp Val Val Leu Ser Val Glu Thr 515 520 525 530
Thr Pro Ser He Ser Leu Asp Gin Ser Arg Asp Leu Met Leu Asn He
535 540 545
Glu Ser Ala He Lys Lys His Val Lys Glu Val Lys Ser He Val Ala
550 555 560
Arg Thr Gly Ser Asp Glu Leu Gly Leu Asp Leu Gly Gly Leu Asn Gin
565 570 575
Thr Asp Thr Phe He Ser Phe He Pro Lys Lys Glu Trp Ser Val Lys
580 585 590
Thr Lys Asp Glu Leu Leu Glu Lys He Met Asp Ser Leu Lys Asp Phe 595 600 605 610
Lys Gly He Asn Phe Ser Phe Thr Gin Pro He Glu Met Arg He Ser
615 620 625
Glu Met Leu Thr Gly Val Arg Gly Asp Leu Ala Val Lys He Phe Gly
630 635 640
Asp Gly He Ser Glu Leu Asn Glu Leu Ser Phe Gin He Ala Gin Ala
645 650 655
Leu Lys Gly He Lys Gly Ser Ser Glu Val Leu Thr Thr Leu Asn Glu 660 665 670
- 233 -
Gly Val Asn Tyr Leu 'lyr Val Thr Pro Asn Lys Glu Ser Met Ala Asp 675 680 685 690
Val Gly He Thr Ser Asp Glu Phe Ser Lys Phe Leu Lys Ser Ala Leu
695 700 705
Glu Gly Leu Val Val Asp Val He Pro Thr Gly He Ser Arg Thr Pro
710 715 720
Val Met He Arg Gin Glu Ser Asp Phe Ala Ser Ser He Thr Lys He
725 730 735
Lys Ser Leu Ala Leu Thr Ser Lys Tyr Gly Val Leu Val Pro He Thr
740 745 750
Ser He Ala Lys He Glu Glu Val Asp Gly Pro Val Ser Val Val Arg 755 760 765 770
Glu Asn Ser Met Arg Met Ser Val Val Arg Ser Asn Val Val Gly Arg
775 780 785
Asp Leu Lys Ser Phe Val Glu Glu Ala Lys Lys Val He Ala Gin Asn
790 795 800
He Lys Leu Pro Pro Ser Tyr Tyr He Thr Tyr Gly Gly Gin Phe Glu
805 810 815
Asn Gin Gin Arg Ala Asn Lys Arg Leu Ser Thr Val He Pro Leu Ser
820 825 830
He Leu Ala He Phe Phe He Leu Phe Phe Thr Phe Lys Ser He Pro 835 840 845 850
Leu Ala Leu Leu He Leu Leu Asn He Pro Phe Ala Val Thr Gly Gly
855 860 865
Leu He Ala Leu Phe Ala Val Gly Glu Tyr He Ser Val Pro Ala Ser
870 875 880
Val Gly Phe He Ala Leu Phe Gly He Ala Val Leu Asn Gly Val Val
885 890 895
Met He Gly Tyr Phe Lys Glu Leu Leu Leu Gin Gly Lys Ser Val Glu
900 905 910
Glu Cys Val Leu Leu Gly Ala Lys Arg Arg Leu Arg Pro Val Leu Met 915 920 925 930
Thr Ala Cys He Ala Gly Leu Gly Leu Leu Pro Leu Leu Phe Ser His
935 940 945
Ser Val Gly Ser Glu Val Gin Lys Pro Leu Ala He Val Val Leu Gly
950 955 960
Gly Leu Val Thr Ser Ser Ala Leu Thr Leu Leu Leu Leu Pro Pro Met
965 970 975
Phe Met Leu He Ala Lys Lys He Lys He Val 980 985
(2) INFORMATION FOR SEQ ID NO: 109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 898 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 86...835 (D) OTHER INFORMATION:
- 234 -
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 86...161 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109:
GCATAAAATA AACAAACATT AAGTAAGGCT TATCAATATT TGATTACAAT TATAAGGGTT 60
ACATTTTTTT AATAGGAGAT ATACC ATG CTA GGA AAC GTT AAA AAA ACC CTT 112
Met Leu Gly Asn Val Lys Lys Thr Leu
-25 -20
TTT GGG GTC TTG TGT TTG GGC ACG TTG TGT TTG AGA GGG TTA ATG GCA 160
Phe Gly Val Leu Cys Leu Gly Thr Leu Cys Leu Arg Gly Leu Met Ala -15 -10 -5
GAG CCA GAC GCT AAA GAG CTT GTT AAT TTA GGC ATA GAG AGC GCG AAG 208
Glu Pro Asp Ala Lys Glu Leu Val Asn Leu Gly He Glu Ser Ala Lys 1 5 10 15
AAG CAA GAT TTC GCT CAA GCT AAA ACG CAT TTT GAA AAA GCT TGT GAG 256
Lys Gin Asp Phe Ala Gin Ala Lys Thr His Phe Glu Lys Ala Cys Glu
20 25 30
TTA AAA AAT GGC TTT GGA TGT GTT TTT TTA GGG GCG TTC TAT GAA GAA 304
Leu Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu 35 40 45
GGG AAA GGA GTG GGA AAA GAC TTG AAA AAA GCC ATC CAA TTT TAC ACT 352
Gly Lys Gly Val Gly Lys Asp Leu Lys Lys Ala He Gin Phe Tyr Thr 50 55 60
AAA GGT TGT GAA TTA AAT GAT GGT TAT GGG TGT AAC CTG CTA GGA AAT 400
Lys Gly Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn 65 70 75 80
TTA TAC TAT AAC GGA CAA GGC GTG TCA AAA GAC GCT AAA AAA GCC TCA 448
Leu Tyr Tyr Asn Gly Gin Gly Val Ser Lys Asp Ala Lys Lys Ala Ser 85 90 95
CAA TAC TAC TCT AAA GCT TGC GAC TTA AAC CAT GCT GAA GGG TGT ATG 496
Gin Tyr Tyr Ser Lys Ala Cys Asp Leu Asn His Ala Glu Gly Cys Met
100 105 110
GTA TTA GGA AGC TTA CAC CAT TAT GGC GTA GGC ACG CCT AAG GAT TTA 544
Val Leu Gly Ser Leu His His Tyr Gly Val Gly Thr Pro Lys Asp Leu 115 120 125
AGA AAG GCT CTT GAT TTG TAT GAA AAA GCT TGC GAT TTA AAA GAC AGC 592
Arg Lys Ala Leu Asp Leu Tyr Glu Lys Ala Cys Asp Leu Lys Asp Ser 130 135 140
- 235 -
CCA GGG TGT ATT AAT GCA -GGA TAT ATA TAT AGT GTA ACA AAG AAT TTT 640 Pro Gly Cys He Asn Ala Gly Tyr He Tyr Ser Val Thr Lys Asn Phe 145 150 155 160
AAG GAG GCT ATC GTT CGT TAT TCT AAA GCA TGC GAA TTA AAA GAT GGT 688 Lys Glu Ala He Val Arg Tyr Ser Lys Ala Cys Glu Leu Lys Asp Gly 165 170 175
AGG GGG TGT TAT AAT TTA GGG GTT ATG CAA TAC AAC GCT CAA GGT ACA 736 Arg Gly Cys Tyr Asn Leu Gly Val Met Gin Tyr Asn Ala Gin Gly Thr 180 185 190
GCA AAG GAC GAA AAG CAA GCG GTA GAA AAC TTT AAA AAA GGC TGC AAA 784 Ala Lys Asp Glu Lys Gin Ala Val Glu Asn Phe Lys Lys Gly Cys Lys
195 200 205
TCA AGC GTT AAA GAA GCA TGC GAC GCT CTC AAG GAA TTA AAA ATA GAA 832 Ser Ser Val Lys Glu Ala Cys Asp Ala Leu Lys Glu Leu Lys He Glu 210 215 220
CTT TAATTTCAAT GAAGTTAGCT AAACGCTGCG TTTAGCTGGC TTTTACGCTT TTTATA 891
Leu
225
TTTTAAG 898
(2) INFORMATION FOR SEQ ID NO: 110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 250 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110:
Met Leu Gly Asn Val Lys Lys Thr Leu Phe Gly Val Leu Cys Leu Gly -25 -20 -15 -10
Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu
-5 1 5
Val Asn Leu Gly He Glu Ser Ala Lys Lys Gin Asp Phe Ala Gin Ala
10 15 20
Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys
25 30 35
Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp 40 45 50 55
- 23 6 -
Leu Lys Lys Ala He Gin- Phe Tyr Thr Lys Gly Cys Glu Leu Asn Asp
60 65 70
Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu Tyr Tyr Asn Gly Gin Gly
75 80 85
Val Ser Lys Asp Ala Lys Lys Ala Ser Gin Tyr Tyr Ser Lys Ala Cys
90 95 100
Asp Leu Asn His Ala Glu Gly Cys Met Val Leu Gly Ser Leu His His
105 110 115
Tyr Gly Val Gly Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr 120 125 130 135
Glu Lys Ala Cys Asp Leu Lys Asp Ser Pro Gly Cys He Asn Ala Gly
140 145 150
Tyr He Tyr Ser Val Thr Lys Asn Phe Lys Glu Ala He Val Arg Tyr
155 160 165
Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg Gly Cys Tyr Asn Leu Gly
170 175 180
Val Met Gin Tyr Asn Ala Gin Gly Thr Ala Lys Asp Glu Lys Gin Ala
185 190 195
Val Glu Asn Phe Lys Lys Gly Cys Lys Ser Ser Val Lys Glu Ala Cys 200 205 210 215
Asp Ala Leu Lys Glu Leu Lys He Glu Leu 220 225
(2) INFORMATION FOR SEQ ID NO: 111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1079 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 169...834 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 169...289 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111:
CAAAAAAAAA AAAAAACAAT TTCAGTTTCT TATTAGCTAG GTTTGATTAA AATGAAAAGC 60 TTTTATGTGT TTAAACTTCA TTGTCTTAAA ACTTTTAAGA GCAATTTTAA AATTCGTTGG 120 CGTATAATAT CCGTTTTGAA TGAACTACTA AAAAAAGGGT TTTAAATA ATG GCT GAA 177
Met Ala Glu -40
AAT TCT TTC AAA AAT GTT TCC ACA CAA CCC AAA GTA TTT TTC TTA TTG 225
- 237 -
Asn Ser Phe Lys Asn vai Ser Thr Gin Pro Lys Val Phe Phe Leu Leu -35 -30 -25
CCA GCT AAA ACC CTG TTT CTT TTA GGA GGC GTT TTT AGC GCG TTT TTT 273
Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser Ala Phe Phe
-20 -15 -10
ATC CTT ATT GCT GGC TTG GTT TTT TTT GAT TAT GCT CAT TTG ATG GAC 321
He Leu He Ala Gly Leu Val Phe Phe Asp Tyr Ala His Leu Met Asp -5 1 5 10
AAT GCC ATT TTT AAT TTT GCG CGT TCA ACC CCC TTT AAT TCC AGC CCT 369
Asn Ala He Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn Ser Ser Pro
15 20 25
ATT TTA ACT CTA ATC CTC CAA AAT ATC GCT AAT TTA GGC TCT TCT CAA 417
He Leu Thr Leu He Leu Gin Asn He Ala Asn Leu Gly Ser Ser Gin 30 35 40
TTC GTG TTG CCT TTG AGT TTG TTG GTG GGG GTG TTT TTA AGC CTT TAT 465
Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu Ser Leu Tyr
45 50 55
CGC AGA AAC TTA GTG CTT GGG GTG TGG TTT GTG TTA AGC GTG ATC TTG 513
Arg Arg Asn Leu Val Leu Gly Val Trp Phe Val Leu Ser Val He Leu
60 65 70 75
TTT GAA GCC CTT TTA GAA TCT TTA AAA CAC CTT TTT GCA TAT TCC ATT 561
Phe Glu Ala Leu Leu Glu Ser Leu Lys His Leu Phe Ala Tyr Ser He 80 85 90
CAG TGG CTT TCG CGC AGC GCT AAT TTC CCT AAC GCT ACT GCG CTT TCT 609
Gin Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr Ala Leu Ser 95 100 105
TTA GTG CTA TTT TAT GGG TTG CTT ATT TTA TTG ATA CCC CAT TTA ATC 657
Leu Val Leu Phe Tyr Gly Leu Leu He Leu Leu He Pro His Leu He 110 115 120
ACG CAT CAA ACG CTT AAA AAT GTT CTT TTT TAT AGC TTA TTT GGT TTG 705
Thr His Gin Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu Phe Gly Leu
125 130 135
ATT TTT TTA ATA GGG TTA GCA CTG ATT GTT TTA GGG GTT TCT TTC AGT 753
He Phe Leu He Gly Leu Ala Leu He Val Leu Gly Val Ser Phe Ser
140 145 150 155
AGT GTT TTA GGA GGG TTT TGT TTA GGG GCG TTA GGG GCT TGT TTT TCC 801
Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala Cys Phe Ser 160 165 170
ATA GGG ATT TAT TTG AGC GTG TTT CAA AAG ATC TAAACGAACG GCTTAAAAGA 854
He Gly He Tyr Leu Ser Val Phe Gin Lys He 175 180
ATGAAAATTT TATCAAGGTT TTAATATTGG ATTTAAAGGT ATTATTGCAA CGGATTGTTG 914
- 238 -
ATTTTTTCAT CAAGCTCAAl AAAAAGCAAA AAATCGCCCT GATTGCAGCT GGGGTTTTGA 974 TCACGGCTTT GCTTGTGTTT TTATTGCTCT ATCCCTTTAA AGAAAAAGAC TACACGCAAG 1034 GGGGTTATGG GGTTTTATTT GAAGGTTTAG ACTCTAGCGA TAACG 1079
(2) INFORMATION FOR SEQ ID NO: 112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 222 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...40 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112:
Met Ala Glu Asn Ser Phe Lys Asn Val Ser Thr Gin Pro Lys Val Phe -40 -35 -30 -25
Phe Leu Leu Pro Ala Lys Thr Leu Phe Leu Leu Gly Gly Val Phe Ser
-20 -15 -10
Ala Phe Phe He Leu He Ala Gly Leu Val Phe Phe Asp Tyr Ala His
-5 1 5
Leu Met Asp Asn Ala He Phe Asn Phe Ala Arg Ser Thr Pro Phe Asn
10 15 20
Ser Ser Pro He Leu Thr Leu He Leu Gin Asn He Ala Asn Leu Gly 25 30 35 40
Ser Ser Gin Phe Val Leu Pro Leu Ser Leu Leu Val Gly Val Phe Leu
45 50 55
Ser Leu Tyr Arg Arg Asn Leu Val Leu Gly Val Trp Phe Val Leu Ser
60 65 70
Val He Leu Phe Glu Ala Leu Leu Glu Ser Leu Lys His Leu Phe Ala
75 80 85
Tyr Ser He Gin Trp Leu Ser Arg Ser Ala Asn Phe Pro Asn Ala Thr
90 95 100
Ala Leu Ser Leu Val Leu Phe Tyr Gly Leu Leu He Leu Leu He Pro 105 110 115 120
His Leu He Thr His Gin Thr Leu Lys Asn Val Leu Phe Tyr Ser Leu
125 130 135
Phe Gly Leu He Phe Leu He Gly Leu Ala Leu He Val Leu Gly Val
140 145 150
Ser Phe Ser Ser Val Leu Gly Gly Phe Cys Leu Gly Ala Leu Gly Ala
155 160 165
Cys Phe Ser He Gly He Tyr Leu Ser Val Phe Gin Lys He 170 175 180
(2) INFORMATION FOR SEQ ID NO: 113:
- 23 9 -
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 962 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 97...912 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 97...217 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113:
TTTTATTGAA TGTGTTGTAA TGTTTTTAAG GTATAATAAA CTCTTTTTAA GTCAAGCAAT 60 AAAGTTTGCA ACCTGATGAG AGTAATAATA GAGTTT ATG CTG ATT TCA TTA AAA 114
Met Leu He Ser Leu Lys -40 -35
ACA TTC CTA AAA ATA TTA TTG AAA ATA TTC CTA AAA ACC TTC CAA AAG 162 Thr Phe Leu Lys He Leu Leu Lys He Phe Leu Lys Thr Phe Gin Lys -30 -25 -20
ATT TGG GTA GTT TGC GTT ATT ATT TGG GGG TTA GGC TGT AGT TTT TTA 210 He Trp Val Val Cys Val He He Trp Gly Leu Gly Cys Ser Phe Leu -15 -10 -5
AAC GCT AAC AGC ATT CAA TTA GAA GAA ACG CTC AGA CGA AGC CCT AAA 258 Asn Ala Asn Ser He Gin Leu Glu Glu Thr Leu Arg Arg Ser Pro Lys 1 5 10
AAT CTT ATT TGG CAA CAC TTT AAA AAG AAG TTT AAA AAG AGC AAC ACG 306 Asn Leu He Trp Gin His Phe Lys Lys Lys Phe Lys Lys Ser Asn Thr 15 20 25 30
ATC CCT TAT GCC CCA AAT AGC CGT TGG AAA TAT TTA GGC ACG AGC ATA 354 He Pro Tyr Ala Pro Asn Ser Arg Trp Lys Tyr Leu Gly Thr Ser He 35 40 45
GGG ATT TTA GGC GTG TCT TTG GTG ATA GGG ATT GTG GGG CTG TAT CTC 402 Gly He Leu Gly Val Ser Leu Val He Gly He Val Gly Leu Tyr Leu 50 55 60
ATG CCA GAG AGC GTA ACG AAT TGG GAT AAA GAA AAG TTT GGG ATC AAA 450 Met Pro Glu Ser Val Thr Asn Trp Asp Lys Glu Lys Phe Gly He Lys 65 70 75
- 240 -
AGT TGG TTT GAA AA'l GIC CGC ATG GGG CCA AAA CTG GAC AAT GAT AGT 498 Ser Trp Phe Glu Asn Val Arg Met Gly Pro Lys Leu Asp Asn Asp Ser 80 85 90
TTT ATT TTT AAT GAA ATT TTG CAC CCT TAT TTT GGG GCT ATG TAT TAT 546 Phe He Phe Asn Glu He Leu His Pro Tyr Phe Gly Ala Met Tyr Tyr 95 100 105 110
ATG CAA CCG CGC ATG GCT GGA TTT AGC TGG ATG GCA TCA GCG TTT TTT 594 Met Gin Pro Arg Met Ala Gly Phe Ser Trp Met Ala Ser Ala Phe Phe 115 120 125
TCT TTT ATC ACT TCC ACG CTT TTT TGG GAA TAT GGC TTG GAA GCG TTT 642 Ser Phe He Thr Ser Thr Leu Phe Trp Glu Tyr Gly Leu Glu Ala Phe 130 135 140
GTG GAA GTG CCT AGC TGG CAG GAT TTA GTG ATC ACG CCT TTA TTA GGC 690 Val Glu Val Pro Ser Trp Gin Asp Leu Val He Thr Pro Leu Leu Gly 145 150 155
TCC ATT TTA GGG GAG GGG TTT TAT CAG CTC ACG CGC TAT ATC CAA CGC 738 Ser He Leu Gly Glu Gly Phe Tyr Gin Leu Thr Arg Tyr He Gin Arg 160 165 170
AAT GAA GGC AAG CTT TTT GGC TCT TTA TTT TTA GGG CGT TTA GTC ATC 786 Asn Glu Gly Lys Leu Phe Gly Ser Leu Phe Leu Gly Arg Leu Val He 175 180 185 190
GCT CTT ATG GAT CCT ATC GGT TTT ATC ATT AGG GAT TTA GGA CTT GGG 834 Ala Leu Met Asp Pro He Gly Phe He He Arg Asp Leu Gly Leu Gly 195 200 205
GAA GCT TTA GGG ATT TAT AAT AAA CAC GAA ATC CGT TCC AGC TTA AGC 882 Glu Ala Leu Gly He Tyr Asn Lys His Glu He Arg Ser Ser Leu Ser 210 215 220
CCC AAT GGT TTG AAT TTG ACT TAC AAA TTT TAAGAGCTTA AAATTTAAGA AAA 935 Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe 225 230
TTATAAAGAG TTTTGATAGA ATACCTT 962
(2) INFORMATION FOR SEQ ID NO: 114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 272 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
- 241 -
( B ) LOCATION : i ...40 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114:
Met Leu He Ser Leu Lys Thr Phe Leu Lys He Leu Leu Lys He Phe -40 -35 -30 -25
Leu Lys Thr Phe Gin Lys He Trp Val Val Cys Val He He Trp Gly
-20 -15 -10
Leu Gly Cys Ser Phe Leu Asn Ala Asn Ser He Gin Leu Glu Glu Thr
-5 1 5
Leu Arg Arg Ser Pro Lys Asn Leu He Trp Gin His Phe Lys Lys Lys
10 15 20
Phe Lys Lys Ser Asn Thr He Pro Tyr Ala Pro Asn Ser Arg Trp Lys 25 30 35 40
Tyr Leu Gly Thr Ser He Gly He Leu Gly Val Ser Leu Val He Gly
45 50 55
He Val Gly Leu Tyr Leu Met Pro Glu Ser Val Thr Asn Trp Asp Lys
60 65 70
Glu Lys Phe Gly He Lys Ser Trp Phe Glu Asn Val Arg Met Gly Pro
75 80 85
Lys Leu Asp Asn Asp Ser Phe He Phe Asn Glu He Leu His Pro Tyr
90 95 100
Phe Gly Ala Met Tyr Tyr Met Gin Pro Arg Met Ala Gly Phe Ser Trp 105 110 115 120
Met Ala Ser Ala Phe Phe Ser Phe He Thr Ser Thr Leu Phe Trp Glu
125 130 135
Tyr Gly Leu Glu Ala Phe Val Glu Val Pro Ser Trp Gin Asp Leu Val
140 145 150
He Thr Pro Leu Leu Gly Ser He Leu Gly Glu Gly Phe Tyr Gin Leu
155 160 165
Thr Arg Tyr He Gin Arg Asn Glu Gly Lys Leu Phe Gly Ser Leu Phe
170 175 180
Leu Gly Arg Leu Val He Ala Leu Met Asp Pro He Gly Phe He He 185 190 195 200
Arg Asp Leu Gly Leu Gly Glu Ala Leu Gly He Tyr Asn Lys His Glu
205 210 215
He Arg Ser Ser Leu Ser Pro Asn Gly Leu Asn Leu Thr Tyr Lys Phe 220 225 230
(2) INFORMATION FOR SEQ ID NO: 115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1422 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 216...1202
-242- (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 216...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115:
AAATTAAACG AGTTTGGTTT AGAGCCGTAT TTAGGGTTTT TGCACCCCCA TTTAACCAAT 60
GATTTTGAAA ATAACCCTAA TGAGCAATCA GCGCTCTTTG TCTTGCCCCT TTCAGCGGTT 120
AGCGCTCTTA ATGTGCATGC ACTCAAATTT GTGTTGTTGG AAGCGTTACC CTAAAACGCT 180
ATTTTTAAAA TAATCCATTA AAATAAAGGC GAGGA ATG AAA AGA TTT GTT TTG 233
Met Lys Arg Phe Val Leu -15
TTT TTA TTG TTC ATG TGC GTT TGC GTT CAA GCT TAC GCC GAG CAA GAT 281 Phe Leu Leu Phe Met Cys Val Cys Val Gin Ala Tyr Ala Glu Gin Asp -10 -5 1
TAC TTT TTT AGG GAT TTT AAA TCT AGA GAT TTG CCC CAA AAA CTC CAT 329 Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp Leu Pro Gin Lys Leu His 5 10 15
CTT GAT AAA AAG CTC TCC CAA ACA ATA CAG CCA TGC ATG CAA CTT AAC 377 Leu Asp Lys Lys Leu Ser Gin Thr He Gin Pro Cys Met Gin Leu Asn 20 25 30 35
GCA TCA AAA CAC TAC ACT TCT ACC GGG GTT AGA GAG CCT GAT AAA TGC 425 Ala Ser Lys His Tyr Thr Ser Thr Gly Val Arg Glu Pro Asp Lys Cys 40 45 50
ACA AAG AGT TTT AAA AAA TCC GCT CTC ATG TCC TAT GAC TTA GCG CTA 473 Thr Lys Ser Phe Lys Lys Ser Ala Leu Met Ser Tyr Asp Leu Ala Leu 55 60 65
GGT TAT TTG GTG AGT AAG AAT AAG CAA TAC GGC TTA AAG GCT ATA GAA 521 Gly Tyr Leu Val Ser Lys Asn Lys Gin Tyr Gly Leu Lys Ala He Glu 70 75 80
ATT TTA AAC GCT TGG GCT AAA GAG CTT CAA AGC GTG GAT ACT TAT CAG 569 He Leu Asn Ala Trp Ala Lys Glu Leu Gin Ser Val Asp Thr Tyr Gin 85 90 95
AGC GAG GAT AAT ATC AAT TTT TAC ATG CCT TAT ATG AAC ATG GCT TAT 617 Ser Glu Asp Asn He Asn Phe Tyr Met Pro Tyr Met Asn Met Ala Tyr 100 105 110 115
TGG TTT GTC AAA AAG GCG TTT CCT AGC CCA GAA TAT GAA GAT TTC ATT 665 Trp Phe Val Lys Lys Ala Phe Pro Ser Pro Glu Tyr Glu Asp Phe He 120 125 130
AAG CGG ATG CGC CAG TAT TCT CAA TCA GCT CTT AAC ACT AAC CAT GGG 713
- 243 -
Lys Arg Met Arg Gin Tyr -Ser Gin Ser Ala Leu Asn Thr Asn His Gly 135 140 145
GCG TGG GGC ATT CTT TTT GAT GTG AGT TCT GCG CTA GCG TTA GAC GAT 761 Ala Trp Gly He Leu Phe Asp Val Ser Ser Ala Leu Ala Leu Asp Asp 150 155 160
AAT GCC CTT TTG CAC AAT AGC GCT AAT CGG TGG CAG GAG TGG GTG TTT 809 Asn Ala Leu Leu His Asn Ser Ala Asn Arg Trp Gin Glu Trp Val Phe 165 170 175
AAA GCC ATA GAT GAG AAT GGG GTT ATT GNT AGC GCG ATC ACT AGG AGC 857 Lys Ala He Asp Glu Asn Gly Val He Xaa Ser Ala He Thr Arg Ser 180 185 190 195
GAT ACG AGC GAT TAT CAT GGC GGC CCT ACA AAG GGC ATT AAG GGG ATA 905 Asp Thr Ser Asp Tyr His Gly Gly Pro Thr Lys Gly He Lys Gly He 200 205 210
GCT TAT ACC AAT TTC GCG CTT CTT GCG CTA ACC ATA TCA GGC GAA TTG 953 Ala Tyr Thr Asn Phe Ala Leu Leu Ala Leu Thr He Ser Gly Glu Leu 215 220 225
CTT TTT GAG AAC GGG TAT GAT TTG TGG GGT AGT GGA GCT GGG AAA AGG 1001 Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly Ser Gly Ala Gly Lys Arg 230 235 240
CTC TCT GTG GCG TAT AAC AAA GTT GCA ACA TGG ATT TTA AAC CCT GAA 1049 Leu Ser Val Ala Tyr Asn Lys Val Ala Thr Trp He Leu Asn Pro Glu 245 250 255
ACT TTC CCT TAT TTC CAG CCT AAC CTT ATC GGG GTG CAT AAC AAC GCC 1097 Thr Phe Pro Tyr Phe Gin Pro Asn Leu He Gly Val His Asn Asn Ala 260 265 270 275
TAT TTC ATT ATT TTA GCC AAG CAT TAT TCT AGC CCT AGT GCA AAT GAG 1145 Tyr Phe He He Leu Ala Lys His Tyr Ser Ser Pro Ser Ala Asn Glu 280 285 290
CTT TTA AAG CAA GGC GAT TTA CAC GAA GAT GGT TTC AGG CTG AAA CTC 1193 Leu Leu Lys Gin Gly Asp Leu His Glu Asp Gly Phe Arg Leu Lys Leu 295 300 305
CGA TCG CCA TGAATTTTTC TGTATCCAAG GTTAGCCTTA AGGATGGCCA TGCGCTTTA 1251 Arg Ser Pro 310
ACCTTTTGAT GAATGGTTCA GAAAGTTTGT TTCAGTCAGC ATTATTTACA AAAAGAGTTT 1311 AAAATAAACG CAATTGTATC TCTTGAGTCG TCTTTAGAGT GCAAATGATT ATCAAAATGA 1371 ATCGTTTTAG TTGTAAGCGT GCTTATTTAC ACTAAAATAA TAAGCGTTAT T 1422
(2) INFORMATION FOR SEQ ID NO: 116:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 329 amino acids
-244 -
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...19 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116:
Met Lys Arg Phe Val Leu Phe Leu Leu Phe Met Cys Val Cys Val Gin
-15 -10 -5
Ala Tyr Ala Glu Gin Asp Tyr Phe Phe Arg Asp Phe Lys Ser Arg Asp
1 5 10
Leu Pro Gin Lys Leu His Leu Asp Lys Lys Leu Ser Gin Thr He Gin
15 20 25
Pro Cys Met Gin Leu Asn Ala Ser Lys His Tyr Thr Ser Thr Gly Val 30 35 40 45
Arg Glu Pro Asp Lys Cys Thr Lys Ser Phe Lys Lys Ser Ala Leu Met
50 55 60
Ser Tyr Asp Leu Ala Leu Gly Tyr Leu Val Ser Lys Asn Lys Gin Tyr
65 70 75
Gly Leu Lys Ala He Glu He Leu Asn Ala Trp Ala Lys Glu Leu Gin
80 85 90
Ser Val Asp Thr Tyr Gin Ser Glu Asp Asn He Asn Phe Tyr Met Pro
95 100 105
Tyr Met Asn Met Ala Tyr Trp Phe Val Lys Lys Ala Phe Pro Ser Pro 110 115 120 125
Glu Tyr Glu Asp Phe He Lys Arg Met Arg Gin Tyr Ser Gin Ser Ala
130 135 140
Leu Asn Thr Asn His Gly Ala Trp Gly He Leu Phe Asp Val Ser Ser
145 150 155
Ala Leu Ala Leu Asp Asp Asn Ala Leu Leu His Asn Ser Ala Asn Arg
160 165 170
Trp Gin Glu Trp Val Phe Lys Ala He Asp Glu Asn Gly Val He Xaa
175 180 185
Ser Ala He Thr Arg Ser Asp Thr Ser Asp Tyr His Gly Gly Pro Thr 190 195 200 205
Lys Gly He Lys Gly He Ala Tyr Thr Asn Phe Ala Leu Leu Ala Leu
210 215 220
Thr He Ser Gly Glu Leu Leu Phe Glu Asn Gly Tyr Asp Leu Trp Gly
225 230 235
Ser Gly Ala Gly Lys Arg Leu Ser Val Ala Tyr Asn Lys Val Ala Thr
240 245 250
Trp He Leu Asn Pro Glu Thr Phe Pro Tyr Phe Gin Pro Asn Leu He
255 260 265
Gly Val His Asn Asn Ala Tyr Phe He He Leu Ala Lys His Tyr Ser 270 275 280 285
Ser Pro Ser Ala Asn Glu Leu Leu Lys Gin Gly Asp Leu His Glu Asp
- 245 -
290 295 300
Gly Phe Arg Leu Lys Leu Arg Ser Pro 305 310
(2) INFORMATION FOR SEQ ID NO: 117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1080 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 157...987 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 157...226 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117:
AGCGGTAAAA TCGCTGAAGA AAACAACGCT AAAGAATTTT TTAACCACCC GAAATCTCAA 60 AGAGCGCAAA AATTTTTAGA AACTTTCCAT TTTTTAGGGA GCTGTTAAAT AAAGTTTGCT 120 AAAAAGATGA TTCTAATTTC AAAAAAAGGT GTTTTT ATG AAA ACA AAC GGG CTT 174
Met Lys Thr Asn Gly Leu -20
TTT AAA ATG TGG GGG CTG TTT TTA GTT TTA ATC GCT TTA GTC TTT AAT 222 Phe Lys Met Trp Gly Leu Phe Leu Val Leu He Ala Leu Val Phe Asn -15 -10 -5
GCA TGT TCT GAT AGC CAT AAA GAA AAA AAG GAC GCT TTA GAA GTC ATT 270 Ala Cys Ser Asp Ser His Lys Glu Lys Lys Asp Ala Leu Glu Val He 1 5 10 15
AAA CAA AGA GGG GTT TTA AAA GTG GGG GTT TTT AGC GAT AAG CCT CCT 318 Lys Gin Arg Gly Val Leu Lys Val Gly Val Phe Ser Asp Lys Pro Pro 20 25 30
TTT GGC TCT GTG GAT TCT AAA GGG AAA TAT CAA GGC TAT GAT GTA GTT 366 Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr Gin Gly Tyr Asp Val Val 35 40 45
ATT GCT AAA CGC ATG GCT CTT GAT TTA TTG GGC GAT GAA AAT AAG ATT 414 He Ala Lys Arg Met Ala Leu Asp Leu Leu Gly Asp Glu Asn Lys He 50 55 60
- 246 -
GAG TTT ATT CCT GTA GAA GCT TCA GCT AGG GTG GAA TTT TTA AAA GCC 462 Glu Phe He Pro Val Glu Ala Ser Ala Arg Val Glu Phe Leu Lys Ala 65 70 75
AAT AAA GTG GAT ATT ATC ATG GCT AAT TTC ACG CGC ACT AAA GAA AGA 510 Asn Lys Val Asp He He Met Ala Asn Phe Thr Arg Thr Lys Glu Arg 80 85 90 95
GAA AAA GTC GTG GAT TTC GCT AAG CCG TAT ATG AAA GTC GCT TTA GGG 558 Glu Lys Val Val Asp Phe Ala Lys Pro Tyr Met Lys Val Ala Leu Gly 100 105 110
GTG GTT TCT AAA GAT GGG GTC ATT AAA AAT ATA GAA GAG TTG AAA GAT 606 Val Val Ser Lys Asp Gly Val He Lys Asn He Glu Glu Leu Lys Asp 115 120 125
AAA GAG TTG ATT GTG AAT AAA GGC ACG ACA GCG GAT TTT TAT TTC ACT 654 Lys Glu Leu He Val Asn Lys Gly Thr Thr Ala Asp Phe Tyr Phe Thr 130 135 140
AAA AAT TAC CCC AAT ATC AAG CTT TTG AAA TTT GAG CAA AAT ACA GAG 702 Lys Asn Tyr Pro Asn He Lys Leu Leu Lys Phe Glu Gin Asn Thr Glu 145 150 155
ACT TTT TTA GCC CTT TTA AAC AAT AAG GCT ACC GCT CTA GCC CAT GAC 750 Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala Thr Ala Leu Ala His Asp 160 165 170 175
AAC ACT TTA TTG CTC GCT TGG ACG AAA CAA CAC CCT GAA TTT AAA TTA 798 Asn Thr Leu Leu Leu Ala Trp Thr Lys Gin His Pro Glu Phe Lys Leu 180 185 190
GGC ATT ACA AGC CTT GGC GAT AAG GAT GTG ATC GCT CCA GCG ATT AAA 846 Gly He Thr Ser Leu Gly Asp Lys Asp Val He Ala Pro Ala He Lys 195 200 205
AAA GGC AAC CCC AAG CTT TTA GAA TGG TTG AAT AAC GAA ATA GAT TCC 894 Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu Asn Asn Glu He Asp Ser 210 215 220
CTC ATT TCT AGC GAC TTC TTA AAA GAA GCT TAT CAA GAG ACT TTA GCA 942 Leu He Ser Ser Asp Phe Leu Lys Glu Ala Tyr Gin Glu Thr Leu Ala 225 230 235
CCT GTT TAT GGC GAT GAA ATC AAA CCG GAA GAA ATT ATT TTT GAA TGATT 992 Pro Val Tyr Gly Asp Glu He Lys Pro Glu Glu He He Phe Glu 240 245 250
TCTTTAGGCT TTGAATTCTT GACAGGGTGC GTTTTTATTG CTAAATTAGC AATTTTGTGA 1052 TCTTTTTGTT TTTCATTTTG AGATATAT 1080
(2) INFORMATION FOR SEQ ID NO: 118:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 277 amino acids
- 247 -
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...23 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118:
Met Lys Thr Asn Gly Leu Phe Lys Met Trp Gly Leu Phe Leu Val Leu
-20 -15 -10
He Ala Leu Val Phe Asn Ala Cys Ser Asp Ser His Lys Glu Lys Lys
-5 1 5
Asp Ala Leu Glu Val He Lys Gin Arg Gly Val Leu Lys Val Gly Val 10 15 20 25
Phe Ser Asp Lys Pro Pro Phe Gly Ser Val Asp Ser Lys Gly Lys Tyr
30 35 40
Gin Gly Tyr Asp Val Val He Ala Lys Arg Met Ala Leu Asp Leu Leu
45 50 55
Gly Asp Glu Asn Lys He Glu Phe He Pro Val Glu Ala Ser Ala Arg
60 65 70
Val Glu Phe Leu Lys Ala Asn Lys Val Asp He He Met Ala Asn Phe
75 80 85
Thr Arg Thr Lys Glu Arg Glu Lys Val Val Asp Phe Ala Lys Pro Tyr 90 95 100 105
Met Lys Val Ala Leu Gly Val Val Ser Lys Asp Gly Val He Lys Asn
110 115 120
He Glu Glu Leu Lys Asp Lys Glu Leu He Val Asn Lys Gly Thr Thr
125 130 135
Ala Asp Phe Tyr Phe Thr Lys Asn Tyr Pro Asn He Lys Leu Leu Lys
140 145 150
Phe Glu Gin Asn Thr Glu Thr Phe Leu Ala Leu Leu Asn Asn Lys Ala
155 160 165
Thr Ala Leu Ala His Asp Asn Thr Leu Leu Leu Ala Trp Thr Lys Gin 170 175 180 185
His Pro Glu Phe Lys Leu Gly He Thr Ser Leu Gly Asp Lys Asp Val
190 195 200
He Ala Pro Ala He Lys Lys Gly Asn Pro Lys Leu Leu Glu Trp Leu
205 210 215
Asn Asn Glu He Asp Ser Leu He Ser Ser Asp Phe Leu Lys Glu Ala
220 225 230
Tyr Gin Glu Thr Leu Ala Pro Val Tyr Gly Asp Glu He Lys Pro Glu
235 240 245
Glu He He Phe Glu 250
(2) INFORMATION FOR SEQ ID NO: 119:
- 248 -
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1114 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 37...1050 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119:
CGAGCTATCA CAACAAATCA ATTTGTAGGA ACAAGC ATG TTT TTT AAA ACT TAT 54
Met Phe Phe Lys Thr Tyr 1 5
CAA AAA TTA TTG GGT GCG AGC TGT TTG ACG TTG TAT TTA GCG GGC TGT 102 Gin Lys Leu Leu Gly Ala Ser Cys Leu Thr Leu Tyr Leu Ala Gly Cys 10 15 20
GGG AGT GAT AGT AGC GAG CCA TTG GTG GGA ATT GAA AAA AAT AGC TTC 150 Gly Ser Asp Ser Ser Glu Pro Leu Val Gly He Glu Lys Asn Ser Phe 25 30 35
AAT TCT ACC GTG AAA ATC ATT TCT AAA ACC GAC AAC ATA GAA ATC CAA 198 Asn Ser Thr Val Lys He He Ser Lys Thr Asp Asn He Glu He Gin 40 45 50
GAC TTG AAG CTC AAT CGT GGC AAT TGT GAG CAT GAT CAA AAT TTC TTG 246 Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu His Asp Gin Asn Phe Leu 55 60 65 70
GTA AAG TTA ATC CAA GAA ACA GCC AAT ACA TAC CTG TTT GCA TCA GAA 294 Val Lys Leu He Gin Glu Thr Ala Asn Thr Tyr Leu Phe Ala Ser Glu 75 80 85
AAA GAA AAA GCG ATC AAA AAC CAC CAA GCA AAA ATC GCA AGA CTT CAA 342 Lys Glu Lys Ala He Lys Asn His Gin Ala Lys He Ala Arg Leu Gin 90 95 100
AAA GAT TTA GAA GAA CTC ACA CAG CAT GTG CAA CAA TCC AAT AAT CTT 390 Lys Asp Leu Glu Glu Leu Thr Gin His Val Gin Gin Ser Asn Asn Leu 105 110 115
GAT AAA TTG TTA GAA AAT GGA GGA CTA TTC GTT AGT GGC CAT GAT TAT 438 Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe Val Ser Gly His Asp Tyr 120 125 130
AAA TAT ACA AAA GAT GAT AAC CCA ATA TAT GTT GTT AAG AGG ATG CTT 486 Lys Tyr Thr Lys Asp Asp Asn Pro He Tyr Val Val Lys Arg Met Leu
-249 -
135 140 145 150
GAT AAC CTT GAT AGC TAT AAA TAT GAA TCA GAC GAC GTG CTA GAC GTG 534 Asp Asn Leu Asp Ser Tyr Lys Tyr Glu Ser Asp Asp Val Leu Asp Val 155 160 165
CCA TAT GAG AAG CTA TTG GAA ATA AGC ATT GCT ATT GAA GAC ACT AAA 582 Pro Tyr Glu Lys Leu Leu Glu He Ser He Ala He Glu Asp Thr Lys 170 175 180
AAC CCC AAA GAC TAC CCT TAT ATC AAC CTT AAA GAA CTC AAA AAA TTA 630 Asn Pro Lys Asp Tyr Pro Tyr He Asn Leu Lys Glu Leu Lys Lys Leu 185 190 195
ATA GAT AGT ATT ATT GAT GAT CAT GGT TAT ATG GCC GAT GGC TTT TTG 678 He Asp Ser He He Asp Asp His Gly Tyr Met Ala Asp Gly Phe Leu 200 205 210
AAT GAA TAT TCT AAT AGG GTA TCA AAA AAA GGT CTC CAA ATC CTT GCT 726 Asn Glu Tyr Ser Asn Arg Val Ser Lys Lys Gly Leu Gin He Leu Ala 215 220 225 230
AAA CTA AAA TCC ATG TGG CCT AGC GTA GGG AAA TTT TAT TTC GCC TCT 774 Lys Leu Lys Ser Met Trp Pro Ser Val Gly Lys Phe Tyr Phe Ala Ser 235 240 245
TTG AAA GAG GCT ATC CCA AGG CAT GCC AAA GAA GTT ACT GAC AAG ATG 822 Leu Lys Glu Ala He Pro Arg His Ala Lys Glu Val Thr Asp Lys Met 250 255 260
ATT AGC TCT GAA GAA AAA TCT ATC AAA GCC AAT CAA GTC AAA CTC ACT 870 He Ser Ser Glu Glu Lys Ser He Lys Ala Asn Gin Val Lys Leu Thr 265 270 275
GAA GCG AAG CAA GAT ATT GAC AAA ATG GAA AAA ATC ATT AAA GAT TTA 918 Glu Ala Lys Gin Asp He Asp Lys Met Glu Lys He He Lys Asp Leu 280 285 290
GAA AGC AAG AAA AAC ACC TTA TCA GTG TAT TTA AAA TTT GGA GAA AGT 966 Glu Ser Lys Lys Asn Thr Leu Ser Val Tyr Leu Lys Phe Gly Glu Ser 295 300 305 310
TTC ACA GCG CAT TAT AAG TGT CAA AAT CTC ATA GAA GTT GGA GTC AAA 1014 Phe Thr Ala His Tyr Lys Cys Gin Asn Leu He Glu Val Gly Val Lys 315 320 325
ACC GAT AAA GGC TCC TGG ACT TTC AAC TTT AAC AGA TAAATCAGGC AAATAT 1066 Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe Asn Arg 330 335
GGACAATAGC ACAGACAGAG CAAAAATCCT TATAGAAGAG CTTAAAAT 1114
(2) INFORMATION FOR SEQ ID NO: 120: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 338 amino acids
- 250 -
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120:
Met Phe Phe Lys Thr Tyr Gin Lys Leu Leu Gly Ala Ser Cys Leu Thr
1 5 10 15
Leu Tyr Leu Ala Gly Cys Gly Ser Asp Ser Ser Glu Pro Leu Val Gly
20 25 30
He Glu Lys Asn Ser Phe Asn Ser Thr Val Lys He He Ser Lys Thr
35 40 45
Asp Asn He Glu He Gin Asp Leu Lys Leu Asn Arg Gly Asn Cys Glu
50 55 60
His Asp Gin Asn Phe Leu Val Lys Leu He Gin Glu Thr Ala Asn Thr 65 70 75 80
Tyr Leu Phe Ala Ser Glu Lys Glu Lys Ala He Lys Asn His Gin Ala
85 90 95
Lys He Ala Arg Leu Gin Lys Asp Leu Glu Glu Leu Thr Gin His Val
100 105 110
Gin Gin Ser Asn Asn Leu Asp Lys Leu Leu Glu Asn Gly Gly Leu Phe
115 120 125
Val Ser Gly His Asp Tyr Lys Tyr Thr Lys Asp Asp Asn Pro He Tyr
130 135 140
Val Val Lys Arg Met Leu Asp Asn Leu Asp Ser Tyr Lys Tyr Glu Ser 145 150 155 160
Asp Asp Val Leu Asp Val Pro Tyr Glu Lys Leu Leu Glu He Ser He
165 170 175
Ala He Glu Asp Thr Lys Asn Pro Lys Asp Tyr Pro Tyr He Asn Leu
180 185 190
Lys Glu Leu Lys Lys Leu He Asp Ser He He Asp Asp His Gly Tyr
195 200 205
Met Ala Asp Gly Phe Leu Asn Glu Tyr Ser Asn Arg Val Ser Lys Lys
210 215 220
Gly Leu Gin He Leu Ala Lys Leu Lys Ser Met Trp Pro Ser Val Gly 225 230 235 240
Lys Phe Tyr Phe Ala Ser Leu Lys Glu Ala He Pro Arg His Ala Lys
245 250 255
Glu Val Thr Asp Lys Met He Ser Ser Glu Glu Lys Ser He Lys Ala
260 265 270
Asn Gin Val Lys Leu Thr Glu Ala Lys Gin Asp He Asp Lys Met Glu
275 280 285
Lys He He Lys Asp Leu Glu Ser Lys Lys Asn Thr Leu Ser Val Tyr
290 295 300
Leu Lys Phe Gly Glu Ser Phe Thr Ala His Tyr Lys Cys Gin Asn Leu 305 310 315 320
He Glu Val Gly Val Lys Thr Asp Lys Gly Ser Trp Thr Phe Asn Phe
325 330 335
Asn Arg
(2) INFORMATION FOR SEQ ID NO: 121:
- 251 -
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1101 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 40...1026 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide
(B) LOCATION: 40...99 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide
(B) LOCATION: 100...1026 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121:
GGTTATACCG AAAAAACAAT ATGAAATCAA GGAGTTTGT ATG CAA CAG CGT CAT 54
Met Gin Gin Arg His -20
TTA GGC CCT TTA AAA GTG GGT GCA TTA GCT CTA GGG TGC ATG GGC ATG 102 Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu Gly Cys Met Gly Met -15 -10 -5 1
ACT TAT GGG TAT GGG GAA GTC CAT GAT AAA AAG CAG ATG GTT AAA CTT 150 Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys Gin Met Val Lys Leu 5 10 15
ATC CAT AAG GCT TTG GAA TTG GGT ATT AAC TTT TTT GAC ACT GCA GAG 198 He His Lys Ala Leu Glu Leu Gly He Asn Phe Phe Asp Thr Ala Glu 20 25 30
GCT TAT GGG GAA GAT AAT GAA AAG CTT TTA GCG AAG CGA TCA AGC CTT 246 Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala Lys Arg Ser Ser Leu 35 40 45
ATT AAA GAC AAG GTT GTG GTA GCG AGC AAG TTT GGG ATT TAC TAC GCA 294 He Lys Asp Lys Val Val Val Ala Ser Lys Phe Gly He Tyr Tyr Ala 50 55 60 65
GAT CCT AAT GAC AAA TAC GCA ACC ATG TTT TTA GAC TCC AGT TCT AAC 342 Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu Asp Ser Ser Ser Asn 70 75 80
CGC ATT AAG AGT GCC ATT GAA GGG AGT TTG AAA CGC TTA AAA GTA GAA 390
- 252 -
Arg He Lys Ser Ala He Glu Gly Ser Leu Lys Arg Leu Lys Val Glu 85 90 95
TGC ATT GAT TTA TAC TAC CAA CAC CGC ATG GAT ACT AAC ACG CCC ATA 438 Cys He Asp Leu Tyr Tyr Gin His Arg Met Asp Thr Asn Thr Pro He 100 105 110
GAA GAA GTG GCA GAA GTT ATG CAA GCT CTT ATT AAA GAA GGA AAA ATT 486 Glu Glu Val Ala Glu Val Met Gin Ala Leu He Lys Glu Gly Lys He 115 120 125
AAA GCT TGG GGG ATG AGT GAG GCA GGG TTA TCT AGC ATC CAA AAA GCC 534 Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Ser Ser He Gin Lys Ala 130 135 140 145
CAT CAA ATT TGC CCT TTA AGC GCG TTG CAG AGC GAA TAT TCC TTG TGG 582 His Gin He Cys Pro Leu Ser Ala Leu Gin Ser Glu Tyr Ser Leu Trp 150 155 160
TGG CGC GAA CCT GAA AAA GAG ATT TTA GGT TTT TTA GAA AAA GAA AAA 630 Trp Arg Glu Pro Glu Lys Glu He Leu Gly Phe Leu Glu Lys Glu Lys 165 170 175
ATT GGC TTT GTC GCT TTT TCG CCT TTG GGT AAG GGG TTT TTA GGC GCG 678 He Gly Phe Val Ala Phe Ser Pro Leu Gly Lys Gly Phe Leu Gly Ala 180 185 190
AAA TTT GAA AAA AAT GCT ACC TTC GCT AGT GAA GAT TTT AGA AGC GTT 726 Lys Phe Glu Lys Asn Ala Thr Phe Ala Ser Glu Asp Phe Arg Ser Val 195 200 205
TCT CCT AGG TTT AAT CAA GAA AAT CTA GCC AAA AAT TAC GTC TTG GTG 774 Ser Pro Arg Phe Asn Gin Glu Asn Leu Ala Lys Asn Tyr Val Leu Val 210 215 220 225
GAA TTA ATC CAA GAT CAT GCA CAC GCT AAA GGC GTT ACA CCA GCC CAA 822 Glu Leu He Gin Asp His Ala His Ala Lys Gly Val Thr Pro Ala Gin 230 235 240
CTG GCT CTC TCG TGG ATT TTG CAC ACG CAA AAA ATC ATT GTC CCT CTC 870 Leu Ala Leu Ser Trp He Leu His Thr Gin Lys He He Val Pro Leu 245 250 255
TTT GGC ACC ACC AAA GAA TCC AGG CTC ATA GAA AAT ATA GGG GCT TTG 918 Phe Gly Thr Thr Lys Glu Ser Arg Leu He Glu Asn He Gly Ala Leu
260 265 270
CAG GTT TCT TGG AGT CAA AAA GAA TTG GAG ATT TTT CAA AAA GAA TTG 966 Gin Val Ser Trp Ser Gin Lys Glu Leu Glu He Phe Gin Lys Glu Leu 275 280 285
ACT GCA ATC AAA ATA GAA GGG GCC CGC TAC CCT GAA AGA ATC AAT GAA 1014 Thr Ala He Lys He Glu Gly Ala Arg Tyr Pro Glu Arg He Asn Glu 290 295 300 305
- 253 -
ATG GTG AAT CAA TAAAAGTATT GGGTATTTAT AATTGCATTG GCTCTTTTAA AAGAG 1071 Met Val Asn Gin
ATTGAGCGTT ATTTCCTGTT TGTCAGTGTG 1101
(2) INFORMATION FOR SEQ ID NO: 122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122:
Met Gin Gin Arg His Leu Gly Pro Leu Lys Val Gly Ala Leu Ala Leu -20 -15 -10 -5
Gly Cys Met Gly Met Thr Tyr Gly Tyr Gly Glu Val His Asp Lys Lys
1 5 10
Gin Met Val Lys Leu He His Lys Ala Leu Glu Leu Gly He Asn Phe
15 20 25
Phe Asp Thr Ala Glu Ala Tyr Gly Glu Asp Asn Glu Lys Leu Leu Ala
30 35 40
Lys Arg Ser Ser Leu He Lys Asp Lys Val Val Val Ala Ser Lys Phe 45 50 55 60
Gly He Tyr Tyr Ala Asp Pro Asn Asp Lys Tyr Ala Thr Met Phe Leu
65 70 75
Asp Ser Ser Ser Asn Arg He Lys Ser Ala He Glu Gly Ser Leu Lys
80 85 90
Arg Leu Lys Val Glu Cys He Asp Leu Tyr Tyr Gin His Arg Met Asp
95 100 105
Thr Asn Thr Pro He Glu Glu Val Ala Glu Val Met Gin Ala Leu He
110 115 120
Lys Glu Gly Lys He Lys Ala Trp Gly Met Ser Glu Ala Gly Leu Ser 125 130 135 140
Ser He Gin Lys Ala His Gin He Cys Pro Leu Ser Ala Leu Gin Ser
145 150 155
Glu Tyr Ser Leu Trp Trp Arg Glu Pro Glu Lys Glu He Leu Gly Phe
160 165 170
Leu Glu Lys Glu Lys He Gly Phe Val Ala Phe Ser Pro Leu Gly Lys
175 180 185
Gly Phe Leu Gly Ala Lys Phe Glu Lys Asn Ala Thr Phe Ala Ser Glu
190 195 200
Asp Phe Arg Ser Val Ser Pro Arg Phe Asn Gin Glu Asn Leu Ala Lys
205 210 215 220
Asn Tyr Val Leu Val Glu Leu He Gin Asp His Ala His Ala Lys Gly
225 230 235
Val Thr Pro Ala Gin Leu Ala Leu Ser Trp He Leu His Thr Gin Lys
240 245 250
He He Val Pro Leu Phe Gly Thr Thr Lys Glu Ser Arg Leu He Glu
- 254 -
255 260 265
Asn He Gly Ala Leu Gin Val Ser Trp Ser Gin Lys Glu Leu Glu He
270 275 280
Phe Gin Lys Glu Leu Thr Ala He Lys He Glu Gly Ala Arg Tyr Pro 285 290 295 300
Glu Arg He Asn Glu Met Val Asn Gin 305
(2) INFORMATION FOR SEQ ID NO: 123
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 955 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 126...806 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 126...237 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123:
GTCAGCCTTT AAAGGTTTCA TTATAGCAAA GAATATTATT TTTTTATTCC TTGCGTTTTC 60 TGTGCGTTTG TGGGGCAAAT AAGATATAAT CGCCTTTTTA AAATTCATTT TTTAAAGGGG 120 TTTGA ATG GTA TTT GAC AGA ACA ATC AGC GTA AGA GAA AAA AAA GCG GCT 170 Met Val Phe Asp Arg Thr He Ser Val Arg Glu Lys Lys Ala Ala -35 -30 -25
AAA ACG CTT GGG ATT ATT GGG ATC GTC TTT TTT ATT TTG TTT GGC ATC 218 Lys Thr Leu Gly He He Gly He Val Phe Phe He Leu Phe Gly He -20 -15 -10
GTG ATA AGC GGG GTG GCT TTT CAA AAA GAG TGG GTG CAA CAA TTG GAT 266 Val He Ser Gly Val Ala Phe Gin Lys Glu Trp Val Gin Gin Leu Asp -5 1 5 10
TTA TTT TTT ATA GAC TTG ATC CGC AAC CCT GCC CCC ATT CAA AAA AGC 314 Leu Phe Phe He Asp Leu He Arg Asn Pro Ala Pro He Gin Lys Ser 15 20 25
GCG TGG CTT TCT TTC GTG TTT TTT AGC ACT TGG TTT GCA CAA AGC AAG 362 Ala Trp Leu Ser Phe Val Phe Phe Ser Thr Trp Phe Ala Gin Ser Lys 30 35 40
- 255 -
CTC ACC ACT CCT ATA GCC TTA CTC ATT GGC TTG TGG TTT GGG TTT CAA 410 Leu Thr Thr Pro He Ala Leu Leu He Gly Leu Trp Phe Gly Phe Gin 45 50 55
AAA CGC ATC GCT TTG GGG GTG TGG TTT TTC TTT AGC ATC TTA TTA GGT 458 Lys Arg He Ala Leu Gly Val Trp Phe Phe Phe Ser He Leu Leu Gly 60 65 70
GAA TTC ACC TTA AAA TCC CTT AAG CTT TTA GTG GCG CGC CCA CGG CCT 506 Glu Phe Thr Leu Lys Ser Leu Lys Leu Leu Val Ala Arg Pro Arg Pro 75 80 85 90
GTA ACC AAT GGC GAA TTG GTT TTC GCG CAT GGC TTT AGT TTC CCT AGC 554 Val Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser 95 100 105
GGG CAT GCT TTG GCT TCA GCG CTT TTT TAC GGC TCT TTG GCG TTG TTG 602 Gly His Ala Leu Ala Ser Ala Leu Phe Tyr Gly Ser Leu Ala Leu Leu 110 115 120
TTA TGC TAT TCT AAC GCC AAC AAT CGC ATT AAA ACG ATT ATT GCT GTG 650 Leu Cys Tyr Ser Asn Ala Asn Asn Arg He Lys Thr He He Ala Val 125 130 135
GTT TTG CTT TTT TGG ATT TTT TTA ATG GCG TAT GAT AGG GTT TAT TTA 698 Val Leu Leu Phe Trp He Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu 140 145 150
GGG GTG CAT TAC CCT AGC GAT GTT TTA GGA GGG TTT TTA TTA GGG ATT 746 Gly Val His Tyr Pro Ser Asp Val Leu Gly Gly Phe Leu Leu Gly He 155 160 165 170
GCT TGG TCG TGC TGC TCT TTA GCG CTT TAT TTA GGG TTT TTG AAA CGC 794 Ala Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg 175 180 185
CCT TAT AAT CAA TAAAGGCTTT ATTTAACCAA ACACTGACAA CTAAAATTTT TAAAA 851 Pro Tyr Asn Gin 190
TTCTATTTTT TGATAAAACT CATTCTCTTA AGGGGATAGG GGGTATTTTG CGATAATACC 911 CCCTTAACCC CCTTAAGAAA CCCCCTAACC CCCAAGACCG CTTT 955
(2) INFORMATION FOR SEQ ID NO: 124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 227 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
- 256 -
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...37 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124:
Met Val Phe Asp Arg Thr He Ser Val Arg Glu Lys Lys Ala Ala Lys
-35 -30 -25
Thr Leu Gly He He Gly He Val Phe Phe He Leu Phe Gly He Val
-20 -15 -10
He Ser Gly Val Ala Phe Gin Lys Glu Trp Val Gin Gin Leu Asp Leu -5 1 5 10
Phe Phe He Asp Leu He Arg Asn Pro Ala Pro He Gin Lys Ser Ala
15 20 25
Trp Leu Ser Phe Val Phe Phe Ser Thr Trp Phe Ala Gin Ser Lys Leu
30 35 40
Thr Thr Pro He Ala Leu Leu He Gly Leu Trp Phe Gly Phe Gin Lys
45 50 55
Arg He Ala Leu Gly Val Trp Phe Phe Phe Ser He Leu Leu Gly Glu 60 65 70 75
Phe Thr Leu Lys Ser Leu Lys Leu Leu Val Ala Arg Pro Arg Pro Val
80 85 90
Thr Asn Gly Glu Leu Val Phe Ala His Gly Phe Ser Phe Pro Ser Gly
95 100 105
His Ala Leu Ala Ser Ala Leu Phe Tyr Gly Ser Leu Ala Leu Leu Leu
110 115 120
Cys Tyr Ser Asn Ala Asn Asn Arg He Lys Thr He He Ala Val Val
125 130 135
Leu Leu Phe Trp He Phe Leu Met Ala Tyr Asp Arg Val Tyr Leu Gly 140 145 150 155
Val His Tyr Pro Ser Asp Val Leu Gly Gly Phe Leu Leu Gly He Ala
160 165 170
Trp Ser Cys Cys Ser Leu Ala Leu Tyr Leu Gly Phe Leu Lys Arg Pro
175 180 185
Tyr Asn Gin 190
(2) INFORMATION FOR SEQ ID NO: 125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1183 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 91...1032 (D) OTHER INFORMATION:
- 257 -
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 91...148 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125:
CTTAAAAGAA ACTTCGCAAA CCTTTTTATA TTATTTTAAA AGCACTAATA TTTATTATAT 60 TAGTTACAAC TATTTATTGT AAAGGCTAAA ATG TTG AAA TTT AAA TAT GGT TTG 114
Met Leu Lys Phe Lys Tyr Gly Leu -15
ATT TAT ATC GCG CTC ATA CTA GGA CTT CAA GCG ACA GAT TAT GAC AAT 162 He Tyr He Ala Leu He Leu Gly Leu Gin Ala Thr Asp Tyr Asp Asn -10 -5 1 5
TTA GAA GAA GAA AAC CAA CAA TTA GAT GAA AAA ATA AAC CAT TTA AAG 210 Leu Glu Glu Glu Asn Gin Gin Leu Asp Glu Lys He Asn His Leu Lys 10 15 20
CAA CAG CTC ACC GAA AAA GGG GTT TCG CCC AAA GAG ATG GAT AAG GAT 258 Gin Gin Leu Thr Glu Lys Gly Val Ser Pro Lys Glu Met Asp Lys Asp 25 30 35
AAG TTT GAA GAA GAA TAC ATC AAT CGA TCT TAT CCT AAA ATT TCT TCC 306 Lys Phe Glu Glu Glu Tyr He Asn Arg Ser Tyr Pro Lys He Ser Ser 40 45 50
AAG AAA AAA GAG AAA TTG CTC AAA TCT TTT TCC ATA GCC GAT GAT AAG 354 Lys Lys Lys Glu Lys Leu Leu Lys Ser Phe Ser He Ala Asp Asp Lys 55 60 65
AGT GGG GTT TTT TTA GGG GGT GGG TAT GCT TAT GGG GAA CTT AAC TTG 402 Ser Gly Val Phe Leu Gly Gly Gly Tyr Ala Tyr Gly Glu Leu Asn Leu 70 75 80 85
TCT TAT CAA GGG GAA ATG TTA GAC AGA TAC GGC GCG AAT GCC CCT AGC 450 Ser Tyr Gin Gly Glu Met Leu Asp Arg Tyr Gly Ala Asn Ala Pro Ser 90 95 100
GCG TTT AAA AAC AAT ATC AAT ATT AAC GCT CCT GTT TCT ATG ATT AGC 498 Ala Phe Lys Asn Asn He Asn He Asn Ala Pro Val Ser Met He Ser 105 110 115
GCT AAA TTT GGG TAT CAA AAA TAC TTT GTG TCT TAT TTT GGG ACA CGA 546 Ala Lys Phe Gly Tyr Gin Lys Tyr Phe Val Ser Tyr Phe Gly Thr Arg 120 125 130
TTT TAT GGG GAT TTA TTG CTT GGG GGT GGG GCA TTA AAA GAG GAT GCA 594 Phe Tyr Gly Asp Leu Leu Leu Gly Gly Gly Ala Leu Lys Glu Asp Ala 135 140 145
ATC AAG CAG CCT GTA GGC TCG TTT ATT TAT GTT TTA GGG GCT GTC AAT 642 He Lys Gin Pro Val Gly Ser Phe He Tyr Val Leu Gly Ala Val Asn
- 258 -
150 155 160 165
ACC GAT TTA TTG TTT GAT ATG CCT TTA GAT TTT AAA ACT AAA AAG CAT 690 Thr Asp Leu Leu Phe Asp Met Pro Leu Asp Phe Lys Thr Lys Lys His 170 175 180
TTT TTA GGC GTT TAT GCG GGT TTT GGG ATA GGG CTT ATG CTC TAT CAA 738 Phe Leu Gly Val Tyr Ala Gly Phe Gly He Gly Leu Met Leu Tyr Gin 185 190 195
GAC AGG CCT AAT CAA AAC GGG AGG AAT TTA GTA GTG GGG GGC TAT TCA 786 Asp Arg Pro Asn Gin Asn Gly Arg Asn Leu Val Val Gly Gly Tyr Ser 200 205 210
AGC CCT AAT TTT TTA TGG AAA TCT TTG ATT GAA GTG GAT TAC ACT TTT 834 Ser Pro Asn Phe Leu Trp Lys Ser Leu He Glu Val Asp Tyr Thr Phe 215 220 225
AAT GTG GGC GTG AGT TTA ACG CTT TAT AGG AAA CAC CGT TTA GAG ATT 882 Asn Val Gly Val Ser Leu Thr Leu Tyr Arg Lys His Arg Leu Glu He 230 235 240 245
GGC ACA AAA TTG CCG ATT AGC TAT TTG AGA ATG GGA GTG GAA GAG GGA 930 Gly Thr Lys Leu Pro He Ser Tyr Leu Arg Met Gly Val Glu Glu Gly 250 255 260
GCG ATT TAT CAA AAT AAA GAA GAT GAT GAG CGT TTG TTG GTT TCG GCT 978 Ala He Tyr Gin Asn Lys Glu Asp Asp Glu Arg Leu Leu Val Ser Ala 265 270 275
AAC AAC CAG TTC AAG CGA TCC AGT TTT TTA TTA GTG AAT TAT GCG TTT 1026 Asn Asn Gin Phe Lys Arg Ser Ser Phe Leu Leu Val Asn Tyr Ala Phe 280 285 290
ATT TTT TAAGGCTTGA TCTTGGAGTT AAGGTTTAAA ATTTTAGCGT TAGTCGTTTT AA 1084 He Phe 295
TTTTAGGGGG TTATTTGATT TTTAACGCTT TAATCACAAA ACCCAGAGCT TTAAGTTTTA 1144 GTTTAAATAG CAAAGAGGGT GCGCTTAATG ACAATGATG 1183
(2) INFORMATION FOR SEQ ID NO: 126:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 314 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...19
-259- (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126:
Met Leu Lys Phe Lys Tyr Gly Leu He Tyr He Ala Leu He Leu Gly
-15 -10 -5
Leu Gin Ala Thr Asp Tyr Asp Asn Leu Glu Glu Glu Asn Gin Gin Leu
1 5 10
Asp Glu Lys He Asn His Leu Lys Gin Gin Leu Thr Glu Lys Gly Val
15 20 25
Ser Pro Lys Glu Met Asp Lys Asp Lys Phe Glu Glu Glu Tyr He Asn 30 35 40 45
Arg Ser Tyr Pro Lys He Ser Ser Lys Lys Lys Glu Lys Leu Leu Lys
50 55 60
Ser Phe Ser He Ala Asp Asp Lys Ser Gly Val Phe Leu Gly Gly Gly
65 70 75
Tyr Ala Tyr Gly Glu Leu Asn Leu Ser Tyr Gin Gly Glu Met Leu Asp
80 85 90
Arg Tyr Gly Ala Asn Ala Pro Ser Ala Phe Lys Asn Asn He Asn He
95 100 105
Asn Ala Pro Val Ser Met He Ser Ala Lys Phe Gly Tyr Gin Lys Tyr 110 115 120 125
Phe Val Ser Tyr Phe Gly Thr Arg Phe Tyr Gly Asp Leu Leu Leu Gly
130 135 140
Gly Gly Ala Leu Lys Glu Asp Ala He Lys Gin Pro Val Gly Ser Phe
145 150 155
He Tyr Val Leu Gly Ala Val Asn Thr Asp Leu Leu Phe Asp Met Pro
160 165 170
Leu Asp Phe Lys Thr Lys Lys His Phe Leu Gly Val Tyr Ala Gly Phe
175 180 185
Gly He Gly Leu Met Leu Tyr Gin Asp Arg Pro Asn Gin Asn Gly Arg 190 195 200 205
Asn Leu Val Val Gly Gly Tyr Ser Ser Pro Asn Phe Leu Trp Lys Ser
210 215 220
Leu He Glu Val Asp Tyr Thr Phe Asn Val Gly Val Ser Leu Thr Leu
225 230 235
Tyr Arg Lys His Arg Leu Glu He Gly Thr Lys Leu Pro He Ser Tyr
240 245 250
Leu Arg Met Gly Val Glu Glu Gly Ala He Tyr Gin Asn Lys Glu Asp
255 260 265
Asp Glu Arg Leu Leu Val Ser Ala Asn Asn Gin Phe Lys Arg Ser Ser 270 275 280 285
Phe Leu Leu Val Asn Tyr Ala Phe He Phe 290 295
(2) INFORMATION FOR SEQ ID NO: 127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1851 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 260 -
[ ix ) FEATURE :
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 238...1665 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 238...313 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127:
GAGCTAGTTT TAAAAAGTTA GTTTTGTTTT AAAAAGTTAA TACTATTTTG AAGCACTCCT 60
ATTCAGATGG CTAAGGCACA CAAGAAATTA GGGGACTCTG CTGTATTCCT ACCCTGAAGC 120
GTTACCCTAA AATCCTATTG CATAGGTCTA AATAAGAGCT TAGGGATCAT TTTAGCCATA 180
AAAAGCTTAT GTTTTCATTA AAAATGTTAT GATACGCTCA AATAGTCAAG CAAAAAA ATG 240
Met -25
TCA ATT AAA AGG GTT AGA TTG AAA ATA TTC GTT CTG TTG ATG TCG GTA 288 Ser He Lys Arg Val Arg Leu Lys He Phe Val Leu Leu Met Ser Val -20 -15 -10
ATT TTA GGA ATA TCA TTA ACA GGT TGC ATA GGC TAT CGT ATG GAC TTA 336 He Leu Gly He Ser Leu Thr Gly Cys He Gly Tyr Arg Met Asp Leu -5 1 5
GAA CAT TTT AAC ACG CTC TAT TAT GAA GAA AGC CCT AAA AAA GCT TAT 384 Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala Tyr 10 15 20
GAA TAT TCC AAA CAA TTC ACT AAG AAA AAA AAG AAC GCT CTT TTA TGG 432 Glu Tyr Ser Lys Gin Phe Thr Lys Lys Lys Lys Asn Ala Leu Leu Trp 25 30 35 40
GAC TTG CAA AAC GGC TTG AGC GCT TTA TAC GCC AGA GAT TAC CAG ACT 480 Asp Leu Gin Asn Gly Leu Ser Ala Leu Tyr Ala Arg Asp Tyr Gin Thr 45 50 55
TCT TTA GGG GTA TTA GAT CAA GCC GAG CAA CGC TTT GAT AAA ACG CAA 528 Ser Leu Gly Val Leu Asp Gin Ala Glu Gin Arg Phe Asp Lys Thr Gin 60 65 70
AGC GCT TTT ACA AGA GGG GCT GGT TAT GTG GGC GCT ACC ATG ATT AAT 576 Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val Gly Ala Thr Met He Asn 75 80 85
GAT AAT GTG CGC GCT TAT GGG GGG AAT ATT TAT GAG GGC GTT TTA ATC 624 Asp Asn Val Arg Ala Tyr Gly Gly Asn He Tyr Glu Gly Val Leu He 90 95 100
- 261 -
AAT TAT TAC AAA GCG ATA GAC TAC ATG CTT TTA AAC GAT AGC GCG AAA 672
Asn Tyr Tyr Lys Ala He Asp Tyr Met Leu Leu Asn Asp Ser Ala Lys 105 110 115 120
GCT AGG GTG CAA TTC AAC CGT GCG AAC GAA CGC CAG CGC AGG GCT AAA 720
Ala Arg Val Gin Phe Asn Arg Ala Asn Glu Arg Gin Arg Arg Ala Lys 125 130 135
GAA TTT TAT TAT GAG GAA GTG CAA AAA GCC ATT AAA GAG ATC GAT TCT 768
Glu Phe Tyr Tyr Glu Glu Val Gin Lys Ala He Lys Glu He Asp Ser 140 145 150
AGC AAA AAG CAC AAT ATT AAT ATG GAA CGC TCT AGG GTG GAA GTG AGC 816 Ser Lys Lys His Asn He Asn Met Glu Arg Ser Arg Val Glu Val Ser
155 160 165
GAG ATT TTA AAC AAC ACC TAT TCT AAT TTA GAC AAA TAC GAA GCT TAT 864
Glu He Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala Tyr 170 175 180
CAG GGC TTA CTT AAC CCG GCG GTT TCG TAT CTC TCA GGG TTG TTT TAC 912
Gin Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe Tyr 185 190 195 200
GCT TTA AAT GGG GAT GAG AAT AAG GGA TTA GGC TAT CTT AAT GAA GCC 960
Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu Ala 205 210 215
TAT GGG ATC AGT CAA AGC CCT TTT GTA GCC CAA GAC TTG GTT TTT TTC 1008
Tyr Gly He Ser Gin Ser Pro Phe Val Ala Gin Asp Leu Val Phe Phe 220 225 230
AAA AAC CCT AAC AGG AGC CAT TTC ACT TGG ATC ATC ATT GAA GAT GGT 1056
Lys Asn Pro Asn Arg Ser His Phe Thr Trp He He He Glu Asp Gly
235 240 245
AAA GAG CCG CAA AAA AGC GAA TTT AAA ATT GAT GTG CCT ATT TTT ATG 1104
Lys Glu Pro Gin Lys Ser Glu Phe Lys He Asp Val Pro He Phe Met 250 255 260
ATC GAT TCG GTT TAT AAC GTG AGT ATA GCC TTG CCC AAG CTA GAA AAA 1152
He Asp Ser Val Tyr Asn Val Ser He Ala Leu Pro Lys Leu Glu Lys 265 270 275 280
GGG GAA GCG TTT TAT CAA AAT TTC ACT CTC AAA GAT GGA GAA AAA GTA 1200
Gly Glu Ala Phe Tyr Gin Asn Phe Thr Leu Lys Asp Gly Glu Lys Val 285 290 295
ACG CCC TTT GAC ACT TTA GCC TCA ATA GAT GCG GTG GTC GCT AGC GAA 1248
Thr Pro Phe Asp Thr Leu Ala Ser He Asp Ala Val Val Ala Ser Glu 300 305 310
TTC AGG AAG CAG TTG CCC TAC ATT ATC ACT AGG GCT ATT TTA TCG GCC 1296
Phe Arg Lys Gin Leu Pro Tyr He He Thr Arg Ala He Leu Ser Ala
315 320 325
- 262 -
ACT TTT AAA GTG GGC ATG CAA GCG GTG GCG AAC TAT TAT TTG GGG TTT 1344 Thr Phe Lys Val Gly Met Gin Ala Val Ala Asn Tyr Tyr Leu Gly Phe 330 335 340
GTT GGA GGG TTA GTA ACT TCC TTG TAT TCA GGT GTG AGC ACC TTT GCA 1392 Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe Ala 345 350 355 360
GAC ACT AGA AGC ACG AGC ATT TTT GCC CAT AAA ATC TAC CTC ATG CGC 1440 Asp Thr Arg Ser Thr Ser He Phe Ala His Lys He Tyr Leu Met Arg 365 370 375
ATT AAA AAC AAA GCC TTT GAA AGT TAT GAA GTT CGA GCC GAT TCC ATT 1488 He Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser He 380 385 390
GAC GCT TTT TCG TTT TCA TTA AAG CCT TGT AAA AGA TCG CTT GAA AGC 1536 Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu Ser 395 400 405
CCT AAA ATC ATT GAC GCT AGG GAA TTG CTT TCT GGG TTT GTA GCA GCC 1584 Pro Lys He He Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala Ala 410 415 420
CCA CAA ATC TTT TGC TCT AAC CGC CAT AAT ATT TTA TAC GTG CGC AGT 1632 Pro Gin He Phe Cys Ser Asn Arg His Asn He Leu Tyr Val Arg Ser 425 430 435 440
TTT AAA AAC GGG TTT GTT TTG AGT CGT TTA AAA TGATTTCAAA ACCCCCACCA 1685 Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys 445 450
AAGGAATTTT AGTTTTTAAG TGTCGTTGGC ATTAAACGCA AACACGATAT AATTATAAAA 1745 CGATACGAAA ACCTAAATTA AGGGGAAGTC ATGGCTGATA GTTTAGCGGG CATTGATCAA 1805 GTTACGAGTT TGCATAAAAA TAACGAGTTA CAATTGTTGT GTTTCA 1851
(2) INFORMATION FOR SEQ ID NO: 128:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...25 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128:
- 263 -
Met Ser He Lys Arg Val Arg Leu Lys He Phe Val Leu Leu Met Ser -25 -20 -15 -10
Val He Leu Gly He Ser Leu Thr Gly Cys He Gly Tyr Arg Met Asp
-5 1 5
Leu Glu His Phe Asn Thr Leu Tyr Tyr Glu Glu Ser Pro Lys Lys Ala
10 15 20
Tyr Glu Tyr Ser Lys Gin Phe Thr Lys Lys Lys Lys Asn Ala Leu Leu
25 30 35
Trp Asp Leu Gin Asn Gly Leu Ser Ala Leu Tyr Ala Arg Asp Tyr Gin 40 45 50 55
Thr Ser Leu Gly Val Leu Asp Gin Ala Glu Gin Arg Phe Asp Lys Thr
60 65 70
Gin Ser Ala Phe Thr Arg Gly Ala Gly Tyr Val Gly Ala Thr Met He
75 80 85
Asn Asp Asn Val Arg Ala Tyr Gly Gly Asn He Tyr Glu Gly Val Leu
90 95 100
He Asn Tyr Tyr Lys Ala He Asp Tyr Met Leu Leu Asn Asp Ser Ala
105 110 115
Lys Ala Arg Val Gin Phe Asn Arg Ala Asn Glu Arg Gin Arg Arg Ala 120 125 130 135
Lys Glu Phe Tyr Tyr Glu Glu Val Gin Lys Ala He Lys Glu He Asp
140 145 150
Ser Ser Lys Lys His Asn He Asn Met Glu Arg Ser Arg Val Glu Val
155 . 160 165
Ser Glu He Leu Asn Asn Thr Tyr Ser Asn Leu Asp Lys Tyr Glu Ala
170 175 180
Tyr Gin Gly Leu Leu Asn Pro Ala Val Ser Tyr Leu Ser Gly Leu Phe
185 190 195
Tyr Ala Leu Asn Gly Asp Glu Asn Lys Gly Leu Gly Tyr Leu Asn Glu 200 205 210 215
Ala Tyr Gly He Ser Gin Ser Pro Phe Val Ala Gin Asp Leu Val Phe
220 225 230
Phe Lys Asn Pro Asn Arg Ser His Phe Thr Trp He He He Glu Asp
235 240 245
Gly Lys Glu Pro Gin Lys Ser Glu Phe Lys He Asp Val Pro He Phe
250 255 260
Met He Asp Ser Val Tyr Asn Val Ser He Ala Leu Pro Lys Leu Glu
265 270 275
Lys Gly Glu Ala Phe Tyr Gin Asn Phe Thr Leu Lys Asp Gly Glu Lys 280 285 290 295
Val Thr Pro Phe Asp Thr Leu Ala Ser He Asp Ala Val Val Ala Ser
300 305 310
Glu Phe Arg Lys Gin Leu Pro Tyr He He Thr Arg Ala He Leu Ser
315 320 325
Ala Thr Phe Lys Val Gly Met Gin Ala Val Ala Asn Tyr Tyr Leu Gly
330 335 340
Phe Val Gly Gly Leu Val Thr Ser Leu Tyr Ser Gly Val Ser Thr Phe
345 350 355
Ala Asp Thr Arg Ser Thr Ser He Phe Ala His Lys He Tyr Leu Met 360 365 370 375
Arg He Lys Asn Lys Ala Phe Glu Ser Tyr Glu Val Arg Ala Asp Ser
380 385 390
He Asp Ala Phe Ser Phe Ser Leu Lys Pro Cys Lys Arg Ser Leu Glu
395 400 405
Ser Pro Lys He He Asp Ala Arg Glu Leu Leu Ser Gly Phe Val Ala
- 264 -
410 415 420
Ala Pro Gin He Phe Cys Ser Asn Arg His Asn He Leu Tyr Val Arg
425 430 435
Ser Phe Lys Asn Gly Phe Val Leu Ser Arg Leu Lys 440 445 450
(2) INFORMATION FOR SEQ ID NO: 129:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...432 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129:
ATG TTA GAA AAA TTG ATT GAA AGA GTG TTG TTT GCC ACT CGT TGG TTG 48 Met Leu Glu Lys Leu He Glu Arg Val Leu Phe Ala Thr Arg Trp Leu 1 5 10 15
CTA GCC CCT TTA TGT ATT GCC ATG TCG TTA GTG CTG GTG GTT TTA GGC 96 Leu Ala Pro Leu Cys He Ala Met Ser Leu Val Leu Val Val Leu Gly 20 25 30
TAT GTG TTC ATG AAA GAG TTG TGG CAC ATG CTC AGC CAT TTA AAC ACG 144 Tyr Val Phe Met Lys Glu Leu Trp His Met Leu Ser His Leu Asn Thr 35 40 45
ATC AGC GAA ACG GAT TTG GTT TTA TCA GCC TTA GGA TTA GTG GAT TTG 192 He Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu 50 55 60
TTG TTT ATG GCC GGG CTT GTT TTA ATG GTG TTA CTC GCC AGT TAT GAA 240 Leu Phe Met Ala Gly Leu Val Leu Met Val Leu Leu Ala Ser Tyr Glu 65 70 75 80
AGC TTT GTT TCT AAA TTA GAC AAG GTG GAT GCC AGT GAA ATC ACT TGG 288 Ser Phe Val Ser Lys Leu Asp Lys Val Asp Ala Ser Glu He Thr Trp 85 90 95
CTA AAG CAC ACG GAT TTT AAC GCT TTA AAA TTA AAG GTT TCA CTC TCC 336 Leu Lys His Thr Asp Phe Asn Ala Leu Lys Leu Lys Val Ser Leu Ser 100 105 110
ATT GTA GCG ATT TCA GCG ATT TTC TTG CTC AAA CGC TAC ATG AGT TTA 384 He Val Ala He Ser Ala He Phe Leu Leu Lys Arg Tyr Met Ser Leu
- 265 -
115 120 125
GAA AGA TGT TTT ATC CCA GCA TTC CCT AAG GAT ACG CCC CCT ATC GCA T 433 Glu Arg Cys Phe He Pro Ala Phe Pro Lys Asp Thr Pro Pro He Ala 130 135 140
AA 435
(2) INFORMATION FOR SEQ ID NO: 130:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 144 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130:
Met Leu Glu Lys Leu He Glu Arg Val Leu Phe Ala Thr Arg Trp Leu
1 5 10 15
Leu Ala Pro Leu Cys He Ala Met Ser Leu Val Leu Val Val Leu Gly
20 25 30
Tyr Val Phe Met Lys Glu Leu Trp His Met Leu Ser His Leu Asn Thr
35 40 45
He Ser Glu Thr Asp Leu Val Leu Ser Ala Leu Gly Leu Val Asp Leu
50 55 60
Leu Phe Met Ala Gly Leu Val Leu Met Val Leu Leu Ala Ser Tyr Glu 65 70 75 80
Ser Phe Val Ser Lys Leu, Asp Lys Val Asp Ala Ser Glu He Thr Trp 85 90 95
Leu Lys His Thr Asp Phe Asn Ala Leu Lys Leu Lys Val Ser Leu Ser
100 105 110
He Val Ala He Ser Ala He Phe Leu Leu Lys Arg Tyr Met Ser Leu
115 120 125
Glu Arg Cys Phe He Pro Ala Phe Pro Lys Asp Thr Pro Pro He Ala 130 135 140
(2) INFORMATION FOR SEQ ID NO: 131
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2234 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 213...2081 (D) OTHER INFORMATION:
- 266 -
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 213...273 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131:
ATCATAAAAT GTAAAAATAC TCAAAGCATC GCATCAAGCA ATATAGCGAT CTGAAAAGAG 60
GCTCACAATT GAGCTAAAGC CCGCTTTTTA GGGATAAATA AAAAGCGTTT TCAAATTGCA 120
TGGGTAACTT TATGGGGCGA AGCGTTTCTA AATTTTGGTA TAATCGCTAG AAATTGTGAG 180
AAAGATTCTA TCTTGTTTGA GTGGGGTTTC GC ATG CGT TTA TTA TTG TGG TGG 233
Met Arg Leu Leu Leu Trp Trp -20 -15
GTA TTG GTA TTA TCG CTC TTT TTA AAT CCT TTG AGA GCG GTT GAA GAG 281 Val Leu Val Leu Ser Leu Phe Leu Asn Pro Leu Arg Ala Val Glu Glu -10 -5 1
CAT GAA ACA GAT GCG GTG GAT TTG TTT TTG ATT TTC AAT CAA ATC AAC 329 His Glu Thr Asp Ala Val Asp Leu Phe Leu He Phe Asn Gin He Asn 5 10 15
CAG CTC AAT CAA GTC ATT GAA ACT TAC AAA AAA AAC CCT GAA AGA AGC 377 Gin Leu Asn Gin Val He Glu Thr Tyr Lys Lys Asn Pro Glu Arg Ser 20 25 30 35
GCT GAA ATC TCT CTG TAT AAC ACC CAA AAG AAT GAC TTG ATT AAA AGT 425 Ala Glu He Ser Leu Tyr Asn Thr Gin Lys Asn Asp Leu He Lys Ser 40 45 50
TTG ACT TCT AAA GTG TTG AAT GAA AGG GAT AAG ATC GGG ATT GAT ATC 473 Leu Thr Ser Lys Val Leu Asn Glu Arg Asp Lys He Gly He Asp He 55 60 65
AAT CAA AAT TTA AAA GAG CAG GAA AAA ATC AAA AAG CGT TTG TCT AAA 521 Asn Gin Asn Leu Lys Glu Gin Glu Lys He Lys Lys Arg Leu Ser Lys 70 75 80
AGC ATT AAT GGC GAT GAT TTC TAC ACT TTC ATG AAA GAC AGA TTG TCT 569 Ser He Asn Gly Asp Asp Phe Tyr Thr Phe Met Lys Asp Arg Leu Ser 85 90 95
TTA GAT ATT TTG TTG ATA GAT GAA ATT TTG TAT CGT TTT ATA GAT AAA 617 Leu Asp He Leu Leu He Asp Glu He Leu Tyr Arg Phe He Asp Lys 100 105 110 115
ATC AGG AGC AGT ATT GAT ATT TTT AGC GAA CAA AAA GAT GTA GAA AGC 665 He Arg Ser Ser He Asp He Phe Ser Glu Gin Lys Asp Val Glu Ser 120 125 130
ATC AGC GAT GCT TTC CTT TTG CGT TTA GGG CAA TTC AAA CTC TAC ACT 713
- 267 -
Ile Ser Asp Ala Phe Leu Leu Arg Leu Gly Gin Phe Lys Leu Tyr Thr 135 140 145
TTC CCT AAA AAT TTA GGC AAT GTC AAA ATG CAT GAA TTA GAG CAG ATG 761
Phe Pro Lys Asn Leu Gly Asn Val Lys Met His Glu Leu Glu Gin Met
150 155 160
TTT AGC GAT TAT GAA TTG CGT TTG AAC ACT TAC ACC GAA GTC TTG CGT 809
Phe Ser Asp Tyr Glu Leu Arg Leu Asn Thr Tyr Thr Glu Val Leu Arg 165 170 175
TAC ATT AAA AAC CAC CCT AAA GAA GTG CTT CCT AAA AAC TTG ATC ATG 857
Tyr He Lys Asn His Pro Lys Glu Val Leu Pro Lys Asn Leu He Met 180 185 190 195
GAA GTG AAT ATG GAT TTT GTG TTA AAC AAA ATC AGC AAG GTT TTG CCT 905
Glu Val Asn Met Asp Phe Val Leu Asn Lys He Ser Lys Val Leu Pro 200 205 210
TTC ACA ACC CAT AGC TTG CAA GTG AGT AAA ATC GTG CTA GCT TTG ACG 953
Phe Thr Thr His Ser Leu Gin Val Ser Lys He Val Leu Ala Leu Thr 215 220 225
ATT TTA GCC TTA TTG CTG GGT TTA AGG AAG TTG ATC ACT TGG CTT TTA 1001
He Leu Ala Leu Leu Leu Gly Leu Arg Lys Leu He Thr Trp Leu Leu
230 235 240
GCC TTA TTG TTA GAT CGT ATT TTT GAA ATC ATG CAG CGC AAT AAA AAA 1049
Ala Leu Leu Leu Asp Arg He Phe Glu He Met Gin Arg Asn Lys Lys 245 250 255
ATG CAT GTC AAT GTG CAA AAG AGC ATT GTT TCG CCG GTT TCT GTC TTT 1097
Met His Val Asn Val Gin Lys Ser He Val Ser Pro Val Ser Val Phe 260 265 270 275
TTA GCC CTA TTT AGT TGC GAT GTG GCT TTA GAT ATT TTC TAC TAC CCT 1145
Leu Ala Leu Phe Ser Cys Asp Val Ala Leu Asp He Phe Tyr Tyr Pro 280 285 290
AAC GCA TCG CCC CCT AAA GTT TCT ATG TGG GTG GGC GCG GTG TAT ATC 1193
Asn Ala Ser Pro Pro Lys Val Ser Met Trp Val Gly Ala Val Tyr He 295 300 305
ATG CTT TTA GCA TGG TTA GTG ATA GCG CTT TTT AAA GGC TAT GGG GAA 1241
Met Leu Leu Ala Trp Leu Val He Ala Leu Phe Lys Gly Tyr Gly Glu
310 315 320
GCG TTA GTT ACG AAT ATG GCT ACC AAA AGC ACG CAC AAT TTT AGA AAA 1289
Ala Leu Val Thr Asn Met Ala Thr Lys Ser Thr His Asn Phe Arg Lys 325 330 335
GAA GTG ATC AAC TTG ATT TTA AAA GTC GTG TAT TTT TTG ATC TTT ATT 1337
Glu Val He Asn Leu He Leu Lys Val Val Tyr Phe Leu He Phe He 340 345 350 355
- 268 -
GTC GCG CTT TTA GGG GTT TTG AAA CAA CTA GGG TTT AAC GTT TCA GCC 1385 Val Ala Leu Leu Gly Val Leu Lys Gin Leu Gly Phe Asn Val Ser Ala 360 365 370
ATC ATC GCT TCT TTA GGG ATT GGG GGG TTA GCG GTG GCT TTG GCG GTT 1433 He He Ala Ser Leu Gly He Gly Gly Leu Ala Val Ala Leu Ala Val 375 380 385
AAA GAT GTG TTA GCG AAT TTT TTT GCT TCG GTC ATT TTA TTA TTA GAC 1481 Lys Asp Val Leu Ala Asn Phe Phe Ala Ser Val He Leu Leu Leu Asp 390 395 400
AAT TCG TTT TCT CAA GGG GAT TGG ATC GTG TGC GGT GAA GTG GAG GGC 1529 Asn Ser Phe Ser Gin Gly Asp Trp He Val Cys Gly Glu Val Glu Gly 405 410 415
ACG GTG GTG GAA ATG GGG TTA AGG CGC ACC ACG ATC AGA GCC TTT GAC 1577 Thr Val Val Glu Met Gly Leu Arg Arg Thr Thr He Arg Ala Phe Asp 420 425 430 435
AAC GCT CTT TTG TCC GTG CCT AAT TCA GAA TTA GCC GGA AAA CCC ATC 1625 Asn Ala Leu Leu Ser Val Pro Asn Ser Glu Leu Ala Gly Lys Pro He
440 445 450
AGG AAT TGG AGC CGT CGT AAA GTG GGA AGG CGT ATT AAA ATG GAA ATA 1673 Arg Asn Trp Ser Arg Arg Lys Val Gly Arg Arg He Lys Met Glu He 455 460 465
GGC TTA ACT TAT AGC TCC AGT CAA AGC GCT TTA CAG CTT TGC GTG AAA 1721 Gly Leu Thr Tyr Ser Ser Ser Gin Ser Ala Leu Gin Leu Cys Val Lys 470 475 480
GAC ATT AAA GAA ATG TTA GAA AAC CAC CCT AAA ATC GCT AAC GGA GCC 1769 Asp He Lys Glu Met Leu Glu Asn His Pro Lys He Ala Asn Gly Ala 485 490 495
GAT AGC GCT TTG CAA AAT GTG AGC GAT TAC CGC TAC ATG TTT AAA AAA 1817 Asp Ser Ala Leu Gin Asn Val Ser Asp Tyr Arg Tyr Met Phe Lys Lys 500 505 510 515
GAT ATT GTT TCT ATT GAT GAT TTT TTA GGG TAT AAA AAC AAT TTG TTT 1865 Asp He Val Ser He Asp Asp Phe Leu Gly Tyr Lys Asn Asn Leu Phe 520 525 530
GTC TTT TTA GAT CAG TTT GCG GAC AGC TCT ATT AAT ATT TTA GTG TAT 1913 Val Phe Leu Asp Gin Phe Ala Asp Ser Ser He Asn He Leu Val Tyr 535 540 545
TGC TTT TCT AAG ACA GTG GTT TGG GAA GAG TGG CTA GAA GTC AAA GAA 1961 Cys Phe Ser Lys Thr Val Val Trp Glu Glu Trp Leu Glu Val Lys Glu 550 555 560
GAT GTG ATG CTA AAA ATC ATG GGG ATT GTA GAA AAG CAC CAT TTG AGT 2009 Asp Val Met Leu Lys He Met Gly He Val Glu Lys His His Leu Ser 565 570 575
- 269 -
TTT GCT TTC CCA TCA CAG AGT TTG TAT GTG GAG AGT TTG CCA GAA GTT 2057 Phe Ala Phe Pro Ser Gin Ser Leu Tyr Val Glu Ser Leu Pro Glu Val 580 585 590 595
AGC CTG AAA GAA GGG GCT AAA ATC TGAAATTATT GGTAGATGTA TTCTTTGGTT 2111 Ser Leu Lys Glu Gly Ala Lys He 600
AAGGGGAAAG TGTTATCCAC GCTGTTGGTT AAAAGCAATT GGAATAAATC CGCGCTCCCC 2171 ACCCTAAAGG CGGATGCGCA AGTCCTTAAA TACAGATCCC ACATGCGGAT AAAGCGTTCG 2231 TCA 2234
(2) INFORMATION FOR SEQ ID NO: 132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 623 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...20 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132:
Met Arg Leu Leu Leu Trp Trp Val Leu Val Leu Ser Leu Phe Leu Asn
-20 -15 -10 -5
Pro Leu Arg Ala Val Glu Glu His Glu Thr Asp Ala Val Asp Leu Phe
1 5 10
Leu He Phe Asn Gin He Asn Gin Leu Asn Gin Val He Glu Thr Tyr
15 20 25
Lys Lys Asn Pro Glu Arg Ser Ala Glu He Ser Leu Tyr Asn Thr Gin
30 35 40
Lys Asn Asp Leu He Lys Ser Leu Thr Ser Lys Val Leu Asn Glu Arg
45 50 55 60
Asp Lys He Gly He Asp He Asn Gin Asn Leu Lys Glu Gin Glu Lys
65 70 75
He Lys Lys Arg Leu Ser Lys Ser He Asn Gly Asp Asp Phe Tyr Thr
80 85 90
Phe Met Lys Asp Arg Leu Ser Leu Asp He Leu Leu He Asp Glu He
95 100 105
Leu Tyr Arg Phe He Asp Lys He Arg Ser Ser He Asp He Phe Ser
110 115 120
Glu Gin Lys Asp Val Glu Ser He Ser Asp Ala Phe Leu Leu Arg Leu
125 130 135 140
Gly Gin Phe Lys Leu Tyr Thr Phe Pro Lys Asn Leu Gly Asn Val Lys
145 150 155
Met His Glu Leu Glu Gin Met Phe Ser Asp Tyr Glu Leu Arg Leu Asn
- 270 -
160 165 170
Thr Tyr Thr Glu Val Leu Arg Tyr He Lys Asn His Pro Lys Glu Val
175 180 185
Leu Pro Lys Asn Leu He Met Glu Val Asn Met Asp Phe Val Leu Asn
190 195 200
Lys He Ser Lys Val Leu Pro Phe Thr Thr His Ser Leu Gin Val Ser 205 210 215 220
Lys He Val Leu Ala Leu Thr He Leu Ala Leu Leu Leu Gly Leu Arg
225 230 235
Lys Leu He Thr Trp Leu Leu Ala Leu Leu Leu Asp Arg He Phe Glu
240 245 250
He Met Gin Arg Asn Lys Lys Met His Val Asn Val Gin Lys Ser He
255 260 265
Val Ser Pro Val Ser Val Phe Leu Ala Leu Phe Ser Cys Asp Val Ala
270 275 280
Leu Asp He Phe Tyr Tyr Pro Asn Ala Ser Pro Pro Lys Val Ser Met 285 290 295 300
Trp Val Gly Ala Val Tyr He Met Leu Leu Ala Trp Leu Val He Ala
305 310 315
Leu Phe Lys Gly Tyr Gly Glu Ala Leu Val Thr Asn Met Ala Thr Lys
320 325 330
Ser Thr His Asn Phe Arg Lys Glu Val He Asn Leu He Leu Lys Val
335 340 345
Val Tyr Phe Leu He Phe He Val Ala Leu Leu Gly Val Leu Lys Gin
350 355 360
Leu Gly Phe Asn Val Ser Ala He He Ala Ser Leu Gly He Gly Gly 365 370 375 380
Leu Ala Val Ala Leu Ala Val Lys Asp Val Leu Ala Asn Phe Phe Ala
385 390 395
Ser Val He Leu Leu Leu Asp Asn Ser Phe Ser Gin Gly Asp Trp He
400 405 410
Val Cys Gly Glu Val Glu Gly Thr Val Val Glu Met Gly Leu Arg Arg
415 420 425
Thr Thr He Arg Ala Phe Asp Asn Ala Leu Leu Ser Val Pro Asn Ser
430 435 440
Glu Leu Ala Gly Lys Pro He Arg Asn Trp Ser Arg Arg Lys Val Gly 445 450 455 460
Arg Arg He Lys Met Glu He Gly Leu Thr Tyr Ser Ser Ser Gin Ser
465 470 475
Ala Leu Gin Leu Cys Val Lys Asp He Lys Glu Met Leu Glu Asn His
480 485 490
Pro Lys He Ala Asn Gly Ala Asp Ser Ala Leu Gin Asn Val Ser Asp
495 500 505
Tyr Arg Tyr Met Phe Lys Lys Asp He Val Ser He Asp Asp Phe Leu
510 515 520
Gly Tyr Lys Asn Asn Leu Phe Val Phe Leu Asp Gin Phe Ala Asp Ser 525 530 535 540
Ser He Asn He Leu Val Tyr Cys Phe Ser Lys Thr Val Val Trp Glu
545 550 555
Glu Trp Leu Glu Val Lys Glu Asp Val Met Leu Lys He Met Gly He
560 565 570
Val Glu Lys His His Leu Ser Phe Ala Phe Pro Ser Gin Ser Leu Tyr
575 580 585
Val Glu Ser Leu Pro Glu Val Ser Leu Lys Glu Gly Ala Lys He 590 595 600
- 271 -
(2) INFORMATION FOR SEQ ID NO: 133
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...429 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide
(B) LOCATION: 1...93 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide
(B) LOCATION: 94...429 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133:
ATG AAA AAA TTT TTT TCT CAA TCT TTA TTA GCT TTG ATT GTG TCT ATG 48 Met Lys Lys Phe Phe Ser Gin Ser Leu Leu Ala Leu He Val Ser Met -31 -30 -25 -20
AAC GCG CTA CTG GCC ATG GAT GGC AAT GGC GTT TTT TTA GGG GCG GGT 96 Asn Ala Leu Leu Ala Met Asp Gly Asn Gly Val Phe Leu Gly Ala Gly -15 -10 -5 1
TAT TTG CAA GGG CAA GCC CAA ATG CAT GCG GAT ATT AAT TCT CAA AAA 144
Tyr Leu Gin Gly Gin Ala Gin Met His Ala Asp He Asn Ser Gin Lys 5 10 15
CAA GCC ACT AAC GCT ACT ATC AAA GGC TTT GAT GCG CTT TTA GGG TAT 192
Gin Ala Thr Asn Ala Thr He Lys Gly Phe Asp Ala Leu Leu Gly Tyr 20 25 30
CAA TTT TTC TTT GGG AAA TAC TTT GGC TTG CGT GCT TAT GGG TTT TTT 240
Gin Phe Phe Phe Gly Lys Tyr Phe Gly Leu Arg Ala Tyr Gly Phe Phe 35 40 45
GAC TAC GCT CAT GCC AAT TCT ATT AGG CTT AAA AAC CCT AAC TAT AAC 288
Asp Tyr Ala His Ala Asn Ser He Arg Leu Lys Asn Pro Asn Tyr Asn
50 55 60 65
AGC GAA GTG GCG CAA TTG GCG GGT CAA ATT CTT GGG AAA CAA GAA ATC 336
Ser Glu Val Ala Gin Leu Ala Gly Gin He Leu Gly Lys Gin Glu He
- 272 -
70 75 80
AAT CGC TTA ACG AGC CTT GCT GAT CCT AAA ACC TTT GAG CCA AAC ATG 384
Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys Thr Phe Glu Pro Asn Met 85 90 95
CTC ACT TAT GGG GGG GCT ATG GAT TTA ATG GTT AAT GTT CAT CAA TAA 432
Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gin
100 105 110
(2) INFORMATION FOR SEQ ID NO: 134:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 143 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134:
Met Lys Lys Phe Phe Ser Gin Ser Leu Leu Ala Leu He Val Ser Met -31 -30 -25 -20
Asn Ala Leu Leu Ala Met Asp Gly Asn Gly Val Phe Leu Gly Ala Gly -15 -10 -5 1
Tyr Leu Gin Gly Gin Ala Gin Met His Ala Asp He Asn Ser Gin Lys
5 10 15
Gin Ala Thr Asn Ala Thr He Lys Gly Phe Asp Ala Leu Leu Gly Tyr
20 25 30
Gin Phe Phe Phe Gly Lys Tyr Phe Gly Leu Arg Ala Tyr Gly Phe Phe
35 40 45
Asp Tyr Ala His Ala Asn Ser He Arg Leu Lys Asn Pro Asn Tyr Asn 50 55 60 65
Ser Glu Val Ala Gin Leu Ala Gly Gin He Leu Gly Lys Gin Glu He
70 75 80
Asn Arg Leu Thr Ser Leu Ala Asp Pro Lys Thr Phe Glu Pro Asn Met
85 90 95
Leu Thr Tyr Gly Gly Ala Met Asp Leu Met Val Asn Val His Gin 100 105 110
(2) INFORMATION FOR SEQ ID NO: 135:
(l) SEQUENCE CHARACTERISTICS.
(A) LENGTH: 336 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
- 273 -
(B) LOCATION: 1...333 (D) OTHER INFORMATION:
(A) NAME/KEY: sig_peptide
(B) LOCATION: 1...60 (D) OTHER INFORMATION:
(A) NAME/KEY: mat_peptide
(B) LOCATION: 61...333 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135:
ATG AAA ACC TTT AAA AAC CTG CTC TGT TTT AGC CTG ATC GCT ATG AGT 48 Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu He Ala Met Ser -20 -15 -10 -5
TGG CTC CAA GCG GAC ATG TTG GAT AAT TTC ACT AGG GCC ATT AAC AGC 96 Trp Leu Gin Ala Asp Met Leu Asp Asn Phe Thr Arg Ala He Asn Ser 1 5 10
TAC ACC ACT AAA AAG CTT AAT GAA ATC AAG GAT CAA GTC AAT AGC GCT 144 Tyr Thr Thr Lys Lys Leu Asn Glu He Lys Asp Gin Val Asn Ser Ala 15 20 25
AAC CCT ACT AAA AAT CAC AAT ACC ACT TAT AAC GCT AAT GGC ATG CTC 192 Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu 30 35 40
ATT AAC ATT GAT TGT AAA GTC TTA AAA AAT AAC TTC TAT TCG GTG TGT 240 He Asn He Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys 45 50 55 60
TAT TCT AGC GAG TTA AAA AAC CCT ATT TAT GGC GTG AGC GTG TTG TTT 288 Tyr Ser Ser Glu Leu Lys Asn Pro He Tyr Gly Val Ser Val Leu Phe 65 70 75
GGG GAT TTA GTG GAT AAA AAT AAT ATT GAA AAA CGC TAT GAG TTT TAA 336 Gly Asp Leu Val Asp Lys Asn Asn He Glu Lys Arg Tyr Glu Phe 80 85 90
(2) INFORMATION FOR SEQ ID NO: 136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 111 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
- 274 -
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136:
Met Lys Thr Phe Lys Asn Leu Leu Cys Phe Ser Leu He Ala Met Ser -20 -15 -10 -5
Trp Leu Gin Ala Asp Met Leu Asp Asn Phe Thr Arg Ala He Asn Ser
1 5 10
Tyr Thr Thr Lys Lys Leu Asn Glu He Lys Asp Gin Val Asn Ser Ala
15 20 25
Asn Pro Thr Lys Asn His Asn Thr Thr Tyr Asn Ala Asn Gly Met Leu
30 35 40
He Asn He Asp Cys Lys Val Leu Lys Asn Asn Phe Tyr Ser Val Cys 45 50 55 60
Tyr Ser Ser Glu Leu Lys Asn Pro He Tyr Gly Val Ser Val Leu Phe
65 70 75
Gly Asp Leu Val Asp Lys Asn Asn He Glu Lys Arg Tyr Glu Phe 80 85 90
(2) INFORMATION FOR SEQ ID NO: 137:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2185 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 81...2069 (D) OTHER INFORMATION:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 81...144 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137:
GTAAAAAATG GCTTATCTGT TCTAGCCTAC TCCCCTTATT TTTTCTTAAT CCCTTAGCGG 60 CAGAAGATGA TGGGTTTTTT ATG GGG GTG AGT TAT CAA ACT TCT CTA GCT 110
Met Gly Val Ser Tyr Gin Thr Ser Leu Ala -20 -15
ATT CAA AGG GTG GAT AAC TCA GGG CTT AAC GCC AGT CAA GCC GCA TCC 158 He Gin Arg Val Asp Asn Ser Gly Leu Asn Ala Ser Gin Ala Ala Ser -10 -5 1 5
ACC TAC ATC CGC CAG AAC GCT ATC GCT CTA GAA TCT GCG GCG GTG CCT 206 Thr Tyr He Arg Gin Asn Ala He Ala Leu Glu Ser Ala Ala Val Pro 10 15 20
- 275 -
TTA GCC TAT TAT TTA GAA GCG ATG GGC CAA CAA ACC AGG GTT TTA ATG 254 Leu Ala Tyr Tyr Leu Glu Ala Met Gly Gin Gin Thr Arg Val Leu Met 25 30 35
CAA ATG CTC TGC CCT GAT CCT TCC AAA CGC TGT TTG CTC TAT GCT GGA 302 Gin Met Leu Cys Pro Asp Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly 40 45 50
GGT TAT AAA AAC GGA TCA AGT AAT ACT AAC GGC GAT ACA GGC AAC AAC 350 Gly Tyr Lys Asn Gly Ser Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn
55 60 65
CCC CCA AGA GGC AAT GTC AAT GCC ACC TTT GAT ATG CAA TCT CTA GTC 398 Pro Pro Arg Gly Asn Val Asn Ala Thr Phe Asp Met Gin Ser Leu Val 70 75 80 85
AAT AAT TTA AAC AAG CTC ACC CAA CTC ATC GGC GAG ACT TTA ATC CGT 446 Asn Asn Leu Asn Lys Leu Thr Gin Leu He Gly Glu Thr Leu He Arg 90 95 100
AAC CCT GAA AAT CTT TCT AAC GCC AAA GTC TTT AAT GTC AAA TTT GGC 494 Asn Pro Glu Asn Leu Ser Asn Ala Lys Val Phe Asn Val Lys Phe Gly 105 110 115
AAT CAA AGC ACT GTT ATT GCA TTG CCT GAG GGT CTA GCC AAT ACC ATG 542 Asn Gin Ser Thr Val He Ala Leu Pro Glu Gly Leu Ala Asn Thr Met 120 125 130
AAC GCT TTA AAC GAT GAT ATT ACC AAC GCT TTA ACC ACG CTC TGG TAT 590 Asn Ala Leu Asn Asp Asp He Thr Asn Ala Leu Thr Thr Leu Trp Tyr 135 140 145
AAC CAA ACC TTA ACG AAT AAA TCT TTT AAT AGC GGT AAT TCC GTG AAT 638 Asn Gin Thr Leu Thr Asn Lys Ser Phe Asn Ser Gly Asn Ser Val Asn 150 155 160 165
TTT AGC CCC CAA GTC TTG CAA CAC CTT TTA CAA GAC GGC TTA GCC ACA 686 Phe Ser Pro Gin Val Leu Gin His Leu Leu Gin Asp Gly Leu Ala Thr 170 175 180
AGT AAT CAA ACC ATT TGC AGC ACT CAA AAC CAA TGC ACC GCC ACC AAT 734 Ser Asn Gin Thr He Cys Ser Thr Gin Asn Gin Cys Thr Ala Thr Asn 185 190 195
GAA GCT AAA TCT ATC GCT CAA AAC GCC CAA AAC ATC TTC CAG GCT TTA 782 Glu Ala Lys Ser He Ala Gin Asn Ala Gin Asn He Phe Gin Ala Leu 200 205 210
ATG CAA GCA GGG ATT TTA GGG GGC TTA GCC AAT GAA AAG CAA TTT GGC 830 Met Gin Ala Gly He Leu Gly Gly Leu Ala Asn Glu Lys Gin Phe Gly 215 220 225
TTC ACT TAC AAC AAA GCC CCT AAT GGT AGC GAT TCC CAA CAA GGC TAC 878 Phe Thr Tyr Asn Lys Ala Pro Asn Gly Ser Asp Ser Gin Gin Gly Tyr 230 235 240 245
- 276 -
CAA AGC TTT AGC GGC CCG GGT TAT TAC ACT AAA AAC GGC GCT AAT GGC 926
Gin Ser Phe Ser Gly Pro Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly 250 255 260
ACT ACC CAA GCG CCC TTG AAA GCA TTA CCC GCT GGA GCG ACA ATT GGA 974
Thr Thr Gin Ala Pro Leu Lys Ala Leu Pro Ala Gly Ala Thr He Gly 265 270 275
TCA GGC AAT GGC CAA TAC ACC TAC CAC CCC AGC TCG GCA GTC TAT TAT 1022
Ser Gly Asn Gly Gin Tyr Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr
280 285 290
TTA GCC GAT AGC ATC ATT GCT AAT GGC ATC ACC GCT TCT ATG ATT TTT 1070
Leu Ala Asp Ser He He Ala Asn Gly He Thr Ala Ser Met He Phe 295 300 305
TCA GGC ATG CAA AAT TTC GCC AAT AAA GCC GCT AAA CTG ACA GGC ACT 1118
Ser Gly Met Gin Asn Phe Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr 310 315 320 325
TCA AGC TAT AGC CAG ATG CAA GAT GCG ATC AAT TAC GGG GAA AGC TTG 1166
Ser Ser Tyr Ser Gin Met Gin Asp Ala He Asn Tyr Gly Glu Ser Leu 330 335 340
CTC AGT AAC ACC GTA GCG TAT GGG GAT TTC ATC ACC AAT TGG GTC GCC 1214
Leu Ser Asn Thr Val Ala Tyr Gly Asp Phe He Thr Asn Trp Val Ala 345 350 355
CCC TAT TTG GAT TTA AAC AAC AAA GGT TTG AAT TTC TTG CCT AGC TAT 1262
Pro Tyr Leu Asp Leu Asn Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr
360 365 370
GGG GGG CAA TTG AAT GGT GCT AAT CAT CAA ACC CCA CAA TTA ACC CCG 1310
Gly Gly Gin Leu Asn Gly Ala Asn His Gin Thr Pro Gin Leu Thr Pro 375 380 385
CAA CAA GCC CAA CAA GAG CAA AAA GTC ATC ATG AAC CAA CTA GAG CAA 1358
Gin Gin Ala Gin Gin Glu Gin Lys Val He Met Asn Gin Leu Glu Gin 390 395 400 405
GCC ACA AAC GCC CCC ACC CCC GCG CAA ATA AAC AGG ATT TTA GCC AAC 1406
Ala Thr Asn Ala Pro Thr Pro Ala Gin He Asn Arg He Leu Ala Asn 410 415 420
CCC TAT TCC CCC ACG GCA AAA ACT TTA ATG GCT TAT GGG CTT TAT CGC 1454
Pro Tyr Ser Pro Thr Ala Lys Thr Leu Met Ala Tyr Gly Leu Tyr Arg 425 430 435
TCT AAA GCA GTG ATT GGC GGG GTG ATT GAT GAA ATG CAA ACT AAA GTG 1502
Ser Lys Ala Val He Gly Gly Val He Asp Glu Met Gin Thr Lys Val
440 445 450
AAT CAA GTC TAT CAA ATG GGC TTT GCT AGG AAT TTT TTG GAG CAT AAC 1550
Asn Gin Val Tyr Gin Met Gly Phe Ala Arg Asn Phe Leu Glu His Asn 455 460 465
- 277 -
TCT AAT TCT AAT AAC ATG AAC GGC TTT GGC GTG AAA ATG GGC TAT AAG 1598 Ser Asn Ser Asn Asn Met Asn Gly Phe Gly Val Lys Met Gly Tyr Lys 470 475 480 485
CAA TTC TTT GGC AAA AAG CGC ATG TTT GGG CTT AGG TAT TAT GGT TTT 1646 Gin Phe Phe Gly Lys Lys Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe 490 495 500
TAT GAT TTT GGT TAC GCT CAA TTT GGC GCA GAA TCT TCT TTA GTG AAA 1694 Tyr Asp Phe Gly Tyr Ala Gin Phe Gly Ala Glu Ser Ser Leu Val Lys 505 510 515
GCC ACC CTC TCT AGC TAT GGG GCA GGC ACA GAC TTT CTT TAT AAT GTT 1742 Ala Thr Leu Ser Ser Tyr Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val 520 525 530
TTT ACC CGA AAA AGA GGG ACT GAA GCG ATA GAT ATC GGT TTT TTT GCC 1790 Phe Thr Arg Lys Arg Gly Thr Glu Ala He Asp He Gly Phe Phe Ala 535 540 545
GGT ATC CAA CTT GCA GGG CAA ACT TGG AAA ACG AAT TTT TTA GAT CAA 1838 Gly He Gin Leu Ala Gly Gin Thr Trp Lys Thr Asn Phe Leu Asp Gin 550 555 560 565
GTG GAT GGC AAC CAT CTT AAA CCC AAA GAC ACT TCT TTC CAA TTC CTT 1886 Val Asp Gly Asn His Leu Lys Pro Lys Asp Thr Ser Phe Gin Phe Leu 570 575 580
TTT GAT TTA GGC ATA AGG ACC AAT TTT TCC AAA ATC GCT CAT CAA AAA 1934 Phe Asp Leu Gly He Arg Thr Asn Phe Ser Lys He Ala His Gin Lys 585 590 595
AGA TCC CGT TTT TCT CAA GGG ATA GAA TTT GGC CTT AAA ATA CCG GTG 1982 Arg Ser Arg Phe Ser Gin Gly He Glu Phe Gly Leu Lys He Pro Val 600 605 610
CTT TAT CAC ACC TAT TAC CAA TCA GAA GGC GTT ACA GCG AAG TAT AGA 2030 Leu Tyr His Thr Tyr Tyr Gin Ser Glu Gly Val Thr Ala Lys Tyr Arg 615 620 625
AGA GCC TTT AGT TTT TAT GTG GGC TAC AAC ATA GGC TTT TGATTAAACA AA 2081 Arg Ala Phe Ser Phe Tyr Val Gly Tyr Asn He Gly Phe 630 635 640
ATAAGGGAAA AATATGATAA AAAAAGCTAG AAAATTCATA CCATTCTTTT TAATTGGCTC 2141 CCTCTTAGCT GAAGACAATG GCTGGTATAT GTCTGTAGGC TATC 2185
(2) INFORMATION FOR SEQ ID NO: 138:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 278 -
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (ix) FEATURE:
(A) NAME/KEY: Signal Sequence
(B) LOCATION: 1...21 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138:
Met Gly Val Ser Tyr Gin Thr Ser Leu Ala He Gin Arg Val Asp Asn
-20 -15 -10
Ser Gly Leu Asn Ala Ser Gin Ala Ala Ser Thr Tyr He Arg Gin Asn -5 1 5 10
Ala He Ala Leu Glu Ser Ala Ala Val Pro Leu Ala Tyr Tyr Leu Glu
15 20 25
Ala Met Gly Gin Gin Thr Arg Val Leu Met Gin Met Leu Cys Pro Asp
30 35 40
Pro Ser Lys Arg Cys Leu Leu Tyr Ala Gly Gly Tyr Lys Asn Gly Ser
45 50 55
Ser Asn Thr Asn Gly Asp Thr Gly Asn Asn Pro Pro Arg Gly Asn Val 60 65 70 75
Asn Ala Thr Phe Asp Met Gin Ser Leu Val Asn Asn Leu Asn Lys Leu
80 85 90
Thr Gin Leu He Gly Glu Thr Leu He Arg Asn Pro Glu Asn Leu Ser
95 100 105
Asn Ala Lys Val Phe Asn Val Lys Phe Gly Asn Gin Ser Thr Val He
110 115 120
Ala Leu Pro Glu Gly Leu Ala Asn Thr Met Asn Ala Leu Asn Asp Asp
125 130 135
He Thr Asn Ala Leu Thr Thr Leu Trp Tyr Asn Gin Thr Leu Thr Asn 140 145 150 155
Lys Ser Phe Asn Ser Gly Asn Ser Val Asn Phe Ser Pro Gin Val Leu
160 165 170
Gin His Leu Leu Gin Asp Gly Leu Ala Thr Ser Asn Gin Thr He Cys
175 180 185
Ser Thr Gin Asn Gin Cys Thr Ala Thr Asn Glu Ala Lys Ser He Ala
190 195 200
Gin Asn Ala Gin Asn He Phe Gin Ala Leu Met Gin Ala Gly He Leu
205 210 215
Gly Gly Leu Ala Asn Glu Lys Gin Phe Gly Phe Thr Tyr Asn Lys Ala 220 225 230 235
Pro Asn Gly Ser Asp Ser Gin Gin Gly Tyr Gin Ser Phe Ser Gly Pro
240 245 250
Gly Tyr Tyr Thr Lys Asn Gly Ala Asn Gly Thr Thr Gin Ala Pro Leu
255 260 265
Lys Ala Leu Pro Ala Gly Ala Thr He Gly Ser Gly Asn Gly Gin Tyr
270 275 280
Thr Tyr His Pro Ser Ser Ala Val Tyr Tyr Leu Ala Asp Ser He He
285 290 295
Ala Asn Gly He Thr Ala Ser Met He Phe Ser Gly Met Gin Asn Phe 300 305 310 315
Ala Asn Lys Ala Ala Lys Leu Thr Gly Thr Ser Ser Tyr Ser Gin Met
- 279 -
320 325 330
Gin Asp Ala He Asn Tyr Gly Glu Ser Leu Leu Ser Asn Thr Val Ala
335 340 345
Tyr Gly Asp Phe He Thr Asn Trp Val Ala Pro Tyr Leu Asp Leu Asn
350 355 360
Asn Lys Gly Leu Asn Phe Leu Pro Ser Tyr Gly Gly Gin Leu Asn Gly
365 370 375
Ala Asn His Gin Thr Pro Gin Leu Thr Pro Gin Gin Ala Gin Gin Glu 380 385 390 395
Gin Lys Val He Met Asn Gin Leu Glu Gin Ala Thr Asn Ala Pro Thr
400 405 410
Pro Ala Gin He Asn Arg He Leu Ala Asn Pro Tyr Ser Pro Thr Ala
415 420 425
Lys Thr Leu Met Ala Tyr Gly Leu Tyr Arg Ser Lys Ala Val He Gly
430 435 440
Gly Val He Asp Glu Met Gin Thr Lys Val Asn Gin Val Tyr Gin Met
445 450 455
Gly Phe Ala Arg Asn Phe Leu Glu His Asn Ser Asn Ser Asn Asn Met 460 465 470 475
Asn Gly Phe Gly Val Lys Met Gly Tyr Lys Gin Phe Phe Gly Lys Lys
480 485 490
Arg Met Phe Gly Leu Arg Tyr Tyr Gly Phe Tyr Asp Phe Gly Tyr Ala
495 500 505
Gin Phe Gly Ala Glu Ser Ser Leu Val Lys Ala Thr Leu Ser Ser Tyr
510 515 520
Gly Ala Gly Thr Asp Phe Leu Tyr Asn Val Phe Thr Arg Lys Arg Gly
525 530 535
Thr Glu Ala He Asp He Gly Phe Phe Ala Gly He Gin Leu Ala Gly 540 545 550 555
Gin Thr Trp Lys Thr Asn Phe Leu Asp Gin Val Asp Gly Asn His Leu
560 565 570
Lys Pro Lys Asp Thr Ser Phe Gin Phe Leu Phe Asp Leu Gly He Arg
575 580 585
Thr Asn Phe Ser Lys He Ala His Gin Lys Arg Ser Arg Phe Ser Gin
590 595 600
Gly He Glu Phe Gly Leu Lys He Pro Val Leu Tyr His Thr Tyr Tyr
605 610 615
Gin Ser Glu Gly Val Thr Ala Lys Tyr Arg Arg Ala Phe Ser Phe Tyr 620 625 630 635
Val Gly Tyr Asn He Gly Phe 640
(2) INFORMATION FOR SEQ ID NO: 139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1213 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 51...1160
-280- (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139:
ATTATTTTTA ATCTTGCATG AAATCTTAAA TATAGAATTA GTCCCTTTGG ATG GGA 56
Met Gly
1
TTT TCN CTC GCG CTA GGC TAT TTG TGT TTG TTT ATA TTC GTT TTA AGC 104 Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe He Phe Val Leu Ser 5 10 15
GCT TCT TTA ATC TCT GAA AAA GCC TTA TCC AAG CAG TAT TTG CAA ACC 152 Ala Ser Leu He Ser Glu Lys Ala Leu Ser Lys Gin Tyr Leu Gin Thr 20 25 30
GCT AAA GAT AAA ATC ACC TCT TTA AAG AAT TTA AAA GTC ATC GCC ATT 200 Ala Lys Asp Lys He Thr Ser Leu Lys Asn Leu Lys Val He Ala He 35 40 45 50
ACC GGA AGC TTT GGG AAA ACC AGC ACC AAA AAT TTC TTG CTT CAA ATC 248 Thr Gly Ser Phe Gly Lys Thr Ser Thr Lys Asn Phe Leu Leu Gin He 55 60 65
TTA CAA ACC ACA TTC AAC GCG CAT GCA AGC CCC AAA AGC GTC AAT ACC 296 Leu Gin Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val Asn Thr 70 75 80
CTT TTA GGG CTT GCG AAT GAT ATT AAT CAG AAT TTA GAC GAT AGG AGT 344 Leu Leu Gly Leu Ala Asn Asp He Asn Gin Asn Leu Asp Asp Arg Ser 85 90 95
GAA ATC TAT ATC GCT GAA GCC GGG GCA AGG AAT AAG GGC GAT ATT AAA 392 Glu He Tyr He Ala Glu Ala Gly Ala Arg Asn Lys Gly Asp He Lys 100 105 110
GAA ATC ACC TGT CTC ATT GAA CCG CAC CTT GTT GTG GTT GCA GAA GTG 440 Glu He Thr Cys Leu He Glu Pro His Leu Val Val Val Ala Glu Val 115 120 125 130
GGC GAA CAG CAT TTA GAA TAC TTT AAA ACT TTA GAA AAT ATT TGC GAG 488 Gly Glu Gin His Leu Glu Tyr Phe Lys Thr Leu Glu Asn He Cys Glu 135 140 145
ACT AAA GCG GAA TTA TTG GAT TCC AAA CGC TTA GAA AAA GCC TTT TGT 536 Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg Leu Glu Lys Ala Phe Cys 150 155 160
TAC TCG GTG GAA AAG ATC AAG CCC TAT GCC CCT AAA GAT AGC CCT TTA 584 Tyr Ser Val Glu Lys He Lys Pro Tyr Ala Pro Lys Asp Ser Pro Leu 165 170 175
ATA GAC TAT TCT AGC CTG GTT AAA AAC ATC CAA TCC ACT TTA AAA GGC 632
- 281 -
Ile Asp Tyr Ser Ser Leu Val Lys Asn He Gin Ser Thr Leu Lys Gly 180 185 190
ACT TCT TTT GAA ATG CTT ATA GGT AGC GTT TGG GAA AGA TTT GAA ACA 680 Thr Ser Phe Glu Met Leu He Gly Ser Val Trp Glu Arg Phe Glu Thr 195 200 205 210
AAG GTT CTA GGG GAG TTT AGC GCT TAT AAT ATC GCT TCA GCC ATT TTA 728 Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn He Ala Ser Ala He Leu 215 220 225
ATC GCT AAG CAT TTA GGC TTA GAG ACC GAA AGG ATC AAA CGG CTT GTT 776 He Ala Lys His Leu Gly Leu Glu Thr Glu Arg He Lys Arg Leu Val 230 235 240
TTA GAA CTC AAC CCT ATT GCT CAT CGT TTG CAA CTT TTG GAA GTG AAT 824 Leu Glu Leu Asn Pro He Ala His Arg Leu Gin Leu Leu Glu Val Asn 245 250 255
CAA AAA ATC ATC ATA GAC GAT AGC TTT AAT GGG AAT TTA AAG GGC ATG 872 Gin Lys He He He Asp Asp Ser Phe Asn Gly Asn Leu Lys Gly Met 260 265 270
TTA GAG GGC ATT CGT TTA GCG AGT TTG CAC AAA GGG CGT AAA GTC ATT 920 Leu Glu Gly He Arg Leu Ala Ser Leu His Lys Gly Arg Lys Val He 275 280 285 290
GTA ACA CCG GGC TTA GTG GAA AGC AAT ACA GAA AGT AAT GAG GCT TTA 968 Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu Ala Leu 295 300 305
GCG CAA AAA ATA GAC GGG GTT TTT GAT GTC GCT ATC ATC ACA GGG GAG 1016 Ala Gin Lys He Asp Gly Val Phe Asp Val Ala He He Thr Gly Glu 310 315 320
TTG AAT TCC AAA ACG ATT GCT TCA CAA TTG AAA ACC CCC CAA AAA ATC 1064 Leu Asn Ser Lys Thr He Ala Ser Gin Leu Lys Thr Pro Gin Lys He 325 330 335
TTA CTC AAG GAT AAG GCG CAA TTG GAA AAT ATC TTA CAA GCC ACC ACG 1112 Leu Leu Lys Asp Lys Ala Gin Leu Glu Asn He Leu Gin Ala Thr Thr 340 345 350
ATT CAA GGC GAT TTG ATT TTA TTC GCT AAT GAC GCC CCT AAT TAC ATT T 1161 He Gin Gly Asp Leu He Leu Phe Ala Asn Asp Ala Pro Asn Tyr He 355 360 365 370
AGGAAATGAA CATGCAACAT TTATACGCTC CTTGGCGCGA AAGTTATTTG AA 1213
(2) INFORMATION FOR SEQ ID NO: 140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
- 282 -
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140:
Met Gly Phe Xaa Leu Ala Leu Gly Tyr Leu Cys Leu Phe He Phe Val
1 5 10 15
Leu Ser Ala Ser Leu He Ser Glu Lys Ala Leu Ser Lys Gin Tyr Leu
20 25 30
Gin Thr Ala Lys Asp Lys He Thr Ser Leu Lys Asn Leu Lys Val He
35 40 45
Ala He Thr Gly Ser Phe Gly Lys Thr Ser Thr Lys Asn Phe Leu Leu
50 55 60
Gin He Leu Gin Thr Thr Phe Asn Ala His Ala Ser Pro Lys Ser Val 65 70 75 80
Asn Thr Leu Leu Gly Leu Ala Asn Asp He Asn Gin Asn Leu Asp Asp
85 90 95
Arg Ser Glu He Tyr He Ala Glu Ala Gly Ala Arg Asn Lys Gly Asp
100 105 110
He Lys Glu He Thr Cys Leu He Glu Pro His Leu Val Val Val Ala
115 120 125
Glu Val Gly Glu Gin His Leu Glu Tyr Phe Lys Thr Leu Glu Asn He
130 135 140
Cys Glu Thr Lys Ala Glu Leu Leu Asp Ser Lys Arg Leu Glu Lys Ala 145 150 155 160
Phe Cys Tyr Ser Val Glu Lys He Lys Pro Tyr Ala Pro Lys Asp Ser
165 170 175
Pro Leu He Asp Tyr Ser Ser Leu Val Lys Asn He Gin Ser Thr Leu
180 185 190
Lys Gly Thr Ser Phe Glu Met Leu He Gly Ser Val Trp Glu Arg Phe
195 200 205
Glu Thr Lys Val Leu Gly Glu Phe Ser Ala Tyr Asn He Ala Ser Ala
210 215 220
He Leu He Ala Lys His Leu Gly Leu Glu Thr Glu Arg He Lys Arg 225 230 235 240
Leu Val Leu Glu Leu Asn Pro He Ala His Arg Leu Gin Leu Leu Glu
245 250 255
Val Asn Gin Lys He He He Asp Asp Ser Phe Asn Gly Asn Leu Lys
260 265 270
Gly Met Leu Glu Gly He Arg Leu Ala Ser Leu His Lys Gly Arg Lys
275 280 285
Val He Val Thr Pro Gly Leu Val Glu Ser Asn Thr Glu Ser Asn Glu
290 295 300
Ala Leu Ala Gin Lys He Asp Gly Val Phe Asp Val Ala He He Thr 305 310 315 320
Gly Glu Leu Asn Ser Lys Thr He Ala Ser Gin Leu Lys Thr Pro Gin
325 330 335
Lys He Leu Leu Lys Asp Lys Ala Gin Leu Glu Asn He Leu Gin Ala
340 345 350
Thr Thr He Gin Gly Asp Leu He Leu Phe Ala Asn Asp Ala Pro Asn
355 360 365
Tyr He 370
-283- (2) INFORMATION FOR SEQ ID NO: 141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 82...270 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141:
ACTTAAAGGC ATAAAAACCT TAAGCTTTTT GAGTTTCAAA AGGGTTTCAA GCTTTTTATA 60
AGACTTTTTT TGAATGAGTA A GGA GAA AAT ATT TTG TTC CAT AAA CTG TC 111
Gly Glu Asn He Leu Phe His Lys Leu He 1 5 10
TTA ACA TGC TTT TTA GCG CTT GTA GCA ATA ACC ATT CAA GCT TGC GGT 159 Leu Thr Cys Phe Leu Ala Leu Val Ala He Thr He Gin Ala Cys Gly 15 20 25
TAT AAA GCC CCT CCA TTC AAT GAA AAA CCC GCT AAA AAA ACT TCA AAC 207 Tyr Lys Ala Pro Pro Phe Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn 30 35 40
AGC TCT AAT TCT TCT ATG CAA ACG CCC ACC AAC AGC ACC ACG CCA GAA 255 Ser Ser Asn Ser Ser Met Gin Thr Pro Thr Asn Ser Thr Thr Pro Glu 45 50 55
TTT TTA AAT CAG CCT TAAAATCACT GCTCTTGTTT AAGGGCTTTG ATTTCTAGGG T 311 Phe Leu Asn Gin Pro 60
TTTTGTGGCT AACTTTTGAN STTCGCTTTC ATCATGCGTT ACCATAATG 360
(2) INFORMATION FOR SEQ ID NO: 142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 63 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142:
Gly Glu Asn He Leu Phe His Lys Leu He Leu Thr Cys Phe Leu Ala
- 284 -
1 5 10 15
Leu Val Ala He Thr He Gin Ala Cys Gly Tyr Lys Ala Pro Pro Phe
20 25 30
Asn Glu Lys Pro Ala Lys Lys Thr Ser Asn Ser Ser Asn Ser Ser Met
35 40 45
Gin Thr Pro Thr Asn Ser Thr Thr Pro Glu Phe Leu Asn Gin Pro 50 55 60
(2) INFORMATION FOR SEQ ID NO: 143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1024 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 115...921 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143:
AGTTGGCAAA AACGCAGAGA CAGTAACGCA AAGGCAAATA AAGAGACTCA TTTTAAACAA 60 GCGAATGCCA TTACAAATAT AATCAGATCA GTTGGTGGGT TTTTTACAAA GATT ATG 117
Met 1
AAG AGA GTT AGA GAA CTT GTA AAA AAA CAT CCC GAG AAA AGC AGT GTG 165 Lys Arg Val Arg Glu Leu Val Lys Lys His Pro Glu Lys Ser Ser Val 5 10 15
GCA TTA GTA GTA TTA ACC CAT GCT GCA TGC AAG AAA GCG AAA GAA TTG 213 Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu Leu 20 25 30
GAC GAT AAA GTC CAG GAT AAA TCC AAA CAA GCT GAA AAA GAA AAT CAA 261 Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu Asn Gin 35 40 45
ATC AAT TGG TGG AAA TAT TCA GGA TTA ACA ATA GCG ACA AGT TTA TTA 309 He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Thr Ser Leu Leu 50 55 60 65
TTA GCC GCT TGT AGT GTT GGT GAT ATT GAT AAA CAG ATA GAG TTA GAA 357 Leu Ala Ala Cys Ser Val Gly Asp He Asp Lys Gin He Glu Leu Glu 70 75 80
CAA GAA AAA AAG GAA GCT GAA AAC GCT AGG GAT AGA GCG AAC AAG AGT 405 Gin Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys Ser 85 90 95
-285 -
GGG ATA GAA CTG GAA CAG- GAA AAA CAA AAG ACC ATT AAA GAA CAA AAA 453 Gly He Glu Leu Glu Gin Glu Lys Gin Lys Thr He Lys Glu Gin Lys 100 105 110
GAT TTA GTT AAA AAA GCA GAA CAA AAT TGC CAA GAA AAT CAT GGC CAA 501 Asp Leu Val Lys Lys Ala Glu Gin Asn Cys Gin Glu Asn His Gly Gin 115 120 125
TTC TTT ATG AAA AAA TTA GGA ATT AAG GGT GGC ATT GCT ATA GAA GTA 549 Phe Phe Met Lys Lys Leu Gly He Lys Gly Gly He Ala He Glu Val 130 135 140 145
GAA GCT GAA TGC AAA ACC CCT AAA CCT GCA AAA ACC AAT CAA ACC CCT 597 Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gin Thr Pro 150 155 160
ATC CAG CCA AAA CAC CTC CCC AAC TCT AAA CAA CCC CAC TCT CAA AGA 645 He Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro His Ser Gin Arg 165 170 175
GGA TCA AAA GCG CAA GAG CTT ATC GCT TAT TTG CAA AAA GAG TTA GAA 693 Gly Ser Lys Ala Gin Glu Leu He Ala Tyr Leu Gin Lys Glu Leu Glu 180 185 190
TCT CTG CCC TAT TCA CAA AAA GCT ATC GCT AAA CAA GTG AAT TTT TAC 741 Ser Leu Pro Tyr Ser Gin Lys Ala He Ala Lys Gin Val Asn Phe Tyr 195 200 205
AGG CCA AGT TCT GTC GCT TAT TTA GAA CTA GAC CCT AGA GAT TTT AAG 789 Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys 210 215 220 225
GTT ACA GAA GAA TGG CAA AAA GAA AAT CTA AAA ATA CGC TCT AAA GCT 837 Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys He Arg Ser Lys Ala 230 235 240
CAA GCT AAA ATG CTT GGA AAT GAG AAA CCC ACA AGC CCA CCT TTC AAC 885 Gin Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe Asn 245 250 255
CTC TCA AAG CCT TTT GTT CGT TCA AAA AAT ATT TGC TGATGTTAAT AAAGAA 937 Leu Ser Lys Pro Phe Val Arg Ser Lys Asn He Cys 260 265
ATAGAAGCAG TTGCTAATAC TGAAAAGAAA GCAGAAAAAG MGGGTTATGG TTATAGTAAA 997 AGGATGTAGG CATAAGAAAA TAAGAAC 1024
(2) INFORMATION FOR SEQ ID NO: 144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 286 -
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144:
Met Lys Arg Val Arg Glu Leu Val Lys Lys His Pro Glu Lys Ser Ser
1 5 10 15
Val Ala Leu Val Val Leu Thr His Ala Ala Cys Lys Lys Ala Lys Glu
20 25 30
Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu Asn
35 40 45
Gin He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Thr Ser Leu
50 55 60
Leu Leu Ala Ala Cys Ser Val Gly Asp He Asp Lys Gin He Glu Leu 65 70 75 80
Glu Gin Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn Lys
85 90 95
Ser Gly He Glu Leu Glu Gin Glu Lys Gin Lys Thr He Lys Glu Gin
100 105 110
Lys Asp Leu Val Lys Lys Ala Glu Gin Asn Cys Gin Glu Asn His Gly
115 120 125
Gin Phe Phe Met Lys Lys Leu Gly He Lys Gly Gly He Ala He Glu
130 135 140
Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn Gin Thr 145 150 155 160
Pro He Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro His Ser Gin
165 170 175
Arg Gly Ser Lys Ala Gin Glu Leu He Ala Tyr Leu Gin Lys Glu Leu
180 185 190
Glu Ser Leu Pro Tyr Ser Gin Lys Ala He Ala Lys Gin Val Asn Phe
195 200 205
Tyr Arg Pro Ser Ser Val Ala Tyr Leu Glu Leu Asp Pro Arg Asp Phe
210 215 220
Lys Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys He Arg Ser Lys 225 230 235 240
Ala Gin Ala Lys Met Leu Gly Asn Glu Lys Pro Thr Ser Pro Pro Phe
245 250 255
Asn Leu Ser Lys Pro Phe Val Arg Ser Lys Asn He Cys 260 265
(2) INFORMATION FOR SEQ ID NO: 145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 669 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 88...603 (D) OTHER INFORMATION:
-287 -
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145:
AAAATAAGGA GGAATTGTTT GATTTTACGA TTGGCTGGAG CAAGCGTTTT AACGGCTTGT 60 GTCTTTTCGG GGTGTTTTTT TTTAAAA ATG TTT GAT AAA AAA CTT TCT AGT AAC 114
Met Phe Asp Lys Lys Leu Ser Ser Asn 1 5
GAT TGG CAT ATC CAA AAA GTG GAA ATG AAC CAT CAA GTC TAT GAC ATT 162 Asp Trp His He Gin Lys Val Glu Met Asn His Gin Val Tyr Asp He 10 15 ' 20 25
GAA ACC ATG CTC GCT GAT AGC GCT TTT AGA GAG CAT GAA GAA GAG CAA 210 Glu Thr Met Leu Ala Asp Ser Ala Phe Arg Glu His Glu Glu Glu Gin 30 35 40
GAT TCC TCT CTA AAT ACC GCT TTG CCT GAA GAT AAA ACA GCG ATT GAA 258 Asp Ser Ser Leu Asn Thr Ala Leu Pro Glu Asp Lys Thr Ala He Glu 45 50 55
GCC AAA GAG CAA GAG CAA AAA GAA AAA AGA AAA CGC TGG TAT GAG CTT 306 Ala Lys Glu Gin Glu Gin Lys Glu Lys Arg Lys Arg Trp Tyr Glu Leu 60 65 70
TTT AAA AAG AAA CCA AAG CCC AAA AGC TCT ATG GGA GAG TTT GTG TTT 354 Phe Lys Lys Lys Pro Lys Pro Lys Ser Ser Met Gly Glu Phe Val Phe 75 80 85
GAT CAA AAA GAA AAT CGT ATT TAT GGC AAA GGC TAT TGC AAC CGG TAT 402 Asp Gin Lys Glu Asn Arg He Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr 90 95 100 105
TTT GCC AGC TAT GTA TGG CAG GGC GAT AGG CAC ATT GGG ATT GAA GAT 450 Phe Ala Ser Tyr Val Trp Gin Gly Asp Arg His He Gly He Glu Asp 110 115 120
AGC GGG ATT TCA AGA AAA GTG TGT AAA GAT GAG CAT TTA ATG GCG TTT 498 Ser Gly He Ser Arg Lys Val Cys Lys Asp Glu His Leu Met Ala Phe 125 130 135
GAA TTG GAA TTT ATG GAG AAT TTT AAG GGT AAT TTT ACG GTA ACT AAG 546 Glu Leu Glu Phe Met Glu Asn Phe Lys Gly Asn Phe Thr Val Thr Lys 140 145 150
GGC AAG GAC ACG CTC ATT TTA GAC AAC CAA AAA ATG AAA ATT TAT TTG 594 Gly Lys Asp Thr Leu He Leu Asp Asn Gin Lys Met Lys He Tyr Leu 155 160 165
AAA ACG CCT TGAGTGGGTT TTTGATTTCA AAACAATCTA AGATCACTAA ATTAGGGAT 652
Lys Thr Pro
170
TAAAAAGAAA TTTTTAA 669
-288 -
(2) INFORMATION FOR SEQ ID NO: 146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146:
Met Phe Asp Lys Lys Leu Ser Ser Asn Asp Trp His He Gin Lys Val
1 5 10 15
Glu Met Asn His Gin Val Tyr Asp He Glu Thr Met Leu Ala Asp Ser
20 25 30
Ala Phe Arg Glu His Glu Glu Glu Gin Asp Ser Ser Leu Asn Thr Ala
35 40 45
Leu Pro Glu Asp Lys Thr Ala He Glu Ala Lys Glu Gin Glu Gin Lys
50 55 60
Glu Lys Arg Lys Arg Trp Tyr Glu Leu Phe Lys Lys Lys Pro Lys Pro 65 70 75 80
Lys Ser Ser Met Gly Glu Phe Val Phe Asp Gin Lys Glu Asn Arg He
85 90 95
Tyr Gly Lys Gly Tyr Cys Asn Arg Tyr Phe Ala Ser Tyr Val Trp Gin
100 105 110
Gly Asp Arg His He Gly He Glu Asp Ser Gly He Ser Arg Lys Val
115 120 125
Cys Lys Asp Glu His Leu Met Ala Phe Glu Leu Glu Phe Met Glu Asn
130 135 140
Phe Lys Gly Asn Phe Thr Val Thr Lys Gly Lys Asp Thr Leu He Leu 145 150 155 160
Asp Asn Gin Lys Met Lys He Tyr Leu Lys Thr Pro 165 170
(2) INFORMATION FOR SEQ ID NO: 147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1350 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 87...1280 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147:
- 2 89 -
ATCAATCTAA CTTGAGTGGA TTTTTCGTAT TAGTTTCCAT GATATAATTT TGAAAAGTAA 60
GATTGTTTTT TAAAAAAAGG TTGGTA ATG GAA TCA GTA AAA ACA GGA AAA ACA 113
Met Glu Ser Val Lys Thr Gly Lys Thr 1 5
AAT AAG GTT GGC AAG AAT ACA GAG ATG GCT AAT ACA AAG GCA AAT AAA 161 Asn Lys Val Gly Lys Asn Thr Glu Met Ala Asn Thr Lys Ala Asn Lys 10 15 20 25
GAG ACT CAT TTT AAA CAA GTG AGC GCC ATT ACA AAT ATA ATC AGA TCA 209 Glu Thr His Phe Lys Gin Val Ser Ala He Thr Asn He He Arg Ser 30 35 40
GTT GGT GGG TTT TTT ACA AAA ATT GCA AAG AGA GTT AGA GGA CTT GTA 257 Val Gly Gly Phe Phe Thr Lys He Ala Lys Arg Val Arg Gly Leu Val 45 50 55
AAA AAA CAC CCC AAG AAA AGC AGT GCG GCA TTA GTA GTA TTG ACC CAT 305 Lys Lys His Pro Lys Lys Ser Ser Ala Ala Leu Val Val Leu Thr His 60 65 70
ATT GCG TGC AAG AAA GCG AAA GAA TTA GAC GAT AAA GTC CAA GAT AAA 353 He Ala Cys Lys Lys Ala Lys Glu Leu Asp Asp Lys Val Gin Asp Lys 75 80 85
TCC AAA CAA GCT GAA AAA GAA AAT CAA ATC AAT TGG TGG AAA TAT TCA 401 Ser Lys Gin Ala Glu Lys Glu Asn Gin He Asn Trp Trp Lys Tyr Ser 90 95 100 105
GGA TTA ACA ATA GCG GCA AGT TTA TTA TTA GCC GCT TGT AGC GCT GGT 449 Gly Leu Thr He Ala Ala Ser Leu Leu Leu Ala Ala Cys Ser Ala Gly 110 115 120
GAT ACT GAT AAA CAG ATA GAA CTA GAA CAA GAA AAA AAG GAA GCT GAA 497 Asp Thr Asp Lys Gin He Glu Leu Glu Gin Glu Lys Lys Glu Ala Glu 125 130 135
AAC GCT AGG GAT AGA GCG AAC AAG AGT GGG ATA GAA CTA GAA CAA GAA 545 Asn Ala Arg Asp Arg Ala Asn Lys Ser Gly He Glu Leu Glu Gin Glu 140 145 150
AGA CAG AAA ACA AAC AAG AGT GGG ATA GAA CTC GCT AAT AGT CAA ATA 593 Arg Gin Lys Thr Asn Lys Ser Gly He Glu Leu Ala Asn Ser Gin He 155 160 165
AAA GCA GAA CAA GAA AGA CAA AAG ACA GAA CAA GAA AAA CAA AAA GCA 641 Lys Ala Glu Gin Glu Arg Gin Lys Thr Glu Gin Glu Lys Gin Lys Ala 170 175 180 185
AAT AAG AGT GCG ATA GAG TTA GAA CAG CAA AAA CAA AAG ACC ATT AAT 689 Asn Lys Ser Ala He Glu Leu Glu Gin Gin Lys Gin Lys Thr He Asn 190 195 200
ACA CAA AGA GAT TTG ATT AAA GAA CAG AAA GAT TTC ATT AAA GAA ACA 737 Thr Gin Arg Asp Leu He Lys Glu Gin Lys Asp Phe He Lys Glu Thr
- 290 -
205 210 215
GAA CAA AAT TGC CAA GAA AAT CAT AAT CAA TTC TTT ATT AAA AAA TTA 785 Glu Gin Asn Cys Gin Glu Asn His Asn Gin Phe Phe He Lys Lys Leu 220 225 230
GGA ATT AAG GGT GGC ATT GCT ATA GAA GTA GAA GCT GAA TGC AAA ACC 833 Gly He Lys Gly Gly He Ala He Glu Val Glu Ala Glu Cys Lys Thr 235 240 245
CCT AAA CCT GCA AAA ACC AAT CAA ACC CCT ATC CAG CCA AAA CAC CTC 881 Pro Lys Pro Ala Lys Thr Asn Gin Thr Pro He Gin Pro Lys His Leu 250 255 260 265
CCA AAC TCT AAA CAA CCT CAT TCT CAA AGA GGA TCA AAA GCG CAA GAG 929 Pro Asn Ser Lys Gin Pro His Ser Gin Arg Gly Ser Lys Ala Gin Glu 270 275 280
TTT ATC GCT TAT TTG CAA AAA GAG CTA GAA TTT CTG CCC TAT TCG CAA 977 Phe He Ala Tyr Leu Gin Lys Glu Leu Glu Phe Leu Pro Tyr Ser Gin 285 290 295
AAA GCT ATC GCT AAA CAA GTG AAT TTC TAT AAA CCA AGT TCT ATC GCT 1025 Lys Ala He Ala Lys Gin Val Asn Phe Tyr Lys Pro Ser Ser He Ala 300 305 310
TAT TTA GAA CTA GAT CCT AGA GAT TTT AAG GTT ACA GAA GAA TGG CAA 1073 Tyr Leu Glu Leu Asp Pro Arg Asp Phe Lys Val Thr Glu Glu Trp Gin 315 320 325
AAA GAA AAT CTA AAA ATA CGC TCT AAA GCT CAA GCT AAA ATG CTT GAA 1121 Lys Glu Asn Leu Lys He Arg Ser Lys Ala Gin Ala Lys Met Leu Glu 330 335 340 345
ATG AGG GAT TTA AAA CCA GAC CCA CAA GCC CAC CTT CCA ACC TCT CAA 1169
Met Arg Asp Leu Lys Pro Asp Pro Gin Ala His Leu Pro Thr Ser Gin
350 355 360
AGC CTT TTG TTC GTT CAA AAA ATA TTT GCT GAT GTT AAT AAA GAA ATA 1217
Ser Leu Leu Phe Val Gin Lys He Phe Ala Asp Val Asn Lys Glu He
365 370 375
GAA GCA GTT GCT AAT ACT GAA AAG AAA GCA GAA AAA GCG GGT TAT GGT 1265 Glu Ala Val Ala Asn Thr Glu Lys Lys Ala Glu Lys Ala Gly Tyr Gly 380 385 390
TAT AGT AAA AGG ATG TAGGCATAAG AAAATAAGAA CACCATAAAA TCGTTTTTAG C 1321 Tyr Ser Lys Arg Met 395
TTCTAGGAGA CATCAGTCAG TTTCTTGCC 1350
(2) INFORMATION FOR SEQ ID NO: 148:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 398 amino acids
- 291 -
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148:
Met Glu Ser Val Lys Thr Gly Lys Thr Asn Lys Val Gly Lys Asn Thr
1 5 10 15
Glu Met Ala Asn Thr Lys Ala Asn Lys Glu Thr His Phe Lys Gin Val
20 25 30
Ser Ala He Thr Asn He He Arg Ser Val Gly Gly Phe Phe Thr Lys
35 40 45
He Ala Lys Arg Val Arg Gly Leu Val Lys Lys His Pro Lys Lys Ser
50 55 60
Ser Ala Ala Leu Val Val Leu Thr His He Ala Cys Lys Lys Ala Lys 65 70 75 80
Glu Leu Asp Asp Lys Val Gin Asp Lys Ser Lys Gin Ala Glu Lys Glu
85 90 95
Asn Gin He Asn Trp Trp Lys Tyr Ser Gly Leu Thr He Ala Ala Ser
100 105 110
Leu Leu Leu Ala Ala Cys Ser Ala Gly Asp Thr Asp Lys Gin He Glu
115 120 125
Leu Glu Gin Glu Lys Lys Glu Ala Glu Asn Ala Arg Asp Arg Ala Asn
130 135 140
Lys Ser Gly He Glu Leu Glu Gin Glu Arg Gin Lys Thr Asn Lys Ser 145 150 155 160
Gly He Glu Leu Ala Asn Ser Gin He Lys Ala Glu Gin Glu Arg Gin
165 170 175
Lys Thr Glu Gin Glu Lys Gin Lys Ala Asn Lys Ser Ala He Glu Leu
180 185 190
Glu Gin Gin Lys Gin Lys Thr He Asn Thr Gin Arg Asp Leu He Lys
195 200 205
Glu Gin Lys Asp Phe He Lys Glu Thr Glu Gin Asn Cys Gin Glu Asn
210 215 220
His Asn Gin Phe Phe He Lys Lys Leu Gly He Lys Gly Gly He Ala 225 230 235 240
He Glu Val Glu Ala Glu Cys Lys Thr Pro Lys Pro Ala Lys Thr Asn
245 250 255
Gin Thr Pro He Gin Pro Lys His Leu Pro Asn Ser Lys Gin Pro His
260 265 270
Ser Gin Arg Gly Ser Lys Ala Gin Glu Phe He Ala Tyr Leu Gin Lys
275 280 285
Glu Leu Glu Phe Leu Pro Tyr Ser Gin Lys Ala He Ala Lys Gin Val
290 295 300
Asn Phe Tyr Lys Pro Ser Ser He Ala Tyr Leu Glu Leu Asp Pro Arg 305 310 315 320
Asp Phe Lys Val Thr Glu Glu Trp Gin Lys Glu Asn Leu Lys He Arg
325 330 335
Ser Lys Ala Gin Ala Lys Met Leu Glu Met Arg Asp Leu Lys Pro Asp
340 345 350
Pro Gin Ala His Leu Pro Thr Ser Gin Ser Leu Leu Phe Val Gin Lys 355 360 365
- 292 -
Ile Phe Ala Asp Val Asn Lys Glu He Glu Ala Val Ala Asn Thr Glu
370 375 380
Lys Lys Ala Glu Lys Ala Gly Tyr Gly Tyr Ser Lys Arg Met 385 390 395
(2) INFORMATION FOR SEQ ID NO: 149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 709 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 336...443 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149:
TAAGGGATAT TGCTAACGAT TAAGCTGTAT TGGAAGAGTT TATTTTGCAA GAATTAATCT 60
TGCCTTGTGT GATTAGTAAC ACAAGGCAAG TGTGATAAAC CCTACTACAA TTTCAATTCA 120
AGGAGCCTAA CTAAAATAAA ATGAACAATT TCAGTTAGGG CTTTATTATA GCAAAAATTA 180
TCTAAGATTA CAAAGGGTAG CGTTTCTGTT TTTGGATTTA GAGCGTTATT TTGATTGTTT 240
TGAGTTTAAT TTACTTTTTG TTTAATAATA AATCTTAACT ATCATAAATG TACAATTAAA 300
GTATTTAAAA AAATTTTAAA ACAAAAGGAT ATAAA ATG AAA ACC ATT AGA AAT 353
Met Lys Thr He Arg Asn 1 5
AGC GTG TTT ATT GGA GCG TCT TTA CTC GGC GGT TGC GCT AGC GTT GAG 401 Ser Val Phe He Gly Ala Ser Leu Leu Gly Gly Cys Ala Ser Val Glu 10 15 20
GCT TAT TTT GAC GCT TTG CAT GTT GCT CGC GTT AAA GAC GCT TGTTTATAG 452 Ala Tyr Phe Asp Ala Leu His Val Ala Arg Val Lys Asp Ala 25 30 35
AAAAAGAAGC ACACCACACG CCCAAAGACT TTGATAGCCC TTACCACACT GACTAAACCG 512
GCACTAGGTT TTAGTTGGGG GTTTTTAGGG GTGTTATTTT AGATACTCTC TGTTCCCTTA 572
AAGAAAATAA ATTTCTACCA TAAAATAAAA TCTTAAATTA AGGCGACTAA AACCCCACTT 632
TTAAAAAATT AAAAAGCGTT AAGTAAGACT TATCCAAAAA GCAAAGAAAA TCAATTTTTC 692
CAACCACTTT TTTTAAG 709
(2) INFORMATION FOR SEQ ID NO: 150:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 293 -
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150:
Met Lys Thr He Arg Asn Ser Val Phe He Gly Ala Ser Leu Leu Gly
1 5 10 15
Gly Cys Ala Ser Val Glu Ala Tyr Phe Asp Ala Leu His Val Ala Arg
20 25 30
Val Lys Asp Ala 35
(2) INFORMATION FOR SEQ ID NO: 151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 888 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 19...837 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151:
AGATAGGAAT GTAAAGGA ATG GAA TTT ATG AAA AAG TTT GTA GCT TTA GGG 51
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly 1 5 10
CTT CTA TCC GCA GTT TTA AGC TCT TCG TTG TTA GCC GAA GGT GAT GGT 99 Leu Leu Ser Ala Val Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly 15 20 25
GTT TAT ATA GGG ACT AAT TAT CAG CTT GGA CAA GCC CGT TTG AAT AGT 147 Val Tyr He Gly Thr Asn Tyr Gin Leu Gly Gin Ala Arg Leu Asn Ser 30 35 40
AAT ATT TAT AAT ACA GGG GAT TGC ACA GGG AGT GTT GTA GGT TGC CCC 195 Asn He Tyr Asn Thr Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro 45 50 55
CCA GGT CTT ACC GCT AAT AAG CAT AAT CCA GGA GGC ACC AAT ATC AAT 243 Pro Gly Leu Thr Ala Asn Lys His Asn Pro Gly Gly Thr Asn He Asn 60 65 70 75
TGG CAT GCT AAA TAC GCT AAT GGG GCT TTG AAT GGT CTT GGG TTG AAT 291 Trp His Ala Lys Tyr Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn 80 85 90
- 294 -
GTG GGT TAT AAG AAG TTC TTC CAG TTC AAG TCT TTT GAT ATG ACA AGC 339 Val Gly Tyr Lys Lys Phe Phe Gin Phe Lys Ser Phe Asp Met Thr Ser 95 100 105
AAG TGG TTT GGT TTT AGA GTG TAT GGG CTT TTT GAT TAT GGG CAT GCC 387 Lys Trp Phe Gly Phe Arg Val Tyr Gly Leu Phe Asp Tyr Gly His Ala 110 115 120
ACT TTA GGC AAG CAA GTT TAT GCA CCT AAT AAA ATC CAG TTG GAT ATG 435 Thr Leu Gly Lys Gin Val Tyr Ala Pro Asn Lys He Gin Leu Asp Met 125 130 135
GTC TCT TGG GGT GTG GGG AGC GAT TTG TTA GCT GAT ATT ATT GAT AAC 483 Val Ser Trp Gly Val Gly Ser Asp Leu Leu Ala Asp He He Asp Asn 140 145 150 155
GAT AAC GCT TCT TTT GGT ATT TTT GGT GGG GTC GCT ATC GGC GGT AAC 531 Asp Asn Ala Ser Phe Gly He Phe Gly Gly Val Ala He Gly Gly Asn 160 165 170
ACT TGG AAA AGC TCA GCG GCA AAC TAT TGG AAA GAG CAA ATC ATT GAA 579 Thr Trp Lys Ser Ser Ala Ala Asn Tyr Trp Lys Glu Gin He He Glu 175 180 185
GCT AAG GGT CCT GAT GTT TGT ACC CCT ACT TAT TGT AAC CCT AAC GCT 627 Ala Lys Gly Pro Asp Val Cys Thr Pro Thr Tyr Cys Asn Pro Asn Ala 190 195 200
CCT TAT AGC ACC AAA ACT TCA ACC GTC GCT TTT CAG GTA TGG TTG AAT 675 Pro Tyr Ser Thr Lys Thr Ser Thr Val Ala Phe Gin Val Trp Leu Asn 205 210 215
TTT GGG GTG AGA GCC AAT ATT TAC AAG CAT AAT GGC GTA GAG TTT GGC 723 Phe Gly Val Arg Ala Asn He Tyr Lys His Asn Gly Val Glu Phe Gly 220 225 230 235
GTG AGA GTG CCG CTA CTC ATC AAC AAG TTT TTG AGT GCG GGT CCT AAC 771 Val Arg Val Pro Leu Leu He Asn Lys Phe Leu Ser Ala Gly Pro Asn 240 245 250
GCT ACT AAT CTT TAT TAC CAT TTG AAA CGG GAT TAT TCG CTT TAT TTA 819 Ala Thr Asn Leu Tyr Tyr His Leu Lys Arg Asp Tyr Ser Leu Tyr Leu 255 260 265
GGG TAT AAC TAC ACT TTT TAAACCCTTT AAAAGGGTGT CTTTAAGCCC TTTTTAGT 875 Gly Tyr Asn Tyr Thr Phe 270
CCTTATAAAA AGG 888
(2) INFORMATION FOR SEQ ID NO: 152:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids
(B) TYPE: amino acid
- 295 -
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152:
Met Glu Phe Met Lys Lys Phe Val Ala Leu Gly Leu Leu Ser Ala Val
1 5 10 15
Leu Ser Ser Ser Leu Leu Ala Glu Gly Asp Gly Val Tyr He Gly Thr
20 25 30
Asn Tyr Gin Leu Gly Gin Ala Arg Leu Asn Ser Asn He Tyr Asn Thr
35 40 45
Gly Asp Cys Thr Gly Ser Val Val Gly Cys Pro Pro Gly Leu Thr Ala
50 55 60
Asn Lys His Asn Pro Gly Gly Thr Asn He Asn Trp His Ala Lys Tyr 65 70 75 80
Ala Asn Gly Ala Leu Asn Gly Leu Gly Leu Asn Val Gly Tyr Lys Lys
85 90 95
Phe Phe Gin Phe Lys Ser Phe Asp Met Thr Ser Lys Trp Phe Gly Phe
100 105 110
Arg Val Tyr Gly Leu Phe Asp Tyr Gly His Ala Thr Leu Gly Lys Gin
115 120 125
Val Tyr Ala Pro Asn Lys He Gin Leu Asp Met Val Ser Trp Gly Val
130 135 140
Gly Ser Asp Leu Leu Ala Asp He He Asp Asn Asp Asn Ala Ser Phe 145 150 155 160
Gly He Phe Gly Gly Val Ala He Gly Gly Asn Thr Trp Lys Ser Ser
165 170 175
Ala Ala Asn Tyr Trp Lys Glu Gin He He Glu Ala Lys Gly Pro Asp
180 185 190
Val Cys Thr Pro Thr Tyr Cys Asn Pro Asn Ala Pro Tyr Ser Thr Lys
195 200 205
Thr Ser Thr Val Ala Phe Gin Val Trp Leu Asn Phe Gly Val Arg Ala
210 215 220
Asn He Tyr Lys His Asn Gly Val Glu Phe Gly Val Arg Val Pro Leu 225 230 235 240
Leu He Asn Lys Phe Leu Ser Ala Gly Pro Asn Ala Thr Asn Leu Tyr
245 250 255
Tyr His Leu Lys Arg Asp Tyr Ser Leu Tyr Leu Gly Tyr Asn Tyr Thr
260 265 270
Phe
(2) INFORMATION FOR SEQ ID NO: 153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 310 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
- 296 -
(A) NAME/KK*: Coding Sequence
(B) LOCATION: 10...279 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153:
AAAAGGAGA GTG GCG GTG AAA AAA ATC GTT GTG AGT TGG TGT GTG GCG TTG 51 Val Ala Val Lys Lys He Val Val Ser Trp Cys Val Ala Leu 1 5 10
GCT TTT TTA AGC GCG GAT TCA GCA CAA GCC AAT AAA GCG ATC AGT AAT 99 Ala Phe Leu Ser Ala Asp Ser Ala Gin Ala Asn Lys Ala He Ser Asn 15 20 25 30
GCG GAT TTG ATT AAA GAG ATA AGG GAT TTA AAA AAA ATC ATC AGC GCG 147 Ala Asp Leu He Lys Glu He Arg Asp Leu Lys Lys He He Ser Ala 35 40 45
CAA AAC ACT GAG ATT AAC AAC TTA AGA AAA GTG CAA GAA GTG TTG TCT 195 Gin Asn Thr Glu He Asn Asn Leu Arg Lys Val Gin Glu Val Leu Ser 50 55 60
GGG CAA TTA GGG GAC ATG CGT AAG GAT ATA TTA AGC ACT AGA GAT TAT 243 Gly Gin Leu Gly Asp Met Arg Lys Asp He Leu Ser Thr Arg Asp Tyr
65 70 75
TGC ATT AGC TTA AGG CCT TAT ATC TAT AAT TGG CGC TAGGGGATAA TCCAAA 295 Cys He Ser Leu Arg Pro Tyr He Tyr Asn Trp Arg 80 85 90
AAATGAAAGC ATGCG 310
(2) INFORMATION FOR SEQ ID NO : 154 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154:
Val Ala Val Lys Lys He Val Val Ser Trp Cys Val Ala Leu Ala Phe
1 5 10 15
Leu Ser Ala Asp Ser Ala Gin Ala Asn Lys Ala He Ser Asn Ala Asp
20 25 30
Leu He Lys Glu He Arg Asp Leu Lys Lys He He Ser Ala Gin Asn
35 40 45
Thr Glu He Asn Asn Leu Arg Lys Val Gin Glu Val Leu Ser Gly Gin
50 55 60
Leu Gly Asp Met Arg Lys Asp He Leu Ser Thr Arg Asp Tyr Cys He 65 70 75 80
- 297 -
Ser Leu Arg Pro Tyr lie Tyr Asn Trp Arg 85 90
(2) INFORMATION FOR SEQ ID NO: 155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 549 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 16...474 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155:
TGTTAAGATC AGTTT ATG GAA CAA AAT ATT TTC TCC TTA CTC ATT CAA AAA 51 Met Glu Gin Asn He Phe Ser Leu Leu He Gin Lys 1 5 10
AAG TCT TAT AAA AAG CTT GAA ACC CTT TTG AAA CTC AAA AAG CTT AAG 99 Lys Ser Tyr Lys Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys 15 20 25
GTT TTT ATG CCT TTA AGT TTA CAA GAA AAT TTG CTT TTT ATC TTC ATA 147 Val Phe Met Pro Leu Ser Leu Gin Glu Asn Leu Leu Phe He Phe He 30 35 40
AAA GAC TCT AAA TTG CTT TTT GCG TTT AAA GAC ATT TGG GCT TCT AAA 195 Lys Asp Ser Lys Leu Leu Phe Ala Phe Lys Asp He Trp Ala Ser Lys 45 50 55 60
GAA TTT AAC CAA CGA TTC GCT AAA GAA ATC AGC CAT TTT TTA AAC ACG 243 Glu Phe Asn Gin Arg Phe Ala Lys Glu He Ser His Phe Leu Asn Thr 65 70 75
CAA GGG CAT GCT TAT GGG TTT GAC GGG TTG AAT GGG TTA GAA ATT TTA 291 Gin Gly His Ala Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu He Leu 80 85 90
GGT TAT GTG CCT AAA GAC GCG CTA AAA AAA TCC AAT TTT TAT GCC CCC 339 Gly Tyr Val Pro Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro 95 100 105
ATT AAA AAA CAA GCC CGT TTT TTT CGC CCT AGT GCT TTA GGG TTG TTC 387 He Lys Lys Gin Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe 110 115 120
CAT AAC CCC ATT AAA GAC GCT CGT TTG CAT GAA TGT TTT GAA AAA GCG 435
-298 -
His Asn Pro He Lys Asp Ala Arg Leu His Glu Cys Phe Glu Lys Ala 125 130 135 140
CGC GCT TTG ATC CAC TAC CAA CGA AGT TTT TTT GAG GAA TGAATGGCTG AT 486 Arg Ala Leu He His Tyr Gin Arg Ser Phe Phe Glu Glu 145 150
TTATTGTCCA GTTTAAAAAA CCTTCCTAAC AGCAGTGGCG TGTATCAATA TTTTGATAAA 546 AAC 549
(2) INFORMATION FOR SEQ ID NO: 156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 153 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156:
Met Glu Gin Asn He Phe Ser Leu Leu He Gin Lys Lys Ser Tyr Lys
1 5 10 15
Lys Leu Glu Thr Leu Leu Lys Leu Lys Lys Leu Lys Val Phe Met Pro
20 25 30
Leu Ser Leu Gin Glu Asn Leu Leu Phe He Phe He Lys Asp Ser Lys
35 40 45
Leu Leu Phe Ala Phe Lys Asp He Trp Ala Ser Lys Glu Phe Asn Gin
50 55 60
Arg Phe Ala Lys Glu He Ser His Phe Leu Asn Thr Gin Gly His Ala 65 70 75 80
Tyr Gly Phe Asp Gly Leu Asn Gly Leu Glu He Leu Gly Tyr Val Pro
85 90 95
Lys Asp Ala Leu Lys Lys Ser Asn Phe Tyr Ala Pro He Lys Lys Gin
100 105 110
Ala Arg Phe Phe Arg Pro Ser Ala Leu Gly Leu Phe His Asn Pro He
115 120 125
Lys Asp Ala Arg Leu His Glu Cys Phe Glu Lys Ala Arg Ala Leu He
130 135 140
His Tyr Gin Arg Ser Phe Phe Glu Glu 145 150
(2) INFORMATION FOR SEQ ID NO: 157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2627 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
- 299 -
(A) NAME/Kfai: Coding Sequence
(B) LOCATION: 18...2582 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157:
AAAGACATGT GCAACCG ATG AAA TCT AAA AAA CTT TAT TTG GCT TTA ATC 50
Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu He 1 5 10
ATA GGG GTT TTA TTA GCG TTT TTA ACC CTA TCT TCA TGG CTG GGT AAT 98 He Gly Val Leu Leu Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn 15 20 25
AGC GGT TTA GTG GGG CGT TTT GGG GTG TGG TTT GCC GCA CTC AAT AAA 146 Ser Gly Leu Val Gly Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys 30 35 40
AAA TAT TTT GGG CAT CTT TCA TTC ATT AAT TTA CCC TAT TTA GCA TGG 194 Lys Tyr Phe Gly His Leu Ser Phe He Asn Leu Pro Tyr Leu Ala Trp
45 50 55
GTT TTA TTC CTT TTA TAC AAG ACT AAA AAC CCT TTT ACA GAA ATC GTT 242 Val Leu Phe Leu Leu Tyr Lys Thr Lys Asn Pro Phe Thr Glu He Val 60 65 70 75
TTA GAA AAA ACT TTA GGG CAT CTA TTA GGC ATT TTA TCT TTG CTC TTT 290 Leu Glu Lys Thr Leu Gly His Leu Leu Gly He Leu Ser Leu Leu Phe 80 85 90
TTA CAA TCT AGC CTA TTA AAT CAA GGG GAA ATC GGC AAC AGC GCG CGT 338 Leu Gin Ser Ser Leu Leu Asn Gin Gly Glu He Gly Asn Ser Ala Arg 95 100 105
TTG TTT TTA CGC CCT TTT ATA GGG GAT TTT GGG CTT TAT GCG CTG ATA ' 386 Leu Phe Leu Arg Pro Phe He Gly Asp Phe Gly Leu Tyr Ala Leu He 110 115 120
ACG CTT ATG GTA GTT ATT TCT TAT TTG ATT CTA TTC AAA CTA CCC CCT 434 Thr Leu Met Val Val He Ser Tyr Leu He Leu Phe Lys Leu Pro Pro 125 130 135
AAA AGC GTT TTT TAT CCT TAT ATG AAC AAA ACA CAA AAC CTT TTA AAA 482 Lys Ser Val Phe Tyr Pro Tyr Met Asn Lys Thr Gin Asn Leu Leu Lys 140 145 150 155
GAG ATT TAC AAA CAA TGC TTA CAA GCC TTT AGC CCT AAT TTT AGC CCA 530 Glu He Tyr Lys Gin Cys Leu Gin Ala Phe Ser Pro Asn Phe Ser Pro 160 165 170
AAA AAA GAG GGT TTT GAA AAC ACC CCA TCA GAT ATT CAA AAA AAA GAA 578 Lys Lys Glu Gly Phe Glu Asn Thr Pro Ser Asp He Gin Lys Lys Glu 175 180 185
- 300 -
ACC AAA AAC GAC AAA GAA AAA GAA AAC CGC AAA GAA AAC CCT ATT AAT 626 Thr Lys Asn Asp Lys Glu Lys Glu Asn Arg Lys Glu Asn Pro He Asn 190 195 200
GAA AAC CAC AAA ACC CCT AAC GAA GAA CCG TTT TTA GCG ATC CCT ACC 674 Glu Asn His Lys Thr Pro Asn Glu Glu Pro Phe Leu Ala He Pro Thr 205 210 215
CCC TAT AAC ACG ACT TTA AAT GAT TCA GAG CCG CAA GAA GGC TTA GTC 722 Pro Tyr Asn Thr Thr Leu Asn Asp Ser Glu Pro Gin Glu Gly Leu Val 220 225 230 235
CAA ATT TCC TCC CAC CCC CCT ACC CAT TAC ACC ATT TAC CCT AAA AGA 770 Gin He Ser Ser His Pro Pro Thr His Tyr Thr He Tyr Pro Lys Arg 240 245 250
AAC CGA TTT GAT GAT TTG ACT AAC CCC ACT AAC CCC CCT TTA AAA GAA 818 Asn Arg Phe Asp Asp Leu Thr Asn Pro Thr Asn Pro Pro Leu Lys Glu 255 260 265
ATT AAA CAA GAA ACT AAA GAA AGA GAA CCC ACG CCT ACA AAA GAA ACT 866 He Lys Gin Glu Thr Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr 270 275 280
CTT ACG CCC ACC ACG CCC AAA CCT ATC ATG CCC ACA CTT GCA CCC ATA 914 Leu Thr Pro Thr Thr Pro Lys Pro He Met Pro Thr Leu Ala Pro He 285 290 295
ATA GAA AAT GAC AAC AAA ACA GAA AAC CAA AAA ACC CCC AAC CAC CCT 962 He Glu Asn Asp Asn Lys Thr Glu Asn Gin Lys Thr Pro Asn His Pro 300 305 310 315
AAA AAA GAA GAA AAC CCA CAA GAA AAC ACG CAA GAA GAA ATG ATA GAA 1010 Lys Lys Glu Glu Asn Pro Gin Glu Asn Thr Gin Glu Glu Met He Glu 320 325 330
GGA AGG ATA GAA GAA ATG ATA AAG GAA AAT CTA AAA AAA GAA GAA AAA 1058 Gly Arg He Glu Glu Met He Lys Glu Asn Leu Lys Lys Glu Glu Lys 335 340 345
GAA GTG CAA AAC GCT CCA AAC TTT AGC CCA GTA ACC CCC ACA AGC GCT 1106 Glu Val Gin Asn Ala Pro Asn Phe Ser Pro Val Thr Pro Thr Ser Ala 350 355 360
AAA AAA CCC GTT ATG GTT AAA GAA TTG AGC GAA AAT AAA GAG ATA TTA 1154 Lys Lys Pro Val Met Val Lys Glu Leu Ser Glu Asn Lys Glu He Leu 365 370 375
GAC GGA TTG GAT TAT GGC GAA GTG CAA AAA CCC AAA GAT TAT GAG CTT 1202 Asp Gly Leu Asp Tyr Gly Glu Val Gin Lys Pro Lys Asp Tyr Glu Leu 380 385 390 395
CCC ACC ACG CAA TTA TTG AAT GCG GTT TGT TTG AAA GAC ACT TCT TTA 1250 Pro Thr Thr Gin Leu Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu 400 405 410
- 301 -
GAC GAA AAC GAG ATT GAC CAA AAA ATC CAG GAT CTA TTG AGC AAA CTG 1298 Asp Glu Asn Glu He Asp Gin Lys He Gin Asp Leu Leu Ser Lys Leu 415 420 425
CGC ACC TTT AAA ATT GAT GGC GAT ATT ATC CGC ACT TAT TCA GGC CCT 1346 Arg Thr Phe Lys He Asp Gly Asp He He Arg Thr Tyr Ser Gly Pro 430 435 440
ATT GTA ACC ACT TTT GAA TTC CGC CCA GCC CCT AAC GTT AAG GTG AGT 1394 He Val Thr Thr Phe Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser 445 450 455
CGT ATT TTA GGC TTG AGC GAT GAT TTA GCG ATG ACT TTA TGC GCT GAA 1442 Arg He Leu Gly Leu Ser Asp Asp Leu Ala Met Thr Leu Cys Ala Glu 460 465 470 475
TCC ATC CGC ATT CAA GCC CCT ATT AAG GGT AAA GAT GTC GTT GGC ATT 1490 Ser He Arg He Gin Ala Pro He Lys Gly Lys Asp Val Val Gly He 480 485 490
GAA ATC CCT AAC AGC CAA AGC CAA ATT ATT TAT TTA AGA GAA ATT CTA 1538 Glu He Pro Asn Ser Gin Ser Gin He He Tyr Leu Arg Glu He Leu 495 500 505
GAG AGC GAA TTG TTT CAA AAA TCC AGC TCG CCC TTA ACT CTA GCT TTA 1586 Glu Ser Glu Leu Phe Gin Lys Ser Ser Ser Pro Leu Thr Leu Ala Leu 510 515 520
GGC AAA GAC ATT GTG GGT AAC CCT TTC ATC ACG GAT TTA AAA AAG CTC 1634 Gly Lys Asp He Val Gly Asn Pro Phe He Thr Asp Leu Lys Lys Leu 525 530 535
CCC CAT TTG CTC ATC GCT GGC ACG ACA GGA AGC GGT AAG AGC GTG GGC 1682 Pro His Leu Leu He Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly 540 545 550 555
GTG AAT GCG ATG ATT TTA TCC TTA CTT TAT AAA AAC CCT CCC GAT CAA 1730 Val Asn Ala Met He Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gin 560 565 570
CTC AAA TTA GTG ATG ATC GAT CCC AAA ATG GTA GAA TTT AGT ATT TAT 1778 Leu Lys Leu Val Met He Asp Pro Lys Met Val Glu Phe Ser He Tyr 575 580 585
GCG GAT ATC CCT CAT TTG CTC ACG CCC ATT ATC ACC GAC CCT AAA AAA 1826 Ala Asp He Pro His Leu Leu Thr Pro He He Thr Asp Pro Lys Lys 590 595 600
GCT ATT GGG GCT TTG CAA AGC GTG GCT AAA GAA ATG GAA CGC CGG TAT 1874 Ala He Gly Ala Leu Gin Ser Val Ala Lys Glu Met Glu Arg Arg Tyr 605 610 615
TCT TTA ATG AGC GAA TAC AAG GTT AAA ACC ATT GAT TCT TAT AAT GAA 1922 Ser Leu Met Ser Glu Tyr Lys Val Lys Thr He Asp Ser Tyr Asn Glu 620 625 630 635
- 302 -
CAA GCC CCA AGT AAC GGC GTT GAA GCG TTC CCC TAT TTG ATT GTG GTG 1970 Gin Ala Pro Ser Asn Gly Val Glu Ala Phe Pro Tyr Leu He Val Val 640 645 650
ATT GAT GAA TTA GCG GAT TTA ATG ATG ACA GGG GGC AAA GAA GCG GAG 2018 He Asp Glu Leu Ala Asp Leu Met Met Thr Gly Gly Lys Glu Ala Glu 655 660 665
TTT CCT ATC GCT AGA ATC GCT CAA ATG GGG CGC GCG AGC GGC TTA CAC 2066 Phe Pro He Ala Arg He Ala Gin Met Gly Arg Ala Ser Gly Leu His 670 675 680
CTC ATT GTA GCG ACC CAA CGC CCA AGC GTG GAT GTC GTA ACC GGC TTG 2114 Leu He Val Ala Thr Gin Arg Pro Ser Val Asp Val Val Thr Gly Leu
685 690 695
ATT AAA ACC AAC TTG CCT TCA AGG GTG AGT TTT AGG GTA GGC ACT AAG 2162 He Lys Thr Asn Leu Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys 700 705 710 715
ATT GAT TCT AAA GTG ATT TTA GAC ACT GAT GGG GCG CAA AGC TTG TTA 2210 He Asp Ser Lys Val He Leu Asp Thr Asp Gly Ala Gin Ser Leu Leu 720 725 730
GGA AGA GGC GAT ATG CTC TTT ACC CCC CCA GGA GCG AAC GGG TTA GTG 2258 Gly Arg Gly Asp Met Leu Phe Thr Pro Pro Gly Ala Asn Gly Leu Val 735 740 745
CGC TTG CAT GCC CCC TTT GCC ACT GAA GAT GAA ATC AAA AAA ATC GTG 2306 Arg Leu His Ala Pro Phe Ala Thr Glu Asp Glu He Lys Lys He Val 750 755 760
GAT TTT ATT AAA GCC CAA AAA GAA GTA CAA TAC GAT AAA GAT TTC TTG 2354 Asp Phe He Lys Ala Gin Lys Glu Val Gin Tyr Asp Lys Asp Phe Leu 765 770 775
CTA GAA GAA TCA CGC ATG CCT TTA GAC ACC CCT AAT TAT CAA GGC GAT 2402 Leu Glu Glu Ser Arg Met Pro Leu Asp Thr Pro Asn Tyr Gin Gly Asp 780 785 790 795
GAC ATT TTA GAA AGG GCT AAA GCG GTG ATT TTA GAA AAA AAG ATC ACT 2450 Asp He Leu Glu Arg Ala Lys Ala Val He Leu Glu Lys Lys He Thr 800 805 810
TCT ACG AGT TTT TTA CAA CGC CAA TTA AAA ATC GGC TAC AAC CAA GCC 2498 Ser Thr Ser Phe Leu Gin Arg Gin Leu Lys He Gly Tyr Asn Gin Ala 815 820 825
GCT ACC ATT ACT GAC GAA TTA GAA GCT CAA GGC TTT TTA TCC CCA AGA 2546 Ala Thr He Thr Asp Glu Leu Glu Ala Gin Gly Phe Leu Ser Pro Arg 830 835 840
AAC GCT AAA GGC AAC AGA GAG ATT TTG CAA AAC TTT TAGGCTTTGT TTTCAT 2598 Asn Ala Lys Gly Asn Arg Glu He Leu Gin Asn Phe 845 850 855
-303- TGGATATTGG CAAACATTAΪ TTTTGATTT 2627
(2) INFORMATION FOR SEQ ID NO: 158:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 855 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: Met Lys Ser Lys Lys Leu Tyr Leu Ala Leu He He Gly Val Leu Leu
1 5 10 15
Ala Phe Leu Thr Leu Ser Ser Trp Leu Gly Asn Ser Gly Leu Val Gly
20 25 30
Arg Phe Gly Val Trp Phe Ala Ala Leu Asn Lys Lys Tyr Phe Gly His
35 40 45
Leu Ser Phe He Asn Leu Pro Tyr Leu Ala Trp Val Leu Phe Leu Leu
50 55 60
Tyr Lys Thr Lys Asn Pro Phe Thr Glu He Val Leu Glu Lys Thr Leu 65 70 75 80
Gly His Leu Leu Gly He Leu Ser Leu Leu Phe Leu Gin Ser Ser Leu
85 90 95
Leu Asn Gin Gly Glu He Gly Asn Ser Ala Arg Leu Phe Leu Arg Pro
100 105 110
Phe He Gly Asp Phe Gly Leu Tyr Ala Leu He Thr Leu Met Val Val
115 120 125
He Ser Tyr Leu He Leu Phe Lys Leu Pro Pro Lys Ser Val Phe Tyr
130 135 140
Pro Tyr Met Asn Lys Thr Gin Asn Leu Leu Lys Glu He Tyr Lys Gin 145 150 155 160
Cys Leu Gin Ala Phe Ser Pro Asn Phe Ser Pro Lys Lys Glu Gly Phe
165 170 175
Glu Asn Thr Pro Ser Asp He Gin Lys Lys Glu Thr Lys Asn Asp Lys
180 185 190
Glu Lys Glu Asn Arg Lys Glu Asn Pro He Asn Glu Asn His Lys Thr
195 200 205
Pro Asn Glu Glu Pro Phe Leu Ala He Pro Thr Pro Tyr Asn Thr Thr
210 215 220
Leu Asn Asp Ser Glu Pro Gin Glu Gly Leu Val Gin He Ser Ser His 225 230 235 240
Pro Pro Thr His Tyr Thr He Tyr Pro Lys Arg Asn Arg Phe Asp Asp
245 250 255
Leu Thr Asn Pro Thr Asn Pro Pro Leu Lys Glu He Lys Gin Glu Thr
260 265 270
Lys Glu Arg Glu Pro Thr Pro Thr Lys Glu Thr Leu Thr Pro Thr Thr
275 280 285
Pro Lys Pro He Met Pro Thr Leu Ala Pro He He Glu Asn Asp Asn
290 295 300
Lys Thr Glu Asn Gin Lys Thr Pro Asn His Pro Lys Lys Glu Glu Asn 305 310 315 320
Pro Gin Glu Asn Thr Gin Glu Glu Met He Glu Gly Arg He Glu Glu 325 330 335
- 304 -
Met He Lys Glu Asn lieu Lys Lys Glu Glu Lys Glu Val Gin Asn Ala
340 345 350
Pro Asn Phe Ser Pro Val Thr Pro Thr Ser Ala Lys Lys Pro Val Met
355 360 365
Val Lys Glu Leu Ser Glu Asn Lys Glu He Leu Asp Gly Leu Asp Tyr
370 375 380
Gly Glu Val Gin Lys Pro Lys Asp Tyr Glu Leu Pro Thr Thr Gin Leu 385 390 395 400
Leu Asn Ala Val Cys Leu Lys Asp Thr Ser Leu Asp Glu Asn Glu He
405 410 415
Asp Gin Lys He Gin Asp Leu Leu Ser Lys Leu Arg Thr Phe Lys He
420 425 430
Asp Gly Asp He He Arg Thr Tyr Ser Gly Pro He Val Thr Thr Phe
435 440 445
Glu Phe Arg Pro Ala Pro Asn Val Lys Val Ser Arg He Leu Gly Leu
450 455 460
Ser Asp Asp Leu Ala Met Thr Leu Cys Ala Glu Ser He Arg He Gin 465 470 475 480
Ala Pro He Lys Gly Lys Asp Val Val Gly He Glu He Pro Asn Ser
485 490 495
Gin Ser Gin He He Tyr Leu Arg Glu He Leu Glu Ser Glu Leu Phe
500 505 510
Gin Lys Ser Ser Ser Pro Leu Thr Leu Ala Leu Gly Lys Asp He Val
515 520 525
Gly Asn Pro Phe He Thr Asp Leu Lys Lys Leu Pro His Leu Leu He
530 535 540
Ala Gly Thr Thr Gly Ser Gly Lys Ser Val Gly Val Asn Ala Met He 545 550 555 560
Leu Ser Leu Leu Tyr Lys Asn Pro Pro Asp Gin Leu Lys Leu Val Met
565 570 575
He Asp Pro Lys Met Val Glu Phe Ser He Tyr Ala Asp He Pro His
580 585 590
Leu Leu Thr Pro He He Thr Asp Pro Lys Lys Ala He Gly Ala Leu
595 600 605
Gin Ser Val Ala Lys Glu Met Glu Arg Arg Tyr Ser Leu Met Ser Glu
610 615 620
Tyr Lys Val Lys Thr He Asp Ser Tyr Asn Glu Gin Ala Pro Ser Asn 625 630 635 640
Gly Val Glu Ala Phe Pro Tyr Leu He Val Val He Asp Glu Leu Ala
645 650 655
Asp Leu Met Met Thr Gly Gly Lys Glu Ala Glu Phe Pro He Ala Arg
660 665 670
He Ala Gin Met Gly Arg Ala Ser Gly Leu His Leu He Val Ala Thr
675 680 685
Gin Arg Pro Ser Val Asp Val Val Thr Gly Leu He Lys Thr Asn Leu
690 695 700
Pro Ser Arg Val Ser Phe Arg Val Gly Thr Lys He Asp Ser Lys Val 705 710 715 720
He Leu Asp Thr Asp Gly Ala Gin Ser Leu Leu Gly Arg Gly Asp Met
725 730 735
Leu Phe Thr Pro Pro Gly Ala Asn Gly Leu Val Arg Leu His Ala Pro
740 745 750
Phe Ala Thr Glu Asp Glu He Lys Lys He Val Asp Phe He Lys Ala
755 760 765
Gin Lys Glu Val Gin Tyr Asp Lys Asp Phe Leu Leu Glu Glu Ser Arg
- 3 05 -
770 775 780
Met Pro Leu Asp Thr Pro Asn Tyr Gin Gly Asp Asp He Leu Glu Arg 785 790 795 800
Ala Lys Ala Val He Leu Glu Lys Lys He Thr Ser Thr Ser Phe Leu
805 810 815
Gin Arg Gin Leu Lys He Gly Tyr Asn Gin Ala Ala Thr He Thr Asp
820 825 830
Glu Leu Glu Ala Gin Gly Phe Leu Ser Pro Arg Asn Ala Lys Gly Asn
835 840 845
Arg Glu He Leu Gin Asn Phe 850 855
(2) INFORMATION FOR SEQ ID NO: 159:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1986 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 56...1945 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159:
GGTGTCCTTA AACAGCAGGG TGAAAGAGAT TTTAAAAGAA AGCGCTCTGC ATTCT ATG 58
Met 1
CAA GAT AGT TTG CAT TTT AAG GTT AAT GAA GTG CAA GGG GTT TTA GAA 106 Gin Asp Ser Leu His Phe Lys Val Asn Glu Val Gin Gly Val Leu Glu 5 10 15
AAC ACT TAT ACG AGC ATG GGC ATT GTT AAA GAA ATG CTC CCT AAA GAC 154 Asn Thr Tyr Thr Ser Met Gly He Val Lys Glu Met Leu Pro Lys Asp 20 25 30
ACC AAA AGA GAA ATC AAA ATC GGC TTG TTA AAA AAC TTC ATT TTA GCC 202 Thr Lys Arg Glu He Lys He Gly Leu Leu Lys Asn Phe He Leu Ala 35 40 45
AAT TCG CAT GTC GCT GGG GTG AGC ATG TTT TTT AAA GGC AGA GAA GAT 250 Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu Asp 50 55 60 65
TTA AGA TTA ACG CTT TTA AGG GAT AAC AAT ACG ATT AAG CTA GTG GAA 298 Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr He Lys Leu Val Glu 70 75 80
- 3 06 -
AAT CCG TCA T'IA GA AAT AGC CCT TTA GCG CAA AAA GCG ATG AAA AAT 346
Asn Pro Ser Leu Glu Asn Ser Pro Leu Ala Gin Lys Ala Met Lys Asn 85 90 95
AAA GAA ATT TCT AAA AGT TTG GGT TAT TAT AGG AAA ATG CCT AAT GGG 394
Lys Glu He Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn Gly
100 105 110
GCG GAA GTT TAT GGG GTG GAT ATT CTT TTA CCT TTA TTG AAT GAG AAC 442
Ala Glu Val Tyr Gly Val Asp He Leu Leu Pro Leu Leu Asn Glu Asn 115 120 125
GCT CAA GAG GTT GTA GGG GCT TTG ATG ATT TTT ATT TCC ATT GAC AGC 490
Ala Gin Glu Val Val Gly Ala Leu Met He Phe He Ser He Asp Ser 130 135 140 145
TTC AGC AAT GAA ATC ACT AAA AAC AGG AGC GAT TTA TTT TTA ATT GGC 538
Phe Ser Asn Glu He Thr Lys Asn Arg Ser Asp Leu Phe Leu He Gly 150 155 160
ACT AAA GGT AAA GTG CTT TTG AGC GCG AAT AAG AGT TTG CAA GAC AAA 586
Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gin Asp Lys 165 170 175
CCT ATC GCA GAA ATT TAT AAG AGC GTG CCT AAA GCC ACC AAC GAA GTG 634
Pro He Ala Glu He Tyr Lys Ser Val Pro Lys Ala Thr Asn Glu Val
180 185 190
ATG GCT ATT TTA GAA AAC GGC TCT AAA GCG ACT TTA GAA TAC TTA GAT 682
Met Ala He Leu Glu Asn Gly Ser Lys Ala Thr Leu Glu Tyr Leu Asp 195 200 205
CCC TTT AGC CAT AAG GAA AAT TTT TTA GCC GTT GAA ACC TTT AAA ATG 730
Pro Phe Ser His Lys Glu Asn Phe Leu Ala Val Glu Thr Phe Lys Met 210 215 220 225
CTA GGC AAA ACA GAA AGT AAA GAC AAT CTT AAT TGG ATG ATC GCT TTA 778
Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met He Ala Leu 230 235 240
ATC ATT GAA AAA GAC AAG GTC TAT GAG CAA GTA GGC TCG GTG CGT TTT 826
He He Glu Lys Asp Lys Val Tyr Glu Gin Val Gly Ser Val Arg Phe 245 250 255
GTG GTG ATC ATA GCG AGC GCA ATC ATG GTG TTA GCC TTG ATT ATA GCG 874
Val Val He He Ala Ser Ala He Met Val Leu Ala Leu He He Ala
260 265 270
ATC ACT CTC TTA ATG CGA GCG ATC GTG AGC AGT CGT TTG GAA GCC GTT 922
He Thr Leu Leu Met Arg Ala He Val Ser Ser Arg Leu Glu Ala Val 275 280 285
TCT AGC ACC TTG TCT CAT TTC TTT AAA TTA TTG AAC AAT CAA GCC AAT 970
Ser Ser Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gin Ala Asn 290 295 300 305
- 307 -
TCT AGC GGT ATT AAA HG ATT GAA GCG AAA TCC AAT GAC GAG TTA GGC 1018
Ser Ser Gly He Lys Leu He Glu Ala Lys Ser Asn Asp Glu Leu Gly 310 315 320
CGC ATG CAA ACA GCG ATC AAT AAA AAT ATC TTG CAA ACC CAA AAA ATC 1066
Arg Met Gin Thr Ala He Asn Lys Asn He Leu Gin Thr Gin Lys He
325 330 335
ATG CAA GAA GAC AGG CAA GCC GTC CAA GAC ACC ATT AAA GTG GTT TCA 1114
Met Gin Glu Asp Arg Gin Ala Val Gin Asp Thr He Lys Val Val Ser 340 345 350
GAT GTG AAA GCA GGG AAT TTT GCG GTG CGC ATC ACG GCT GAG CCC GCA 1162
Asp Val Lys Ala Gly Asn Phe Ala Val Arg He Thr Ala Glu Pro Ala 355 360 365
AGC CCT GAT TTG AAA GAA TTG AGG GAC GCG CTA AAT GGG ATC ATG GAT 1210
Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly He Met Asp
370 375 380 385
TAT TTG CAA GAA AGC GTA GGG ACT CAC ATG CCA AGC ATT TTC AAA ATC 1258
Tyr Leu Gin Glu Ser Val Gly Thr His Met Pro Ser He Phe Lys He 390 395 400
TTT GAA AGC TAT TCT GGT TTG GAT TTT AGA GGC CGG ATC CAA AAC GCT 1306
Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg He Gin Asn Ala
405 410 415
TCG GGT AGG GTG GAA CTG GTT ACT AAC GCT TTA GGG CAA GAA ATC CAA 1354
Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gin Glu He Gin 420 425 430
AAA ATG CTA GAA ACT TCG TCT AAT TTT GCC AAA GAT TTA GCG AAC GAT 1402
Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn Asp 435 440 445
AGC GCG AAT TTA AAA GAG TGC GTG CAA AAT TTA GAA AAA GCT TCA AAC 1450
Ser Ala Asn Leu Lys Glu Cys Val Gin Asn Leu Glu Lys Ala Ser Asn
450 455 460 465
TCC CAA CAC AAA AGC TTG ATG GAA ACT TCC AAA ACG ATA GAA AAT ATC 1498
Ser Gin His Lys Ser Leu Met Glu Thr Ser Lys Thr He Glu Asn He 470 475 480
ACC ACT TCC ATT CAA GGC GTG AGC TCT CAA AGT GAA GCC ATG ATT GAA 1546
Thr Thr Ser He Gin Gly Val Ser Ser Gin Ser Glu Ala Met He Glu
485 490 495
CAA GGG CAA GAC ATT AAA AGC ATT GTA GAA ATC ATT AGA GAT ATT GCT 1594
Gin Gly Gin Asp He Lys Ser He Val Glu He He Arg Asp He Ala 500 505 510
GAT CAA ACC AAT CTT TTA GCC TTA AAC GCC GCT ATT GAA GCC GCA AGG 1642
Asp Gin Thr Asn Leu Leu Ala Leu Asn Ala Ala He Glu Ala Ala Arg 515 520 525
- 308 -
GCC GGC GAG CAT GGC AGA GGC TTT GCG GTG GTG GCT GAT GAG GTA AGA 1690 Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg 530 535 540 545
AAG CTC GCT GAA AGG ACG CAA AAA TCG CTC AGC GAG ATT GAA GCC AAT 1738 Lys Leu Ala Glu Arg Thr Gin Lys Ser Leu Ser Glu He Glu Ala Asn 550 555 560
ATC AAT ATT TTA GTG CAA AGC ATT TCA GAC ACG AGC GAA AGC ATT AAA 1786 He Asn He Leu Val Gin Ser He Ser Asp Thr Ser Glu Ser He Lys 565 570 575
AAC CAG GTT AAA GAA GTG GAA GAA ATC AAC GCT TCT ATT GAA GCC TTA 1834 Asn Gin Val Lys Glu Val Glu Glu He Asn Ala Ser He Glu Ala Leu 580 585 590
AGA TCG GTT ACT GAG GGC AAT CTA AAA ATC GCT AGC GAT TCT TTA GAA 1882 Arg Ser Val Thr Glu Gly Asn Leu Lys He Ala Ser Asp Ser Leu Glu 595 600 605
ATC AGT CAA GAA ATT GAC AAA GTT TCT AAC GAT ATT TTA GAA GAT GTG 1930 He Ser Gin Glu He Asp Lys Val Ser Asn Asp He Leu Glu Asp Val 610 615 620 625
AAT AAA AAG CAG TTT TAATGCTCAT TCATATTTGC TGCTCAGTGG ATAACCTCTA T 1986 Asn Lys Lys Gin Phe 630
1986
(2) INFORMATION FOR SEQ ID NO: 160:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 630 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160:
Met Gin Asp Ser Leu His Phe Lys Val Asn Glu Val Gin Gly Val Leu
1 5 10 15
Glu Asn Thr Tyr Thr Ser Met Gly He Val Lys Glu Met Leu Pro Lys
20 25 30
Asp Thr Lys Arg Glu He Lys He Gly Leu Leu Lys Asn Phe He Leu
35 40 45
Ala Asn Ser His Val Ala Gly Val Ser Met Phe Phe Lys Gly Arg Glu
50 55 60
Asp Leu Arg Leu Thr Leu Leu Arg Asp Asn Asn Thr He Lys Leu Val 65 70 75 80
Glu Asn Pro Ser Leu Glu Asn Ser Pro Leu Ala Gin Lys Ala Met Lys
85 90 95
Asn Lys Glu He Ser Lys Ser Leu Gly Tyr Tyr Arg Lys Met Pro Asn
- 309 -
100 105 110
Gly Ala Glu Val Tyr Gly Val Asp He Leu Leu Pro Leu Leu Asn Glu
115 120 125
Asn Ala Gin Glu Val Val Gly Ala Leu Met He Phe He Ser He Asp
130 135 140
Ser Phe Ser Asn Glu He Thr Lys Asn Arg Ser Asp Leu Phe Leu He 145 150 155 160
Gly Thr Lys Gly Lys Val Leu Leu Ser Ala Asn Lys Ser Leu Gin Asp
165 170 175
Lys Pro He Ala Glu He Tyr Lys Ser Val Pro Lys Ala Thr Asn Glu
180 185 190
Val Met Ala He Leu Glu Asn Gly Ser Lys Ala Thr Leu Glu Tyr Leu
195 200 205
Asp Pro Phe Ser His Lys Glu Asn Phe Leu Ala Val Glu Thr Phe Lys
210 215 220
Met Leu Gly Lys Thr Glu Ser Lys Asp Asn Leu Asn Trp Met He Ala 225 230 235 240
Leu He He Glu Lys Asp Lys Val Tyr Glu Gin Val Gly Ser Val Arg
245 250 255
Phe Val Val He He Ala Ser Ala He Met Val Leu Ala Leu He He
260 265 270
Ala He Thr Leu Leu Met Arg Ala He Val Ser Ser Arg Leu Glu Ala
275 280 285
Val Ser Ser Thr Leu Ser His Phe Phe Lys Leu Leu Asn Asn Gin Ala
290 295 300
Asn Ser Ser Gly He Lys Leu He Glu Ala Lys Ser Asn Asp Glu Leu 305 310 315 320
Gly Arg Met Gin Thr Ala He Asn Lys Asn He Leu Gin Thr Gin Lys
325 330 335
He Met Gin Glu Asp Arg Gin Ala Val Gin Asp Thr He Lys Val Val
340 345 350
Ser Asp Val Lys Ala Gly Asn Phe Ala Val Arg He Thr Ala Glu Pro
355 360 365
Ala Ser Pro Asp Leu Lys Glu Leu Arg Asp Ala Leu Asn Gly He Met
370 375 380
Asp Tyr Leu Gin Glu Ser Val Gly Thr His Met Pro Ser He Phe Lys 385 390 395 400
He Phe Glu Ser Tyr Ser Gly Leu Asp Phe Arg Gly Arg He Gin Asn
405 410 415
Ala Ser Gly Arg Val Glu Leu Val Thr Asn Ala Leu Gly Gin Glu He
420 425 430
Gin Lys Met Leu Glu Thr Ser Ser Asn Phe Ala Lys Asp Leu Ala Asn
435 440 445
Asp Ser Ala Asn Leu Lys Glu Cys Val Gin Asn Leu Glu Lys Ala Ser
450 455 460
Asn Ser Gin His Lys Ser Leu Met Glu Thr Ser Lys Thr He Glu Asn 465 470 475 480
He Thr Thr Ser He Gin Gly Val Ser Ser Gin Ser Glu Ala Met He
485 490 495
Glu Gin Gly Gin Asp He Lys Ser He Val Glu He He Arg Asp He
500 505 510
Ala Asp Gin Thr Asn Leu Leu Ala Leu Asn Ala Ala He Glu Ala Ala
515 520 525
Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val 530 535 540
- 310 -
Arg Lys Leu Ala Glu Arg Thr Gin Lys Ser Leu Ser Glu He Glu Ala 545 550 555 560
Asn He Asn He Leu Val Gin Ser He Ser Asp Thr Ser Glu Ser He
565 570 575
Lys Asn Gin Val Lys Glu Val Glu Glu He Asn Ala Ser He Glu Ala
580 585 590
Leu Arg Ser Val Thr Glu Gly Asn Leu Lys He Ala Ser Asp Ser Leu
595 600 605
Glu He Ser Gin Glu He Asp Lys Val Ser Asn Asp He Leu Glu Asp
610 615 620
Val Asn Lys Lys Gin Phe 625 630
(2) INFORMATION FOR SEQ ID NO: 161:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1758 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 8...1702 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161:
GAGATAA ATG ATG TTT TCT TCA ATG TTT GCT TCG TTG GGG ACT CGT ATC 49 Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg He 1 5 10
ATG CTG GTC GTG TTA GCC GCT CTT TTA GGT TTA GGG GGG CTT TTT ATT 97 Met Leu Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe He 15 20 25 30
GGT TTT GTA AAG GTT ATG CAA AAA GAT GTG TTA GCG CAA CTC ATG GAG 145 Gly Phe Val Lys Val Met Gin Lys Asp Val Leu Ala Gin Leu Met Glu 35 40 45
CAT TTA GAA ACC GGG CAA TAC AAA AAG CGT GAA AAA ACG CTC GCT TAC 193 His Leu Glu Thr Gly Gin Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr 50 55 60
ATG ACA AAA ATT ATT GAA CAG GGC ATT CAT GAG TAT TAC AAA AAT TTT 241 Met Thr Lys He He Glu Gin Gly He His Glu Tyr Tyr Lys Asn Phe
65 70 75
GAC AAT GCT ACT GCA AGA AAA ATG GCG TTA GAT TAT TTC AAA CGC ATC 289 Asp Asn Ala Thr Ala Arg Lys Met Ala Leu Asp Tyr Phe Lys Arg He 80 85 90
- 311 -
AAC GAC GAT AAG GGC ATG ATT TAT ATG GTG GTG GTG GAT AAA AAC GGG 337
Asn Asp Asp Lys Gly Met He Tyr Met Val Val Val Asp Lys Asn Gly 95 100 105 110
GTG GTA TTG TTT GAT CCG GTC AAT CCT AAA ACC GTA GNC CAA TCA GGG 385
Val Val Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gin Ser Gly
115 120 125
CTT GAC GCT CAG AGC GTT GAT GGG GTG TAT TAT GTT AGG GGG TAT TTG 433
Leu Asp Ala Gin Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu 130 135 140
GAG GCG GCC AAA AAA GGG GGA GGC TAC ACT TAT TAT AAA ATG CCT AAA 481
Glu Ala Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys
145 150 155
TAC GAT GGA GGC GTA CCG GAG AAA AAA TTC GCC TAC TCG CAT TAT GAT 529
Tyr Asp Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp 160 165 170
GAA GTT TCT CAA ATG GTG ATC GCA ACG ACT TCC TAT TAC ACT GAC ATT 577
Glu Val Ser Gin Met Val He Ala Thr Thr Ser Tyr Tyr Thr Asp He 175 180 185 190
AAC ACA GAA AAT AAA GCG ATC AAA GAA GGC GTG AAT AAG GTT TTT GAT 625
Asn Thr Glu Asn Lys Ala He Lys Glu Gly Val Asn Lys Val Phe Asp
195 200 205
GAA AAC ACC ACG AAA TTA TTC CTT TGG ATA CTG ACA GCG ACG ATA GCG 673
Glu Asn Thr Thr Lys Leu Phe Leu Trp He Leu Thr Ala Thr He Ala 210 215 220
CTA GTG GTT TTG ACG CTC ATA TAC GCT AAA TTA AGG ATC GTG AAA CGC 721
Leu Val Val Leu Thr Leu He Tyr Ala Lys Leu Arg He Val Lys Arg
225 230 235
ATT GAT GAA CTG GTC CTT AAA ATC AAC GCT TTT AGC CGT GGG GAT AAG 769
He Asp Glu Leu Val Leu Lys He Asn Ala Phe Ser Arg Gly Asp Lys 240 245 250
GAT TTG AGA GCC AAA ATT GAT GTG GGT GAT CGC AAC GAT GAA ATC TCG 817
Asp Leu Arg Ala Lys He Asp Val Gly Asp Arg Asn Asp Glu He Ser 255 260 265 270
CAA GTG GGC CGT GGG ATC AAT TTG TTT GTG GAA AAC GCC CGC TTG ATT 865
Gin Val Gly Arg Gly He Asn Leu Phe Val Glu Asn Ala Arg Leu He
275 280 285
ATG GAA GAG ATT AAA GGG ATT TCC ACC CTC AAT AAA ACT TCA ATG GAT 913
Met Glu Glu He Lys Gly He Ser Thr Leu Asn Lys Thr Ser Met Asp 290 295 300
AAA TTA GTC CAA ATC ACG CAA GAA ACC CAA AAG AGC ATG AAA GAT TCC 961
Lys Leu Val Gin He Thr Gin Glu Thr Gin Lys Ser Met Lys Asp Ser
305 310 315
- 312 -
TCA ACC ACC CTA AAT luu GTG AAA AAT AAA GCC ACT GAT ATA GCG AGC 1009 Ser Thr Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp He Ala Ser 320 325 330
ATG ATG AAT GCT TCC ATA GAG CAA TCT CAA GGG TTA AGG AAG CGT TTG 1057 Met Met Asn Ala Ser He Glu Gin Ser Gin Gly Leu Arg Lys Arg Leu 335 340 345 350
ATT GAA ACG CAA GGG CTG GTC AAA GAG AGC AAG GAT GCG ATC GGG GAT 1105 He Glu Thr Gin Gly Leu Val Lys Glu Ser Lys Asp Ala He Gly Asp 355 360 365
TTA TTT TCT CAA ATC ACA GAG AGC GCG CAC ACT GAA GAG GAA CTC TCT 1153 Leu Phe Ser Gin He Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser 370 375 380
AGC AAA GTG GAG CAG CTA AGC CGT AAC GCT GAT GAT GTC AAA TCC ATT 1201 Ser Lys Val Glu Gin Leu Ser Arg Asn Ala Asp Asp Val Lys Ser He 385 390 395
CTG GAT ATT ATC AAT GAT ATT GCC GAT CAA ACG AAT TTA TTA GCC CTA 1249 Leu Asp He He Asn Asp He Ala Asp Gin Thr Asn Leu Leu Ala Leu 400 405 410
AAC GCT GCT ATT GAA GCC GCA AGG GCT GGC GAG CAT GGC AGA GGC TTT 1297 Asn Ala Ala He Glu Ala Ala Arg Ala Gly Glu His Gly Arg Gly Phe 415 420 425 430
GCG GTG GTG GCT GAT GAA GTT AGG AAT TTA GCC GGG CGC ACT CAA AAG 1345 Ala Val Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gin Lys 435 440 445
TCT TTA GCC GAA ATC AAT TCC ACT ATC ATG GTG ATT GTC CAA GAA ATC 1393 Ser Leu Ala Glu He Asn Ser Thr He Met Val He Val Gin Glu He 450 455 460
AAT GCC GTG AGT TCG CAA ATG AAT CTC AAT TCG CAA AAA ATG GAG CGT 1441 Asn Ala Val Ser Ser Gin Met Asn Leu Asn Ser Gin Lys Met Glu Arg 465 470 475
TTG AGC GAT ATG AGT AAA AGC GTG CAA GAA ACT TAC GAA AAA ATG AGT 1489 Leu Ser Asp Met Ser Lys Ser Val Gin Glu Thr Tyr Glu Lys Met Ser 480 485 490
TCT AAT TTA AGC TCA GTC GTG TCA GAC AGC AAT CAA AGC ATG GAC GAT 1537 Ser Asn Leu Ser Ser Val Val Ser Asp Ser Asn Gin Ser Met Asp Asp 495 500 505 510
TAC GCC AAA TCC GGA CAC CAA ATT GAA GTT ATG GTA AGC GAT TTT GCA 1585 Tyr Ala Lys Ser Gly His Gin He Glu Val Met Val Ser Asp Phe Ala 515 520 525
GAG GTG GAA AAA GTG GCT TCT AAG ACT TTA GCG GAT TCT TCA GAT ATT 1633 Glu Val Glu Lys Val Ala Ser Lys Thr Leu Ala Asp Ser Ser Asp He 530 535 540
- 313 -
TTA AAC ATC GCT ACG <_Al GTG AGT GGA ACG ACC ATG AAT TTA GAC AAA 1681 Leu Asn He Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys 545 550 555
CAA GTG AAT TTG TTT AAA ACT TAATCAGGGG GAGTTTATTA AAAAAGGGTT GGAT 1736 Gin Val Asn Leu Phe Lys Thr 560 565
TGTTAAAAGT TTCTGTGATC AC 1758
(2) INFORMATION FOR SEQ ID NO: 162:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 565 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162:
Met Met Phe Ser Ser Met Phe Ala Ser Leu Gly Thr Arg He Met Leu
1 5 10 15
Val Val Leu Ala Ala Leu Leu Gly Leu Gly Gly Leu Phe He Gly Phe
20 25 30
Val Lys Val Met Gin Lys Asp Val Leu Ala Gin Leu Met Glu His Leu
35 40 45
Glu Thr Gly Gin Tyr Lys Lys Arg Glu Lys Thr Leu Ala Tyr Met Thr
50 55 60
Lys He He Glu Gin Gly He His Glu Tyr Tyr Lys Asn Phe Asp Asn 65 70 75 80
Ala Thr Ala Arg Lys Met Ala Leu Asp Tyr Phe Lys Arg He Asn Asp
85 90 95
Asp Lys Gly Met He Tyr Met Val Val Val Asp Lys Asn Gly Val Val
100 105 110
Leu Phe Asp Pro Val Asn Pro Lys Thr Val Xaa Gin Ser Gly Leu Asp
115 120 125
Ala Gin Ser Val Asp Gly Val Tyr Tyr Val Arg Gly Tyr Leu Glu Ala
130 135 140
Ala Lys Lys Gly Gly Gly Tyr Thr Tyr Tyr Lys Met Pro Lys Tyr Asp 145 150 155 160
Gly Gly Val Pro Glu Lys Lys Phe Ala Tyr Ser His Tyr Asp Glu Val
165 170 175
Ser Gin Met Val He Ala Thr Thr Ser Tyr Tyr Thr Asp He Asn Thr
180 185 190
Glu Asn Lys Ala He Lys Glu Gly Val Asn Lys Val Phe Asp Glu Asn
195 200 205
Thr Thr Lys Leu Phe Leu Trp He Leu Thr Ala Thr He Ala Leu Val
210 215 220
Val Leu Thr Leu He Tyr Ala Lys Leu Arg He Val Lys Arg He Asp 225 230 235 240
Glu Leu Val Leu Lys He Asn Ala Phe Ser Arg Gly Asp Lys Asp Leu
245 250 255
Arg Ala Lys He Asp Val Gly Asp Arg Asn Asp Glu He Ser Gin Val
- 314 -
260 265 270
Gly Arg Gly He Asn Leu Phe Val Glu Asn Ala Arg Leu He Met Glu
275 280 285
Glu He Lys Gly He Ser Thr Leu Asn Lys Thr Ser Met Asp Lys Leu
290 295 300
Val Gin He Thr Gin Glu Thr Gin Lys Ser Met Lys Asp Ser Ser Thr 305 310 315 320
Thr Leu Asn Ser Val Lys Asn Lys Ala Thr Asp He Ala Ser Met Met
325 330 335
Asn Ala Ser He Glu Gin Ser Gin Gly Leu Arg Lys Arg Leu He Glu
340 345 350
Thr Gin Gly Leu Val Lys Glu Ser Lys Asp Ala He Gly Asp Leu Phe
355 360 365
Ser Gin He Thr Glu Ser Ala His Thr Glu Glu Glu Leu Ser Ser Lys
370 375 380
Val Glu Gin Leu Ser Arg Asn Ala Asp Asp Val Lys Ser He Leu Asp 385 390 395 400
He He Asn Asp He Ala Asp Gin Thr Asn Leu Leu Ala Leu Asn Ala
405 410 415
Ala He Glu Ala Ala Arg Ala Gly Glu His Gly Arg Gly Phe Ala Val
420 425 430
Val Ala Asp Glu Val Arg Asn Leu Ala Gly Arg Thr Gin Lys Ser Leu
435 440 445
Ala Glu He Asn Ser Thr He Met Val He Val Gin Glu He Asn Ala
450 455 460
Val Ser Ser Gin Met Asn Leu Asn Ser Gin Lys Met Glu Arg Leu Ser 465 470 475 480
Asp Met Ser Lys Ser Val Gin Glu Thr Tyr Glu Lys Met Ser Ser Asn
485 490 495
Leu Ser Ser Val Val Ser Asp Ser Asn Gin Ser Met Asp Asp Tyr Ala
500 505 510
Lys Ser Gly His Gin He Glu Val Met Val Ser Asp Phe Ala Glu Val
515 520 525
Glu Lys Val Ala Ser Lys Thr Leu Ala Asp Ser Ser Asp He Leu Asn
530 535 540
He Ala Thr His Val Ser Gly Thr Thr Met Asn Leu Asp Lys Gin Val 545 550 555 560
Asn Leu Phe Lys Thr 565
(2) INFORMATION FOR SEQ ID NO: 163
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 686 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 16...660 (D) OTHER INFORMATION:
-315 -
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163:
TATAAGGTTG CTCTC ATG AAA AAA CCC TAT AGG AAG ATT TCT GAT TAT GCG 51 Met Lys Lys Pro Tyr Arg Lys He Ser Asp Tyr Ala 1 5 10
ATC GTG GGT GGT TTG AGC GCG TTA GTG ATG GTG AGC ATT GTG GGG TGT 99 He Val Gly Gly Leu Ser Ala Leu Val Met Val Ser He Val Gly Cys 15 20 25
AAG AGC AAT GCT GAT GAC AAA CCA AAA GAG CAA AGC TCT TTA AGT CAA 147 Lys Ser Asn Ala Asp Asp Lys Pro Lys Glu Gin Ser Ser Leu Ser Gin 30 35 40
AGC GTT CAA AAA GGC GCG TTT GTG ATT TTA GAA GAG CAA AAG GAT AAA 195 Ser Val Gin Lys Gly Ala Phe Val He Leu Glu Glu Gin Lys Asp Lys 45 50 55 60
TCT TAC AAG GTT GTT GAA GAA TAC CCC AGC TCA AGA ACC CAC ATT ATA 243 Ser Tyr Lys Val Val Glu Glu Tyr Pro Ser Ser Arg Thr His He He 65 70 75
GTG CGC GAT TTG CAA GGC AAT GAA CGC GTG TTA AGC AAT GAA GAG ATT 291 Val Arg Asp Leu Gin Gly Asn Glu Arg Val Leu Ser Asn Glu Glu He 80 85 90
CAA AAG CTC ATC AAA GAA GAA GAA GCT AAA ATT GAT AAC GGC ACG AGC 339 Gin Lys Leu He Lys Glu Glu Glu Ala Lys He Asp Asn Gly Thr Ser 95 100 105
AAG CTT GTC CAG CCT AAT AAT GGA GGG AGT AAT GAA GGC TCA GGC TTT 387 Lys Leu Val Gin Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe 110 115 120
GGC TTG GGG AGC GCG ATT TTA GGG AGC GCG GCG GGG GCG ATT TTA GGG 435
Gly Leu Gly Ser Ala He Leu Gly Ser Ala Ala Gly Ala He Leu Gly 125 130 135 140
AGT TAT ATT GGT AAT AAG CTT TTC AAT AAC CCT AAT TAC CAG CAA AAC 483
Ser Tyr He Gly Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gin Gin Asn
145 150 155
GCC CAA CGG ACC TAC AAA TCC CCA CAA GCT TAC CAA CGC TCT CAA AAT 531 Ala Gin Arg Thr Tyr Lys Ser Pro Gin Ala Tyr Gin Arg Ser Gin Asn 160 165 170
TCC TTT TCT AAA AGT GCG CCC AGT GCT TCA AGC ATG GGC GGA GCG AGT 579 Ser Phe Ser Lys Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser 175 180 185
AAG GGA CAG AGC GGG TTT TTT GGC TCT AGT AGG CCT ACT AGT TCA CCG 627 Lys Gly Gin Ser Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro 190 195 200
- 316 -
GCG GTA AGC TCT GGu ACA AGG GGC TTT AAC TCA TAATTTAATT GATTCAAGGC 680 Ala Val Ser Ser Gly Thr Arg Gly Phe Asn Ser 205 210 215
TAAAAA 686
(2) INFORMATION FOR SEQ ID NO: 164:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 215 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 164 :
Met Lys Lys Pro Tyr Arg Lys He Ser Asp Tyr Ala He Val Gly Gly
1 5 10 15
Leu Ser Ala Leu Val Met Val Ser He Val Gly Cys Lys Ser Asn Ala
20 25 30
Asp Asp Lys Pro Lys Glu Gin Ser Ser Leu Ser Gin Ser Val Gin Lys
35 40 45
Gly Ala Phe Val He Leu Glu Glu Gin Lys Asp Lys Ser Tyr Lys Val
50 55 60
Val Glu Glu Tyr Pro Ser Ser Arg Thr His He He Val Arg Asp Leu 65 70 75 80
Gin Gly Asn Glu Arg Val Leu Ser Asn Glu Glu He Gin Lys Leu He
85 90 95
Lys Glu Glu Glu Ala Lys He Asp Asn Gly Thr Ser Lys Leu Val Gin
100 105 110
Pro Asn Asn Gly Gly Ser Asn Glu Gly Ser Gly Phe Gly Leu Gly Ser
115 120 125
Ala He Leu Gly Ser Ala Ala Gly Ala He Leu Gly Ser Tyr He Gly
130 135 140
Asn Lys Leu Phe Asn Asn Pro Asn Tyr Gin Gin Asn Ala Gin Arg Thr 145 150 155 160
Tyr Lys Ser Pro Gin Ala Tyr Gin Arg Ser Gin Asn Ser Phe Ser Lys
165 170 175
Ser Ala Pro Ser Ala Ser Ser Met Gly Gly Ala Ser Lys Gly Gin Ser
180 185 190
Gly Phe Phe Gly Ser Ser Arg Pro Thr Ser Ser Pro Ala Val Ser Ser
195 200 205
Gly Thr Arg Gly Phe Asn Ser 210 215
(2) INFORMATION FOR SEQ ID NO: 165:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8748 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
- 317 -
(ii) MOLLCuj-iE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 16...8694 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165:
AGAGGGTAGC ATTTA ATG AAA AAG TTT AAA AAG AAA CCA AAA AGT ATC AAA 51 Met Lys Lys Phe Lys Lys Lys Pro Lys Ser He Lys 1 5 10
CGA TCG CAT CAA AAT CAA AAA ACA ATC TTA AAG CGT CCT TTA TGG CTT 99 Arg Ser His Gin Asn Gin Lys Thr He Leu Lys Arg Pro Leu Trp Leu 15 20 25
ATG CCT TTA CTC ATC AGC GGG TTT GCT AGT GGG GTG TAT GCG AAT AAT 147 Met Pro Leu Leu He Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn 30 35 40
CTG TGG GAT TTG TTA AAC CCA AAA GTG GGG GGT GAG TAT GTG CAT TGG 195 Leu Trp Asp Leu Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp 45 50 55 60
GTT AAG GGC AGT CAG TAT TGT GCA TGG TGG GAA TTT GCT GGG TGT TTA 243 Val Lys Gly Ser Gin Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu 65 70 75
AAG AAT GTA TGG GGG GCA AAT CAT AAA GGC TAT GAT GCT GGA AAC GCC 291 Lys Asn Val Trp Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala 80 85 90
GCT AAC TAT TTG TCT TCT CAA AAC TAT CAA GCT ATT TCG GTG GGT AGT 339 Ala Asn Tyr Leu Ser Ser Gin Asn Tyr Gin Ala He Ser Val Gly Ser 95 100 105
GGG AAT GAA ACG GGG ACT TAT AGT TTA AGC GGT TTT ACC AAT TAT GTT 387 Gly Asn Glu Thr Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val 110 115 120
GGG GGC AAT CTC ACG ATC AAT CTA GGC AAT AGC GTT GTT TTA GAT TTA 435 Gly Gly Asn Leu Thr He Asn Leu Gly Asn Ser Val Val Leu Asp Leu 125 130 135 140
AGC GGT TCT AAT AGT TTC ACT TCG TAT CAA GGT TAT AAT CAA GGC AAA 483 Ser Gly Ser Asn Ser Phe Thr Ser Tyr Gin Gly Tyr Asn Gin Gly Lys 145 150 155
GAT GAT GTA ACA TTT ACG GTT GGC GCA ATC AAT TTA AAC GGC ACT TTA 531 Asp Asp Val Thr Phe Thr Val Gly Ala He Asn Leu Asn Gly Thr Leu 160 165 170
- 318 -
GAA GTG GGT AAT CG'l GIG GGA TCG GGA GCT GGC ACG CAC ACC GGC ACA 579 Glu Val Gly Asn Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr 175 180 185
GCC ACT TTA AAC TTG AAC GCT AAT AAG GTC AAT ATC AAT TCC AAT ATC 627 Ala Thr Leu Asn Leu Asn Ala Asn Lys Val Asn He Asn Ser Asn He 190 195 200
AAC GCG TAT AAA ACT TCG CAA GTG AAT ATA GGC AAC GCT AAC AGC GTT 675 Asn Ala Tyr Lys Thr Ser Gin Val Asn He Gly Asn Ala Asn Ser Val 205 210 215 220
ATT ACC ATT GGT TCG GTT TCT TTG AGT GGG GAT GTT TGC AGT TCT TTA 723 He Thr He Gly Ser Val Ser Leu Ser Gly Asp Val Cys Ser Ser Leu 225 230 235
GCT AGC GTT GGG ATA GGG GCT AAT TGC TCC ACT TCT GGG CCT AGC TAT 771 Ala Ser Val Gly He Gly Ala Asn Cys Ser Thr Ser Gly Pro Ser Tyr 240 245 250
TCT TTT AAA GGG ACG ACT AAC GCT ACT AAC ACG GCG TTT AGT AAT GCA 819 Ser Phe Lys Gly Thr Thr Asn Ala Thr Asn Thr Ala Phe Ser Asn Ala 255 260 265
AGC GGC AGT TTC ACT TTT GAA GAG AAC GCC ACT TTT AGC GGG GCG AAA 867 Ser Gly Ser Phe Thr Phe Glu Glu Asn Ala Thr Phe Ser Gly Ala Lys 270 275 280
TGG AAT GGG GGG ACT TAT ACC TTT AAT AAA GAG TTT AGC GCT ACC AAT 915 Trp Asn Gly Gly Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn 285 290 295 300
AAC ACC GCC TTT AGT AGC GGT AGT TTT AAT TTT AAA GGT GTA AGC TCT 963 Asn Thr Ala Phe Ser Ser Gly Ser Phe Asn Phe Lys Gly Val Ser Ser 305 310 315
TTT AAT GGT ACT TCG TTT AGT AAC GCT TCT TAT ACT TTT GAC AAT CAA 1011 Phe Asn Gly Thr Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gin 320 325 330
GCC ACT TTC CAA AAC AGC TCC TTT AAT GGG GGG ACT TTT ACT TTT AAT 1059 Ala Thr Phe Gin Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn 335 340 345
AAC CAA ACT AAT CCA ACT AAC AAC GCT CAG CAC CCC CAA ATT CAA AAC 1107 Asn Gin Thr Asn Pro Thr Asn Asn Ala Gin His Pro Gin He Gin Asn 350 355 360
AGC TCT TTT AGT GGT AAC GCT ACC ACT CTT AAG GGC TTT GTG AAT TTC 1155 Ser Ser Phe Ser Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe 365 370 375 380
CAG CAA GCC TTT AAC AAT TCA AAC CAC CAA CTA ACG ATC CAA AAC GCT 1203 Gin Gin Ala Phe Asn Asn Ser Asn His Gin Leu Thr He Gin Asn Ala 385 390 395
- 319 -
TCC TTT AAT AAC GCC A(_ι TTT AAC AAT ACC GGT AAA ATC ACT ATA GAA 1251
Ser Phe Asn Asn Ala Thr Phe Asn Asn Thr Gly Lys He Thr He Glu 400 405 410
AAA GAT GCG AGT TTT AAT AAC ACG ACA TTC AAC ACT TCT GTT GAT ACA 1299
Lys Asp Ala Ser Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr 415 420 425
AAC AAC ATG AGT GTT ACC GGT GGC GTT ACT TTA AGC GGT AAA AAT GAC 1347
Asn Asn Met Ser Val Thr Gly Gly Val Thr Leu Ser Gly Lys Asn Asp 430 435 440
TTG AAA AAT GGC TCA ACC CTT GAT TTT GGG AGT TCT AAA ATC ACT CTC 1395
Leu Lys Asn Gly Ser Thr Leu Asp Phe Gly Ser Ser Lys He Thr Leu
445 450 455 460
GCT CAA GGG ACG ACT TTC AAC CTC ACA AGT TTA GGC AGT GAG AAG AGC 1443
Ala Gin Gly Thr Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser 465 470 475
GTA ACG ATT TTA AAT TCT AGC GGT GGG ATC ACT TAT AGT AAC CTT TTA 1491
Val Thr He Leu Asn Ser Ser Gly Gly He Thr Tyr Ser Asn Leu Leu 480 485 490
AAC CAT GCA ATC AAC GGC TTG ACA AGT GCC TTA AAA ACG AAC GAA AGC 1539
Asn His Ala He Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser 495 500 505
CTT TCA AAT CCG CAA AGT TTC GCT CAA GGT TTG TGG GAT ATA ATC ACT 1587
Leu Ser Asn Pro Gin Ser Phe Ala Gin Gly Leu Trp Asp He He Thr 510 515 520
TAC AAT GGG GTT ACC GGG CAG CTT TTG AAT GAA AAC GCT GCA ACA TCT 1635
Tyr Asn Gly Val Thr Gly Gin Leu Leu Asn Glu Asn Ala Ala Thr Ser
525 530 535 540
AAA CCC ACT GAC TCT TCG CCC TCT AAA TCC TCT ACA AAC TCT ACG CAA 1683
Lys Pro Thr Asp Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gin 545 550 555
GTC TAT CAA GTG GGT TAC AAA ATA GGG GAT ACT ATC TAC AAA CTG CAA 1731
Val Tyr Gin Val Gly Tyr Lys He Gly Asp Thr He Tyr Lys Leu Gin 560 565 570
GAA ACT TTC AGC CAC AAT TCC ATT ATT ATT CAG GCT TTA GAG AGC GGG 1779
Glu Thr Phe Ser His Asn Ser He He He Gin Ala Leu Glu Ser Gly 575 580 585
ACT TAC ACG CCA CCC CCT GTC ATT AAC GGC TCC AAA TTT GAC TTA TCC 1827
Thr Tyr Thr Pro Pro Pro Val He Asn Gly Ser Lys Phe Asp Leu Ser 590 595 600
GCT TCA AAT TAT ATC AAT GCT GAC ATG CCT TGG TAT GAC CAT AAA TAT 1875
Ala Ser Asn Tyr He Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr
605 610 615 620
- 320 -
TAC ATC CCT AAA TCC CAA AAT TTT ACA GAG AGC GGG ACT TAT TAC TTG 1923 Tyr He Pro Lys Ser Gin Asn Phe Thr Glu Ser Gly Thr Tyr Tyr Leu 625 630 635
CCG AGC GTC CAA ATA TGG GGG AGC TAC ACT AAC TCG TTT AAA CAA ACT 1971 Pro Ser Val Gin He Trp Gly Ser Tyr Thr Asn Ser Phe Lys Gin Thr 640 645 650
TTT AGC GCA AAT GGT AGT AAT CTG GTG ATT GGG TAT AAC TCA ACA TGG 2019 Phe Ser Ala Asn Gly Ser Asn Leu Val He Gly Tyr Asn Ser Thr Trp 655 660 665
ACT GAT CAT AAT GTC TCT TCT AGC GGC ACG GTG TCT TTT GGG GAC ACT 2067 Thr Asp His Asn Val Ser Ser Ser Gly Thr Val Ser Phe Gly Asp Thr 670 675 680
TCA GGG AGC GCT CTT AAT GGG CAT TGC GGA CCT TGG CCG TAT TAC CAA 2115 Ser Gly Ser Ala Leu Asn Gly His Cys Gly Pro Trp Pro Tyr Tyr Gin 685 690 695 700
TGC ACA GGC ACG ACT AAC GGC ACT TAT AGC GCC TAT CAT GTG TAT ATC 2163 Cys Thr Gly Thr Thr Asn Gly Thr Tyr Ser Ala Tyr His Val Tyr He 705 710 715
ACA GCG AAT CTG CGT TCT GGC AAT CGT ATA GGC ACC GGT GGG GCA GCT 2211 Thr Ala Asn Leu Arg Ser Gly Asn Arg He Gly Thr Gly Gly Ala Ala
720 725 730
AAT CTA ATC TTT AAT GGG GTA GAT AGT ATC AAT ATC GCT AAC GCT ACC 2259 Asn Leu He Phe Asn Gly Val Asp Ser He Asn He Ala Asn Ala Thr 735 740 745
ATC ACG CAA CAT AAC GCC GGA ATC TAT TCA AGC TCT ATG ACT TTT TCC 2307 He Thr Gin His Asn Ala Gly He Tyr Ser Ser Ser Met Thr Phe Ser 750 755 760
ACG CAA AGC ATG GAT AAT TCG CAG AAT TTG AAT GGT CTA AAT TCT AAC 2355 Thr Gin Ser Met Asp Asn Ser Gin Asn Leu Asn Gly Leu Asn Ser Asn 765 770 775 780
GGC AAA CTT TCG GTG TAT GGC ACC ACT TTC ACT AAC GAA GCT AAA GAT 2403 Gly Lys Leu Ser Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp 785 790 795
GGG AAA TTC ATT TTC AAT GCA GGG CAA GCG GTT TTT GAA AAC ACC AAC 2451 Gly Lys Phe He Phe Asn Ala Gly Gin Ala Val Phe Glu Asn Thr Asn 800 805 810
TTT AAT GGA GGG AGT TAC CAA TTC AGC GGC GAT AGC TTG AAT TTT TCA 2499 Phe Asn Gly Gly Ser Tyr Gin Phe Ser Gly Asp Ser Leu Asn Phe Ser 815 820 825
AAC AAC AAC CAG TTC AAT AGC GGT TCG TTT GAA ATT AGC GCA AAA AAC 2547 Asn Asn Asn Gin Phe Asn Ser Gly Ser Phe Glu He Ser Ala Lys Asn 830 835 840
- 321 -
GCT TCG TTC AAT AAC GCi AAC TTT AAC AAC AGC GCT TCT TTT AAT TTC 2595 Ala Ser Phe Asn Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe Asn Phe 845 850 855 860
AAT AAT TCT AAC GCG ACC ACT TCG TTT GTG GGG GAT TTC ACT AAC GCT 2643 Asn Asn Ser Asn Ala Thr Thr Ser Phe Val Gly Asp Phe Thr Asn Ala 865 870 875
AAT TCA AAT TTG CAA ATC GCC GGG AAC GCT GTT TTT GGG AAC TCT ACT 2691 Asn Ser Asn Leu Gin He Ala Gly Asn Ala Val Phe Gly Asn Ser Thr 880 885 890
AAT GGC TCT CAA AAT ACC GCT AAT TTT AAT AAT ACC GGC TCT GTT AAT 2739 Asn Gly Ser Gin Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn 895 900 905
ATT TCA GGG AAT GCA ACC TTT GAT AAT GTG GTG TTT AAT GGC CCT ACG 2787 He Ser Gly Asn Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr 910 915 920
AAC ACG AGC GTG AAA GGG CAG GTT ACT TTA AAT AAC ATC ACT TTA AAA 2835 Asn Thr Ser Val Lys Gly Gin Val Thr Leu Asn Asn He Thr Leu Lys 925 930 935 940
AAC CTG AAC GCC CCT TTG TCT TTT GGC GAT GGG ACG ATT ACT TTT AAC 2883 Asn Leu Asn Ala Pro Leu Ser Phe Gly Asp Gly Thr He Thr Phe Asn 945 950 955
GCT CAT TCG GTG ATT AAT ATT GCT GAA TCT ATC ACT AAT GGC AAC CCT 2931 Ala His Ser Val He Asn He Ala Glu Ser He Thr Asn Gly Asn Pro 960 965 970
ATC ACT CTT GTA AGC TCT TCT AAA GAA ATT GAA TAC AAC AAC GCT TTC 2979 He Thr Leu Val Ser Ser Ser Lys Glu He Glu Tyr Asn Asn Ala Phe 975 980 985
AGT AAA AAT CTA TGG CAG CTC ATC AAC TAC CAA GGG CAT GGG GCA AGC 3027 Ser Lys Asn Leu Trp Gin Leu He Asn Tyr Gin Gly His Gly Ala Ser 990 995 1000
AGT GAA AAG CTC GTC TCT AGC GCG GGT AAT GGC GTT TAT GAT GTG GTG 3075 Ser Glu Lys Leu Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Val Val 1005 1010 1015 1020
TAT TCT TTC AAT AAC CAA ACC TAC AAT TTC CAA GAG GTT TTT TCA CAA 3123 Tyr Ser Phe Asn Asn Gin Thr Tyr Asn Phe Gin Glu Val Phe Ser Gin 1025 1030 1035
AAC AGC ATT TCT ATC CGG CGT TTG GGC GTT AAC ATG GTG TTT GAT TAT 3171 Asn Ser He Ser He Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr 1040 1045 1050
GTG GAT ATG GAA AAA TCG GAT CAT TTA TAT TAT CAA AAC GCT CTC GGT 3219 Val Asp Met Glu Lys Ser Asp His Leu Tyr Tyr Gin Asn Ala Leu Gly 1055 1060 1065
- 322 -
TTT ATG ACC TAC All* CCT AA1 AGC TAT AAC AAT AAT TTA GGG AAT GCA 3267
Phe Met Thr Tyr Met Pro Asn Ser Tyr Asn Asn Asn Leu Gly Asn Ala
1070 1075 1080
AAC AAC ACC ATT TAC TAT TAC GAC AAG AGC ATT GAT TTT TAT GCG AGC 3315
Asn Asn Thr He Tyr Tyr Tyr Asp Lys Ser He Asp Phe Tyr Ala Ser
1085 1090 1095 1100
GGG AAA ACT CTA TTC ACT AAA GCG GAA TTT TCT CAA ACA TTC ACC GGG 3363
Gly Lys Thr Leu Phe Thr Lys Ala Glu Phe Ser Gin Thr Phe Thr Gly
1105 1110 1115
CAA AAC AGC GCG ATC GTT TTT GGG GCT AAA AGC ATA TGG ACG AGC TTA 3411
Gin Asn Ser Ala He Val Phe Gly Ala Lys Ser He Trp Thr Ser Leu
1120 1125 1130
AGC GAT GCA CCG CAG TCT AAC ACC ATC ATT CGC TTT GGG GAC AAT AAG 3459
Ser Asp Ala Pro Gin Ser Asn Thr He He Arg Phe Gly Asp Asn Lys
1135 1140 1145
GGA GCA GGG AGT AAT GAT GCG AGC GGG CAT TGC TGG AAT TTG CAA TGC 3507
Gly Ala Gly Ser Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gin Cys
1150 1155 1160
ATA GGC TTT ATT ACA GGG CAT TAT GAA GCG CAA AAG ATT TAC ATC ACC 3555
He Gly Phe He Thr Gly His Tyr Glu Ala Gin Lys He Tyr He Thr
1165 1170 1175 1180
GGT AGC ATT GAA AGC GGG AAT CGC ATT TCT AGC GGT GGG GGC GCG AGC 3603
Gly Ser He Glu Ser Gly Asn Arg He Ser Ser Gly Gly Gly Ala Ser
1185 1190 1195
CTT AAT TTT AAC GGG CTT CAA GGC ATT CTT TTA ACG AAC GCG ACT TTG 3651
Leu Asn Phe Asn Gly Leu Gin Gly He Leu Leu Thr Asn Ala Thr Leu
1200 1205 1210
TAT AAC CGC GCC GCT GGC ACG CAA AGC TCG TCT ATG AAT TTT ATC TCT 3699
Tyr Asn Arg Ala Ala Gly Thr Gin Ser Ser Ser Met Asn Phe He Ser
1215 1220 1225
AAC AGC GCG AAC ATT CAG GCT CAA AAC TCC TAT TTT ATA GAC GAT ACC 3747
Asn Ser Ala Asn He Gin Ala Gin Asn Ser Tyr Phe He Asp Asp Thr
1230 1235 1240
GCA CAA AAT GGC GGT AAC CCT AAT TTC AGT TTC AAC GCT TTG AAT CTG 3795
Ala Gin Asn Gly Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu
1245 1250 1255 1260
GAT TTT TCT AAC AGC TCT TTT AGA GGC TAT GTG GGG AAA ACG CAA TCT 3843
Asp Phe Ser Asn Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gin Ser
1265 1270 1275
GTT TTT AAA TTC AAT GCC AAG AAT GCG ATC AGT TTC ACC AAC AGC ACG 3891
Val Phe Lys Phe Asn Ala Lys Asn Ala He Ser Phe Thr Asn Ser Thr
1280 1285 1290
- 323 -
AAT TTA AGC TCT GG'i IlG TAT CAA ATG CAA GCT AAA AGC GTG TTG TTT 3939 Asn Leu Ser Ser Gly Leu Tyr Gin Met Gin Ala Lys Ser Val Leu Phe 1295 1300 1305
GAC AAT TCC AAT TTA AGC GTT TCA GTG GGG ACA AGC AGT ATT AAA GCC 3987 Asp Asn Ser Asn Leu Ser Val Ser Val Gly Thr Ser Ser He Lys Ala 1310 1315 1320
AAT GCG ATC AAT CTT TCT CAA AAT GCC TCT ATT AAT GCG AGC AAC CAT 4035 Asn Ala He Asn Leu Ser Gin Asn Ala Ser He Asn Ala Ser Asn His 1325 1330 1335 1340
TCA ACC TTA GAA CTT CAA GGC GAT TTG AAT GTG AAC GAC ACC AGC TCG 4083 Ser Thr Leu Glu Leu Gin Gly Asp Leu Asn Val Asn Asp Thr Ser Ser 1345 1350 1355
CTC AAC CTC AAC CAA AGC ACG ATT AAT GTT TCC AAT AAC GCC ACG ATC 4131
Leu Asn Leu Asn Gin Ser Thr He Asn Val Ser Asn Asn Ala Thr He
1360 1365 1370
AAC GAT TAT GCG AGC TTG ATT GCG AGT AAT GGC TCT CAC CTT AAT TTT 4179
Asn Asp Tyr Ala Ser Leu He Ala Ser Asn Gly Ser His Leu Asn Phe 1375 1380 1385
AAC GGG GCG GTT AAT TTC AAT TCA GCG AAT ATT ACT ACG AGT TTG AAT 4227 Asn Gly Ala Val Asn Phe Asn Ser Ala Asn He Thr Thr Ser Leu Asn 1390 1395 1400
AAT TCC TCT ATC GTG TTT AAG GGG GCG GTC TCT TTA GGA GGG CAG TTT 4275 Asn Ser Ser He Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gin Phe 1405 1410 1415 1420
AAT TTA AGC AAT AAC TCT TCT TTA GAT TTC CAA GGC TCT AGC GCT ATC 4323 Asn Leu Ser Asn Asn Ser Ser Leu Asp Phe Gin Gly Ser Ser Ala He 1425 1430 1435
ACC TCT AAC ACG GCG TTT AAT TTC TAT GAT AAC GCT TTT TCT CAA AGC 4371 Thr Ser Asn Thr Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gin Ser 1440 1445 1450
CCC ATC ACT TTC CAT CAA GCC CTT GAC ATT AAA GCG CCC TTA AGT TTG 4419 Pro He Thr Phe His Gin Ala Leu Asp He Lys Ala Pro Leu Ser Leu 1455 1460 1465
GGA GGC AAC CTT TTA AAC CCT AAC AAC AGC AGC GTG CTG GAT TTA AAA 4467 Gly Gly Asn Leu Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys 1470 1475 1480
AAC AGC CAG CTT GTT TTT GGC GAT CAA GGG AGT TTG AAT ATC GCT AAC 4515 Asn Ser Gin Leu Val Phe Gly Asp Gin Gly Ser Leu Asn He Ala Asn 1485 1490 1495 1500
ATT GAT TTA CTA AGC GAT CTA AAT GAT AAT AAA AAT CGT GTG TAT AAC 4563 He Asp Leu Leu Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn 1505 1510 1515
- 324 -
ATC ATT CAA GCG GAC AiG AAT AGT AAT TGG TAT GAG CGT ATC AGC TTC 4611 He He Gin Ala Asp Met Asn Ser Asn Trp Tyr Glu Arg He Ser Phe 1520 1525 1530
TTT GGC ATG CAC ATC AAT GAC GGG ATT TAT GAT GCT AAA AAC CAA ACT 4659 Phe Gly Met His He Asn Asp Gly He Tyr Asp Ala Lys Asn Gin Thr 1535 1540 1545
TAT AGT TTC ACT AAC CCC CTT AAT AAC GCC CTA AAA ATC ACC GAG AGC 4707 Tyr Ser Phe Thr Asn Pro Leu Asn Asn Ala Leu Lys He Thr Glu Ser 1550 1555 1560
TTT AAA GAC AAC CAA CTA AGC GTT ACG CTC TCT CAA ATC CCG GGT ATT 4755 Phe Lys Asp Asn Gin Leu Ser Val Thr Leu Ser Gin He Pro Gly He 1565 1570 1575 1580
AAA AAC ACG CTC TAT AAC ATT GGC TCT GAA ATT TTT AAC TAC CAA AAA 4803 Lys Asn Thr Leu Tyr Asn He Gly Ser Glu He Phe Asn Tyr Gin Lys 1585 1590 1595
GTT TAT AAC AAC GCT AAT GGC GTG TAT TCT TAT AGC GAT GAT GCA CAA 4851 Val Tyr Asn Asn Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp Ala Gin 1600 1605 1610
GGC GTG TTT TAT CTC ACA AGC AAC GTG AAA GGC TAT TAC AAC CCT AAC 4899 Gly Val Phe Tyr Leu Thr Ser Asn Val Lys Gly Tyr Tyr Asn Pro Asn 1615 1620 1625
CAA TCC TAT CAA GCC AGC GGC AGT AAC AAC ACC ACG AAA AAT AAT AAT 4947 Gin Ser Tyr Gin Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn 1630 1635 1640
CTA ACC TCT GAA TCT TCT ATC ATC TCG CAA ACC TAT AAC GCG CAA GGC 4995 Leu Thr Ser Glu Ser Ser He He Ser Gin Thr Tyr Asn Ala Gin Gly 1645 1650 1655 1660
AAC CCT ATT AGC GCG TTG CAC ATC TAT AAC AAG GGC TAT AAT TTC AAC 5043 Asn Pro He Ser Ala Leu His He Tyr Asn Lys Gly Tyr Asn Phe Asn 1665 1670 1675
AAT ATC AAA GCG TTA GGG CAA ATG GCT CTC AAA CTC TAC CCT GAA ATC 5091 Asn He Lys Ala Leu Gly Gin Met Ala Leu Lys Leu Tyr Pro Glu He 1680 1685 1690
AAA AAG GTA TTA GGG AAT GAT TTT TCG CCC TCA AGT TTG AAC GCT TTA 5139 Lys Lys Val Leu Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu 1695 1700 1705
AAC TCT AAT GCG CTA AAC CAA CTT ACC AAA CTC ATC ACG CCT AAC GAC 5187 Asn Ser Asn Ala Leu Asn Gin Leu Thr Lys Leu He Thr Pro Asn Asp 1710 1715 1720
TGG AAA AAC ATT AAC GAG TTG ATT GAT AAC GCA AAC AAT TCG GTG GTG 5235 Trp Lys Asn He Asn Glu Leu He Asp Asn Ala Asn Asn Ser Val Val 1725 1730 1735 1740
- 325 -
CAA AAT TTC AAT AAC GGι_ ACT TTG ATT GTG GGA GCG ACT CAA ATA GGG 5283
Gin Asn Phe Asn Asn Gly Thr Leu He Val Gly Ala Thr Gin He Gly 1745 1750 1755
CAA ACA GAC ACC AAT AGC GCG GTT GTT TTT GGG GGC TTG GGC TAT CAA 5331
Gin Thr Asp Thr Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tyr Gin 1760 1765 1770
ACA CCT TGT GAT TAT ACT GAT ATT GTG TGC CAA AAA TTT AGA GGC ACT 5379
Thr Pro Cys Asp Tyr Thr Asp He Val Cys Gin Lys Phe Arg Gly Thr 1775 1780 1785
TAT TTA GGA CAG CTT TTA GAG TCC AGC TCG GCT GAT TTG GGC TAT ATT 5427
Tyr Leu Gly Gin Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly Tyr He 1790 1795 1800
GAC ACG ACT TTT AAC GCT AAA GAA ATT TAT CTT ACC GGC ACT TTA GGG 5475
Asp Thr Thr Phe Asn Ala Lys Glu He Tyr Leu Thr Gly Thr Leu Gly
1805 1810 1815 1820
AGC GGG AAC GCA TGG GGG ACT GGG GGG AGC GCG AGC GTA ACT TTT AAC 5523
Ser Gly Asn Ala Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn 1825 1830 1835
AGC CAA ACT TCG CTC ATT CTC AAT CAG GCT AAT ATC GTA AGC TCG CAA 5571
Ser Gin Thr Ser Leu He Leu Asn Gin Ala Asn He Val Ser Ser Gin 1840 1845 1850
ACC GAT GGG ATC TTT AGC ATG CTG GGT CAA GAG GGT ATT AAT AAG GTT 5619
Thr Asp Gly He Phe Ser Met Leu Gly Gin Glu Gly He Asn Lys Val 1855 1860 1865
TTC AAT CAA GCC GGG CTC GCT AAT ATT TTG GGC GAA GTG GCG GTG CAA 5667
Phe Asn Gin Ala Gly Leu Ala Asn He Leu Gly Glu Val Ala Val Gin 1870 1875 1880
TCC ATC AAC AAA GCC GGG GGA TTA GGG AAT TTG ATA GTA AAT ACG CTA 5715
Ser He Asn Lys Ala Gly Gly Leu Gly Asn Leu He Val Asn Thr Leu
1885 1890 1895 1900
GGG AGT AAT AGC GTG ATT GGG GGG TAT TTA ACG CCT GAA CAA AAA AAT 5763
Gly Ser Asn Ser Val He Gly Gly Tyr Leu Thr Pro Glu Gin Lys Asn 1905 1910 1915
CAA ACC CTA AGC CAG CTT TTA GGG CAG AAT AAC TTT GAT AAT CTC ATG 5811
Gin Thr Leu Ser Gin Leu Leu Gly Gin Asn Asn Phe Asp Asn Leu Met 1920 1925 1930
AAC GAT AGC GGT TTG AAT ACG GCG ATT AAG GAT TTG ATC AGA CAA AAA 5859
Asn Asp Ser Gly Leu Asn Thr Ala He Lys Asp Leu He Arg Gin Lys 1935 1940 1945
TTA GGC TTT TGG ACC GGG CTA GTG GGG GGA TTA GCC GGA CTA GGG GGC 5907
Leu Gly Phe Trp Thr Gly Leu Val Gly Gly Leu Ala Gly Leu Gly Gly 1950 1955 1960
- 326 -
ATT GAT TTG CAA AAU CCT GAA AAG CTT ATA GGC AGC ATG TCA ATC AAT 5955 He Asp Leu Gin Asn Pro Glu Lys Leu He Gly Ser Met Ser He Asn 1965 1970 1975 1980
GAT TTA TTG AGT AAA AAA GGG TTG TTC AAT CAG ATC ACC GGC TTT ATT 6003 Asp Leu Leu Ser Lys Lys Gly Leu Phe Asn Gin He Thr Gly Phe He 1985 1990 1995
TCC GCT AAC GAT ATA GGG CAA GTC ATA AGC GTA ATG TTG CAA GAT ATT 6051 Ser Ala Asn Asp He Gly Gin Val He Ser Val Met Leu Gin Asp He
2000 2005 2010
GTC AAA CCG AGC AAC GCT TTA AAA AAC GAT GTA GCG GCT TTA GGC AAG 6099 Val Lys Pro Ser Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys 2015 2020 2025
CAA ATG ATT GGC GAA TTT TTA GGC CAA GAC ACG CTC AAT TCT TTA GAA 6147 Gin Met He Gly Glu Phe Leu Gly Gin Asp Thr Leu Asn Ser Leu Glu 2030 2035 2040
AGC TTG TTG CAA AAC CAG CAG ATT AAA AGC GTT TTA GAC AAA GTC CTA 6195 Ser Leu Leu Gin Asn Gin Gin He Lys Ser Val Leu Asp Lys Val Leu 2045 2050 2055 2060
GCG GCT AAA GGT TTA GGG CCT ATT TAT GAA CAA GGC TTG GGG GAT TTG 6243 Ala Ala Lys Gly Leu Gly Pro He Tyr Glu Gin Gly Leu Gly Asp Leu 2065 2070 2075
ATA CCT AAT CTT GGT AAA AAA GGG CTT TTC GCT CCT TAT GGC TTG AGT 6291 He Pro Asn Leu Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser 2080 2085 2090
CAA GTG TGG CAA AAA GGG GAT TTT AGT TTC AAC GCA CAA GGC AAT GTT 6339 Gin Val Trp Gin Lys Gly Asp Phe Ser Phe Asn Ala Gin Gly Asn Val 2095 2100 2105
TTT GTG CAA AAT TCC ACT TTC TCT AAC GCC AAT GGA GGC ACG CTC TCT 6387 Phe Val Gin Asn Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser 2110 2115 2120
TTT AAC GCA GGA AAT TCG CTC ATT TTT GCC GGA AAC AAT CAT ATT GCA 6435 Phe Asn Ala Gly Asn Ser Leu He Phe Ala Gly Asn Asn His He Ala 2125 2130 2135 2140
TTC ACT AAC CAC GCT GGA ACT CTT CAA TTA TTG TCC GAT CAA GTT TCT 6483 Phe Thr Asn His Ala Gly Thr Leu Gin Leu Leu Ser Asp Gin Val Ser 2145 2150 2155
AAC ATT AAC ATC ACC ACG CTT AAC GCT AGC AAC GGC CTT AAG ATT AAC 6531 Asn He Asn He Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys He Asn 2160 2165 2170
GCC GCT AAT AAC AAT GTT TCT GTG TCT CAA GGC AAT CTG TTT GTC AGC 6579 Ala Ala Asn Asn Asn Val Ser Val Ser Gin Gly Asn Leu Phe Val Ser 2175 2180 2185
- 327 -
GCT AGC TGC GCG CAA CAA AGC GAT CCA ACT ACA GCT AAT ATT GCA AAC 6627 Ala Ser Cys Ala Gin Gin Ser Asp Pro Thr Thr Ala Asn He Ala Asn 2190 2195 2200
CCT TGC GCG CTT AGC GCC CAA AGC ACG AAT GGC GCT TCT TCT AAT AAT 6675 Pro Cys Ala Leu Ser Ala Gin Ser Thr Asn Gly Ala Ser Ser Asn Asn 2205 2210 2215 2220
GCG TCA AAT AAC GCG CCA ATC GCC TTG AGT AAT AAC GAT GAA AGC TTG 6723 Ala Ser Asn Asn Ala Pro He Ala Leu Ser Asn Asn Asp Glu Ser Leu 2225 2230 2235
ATG GTT GCG GCG AAT GAT TTC AAT TTT TCA GGC AAT ATT TAC GCT AAT 6771 Met Val Ala Ala Asn Asp Phe Asn Phe Ser Gly Asn He Tyr Ala Asn 2240 2245 2250
GGG GTG GTT GAT TTT TCA AAG ATT AAA GGC TCT GCA AAC ATT AAA AAC 6819 Gly Val Val Asp Phe Ser Lys He Lys Gly Ser Ala Asn He Lys Asn 2255 2260 2265
CTG TAT CTT TAC AAT AAC GCT CAA TTC CAA GCC AAC AAT CTC ACT ATT 6867 Leu Tyr Leu Tyr Asn Asn Ala Gin Phe Gin Ala Asn Asn Leu Thr He 2270 2275 2280
TCC AAT CAA GCG GTG TTA GAA AAA AAC GCC AGC TTT GTA ACG AAT AAT 6915 Ser Asn Gin Ala Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn 2285 2290 2295 2300
TTA AAC ATT CAA GGA GCG TTT AAC AAC AAC GCC ACG CAA AAA ATA GAG 6963 Leu Asn He Gin Gly Ala Phe Asn Asn Asn Ala Thr Gin Lys He Glu 2305 2310 2315
GTG CTT CAA AAT TTA GTG ATC GCT TCA AAC GCT TCT TTA AGC ACC GGG 7011 Val Leu Gin Asn Leu Val He Ala Ser Asn Ala Ser Leu Ser Thr Gly 2320 2325 2330
ATT TAT GGG TTA GAA GTA GGG GGG GCT TTG AAT AAT TCT GGA GCG ATC 7059 He Tyr Gly Leu Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala He 2335 2340 2345
CAT TTT AAT TTA GAA AAT ACC CAA ACG CCA ACG CCG CTC ATT CAA GCA 7107 His Phe Asn Leu Glu Asn Thr Gin Thr Pro Thr Pro Leu He Gin Ala 2350 2355 2360
GAG GGG ATC ATT AAC CTC AAC ACC ACC CAA ACG CCT TTT ATG AAT GTC 7155 Glu Gly He He Asn Leu Asn Thr Thr Gin Thr Pro Phe Met Asn Val 2365 2370 2375 2380
AAT AAC AGC ATG GCC AAT AAT ACG ACT TAC ACT TTA TTA AAA AGC AGC 7203 Asn Asn Ser Met Ala Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser 2385 2390 2395
CGT TAC ATT GAT TAC AAT ATC AAC CCC AAC AGC TTG CAA TCG TAT TTG 7251 Arg Tyr He Asp Tyr Asn He Asn Pro Asn Ser Leu Gin Ser Tyr Leu 2400 2405 2410
- 328 -
AAT CTC TAC ACT TTA ATC -AAT ATC AAC GGG AAC CAC ATA GAG GAA AAA 7299 Asn Leu Tyr Thr Leu He Asn He Asn Gly Asn His He Glu Glu Lys 2415 2420 2425
AAC GGC GCA TTG ACT TAT TTG GGC CAA CGG GTT TTG TTG CAA GAT AAG 7347 Asn Gly Ala Leu Thr Tyr Leu Gly Gin Arg Val Leu Leu Gin Asp Lys 2430 2435 2440
GGG TTA TTG TTA AGC GTA GCG CTG CCC AAC TCA AAC AAC GCT TCT CAA 7395 Gly Leu Leu Leu Ser Val Ala Leu Pro Asn Ser Asn Asn Ala Ser Gin 2445 2450 2455 2460
AAC AAC ATT TTA AGC CTT TCT GTC CTT TAT AAC CAA GTT AAA ATG TCT 7443 Asn Asn He Leu Ser Leu Ser Val Leu Tyr Asn Gin Val Lys Met Ser 2465 2470 2475
TGC GGC GAT AAA GCG ATG GAT TTT ACC CCC CCT ACC TTA CAA GAT TAC 7491 Cys Gly Asp Lys Ala Met Asp Phe Thr Pro Pro Thr Leu Gin Asp Tyr 2480 2485 2490
ATT GTG GGC ATT CAA GGG CAA AGC GCG CTC AAT CAA ATT GAA GCT GTT 7539 He Val Gly He Gin Gly Gin Ser Ala Leu Asn Gin He Glu Ala Val 2495 2500 2505
GGG GGG AAC GCT ATC AAG TGG CTT TCA ACA TTG ATG ATG GAG ACT AAA 7587 Gly Gly Asn Ala He Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys 2510 2515 2520
GAA AAC CCG TTT TTT GCG CCG ATT TAT TTA AAA AAC CAC TCT TTG AAT 7635 Glu Asn Pro Phe Phe Ala Pro He Tyr Leu Lys Asn His Ser Leu Asn 2525 2530 2535 2540
GAA ATC TTA GGC GTA ACA AAA GAT CTT CAA AAC ACC GCA AGC TTG ATT 7683 Glu He Leu Gly Val Thr Lys Asp Leu Gin Asn Thr Ala Ser Leu He 2545 2550 2555
TCT AAC CCT AAT TTT AGA GAT AAC GCT ACC AAT CTT TTA GAA TTG GCG 7731 Ser Asn Pro Asn Phe Arg Asp Asn Ala Thr Asn Leu Leu Glu Leu Ala 2560 2565 2570
AGT TAC ACC CAA CAA ACC AGC CGT TTA ACA AAA CTC TCT GAT TTT AGA 7779 Ser Tyr Thr Gin Gin Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg 2575 2580 2585
TCT AGA GAG GGA GAG TCT GAT TTT TCT TTG TTA GAG CTT AAA AAC AAG 7827 Ser Arg Glu Gly Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys 2590 2595 2600
CGT TTT AGC GAT CCT AAT CCA GAG GTT TTT GTC AAA TAC TCT CAA CTT 7875 Arg Phe Ser Asp Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gin Leu 2605 2610 2615 2620
AGC AAA CAC CCA AAT AAC CTT TGG GTT CAA GGG GTG GGA GGA GCG AGC 7923 Ser Lys His Pro Asn Asn Leu Trp Val Gin Gly Val Gly Gly Ala Ser 2625 2630 2635
-329 -
TTT ATT TCT GGG GGC AAT -GGC ACG CTT TAT GGC TTG AAT GCG GGC TAT 7971
Phe He Ser Gly Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala Gly Tyr
2640 2645 2650
GAC AGG TTG GTT AAA AAT GTG ATC CTT GGG GGT TAT GTG GCT TAT GGC 8019
Asp Arg Leu Val Lys Asn Val He Leu Gly Gly Tyr Val Ala Tyr Gly 2655 2660 2665
TAT AGC GAC TTT AAT GGG AAC ATC ATG CAT TCT TTG GGT AAT AAT GTG 8067 Tyr Ser Asp Phe Asn Gly Asn He Met His Ser Leu Gly Asn Asn Val 2670 2675 2680
GAT GTG GGG ATG TAT GCG AGG GCT TTT TTA AAA AGG AAC GAA TTC ACT 8115 Asp Val Gly Met Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr 2685 2690 2695 2700
TTG AGC GCG AAT GAA ACT TAT GGA GGC AAT GCA ACT AGT ATC AAT TCT 8163 Leu Ser Ala Asn Glu Thr Tyr Gly Gly Asn Ala Thr Ser He Asn Ser 2705 2710 2715
TCT AAT TCT TTG CTC TCT GTG TTG AAC CAA CGC TAC AAC TAC AAC ACC 8211 Ser Asn Ser Leu Leu Ser Val Leu Asn Gin Arg Tyr Asn Tyr Asn Thr 2720 2725 2730
TGG ACA ACG AGC GTG AAC GGG AAT TAC GGC TAT GAT TTC ATG TTC AAA 8259 Trp Thr Thr Ser Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys 2735 2740 2745
CAA AAA AGC GTG GTG CTA AAA CCT CAA GTG GGT TTG AGC TAT CAT TTC 8307 Gin Lys Ser Val Val Leu Lys Pro Gin Val Gly Leu Ser Tyr His Phe 2750 2755 2760
ATA GGT CTA AGT GGG ATG AAA GGC AAT GAT GCC GCT TAC AAA CAA TTC 8355 He Gly Leu Ser Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gin Phe 2765 2770 2775 2780
CTC ATG CAT TCA AAC CCC TCT AAC GAA TCG GTT TTA ACG CTC AAC ATG 8403 Leu Met His Ser Asn Pro Ser Asn Glu Ser Val Leu Thr Leu Asn Met 2785 2790 2795
GGG TTG GAG AGC CGT AAA TAT TTT GGT AAA AAT TCC TAT TAT TTT GTA 8451 Gly Leu Glu Ser Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val 2800 2805 2810
ACG GCG AGA CTA GGT AGG GAT CTT TTG ATC AAA TCT AAA GGC AGC AAT 8499 Thr Ala Arg Leu Gly Arg Asp Leu Leu He Lys Ser Lys Gly Ser Asn 2815 2820 2825
ACG GTG CGT TTT GTG GGC GAA AAC ACT TTA TTG TAT CGC AAG GGG GAA 8547 Thr Val Arg Phe Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu 2830 2835 2840
GTT TTT AAC ACT TTT GCG AGC GTG ATT ACA GGG GGC GAA ATG CAT TTG 8595 Val Phe Asn Thr Phe Ala Ser Val He Thr Gly Gly Glu Met His Leu 2845 2850 2855 2860
- 330 -
TGG CGT TTG GTG TA'i GTG -AAT GCG GGG GTG GGG CTT AAG ATG GGC TTG 8643 Trp Arg Leu Val Tyr Val Asn Ala Gly Val Gly Leu Lys Met Gly Leu 2865 2870 2875
CAA TAC CAA GAT ATT AAT ATA ACC GGG AAT GTG GGC ATG CGA GTG GCG 8691 Gin Tyr Gin Asp He Asn He Thr Gly Asn Val Gly Met Arg Val Ala 2880 2885 2890
TTT TAGCTTTTTT GCTATAATGC TTCGTTCAAA TTTTATGGTT AGGTTTTTCT ATGT 8748 Phe
(2) INFORMATION FOR SEQ ID NO: 166:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2893 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166:
Met Lys Lys Phe Lys Lys Lys Pro Lys Ser He Lys Arg Ser His Gin
1 5 10 15
Asn Gin Lys Thr He Leu Lys Arg Pro Leu Trp Leu Met Pro Leu Leu
20 25 30
He Ser Gly Phe Ala Ser Gly Val Tyr Ala Asn Asn Leu Trp Asp Leu
35 40 45
Leu Asn Pro Lys Val Gly Gly Glu Tyr Val His Trp Val Lys Gly Ser
50 55 60
Gin Tyr Cys Ala Trp Trp Glu Phe Ala Gly Cys Leu Lys Asn Val Trp 65 70 75 80
Gly Ala Asn His Lys Gly Tyr Asp Ala Gly Asn Ala Ala Asn Tyr Leu
85 90 95
Ser Ser Gin Asn Tyr Gin Ala He Ser Val Gly Ser Gly Asn Glu Thr
100 105 110
Gly Thr Tyr Ser Leu Ser Gly Phe Thr Asn Tyr Val Gly Gly Asn Leu
115 120 125
Thr He Asn Leu Gly Asn Ser Val Val Leu Asp Leu Ser Gly Ser Asn
130 135 140
Ser Phe Thr Ser Tyr Gin Gly Tyr Asn Gin Gly Lys Asp Asp Val Thr 145 150 155 160
Phe Thr Val Gly Ala He Asn Leu Asn Gly Thr Leu Glu Val Gly Asn
165 170 175
Arg Val Gly Ser Gly Ala Gly Thr His Thr Gly Thr Ala Thr Leu Asn
180 185 190
Leu Asn Ala Asn Lys Val Asn He Asn Ser Asn He Asn Ala Tyr Lys
195 200 205
Thr Ser Gin Val Asn He Gly Asn Ala Asn Ser Val He Thr He Gly
210 215 220
Ser Val Ser Leu Ser Gly Asp Val Cys Ser Ser Leu Ala Ser Val Gly 225 230 235 240
- 331 -
Ile Gly Ala Asn Cys ber- Thr Ser Gly Pro Ser Tyr Ser Phe Lys Gly
245 250 255
Thr Thr Asn Ala Thr Asn Thr Ala Phe Ser Asn Ala Ser Gly Ser Phe
260 265 270
Thr Phe Glu Glu Asn Ala Thr Phe Ser Gly Ala Lys Trp Asn Gly Gly
275 280 285
Thr Tyr Thr Phe Asn Lys Glu Phe Ser Ala Thr Asn Asn Thr Ala Phe
290 295 300
Ser Ser Gly Ser Phe Asn Phe Lys Gly Val Ser Ser Phe Asn Gly Thr 305 310 315 320
Ser Phe Ser Asn Ala Ser Tyr Thr Phe Asp Asn Gin Ala Thr Phe Gin
325 330 335
Asn Ser Ser Phe Asn Gly Gly Thr Phe Thr Phe Asn Asn Gin Thr Asn
340 345 350
Pro Thr Asn Asn Ala Gin His Pro Gin He Gin Asn Ser Ser Phe Ser
355 360 365
Gly Asn Ala Thr Thr Leu Lys Gly Phe Val Asn Phe Gin Gin Ala Phe
370 375 380
Asn Asn Ser Asn His Gin Leu Thr He Gin Asn Ala Ser Phe Asn Asn 385 390 395 400
Ala Thr Phe Asn Asn Thr Gly Lys He Thr He Glu Lys Asp Ala Ser
405 410 415
Phe Asn Asn Thr Thr Phe Asn Thr Ser Val Asp Thr Asn Asn Met Ser
420 425 430
Val Thr Gly Gly Val Thr Leu Ser Gly Lys Asn Asp Leu Lys Asn Gly
435 440 445
Ser Thr Leu Asp Phe Gly Ser Ser Lys He Thr Leu Ala Gin Gly Thr
450 455 460
Thr Phe Asn Leu Thr Ser Leu Gly Ser Glu Lys Ser Val Thr He Leu 465 470 475 480
Asn Ser Ser Gly Gly He Thr Tyr Ser Asn Leu Leu Asn His Ala He
485 490 495
Asn Gly Leu Thr Ser Ala Leu Lys Thr Asn Glu Ser Leu Ser Asn Pro
500 505 510
Gin Ser Phe Ala Gin Gly Leu Trp Asp He He Thr Tyr Asn Gly Val
515 520 525
Thr Gly Gin Leu Leu Asn Glu Asn Ala Ala Thr Ser Lys Pro Thr Asp
530 535 540
Ser Ser Pro Ser Lys Ser Ser Thr Asn Ser Thr Gin Val Tyr Gin Val 545 550 555 560
Gly Tyr Lys He Gly Asp Thr He Tyr Lys Leu Gin Glu Thr Phe Ser
565 570 575
His Asn Ser He He He Gin Ala Leu Glu Ser Gly Thr Tyr Thr Pro
580 585 590
Pro Pro Val He Asn Gly Ser Lys Phe Asp Leu Ser Ala Ser Asn Tyr
595 600 605
He Asn Ala Asp Met Pro Trp Tyr Asp His Lys Tyr Tyr He Pro Lys
610 615 620
Ser Gin Asn Phe Thr Glu Ser Gly Thr Tyr Tyr Leu Pro Ser Val Gin 625 630 635 640
He Trp Gly Ser Tyr Thr Asn Ser Phe Lys Gin Thr Phe Ser Ala Asn
645 650 655
Gly Ser Asn Leu Val He Gly Tyr Asn Ser Thr Trp Thr Asp His Asn
660 665 670
Val Ser Ser Ser Gly Thr Val Ser Phe Gly Asp Thr Ser Gly Ser Ala
- 332 -
67b 680 685
Leu Asn Gly His Cys Gly Pro Trp Pro Tyr Tyr Gin Cys Thr Gly Thr
690 695 700
Thr Asn Gly Thr Tyr Ser Ala Tyr His Val Tyr He Thr Ala Asn Leu 705 710 715 720
Arg Ser Gly Asn Arg He Gly Thr Gly Gly Ala Ala Asn Leu He Phe
725 730 735
Asn Gly Val Asp Ser He Asn He Ala Asn Ala Thr He Thr Gin His
740 745 750
Asn Ala Gly He Tyr Ser Ser Ser Met Thr Phe Ser Thr Gin Ser Met
755 760 765
Asp Asn Ser Gin Asn Leu Asn Gly Leu Asn Ser Asn Gly Lys Leu Ser
770 775 780
Val Tyr Gly Thr Thr Phe Thr Asn Glu Ala Lys Asp Gly Lys Phe He 785 790 795 800
Phe Asn Ala Gly Gin Ala Val Phe Glu Asn Thr Asn Phe Asn Gly Gly
805 810 815
Ser Tyr Gin Phe Ser Gly Asp Ser Leu Asn Phe Ser Asn Asn Asn Gin
820 825 830
Phe Asn Ser Gly Ser Phe Glu He Ser Ala Lys Asn Ala Ser Phe Asn
835 840 845
Asn Ala Asn Phe Asn Asn Ser Ala Ser Phe Asn Phe Asn Asn Ser Asn
850 855 860
Ala Thr Thr Ser Phe Val Gly Asp Phe Thr Asn Ala Asn Ser Asn Leu 865 870 875 880
Gin He Ala Gly Asn Ala Val Phe Gly Asn Ser Thr Asn Gly Ser Gin
885 890 895
Asn Thr Ala Asn Phe Asn Asn Thr Gly Ser Val Asn He Ser Gly Asn
900 905 910
Ala Thr Phe Asp Asn Val Val Phe Asn Gly Pro Thr Asn Thr Ser Val
915 920 925
Lys Gly Gin Val Thr Leu Asn Asn He Thr Leu Lys Asn Leu Asn Ala
930 935 940
Pro Leu Ser Phe Gly Asp Gly Thr He Thr Phe Asn Ala His Ser Val 945 950 955 960
He Asn He Ala Glu Ser He Thr Asn Gly Asn Pro He Thr Leu Val
965 970 975
Ser Ser Ser Lys Glu He Glu Tyr Asn Asn Ala Phe Ser Lys Asn Leu
980 985 990
Trp Gin Leu He Asn Tyr Gin Gly His Gly Ala Ser Ser Glu Lys Leu
995 1000 1005
Val Ser Ser Ala Gly Asn Gly Val Tyr Asp Val Val Tyr Ser Phe Asn
1010 1015 1020
Asn Gin Thr Tyr Asn Phe Gin Glu Val Phe Ser Gin Asn Ser He Ser 025 1030 1035 1040
He Arg Arg Leu Gly Val Asn Met Val Phe Asp Tyr Val Asp Met Glu
1045 1050 1055
Lys Ser Asp His Leu Tyr Tyr Gin Asn Ala Leu Gly Phe Met Thr Tyr
1060 1065 1070
Met Pro Asn Ser Tyr Asn Asn Asn Leu Gly Asn Ala Asn Asn Thr He
1075 1080 1085
Tyr Tyr Tyr Asp Lys Ser He Asp Phe Tyr Ala Ser Gly Lys Thr Leu
1090 1095 1100
Phe Thr Lys Ala Glu Phe Ser Gin Thr Phe Thr Gly Gin Asn Ser Ala 105 1110 1115 1120
- 333 -
Ile Val Phe Gly Ala Lys -Ser He Trp Thr Ser Leu Ser Asp Ala Pro
1125 1130 1135
Gin Ser Asn Thr He He Arg Phe Gly Asp Asn Lys Gly Ala Gly Ser
1140 1145 1150
Asn Asp Ala Ser Gly His Cys Trp Asn Leu Gin Cys He Gly Phe He
1155 1160 1165
Thr Gly His Tyr Glu Ala Gin Lys He Tyr He Thr Gly Ser He Glu
1170 1175 1180
Ser Gly Asn Arg He Ser Ser Gly Gly Gly Ala Ser Leu Asn Phe Asn 185 1190 1195 1200
Gly Leu Gin Gly He Leu Leu Thr Asn Ala Thr Leu Tyr Asn Arg Ala
1205 1210 1215
Ala Gly Thr Gin Ser Ser Ser Met Asn Phe He Ser Asn Ser Ala Asn
1220 1225 1230
He Gin Ala Gin Asn Ser Tyr Phe He Asp Asp Thr Ala Gin Asn Gly
1235 1240 1245
Gly Asn Pro Asn Phe Ser Phe Asn Ala Leu Asn Leu Asp Phe Ser Asn
1250 1255 1260
Ser Ser Phe Arg Gly Tyr Val Gly Lys Thr Gin Ser Val Phe Lys Phe 265 1270 1275 1280
Asn Ala Lys Asn Ala He Ser Phe Thr Asn Ser Thr Asn Leu Ser Ser
1285 1290 1295
Gly Leu Tyr Gin Met Gin Ala Lys Ser Val Leu Phe Asp Asn Ser Asn
1300 1305 1310
Leu Ser Val Ser Val Gly Thr Ser Ser He Lys Ala Asn Ala He Asn
1315 1320 1325
Leu Ser Gin Asn Ala Ser He Asn Ala Ser Asn His Ser Thr Leu Glu
1330 1335 1340
Leu Gin Gly Asp Leu Asn Val Asn Asp Thr Ser Ser Leu Asn Leu Asn 345 1350 1355 1360
Gin Ser Thr He Asn Val Ser Asn Asn Ala Thr He Asn Asp Tyr Ala
1365 1370 1375
Ser Leu He Ala Ser Asn Gly Ser His Leu Asn Phe Asn Gly Ala Val
1380 1385 1390
Asn Phe Asn Ser Ala Asn He Thr Thr Ser Leu Asn Asn Ser Ser He
1395 1400 1405
Val Phe Lys Gly Ala Val Ser Leu Gly Gly Gin Phe Asn Leu Ser Asn
1410 1415 1420
Asn Ser Ser Leu Asp Phe Gin Gly Ser Ser Ala He Thr Ser Asn Thr 425 1430 1435 1440
Ala Phe Asn Phe Tyr Asp Asn Ala Phe Ser Gin Ser Pro He Thr Phe
1445 1450 1455
His Gin Ala Leu Asp He Lys Ala Pro Leu Ser Leu Gly Gly Asn Leu
1460 1465 1470
Leu Asn Pro Asn Asn Ser Ser Val Leu Asp Leu Lys Asn Ser Gin Leu
1475 1480 1485
Val Phe Gly Asp Gin Gly Ser Leu Asn He Ala Asn He Asp Leu Leu
1490 1495 1500
Ser Asp Leu Asn Asp Asn Lys Asn Arg Val Tyr Asn He He Gin Ala 505 1510 1515 1520
Asp Met Asn Ser Asn Trp Tyr Glu Arg He Ser Phe Phe Gly Met His
1525 1530 1535
He Asn Asp Gly He Tyr Asp Ala Lys Asn Gin Thr Tyr Ser Phe Thr
1540 1545 1550
Asn Pro Leu Asn Asn Ala Leu Lys He Thr Glu Ser Phe Lys Asp Asn
- 334 -
1555 1560 1565
Gin Leu Ser Val Thr Leu Ser Gin He Pro Gly He Lys Asn Thr Leu
1570 1575 1580
Tyr Asn He Gly Ser Glu He Phe Asn Tyr Gin Lys Val Tyr Asn Asn 585 1590 1595 1600
Ala Asn Gly Val Tyr Ser Tyr Ser Asp Asp Ala Gin Gly Val Phe Tyr
1605 1610 1615
Leu Thr Ser Asn Val Lys Gly Tyr Tyr Asn Pro Asn Gin Ser Tyr Gin
1620 1625 1630
Ala Ser Gly Ser Asn Asn Thr Thr Lys Asn Asn Asn Leu Thr Ser Glu
1635 1640 1645
Ser Ser He He Ser Gin Thr Tyr Asn Ala Gin Gly Asn Pro He Ser
1650 1655 1660
Ala Leu His He Tyr Asn Lys Gly Tyr Asn Phe Asn Asn He Lys Ala 665 1670 1675 1680
Leu Gly Gin Met Ala Leu Lys Leu Tyr Pro Glu He Lys Lys Val Leu
1685 1690 1695
Gly Asn Asp Phe Ser Pro Ser Ser Leu Asn Ala Leu Asn Ser Asn Ala
1700 1705 1710
Leu Asn Gin Leu Thr Lys Leu He Thr Pro Asn Asp Trp Lys Asn He
1715 1720 1725
Asn Glu Leu He Asp Asn Ala Asn Asn Ser Val Val Gin Asn Phe Asn
1730 1735 1740
Asn Gly Thr Leu He Val Gly Ala Thr Gin He Gly Gin Thr Asp Thr 745 1750 1755 1760
Asn Ser Ala Val Val Phe Gly Gly Leu Gly Tyr Gin Thr Pro Cys Asp
1765 1770 1775
Tyr Thr Asp He Val Cys Gin Lys Phe Arg Gly Thr Tyr Leu Gly Gin
1780 1785 1790
Leu Leu Glu Ser Ser Ser Ala Asp Leu Gly Tyr He Asp Thr Thr Phe
1795 1800 1805
Asn Ala Lys Glu He Tyr Leu Thr Gly Thr Leu Gly Ser Gly Asn Ala
1810 1815 1820
Trp Gly Thr Gly Gly Ser Ala Ser Val Thr Phe Asn Ser Gin Thr Ser 825 1830 1835 1840
Leu He Leu Asn Gin Ala Asn He Val Ser Ser Gin Thr Asp Gly He
1845 1850 1855
Phe Ser Met Leu Gly Gin Glu Gly He Asn Lys Val Phe Asn Gin Ala
1860 1865 1870
Gly Leu Ala Asn He Leu Gly Glu Val Ala Val Gin Ser He Asn Lys
1875 1880 1885
Ala Gly Gly Leu Gly Asn Leu He Val Asn Thr Leu Gly Ser Asn Ser
1890 1895 1900
Val He Gly Gly Tyr Leu Thr Pro Glu Gin Lys Asn Gin Thr Leu Ser 905 1910 1915 1920
Gin Leu Leu Gly Gin Asn Asn Phe Asp Asn Leu Met Asn Asp Ser Gly
1925 1930 1935
Leu Asn Thr Ala He Lys Asp Leu He Arg Gin Lys Leu Gly Phe Trp
1940 1945 1950
Thr Gly Leu Val Gly Gly Leu Ala Gly Leu Gly Gly He Asp Leu Gin
1955 1960 1965
Asn Pro Glu Lys Leu He Gly Ser Met Ser He Asn Asp Leu Leu Ser
1970 1975 1980
Lys Lys Gly Leu Phe Asn Gin He Thr Gly Phe He Ser Ala Asn Asp 985 1990 1995 2000
- 335 -
Ile Gly Gin val ne Ser -Val Met Leu Gin Asp He Val Lys Pro Ser
2005 2010 2015
Asn Ala Leu Lys Asn Asp Val Ala Ala Leu Gly Lys Gin Met He Gly
2020 2025 2030
Glu Phe Leu Gly Gin Asp Thr Leu Asn Ser Leu Glu Ser Leu Leu Gin
2035 2040 2045
Asn Gin Gin He Lys Ser Val Leu Asp Lys Val Leu Ala Ala Lys Gly
2050 2055 2060
Leu Gly Pro He Tyr Glu Gin Gly Leu Gly Asp Leu He Pro Asn Leu 065 2070 2075 2080
Gly Lys Lys Gly Leu Phe Ala Pro Tyr Gly Leu Ser Gin Val Trp Gin
2085 2090 2095
Lys Gly Asp Phe Ser Phe Asn Ala Gin Gly Asn Val Phe Val Gin Asn
2100 2105 2110
Ser Thr Phe Ser Asn Ala Asn Gly Gly Thr Leu Ser Phe Asn Ala Gly
2115 2120 2125
Asn Ser Leu He Phe Ala Gly Asn Asn His He Ala Phe Thr Asn His
2130 2135 2140
Ala Gly Thr Leu Gin Leu Leu Ser Asp Gin Val Ser Asn He Asn He 145 2150 2155 2160
Thr Thr Leu Asn Ala Ser Asn Gly Leu Lys He Asn Ala Ala Asn Asn
2165 2170 2175
Asn Val Ser Val Ser Gin Gly Asn Leu Phe Val Ser Ala Ser Cys Ala
2180 2185 2190
Gin Gin Ser Asp Pro Thr Thr Ala Asn He Ala Asn Pro Cys Ala Leu
2195 2200 2205
Ser Ala Gin Ser Thr Asn Gly Ala Ser Ser Asn Asn Ala Ser Asn Asn
2210 2215 2220
Ala Pro He Ala Leu Ser Asn Asn Asp Glu Ser Leu Met Val Ala Ala 225 2230 2235 2240
Asn Asp Phe Asn Phe Ser Gly Asn He Tyr Ala Asn Gly Val Val Asp
2245 2250 2255
Phe Ser Lys He Lys Gly Ser Ala Asn He Lys Asn Leu Tyr Leu Tyr
2260 2265 2270
Asn Asn Ala Gin Phe Gin Ala Asn Asn Leu Thr He Ser Asn Gin Ala
2275 2280 2285
Val Leu Glu Lys Asn Ala Ser Phe Val Thr Asn Asn Leu Asn He Gin
2290 2295 2300
Gly Ala Phe Asn Asn Asn Ala Thr Gin Lys He Glu Val Leu Gin Asn 305 2310 2315 2320
Leu Val He Ala Ser Asn Ala Ser Leu Ser Thr Gly He Tyr Gly Leu
2325 2330 2335
Glu Val Gly Gly Ala Leu Asn Asn Ser Gly Ala He His Phe Asn Leu
2340 2345 2350
Glu Asn Thr Gin Thr Pro Thr Pro Leu He Gin Ala Glu Gly He He
2355 2360 2365
Asn Leu Asn Thr Thr Gin Thr Pro Phe Met Asn Val Asn Asn Ser Met
2370 2375 2380
Ala Asn Asn Thr Thr Tyr Thr Leu Leu Lys Ser Ser Arg Tyr He Asp 385 2390 2395 2400
Tyr Asn He Asn Pro Asn Ser Leu Gin Ser Tyr Leu Asn Leu Tyr Thr
2405 2410 2415
Leu He Asn He Asn Gly Asn His He Glu Glu Lys Asn Gly Ala Leu
2420 2425 2430
Thr Tyr Leu Gly Gin Arg Val Leu Leu Gin Asp Lys Gly Leu Leu Leu
- 336 -
2435 2440 2445
Ser Val Ala Leu Pro Asn Ser Asn Asn Ala Ser Gin Asn Asn He Leu
2450 2455 2460
Ser Leu Ser Val Leu Tyr Asn Gin Val Lys Met Ser Cys Gly Asp Lys 465 2470 2475 2480
Ala Met Asp Phe Thr Pro Pro Thr Leu Gin Asp Tyr He Val Gly He
2485 2490 2495
Gin Gly Gin Ser Ala Leu Asn Gin He Glu Ala Val Gly Gly Asn Ala
2500 2505 2510
He Lys Trp Leu Ser Thr Leu Met Met Glu Thr Lys Glu Asn Pro Phe
2515 2520 2525
Phe Ala Pro He Tyr Leu Lys Asn His Ser Leu Asn Glu He Leu Gly
2530 2535 2540
Val Thr Lys Asp Leu Gin Asn Thr Ala Ser Leu He Ser Asn Pro Asn 545 2550 2555 2560
Phe Arg Asp Asn Ala Thr Asn Leu Leu Glu Leu Ala Ser Tyr Thr Gin
2565 2570 2575
Gin Thr Ser Arg Leu Thr Lys Leu Ser Asp Phe Arg Ser Arg Glu Gly
2580 2585 2590
Glu Ser Asp Phe Ser Leu Leu Glu Leu Lys Asn Lys Arg Phe Ser Asp
2595 2600 2605
Pro Asn Pro Glu Val Phe Val Lys Tyr Ser Gin Leu Ser Lys His Pro
2610 2615 2620
Asn Asn Leu Trp Val Gin Gly Val Gly Gly Ala Ser Phe He Ser Gly 625 2630 2635 2640
Gly Asn Gly Thr Leu Tyr Gly Leu Asn Ala Gly Tyr Asp Arg Leu Val
2645 2650 2655
Lys Asn Val He Leu Gly Gly Tyr Val Ala Tyr Gly Tyr Ser Asp Phe
2660 2665 2670
Asn Gly Asn He Met His Ser Leu Gly Asn Asn Val Asp Val Gly Met
2675 2680 2685
Tyr Ala Arg Ala Phe Leu Lys Arg Asn Glu Phe Thr Leu Ser Ala Asn
2690 2695 2700
Glu Thr Tyr Gly Gly Asn Ala Thr Ser He Asn Ser Ser Asn Ser Leu 705 2710 2715 2720
Leu Ser Val Leu Asn Gin Arg Tyr Asn Tyr Asn Thr Trp Thr Thr Ser
2725 2730 2735
Val Asn Gly Asn Tyr Gly Tyr Asp Phe Met Phe Lys Gin Lys Ser Val
2740 2745 2750
Val Leu Lys Pro Gin Val Gly Leu Ser Tyr His Phe He Gly Leu Ser
2755 2760 2765
Gly Met Lys Gly Asn Asp Ala Ala Tyr Lys Gin Phe Leu Met His Ser
2770 2775 2780
Asn Pro Ser Asn Glu Ser Val Leu Thr Leu Asn Met Gly Leu Glu Ser 785 2790 2795 2800
Arg Lys Tyr Phe Gly Lys Asn Ser Tyr Tyr Phe Val Thr Ala Arg Leu
2805 2810 2815
Gly Arg Asp Leu Leu He Lys Ser Lys Gly Ser Asn Thr Val Arg Phe
2820 2825 2830
Val Gly Glu Asn Thr Leu Leu Tyr Arg Lys Gly Glu Val Phe Asn Thr
2835 2840 2845
Phe Ala Ser Val He Thr Gly Gly Glu Met His Leu Trp Arg Leu Val
2850 2855 2860
Tyr Val Asn Ala Gly Val Gly Leu Lys Met Gly Leu Gin Tyr Gin Asp 865 2870 2875 2880
- 337 -
Ile Asn He Thr Gly Asn- Val Gly Met Arg Val Ala Phe 2885 2890
(2) INFORMATION FOR SEQ ID NO: 167
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1376 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 13...1338 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167:
TGGTAGTTAA GA ATG GGT AAT CAT TTT TCT AAA TTA GGA TTT GTT TTA GCC 51 Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala 1 5 10
GCA TTA GGA AGC GCG ATA GGT TTA GGG CAT ATC TGG CGT TTC CCC TAC 99 Ala Leu Gly Ser Ala He Gly Leu Gly His He Trp Arg Phe Pro Tyr 15 20 25
ATG ACT GGG GTG AGT GGT GGG GGT GCT TTT GTT TTA TTG TTT TTA TTT 147 Met Thr Gly Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe 30 35 40 45
TTA TCT TTA AGC GTT GGC GCG GCG ATG TTT ATC GCT GAA ATG CTA TTA 195 Leu Ser Leu Ser Val Gly Ala Ala Met Phe He Ala Glu Met Leu Leu 50 55 60
GGA CAA AGC ACT CAA AAA AAT GTA ACA GAA GCT TTT AAA GAG CTT GAC 243 Gly Gin Ser Thr Gin Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp 65 70 75
ATT AAC CCC AAA AAA CGC TGG AAA TAC GCA GGG CTT TTG CTT GTT TCT 291 He Asn Pro Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser 80 85 90
GGG CCA TTA ATA CTG ACT TTT TAC GGC ACG ATT TTA GGT TGG GTG CTT 339 Gly Pro Leu He Leu Thr Phe Tyr Gly Thr He Leu Gly Trp Val Leu 95 100 105
TAT TAT TTG GTG AGT GTT AGT TTT AAT TTG CCT AAC AAT ATC CAA GAA 387 Tyr Tyr Leu Val Ser Val Ser Phe Asn Leu Pro Asn Asn He Gin Glu 110 115 120 125
- 33 8 -
TCT GAA CAA ATT TTT ACT -CAA ACT TTG CAG TCT ATA GGG CTA CAA TCC 435 Ser Glu Gin He Phe Thr Gin Thr Leu Gin Ser He Gly Leu Gin Ser 130 135 140
ATA GGG CTT TTT AGC GTT TTA TTG ATA ACC GGA TGG ATT GTT TCT AGG 483 He Gly Leu Phe Ser Val Leu Leu He Thr Gly Trp He Val Ser Arg 145 150 155
GGG ATT AAA GAA GGC ATT GAA AAG CTC AAT TTG GTT TTA ATG CCC TTA 531 Gly He Lys Glu Gly He Glu Lys Leu Asn Leu Val Leu Met Pro Leu 160 165 170
CTC TTT GCT ACT TTT TTT GGT TTG CTT TTC TAT GCG ATG AGC ATG GAT 579 Leu Phe Ala Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp 175 180 185
TCT TTT TCT AAA GCT TTT CAT TTC ATG TTT GAT TTC AAA CCA AAA GAT 627 Ser Phe Ser Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp 190 195 200 205
TTG ACC TCT CAA GTG TTC ACT TAT TCC TTG GGG CAG GTT TTC TTT TCC 675 Leu Thr Ser Gin Val Phe Thr Tyr Ser Leu Gly Gin Val Phe Phe Ser 210 215 220
TTA AGC ATC GGT TTA GGG ATC AAT ATC ACT TAC GCT GCG GTT ACG GAT 723 Leu Ser He Gly Leu Gly He Asn He Thr Tyr Ala Ala Val Thr Asp 225 230 235
AAA ACG CAG AAT TTG CTT AAA AGC ACT ATT TGG GTG GTT TTA TCA GGA 771 Lys Thr Gin Asn Leu Leu Lys Ser Thr He Trp Val Val Leu Ser Gly 240 245 250
ATT CTA ATT TCT CTT GTG GCA GGA CTT ATG ATT TTC ACT TTT GTG TTT 819 He Leu He Ser Leu Val Ala Gly Leu Met He Phe Thr Phe Val Phe 255 260 265
GAA TAT GGG GCG AAT GTC TCA CAA GGC ACA GGG TTA ATC TTC ACT TCT 867 Glu Tyr Gly Ala Asn Val Ser Gin Gly Thr Gly Leu He Phe Thr Ser 270 275 280 285
TTA CCG GTG GTT TTT GGC CAA ATG GGA GCG ATA GGC ATT CTT GTT TCG 915 Leu Pro Val Val Phe Gly Gin Met Gly Ala He Gly He Leu Val Ser 290 295 300
ATT CTT TTC TTG CTC GCG CTC GCT TTT GCT GGC ATC ACT TCT ACG GTG 963 He Leu Phe Leu Leu Ala Leu Ala Phe Ala Gly He Thr Ser Thr Val 305 310 315
GCT TTA TTG GAG CCA AGC GTG ATG TAT CTT ACC GAA AGG TAT CAA TAC 1011 Ala Leu Leu Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gin Tyr 320 325 330
TCT CGT TTT AAG GTT ACT TGG GGT CTT GTA GCA CTA ATT TTT GTG GTA 1059 Ser Arg Phe Lys Val Thr Trp Gly Leu Val Ala Leu He Phe Val Val 335 340 345
- 33 9 -
GGC GTG GTG TTG ATT TTC- TCG CTC CAT AAG GAT TAT AAA GAT TAT CTC 1107 Gly Val Val Leu He Phe Ser Leu His Lys Asp Tyr Lys Asp Tyr Leu 350 355 360 365
ACT TTC TTT GAA AAA AGT CTT TTT GAT TGG TTG GAT TTT GCA TCA AGC 1155 Thr Phe Phe Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Ala Ser Ser 370 375 380
ACC ATT ATC ATG CCT TTA GGC GGG ATG GCA ACC TTT ATT TTT ATG GGT 1203 Thr He He Met Pro Leu Gly Gly Met Ala Thr Phe He Phe Met Gly 385 390 395
TGG GTT TTG AAA AAA GAA AAA TTG CGT CTT TTG AGC GTG CAC TTT TTA 1251 Trp Val Leu Lys Lys Glu Lys Leu Arg Leu Leu Ser Val His Phe Leu 400 405 410
GGC CCT AAA TTG TTT GCA ACT TGG TAT TTC TTG CTT AAA TAT ATC ACC 1299 Gly Pro Lys Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr He Thr 415 420 425
CCT TTA ATT GTG TTT TCC ATT TGG TTG AGC AAG ATT TAT TAAAATATTT GG 1350 Pro Leu He Val Phe Ser He Trp Leu Ser Lys He Tyr 430 435 440
CATGGGAAAA TTTTCTAAAT TAGGCT 1376
(2) INFORMATION FOR SEQ ID NO: 168:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 442 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: Met Gly Asn His Phe Ser Lys Leu Gly Phe Val Leu Ala Ala Leu Gly
1 5 10 15
Ser Ala He Gly Leu Gly His He Trp Arg Phe Pro Tyr Met Thr Gly
20 25 30
Val Ser Gly Gly Gly Ala Phe Val Leu Leu Phe Leu Phe Leu Ser Leu
35 40 45
Ser Val Gly Ala Ala Met Phe He Ala Glu Met Leu Leu Gly Gin Ser
50 55 60
Thr Gin Lys Asn Val Thr Glu Ala Phe Lys Glu Leu Asp He Asn Pro 65 70 75 80
Lys Lys Arg Trp Lys Tyr Ala Gly Leu Leu Leu Val Ser Gly Pro Leu
85 90 95
He Leu Thr Phe Tyr Gly Thr He Leu Gly Trp Val Leu Tyr Tyr Leu
100 105 110
Val Ser Val Ser Phe Asn Leu Pro Asn Asn He Gin Glu Ser Glu Gin
115 120 125
He Phe Thr Gin Thr Leu Gin Ser He Gly Leu Gin Ser He Gly Leu 130 135 140
- 340 -
Phe Ser Val Leu Leu He -Thr Gly Trp He Val Ser Arg Gly He Lys 145 150 155 160
Glu Gly He Glu Lys Leu Asn Leu Val Leu Met Pro Leu Leu Phe Ala
165 170 175
Thr Phe Phe Gly Leu Leu Phe Tyr Ala Met Ser Met Asp Ser Phe Ser
180 185 190
Lys Ala Phe His Phe Met Phe Asp Phe Lys Pro Lys Asp Leu Thr Ser
195 200 205
Gin Val Phe Thr Tyr Ser Leu Gly Gin Val Phe Phe Ser Leu Ser He
210 215 220
Gly Leu Gly He Asn He Thr Tyr Ala Ala Val Thr Asp Lys Thr Gin 225 230 235 240
Asn Leu Leu Lys Ser Thr He Trp Val Val Leu Ser Gly He Leu He
245 250 255
Ser Leu Val Ala Gly Leu Met He Phe Thr Phe Val Phe Glu Tyr Gly
260 265 270
Ala Asn Val Ser Gin Gly Thr Gly Leu He Phe Thr Ser Leu Pro Val
275 280 285
Val Phe Gly Gin Met Gly Ala He Gly He Leu Val Ser He Leu Phe
290 295 300
Leu Leu Ala Leu Ala Phe Ala Gly He Thr Ser Thr Val Ala Leu Leu 305 310 315 320
Glu Pro Ser Val Met Tyr Leu Thr Glu Arg Tyr Gin Tyr Ser Arg Phe
325 330 335
Lys Val Thr Trp Gly Leu Val Ala Leu He Phe Val Val Gly Val Val
340 345 350
Leu He Phe Ser Leu His Lys Asp Tyr Lys Asp Tyr Leu Thr Phe Phe
355 360 365
Glu Lys Ser Leu Phe Asp Trp Leu Asp Phe Ala Ser Ser Thr He He
370 375 380
Met Pro Leu Gly Gly Met Ala Thr Phe He Phe Met Gly Trp Val Leu 385 390 395 400
Lys Lys Glu Lys Leu Arg Leu Leu Ser Val His Phe Leu Gly Pro Lys
405 410 415
Leu Phe Ala Thr Trp Tyr Phe Leu Leu Lys Tyr He Thr Pro Leu He
420 425 430
Val Phe Ser He Trp Leu Ser Lys He Tyr 435 440
(2) INFORMATION FOR SEQ ID NO: 169:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1392 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Genomic DNA (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 22...1356 (D) OTHER INFORMATION:
- 341 -
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169:
TTTAAAAGGT ATTTTATAAC G ATG AAA ATT TTT GGG ACT GAT GGC GTG AGG 51
Met Lys He Phe Gly Thr Asp Gly Val Arg 1 5 10
GGT AAA GCA GGG GTG AAA CTC ACC CCC ATG TTT GTG ATG CGT TTA GGC 99 Gly Lys Ala Gly Val Lys Leu Thr Pro Met Phe Val Met Arg Leu Gly 15 20 25
ATT GCT GCC GGA TTG TAT TTT AAA AAA CAT TCT CAA ACG AAT AAA ATT 147 He Ala Ala Gly Leu Tyr Phe Lys Lys His Ser Gin Thr Asn Lys He 30 35 40
CTA ATC GGT AAA GAC ACC AGA AAA AGC GGC TAT ATG GTA GAA AAC GCT 195 Leu He Gly Lys Asp Thr Arg Lys Ser Gly Tyr Met Val Glu Asn Ala 45 50 55
TTA GTG AGC GCT CTA ACT TCC ATA GGC TAT AAT GTG ATT CAA ATA GGG 243 Leu Val Ser Ala Leu Thr Ser He Gly Tyr Asn Val He Gin He Gly 60 65 70
CCT ATG CCC ACC CCT GCG ATT GCG TTT TTA ACT GAA GAC ATG CGC TGT 291 Pro Met Pro Thr Pro Ala He Ala Phe Leu Thr Glu Asp Met Arg Cys 75 80 85 90
GAT GCG GGT ATT ATG ATA AGC GCG AGC CAC AAC CCT TTT GAA GAT AAT 339 Asp Ala Gly He Met He Ser Ala Ser His Asn Pro Phe Glu Asp Asn 95 100 105
GGC ATT AAG TTT TTC AAT TCT TAT GGC TAT AAG CTT AAA GAA GAA GAA 387 Gly He Lys Phe Phe Asn Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu 110 115 120
GAA AAA GCG ATT GAA GAA ATC TTT CAT GAT GAA GAA TTA CTG CAT TCT 435 Glu Lys Ala He Glu Glu He Phe His Asp Glu Glu Leu Leu His Ser 125 130 135
AGC TAT AAA GTG GGT GAG AGC GTC GGT AGC GCT AAA AGG ATA GAC GAT 483 Ser Tyr Lys Val Gly Glu Ser Val Gly Ser Ala Lys Arg He Asp Asp 140 145 150
GTC ATA GGG CGC TAT ATT GCA CAT TTA AAA CAC TCT TTC CCC AAA CAT 531 Val He Gly Arg Tyr He Ala His Leu Lys His Ser Phe Pro Lys His 155 160 165 170
TTG AAT TTA CAG AGT TTA AGG ATC GTG CTA GAT ACG GCT AAT GGC GCG 579 Leu Asn Leu Gin Ser Leu Arg He Val Leu Asp Thr Ala Asn Gly Ala 175 180 185
GCT TAT AAG GTG GCT CCG GTC GTT TTT AGC GAG CTT GGG GCT GAT GTG 627 Ala Tyr Lys Val Ala Pro Val Val Phe Ser Glu Leu Gly Ala Asp Val 190 195 200
-342 -
TTA GTG ATT AAT GAT GAG -CCT AAC GGG TGT AAC ATT AAT GAT CAA TGC 675 Leu Val He Asn Asp Glu Pro Asn Gly Cys Asn He Asn Asp Gin Cys 205 210 215
GGG GCT TTA CAC CCC AAC CAA TTA AGC CAG GAA GTG AAA AAA TAC CGC 723 Gly Ala Leu His Pro Asn Gin Leu Ser Gin Glu Val Lys Lys Tyr Arg 220 225 230
GCA GAT TTA GGC TTT GCT TTT GAT GGC GAT GCT GAC AGG CTA GTG GTG 771 Ala Asp Leu Gly Phe Ala Phe Asp Gly Asp Ala Asp Arg Leu Val Val 235 240 245 250
GTG GAT AAT TTA GGG AAT ATC GTG CAT GGG GAT AAG CTT TTA GGG GTG 819 Val Asp Asn Leu Gly Asn He Val His Gly Asp Lys Leu Leu Gly Val 255 260 265
TTA GGG GTT TAT CAA AAA TCT AAA AAC GCC CTT TCT TCT CAA GCG GTT 867 Leu Gly Val Tyr Gin Lys Ser Lys Asn Ala Leu Ser Ser Gin Ala Val 270 275 280
GTC GCC ACA AAC ATG AGC AAT TTA GCC CTT AAA GAA TAT TTA AAA TCC 915 Val Ala Thr Asn Met Ser Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser 285 290 295
CAA GAT TTG GAA TTG AAG CAT TGC GCG ATT GGG GAT AAG TTT GTG AGC 963 Gin Asp Leu Glu Leu Lys His Cys Ala He Gly Asp Lys Phe Val Ser 300 305 310
GAA TGC ATG CAA TTG AAT AAA GCC AAT TTT GGA GGC GAG CAA AGC GGG 1011 Glu Cys Met Gin Leu Asn Lys Ala Asn Phe Gly Gly Glu Gin Ser Gly 315 320 325 330
CAT ATC ATT TTT AGC GAT TAC GCT AAA ACA GGC GAT GGT TTG GTG TGC 1059 His He He Phe Ser Asp Tyr Ala Lys Thr Gly Asp Gly Leu Val Cys 335 340 345
GCT TTG CAA GTG AGC GCG TTA GTG TTA GAA AGC AAG CAG GTA AGC TCT 1107 Ala Leu Gin Val Ser Ala Leu Val Leu Glu Ser Lys Gin Val Ser Ser 350 355 360
GTT GCG TTA AAC CCC TTT GAA TTA TAC CCC CAA AGC CTA GTG AAT TTG 1155 Val Ala Leu Asn Pro Phe Glu Leu Tyr Pro Gin Ser Leu Val Asn Leu 365 370 375
AAT GTC CAA AAA AAG CCC CCT TTA GAA AGC CTG AAA GGT TAT AGC GCT 1203 Asn Val Gin Lys Lys Pro Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala 380 385 390
CTT TTA AAA GAA TTA GAC AAG CTA GAA ATC CGC CAT TTG ATC CGT TAT 1251 Leu Leu Lys Glu Leu Asp Lys Leu Glu He Arg His Leu He Arg Tyr 395 400 405 410
AGC GGC ACT GAA AAC AAA TTG CGA ATC CTT TTA GAA GCT AAA GAT GAA 1299 Ser Gly Thr Glu Asn Lys Leu Arg He Leu Leu Glu Ala Lys Asp Glu 415 420 425
- 343 -
AAG CTT TTA GAA TCC AAA- ATG CAA GAA TTA AAA GAG TTT TTT GAA GGG 1347 Lys Leu Leu Glu Ser Lys Met Gin Glu Leu Lys Glu Phe Phe Glu Gly 430 435 440
CAT TTG TGC TAAAAACCAC TAAAAAAAGC CTGTTGGTTT TTATGG 1392
His Leu Cys 445
(2) INFORMATION FOR SEQ ID NO: 170:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 445 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170:
Met Lys He Phe Gly Thr Asp Gly Val Arg Gly Lys Ala Gly Val Lys
1 5 10 15
Leu Thr Pro Met Phe Val Met Arg Leu Gly He Ala Ala Gly Leu Tyr
20 25 30
Phe Lys Lys His Ser Gin Thr Asn Lys He Leu He Gly Lys Asp Thr
35 40 45
Arg Lys Ser Gly Tyr Met Val Glu Asn Ala Leu Val Ser Ala Leu Thr
50 55 60
Ser He Gly Tyr Asn Val He Gin He Gly Pro Met Pro Thr Pro Ala 65 70 75 80
He Ala Phe Leu Thr Glu Asp Met Arg Cys Asp Ala Gly He Met He
85 90 95
Ser Ala Ser His Asn Pro Phe Glu Asp Asn Gly He Lys Phe Phe Asn
100 105 110
Ser Tyr Gly Tyr Lys Leu Lys Glu Glu Glu Glu Lys Ala He Glu Glu
115 120 125
He Phe His Asp Glu Glu Leu Leu His Ser Ser Tyr Lys Val Gly Glu
130 135 140
Ser Val Gly Ser Ala Lys Arg He Asp Asp Val He Gly Arg Tyr He 145 150 155 160
Ala His Leu Lys His Ser Phe Pro Lys His Leu Asn Leu Gin Ser Leu
165 170 175
Arg He Val Leu Asp Thr Ala Asn Gly Ala Ala Tyr Lys Val Ala Pro
180 185 190
Val Val Phe Ser Glu Leu Gly Ala Asp Val Leu Val He Asn Asp Glu
195 200 205
Pro Asn Gly Cys Asn He Asn Asp Gin Cys Gly Ala Leu His Pro Asn
210 215 220
Gin Leu Ser Gin Glu Val Lys Lys Tyr Arg Ala Asp Leu Gly Phe Ala 225 230 235 240
Phe Asp Gly Asp Ala Asp Arg Leu Val Val Val Asp Asn Leu Gly Asn
245 250 255
He Val His Gly Asp Lys Leu Leu Gly Val Leu Gly Val Tyr Gin Lys 260 265 270
- 344 -
Ser Lys Asn Ala Leu Ser -Ser Gin Ala Val Val Ala Thr Asn Met Ser
275 280 285
Asn Leu Ala Leu Lys Glu Tyr Leu Lys Ser Gin Asp Leu Glu Leu Lys
290 295 300
His Cys Ala He Gly Asp Lys Phe Val Ser Glu Cys Met Gin Leu Asn 305 310 315 320
Lys Ala Asn Phe Gly Gly Glu Gin Ser Gly His He He Phe Ser Asp
325 330 335
Tyr Ala Lys Thr Gly Asp Gly Leu Val Cys Ala Leu Gin Val Ser Ala
340 345 350
Leu Val Leu Glu Ser Lys Gin Val Ser Ser Val Ala Leu Asn Pro Phe
355 360 365
Glu Leu Tyr Pro Gin Ser Leu Val Asn Leu Asn Val Gin Lys Lys Pro
370 375 380
Pro Leu Glu Ser Leu Lys Gly Tyr Ser Ala Leu Leu Lys Glu Leu Asp 385 390 395 400
Lys Leu Glu He Arg His Leu He Arg Tyr Ser Gly Thr Glu Asn Lys
405 410 415
Leu Arg He Leu Leu Glu Ala Lys Asp Glu Lys Leu Leu Glu Ser Lys
420 425 430
Met Gin Glu Leu Lys Glu Phe Phe Glu Gly His Leu Cys 435 440 445
(2) INFORMATION FOR SEQ ID NO: 171:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: GCCGGATCCA TGACTTATGG GTATGGGGAA 30
(2) INFORMATION FOR SEQ ID NO: 172:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: GCCCTCGAGA CTTTTATTGA TTCACCATTT CATT 34
(2) INFORMATION FOR SEQ ID NO: 173: (i) SEQUENCE CHARACTERISTICS:
- 345 -
(A) LENGTH 30 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 173 GCCGGATCCA TCGCTGAAGA AAATGGGGCG 30
(2) INFORMATION FOR SEQ ID NO 174
(I) SEQUENCE CHARACTERISTICS
(A) LENGTH 33 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(II) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 174 GCCCGGCCGC CCTAAAAACT ATAAACATAA CTC 33
(2) INFORMATION FOR SEQ ID NO 175
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 32 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(il) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 175 GCCGGATCCG GTATTAGGAA GCTTATACCA TC 32
(2) INFORMATION FOR SEQ ID NO 176
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 35 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 176 GCCCTCGAGA AGTTCTATTT TTAATTCCTT GAGAG 35
(2) INFORMATION FOR SEQ ID NO 177
-346 -
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 36 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(n) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 177 GCCGGATCCT CTGATAGCCA TAAAGAAAAA AAGGAC 36
(2) INFORMATION FOR SEQ ID NO 178
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 34 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(n) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 178 GCCCTCGAGA TCTTTAGAAA TCAACCCCCA AAGC 34
(2) INFORMATION FOR SEQ ID NO 179
(I) SEQUENCE CHARACTERISTICS
(A) LENGTH 33 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(II) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 179 GCCGGATCCG ACTTAGAACA TTTTAACACG CTC 33
(2) INFORMATION FOR SEQ ID NO 180
(l) SEQUENCE CHARACTERISTICS
(A) LENGTH 33 base pairs
(B) TYPE nucleic acid
(C) STRANDEDNESS single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE cDNA
(xi) SEQUENCE DESCRIPTION SEQ ID NO 180 GCCCTCGAGT CATTTTAAAC GACTCAAAAC AAA 33
(2) INFORMATION FOR SEQ ID NO 181
- 347 -
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181:
GCCGGATCCG GCCAAAGCGT GCGCACTTAT TGG 33
(2) INFORMATION FOR SEQ ID NO: 182:
(I) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(II) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182:
GCCCTCGAGT TATTGTTCCA ACCCCCACGC ATC 33
(2) INFORMATION FOR SEQ ID NO: 183:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(il) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: GCCGGATCCA AGAGCAATGC TGATGACAAA CC 32
(2) INFORMATION FOR SEQ ID NO: 184:
(I) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(II) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: GCCCTCGAGT TATGAGTTAA AGCCCCTTGT CC 32
(2) INFORMATION FOR SEQ ID NO: 185:
- 34 8 -
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: GCCGGATCCG AATCAGTAAA AACAGGAAAA AC 32
(2) INFORMATION FOR SEQ ID NO: 186:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186:
GCCCTCGAGC GGCTCTTTGG AGTTTTATTG 30
(2) INFORMATION FOR SEQ ID NO : 187:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: GCCGGATCCA TCATTCCCTC TCGCTCTATG G 31
(2) INFORMATION FOR SEQ ID NO: 188:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:188:
GCCCTCGAGA CCTTAATGCG TTGCGTTTTC TTT 33
(2) INFORMATION FOR SEQ ID NO: 189:
- 349 -
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH. 36 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: GCCGAGCTCC AAGCAAAAAA ATGTCAATTA AAAGGG 36
(2) INFORMATION FOR SEQ ID NO: 190:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(li) MOLECULE TYPE: CDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190:
GCCCTCGAGG TCTAAATTAG AATAAGTGTT GTT 33
Claims (24)
1. An isolated polynucleotide that encodes:
(i) a polypeptide comprising an amino acid sequence that is homologous to the amino acid sequence of a Helicobacter polypeptide, wherein said amino acid sequence of said Helicobacter polypeptide is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4
(GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO: 10 (GHPO 136), SEQ ID NO: 12 (GHPO 191), SEQ ID NO: 14 (GHPO 213), SEQ ID NO: 16 (GHPO 240), SEQ ID NO: 18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO: 58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO 1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO:80 (GHPO 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO 1538), SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 1012), SEQ ID NO: 108 (GHPO 470), SEQ ID NO: 110 (GHPO 1398), SEQ ID NO: 112 (GHPO 1550), SEQ ID NO: 114 (GHPO 276), SEQ ID NO: 116 (GHPO 1501), SEQ ID NO: 118 (GHPO 706), SEQ ID NO: 120 (GHPO 1001), SEQ ID NO: 122 (GHPO 732), SEQ ID NO: 124 (GHPO 329), SEQ ID NO: 126 (GHPO 574), -351-
SEQ ID NO:128 (GHPO 1190), SEQ ID NO:130 (GHPO 1374), SEQ ID NO:132 (GHPO 1620), SEQ ID NO: 134 (GHPO 956), SEQ ID NO: 136 (HPO 98), SEQ ID NO: 138 (GHPO 689), SEQ ID NO: 140 (GHPO 208), SEQ ID NO: 142 (GHPO 296), SEQ ID NO: 144 (GHPO 726), SEQ ID NO: 146 (GHPO 1026), SEQ ID NO: 148 (GHPO 1301), SEQ ID NO: 150 (GHPO 1536), SEQ ID NO: 152 (GHPO 166), SEQ ID NO: 154 (GHPO 253), SEQ ID NO: 156 (GHPO 297), SEQ ID NO: 158 (GHPO 615), SEQ ID NO: 160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO: 164 (GHPO 1420), SEQ ID NO: 166 (GHPO 1484), SEQ ID NO: 168 (GHPO 1719), and SEQ ID NO: 170 (GHPO 1252); or (ii) a derivative of said polypeptide encoded by said polynucleotide.
2. The isolated polynucleotide of claim 1, which encodes a mature form of said polypeptide.
3. The isolated polynucleotide of claim 1 or 2, wherein the polynucleotide is a DNA molecule.
4. A compound, in a substantially purified form, that is the mature form or a derivative of a polypeptide comprising an amino acid sequence that is homologous to a Helicobacter amino acid sequence that is selected from the group consisting of the amino acid sequences as shown in SEQ ID NO:2 (GHPO 13), SEQ ID NO:4 (GHPO 73), SEQ ID NO:6 (GHPO 90), SEQ ID NO:8 (GHPO 107), SEQ ID NO: 10 (GHPO 136), SEQ ID NO: 12 (GHPO 191), SEQ ID NO: 14 (GHPO 213), SEQ ID NO:16 (GHPO 240), SEQ ID NO: 18 (GHPO 408), SEQ ID NO:20 (GHPO 411), SEQ ID NO:22 (GHPO 419), SEQ ID NO:24 (GHPO 431), SEQ ID NO:26 (GHPO 474), SEQ ID NO:28 (GHPO 591), SEQ ID NO:30 (GHPO 596), SEQ ID NO:32 (GHPO 699), SEQ ID NO:34 (GHPO 724), SEQ ID NO:36 (GHPO 730), SEQ ID NO:38 (GHPO 761), SEQ ID NO:40 (GHPO 804), SEQ ID NO:42 (GHPO 805), SEQ ID -352-
NO:44 (GHPO 812), SEQ ID NO:46 (GHPO 879), SEQ ID NO:48 (GHPO 888), SEQ ID NO:50 (GHPO 986), SEQ ID NO:52 (GHPO 1056), SEQ ID NO:54 (GHPO 1081), SEQ ID NO:56 (GHPO 1100), SEQ ID NO:58 (GHPO 1140), SEQ ID NO:60 (GHPO 1148), SEQ ID NO:62 (GHPO 1200), SEQ ID NO:64 (GHPO 1212), SEQ ID NO:66 (GHPO 1258), SEQ ID NO:68 (GHPO 1263), SEQ ID NO:70 (GHPO
1273), SEQ ID NO:72 (GHPO 1284), SEQ ID NO:74 (GHPO 1299), SEQ ID NO:76 (GHPO 1327), SEQ ID NO:78 (GHPO 1346), SEQ ID NO: 80 (GHPO 1378), SEQ ID NO:82 (GHPO 1412), SEQ ID NO:84 (GHPO 1443), SEQ ID NO:86 (GHPO 1466), SEQ ID NO:88 (GHPO 1476), SEQ ID NO:90 (GHPO 1536), SEQ ID NO:92 (GHPO 1559), SEQ ID NO:94 (GHPO 427), SEQ ID NO:96 (GHPO 1045), SEQ ID NO:98 (GHPO 1262), SEQ ID NO: 100 (GHPO 1688), SEQ ID NO: 102 (GHPO 1538), SEQ ID NO: 104 (GHPO 346), SEQ ID NO: 106 (GHPO 1012), SEQ ID NO: 108 (GHPO 470), SEQ ID NO: 110 (GHPO 1398), SEQ ID NO:112 (GHPO 1550), SEQ ID NO: 114 (GHPO 276), SEQ ID NO: 116 (GHPO 1501), SEQ ID NO: l 18 (GHPO 706), SEQ ID NO: 120 (GHPO 1001), SEQ ID NO:122 (GHPO
732), SEQ ID NO:124 (GHPO 329), SEQ ID NO: 126 (GHPO 574), SEQ ID NO: 128 (GHPO 1190), SEQ ID NO: 130 (GHPO 1374), SEQ ID NO: 132 (GHPO 1620), SEQ ID NO: 134 (GHPO 956), SEQ ID NO: 136 (HPO 98), SEQ ID NO: 138 (GHPO 689), SEQ ID NO: 140 (GHPO 208), SEQ ID NO: 142 (GHPO 296), SEQ ID NO: 144 (GHPO 726), SEQ ID NO:146 (GHPO 1026), SEQ ID NO: 148 (GHPO 1301), SEQ ID NO: 150 (GHPO 1536), SEQ ID NO: 152 (GHPO 166), SEQ ID NO: 154 (GHPO 253), SEQ ID NO:156 (GHPO 297), SEQ ID NO:158 (GHPO 615), SEQ ID NO:160 (GHPO 1278), SEQ ID NO:162 (GHPO 1282), SEQ ID NO: 164 (GHPO 1420), SEQ ID NO: 166 (GHPO 1484), SEQ ID NO: 168 (GHPO 1719), and SEQ ID NO: 170 (GHPO 1252); or
(ii) a derivative of said polypeptide. -353-
5. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a compound of claim 4.
6. The method of claim 5, further comprising administering to said mammal an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
7. The method of claim 6, wherein said antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin, and said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate.
8. The method of claim 6, wherein said antisecretory agent is a proton pump inhibitor, an H2-receptor antagonist, or a prostaglandin analog.
9. The method of claim 8, wherein said proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole; said H2- receptor antagonist is selected from the group consisting of ranitidine, cimetidine, famotidine, nizatidine, and roxatidine; and said prostaglandin analog is selected from the group consisting of misoprostil and enprostil.
10. The method of claim 5, further comprising administering to said mammal a prophylactically or therapeutically effective amount of a second Helicobacter polypeptide or a derivative thereof.
1 1. The method of claim 10, wherein the second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof. -354-
12. A composition comprising a compound of claim 4, together with a physiologically acceptable diluent or carrier.
13. The composition of claim 12, further comprising an adjuvant.
14. The composition of claim 12, further comprising a second Helicobacter polypeptide or a derivative thereof.
15. The composition of claim 14, wherein said second Helicobacter polypeptide is a Helicobacter urease, a subunit, or a derivative thereof.
16. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of a polynucleotide of claim 1.
17. A composition comprising a viral vector, in the genome of which is inserted a DNA molecule of claim 1 , said DNA molecule being placed under conditions for expression in a mammalian cell and said viral vector being admixed with a physiologically acceptable diluent or carrier.
18. A composition that comprises a bacterial vector comprising a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression and said bacterial vector being admixed with a physiologically acceptable diluent or carrier.
19. The composition of claim 18, wherein said vector is selected from the group consisting oϊ Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bilie de Calmette-Guerin, and Streptococcus. -355-
20. A composition comprising a polynucleotide of claim 1, together with a physiologically acceptable diluent or carrier.
21. The composition of claim 20, wherein said polynucleotide is a DNA molecule that is inserted in a plasmid that is unable to replicate and to substantially integrate in a mammalian genome and is placed under conditions for expression in a mammalian cell.
22. An expression cassette comprising a DNA molecule of claim 1, said DNA molecule being placed under conditions for expression in a procaryotic or eucaryotic cell.
23. A process for producing a compound of claim 4, which comprises culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette of claim 22, and recovering said compound from the cell culture.
24. A method of preventing or treating Helicobacter infection in a mammal, said method comprising administering to said mammal a prophylactically or therapeutically effective amount of an antibody that binds to the compound of claim 4.
Applications Claiming Priority (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US74905196A | 1996-11-14 | 1996-11-14 | |
US08/749051 | 1996-11-14 | ||
US83130997A | 1997-04-01 | 1997-04-01 | |
US83345797A | 1997-04-01 | 1997-04-01 | |
US08/833457 | 1997-04-01 | ||
US08/834705 | 1997-04-01 | ||
US08/834,705 US20030023066A1 (en) | 1996-11-14 | 1997-04-01 | Helicobacter polypeptides and corresponding polynucleotide molecules |
US08/831309 | 1997-04-01 | ||
US88122797A | 1997-06-24 | 1997-06-24 | |
US08/881227 | 1997-06-24 | ||
US90261597A | 1997-07-29 | 1997-07-29 | |
US08/902615 | 1997-07-29 | ||
PCT/US1997/021353 WO1998021225A1 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Publications (2)
Publication Number | Publication Date |
---|---|
AU5266298A true AU5266298A (en) | 1998-06-03 |
AU735391B2 AU735391B2 (en) | 2001-07-05 |
Family
ID=27560272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU52662/98A Ceased AU735391B2 (en) | 1996-11-14 | 1997-11-14 | Helicobacter polypeptides and corresponding polynucleotide molecules |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1021458A4 (en) |
AU (1) | AU735391B2 (en) |
CA (1) | CA2271774A1 (en) |
WO (1) | WO1998021225A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6503747B2 (en) * | 1998-07-14 | 2003-01-07 | University Of Hawaii | Serotype-specific probes for Listeria monocytogenes |
WO2000049044A1 (en) * | 1999-02-19 | 2000-08-24 | Astrazeneca Ab | Expression of helicobacter polypeptides in pichia pastoris |
WO2000073502A2 (en) * | 1999-05-31 | 2000-12-07 | MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. | Essential genes and gene products for identifying, developing and optimising immunological and pharmacological active ingredients for the treatment of microbial infections |
EP1337647A2 (en) * | 2000-11-15 | 2003-08-27 | Ludwig Deml | Helicobacter cysteine rich protein a (hcpa) and uses thereof |
CN111793137A (en) * | 2019-12-12 | 2020-10-20 | 南京蛋球球生物医学技术合伙企业(有限合伙) | Hp tetravalent antigen and preparation method and application thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB8928625D0 (en) * | 1989-12-19 | 1990-02-21 | 3I Res Expl Ltd | H.pylori dna probes |
US5733740A (en) * | 1992-10-13 | 1998-03-31 | Vanderbilt University | Taga gene and methods for detecting predisposition to peptic ulceration and gastric carcinoma |
WO1996033732A1 (en) * | 1995-04-28 | 1996-10-31 | Oravax, Inc. | Multimeric, recombinant urease vaccine |
SK165197A3 (en) * | 1995-06-07 | 1999-01-11 | Astra Ab | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
WO1997019098A1 (en) * | 1995-11-17 | 1997-05-29 | Astra Aktiebolag | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics |
NZ332565A (en) * | 1996-03-29 | 2000-03-27 | Astra Ab | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof |
-
1997
- 1997-11-14 EP EP97947620A patent/EP1021458A4/en not_active Withdrawn
- 1997-11-14 AU AU52662/98A patent/AU735391B2/en not_active Ceased
- 1997-11-14 CA CA002271774A patent/CA2271774A1/en not_active Abandoned
- 1997-11-14 WO PCT/US1997/021353 patent/WO1998021225A1/en not_active Application Discontinuation
Also Published As
Publication number | Publication date |
---|---|
AU735391B2 (en) | 2001-07-05 |
CA2271774A1 (en) | 1998-05-22 |
EP1021458A4 (en) | 2001-12-12 |
EP1021458A1 (en) | 2000-07-26 |
WO1998021225A1 (en) | 1998-05-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU784193B2 (en) | Chlamydia antigens and corresponding DNA fragments and uses thereof | |
WO2002018595A9 (en) | Moraxella polypeptides and corresponding dna fragments and uses thereof | |
AU756010B2 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
JP2000125889A (en) | Protein from actinobacillus pleuropneumoniae | |
AU734052B2 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof | |
AU739641B2 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof | |
AU735391B2 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
AU750792B2 (en) | 76 kDa, 32 kDa, and 50 kDa helicobacter polypeptides and corresponding polynucleotide molecules | |
US20030158396A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
AU2002211201B2 (en) | aopB Gene, protein, homologs, fragments and variants thereof, and their use for cell surface display | |
US20030124141A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
AU2002211201A1 (en) | aopB Gene, protein, homologs, fragments and variants thereof, and their use for cell surface display | |
US20080019994A1 (en) | Immunization Against Chlamydia Infection | |
US20030023066A1 (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
US20020115078A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
WO1997019098A9 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics | |
US20030069404A1 (en) | Helicobacter antigens and corresponding DNA fragments | |
US20020160456A1 (en) | Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome | |
US20020026035A1 (en) | Helicobacter ghpo 1360 and ghpo 750 polypeptides and corresponding polynucleotide molecules | |
US20020044949A1 (en) | 76 kda helicobacter polypeptides and corresponding polynucleotide molecules | |
JP2001503637A (en) | Helicobacter polypeptides and corresponding polynucleotide molecules | |
US20080166376A1 (en) | Immunization Against Chlamydia Infection | |
CA2223395A1 (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics | |
MXPA99004890A (en) | Nucleic acid and amino acid sequences relating to helicobacter pylori |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FGA | Letters patent sealed or granted (standard patent) |